Eric Sunshine <sunsh...@sunshineco.com> wrote:
> On Tue, May 31, 2016 at 3:45 AM, Eric Wong <e...@80x24.org> wrote:
> > Eric Sunshine <sunsh...@sunshineco.com> wrote:

<snip>  Ah thanks, I can see your point-of-view, now.
I always had the '^' in my mind since I've written the same
thing in Perl and Ruby.

> >> I wonder if hand-coding, rather than using a regex, could be an 
> >> improvement:
> >>
> >>     static int is_mboxrd_from(const char *s, size_t n)
> >>     {
> >>         size_t f = strlen("From ");
> >>         const char *t = s + n;
> >>
> >>         while (s < t && *s == '>')
> >>             s++;
> >>         return t - s >= f && !memcmp(s, "From ", f);
> >>     }
> >>
> >> or something.
> >
> > Yikes.  I mostly work in high-level languages and do my best to
> > avoid string parsing in C; so that scares me.  A lot.
> 
> The hand-coded is_mboxrd_from() above is pretty much idiomatic C and
> (I think) typical of how such a function would be coded in Git itself,
> so it looks normal and easy to grok to me (but, of course, I'm
> probably biased since I wrote it).
> 
> > I admit a regex isn't necessary, but I'm wondering if the above
> > could be made less frightening to someone like me.
> 
> Perhaps, but it's difficult to say without knowing how it frightens you.

Pointer arithmetic leading to buffer overruns;
but yeah, I don't do string parsing in C often, if ever.

> > Maybe extra test cases + valgrind can quell my fears :)
> 
> I can envision tests such as:
> 
>     ""
>     "F"
>     "From"
>     "From "
>     "From     "
>     "From foobar"
> 
> and so on, if that's what you mean.

Yes, I also noticed trailing spaces are dropped anyways, so
there's no perfect round-tripping going on.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to