Eric Sunshine <[email protected]> wrote:
> On Tue, May 31, 2016 at 3:45 AM, Eric Wong <[email protected]> wrote:
> > Eric Sunshine <[email protected]> wrote:
<snip> Ah thanks, I can see your point-of-view, now.
I always had the '^' in my mind since I've written the same
thing in Perl and Ruby.
> >> I wonder if hand-coding, rather than using a regex, could be an
> >> improvement:
> >>
> >> static int is_mboxrd_from(const char *s, size_t n)
> >> {
> >> size_t f = strlen("From ");
> >> const char *t = s + n;
> >>
> >> while (s < t && *s == '>')
> >> s++;
> >> return t - s >= f && !memcmp(s, "From ", f);
> >> }
> >>
> >> or something.
> >
> > Yikes. I mostly work in high-level languages and do my best to
> > avoid string parsing in C; so that scares me. A lot.
>
> The hand-coded is_mboxrd_from() above is pretty much idiomatic C and
> (I think) typical of how such a function would be coded in Git itself,
> so it looks normal and easy to grok to me (but, of course, I'm
> probably biased since I wrote it).
>
> > I admit a regex isn't necessary, but I'm wondering if the above
> > could be made less frightening to someone like me.
>
> Perhaps, but it's difficult to say without knowing how it frightens you.
Pointer arithmetic leading to buffer overruns;
but yeah, I don't do string parsing in C often, if ever.
> > Maybe extra test cases + valgrind can quell my fears :)
>
> I can envision tests such as:
>
> ""
> "F"
> "From"
> "From "
> "From "
> "From foobar"
>
> and so on, if that's what you mean.
Yes, I also noticed trailing spaces are dropped anyways, so
there's no perfect round-tripping going on.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html