On Tue, Mar 26, 2013 at 11:39:29AM -0700, Junio C Hamano wrote:

> The function takes two counted strings (<basename, basenamelen> and
> <pattern, patternlen>) as parameters, together with prefix (the
> length of the prefix in pattern that is to be matched literally
> without globbing against the basename) and EXC_* flags that tells it
> how to match the pattern against the basename.
> However, it did not pay attention to the length of these counted
> strings.  Update them to do the following:
>  * When the entire pattern is to be matched literally, the pattern
>    matches the basename only when the lengths of them are the same,
>    and they match up to that length.
>  * When the pattern is "*" followed by a string to be matched
>    literally, make sure that the basenamelen is equal or longer than
>    the "literal" part of the pattern, and the tail of the basename
>    string matches that literal part.
>  * Otherwise, make sure we use only the counted part of the strings
>    when calling fnmatch_icase().  Because these counted strings are
>    full strings most of the time, avoid unnecessary allocation.

I think this is OK, with the intention that we would eventually drop the
allocations from your third bullet point in favor of using a
byte-counted version of fnmatch (i.e., nwildmatch). But until then we're
going to see a performance drop.

The pattern is usually going to be NUL-terminated at the length counter,
but every time we feed a directory, it's going to run into this
allocation. And we do it once for _every_ directory against _every_
wildcard gitignore pattern. So I think it is probably going to be
measurable. I guess we can try measuring it on something like WebKit,
which has plenty of both directories and gitattributes.

