On Tue, Mar 26, 2013 at 01:49:10PM -0700, Junio C Hamano wrote:
> Jeff King <p...@peff.net> writes:
> > I timed this doing "git archive HEAD" on webkit.git before and after. It
> > actually ended up not mattering much (I think because it is only the
> > directories which are affected, not each individually path, so it's a
> > much smaller number than you'd think). The best-of-five timing was
> > slightly slower, but was within the noise.
> Interesting. Because "archive" has to incur a large I/O cost
> anyway, I expected extra allocation for correctness for only the
> directory paths would be dwarfed in the noise.
> I actually care more about cases other than "archive", though. Do
> we even feed directory paths to the machinery?
In general, no, I don't think so. That's why I tested "archive", since I
knew it did. In the normal case, we should just feed file paths, meaning
we only run into this code path when somebody has "foo/" in their
pattern. Testing like:
git ls-files -z >files
time git check-attr --stdin -z -a <files >/dev/null
showed a difference well within the noise.
> > So I do still think it would make sense to go to a byte-limited version
> > of fnmatch eventually, just for code cleanliness and predictability of
> > performance, but this is really not a bad solution in the interim.
> Yes, what we do with wildmatch is a separate issue for 'master' and
Oh, agreed. I just wanted to see how much performance would be impacted
for the interim. But it seems that it's not.
So I think your series is the right direction, but we would want to
factor out the allocation code and use it from match_pathname, as well.
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html