Re: [PATCH 3/3] match_basename: use strncmp instead of strcmp
On Sat, Mar 9, 2013 at 2:50 PM, Junio C Hamano wrote: > Nguyễn Thái Ngọc Duy writes: > >> strncmp provides length information, compared to strcmp, which could >> be taken advantage by the implementation. Even better, we could check >> if the lengths are equal before calling strncmp, eliminating a bit of >> strncmp calls. > > I think I am a bit slower than my usual self tonight, but I am > utterly confused by the above. > > strncmp() compares _only_ up to the first n bytes, so when you are > using it for equality, it is not "we could check length", but is "we > MUST check they match to the length of the shorter string", if you > want to obtain not just faster but correct result. > > Am I mistaken? Yeap, the description is a bit misleading. Although you could get away with length check by doing !strncmp(a, b, strlen(a)+1). > Even if you are using strcmp() that yields ordering not just > equality, it can return a correct result as soon as it hits the > first bytes that are different; I doubt using strncmp() contributes > to the performance very much. Comparing lengths before doing > byte-for-byte comparison could help because you can reject two > strings with different lengths without looking at them. > > At the same time, I wonder if we can take advantage of the fact that > these call sites only care about equality and not ordering. I tried to push it further and compare hash before do the actual string comparison. It slowed things down (hopefully because the cost of hashing, the same one from name-hash.c, not because I did it wrong). -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] match_basename: use strncmp instead of strcmp
On Fri, Mar 08, 2013 at 11:50:04PM -0800, Junio C Hamano wrote: > At the same time, I wonder if we can take advantage of the fact that > these call sites only care about equality and not ordering. I did an RFC-patch for that (that I mistakenly didn't sent as a reply to this e-mail). And I believe that you're correct. My solution is inspired of curl's strequal. Is the reason for git not to care about lower/upper-case for beeing able to support windows? Or is there any other smart reason? I was also thinking about discarding files by looking at their modification date. If the modification timestamp is older than/or equal to the latest commit, there's probably no reason for examine that file any further. I'm not sure about the side effects this may imply though. I think they can be quite nasty. Is this something worth digging more in or am I already on the wrong path? -- Med vänliga hälsningar Fredrik Gustafsson tel: 0733-608274 e-post: iv...@iveqy.com -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] match_basename: use strncmp instead of strcmp
Nguyễn Thái Ngọc Duy writes: > strncmp provides length information, compared to strcmp, which could > be taken advantage by the implementation. Even better, we could check > if the lengths are equal before calling strncmp, eliminating a bit of > strncmp calls. I think I am a bit slower than my usual self tonight, but I am utterly confused by the above. strncmp() compares _only_ up to the first n bytes, so when you are using it for equality, it is not "we could check length", but is "we MUST check they match to the length of the shorter string", if you want to obtain not just faster but correct result. Am I mistaken? Even if you are using strcmp() that yields ordering not just equality, it can return a correct result as soon as it hits the first bytes that are different; I doubt using strncmp() contributes to the performance very much. Comparing lengths before doing byte-for-byte comparison could help because you can reject two strings with different lengths without looking at them. At the same time, I wonder if we can take advantage of the fact that these call sites only care about equality and not ordering. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html