Re: git blame vs git log --follow performance
On Mon, Jan 27, 2014 at 4:10 AM, Joe Perches wrote: > Is there something that can be done about improving > git log --follow -- performance to be nearly > equivalent speed to git blame -- ? Not strictly about --follow, but there is room for improvement for diff'ing in log in general. Right now we do "diff HEAD HEAD~1", "diff HEAD~1 HEAD~2" and so on (--follow needs diff to detect rename). At each step we load new tree objects and reparse. Notice after "diff HEAD HEAD~1" we may have "HEAD~1" and its subtrees read and parsed (not entirely). We could reuse that "diff HEAD~1 HEAD~2". On git.git, "git log --raw" takes 10s and it seems tree object reading is about 2s.In ideal case we might be able to cut that to 1s. The tree parsing code (update_tree_entry) takes about 5s. We might be able to cut that in half, I'm not entirely sure. But there could be a lot of work in caching "HEAD~1" and the overhead may turn out too high for any gain. -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git blame vs git log --follow performance
On Mon, 2014-01-27 at 08:33 +0700, Duy Nguyen wrote: > On Mon, Jan 27, 2014 at 4:10 AM, Joe Perches wrote: > > For instance (using the Linus' linux kernel git): > > > > $ time git log --follow -- drivers/firmware/google/Kconfig > /dev/null > > > > real0m42.329s > > user0m40.984s > > sys 0m0.792s > > > > $ time git blame -- drivers/firmware/google/Kconfig > /dev/null > > > > real0m0.963s > > user0m0.860s > > sys 0m0.096s > > > > It's not fair to compare blame and log. If you compare, compare it to > non follow version Perhaps not, but git blame does follow renames. $ git blame --help [] The origin of lines is automatically followed across whole-file renames (currently there is no option to turn the rename-following off). To follow lines moved from one file to another, or to follow lines that were copied and pasted from another file, etc., see the -C and -M options. > I tested a version with rename detection logic removed. It did not > change the timing significantly. To improve --follow I think we need > to do something about path filtering. Perhaps the log history could stop being read when a commit is found that creates the file without another file being deleted in the same commit. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git blame vs git log --follow performance
On Mon, Jan 27, 2014 at 4:10 AM, Joe Perches wrote: > For instance (using the Linus' linux kernel git): > > $ time git log --follow -- drivers/firmware/google/Kconfig > /dev/null > > real0m42.329s > user0m40.984s > sys 0m0.792s > > $ time git blame -- drivers/firmware/google/Kconfig > /dev/null > > real0m0.963s > user0m0.860s > sys 0m0.096s > It's not fair to compare blame and log. If you compare, compare it to non follow version $ time git log --follow -- drivers/firmware/google/Kconfig > /dev/null real0m35.552s user0m35.120s sys 0m0.383s $ time git log -- drivers/firmware/google/Kconfig > /dev/null real0m4.366s user0m4.215s sys 0m0.144s Although because we need to detect rename, we can't really filter to one path. So the base line is more like $ time git log > /dev/null real0m29.338s user0m28.485s sys 0m0.813s with rename detection taking some more time. > Perhaps adding a whole-file rename option to the > "git log" history simplification mechanism could > help? > > Thoughts? I tested a version with rename detection logic removed. It did not change the timing significantly. To improve --follow I think we need to do something about path filtering. -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
git blame vs git log --follow performance
Hi. Is there something that can be done about improving git log --follow -- performance to be nearly equivalent speed to git blame -- ? The overall cpu time taken for these 2 commands that track individual file history can be quite different. git log --follow -- and git blame -- It seems that there can be a couple orders of magnitude delta in the overall time taken. For instance (using the Linus' linux kernel git): $ time git log --follow -- drivers/firmware/google/Kconfig > /dev/null real0m42.329s user0m40.984s sys 0m0.792s $ time git blame -- drivers/firmware/google/Kconfig > /dev/null real0m0.963s user0m0.860s sys 0m0.096s This particular file has never been renamed. Looking at the output on screen, there does seem to be 25+ seconds of cpu time consumed after the initial (last shown) commit that introduces this file. Perhaps adding a whole-file rename option to the "git log" history simplification mechanism could help? Thoughts? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html