Re: git blame vs git log --follow performance

2014-01-26 Thread Duy Nguyen
On Mon, Jan 27, 2014 at 4:10 AM, Joe Perches  wrote:
> Is there something that can be done about improving
> git log --follow --  performance to be nearly
> equivalent speed to git blame --  ?

Not strictly about --follow, but there is room for improvement for
diff'ing in log in general. Right now we do "diff HEAD HEAD~1", "diff
HEAD~1 HEAD~2" and so on (--follow needs diff to detect rename). At
each step we load new tree objects and reparse. Notice after "diff
HEAD HEAD~1" we may have "HEAD~1" and its subtrees read and parsed
(not entirely). We could reuse that "diff HEAD~1 HEAD~2".

On git.git, "git log --raw" takes 10s and it seems tree object reading
is about 2s.In ideal case we might be able to cut that to 1s. The tree
parsing code (update_tree_entry) takes about 5s. We might be able to
cut that in half, I'm not entirely sure. But there could be a lot of
work in caching "HEAD~1" and the overhead may turn out too high for
any gain.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git blame vs git log --follow performance

2014-01-26 Thread Joe Perches
On Mon, 2014-01-27 at 08:33 +0700, Duy Nguyen wrote:
> On Mon, Jan 27, 2014 at 4:10 AM, Joe Perches  wrote:
> > For instance (using the Linus' linux kernel git):
> >
> > $ time git log --follow -- drivers/firmware/google/Kconfig > /dev/null
> >
> > real0m42.329s
> > user0m40.984s
> > sys 0m0.792s
> >
> > $ time git blame -- drivers/firmware/google/Kconfig > /dev/null
> >
> > real0m0.963s
> > user0m0.860s
> > sys 0m0.096s
> >
> 
> It's not fair to compare blame and log. If you compare, compare it to
> non follow version

Perhaps not, but git blame does follow renames.

$ git blame --help
[]
The origin of lines is automatically followed across
whole-file renames (currently there is no option to
turn the rename-following off). To follow lines moved
from one file to another, or to follow lines that were
copied and pasted from another file, etc., see the -C
and -M options.

> I tested a version with rename detection logic removed. It did not
> change the timing significantly. To improve --follow I think we need
> to do something about path filtering.

Perhaps the log history could stop being read when
a commit is found that creates the file without
another file being deleted in the same commit.


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git blame vs git log --follow performance

2014-01-26 Thread Duy Nguyen
On Mon, Jan 27, 2014 at 4:10 AM, Joe Perches  wrote:
> For instance (using the Linus' linux kernel git):
>
> $ time git log --follow -- drivers/firmware/google/Kconfig > /dev/null
>
> real0m42.329s
> user0m40.984s
> sys 0m0.792s
>
> $ time git blame -- drivers/firmware/google/Kconfig > /dev/null
>
> real0m0.963s
> user0m0.860s
> sys 0m0.096s
>

It's not fair to compare blame and log. If you compare, compare it to
non follow version

$ time git log --follow -- drivers/firmware/google/Kconfig > /dev/null

real0m35.552s
user0m35.120s
sys 0m0.383s

$ time git log -- drivers/firmware/google/Kconfig > /dev/null

real0m4.366s
user0m4.215s
sys 0m0.144s

Although because we need to detect rename, we can't really filter to
one path. So the base line is more like

$ time git log > /dev/null

real0m29.338s
user0m28.485s
sys 0m0.813s

with rename detection taking some more time.

> Perhaps adding a whole-file rename option to the
> "git log" history simplification mechanism could
> help?
>
> Thoughts?

I tested a version with rename detection logic removed. It did not
change the timing significantly. To improve --follow I think we need
to do something about path filtering.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


git blame vs git log --follow performance

2014-01-26 Thread Joe Perches
Hi.

Is there something that can be done about improving
git log --follow --  performance to be nearly
equivalent speed to git blame --  ?

The overall cpu time taken for these 2 commands that
track individual file history can be quite different.

git log --follow -- 
and
git blame -- 

It seems that there can be a couple orders of magnitude
delta in the overall time taken.

For instance (using the Linus' linux kernel git):

$ time git log --follow -- drivers/firmware/google/Kconfig > /dev/null

real0m42.329s
user0m40.984s
sys 0m0.792s

$ time git blame -- drivers/firmware/google/Kconfig > /dev/null

real0m0.963s
user0m0.860s
sys 0m0.096s

This particular file has never been renamed.

Looking at the output on screen, there does seem to
be 25+ seconds of cpu time consumed after the initial
(last shown) commit that introduces this file.

Perhaps adding a whole-file rename option to the
"git log" history simplification mechanism could
help?

Thoughts?

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html