On 4/3/2018 6:18 AM, Ævar Arnfjörð Bjarmason wrote:
On Tue, Apr 03 2018, Lars Schneider wrote:
What is the state of this series? I can't find it in git/git nor in
git-for-windows/git. I think Stolee mentioned the config in
his Git Merge talk [1] and I was about to test it/roll it out :-)
It's in the gvfs branch of g...@github.com:Microsoft/git.git, i.e. it's
not in Git for Windows, but used in Microsoft's own in-house version
used for Windows.git.

Thanks for adding me to CC. I mentioned it in my talk because that was one thing we shipped internally as a "quick fix" until we could do the right thing.

If I remember correctly, Jeff abandoned shipping this upstream because it did have the feel of a hack and we wanted to see if users used the config setting or really cared about the output values. We saw fast adoption of the feature and even turned the config setting on automatically in the following version of GVFS.

I may be misunderstanding this feature, but my impression was that it
was a kludge as a workaround until the commit graph code landed, because
once we have that then surely we can just cheaply report the actual (or
approximate?) number in the common case, but of course it may still be
slow if your commit graph file is out of date.

You are correct that the commit-graph file may be out of date, causing slower performance. Even worse: the current graph patch only provides a constant-multiple speedup (still walking the same number of commits, but each commit is parsed much faster).

Speaking of our GVFS-specific fork [0], the 'gvfs' branch was updated just yesterday with a couple of changes that I am prepping for submission upstream:

* Lazy-load trees when parsing commits from commit-graph [1]
* Compute and consume generation numbers [2]

Each of these will speed up this ahead/behind calculation in different ways. [1] makes the cost of loading each commit a bit faster, saving up to 20% overall. [2] uses generation numbers in paint_down_to_common() to make the while() condition O(1) instead of O(Q) where Q is the size of the priority queue. The Windows repo is particularly "wide" with many parallel branches being merged in complicated ways, so the queue becomes quite large. This use of generation numbers saves about 4% on some ahead/behind calculations. This speedup is modest, but the existing code already made good use of limiting the commit walk to be mostly the "important" commits.

The real benefit of generation numbers will manifest in a way to make --topo-order much faster when rendering a small number of commits.

The generation numbers _could_ be used to approximate the ahead/behind calculation in the following way: When comparing A and B, and gen(A) < gen(B), then A is at least (gen(B) - gen(A)) behind. That's the only information that can be gathered directly from those values, but may be enough to short circuit an exact count.

To truly accelerate these ahead/behind calculations to be sub-linear* in the ahead/behind counts, we would need a bitmap-based approach. The object-reachability bitmap is a non-starter for client machines in the Windows repo, but perhaps a commit-reachability bitmap could be interesting. Performing set operations on the bitmaps could more quickly answer these questions. Just thinking about it makes me want to go down a deep rabbit hole, investigating ways to compute, store, and use these bitmaps. However: let's wait and see how necessary it is as the commit-graph feature stabilizes. (*These bitmap approaches are not guaranteed to be sub-linear, because it may include iterating through a list of O(N) bits, but good run-length encodings will likely make the count operation very fast, even with a set-difference operation included.)

There are too many fun things to work on, not enough time!

Thanks,
-Stolee

[0] https://github.com/microsoft/git
    Fork of GitForWindows that ships to Windows developers

[1] https://github.com/Microsoft/git/commit/29114bf86f591f5c87075f779a1faa2d0f17b92f     Lazy-load trees when parsing commits from commit-graph (accidentally squashed to one commit)

[2] https://github.com/microsoft/git/compare/879b7d3b1bddea2587b28cdd656c9c655018683a...a0731ca93a35fd042560c4b30e8e0edbdfa4bf9f
    Compute and consume generation numbers

Reply via email to