On Mon, Dec 18, 2017 at 08:18:19PM +0100, René Scharfe wrote:

> > The root of the matter is that the revision-walking code doesn't clean
> > up after itself. In every case, the caller is just saving these to clean
> > up commit marks, isn't it?
> 
> bundle also checks if the pending objects exists.

Thanks, I missed that one. So just adding a feature to clean up commit
marks wouldn't be sufficient to cover that case.

> > That sidesteps all of the memory ownership issues by just creating a
> > copy. That's less efficient, but I'd be surprised if it matters in
> > practice (we tend to do one or two revisions per process, there don't
> > tend to be a lot of pending tips, and we're really just talking about
> > copying some pointers here).
> [...]
> I don't know if there can be real-world use cases with millions of
> entries (when it would start to hurt).

I've seen repos which have tens of thousands of tags. Something like
"rev-list --all" would have tens of thousands of pending objects.
I think in practice it's limited to the number of objects (though in
practice more like the number of commits).

I'd note also that for most uses we don't need a full object_array. You
really just need a pointer to the "struct object" to wipe its flags.

So there we might waste 8 bytes per object in the worst case. But bear
in mind that the process is wasting a lot more than that per "struct
commit" that we're holding. And versus the existing scheme, it's only
for the moment until prepare_revision_walk() frees the old pending list.

> Why does prepare_revision_walk() clear the list of pending objects at
> all?  Assuming the list is append-only then perhaps remembering the
> last handled index would suffice.

I assume it was mostly to clean up after itself, since there's no
explicit "I'm done with the traversal" function. But as I said earlier,
I'd be surprised of a revision walk doesn't leave some allocated cruft
in rev_info these days (e.g., pathspec cruft). In practice it doesn't
matter much because we don't do arbitrary numbers of traversals in
single process.

-Peff

Reply via email to