Re: [PATCH 09/13] refs: introduce an iterator interface

2016-06-03 Thread Michael Haggerty
On 06/01/2016 01:12 AM, Eric Sunshine wrote:
> On Tue, May 31, 2016 at 3:59 AM, Michael Haggerty  
> wrote:
>> On 05/31/2016 07:29 AM, Eric Sunshine wrote:
>>> On Mon, May 30, 2016 at 3:55 AM, Michael Haggerty  
>>> wrote:
 +struct ref_iterator *empty_ref_iterator_begin(void);
 +
 +/*
 + * Return true iff ref_iterator is an empty_ref_iterator.
 + */
 +int is_empty_ref_iterator(struct ref_iterator *ref_iterator);
>>>
>>> I can see that you used this function as an optimization or
>>> convenience in overlay_ref_iterator_begin(), but do you expect it to
>>> be generally useful otherwise? Is it worth publishing? Do you have
>>> other use-cases in mind?
>>
>> It is only "published" within the refs module, in refs/refs-internal.h.
>> This header file is not meant to be used by code outside of the refs module.
> 
> Ah, I forgot about that. In that case, it's probably less of an issue.
> 
>> My thinking was that it might be useful to other reference backends. The
>> function is pretty safe for anybody to call, though I admit that it is
>> not very general.
>>
>> I don't have a strong feeling either way. If nobody else chimes in, I'll
>> remove it from the header file as you suggested. We can always add it
>> back if somebody needs it.
> 
> I don't feel strongly about it either.

OK then, I'll leave it as-is.

>>> Also, can you explain why the merge iterator doesn't also perform the
>>> optimization/convenience of checking if one iterator is an empty
>>> iterator?
>>
>> That's because the merge iterator doesn't know what its select function
>> will do. For example, you could imagine an "intersect" select function
>> that only lets through references that were in *both* sub-iterators. In
>> that case, your suggested "optimization" would be incorrect.
> 
> Makes sense. Thanks for explaining. I wonder if this deserves a
> comment somewhere in code or commit message to make the situation
> clear to a future developer who might think it a good idea to promote
> the "optimization" to the merge iterator.

Good idea. I'll add a comment.

> [...]

Thanks,
Michael

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 09/13] refs: introduce an iterator interface

2016-06-02 Thread Michael Haggerty
On 06/02/2016 12:08 PM, Duy Nguyen wrote:
> On Mon, May 30, 2016 at 2:55 PM, Michael Haggerty  
> wrote:
>> Currently, the API for iterating over references is via a family of
>> for_each_ref()-type functions that invoke a callback function for each
>> selected reference. All of these eventually call do_for_each_ref(),
>> which knows how to do one thing: iterate in parallel through two
>> ref_caches, one for loose and one for packed refs, giving loose
>> references precedence over packed refs. This is rather complicated code,
>> and is quite specialized to the files backend. It also requires callers
>> to encapsulate their work into a callback function, which often means
>> that they have to define and use a "cb_data" struct to manage their
>> context.
>>
>> The current design is already bursting at the seams, and will become
>> even more awkward in the upcoming world of multiple reference storage
>> backends:
>>
>> * Per-worktree vs. shared references are currently handled via a kludge
>>   in git_path() rather than iterating over each part of the reference
>>   namespace separately and merging the results. This kludge will cease
>>   to work when we have multiple reference storage backends.
> 
> Question from a refs user. Right now worktree.c:get_worktrees() peeks
> directly to "$GIT_DIR/worktrees/xxx/HEAD" and parses the content
> itself, something that I  promised to fix but never got around to do
> it. Will we have an iterator to go through all worktrees' HEAD, or
> will  there be an API to say "resolve ref HEAD from worktree XYZ"?

My preference is that there is a way to say "create a ref_store object
representing the loose references stored physically under
"$GIT_DIR/worktrees/xxx". Then that ref_store could be asked to read its
`HEAD` (or iterate over all of the refs under that path or whatever).
You could even write `HEAD` through the same ref_store.

Michael

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 09/13] refs: introduce an iterator interface

2016-06-02 Thread Duy Nguyen
On Mon, May 30, 2016 at 2:55 PM, Michael Haggerty  wrote:
> Currently, the API for iterating over references is via a family of
> for_each_ref()-type functions that invoke a callback function for each
> selected reference. All of these eventually call do_for_each_ref(),
> which knows how to do one thing: iterate in parallel through two
> ref_caches, one for loose and one for packed refs, giving loose
> references precedence over packed refs. This is rather complicated code,
> and is quite specialized to the files backend. It also requires callers
> to encapsulate their work into a callback function, which often means
> that they have to define and use a "cb_data" struct to manage their
> context.
>
> The current design is already bursting at the seams, and will become
> even more awkward in the upcoming world of multiple reference storage
> backends:
>
> * Per-worktree vs. shared references are currently handled via a kludge
>   in git_path() rather than iterating over each part of the reference
>   namespace separately and merging the results. This kludge will cease
>   to work when we have multiple reference storage backends.

Question from a refs user. Right now worktree.c:get_worktrees() peeks
directly to "$GIT_DIR/worktrees/xxx/HEAD" and parses the content
itself, something that I  promised to fix but never got around to do
it. Will we have an iterator to go through all worktrees' HEAD, or
will  there be an API to say "resolve ref HEAD from worktree XYZ"?
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 09/13] refs: introduce an iterator interface

2016-05-31 Thread Eric Sunshine
On Tue, May 31, 2016 at 3:59 AM, Michael Haggerty  wrote:
> On 05/31/2016 07:29 AM, Eric Sunshine wrote:
>> On Mon, May 30, 2016 at 3:55 AM, Michael Haggerty  
>> wrote:
>>> +struct ref_iterator *empty_ref_iterator_begin(void);
>>> +
>>> +/*
>>> + * Return true iff ref_iterator is an empty_ref_iterator.
>>> + */
>>> +int is_empty_ref_iterator(struct ref_iterator *ref_iterator);
>>
>> I can see that you used this function as an optimization or
>> convenience in overlay_ref_iterator_begin(), but do you expect it to
>> be generally useful otherwise? Is it worth publishing? Do you have
>> other use-cases in mind?
>
> It is only "published" within the refs module, in refs/refs-internal.h.
> This header file is not meant to be used by code outside of the refs module.

Ah, I forgot about that. In that case, it's probably less of an issue.

> My thinking was that it might be useful to other reference backends. The
> function is pretty safe for anybody to call, though I admit that it is
> not very general.
>
> I don't have a strong feeling either way. If nobody else chimes in, I'll
> remove it from the header file as you suggested. We can always add it
> back if somebody needs it.

I don't feel strongly about it either.

>> Also, can you explain why the merge iterator doesn't also perform the
>> optimization/convenience of checking if one iterator is an empty
>> iterator?
>
> That's because the merge iterator doesn't know what its select function
> will do. For example, you could imagine an "intersect" select function
> that only lets through references that were in *both* sub-iterators. In
> that case, your suggested "optimization" would be incorrect.

Makes sense. Thanks for explaining. I wonder if this deserves a
comment somewhere in code or commit message to make the situation
clear to a future developer who might think it a good idea to promote
the "optimization" to the merge iterator.

>>> +/*
>>> + * An iterator consisting of the union of the entries from iter0 and
>>> + * iter1. If there are entries common to the two sub-iterators, use
>>> + * the one from iter1. Each iterator must iterate over its entries in
>>> + * strcmp() order by refname for this to work.
>>> + *
>>> + * The new iterator takes ownership of its arguments and frees them
>>> + * when the iteration is over. As a convenience to callers, if iter0
>>> + * or iter1 is_empty_ref_iterator(), then abort that one immediately
>>> + * and return the other iterator directly, without wrapping it.
>>> + */
>>> +struct ref_iterator *overlay_ref_iterator_begin(struct ref_iterator *iter0,
>>> +   struct ref_iterator *iter1);
>>
>> When reading about the overlay iterator (both code and documentation),
>> my expectation was that iter0 would shadow iter1, not the other way
>> around as implemented here. Of course, that's entirely subjective, but
>> the generic names don't provide any useful clues as to which shadows
>> which. Perhaps giving them more meaningful names would help.
>
> That's a good idea. I also found myself having to refer back to the
> documentation to remind myself which was which.
>
> How about I rename them "back" and "front"? I will also reverse the
> order of the arguments.

I had a hard time coming up with better names, which is why I didn't
suggest any in my review. The best I had was "shadower" and
"shadowee", but they are far too similar (and long) for my tastes.
"back" and "front" feel a bit off, but are better than anything I
thought of.

As for the argument order, I can't explain why my expectation was that
it would be the other way around, so I certainly don't insist that
they be swapped.

> (But I will leave the names "iter0" and "iter1" in merge_ref_iterator,
> and also the constants like ITER_SELECT_0, because these don't
> necessarily have the interpretation of "back" and "front".)

Yes, those names are fine in the merge iterator.

>>> +/*
>>> + * Wrap iter0, only letting through the references whose names start
>>> + * with prefix. If trim is set, set iter->refname to the name of the
>>> + * reference with that many characters trimmed off the front;
>>> + * otherwise set it to the full refname. The new iterator takes over
>>> + * ownership of iter0 and frees it when iteration is over. It makes
>>> + * its own copy of prefix.
>>> + *
>>> + * As an convenience to callers, if prefix is the empty string and
>>> + * trim is zero, this function returns iter0 directly, without
>>> + * wrapping it.
>>> + */
>>> +struct ref_iterator *prefix_ref_iterator_begin(struct ref_iterator *iter0,
>>> +  const char *prefix,
>>> +  int trim);
>>
>> Minor: Similarly, when reading the code and documentation, I wondered
>> why this was named 'iter0' when no 'iter1' was in sight. Perhaps name
>> it simply 'iter'.
>
> I found that it got a little bit confusing, because the 

Re: [PATCH 09/13] refs: introduce an iterator interface

2016-05-31 Thread Michael Haggerty
On 05/31/2016 08:10 AM, Junio C Hamano wrote:
> Michael Haggerty  writes:
> 
>> This commit introduces a new iteration primitive for references: a
>> ref_iterator. A ref_iterator is a polymorphic object that a reference
>> storage backend can be asked to instantiate. There are three functions
>> that can be applied to a ref_iterator:
>>
>> * ref_iterator_advance(): move to the next reference in the iteration
>> * ref_iterator_abort(): end the iteration before it is exhausted
>> * ref_iterator_peel(): peel the reference currently being looked at
> 
> This part looked somewhat strange in that it makes "peel" sound
> something very special.  Even though I understand why, it made me
> feel uneasy.  I do not think of another operation like peel that may
> want to have such a specialized helper, so I'll let it pass, but the
> primary uneasiness I felt comes from the fact that "iterator-peel"
> is not an essential service of the API that needs for correctness,
> but is a pure optimization (i.e. you could grab a ref from the
> normal iterator call, and then ask "please peel this ref" to the
> usual ref API that does not know anything about iteration).

I agree that this is inelegant. The underlying reason is that the
iterator embodies both the iteration and also the reference currently
being iterated over. So ref_iterator_advance() and ref_iterator_abort()
are really iterator methods, whereas ref_iterator_peel() is really a
method on the current reference. For that matter, the refname etc fields
are conceptually parts of the current reference, too, not of the iteration.

One of my many coding spikes tried to separate the two concepts by
having the iterator give back a separate object representing the
reference. That object would have had a peel() method. But I wasn't
happy with that approach:

* Anything that made it look like the reference object had a lifetime
independent of that of the iterator seemed like asking for trouble.

* The peel() method would probably have to be a real method that knows
where the reference came from (as opposed to a simple function) because
of things like prefix_ref_iterator, where the refname stored in the
ref_iterator might have had its prefix stripped off.

The complexity level was getting beyond what I felt was tenable in C, so
I made the compromise that you see.

If I had more time I'd do some benchmarking to see whether the "peeling"
optimization is really still needed. It comes from the days when
references were stored in giant lists. Now that we store refs in tries,
and each level gets sorted as part of the iteration anyway, the lookup
might be fast enough that nobody would care.

> [...]

Michael

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 09/13] refs: introduce an iterator interface

2016-05-31 Thread Michael Haggerty
On 05/31/2016 07:29 AM, Eric Sunshine wrote:
> On Mon, May 30, 2016 at 3:55 AM, Michael Haggerty  
> wrote:
>> [...]
> [...]
> Either:
> 
> s/false/something other than ITER_OK/
> 
> or:
> 
> s/false/ITER_DONE or ITER_ERROR/

Thanks.

>> +int ref_iterator_advance(struct ref_iterator *ref_iterator);
>> +
>> +/*
>> + * An iterator over nothing (its first ref_iterator_advance() call
>> + * returns 0).
>> + */
> 
> s/0/ITER_DONE/

Thanks. I guess you can guess what an earlier draft of this interface
looked like :-)

>> +struct ref_iterator *empty_ref_iterator_begin(void);
>> +
>> +/*
>> + * Return true iff ref_iterator is an empty_ref_iterator.
>> + */
>> +int is_empty_ref_iterator(struct ref_iterator *ref_iterator);
> 
> I can see that you used this function as an optimization or
> convenience in overlay_ref_iterator_begin(), but do you expect it to
> be generally useful otherwise? Is it worth publishing? Do you have
> other use-cases in mind?

It is only "published" within the refs module, in refs/refs-internal.h.
This header file is not meant to be used by code outside of the refs module.

My thinking was that it might be useful to other reference backends. The
function is pretty safe for anybody to call, though I admit that it is
not very general.

I don't have a strong feeling either way. If nobody else chimes in, I'll
remove it from the header file as you suggested. We can always add it
back if somebody needs it.

> Also, can you explain why the merge iterator doesn't also perform the
> optimization/convenience of checking if one iterator is an empty
> iterator?

That's because the merge iterator doesn't know what its select function
will do. For example, you could imagine an "intersect" select function
that only lets through references that were in *both* sub-iterators. In
that case, your suggested "optimization" would be incorrect.

Incidentally, that's also why I decided to leave the select function in
charge even after one or both of the sub-iterators is exhausted—because
it lets merge_ref_iterator implement more diverse behavior.

>> +/*
>> + * Iterate over the intries from iter0 and iter1, with the values
> 
> s/intries/entries/

Thanks.

>> + * interleaved as directed by the select function. The iterator takes
>> + * ownership of iter0 and iter1 and frees them when the iteration is
>> + * over.
>> + */
>> +struct ref_iterator *merge_ref_iterator_begin(
>> +   struct ref_iterator *iter0, struct ref_iterator *iter1,
>> +   ref_iterator_select_fn *select, void *cb_data);
>> +
>> +/*
>> + * An iterator consisting of the union of the entries from iter0 and
>> + * iter1. If there are entries common to the two sub-iterators, use
>> + * the one from iter1. Each iterator must iterate over its entries in
>> + * strcmp() order by refname for this to work.
>> + *
>> + * The new iterator takes ownership of its arguments and frees them
>> + * when the iteration is over. As a convenience to callers, if iter0
>> + * or iter1 is_empty_ref_iterator(), then abort that one immediately
>> + * and return the other iterator directly, without wrapping it.
>> + */
>> +struct ref_iterator *overlay_ref_iterator_begin(struct ref_iterator *iter0,
>> +   struct ref_iterator *iter1);
> 
> When reading about the overlay iterator (both code and documentation),
> my expectation was that iter0 would shadow iter1, not the other way
> around as implemented here. Of course, that's entirely subjective, but
> the generic names don't provide any useful clues as to which shadows
> which. Perhaps giving them more meaningful names would help.

That's a good idea. I also found myself having to refer back to the
documentation to remind myself which was which.

How about I rename them "back" and "front"? I will also reverse the
order of the arguments.

(But I will leave the names "iter0" and "iter1" in merge_ref_iterator,
and also the constants like ITER_SELECT_0, because these don't
necessarily have the interpretation of "back" and "front".)

>> +/*
>> + * Wrap iter0, only letting through the references whose names start
>> + * with prefix. If trim is set, set iter->refname to the name of the
>> + * reference with that many characters trimmed off the front;
>> + * otherwise set it to the full refname. The new iterator takes over
>> + * ownership of iter0 and frees it when iteration is over. It makes
>> + * its own copy of prefix.
>> + *
>> + * As an convenience to callers, if prefix is the empty string and
>> + * trim is zero, this function returns iter0 directly, without
>> + * wrapping it.
>> + */
>> +struct ref_iterator *prefix_ref_iterator_begin(struct ref_iterator *iter0,
>> +  const char *prefix,
>> +  int trim);
> 
> Minor: Similarly, when reading the code and documentation, I wondered
> why this was named 'iter0' when no 'iter1' was in sight. Perhaps 

Re: [PATCH 09/13] refs: introduce an iterator interface

2016-05-31 Thread Junio C Hamano
Michael Haggerty  writes:

> This commit introduces a new iteration primitive for references: a
> ref_iterator. A ref_iterator is a polymorphic object that a reference
> storage backend can be asked to instantiate. There are three functions
> that can be applied to a ref_iterator:
>
> * ref_iterator_advance(): move to the next reference in the iteration
> * ref_iterator_abort(): end the iteration before it is exhausted
> * ref_iterator_peel(): peel the reference currently being looked at

This part looked somewhat strange in that it makes "peel" sound
something very special.  Even though I understand why, it made me
feel uneasy.  I do not think of another operation like peel that may
want to have such a specialized helper, so I'll let it pass, but the
primary uneasiness I felt comes from the fact that "iterator-peel"
is not an essential service of the API that needs for correctness,
but is a pure optimization (i.e. you could grab a ref from the
normal iterator call, and then ask "please peel this ref" to the
usual ref API that does not know anything about iteration).

> Iterating using a ref_iterator leaves the flow of control in the hands
> of the caller, which means that ref_iterators from multiple
> sources (e.g., loose and packed refs) can be composed and presented to
> the world as a single compound ref_iterator.

Yes, this is a very good move.  I am happy to see us going in this
direction.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 09/13] refs: introduce an iterator interface

2016-05-30 Thread Eric Sunshine
On Mon, May 30, 2016 at 3:55 AM, Michael Haggerty  wrote:
> [...]
> This commit introduces a new iteration primitive for references: a
> ref_iterator. A ref_iterator is a polymorphic object that a reference
> storage backend can be asked to instantiate. There are three functions
> that can be applied to a ref_iterator:
>
> * ref_iterator_advance(): move to the next reference in the iteration
> * ref_iterator_abort(): end the iteration before it is exhausted
> * ref_iterator_peel(): peel the reference currently being looked at
> [...]
> Signed-off-by: Michael Haggerty 
> ---
> diff --git a/refs/refs-internal.h b/refs/refs-internal.h
> @@ -249,6 +249,199 @@ int rename_ref_available(const char *oldname, const 
> char *newname);
> +/*
> + * Advance the iterator to the first or next item and return ITER_OK.
> + * If the iteration is exhausted, free the resources associated with
> + * the ref_iterator and return ITER_DONE. On errors, free the iterator
> + * resources and return ITER_ERROR. It is a bug to use ref_iterator or
> + * call this function again after it has returned false.
> + */

Either:

s/false/something other than ITER_OK/

or:

s/false/ITER_DONE or ITER_ERROR/

> +int ref_iterator_advance(struct ref_iterator *ref_iterator);
> +
> +/*
> + * An iterator over nothing (its first ref_iterator_advance() call
> + * returns 0).
> + */

s/0/ITER_DONE/

> +struct ref_iterator *empty_ref_iterator_begin(void);
> +
> +/*
> + * Return true iff ref_iterator is an empty_ref_iterator.
> + */
> +int is_empty_ref_iterator(struct ref_iterator *ref_iterator);

I can see that you used this function as an optimization or
convenience in overlay_ref_iterator_begin(), but do you expect it to
be generally useful otherwise? Is it worth publishing? Do you have
other use-cases in mind?

Also, can you explain why the merge iterator doesn't also perform the
optimization/convenience of checking if one iterator is an empty
iterator?

> +/*
> + * Iterate over the intries from iter0 and iter1, with the values

s/intries/entries/

> + * interleaved as directed by the select function. The iterator takes
> + * ownership of iter0 and iter1 and frees them when the iteration is
> + * over.
> + */
> +struct ref_iterator *merge_ref_iterator_begin(
> +   struct ref_iterator *iter0, struct ref_iterator *iter1,
> +   ref_iterator_select_fn *select, void *cb_data);
> +
> +/*
> + * An iterator consisting of the union of the entries from iter0 and
> + * iter1. If there are entries common to the two sub-iterators, use
> + * the one from iter1. Each iterator must iterate over its entries in
> + * strcmp() order by refname for this to work.
> + *
> + * The new iterator takes ownership of its arguments and frees them
> + * when the iteration is over. As a convenience to callers, if iter0
> + * or iter1 is_empty_ref_iterator(), then abort that one immediately
> + * and return the other iterator directly, without wrapping it.
> + */
> +struct ref_iterator *overlay_ref_iterator_begin(struct ref_iterator *iter0,
> +   struct ref_iterator *iter1);

When reading about the overlay iterator (both code and documentation),
my expectation was that iter0 would shadow iter1, not the other way
around as implemented here. Of course, that's entirely subjective, but
the generic names don't provide any useful clues as to which shadows
which. Perhaps giving them more meaningful names would help.

> +/*
> + * Wrap iter0, only letting through the references whose names start
> + * with prefix. If trim is set, set iter->refname to the name of the
> + * reference with that many characters trimmed off the front;
> + * otherwise set it to the full refname. The new iterator takes over
> + * ownership of iter0 and frees it when iteration is over. It makes
> + * its own copy of prefix.
> + *
> + * As an convenience to callers, if prefix is the empty string and
> + * trim is zero, this function returns iter0 directly, without
> + * wrapping it.
> + */
> +struct ref_iterator *prefix_ref_iterator_begin(struct ref_iterator *iter0,
> +  const char *prefix,
> +  int trim);

Minor: Similarly, when reading the code and documentation, I wondered
why this was named 'iter0' when no 'iter1' was in sight. Perhaps name
it simply 'iter'.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 09/13] refs: introduce an iterator interface

2016-05-30 Thread Michael Haggerty
On 05/30/2016 06:57 PM, Ramsay Jones wrote:
> 
> 
> On 30/05/16 16:22, Ramsay Jones wrote:
>>
>>
>> On 30/05/16 08:55, Michael Haggerty wrote:
>> [snip]
>>
>>>  /* Reference is a symbolic reference. */
>>> diff --git a/refs/files-backend.c b/refs/files-backend.c
>>> index 8ab4d5f..dbf1587 100644
>>> --- a/refs/files-backend.c
>>> +++ b/refs/files-backend.c
>>> [...]
>>> +   sort_ref_dir(dir);
>>
>> do you need to sort here ...
>>> [...]
>>> +   sort_ref_dir(level->dir);
>>
>> ... given that you sort here?
> 
> I had intended to say 'or vice versa' here. When I wrote this, I had not
> finished reading this patch (let alone the series). Now, I suspect that
> you can simply drop this 'sort_ref_dir()' call site. Unless I've misread
> the code, of course! ;-)

Yes, you are right. Thanks for catching this! I'll fix it in v2.

Michael

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 09/13] refs: introduce an iterator interface

2016-05-30 Thread Ramsay Jones


On 30/05/16 16:22, Ramsay Jones wrote:
> 
> 
> On 30/05/16 08:55, Michael Haggerty wrote:
> [snip]
> 
>>  /* Reference is a symbolic reference. */
>> diff --git a/refs/files-backend.c b/refs/files-backend.c
>> index 8ab4d5f..dbf1587 100644
>> --- a/refs/files-backend.c
>> +++ b/refs/files-backend.c
>> @@ -1,6 +1,7 @@
>>  #include "../cache.h"
>>  #include "../refs.h"
>>  #include "refs-internal.h"
>> +#include "../iterator.h"
>>  #include "../lockfile.h"
>>  #include "../object.h"
>>  #include "../dir.h"
>> @@ -704,6 +705,154 @@ static void prime_ref_dir(struct ref_dir *dir)
>>  }
>>  }
>>  
>> +/*
>> + * A level in the reference hierarchy that is currently being iterated
>> + * through.
>> + */
>> +struct cache_ref_iterator_level {
>> +/*
>> + * The ref_dir being iterated over at this level. The ref_dir
>> + * is sorted before being stored here.
>> + */
>> +struct ref_dir *dir;
>> +
>> +/*
>> + * The index of the current entry within dir (which might
>> + * itself be a directory). If index == -1, then the iteration
>> + * hasn't yet begun. If index == dir->nr, then the iteration
>> + * through this level is over.
>> + */
>> +int index;
>> +};
>> +
>> +/*
>> + * Represent an iteration through a ref_dir in the memory cache. The
>> + * iteration recurses through subdirectories.
>> + */
>> +struct cache_ref_iterator {
>> +struct ref_iterator base;
>> +
>> +/*
>> + * The number of levels currently on the stack. This is always
>> + * at least 1, because when it becomes zero the iteration is
>> + * ended and this struct is freed.
>> + */
>> +size_t levels_nr;
>> +
>> +/* The number of levels that have been allocated on the stack */
>> +size_t levels_alloc;
>> +
>> +/*
>> + * A stack of levels. levels[0] is the uppermost level that is
>> + * being iterated over in this iteration. (This is not
>> + * necessary the top level in the references hierarchy. If we
>> + * are iterating through a subtree, then levels[0] will hold
>> + * the ref_dir for that subtree, and subsequent levels will go
>> + * on from there.)
>> + */
>> +struct cache_ref_iterator_level *levels;
>> +};
>> +
>> +static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator)
>> +{
>> +struct cache_ref_iterator *iter =
>> +(struct cache_ref_iterator *)ref_iterator;
>> +
>> +while (1) {
>> +struct cache_ref_iterator_level *level =
>> +>levels[iter->levels_nr - 1];
>> +struct ref_dir *dir = level->dir;
>> +struct ref_entry *entry;
>> +
>> +if (level->index == -1)
>> +sort_ref_dir(dir);
> 
> do you need to sort here ...
>> +
>> +if (++level->index == level->dir->nr) {
>> +/* This level is exhausted; pop up a level */
>> +if (--iter->levels_nr == 0)
>> +return ref_iterator_abort(ref_iterator);
>> +
>> +continue;
>> +}
>> +
>> +entry = dir->entries[level->index];
>> +
>> +if (entry->flag & REF_DIR) {
>> +/* push down a level */
>> +ALLOC_GROW(iter->levels, iter->levels_nr + 1,
>> +   iter->levels_alloc);
>> +
>> +level = >levels[iter->levels_nr++];
>> +level->dir = get_ref_dir(entry);
>> +sort_ref_dir(level->dir);
> 
> ... given that you sort here?

I had intended to say 'or vice versa' here. When I wrote this, I had not
finished reading this patch (let alone the series). Now, I suspect that
you can simply drop this 'sort_ref_dir()' call site. Unless I've misread
the code, of course! ;-)

ATB,
Ramsay Jones


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 09/13] refs: introduce an iterator interface

2016-05-30 Thread Ramsay Jones


On 30/05/16 08:55, Michael Haggerty wrote:
[snip]

>  /* Reference is a symbolic reference. */
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index 8ab4d5f..dbf1587 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -1,6 +1,7 @@
>  #include "../cache.h"
>  #include "../refs.h"
>  #include "refs-internal.h"
> +#include "../iterator.h"
>  #include "../lockfile.h"
>  #include "../object.h"
>  #include "../dir.h"
> @@ -704,6 +705,154 @@ static void prime_ref_dir(struct ref_dir *dir)
>   }
>  }
>  
> +/*
> + * A level in the reference hierarchy that is currently being iterated
> + * through.
> + */
> +struct cache_ref_iterator_level {
> + /*
> +  * The ref_dir being iterated over at this level. The ref_dir
> + * is sorted before being stored here.
> +  */
> + struct ref_dir *dir;
> +
> + /*
> +  * The index of the current entry within dir (which might
> +  * itself be a directory). If index == -1, then the iteration
> +  * hasn't yet begun. If index == dir->nr, then the iteration
> +  * through this level is over.
> +  */
> + int index;
> +};
> +
> +/*
> + * Represent an iteration through a ref_dir in the memory cache. The
> + * iteration recurses through subdirectories.
> + */
> +struct cache_ref_iterator {
> + struct ref_iterator base;
> +
> + /*
> +  * The number of levels currently on the stack. This is always
> +  * at least 1, because when it becomes zero the iteration is
> +  * ended and this struct is freed.
> +  */
> + size_t levels_nr;
> +
> + /* The number of levels that have been allocated on the stack */
> + size_t levels_alloc;
> +
> + /*
> +  * A stack of levels. levels[0] is the uppermost level that is
> +  * being iterated over in this iteration. (This is not
> +  * necessary the top level in the references hierarchy. If we
> +  * are iterating through a subtree, then levels[0] will hold
> +  * the ref_dir for that subtree, and subsequent levels will go
> +  * on from there.)
> +  */
> + struct cache_ref_iterator_level *levels;
> +};
> +
> +static int cache_ref_iterator_advance(struct ref_iterator *ref_iterator)
> +{
> + struct cache_ref_iterator *iter =
> + (struct cache_ref_iterator *)ref_iterator;
> +
> + while (1) {
> + struct cache_ref_iterator_level *level =
> + >levels[iter->levels_nr - 1];
> + struct ref_dir *dir = level->dir;
> + struct ref_entry *entry;
> +
> + if (level->index == -1)
> + sort_ref_dir(dir);

do you need to sort here ...
> +
> + if (++level->index == level->dir->nr) {
> + /* This level is exhausted; pop up a level */
> + if (--iter->levels_nr == 0)
> + return ref_iterator_abort(ref_iterator);
> +
> + continue;
> + }
> +
> + entry = dir->entries[level->index];
> +
> + if (entry->flag & REF_DIR) {
> + /* push down a level */
> + ALLOC_GROW(iter->levels, iter->levels_nr + 1,
> +iter->levels_alloc);
> +
> + level = >levels[iter->levels_nr++];
> + level->dir = get_ref_dir(entry);
> + sort_ref_dir(level->dir);

... given that you sort here?

> + level->index = -1;
> + } else {
> + iter->base.refname = entry->name;
> + iter->base.oid = >u.value.oid;
> + iter->base.flags = entry->flag;
> + return ITER_OK;
> + }
> + }
> +}
> +

ATB,
Ramsay Jones


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html