On Mon, Jul 8, 2013 at 6:20 PM, Thomas Gummerer <t.gumme...@gmail.com> wrote:
> Duy Nguyen <pclo...@gmail.com> writes:
>> Putting filter_opts in index_state feels like a bad design. Iterator
>> information should be separated from the iterated object, so that two
>> callers can walk through the same index without stepping on each other
>> (I'm not talking about multithreading, a caller may walk a bit, then
>> the other caller starts walking, then the former caller resumes
>> walking again in a call chain).
> Yes, you're right.  We need the filter_opts to see what part of the
> index has been loaded [1] and which part has been skipped, but it
> shouldn't be used for filtering in the for_each_index_entry function.
> I think there should be two versions of the for_each_index_entry
> function then, where the for_each_index_entry function would behave the
> same way as the for_each_index_entry_filtered function with the
> filter_opts parameter set to NULL:
> for_each_index_entry_filtered(struct index_state *, each_cache_entry_fn, void 
> *cb_data, struct filter_opts *)
> for_each_index_entry(struct index_state *, each_cache_entry_fn, void *cb_data)
> Both of them then should call index_change_filter_opts to make sure all
> the entries that are needed are loaded in the in-memory format.
> Does that make sense?

Hmm.. I was confused actually (documentation on the api would help
greatly). If you already filter at load time, I don't think you need
to match again. The caller asked for filter and it should know what's
in there so for_each_index_entry just goes through all entries without
extra match_pathspec. Or is that what next_index_entry for?
match_pathspec function could be expensive when glob is involved. If
the caller wants extra matching, it could do inside the callback

It seems you could change the filter with index_change_filter_opts. In
v5 the index will be reloaded. What happens when some index entries
area already modified? Do we start to have read-only index "views" and
one read-write view? If partial views are always read-only, perhaps we
just allow the user to create a new index_state (or view) with new
filter and destroy the old one. We don't have to care about changing
or separating filter in that case because the view is the iterator.

I wanted to have a tree-based iterator api, but that seems
incompatible with pre-v5 (or at least adds some overhead on pre-v5 to
rebuild the tree structure). It looks like using pathspec to build a
list of entries, as you did, is a good way to take advantage of
tree-based v5 while maintaining code compatibility with pre-v5. By the
way with tree structure, you could use tree_entry_interesting in
read_index_filtered_v5. I think it's more efficient than

I'm still studying the code. Some of what I wrote here may be totally
wrong due to my lack of understanding. I'll get back to you later if I
find something else.

> [1] That is only important for the new index-v5 file format, which can
>     be loaded partially.  The filter_opts could always be set to NULL,
>     as the whole index is always loaded anyway.
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to