Re: [PATCH 05/22] read-cache: add index reading api

2013-07-09 Thread Thomas Gummerer
Junio C Hamano gits...@pobox.com writes:

 Thomas Gummerer t.gumme...@gmail.com writes:

 The reader often needs to rewind the read-pointer partially while
 walking the index (e.g. next_cache_entry() in unpack-trees.c and how
 the o-cache_bottom position is used throughout the subsystem).  I
 am not sure if this singly-linked list is a good way to go.

 I'm not very familiar with the unpack-trees code, but from a quick look
 the pointer (or position in the cache) is always only moved forward.

 I am more worried about o-cache_bottom processing, where it
 currently is an index into an array.

 With your ce-next_in_list_of_read_entries change, a natural rewrite
 would be to point at the ce with o-cache_bottom, but then that
 would mean you cannot in-place replace the entries like we used to
 be able to in an array based implementation.

 But your series does not seem to touch unpack-trees yet, so I may be
 worried too much before it becomes necessary.

Yes, you're right, as Duy mentioned in the other email I just responded
to it makes sense to keep the index around for now.

I looked at the unpack-trees code a bit, and adding a new api and hiding
index_state-cache[] will probably be a bit harder to do than I
originally thought, so it's best to keep that around for now, as we're
still able to get the benefits from partial loading even if it's not
hidden.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 05/22] read-cache: add index reading api

2013-07-08 Thread Thomas Gummerer
Duy Nguyen pclo...@gmail.com writes:

 On Sun, Jul 7, 2013 at 3:11 PM, Thomas Gummerer t.gumme...@gmail.com wrote:
 +/*
 + * Options by which the index should be filtered when read partially.
 + *
 + * pathspec: The pathspec which the index entries have to match
 + * seen: Used to return the seen parameter from match_pathspec()
 + * max_prefix, max_prefix_len: These variables are set to the longest
 + * common prefix, the length of the longest common prefix of the
 + * given pathspec
 + *
 + * read_staged: used to indicate if the conflicted entries (entries
 + * with a stage) should be included
 + * read_cache_tree: used to indicate if the cache-tree should be read
 + * read_resolve_undo: used to indicate if the resolve undo data should
 + * be read
 + */
 +struct filter_opts {
 +   const char **pathspec;
 +   char *seen;
 +   char *max_prefix;
 +   int max_prefix_len;
 +
 +   int read_staged;
 +   int read_cache_tree;
 +   int read_resolve_undo;
 +};
 +
  struct index_state {
 struct cache_entry **cache;
 unsigned int version;
 @@ -270,6 +297,8 @@ struct index_state {
 struct hash_table name_hash;
 struct hash_table dir_hash;
 struct index_ops *ops;
 +   struct internal_ops *internal_ops;
 +   struct filter_opts *filter_opts;
  };

 ...

 -/* remember to discard_cache() before reading a different cache! */
 -int read_index_from(struct index_state *istate, const char *path)
 +
 +int read_index_filtered_from(struct index_state *istate, const char *path,
 +struct filter_opts *opts)
  {
 int fd, err, i;
 struct stat st_old, st_new;
 @@ -1337,7 +1425,7 @@ int read_index_from(struct index_state *istate, const 
 char *path)
 if (istate-ops-verify_hdr(mmap, mmap_size)  0)
 err = 1;

 -   if (istate-ops-read_index(istate, mmap, mmap_size)  0)
 +   if (istate-ops-read_index(istate, mmap, mmap_size, opts)  
 0)
 err = 1;
 istate-timestamp.sec = st_old.st_mtime;
 istate-timestamp.nsec = ST_MTIME_NSEC(st_old);
 @@ -1345,6 +1433,7 @@ int read_index_from(struct index_state *istate, const 
 char *path)
 die_errno(cannot stat the open index);

 munmap(mmap, mmap_size);
 +   istate-filter_opts = opts;
 if (!index_changed(st_old, st_new)  !err)
 return istate-cache_nr;
 }

 Putting filter_opts in index_state feels like a bad design. Iterator
 information should be separated from the iterated object, so that two
 callers can walk through the same index without stepping on each other
 (I'm not talking about multithreading, a caller may walk a bit, then
 the other caller starts walking, then the former caller resumes
 walking again in a call chain).

Yes, you're right.  We need the filter_opts to see what part of the
index has been loaded [1] and which part has been skipped, but it
shouldn't be used for filtering in the for_each_index_entry function.

I think there should be two versions of the for_each_index_entry
function then, where the for_each_index_entry function would behave the
same way as the for_each_index_entry_filtered function with the
filter_opts parameter set to NULL:
for_each_index_entry_filtered(struct index_state *, each_cache_entry_fn, void 
*cb_data, struct filter_opts *)
for_each_index_entry(struct index_state *, each_cache_entry_fn, void *cb_data)

Both of them then should call index_change_filter_opts to make sure all
the entries that are needed are loaded in the in-memory format.

Does that make sense?

[1] That is only important for the new index-v5 file format, which can
be loaded partially.  The filter_opts could always be set to NULL,
as the whole index is always loaded anyway.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 05/22] read-cache: add index reading api

2013-07-08 Thread Thomas Gummerer
Duy Nguyen pclo...@gmail.com writes:

 On Sun, Jul 7, 2013 at 3:11 PM, Thomas Gummerer t.gumme...@gmail.com wrote:
 Add an api for access to the index file.  Currently there is only a very
 basic api for accessing the index file, which only allows a full read of
 the index, and lets the users of the data filter it.  The new index api
 gives the users the possibility to use only part of the index and
 provides functions for iterating over and accessing cache entries.

 This simplifies future improvements to the in-memory format, as changes
 will be concentrated on one file, instead of the whole git source code.

 Signed-off-by: Thomas Gummerer t.gumme...@gmail.com
 ---
  cache.h |  57 +-
  read-cache-v2.c |  96 +++--
  read-cache.c| 108 
 
  read-cache.h|  12 ++-
  4 files changed, 263 insertions(+), 10 deletions(-)

 diff --git a/cache.h b/cache.h
 index 5082b34..d38dfbd 100644
 --- a/cache.h
 +++ b/cache.h
 @@ -127,7 +127,8 @@ struct cache_entry {
 unsigned int ce_flags;
 unsigned int ce_namelen;
 unsigned char sha1[20];
 -   struct cache_entry *next;
 +   struct cache_entry *next; /* used by name_hash */
 +   struct cache_entry *next_ce; /* used to keep a list of cache entries 
 */
 char name[FLEX_ARRAY]; /* more */
  };

 From what I read, doing

 ce = start;
 while (ce) { do(something); ce = next_cache_entry(ce); }

 is the same as

 i = start_index;
 while (i  active_nr) { ce = active_cache[i]; do(something); i++; }

 What's the advantage of using the former over the latter? Do you plan
 to eliminate the latter loop (by hiding struct cache_entry **cache;
 from public index_state structure?

Yes, I wanted to eliminate the latter loop, because it depends on the
in-memory format of the index.  By moving all direct accesses of
index_state-cache to an api it gets easier to change the in-memory
format.  I played a bit with a tree-based in-memory format [1], which
represents the on-disk format of index-v5 more closely, making
modifications and partial-loading simpler.

I've tried switching all those loops to api calls, but that would make
the api too bloated because there is a lot of those loops.  I found it
more sensible to do it this way, leaving the loops how they are, while
making future changes to the in-memory format a lot simpler.

[1] https://github.com/tgummerer/git/blob/index-v5api/read-cache-v5.c#L17
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 05/22] read-cache: add index reading api

2013-07-08 Thread Duy Nguyen
On Mon, Jul 8, 2013 at 6:20 PM, Thomas Gummerer t.gumme...@gmail.com wrote:
 Duy Nguyen pclo...@gmail.com writes:
 Putting filter_opts in index_state feels like a bad design. Iterator
 information should be separated from the iterated object, so that two
 callers can walk through the same index without stepping on each other
 (I'm not talking about multithreading, a caller may walk a bit, then
 the other caller starts walking, then the former caller resumes
 walking again in a call chain).

 Yes, you're right.  We need the filter_opts to see what part of the
 index has been loaded [1] and which part has been skipped, but it
 shouldn't be used for filtering in the for_each_index_entry function.

 I think there should be two versions of the for_each_index_entry
 function then, where the for_each_index_entry function would behave the
 same way as the for_each_index_entry_filtered function with the
 filter_opts parameter set to NULL:
 for_each_index_entry_filtered(struct index_state *, each_cache_entry_fn, void 
 *cb_data, struct filter_opts *)
 for_each_index_entry(struct index_state *, each_cache_entry_fn, void *cb_data)

 Both of them then should call index_change_filter_opts to make sure all
 the entries that are needed are loaded in the in-memory format.

 Does that make sense?

Hmm.. I was confused actually (documentation on the api would help
greatly). If you already filter at load time, I don't think you need
to match again. The caller asked for filter and it should know what's
in there so for_each_index_entry just goes through all entries without
extra match_pathspec. Or is that what next_index_entry for?
match_pathspec function could be expensive when glob is involved. If
the caller wants extra matching, it could do inside the callback
function.

It seems you could change the filter with index_change_filter_opts. In
v5 the index will be reloaded. What happens when some index entries
area already modified? Do we start to have read-only index views and
one read-write view? If partial views are always read-only, perhaps we
just allow the user to create a new index_state (or view) with new
filter and destroy the old one. We don't have to care about changing
or separating filter in that case because the view is the iterator.

I wanted to have a tree-based iterator api, but that seems
incompatible with pre-v5 (or at least adds some overhead on pre-v5 to
rebuild the tree structure). It looks like using pathspec to build a
list of entries, as you did, is a good way to take advantage of
tree-based v5 while maintaining code compatibility with pre-v5. By the
way with tree structure, you could use tree_entry_interesting in
read_index_filtered_v5. I think it's more efficient than
match_pathspec.

I'm still studying the code. Some of what I wrote here may be totally
wrong due to my lack of understanding. I'll get back to you later if I
find something else.

 [1] That is only important for the new index-v5 file format, which can
 be loaded partially.  The filter_opts could always be set to NULL,
 as the whole index is always loaded anyway.
--
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 05/22] read-cache: add index reading api

2013-07-08 Thread Thomas Gummerer
Duy Nguyen pclo...@gmail.com writes:

 On Mon, Jul 8, 2013 at 6:20 PM, Thomas Gummerer t.gumme...@gmail.com wrote:
 Duy Nguyen pclo...@gmail.com writes:
 Putting filter_opts in index_state feels like a bad design. Iterator
 information should be separated from the iterated object, so that two
 callers can walk through the same index without stepping on each other
 (I'm not talking about multithreading, a caller may walk a bit, then
 the other caller starts walking, then the former caller resumes
 walking again in a call chain).

 Yes, you're right.  We need the filter_opts to see what part of the
 index has been loaded [1] and which part has been skipped, but it
 shouldn't be used for filtering in the for_each_index_entry function.

 I think there should be two versions of the for_each_index_entry
 function then, where the for_each_index_entry function would behave the
 same way as the for_each_index_entry_filtered function with the
 filter_opts parameter set to NULL:
 for_each_index_entry_filtered(struct index_state *, each_cache_entry_fn, 
 void *cb_data, struct filter_opts *)
 for_each_index_entry(struct index_state *, each_cache_entry_fn, void 
 *cb_data)

 Both of them then should call index_change_filter_opts to make sure all
 the entries that are needed are loaded in the in-memory format.

 Does that make sense?

 Hmm.. I was confused actually (documentation on the api would help
 greatly). If you already filter at load time, I don't think you need
 to match again. The caller asked for filter and it should know what's
 in there so for_each_index_entry just goes through all entries without
 extra match_pathspec. Or is that what next_index_entry for?
 match_pathspec function could be expensive when glob is involved. If
 the caller wants extra matching, it could do inside the callback
 function.

Yes, a documentation would be good.  I'll try to write something better
later today, when I have some more time.  In the meantime I'll just
outline what the functions do here shortly:

read_index_filtered(opts): This method behaves differently for index-v2
  and index-v5.
  For index-v2 it simply reads the whole index as read_cache() does, so
  we are sure we don't have to reload anything if the user wants a
  different filter.
  For index-v5 it creates an adjusted pathspec to and reads all
  directories that are matched by them.

get_index_entry_by_name(name, namelen, ce): Returns a cache_entry
  matched by name via the ce parameter.  If a cache_entry is matched
  exactly 1 is returned.
  Name may also be a path, in which case it returns 0 and the first
  cache_entry in that path. e.g. we have:
  ...
  path/file1
  
in the index and name is path, than it returns 0 and the path/file1
cache_entry.  If name is path/file1 on the other hand it returns 1
and the path/file1 cache_entry.

for_each_index_entry(fn, cb_data):  Iterates over all cache_entries in
  the index filtered by filter_opts in the index_state, and executes fn
  for each of them with the cb_data as callback data.

next_index_entry(ce): Returns the cache_entry that follows after ce

index_change_filter_opts(opts): For index-v2 it simply changes the
  filter_opts, so for_each_index_entry uses the changed index_opts.
  For index-v5 it refreshes the index if the filter_opts have changed.
  This has some optimization potential, in the case that the opts get
  stricter (less of the index should be read) it doesn't have to reload
  anything.

I'm not sure what's in the cache, because the whole index is in the
cache if the on-disk format is index-v2 and the index is filtered by the
adjusted_pathspec if the on-disk format is index-v5.  That's what I need
the extra match_pathspec for. But yes, that could also be left to the
caller.

Hope that makes it a little clearer.

 It seems you could change the filter with index_change_filter_opts. In
 v5 the index will be reloaded. What happens when some index entries
 area already modified? Do we start to have read-only index views and
 one read-write view? If partial views are always read-only, perhaps we
 just allow the user to create a new index_state (or view) with new
 filter and destroy the old one. We don't have to care about changing
 or separating filter in that case because the view is the iterator.

The read-write part is mostly covered by the next patch (6/22).  Before
changing the index, the filter_opts always have to be set to NULL, using
index_change_filter_opts and therefore use the whole index.  This is
currently hard to improve, because we always need the whole index when
we write it.  Changing this only makes sense once we have partial
writing too.

So in principle the index_change_filter_opts function implements those
views.

Even with partial writing we have to distinguish if a cache_entry has
been added/removed, in which case a full rewrite is necessary or if a
cache_entry has simply been modified (it's content changed), in which
case we could replace it in place.

 

Re: [PATCH 05/22] read-cache: add index reading api

2013-07-08 Thread Junio C Hamano
Thomas Gummerer t.gumme...@gmail.com writes:

 Add an api for access to the index file.  Currently there is only a very
 basic api for accessing the index file, which only allows a full read of
 the index, and lets the users of the data filter it.  The new index api
 gives the users the possibility to use only part of the index and
 provides functions for iterating over and accessing cache entries.

 This simplifies future improvements to the in-memory format, as changes
 will be concentrated on one file, instead of the whole git source code.

 Signed-off-by: Thomas Gummerer t.gumme...@gmail.com
 ---
  cache.h |  57 +-
  read-cache-v2.c |  96 +++--
  read-cache.c| 108 
 
  read-cache.h|  12 ++-
  4 files changed, 263 insertions(+), 10 deletions(-)

 diff --git a/cache.h b/cache.h
 index 5082b34..d38dfbd 100644
 --- a/cache.h
 +++ b/cache.h
 @@ -127,7 +127,8 @@ struct cache_entry {
   unsigned int ce_flags;
   unsigned int ce_namelen;
   unsigned char sha1[20];
 - struct cache_entry *next;
 + struct cache_entry *next; /* used by name_hash */
 + struct cache_entry *next_ce; /* used to keep a list of cache entries */

The reader often needs to rewind the read-pointer partially while
walking the index (e.g. next_cache_entry() in unpack-trees.c and how
the o-cache_bottom position is used throughout the subsystem).  I
am not sure if this singly-linked list is a good way to go.

 +/*
 + * Options by which the index should be filtered when read partially.
 + *
 + * pathspec: The pathspec which the index entries have to match
 + * seen: Used to return the seen parameter from match_pathspec()
 + * max_prefix, max_prefix_len: These variables are set to the longest
 + * common prefix, the length of the longest common prefix of the
 + * given pathspec

These probably should use struct pathspec abstration, not just the
array of raw strings, no?

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 05/22] read-cache: add index reading api

2013-07-08 Thread Thomas Gummerer
Junio C Hamano gits...@pobox.com writes:

 Thomas Gummerer t.gumme...@gmail.com writes:

 Add an api for access to the index file.  Currently there is only a very
 basic api for accessing the index file, which only allows a full read of
 the index, and lets the users of the data filter it.  The new index api
 gives the users the possibility to use only part of the index and
 provides functions for iterating over and accessing cache entries.

 This simplifies future improvements to the in-memory format, as changes
 will be concentrated on one file, instead of the whole git source code.

 Signed-off-by: Thomas Gummerer t.gumme...@gmail.com
 ---
  cache.h |  57 +-
  read-cache-v2.c |  96 +++--
  read-cache.c| 108 
 
  read-cache.h|  12 ++-
  4 files changed, 263 insertions(+), 10 deletions(-)

 diff --git a/cache.h b/cache.h
 index 5082b34..d38dfbd 100644
 --- a/cache.h
 +++ b/cache.h
 @@ -127,7 +127,8 @@ struct cache_entry {
  unsigned int ce_flags;
  unsigned int ce_namelen;
  unsigned char sha1[20];
 -struct cache_entry *next;
 +struct cache_entry *next; /* used by name_hash */
 +struct cache_entry *next_ce; /* used to keep a list of cache entries */

 The reader often needs to rewind the read-pointer partially while
 walking the index (e.g. next_cache_entry() in unpack-trees.c and how
 the o-cache_bottom position is used throughout the subsystem).  I
 am not sure if this singly-linked list is a good way to go.

I'm not very familiar with the unpack-trees code, but from a quick look
the pointer (or position in the cache) is always only moved forward.  A
problem I do see though is skipping a number of entries at once.  An
example for that below:
int matches;
matches = 
cache_tree_matches_traversal(o-src_index-cache_tree,
   names, info);
/*
 * Everything under the name matches; skip the
 * entire hierarchy.  diff_index_cached codepath
 * special cases D/F conflicts in such a way that
 * it does not do any look-ahead, so this is safe.
 */
if (matches) {
o-cache_bottom += matches;
return mask;
}

This could probably be transformed into something like
skip_cache_tree_matches(cache-tree, names, info);

I'll take some time to familiarize myself with the unpack-trees code to
see if I can find a better solution than this, and if there are more
pitfalls.

 +/*
 + * Options by which the index should be filtered when read partially.
 + *
 + * pathspec: The pathspec which the index entries have to match
 + * seen: Used to return the seen parameter from match_pathspec()
 + * max_prefix, max_prefix_len: These variables are set to the longest
 + * common prefix, the length of the longest common prefix of the
 + * given pathspec

 These probably should use struct pathspec abstration, not just the
 array of raw strings, no?

Yes, thanks, that's probably a good idea.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 05/22] read-cache: add index reading api

2013-07-08 Thread Junio C Hamano
Thomas Gummerer t.gumme...@gmail.com writes:

 The reader often needs to rewind the read-pointer partially while
 walking the index (e.g. next_cache_entry() in unpack-trees.c and how
 the o-cache_bottom position is used throughout the subsystem).  I
 am not sure if this singly-linked list is a good way to go.

 I'm not very familiar with the unpack-trees code, but from a quick look
 the pointer (or position in the cache) is always only moved forward.

I am more worried about o-cache_bottom processing, where it
currently is an index into an array.

With your ce-next_in_list_of_read_entries change, a natural rewrite
would be to point at the ce with o-cache_bottom, but then that
would mean you cannot in-place replace the entries like we used to
be able to in an array based implementation.

But your series does not seem to touch unpack-trees yet, so I may be
worried too much before it becomes necessary.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 05/22] read-cache: add index reading api

2013-07-07 Thread Duy Nguyen
On Sun, Jul 7, 2013 at 3:11 PM, Thomas Gummerer t.gumme...@gmail.com wrote:
 Add an api for access to the index file.  Currently there is only a very
 basic api for accessing the index file, which only allows a full read of
 the index, and lets the users of the data filter it.  The new index api
 gives the users the possibility to use only part of the index and
 provides functions for iterating over and accessing cache entries.

 This simplifies future improvements to the in-memory format, as changes
 will be concentrated on one file, instead of the whole git source code.

 Signed-off-by: Thomas Gummerer t.gumme...@gmail.com
 ---
  cache.h |  57 +-
  read-cache-v2.c |  96 +++--
  read-cache.c| 108 
 
  read-cache.h|  12 ++-
  4 files changed, 263 insertions(+), 10 deletions(-)

 diff --git a/cache.h b/cache.h
 index 5082b34..d38dfbd 100644
 --- a/cache.h
 +++ b/cache.h
 @@ -127,7 +127,8 @@ struct cache_entry {
 unsigned int ce_flags;
 unsigned int ce_namelen;
 unsigned char sha1[20];
 -   struct cache_entry *next;
 +   struct cache_entry *next; /* used by name_hash */
 +   struct cache_entry *next_ce; /* used to keep a list of cache entries 
 */
 char name[FLEX_ARRAY]; /* more */
  };

From what I read, doing

ce = start;
while (ce) { do(something); ce = next_cache_entry(ce); }

is the same as

i = start_index;
while (i  active_nr) { ce = active_cache[i]; do(something); i++; }

What's the advantage of using the former over the latter? Do you plan
to eliminate the latter loop (by hiding struct cache_entry **cache;
from public index_state structure?
--
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 05/22] read-cache: add index reading api

2013-07-07 Thread Duy Nguyen
On Sun, Jul 7, 2013 at 3:11 PM, Thomas Gummerer t.gumme...@gmail.com wrote:
 +/*
 + * Options by which the index should be filtered when read partially.
 + *
 + * pathspec: The pathspec which the index entries have to match
 + * seen: Used to return the seen parameter from match_pathspec()
 + * max_prefix, max_prefix_len: These variables are set to the longest
 + * common prefix, the length of the longest common prefix of the
 + * given pathspec
 + *
 + * read_staged: used to indicate if the conflicted entries (entries
 + * with a stage) should be included
 + * read_cache_tree: used to indicate if the cache-tree should be read
 + * read_resolve_undo: used to indicate if the resolve undo data should
 + * be read
 + */
 +struct filter_opts {
 +   const char **pathspec;
 +   char *seen;
 +   char *max_prefix;
 +   int max_prefix_len;
 +
 +   int read_staged;
 +   int read_cache_tree;
 +   int read_resolve_undo;
 +};
 +
  struct index_state {
 struct cache_entry **cache;
 unsigned int version;
 @@ -270,6 +297,8 @@ struct index_state {
 struct hash_table name_hash;
 struct hash_table dir_hash;
 struct index_ops *ops;
 +   struct internal_ops *internal_ops;
 +   struct filter_opts *filter_opts;
  };

...

 -/* remember to discard_cache() before reading a different cache! */
 -int read_index_from(struct index_state *istate, const char *path)
 +
 +int read_index_filtered_from(struct index_state *istate, const char *path,
 +struct filter_opts *opts)
  {
 int fd, err, i;
 struct stat st_old, st_new;
 @@ -1337,7 +1425,7 @@ int read_index_from(struct index_state *istate, const 
 char *path)
 if (istate-ops-verify_hdr(mmap, mmap_size)  0)
 err = 1;

 -   if (istate-ops-read_index(istate, mmap, mmap_size)  0)
 +   if (istate-ops-read_index(istate, mmap, mmap_size, opts)  
 0)
 err = 1;
 istate-timestamp.sec = st_old.st_mtime;
 istate-timestamp.nsec = ST_MTIME_NSEC(st_old);
 @@ -1345,6 +1433,7 @@ int read_index_from(struct index_state *istate, const 
 char *path)
 die_errno(cannot stat the open index);

 munmap(mmap, mmap_size);
 +   istate-filter_opts = opts;
 if (!index_changed(st_old, st_new)  !err)
 return istate-cache_nr;
 }

Putting filter_opts in index_state feels like a bad design. Iterator
information should be separated from the iterated object, so that two
callers can walk through the same index without stepping on each other
(I'm not talking about multithreading, a caller may walk a bit, then
the other caller starts walking, then the former caller resumes
walking again in a call chain).
--
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html