Document the new index api and add examples of how it should be used
instead of the old functions directly accessing the index.

Helped-by: Nguyễn Thái Ngọc Duy <>
Signed-off-by: Thomas Gummerer <>

Duy Nguyen <> writes:

> Hmm.. I was confused actually (documentation on the api would help
> greatly).

As promised, a draft for a documentation for the index api as it is in
this series.

Documentation/technical/api-in-core-index.txt | 108 +++++++++++++++++++++++++-
 1 file changed, 106 insertions(+), 2 deletions(-)

diff --git a/Documentation/technical/api-in-core-index.txt 
index adbdbf5..5269bb1 100644
--- a/Documentation/technical/api-in-core-index.txt
+++ b/Documentation/technical/api-in-core-index.txt
@@ -1,14 +1,116 @@
 in-core index API

+Reading API
+       Read the whole index file from disk.
+`index_name_pos(name, namelen)`::
+       Find a cache_entry with name in the index.  Returns pos if an
+       entry is matched exactly and -pos-1 if an entry is matched
+       partially.
+       e.g.
+       index:
+       file1
+       file2
+       path/file1
+       zzz
+       index_name_pos("path/file1", 10) returns 2, while
+       index_name_pos("path", 4) returns -1
+       This method behaves differently for index-v2 and index-v5.
+       For index-v2 it simply reads the whole index as read_index()
+       does, so we are sure we don't have to reload anything if the
+       user wants a different filter.  It also sets the filter_opts
+       in the index_state, which is used to limit the results when
+       iterating over the index with for_each_index_entry().
+       The whole index is read to avoid the need to eventually
+       re-read the index later, because the performance is no
+       different when reading it partially.
+       For index-v5 it creates an adjusted_pathspec to filter the
+       reading.  First all the directory entries are read and then
+       the cache_entries in the directories that match the adjusted
+       pathspec are read.  The filter_opts in the index_state are set
+       to filter out the rest of the cache_entries that are matched
+       by the adjusted pathspec but not by the pathspec given.  The
+       rest of the index entries are filtered out when iterating over
+       the cache with for_each_index_entries.
+`get_index_entry_by_name(name, namelen, &ce)`::
+       Returns a cache_entry matched by the name, returned via the
+       &ce parameter.  If a cache entry is matched exactly, 1 is
+       returned, otherwise 0.  For an example see index_name_pos().
+       This function should be used instead of the index_name_pos()
+       function to retrieve cache entries.
+`for_each_index_entry(fn, cb_data)`::
+       Iterates over all cache_entries in the index filtered by
+       filter_opts in the index_stat.  For each cache entry fn is
+       executed with cb_data as callback data.  From within the loop
+       do `return 0` to continue, or `return 1` to break the loop.
+       Returns the cache_entry that follows after ce
+       This function again has a slightly different functionality for
+       index-v2 and index-v5.
+       For index-v2 it simply changes the filter_opts, so
+       for_each_index_entry uses the changed index_opts, to iterate
+       over a different set of cache entries.
+       For index-v5 it refreshes the index if the filter_opts have
+       changed and sets the new filter_opts in the index state, again
+       to iterate over a different set of cache entries as with
+       index-v2.
+       This has some optimization potential, in the case that the
+       opts get stricter (less of the index should be read) it
+       doesn't have to reload anything, but currently does.
+Using the new index api
+Currently loops over a specific set of index entry were written as:
+  i = start_index;
+  while (i < active_nr) { ce = active_cache[i]; do(something); i++; }
+they should be rewritten to:
+  ce = start;
+  while (ce) { do(something); ce = next_cache_entry(ce); }
+which is the equivalent operation but hides the in-memory format of
+the index from the user.
+For getting a cache entry get_cache_entry_by_name() should be used
+instead of cache_name_pos(). e.g.:
+  int pos = cache_name_pos(name, namelen);
+  struct cache_entry *ce = active_cache[pos];
+  if (pos < 0) { do(something) }
+  else { do(somethingelse) }
+should be written as:
+  struct cache_entry *ce;
+  int ret = get_cache_entry_by_name(name, namelen, &ce);
+  if (!ret) { do(something) }
+  else { do(somethingelse) }
 Talk about <read-cache.c> and <cache-tree.c>, things like:

 * cache -> the_index macros
-* read_index()
 * write_index()
 * ie_match_stat() and ie_modified(); how they are different and when to
   use which.
-* index_name_pos()
 * remove_index_entry_at()
 * remove_file_from_index()
 * add_file_to_index()
@@ -18,4 +120,6 @@ Talk about <read-cache.c> and <cache-tree.c>, things like:
 * cache_tree_invalidate_path()
 * cache_tree_update()

 (JC, Linus)
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to
More majordomo info at

Reply via email to