== Work done in the previous 11 weeks ==
- Definition of a tentative index file v5 format . This differs
from the proposal in making it possible to bisect the directory
entries and file entries, to do a binary search. The exact bits
for each section were also defined. To further compress the index,
along with prefix compression, the stat data is hashed, since
it's only used for comparison, but the plain data is never used.
Thanks to Michael Haggerty, Nguyen Thai Ngoc Duy, Thomas Rast
and Robin Rosenberg for feedback.
- Prototype of a converter from the index format v2/v3 to the index
format v5.  The converter reads the index from a git repository,
can output parts of the index (header, index entries as in
git ls-files --debug, cache tree as in test-dump-cache-tree, or
the reuc data). Then it writes the v5 index file format to
.git/index-v5. Thanks to Michael Haggerty for the code review.
- Prototype of a reader for the new index file format.  The
reader has mainly the purpose to show the algorithm used to read
the index lexicographically sorted after the full name which is
required by the current internal memory format. Big thanks for
reviewing this code and giving me advice on refactoring goes
to Michael Haggerty.
- Read the index format format and translate it to the current in
memory format. This doesn't include reading any of the current
extensions, which are now part of the main index. The code again
is on github.  Thanks for reviewing the first steps to Thomas
- Read the cache-tree data (formerly an extension, now it's integrated
with the rest of the directory data) from the new ondisk format.
There are still a few optimizations to do in this algorithm.
- Started implementing the API (suggested by Duy), but it's still
in the very early stages. There is one commit for this on GitHub ,
but it's a very early work in progress.
- Started implementing the writer, which extracts the directories from
the in-memory format, and writes the header and the directories to
- I found a few bugs in the algorithm for extracting the directories
and decided to completely rewrite it, using a hash table instead of
simple lists, since the old one would have to many corner cases to
- Implemented writing the file block to disk, and basic tests from the
test suite are running fine, not including tests that require
conflicted data or the cache-tree to work, which both are not
== Work done in the last week ==
- Unfortunately this weeks progress was slower than expected due to
exams at university. Those are now over however, so I can fully
concentrate on the work for Google Summer of Code.
- This week I started implementing a patch to replace the ce_namelen()
function with a field ce_namelen field in struct cache_entry. This
will both give us some extra bits of performance in some (rare)
cases with the old index format, and is a refactoring for index-v5,
which won't store the length in the flags. The thread for the patch
is here . Thanks to Junio, Duy and Thomas for reviews of this patch.
== Outlook for the next week ==
- Polish the patch for the ce_namelen field.
- Implement the cache-tree and conflict data writing to the index file.
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html