Add a documentation of the index file format version 5 to
Documentation/technical.
Helped-by: Michael Haggerty mhag...@alum.mit.edu
Helped-by: Junio C Hamano gits...@pobox.com
Helped-by: Thomas Rast tr...@student.ethz.ch
Helped-by: Nguyen Thai Ngoc Duy pclo...@gmail.com
Helped-by: Robin Rosenberg robin.rosenb...@dewire.com
Signed-off-by: Thomas Gummerer t.gumme...@gmail.com
---
Documentation/technical/index-file-format-v5.txt | 285 ++
1 file changed, 285 insertions(+)
create mode 100644 Documentation/technical/index-file-format-v5.txt
diff --git a/Documentation/technical/index-file-format-v5.txt
b/Documentation/technical/index-file-format-v5.txt
new file mode 100644
index 000..6707f06
--- /dev/null
+++ b/Documentation/technical/index-file-format-v5.txt
@@ -0,0 +1,285 @@
+GIT index format
+
+
+== The git index file format
+
+ The git index file (.git/index) documents the status of the files
+ in the git staging area.
+
+ The staging area is used for preparing commits, merging, etc.
+
+ All binary numbers are in network byte order. Version 5 is described
+ here.
+
+ - A 20-byte header consisting of
+
+ sig (32-bits): Signature:
+ The signature is { 'D', 'I', 'R', 'C' } (stands for dircache)
+
+ vnr (32-bits): Version number:
+ The current supported versions are 2, 3, 4 and 5.
+
+ ndir (32-bits): number of directories in the index.
+
+ nfile (32-bits): number of file entries in the index.
+
+ fblockoffset (32-bits): offset to the file block, relative to the
+ beginning of the file.
+
+ - Offset to the extensions.
+
+ nextensions (32-bits): number of extensions.
+
+ extoffset (32-bits): offset to the extension. (Possibly none, as
+ many as indicated in the 4-byte number of extensions)
+
+ headercrc (32-bits): crc checksum for the header and extension
+ offsets
+
+ - diroffsets (ndir * directory offsets): A directory offset for each
+ of the ndir directories in the index, sorted by pathname (of the
+ directory it's pointing to) (see below). The diroffsets are relative
+ to the beginning of the direntries block. [1]
+
+ - direntries (ndir * directory entries): A directory entry for each
+ of the ndir directories in the index, sorted by pathname (see
+ below). [2]
+
+ - fileoffsets (nfile * file offsets): A file offset for each of the
+ nfile files in the index (see below). The file offsets are relative
+ to the beginning of the fileentries block. [1]
+
+ - fileentries (nfile * file entries): A file entry for each of the
+ nfile files in the index (see below).
+
+ - crdata: A number of entries for conflicted data/resolved conflicts
+ (see below).
+
+ - Extensions (Currently none, see below in the future)
+
+ Extensions are identified by signature. Optional extensions can
+ be ignored if GIT does not understand them.
+
+ GIT supports an arbitrary number of extension, but currently none
+ is implemented. [3]
+
+ extsig (32-bits): extension signature. If the first byte is 'A'..'Z'
+ the extension is optional and can be ignored.
+
+ extsize (32-bits): size of the extension, excluding the header
+ (extsig, extsize, extchecksum).
+
+ extchecksum (32-bits): crc32 checksum of the extension signature
+ and size.
+
+- Extension data.
+
+
+== Directory offsets (diroffsets)
+
+ diroffset (32-bits): offset to the directory relative to the beginning
+of the index file. There are ndir + 1 offsets in the diroffset table,
+the last is pointing to the end of the last direntry. With this last
+entry, we can replace the strlen when reading each filename, by
+calculating its length with the offsets.
+
+ This part is needed for making the directory entries bisectable and
+thus allowing a binary search.
+
+== Directory entry (direntries)
+
+ Directory entries are sorted in lexicographic order by the name
+of their path starting with the root.
+
+ pathname (variable length, nul terminated): relative to top level
+directory (without the leading slash). '/' is used as path
+separator. A string of length 0 ('') indicates the root directory.
+The special path components ., and .. (without quotes) are
+disallowed. The path also includes a trailing slash. [9]
+
+ foffset (32-bits): offset to the lexicographically first file in
+the file offsets (fileoffsets), relative to the beginning of
+the fileoffset block.
+
+ cr (32-bits): offset to conflicted/resolved data at the end of the
+index. 0 if there is no such data. [4]
+
+ ncr (32-bits): number of conflicted/resolved data entries at the
+end of the index if the offset is non 0. If cr is 0, ncr is
+also 0.
+
+ nsubtrees (32-bits): number of subtrees this tree has in the index.
+
+ nfiles (32-bits): number of files in the directory, that are in
+the index.
+
+ nentries