[PATCH/RFC v2 0/16] Introduce index file format version 5

2012-08-05 Thread Thomas Gummerer
Fist again apologies for those who were not credited in the first version of this series. The first version of the series was here: $gmane/202752. Changes since the last version: This series now applies to the latest master. [PATCH/RFC v2 01/16] Modify cache_header to prepare for other index

[PATCH/RFC v2 01/16] Modify cache_header to prepare for other index formats

2012-08-05 Thread Thomas Gummerer
. number of entries for index v2/3/4) can be different from one file format to another. Therefore it is split to its own struct. The structs are also moved to read-cache.c, since they are not used in any other place except test-index-version, where cache_version_header is redefined. Signed-off-by: Thomas

[PATCH/RFC v2 04/16] Modify write functions to prepare for other index formats

2012-08-05 Thread Thomas Gummerer
Modify the write_index function to add the possibility to add other index formats, that are written in a different way. Also mark all functions, which shall only be used with v2-v4 as v2 functions. Signed-off-by: Thomas Gummerer t.gumme...@gmail.com --- read-cache.c | 43

[PATCH/RFC v2 06/16] t3700: sleep for 1 second, to avoid interfering with the racy code

2012-08-05 Thread Thomas Gummerer
to avoid smudging the entry and getting correct test results. Signed-off-by: Thomas Gummerer t.gumme...@gmail.com --- t/t3700-add.sh |1 + 1 file changed, 1 insertion(+) diff --git a/t/t3700-add.sh b/t/t3700-add.sh index 874b3a6..4d70805 100755 --- a/t/t3700-add.sh +++ b/t/t3700-add.sh @@ -184,6

[PATCH/RFC v2 09/16] Read index-v5

2012-08-05 Thread Thomas Gummerer
Make git read the index file version 5 without complaining. This version of the reader doesn't read neither the cache-tree nor the resolve undo data, but doesn't choke on an index that includes such data. Helped-by: Thomas Rast tr...@student.ethz.ch Signed-off-by: Thomas Gummerer t.gumme

[PATCH/RFC v2 10/16] Read resolve-undo data

2012-08-05 Thread Thomas Gummerer
-by: Thomas Rast tr...@student.ethz.ch Signed-off-by: Thomas Gummerer t.gumme...@gmail.com --- read-cache.c |1 + resolve-undo.c | 36 resolve-undo.h |2 ++ 3 files changed, 39 insertions(+) diff --git a/read-cache.c b/read-cache.c index 70334f9..03370f9

[PATCH/RFC v2 12/16] Write index-v5

2012-08-05 Thread Thomas Gummerer
. Helped-by: Thomas Rast tr...@student.ethz.ch Signed-off-by: Thomas Gummerer t.gumme...@gmail.com --- cache.h | 10 +- read-cache.c | 587 +- 2 files changed, 595 insertions(+), 2 deletions(-) diff --git a/cache.h b/cache.h index

[PATCH/RFC v2 13/16] Write index-v5 cache-tree data

2012-08-05 Thread Thomas Gummerer
Write the cache-tree data for the index version 5 file format. The in-memory cache-tree data is converted to the ondisk format, by adding it to the directory entries, that were compiled from the cache-entries in the step before. Signed-off-by: Thomas Gummerer t.gumme...@gmail.com --- cache

[PATCH/RFC v2 14/16] Write resolve-undo data for index-v5

2012-08-05 Thread Thomas Gummerer
Signed-off-by: Thomas Gummerer t.gumme...@gmail.com --- read-cache.c |1 + resolve-undo.c | 93 resolve-undo.h |1 + 3 files changed, 95 insertions(+) diff --git a/read-cache.c b/read-cache.c index d18383f..6496cc4 100644 --- a/read

[PATCH/RFC v2 15/16] update-index.c: add a force-rewrite option

2012-08-05 Thread Thomas Gummerer
Add a force-rewrite option to update-index, which allows the user to rewrite the index, even if there are no changes. This can be used to do performance tests of both the reader and the writer. Signed-off-by: Thomas Gummerer t.gumme...@gmail.com --- builtin/update-index.c |5 - 1 file

[PATCH/RFC v2 16/16] p0002-index.sh: add perf test for the index formats

2012-08-05 Thread Thomas Gummerer
From: Thomas Rast tr...@student.ethz.ch Add a performance test for index version [23]/4/5 by using git update-index --force-rewrite, thus testing both the reader and the writer speed of all index formats. Signed-off-by: Thomas Rast tr...@student.ethz.ch Signed-off-by: Thomas Gummerer t.gumme

[PATCH/RFC v2 02/16] Modify read functions to prepare for other index formats

2012-08-05 Thread Thomas Gummerer
Modify the read_index_from function, splitting it up into one function that stays the same for every index format, doing the basic operations such as verifying the header, and a function which is specific for each index version, which does the real reading of the index. Signed-off-by: Thomas

Re: [PATCH 05/16] t2104: Don't fail when index version is 5

2012-08-03 Thread Thomas Gummerer
On 08/03, Thomas Rast wrote: Thomas Gummerer t.gumme...@gmail.com writes: The test t2104 currently checks if the index version is correctly reduced to 2/increased to 3, when an entry need extended flags, or doesn't use them anymore. Since index-v5 doesn't have extended flags

Re: [RFC 0/16] Introduce index file format version 5

2012-08-03 Thread Thomas Gummerer
On 08/03, Nguyen Thai Ngoc Duy wrote: On Thu, Aug 2, 2012 at 6:01 PM, Thomas Gummerer t.gumme...@gmail.com wrote: Series of patches to introduce the index version 5 file format. This series does not include any fancy stuff like partial loading or partial writing yet, though it's possible

[RFC 0/16] Introduce index file format version 5

2012-08-02 Thread Thomas Gummerer
Series of patches to introduce the index version 5 file format. This series does not include any fancy stuff like partial loading or partial writing yet, though it's possible to do that with the new format. There was already a POC for partial loading, which gave pretty good results, which was

[PATCH 04/16] Modify write functions to prepare for other index formats

2012-08-02 Thread Thomas Gummerer
Modify the write_index function to add the possibility to add other index formats, that are written in a different way. Also mark all functions, which shall only be used with v2-v4 as v2 functions. Signed-off-by: Thomas Gummerer t.gumme...@gmail.com --- read-cache.c | 40

[PATCH 05/16] t2104: Don't fail when index version is 5

2012-08-02 Thread Thomas Gummerer
is 2/3 (whichever is correct for that test) or 5. Signed-off-by: Thomas Gummerer t.gumme...@gmail.com --- t/t2104-update-index-skip-worktree.sh | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/t/t2104-update-index-skip-worktree.sh b/t/t2104-update-index-skip

[PATCH 02/16] Modify read functions to prepare for other index formats

2012-08-02 Thread Thomas Gummerer
Modify the read_index_from function, splitting it up into one function that stays the same for every index format, doing the basic operations such as verifying the header, and a function which is specific for each index version, which does the real reading of the index. Signed-off-by: Thomas

[PATCH 03/16] Modify match_stat_basic to prepare for other index formats

2012-08-02 Thread Thomas Gummerer
Modify match_stat_basic, into one function that handles the general case, which is the same for all index formats, and a function that handles the specific parts for each index file version. Signed-off-by: Thomas Gummerer t.gumme...@gmail.com --- read-cache.c | 77

[PATCH 01/16] Modify cache_header to prepare for other index formats

2012-08-02 Thread Thomas Gummerer
. number of entries for index v2/3/4) can be different from one file format to another. Therefore it is split to its own struct. Signed-off-by: Thomas Gummerer t.gumme...@gmail.com --- cache.h | 5 - read-cache.c | 20 +--- test-index-version.c | 2 +- 3 files

[PATCH 07/16] Add documentation of the index-v5 file format

2012-08-02 Thread Thomas Gummerer
Add a documentation of the index file format version 5 to Documentation/technical. Signed-off-by: Thomas Gummerer t.gumme...@gmail.com --- Documentation/technical/index-file-format-v5.txt | 281 +++ 1 file changed, 281 insertions(+) create mode 100644 Documentation/technical

[PATCH 08/16] Make in-memory format aware of stat_crc

2012-08-02 Thread Thomas Gummerer
Make the in-memory format aware of the stat_crc used by index-v5. It is simply ignored by index version prior to v5. Signed-off-by: Thomas Gummerer t.gumme...@gmail.com --- cache.h | 1 + read-cache.c | 27 +++ 2 files changed, 28 insertions(+) diff --git a/cache.h

[PATCH 06/16] t3700: sleep for 1 second, to avoid interfering with the racy code

2012-08-02 Thread Thomas Gummerer
to avoid smudging the entry and getting correct test results. Signed-off-by: Thomas Gummerer t.gumme...@gmail.com --- t/t3700-add.sh | 1 + 1 file changed, 1 insertion(+) diff --git a/t/t3700-add.sh b/t/t3700-add.sh index 874b3a6..4d70805 100755 --- a/t/t3700-add.sh +++ b/t/t3700-add.sh @@ -184,6 +184,7

[PATCH 11/16] Read cache-tree in index-v5

2012-08-02 Thread Thomas Gummerer
the directories have to be reordered with respect to the ondisk layout. Signed-off-by: Thomas Gummerer t.gumme...@gmail.com --- cache-tree.c | 93 cache-tree.h | 6 read-cache.c | 1 + 3 files changed, 100 insertions(+) diff --git a/cache

[PATCH 12/16] Write index-v5

2012-08-02 Thread Thomas Gummerer
. Signed-off-by: Thomas Gummerer t.gumme...@gmail.com --- cache.h | 10 +- read-cache.c | 602 ++- 2 files changed, 609 insertions(+), 3 deletions(-) diff --git a/cache.h b/cache.h index 91d9b45..fe3b446 100644 --- a/cache.h +++ b/cache.h

[PATCH 14/16] Write resolve-undo data for index-v5

2012-08-02 Thread Thomas Gummerer
Write the resolve undo data to the ondisk format, by joining the data in the resolve-undo string-list with the already existing conflicts that were compiled before, when searching the directories and add them to the corresponding directory entries. Signed-off-by: Thomas Gummerer t.gumme

[PATCH 13/16] Write index-v5 cache-tree data

2012-08-02 Thread Thomas Gummerer
Write the cache-tree data for the index version 5 file format. The in-memory cache-tree data is converted to the ondisk format, by adding it to the directory entries, that were compiled from the cache-entries in the step before. Signed-off-by: Thomas Gummerer t.gumme...@gmail.com --- cache

[PATCH 16/16] p0002-index.sh: add perf test for the index formats

2012-08-02 Thread Thomas Gummerer
Add a performance test for index version [23]/4/5 by using git update-index --force-rewrite, thus testing both the reader and the writer speed of all index formats. Signed-off-by: Thomas Gummerer t.gumme...@gmail.com --- t/perf/p0002-index.sh | 33 + 1 file

[PATCH 15/16] update-index.c: add a force-rewrite option

2012-08-02 Thread Thomas Gummerer
Add a force-rewrite option to update-index, which allows the user to rewrite the index, even if there are no changes. This can be used to do performance tests of both the reader and the writer. Signed-off-by: Thomas Gummerer t.gumme...@gmail.com --- builtin/update-index.c | 5 - 1 file

[PATCH 10/16] Read resolve-undo data

2012-08-02 Thread Thomas Gummerer
-by: Thomas Gummerer t.gumme...@gmail.com --- read-cache.c | 1 + resolve-undo.c | 36 resolve-undo.h | 2 ++ 3 files changed, 39 insertions(+) diff --git a/read-cache.c b/read-cache.c index 884c2a7..cef9a4e 100644 --- a/read-cache.c +++ b/read-cache.c

[PATCH 09/16] Read index-v5

2012-08-02 Thread Thomas Gummerer
Make git read the index file version 5 without complaining. This version of the reader doesn't read neither the cache-tree nor the resolve undo data, but doesn't choke on an index that includes such data. Signed-off-by: Thomas Gummerer t.gumme...@gmail.com --- cache.h | 79 + read

Re: [RFC 0/16] Introduce index file format version 5

2012-08-02 Thread Thomas Gummerer
On 08/02, Nguyen Thai Ngoc Duy wrote: On Thu, Aug 2, 2012 at 6:01 PM, Thomas Gummerer t.gumme...@gmail.com wrote: Documentation/technical/index-file-format-v5.txt | 281 ++ builtin/update-index.c |5 +- cache-tree.c

Re: [PATCH 16/16] p0002-index.sh: add perf test for the index formats

2012-08-02 Thread Thomas Gummerer
On 08/02, Nguyen Thai Ngoc Duy wrote: On Thu, Aug 2, 2012 at 6:02 PM, Thomas Gummerer t.gumme...@gmail.com wrote: Add a performance test for index version [23]/4/5 by using git update-index --force-rewrite, thus testing both the reader and the writer speed of all index formats

Re: [PATCH 09/16] Read index-v5

2012-08-02 Thread Thomas Gummerer
, but we'd probably have to split it to at least 3 files, one for index-v2, one for index-v5 and one for the general functions/api. On Thu, Aug 2, 2012 at 6:01 PM, Thomas Gummerer t.gumme...@gmail.com wrote: +static struct cache_entry *cache_entry_from_ondisk_v5(struct ondisk_cache_entry_v5 *ondisk

Re: [PATCH 04/16] Modify write functions to prepare for other index formats

2012-08-02 Thread Thomas Gummerer
On 08/02, Nguyen Thai Ngoc Duy wrote: On Thu, Aug 2, 2012 at 6:01 PM, Thomas Gummerer t.gumme...@gmail.com wrote: @@ -1785,7 +1785,7 @@ void update_index_if_able(struct index_state *istate, struct lock_file *lockfile rollback_lock_file(lockfile); } -int write_index

Re: [PATCH 03/16] Modify match_stat_basic to prepare for other index formats

2012-08-02 Thread Thomas Gummerer
On 08/02, Nguyen Thai Ngoc Duy wrote: On Thu, Aug 2, 2012 at 6:01 PM, Thomas Gummerer t.gumme...@gmail.com wrote: @@ -1443,7 +1452,6 @@ void read_index_v2(struct index_state *istate, void *mmap, int mmap_size) src_offset += consumed; } strbuf_release

[GSoC] Designing a faster index format - Progress report week 15

2012-07-30 Thread Thomas Gummerer
== Work done in the previous 14 weeks == - Definition of a tentative index file v5 format [1]. This differs from the proposal in making it possible to bisect the directory entries and file entries, to do a binary search. The exact bits for each section were also defined. To further compress

[GSoC] Designing a faster index format - Progress report week 14

2012-07-23 Thread Thomas Gummerer
== Work done in the previous 13 weeks == - Definition of a tentative index file v5 format [1]. This differs from the proposal in making it possible to bisect the directory entries and file entries, to do a binary search. The exact bits for each section were also defined. To further compress

Re: [GSoC] Designing a faster index format - Progress report week 13

2012-07-17 Thread Thomas Gummerer
On 07/16, Junio C Hamano wrote: Thomas Gummerer t.gumme...@gmail.com writes: == Work done in the previous 12 weeks == - Definition of a tentative index file v5 format [1]. This differs from the proposal in making it possible to bisect the directory entries and file entries

Re: [GSoC] Designing a faster index format - Progress report week 13

2012-07-17 Thread Thomas Gummerer
Thanks Junio for reading the progress report, this is just corrected version without the errors that he pointed out. == Work done in the previous 12 weeks == - Definition of a tentative index file v5 format [1]. This differs from the proposal in making it possible to bisect the directory

[GSoC] Designing a faster index format - Progress report week 13

2012-07-16 Thread Thomas Gummerer
== Work done in the previous 12 weeks == - Definition of a tentative index file v5 format [1]. This differs from the proposal in making it possible to bisect the directory entries and file entries, to do a binary search. The exact bits for each section were also defined. To further compress

[PATCH v3 1/3] read-cache.c: Handle long filenames correctly

2012-07-11 Thread Thomas Gummerer
Make git handle long file/path names ( 4096 characters) correctly. There is a bug in the current version, which causes very long file/pathnames to be handled incorrectly, or not even added to the index, if they share the first 4096 characters. Signed-off-by: Thomas Gummerer t.gumme...@gmail.com

[PATCH v3 3/3] Replace strlen() with ce_namelen()

2012-07-11 Thread Thomas Gummerer
Replace strlen(ce-name) with ce_namelen() in a couple of places which gives us some additional bits of performance. Signed-off-by: Thomas Gummerer t.gumme...@gmail.com --- read-cache.c |4 ++-- unpack-trees.c |2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/read

[PATCH v3 2/3] Strip namelen out of ce_flags into a ce_namelen field

2012-07-11 Thread Thomas Gummerer
it more clear what is a flag, and where the length is stored and make it clear which functions use stages in comparisions and which only use the length. It also makes CE_NAMEMASK private, so that users don't mistakenly write the name length in the flags. Signed-off-by: Thomas Gummerer t.gumme

[GSoC] Designing a faster index format - Progress report week 12

2012-07-09 Thread Thomas Gummerer
== Work done in the previous 11 weeks == - Definition of a tentative index file v5 format [1]. This differs from the proposal in making it possible to bisect the directory entries and file entries, to do a binary search. The exact bits for each section were also defined. To further

Introduction of a ce_namelen field

2012-07-06 Thread Thomas Gummerer
Thanks to the review of Junio, Duy and Thomas here is a reroll of the patches, where the name length is separated from the flags in the in-memory format and which includes a little bit of a performance optimization by using the ce_namelen field instead of strlen() in a couple of places thanks to

[PATCH/RFC v2 1/2] Strip namelen out of ce_flags into a ce_namelen field

2012-07-06 Thread Thomas Gummerer
-off-by: Thomas Gummerer t.gumme...@gmail.com --- builtin/apply.c|3 ++- builtin/blame.c|3 ++- builtin/checkout.c |3 ++- builtin/update-index.c |9 +--- cache.h| 18 ++-- read-cache.c | 54

[PATCH/RFC v2 2/2] Replace strlen() with ce_namelen()

2012-07-06 Thread Thomas Gummerer
Replace strlen(ce-name) with ce_namelen() in a couple of places which gives us some additional bits of performance. Signed-off-by: Thomas Gummerer t.gumme...@gmail.com --- read-cache.c |4 ++-- unpack-trees.c |2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/read

<    4   5   6   7   8   9