Re: [PATCH/RFC v2 09/16] Read index-v5
On 08/05, Junio C Hamano wrote: Thomas Gummerer t.gumme...@gmail.com writes: +static struct directory_entry *read_directories_v5(unsigned int *dir_offset, + unsigned int *dir_table_offset, + void *mmap, + int mmap_size) +{ + int i, ondisk_directory_size; + uint32_t *filecrc, *beginning, *end; + struct directory_entry *current = NULL; + struct ondisk_directory_entry *disk_de; + struct directory_entry *de; + unsigned int data_len, len; + char *name; + + ondisk_directory_size = sizeof(disk_de-flags) + + sizeof(disk_de-foffset) + + sizeof(disk_de-cr) + + sizeof(disk_de-ncr) + + sizeof(disk_de-nsubtrees) + + sizeof(disk_de-nfiles) + + sizeof(disk_de-nentries) + + sizeof(disk_de-sha1); + name = (char *)mmap + *dir_offset; + beginning = mmap + *dir_table_offset; Notice how you computed name with pointer arithmetic by first casting mmap (which is void *) and when computing beginning, you forgot to cast mmap and attempted pointer arithmetic with void *. The latter does not work and breaks compilation. The pointer-arith with void * is not limited to this function. Sorry for not noticing this, it always compiled fine for me. Guess I should use -pedantic more often ;-) Please check the a band-aid (I wouldn't call it a fix-up) patch I added on top of the series before queuing the topic to 'pu'; it is primarily to illustrate the places I noticed that have this issue. I do not necessarily suggest that the way the band-aid patch makes it compile is the best approach. It might be cleaner to use a saner type like char * (or perhaps const char *) as the type to point at a piece of memory you read from the disk. I haven't formed an opinion. Thanks. I've used the type of the respective assignment for now. e.g. i have struct cache_header *hdr, so I'm using hdr = (struct cache_header *)mmap + x; read-cache-v5.c compiles with -pedantic without warnings. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC v2 09/16] Read index-v5
Thomas Gummerer t.gumme...@gmail.com writes: + name = (char *)mmap + *dir_offset; + beginning = mmap + *dir_table_offset; Notice how you computed name with pointer arithmetic by first casting mmap (which is void *) and when computing beginning, you forgot to cast mmap and attempted pointer arithmetic with void *. The latter does not work and breaks compilation. The pointer-arith with void * is not limited to this function. ... I've used the type of the respective assignment for now. e.g. i have struct cache_header *hdr, so I'm using hdr = (struct cache_header *)mmap + x; You need to be careful when rewriting the above to choose the right value for 'x' if you go that route (which I wouldn't recommend). With hdr = ptr_add(mmap, x); you are making hdr point at x BYTES beyond mmap, but hdr = (struct cache_header *)mmap + x; means something entirely different, no? hdr points at x entries of struct cache_header beyond mmap (in other words, if mmap[] were defined as struct cache_header mmap[], the above is saying the same as hdr = mmap[x]). I think the way you casted to compute the value for the name pointer is the (second) right thing to do. The cast (char *) applied to mmap is about mmap is a typeless blob of memory I want to count bytes in. Give me *dir_offset bytes into that blob. It is not tied to the type of LHS (i.e. name) at all. The result then needs to be casted to the type of LHS (i.e. name), and in this case the types happen to be the same, so you do not have to cast the result of the addition but that is mere luck. The next line is not so lucky and you would need to say something like: beginning = (uint32_t *)((char *)mmap + *dir_table_offset); Again, inner cast is about mmap is a blob counted in bytes, the outer cast is about type mismatch between a byte-address and LHS of the assignment. If mmap variable in this function were not void * but something more sane like const char *, you wouldn't have to have the inner cast to begin with, and that is why I said the way you did name is the second right thing. Then you can write them like name = mmap + *dir_offset; beginning = (uint32_t *)(mmap + *dir_offset); After thinking about this, the ptr_add() macro might be the best solution, even though I originally called it as a band-aid. We know mmap is a blob of memory, byte-offset of each component of which we know about, so we can say name = ptr_add(mmap, *dir_offset); beginning = ptr_add(mmap, *dir_offset); Hmmm.. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC v2 09/16] Read index-v5
On 08/08, Junio C Hamano wrote: Thomas Gummerer t.gumme...@gmail.com writes: +name = (char *)mmap + *dir_offset; +beginning = mmap + *dir_table_offset; Notice how you computed name with pointer arithmetic by first casting mmap (which is void *) and when computing beginning, you forgot to cast mmap and attempted pointer arithmetic with void *. The latter does not work and breaks compilation. The pointer-arith with void * is not limited to this function. ... I've used the type of the respective assignment for now. e.g. i have struct cache_header *hdr, so I'm using hdr = (struct cache_header *)mmap + x; You need to be careful when rewriting the above to choose the right value for 'x' if you go that route (which I wouldn't recommend). With hdr = ptr_add(mmap, x); you are making hdr point at x BYTES beyond mmap, but hdr = (struct cache_header *)mmap + x; means something entirely different, no? hdr points at x entries of struct cache_header beyond mmap (in other words, if mmap[] were defined as struct cache_header mmap[], the above is saying the same as hdr = mmap[x]). I think the way you casted to compute the value for the name pointer is the (second) right thing to do. The cast (char *) applied to mmap is about mmap is a typeless blob of memory I want to count bytes in. Give me *dir_offset bytes into that blob. It is not tied to the type of LHS (i.e. name) at all. The result then needs to be casted to the type of LHS (i.e. name), and in this case the types happen to be the same, so you do not have to cast the result of the addition but that is mere luck. The next line is not so lucky and you would need to say something like: beginning = (uint32_t *)((char *)mmap + *dir_table_offset); Again, inner cast is about mmap is a blob counted in bytes, the outer cast is about type mismatch between a byte-address and LHS of the assignment. This is what I tried in v3 of the series, but it didn't seem quiet right. If mmap variable in this function were not void * but something more sane like const char *, you wouldn't have to have the inner cast to begin with, and that is why I said the way you did name is the second right thing. Then you can write them like name = mmap + *dir_offset; beginning = (uint32_t *)(mmap + *dir_offset); After thinking about this, the ptr_add() macro might be the best solution, even though I originally called it as a band-aid. We know mmap is a blob of memory, byte-offset of each component of which we know about, so we can say name = ptr_add(mmap, *dir_offset); beginning = ptr_add(mmap, *dir_offset); Hmmm.. I start to think so too. Casting the mmap variable to const char * in the method call doesn't feel right to me, even though it would work. Unless there are any objections I'll use ptr_add in the next version. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RFC v2 09/16] Read index-v5
Make git read the index file version 5 without complaining. This version of the reader doesn't read neither the cache-tree nor the resolve undo data, but doesn't choke on an index that includes such data. Helped-by: Thomas Rast tr...@student.ethz.ch Signed-off-by: Thomas Gummerer t.gumme...@gmail.com --- cache.h | 72 +++ read-cache.c | 590 +- 2 files changed, 657 insertions(+), 5 deletions(-) diff --git a/cache.h b/cache.h index 076d6af..98adcd9 100644 --- a/cache.h +++ b/cache.h @@ -110,6 +110,15 @@ struct cache_time { unsigned int nsec; }; +/* + * The *next pointer is used in read_entries_v5 for holding + * all the elements of a directory, and points to the next + * cache_entry in a directory. + * + * It is reset by the add_name_hash call in set_index_entry + * to set it to point to the next cache_entry in the + * correct in-memory format ordering. + */ struct cache_entry { struct cache_time ce_ctime; struct cache_time ce_mtime; @@ -128,11 +137,58 @@ struct cache_entry { char name[FLEX_ARRAY]; /* more */ }; +struct directory_entry { + struct directory_entry *next; + struct directory_entry *next_hash; + struct cache_entry *ce; + struct cache_entry *ce_last; + struct conflict_entry *conflict; + struct conflict_entry *conflict_last; + unsigned int conflict_size; + unsigned int de_foffset; + unsigned int de_cr; + unsigned int de_ncr; + unsigned int de_nsubtrees; + unsigned int de_nfiles; + unsigned int de_nentries; + unsigned char sha1[20]; + unsigned short de_flags; + unsigned int de_pathlen; + char pathname[FLEX_ARRAY]; +}; + +struct conflict_part { + struct conflict_part *next; + unsigned short flags; + unsigned short entry_mode; + unsigned char sha1[20]; +}; + +struct conflict_entry { + struct conflict_entry *next; + unsigned int nfileconflicts; + struct conflict_part *entries; + unsigned int namelen; + unsigned int pathlen; + char name[FLEX_ARRAY]; +}; + +struct ondisk_conflict_part { + unsigned short flags; + unsigned short entry_mode; + unsigned char sha1[20]; +}; + +#define CE_NAMEMASK (0x0fff) #define CE_STAGEMASK (0x3000) #define CE_EXTENDED (0x4000) #define CE_VALID (0x8000) #define CE_STAGESHIFT 12 +#define CONFLICT_CONFLICTED (0x8000) +#define CONFLICT_STAGESHIFT 13 +#define CONFLICT_STAGEMASK (0x6000) + /* * Range 0x in ce_flags is divided into * two parts: in-memory flags and on-disk ones. @@ -166,6 +222,18 @@ struct cache_entry { #define CE_EXTENDED_FLAGS (CE_INTENT_TO_ADD | CE_SKIP_WORKTREE) /* + * Representation of the extended on-disk flags in the v5 format. + * They must not collide with the ordinary on-disk flags, and need to + * fit in 16 bits. Note however that v5 does not save the name + * length. + */ +#define CE_INTENT_TO_ADD_V5 (0x4000) +#define CE_SKIP_WORKTREE_V5 (0x0800) +#if (CE_VALID|CE_STAGEMASK) (CE_INTENTTOADD_V5|CE_SKIPWORKTREE_V5) +#error v5 on-disk flags collide with ordinary on-disk flags +#endif + +/* * Safeguard to avoid saving wrong flags: * - CE_EXTENDED2 won't get saved until its semantic is known * - Bits in 0x have been saved in ce_flags already @@ -203,6 +271,8 @@ static inline unsigned create_ce_flags(unsigned stage) #define ce_skip_worktree(ce) ((ce)-ce_flags CE_SKIP_WORKTREE) #define ce_mark_uptodate(ce) ((ce)-ce_flags |= CE_UPTODATE) +#define conflict_stage(c) ((CONFLICT_STAGEMASK (c)-flags) CONFLICT_STAGESHIFT) + #define ce_permissions(mode) (((mode) 0100) ? 0755 : 0644) static inline unsigned int create_ce_mode(unsigned int mode) { @@ -249,6 +319,8 @@ static inline unsigned int canon_mode(unsigned int mode) } #define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1) +#define directory_entry_size(len) (offsetof(struct directory_entry,pathname) + (len) + 1) +#define conflict_entry_size(len) (offsetof(struct conflict_entry,name) + (len) + 1) struct index_state { struct cache_entry **cache; diff --git a/read-cache.c b/read-cache.c index 4243606..70334f9 100644 --- a/read-cache.c +++ b/read-cache.c @@ -234,6 +234,55 @@ static int ce_match_stat_basic_v2(struct cache_entry *ce, return changed; } +static int match_stat_crc(struct stat *st, uint32_t expected_crc) +{ + uint32_t data, stat_crc = 0; + unsigned int ctimens = 0; + + data = htonl(st-st_ctime); + stat_crc = crc32(0, (Bytef*)data, 4); +#ifdef USE_NSEC + ctimens = ST_MTIME_NSEC(*st); +#endif + data = htonl(ctimens); + stat_crc = crc32(stat_crc, (Bytef*)data, 4); + data = htonl(st-st_ino); + stat_crc = crc32(stat_crc, (Bytef*)data, 4); + data = htonl(st-st_size); + stat_crc = crc32(stat_crc, (Bytef*)data, 4); + data = htonl(st-st_dev); +