Re: [PATCH v3 18/24] read-cache: write index-v5
[sorry about taking so much time to get back to you, was too busy with other stuff to work on git] Duy Nguyen writes: > On Mon, Aug 19, 2013 at 2:42 AM, Thomas Gummerer wrote: >> +char *super_directory(const char *filename) >> +{ >> + char *slash; >> + >> + slash = strrchr(filename, '/'); >> + if (slash) >> + return xmemdupz(filename, slash-filename); >> + return NULL; >> +} >> + > > Why is this function not static? There are a few other in this patch > (I did not notice in others, but I wasn't looking for them..) That's just an oversight, it should be static. Will look for the others and change them to static too. > But isn't this what dirname() is? > > Another point about this function is it returns a new allocated > string, but I see no free() anywhere in this patch. Leak alert! Yes, you're right, I'll use dirname instead. That also solves the free problem, as dirname modifies the string. I'll still use a wrapper around it, because dirname returns "." when there is no super_directory, but using NULL for that makes it simpler to check it. >> +struct directory_entry *init_directory_entry(char *pathname, int len) >> +{ >> + struct directory_entry *de = xmalloc(directory_entry_size(len)); >> + >> + memcpy(de->pathname, pathname, len); >> + de->pathname[len] = '\0'; >> + de->de_flags = 0; >> + de->de_foffset= 0; >> + de->de_cr = 0; >> + de->de_ncr= 0; >> + de->de_nsubtrees = 0; >> + de->de_nfiles = 0; >> + de->de_nentries = 0; >> + memset(de->sha1, 0, 20); >> + de->de_pathlen= len; >> + de->next = NULL; >> + de->next_hash = NULL; >> + de->ce= NULL; >> + de->ce_last = NULL; >> + de->conflict = NULL; >> + de->conflict_last = NULL; >> + de->conflict_size = 0; >> + return de; >> +} > > I think this function could be shortened to > > struct directory_entry *de = xcalloc(1, directory_entry_size(len)); > memcpy(de->pathname, pathname, len); > de->de_pathlen = len; > return de; Makes sense, thanks. >> +static struct directory_entry *get_directory(char *dir, unsigned int >> dir_len, >> +struct hash_table *table, >> +unsigned int *total_dir_len, >> +unsigned int *ndir, >> +struct directory_entry >> **current) >> +{ >> + struct directory_entry *tmp = NULL, *search, *new, *ret; >> + uint32_t crc; >> + >> + search = find_directory(dir, dir_len, &crc, table); >> + if (search) >> + return search; >> + while (!search) { >> + new = init_directory_entry(dir, dir_len); >> + insert_directory_entry(new, table, total_dir_len, ndir, crc); >> + if (!tmp) >> + ret = new; >> + else >> + new->de_nsubtrees = 1; >> + new->next = tmp; >> + tmp = new; >> + dir = super_directory(dir); > > It feels more natural to create directory_entry(s) from parent to > subdir. If you do so you could reset dir to the remaining of directory > and perform strchr() and do not need to allocate new memory everytime > you call super_directory (because it relies on NUL at the end of the > string). I'm creating them from subdir to parent because it saves a few calls whenever a superdir is already in the hash_table. And I'm using dirname so the new memory allocations are not a problem anymore. >> + dir_len = dir ? strlen(dir) : 0; >> + search = find_directory(dir, dir_len, &crc, table); >> + } >> + search->de_nsubtrees++; >> + (*current)->next = tmp; >> + while ((*current)->next) >> + *current = (*current)->next; >> + >> + return ret; >> +} > >> +static struct directory_entry *compile_directory_data(struct index_state >> *istate, >> + int nfile, >> + unsigned int *ndir, >> + unsigned int >> *total_dir_len, >> + unsigned int >> *total_file_len) >> +{ >> + int i, dir_len = -1; >> + char *dir; >> + struct directory_entry *de, *current, *search; >> + struct cache_entry **cache = istate->cache; >> + struct conflict_entry *conflict_entry; >> + struct hash_table table; >> + uint32_t crc; >> + >> + init_hash(&table); >> + de = init_directory_entry("", 0); >> + current = de; >> + *ndir = 1; >> + *total_dir_len = 1; >> + crc = crc32(0, (Bytef*)de->pathname, de->de_pathlen); >> + insert_hash(crc, de, &table); >> +
Re: [PATCH v3 18/24] read-cache: write index-v5
On Sat, Aug 24, 2013 at 11:07 AM, Duy Nguyen wrote: > On Mon, Aug 19, 2013 at 2:42 AM, Thomas Gummerer wrote: >> Write the index version 5 file format to disk. This version doesn't >> write the cache-tree data and resolve-undo data to the file. > > I keep having things to add after sending my emails. Now that we have > all conflicted entries in file block, the conflict data block becomes > optional, it functions exactly (I think) like resolve-undo extension, > which makes me think it might make sense to make conflict data block > an extension. > > If we make it so, we might want to move "cr" and "ncr" fields out of > direntries. I don't see a solution yet, but I think it's interesting > because future extensions might want to attach stuff to direntries, > just like "cr"/"ncr" from conflict extension. We may want to think now > how that might be done (and conflict extension is a good exercise to > see how it works out) And the solution is pretty obvious: keep resolve-undo extension as _extension_ just like in v2 (and no "cr/ncr" in direntries). The only difference is the time resolve-undo extension is updated: in v2, new entries are added at remove_index_entry_at(), in v5 new entries are added when new stage entries are detected at write_index(). -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 18/24] read-cache: write index-v5
On Mon, Aug 19, 2013 at 2:42 AM, Thomas Gummerer wrote: > Write the index version 5 file format to disk. This version doesn't > write the cache-tree data and resolve-undo data to the file. I keep having things to add after sending my emails. Now that we have all conflicted entries in file block, the conflict data block becomes optional, it functions exactly (I think) like resolve-undo extension, which makes me think it might make sense to make conflict data block an extension. If we make it so, we might want to move "cr" and "ncr" fields out of direntries. I don't see a solution yet, but I think it's interesting because future extensions might want to attach stuff to direntries, just like "cr"/"ncr" from conflict extension. We may want to think now how that might be done (and conflict extension is a good exercise to see how it works out) -- Duy I -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 18/24] read-cache: write index-v5
On Mon, Aug 19, 2013 at 2:42 AM, Thomas Gummerer wrote: > +char *super_directory(const char *filename) > +{ > + char *slash; > + > + slash = strrchr(filename, '/'); > + if (slash) > + return xmemdupz(filename, slash-filename); > + return NULL; > +} > + Why is this function not static? There are a few other in this patch (I did not notice in others, but I wasn't looking for them..) But isn't this what dirname() is? Another point about this function is it returns a new allocated string, but I see no free() anywhere in this patch. Leak alert! > +struct directory_entry *init_directory_entry(char *pathname, int len) > +{ > + struct directory_entry *de = xmalloc(directory_entry_size(len)); > + > + memcpy(de->pathname, pathname, len); > + de->pathname[len] = '\0'; > + de->de_flags = 0; > + de->de_foffset= 0; > + de->de_cr = 0; > + de->de_ncr= 0; > + de->de_nsubtrees = 0; > + de->de_nfiles = 0; > + de->de_nentries = 0; > + memset(de->sha1, 0, 20); > + de->de_pathlen= len; > + de->next = NULL; > + de->next_hash = NULL; > + de->ce= NULL; > + de->ce_last = NULL; > + de->conflict = NULL; > + de->conflict_last = NULL; > + de->conflict_size = 0; > + return de; > +} I think this function could be shortened to struct directory_entry *de = xcalloc(1, directory_entry_size(len)); memcpy(de->pathname, pathname, len); de->de_pathlen = len; return de; > +static struct directory_entry *get_directory(char *dir, unsigned int dir_len, > +struct hash_table *table, > +unsigned int *total_dir_len, > +unsigned int *ndir, > +struct directory_entry **current) > +{ > + struct directory_entry *tmp = NULL, *search, *new, *ret; > + uint32_t crc; > + > + search = find_directory(dir, dir_len, &crc, table); > + if (search) > + return search; > + while (!search) { > + new = init_directory_entry(dir, dir_len); > + insert_directory_entry(new, table, total_dir_len, ndir, crc); > + if (!tmp) > + ret = new; > + else > + new->de_nsubtrees = 1; > + new->next = tmp; > + tmp = new; > + dir = super_directory(dir); It feels more natural to create directory_entry(s) from parent to subdir. If you do so you could reset dir to the remaining of directory and perform strchr() and do not need to allocate new memory everytime you call super_directory (because it relies on NUL at the end of the string). > + dir_len = dir ? strlen(dir) : 0; > + search = find_directory(dir, dir_len, &crc, table); > + } > + search->de_nsubtrees++; > + (*current)->next = tmp; > + while ((*current)->next) > + *current = (*current)->next; > + > + return ret; > +} > +static struct directory_entry *compile_directory_data(struct index_state > *istate, > + int nfile, > + unsigned int *ndir, > + unsigned int > *total_dir_len, > + unsigned int > *total_file_len) > +{ > + int i, dir_len = -1; > + char *dir; > + struct directory_entry *de, *current, *search; > + struct cache_entry **cache = istate->cache; > + struct conflict_entry *conflict_entry; > + struct hash_table table; > + uint32_t crc; > + > + init_hash(&table); > + de = init_directory_entry("", 0); > + current = de; > + *ndir = 1; > + *total_dir_len = 1; > + crc = crc32(0, (Bytef*)de->pathname, de->de_pathlen); > + insert_hash(crc, de, &table); > + conflict_entry = NULL; > + for (i = 0; i < nfile; i++) { > + if (cache[i]->ce_flags & CE_REMOVE) > + continue; > + > + if (dir_len < 0 > + || cache[i]->name[dir_len] != '/' Need a check to make sure name[dir_len] is not out of bound > + || strchr(cache[i]->name + dir_len + 1, '/') > + || cache_name_compare(cache[i]->name, > ce_namelen(cache[i]), > + dir, dir_len)) { In my opinon, "if (dir_len < 0 || !(must && be && a && subdirectory))" is easier to read.. > + dir = super_directory(cache[i]->name); > + dir_len = dir ? strlen(dir) : 0; > + search = get_directory(dir, dir_len, &table, > +