Re: [PATCH v3 18/24] read-cache: write index-v5

2013-11-25 Thread Thomas Gummerer

[sorry about taking so much time to get back to you, was too busy with
other stuff to work on git]

Duy Nguyen pclo...@gmail.com writes:

 On Mon, Aug 19, 2013 at 2:42 AM, Thomas Gummerer t.gumme...@gmail.com wrote:
 +char *super_directory(const char *filename)
 +{
 +   char *slash;
 +
 +   slash = strrchr(filename, '/');
 +   if (slash)
 +   return xmemdupz(filename, slash-filename);
 +   return NULL;
 +}
 +

 Why is this function not static? There are a few other in this patch
 (I did not notice in others, but I wasn't looking for them..)

That's just an oversight, it should be static.  Will look for the others
and change them to static too.

 But isn't this what dirname() is?

 Another point about this function is it returns a new allocated
 string, but I see no free() anywhere in this patch. Leak alert!

Yes, you're right, I'll use dirname instead.  That also solves the free
problem, as dirname modifies the string.  I'll still use a wrapper
around it, because dirname returns . when there is no super_directory,
but using NULL for that makes it simpler to check it.

 +struct directory_entry *init_directory_entry(char *pathname, int len)
 +{
 +   struct directory_entry *de = xmalloc(directory_entry_size(len));
 +
 +   memcpy(de-pathname, pathname, len);
 +   de-pathname[len] = '\0';
 +   de-de_flags  = 0;
 +   de-de_foffset= 0;
 +   de-de_cr = 0;
 +   de-de_ncr= 0;
 +   de-de_nsubtrees  = 0;
 +   de-de_nfiles = 0;
 +   de-de_nentries   = 0;
 +   memset(de-sha1, 0, 20);
 +   de-de_pathlen= len;
 +   de-next  = NULL;
 +   de-next_hash = NULL;
 +   de-ce= NULL;
 +   de-ce_last   = NULL;
 +   de-conflict  = NULL;
 +   de-conflict_last = NULL;
 +   de-conflict_size = 0;
 +   return de;
 +}

 I think this function could be shortened to

 struct directory_entry *de = xcalloc(1, directory_entry_size(len));
 memcpy(de-pathname, pathname, len);
 de-de_pathlen = len;
 return de;

Makes sense, thanks.

 +static struct directory_entry *get_directory(char *dir, unsigned int 
 dir_len,
 +struct hash_table *table,
 +unsigned int *total_dir_len,
 +unsigned int *ndir,
 +struct directory_entry 
 **current)
 +{
 +   struct directory_entry *tmp = NULL, *search, *new, *ret;
 +   uint32_t crc;
 +
 +   search = find_directory(dir, dir_len, crc, table);
 +   if (search)
 +   return search;
 +   while (!search) {
 +   new = init_directory_entry(dir, dir_len);
 +   insert_directory_entry(new, table, total_dir_len, ndir, crc);
 +   if (!tmp)
 +   ret = new;
 +   else
 +   new-de_nsubtrees = 1;
 +   new-next = tmp;
 +   tmp = new;
 +   dir = super_directory(dir);

 It feels more natural to create directory_entry(s) from parent to
 subdir. If you do so you could reset dir to the remaining of directory
 and perform strchr() and do not need to allocate new memory everytime
 you call super_directory (because it relies on NUL at the end of the
 string).

I'm creating them from subdir to parent because it saves a few calls
whenever a superdir is already in the hash_table.  And I'm using dirname
so the new memory allocations are not a problem anymore.

 +   dir_len = dir ? strlen(dir) : 0;
 +   search = find_directory(dir, dir_len, crc, table);
 +   }
 +   search-de_nsubtrees++;
 +   (*current)-next = tmp;
 +   while ((*current)-next)
 +   *current = (*current)-next;
 +
 +   return ret;
 +}

 +static struct directory_entry *compile_directory_data(struct index_state 
 *istate,
 + int nfile,
 + unsigned int *ndir,
 + unsigned int 
 *total_dir_len,
 + unsigned int 
 *total_file_len)
 +{
 +   int i, dir_len = -1;
 +   char *dir;
 +   struct directory_entry *de, *current, *search;
 +   struct cache_entry **cache = istate-cache;
 +   struct conflict_entry *conflict_entry;
 +   struct hash_table table;
 +   uint32_t crc;
 +
 +   init_hash(table);
 +   de = init_directory_entry(, 0);
 +   current = de;
 +   *ndir = 1;
 +   *total_dir_len = 1;
 +   crc = crc32(0, (Bytef*)de-pathname, de-de_pathlen);
 +   insert_hash(crc, de, table);
 +   conflict_entry = NULL;
 +   for (i = 0; i  nfile; i++) {
 +   if (cache[i]-ce_flags  CE_REMOVE)
 +   continue;
 +
 +   if (dir_len  0
 +  

Re: [PATCH v3 18/24] read-cache: write index-v5

2013-08-24 Thread Duy Nguyen
On Sat, Aug 24, 2013 at 11:07 AM, Duy Nguyen pclo...@gmail.com wrote:
 On Mon, Aug 19, 2013 at 2:42 AM, Thomas Gummerer t.gumme...@gmail.com wrote:
 Write the index version 5 file format to disk. This version doesn't
 write the cache-tree data and resolve-undo data to the file.

 I keep having things to add after sending my emails. Now that we have
 all conflicted entries in file block, the conflict data block becomes
 optional, it functions exactly (I think) like resolve-undo extension,
 which makes me think it might make sense to make conflict data block
 an extension.

 If we make it so, we might want to move cr and ncr fields out of
 direntries. I don't see a solution yet, but I think it's interesting
 because future extensions might want to attach stuff to direntries,
 just like cr/ncr from conflict extension. We may want to think now
 how that might be done (and conflict extension is a good exercise to
 see how it works out)

And the solution is pretty obvious: keep resolve-undo extension as
_extension_ just like in v2 (and no cr/ncr in direntries). The only
difference is the time resolve-undo extension is updated: in v2, new
entries are added at remove_index_entry_at(), in v5 new entries are
added when new stage entries  are detected at write_index().
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 18/24] read-cache: write index-v5

2013-08-23 Thread Duy Nguyen
On Mon, Aug 19, 2013 at 2:42 AM, Thomas Gummerer t.gumme...@gmail.com wrote:
 Write the index version 5 file format to disk. This version doesn't
 write the cache-tree data and resolve-undo data to the file.

I keep having things to add after sending my emails. Now that we have
all conflicted entries in file block, the conflict data block becomes
optional, it functions exactly (I think) like resolve-undo extension,
which makes me think it might make sense to make conflict data block
an extension.

If we make it so, we might want to move cr and ncr fields out of
direntries. I don't see a solution yet, but I think it's interesting
because future extensions might want to attach stuff to direntries,
just like cr/ncr from conflict extension. We may want to think now
how that might be done (and conflict extension is a good exercise to
see how it works out)
-- 
Duy
I
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html