Re: What's cooking in git.git (Oct 2013, #03; Wed, 16)

2013-10-22 Thread Karsten Blees
Am 18.10.2013 21:09, schrieb Junio C Hamano:
> Karsten Blees  writes:
> 
>> The coredumps are caused by my patch #10, which free()s
>> cache_entries when they are removed, in combination with ...
> 
> Looking at that patch, it makes me wonder if remove_index_entry_at()
> and replace_index_entry() should be the ones that frees the old
> entry in the first place.  A caller may already have a ce pointing
> at an old entry and use the information from old_ce to update a new
> one after it installed it, e.g.
> 
>   old_ce = ...
> new_ce = make_cache_entry(... old_ce->name, ...);
> replace_index_entry(... new_ce);
>   new_ce->ce_mode = old_ce->cd_mode;
>   free(old_ce);
> 
> The same goes for the functions that remove the entry.
> 

Moving free() to the callers or caller's callers would make it much more 
complicated (more places to change). Besides, most callers don't even have a 
reference to old_ce and simply remove by position. Of course, this doesn't 
prevent caller's caller's callers to keep a reference to a removed / replaced 
entry, as found by Thomas.

> 
> Going forward, I do agree with your patch #10 that removal or
> replacing that may make an existing entry unreferenced should free
> entries that are no longer used, and "use after free" should be
> forbidden.
> 

OK, I'll spend some more time analyzing the call hierarchies to see if there 
are more uses of removed cache_entries. I'll try to post an updated v4 by the 
end of the week.

Karsten
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Oct 2013, #03; Wed, 16)

2013-10-18 Thread Jens Lehmann
Am 18.10.2013 21:09, schrieb Junio C Hamano:
> Karsten Blees  writes:
>> Can't we just use add_file_to_cache here (which replaces
>> cache_entries by creating a copy)?
>>
>> diff --git a/submodule.c b/submodule.c
>> index 1905d75..e388487 100644
>> --- a/submodule.c
>> +++ b/submodule.c
>> @@ -116,30 +116,7 @@ int remove_path_from_gitmodules(const char *path)
>>  
>>  void stage_updated_gitmodules(void)
>>  {
>> -   struct strbuf buf = STRBUF_INIT;
>> -   struct stat st;
>> -   int pos;
>> -   struct cache_entry *ce;
>> -   int namelen = strlen(".gitmodules");
>> -
>> -   pos = cache_name_pos(".gitmodules", namelen);
>> -   if (pos < 0) {
>> -   warning(_("could not find .gitmodules in index"));
>> -   return;
>> -   }
> 
> I think the remainder is (morally) equivalent between the original
> and a single "add-file-to-cache" call, and the version after your
> "how about this" patch in the message I am responding to looks more
> correct (e.g. why does the original lstat after it has read the
> file?).

Cargo cult programming. I was looking at other code manipulating
the index (as Documentation/technical/api-in-core-index.txt is
rather terse ;-) and concluded I would need to read the possibly
updated st.st_mode, in case updating the config file would have
changed that.

> But this warning may want to stay, no?

Of course you are right on this one. All test ran successfully
with this patch, so I think adding one for that warning makes
sense too. And as that is submodule related stuff I volunteer
for fixing all this ;-)

>> -   ce = active_cache[pos];
>> -   ce->ce_flags = namelen;
>> -   if (strbuf_read_file(&buf, ".gitmodules", 0) < 0)
>> -   die(_("reading updated .gitmodules failed"));
>> -   if (lstat(".gitmodules", &st) < 0)
>> -   die_errno(_("unable to stat updated .gitmodules"));
>> -   fill_stat_cache_info(ce, &st);
>> -   ce->ce_mode = ce_mode_from_stat(ce, st.st_mode);
>> -   if (remove_cache_entry_at(pos) < 0)
>> -   die(_("unable to remove .gitmodules from index"));
>> -   if (write_sha1_file(buf.buf, buf.len, blob_type, ce->sha1))
>> -   die(_("adding updated .gitmodules failed"));
>> -   if (add_cache_entry(ce, ADD_CACHE_OK_TO_ADD|ADD_CACHE_OK_TO_REPLACE))
>> +   if (add_file_to_cache(".gitmodules", 0))
>> die(_("staging updated .gitmodules failed"));

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Oct 2013, #03; Wed, 16)

2013-10-18 Thread Jens Lehmann
Am 18.10.2013 02:42, schrieb Karsten Blees:
> Am 17.10.2013 23:07, schrieb Junio C Hamano:
>> Junio C Hamano  writes:
>>
>>> Karsten Blees  writes:
>>>
 Am 16.10.2013 23:43, schrieb Junio C Hamano:
> * kb/fast-hashmap (2013-09-25) 6 commits
>  - fixup! diffcore-rename.c: simplify finding exact renames
>  - diffcore-rename.c: use new hash map implementation
>  - diffcore-rename.c: simplify finding exact renames
>  - diffcore-rename.c: move code around to prepare for the next patch
>  - buitin/describe.c: use new hash map implementation
>  - add a hashtable implementation that supports O(1) removal
>

 I posted a much more complete v3 [1], but somehow missed Jonathan's fixup! 
 commit.
>>>
>>> Thanks; I'll replace the above with v3 and squash the fix-up in.
>>
>> Interestingly, v3 applied on 'maint' and then merged to 'master'
>> seems to break t3600 and t7001 with a coredump.
>>
>> It would conflict with es/name-hash-no-trailing-slash-in-dirs that
>> has been cooking in 'next', too; the resolution might be trivial but
>> I didn't look too deeply into it.
>>
> 
> I've pushed a rebased version to 
> https://github.com/kblees/git/commits/kb/hashmap-v3-next
> (no changes yet except for Jonathan's fixup in #04 and merge resolution).
> 
> The coredumps are caused by my patch #10, which free()s cache_entries when 
> they are removed, in combination with submodule.c::stage_updated_gitmodules 
> (5fee9952 "submodule.c: add .gitmodules staging helper functions"), which 
> removes a cache_entry, then modifies and re-adds the (now) free()d memory.
> 
> Can't we just use add_file_to_cache here (which replaces cache_entries by 
> creating a copy)?

No objections from my side. Looks like we could also copy the
cache entry just before remove_cache_entry_at() and use that
copy afterwards, but your solution is so much shorter that I
would really like to use it (unless someone more cache-savvy
than me has any objections).

And by the way: this is the last use of remove_cache_entry_at(),
would it make sense to remove that define while at it? Only the
remove_index_entry_at() function it is defined to is currently
used.

> diff --git a/submodule.c b/submodule.c
> index 1905d75..e388487 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -116,30 +116,7 @@ int remove_path_from_gitmodules(const char *path)
>  
>  void stage_updated_gitmodules(void)
>  {
> -   struct strbuf buf = STRBUF_INIT;
> -   struct stat st;
> -   int pos;
> -   struct cache_entry *ce;
> -   int namelen = strlen(".gitmodules");
> -
> -   pos = cache_name_pos(".gitmodules", namelen);
> -   if (pos < 0) {
> -   warning(_("could not find .gitmodules in index"));
> -   return;
> -   }
> -   ce = active_cache[pos];
> -   ce->ce_flags = namelen;
> -   if (strbuf_read_file(&buf, ".gitmodules", 0) < 0)
> -   die(_("reading updated .gitmodules failed"));
> -   if (lstat(".gitmodules", &st) < 0)
> -   die_errno(_("unable to stat updated .gitmodules"));
> -   fill_stat_cache_info(ce, &st);
> -   ce->ce_mode = ce_mode_from_stat(ce, st.st_mode);
> -   if (remove_cache_entry_at(pos) < 0)
> -   die(_("unable to remove .gitmodules from index"));
> -   if (write_sha1_file(buf.buf, buf.len, blob_type, ce->sha1))
> -   die(_("adding updated .gitmodules failed"));
> -   if (add_cache_entry(ce, ADD_CACHE_OK_TO_ADD|ADD_CACHE_OK_TO_REPLACE))
> +   if (add_file_to_cache(".gitmodules", 0))
> die(_("staging updated .gitmodules failed"));
>  }
> 
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Oct 2013, #03; Wed, 16)

2013-10-18 Thread Junio C Hamano
Karsten Blees  writes:

> The coredumps are caused by my patch #10, which free()s
> cache_entries when they are removed, in combination with ...

Looking at that patch, it makes me wonder if remove_index_entry_at()
and replace_index_entry() should be the ones that frees the old
entry in the first place.  A caller may already have a ce pointing
at an old entry and use the information from old_ce to update a new
one after it installed it, e.g.

old_ce = ...
new_ce = make_cache_entry(... old_ce->name, ...);
replace_index_entry(... new_ce);
new_ce->ce_mode = old_ce->cd_mode;
free(old_ce);

The same goes for the functions that remove the entry.

But I am probably biased saying this, because in the old days, cache
entries could never be freed (they were carved out of a contiguous
region of memory, mmapped from the index file).  These days, we
parse and run ntoh*() on the on-disk cache entries to create in-core
form, and the "cache entries should never be freed" is no longer
true, but I would not be surprised if there are still some code
leftover that relies on "use after free" being safe, leaking unused
cache entries.

Going forward, I do agree with your patch #10 that removal or
replacing that may make an existing entry unreferenced should free
entries that are no longer used, and "use after free" should be
forbidden.

> Can't we just use add_file_to_cache here (which replaces
> cache_entries by creating a copy)?
>
> diff --git a/submodule.c b/submodule.c
> index 1905d75..e388487 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -116,30 +116,7 @@ int remove_path_from_gitmodules(const char *path)
>  
>  void stage_updated_gitmodules(void)
>  {
> -   struct strbuf buf = STRBUF_INIT;
> -   struct stat st;
> -   int pos;
> -   struct cache_entry *ce;
> -   int namelen = strlen(".gitmodules");
> -
> -   pos = cache_name_pos(".gitmodules", namelen);
> -   if (pos < 0) {
> -   warning(_("could not find .gitmodules in index"));
> -   return;
> -   }

I think the remainder is (morally) equivalent between the original
and a single "add-file-to-cache" call, and the version after your
"how about this" patch in the message I am responding to looks more
correct (e.g. why does the original lstat after it has read the
file?).

But this warning may want to stay, no?

> -   ce = active_cache[pos];
> -   ce->ce_flags = namelen;
> -   if (strbuf_read_file(&buf, ".gitmodules", 0) < 0)
> -   die(_("reading updated .gitmodules failed"));
> -   if (lstat(".gitmodules", &st) < 0)
> -   die_errno(_("unable to stat updated .gitmodules"));
> -   fill_stat_cache_info(ce, &st);
> -   ce->ce_mode = ce_mode_from_stat(ce, st.st_mode);
> -   if (remove_cache_entry_at(pos) < 0)
> -   die(_("unable to remove .gitmodules from index"));
> -   if (write_sha1_file(buf.buf, buf.len, blob_type, ce->sha1))
> -   die(_("adding updated .gitmodules failed"));
> -   if (add_cache_entry(ce, ADD_CACHE_OK_TO_ADD|ADD_CACHE_OK_TO_REPLACE))
> +   if (add_file_to_cache(".gitmodules", 0))
> die(_("staging updated .gitmodules failed"));



>  }
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Oct 2013, #03; Wed, 16)

2013-10-17 Thread Karsten Blees
Am 17.10.2013 23:07, schrieb Junio C Hamano:
> Junio C Hamano  writes:
> 
>> Karsten Blees  writes:
>>
>>> Am 16.10.2013 23:43, schrieb Junio C Hamano:
 * kb/fast-hashmap (2013-09-25) 6 commits
  - fixup! diffcore-rename.c: simplify finding exact renames
  - diffcore-rename.c: use new hash map implementation
  - diffcore-rename.c: simplify finding exact renames
  - diffcore-rename.c: move code around to prepare for the next patch
  - buitin/describe.c: use new hash map implementation
  - add a hashtable implementation that supports O(1) removal

>>>
>>> I posted a much more complete v3 [1], but somehow missed Jonathan's fixup! 
>>> commit.
>>
>> Thanks; I'll replace the above with v3 and squash the fix-up in.
> 
> Interestingly, v3 applied on 'maint' and then merged to 'master'
> seems to break t3600 and t7001 with a coredump.
> 
> It would conflict with es/name-hash-no-trailing-slash-in-dirs that
> has been cooking in 'next', too; the resolution might be trivial but
> I didn't look too deeply into it.
> 

I've pushed a rebased version to 
https://github.com/kblees/git/commits/kb/hashmap-v3-next
(no changes yet except for Jonathan's fixup in #04 and merge resolution).

The coredumps are caused by my patch #10, which free()s cache_entries when they 
are removed, in combination with submodule.c::stage_updated_gitmodules 
(5fee9952 "submodule.c: add .gitmodules staging helper functions"), which 
removes a cache_entry, then modifies and re-adds the (now) free()d memory.

Can't we just use add_file_to_cache here (which replaces cache_entries by 
creating a copy)?


diff --git a/submodule.c b/submodule.c
index 1905d75..e388487 100644
--- a/submodule.c
+++ b/submodule.c
@@ -116,30 +116,7 @@ int remove_path_from_gitmodules(const char *path)
 
 void stage_updated_gitmodules(void)
 {
-   struct strbuf buf = STRBUF_INIT;
-   struct stat st;
-   int pos;
-   struct cache_entry *ce;
-   int namelen = strlen(".gitmodules");
-
-   pos = cache_name_pos(".gitmodules", namelen);
-   if (pos < 0) {
-   warning(_("could not find .gitmodules in index"));
-   return;
-   }
-   ce = active_cache[pos];
-   ce->ce_flags = namelen;
-   if (strbuf_read_file(&buf, ".gitmodules", 0) < 0)
-   die(_("reading updated .gitmodules failed"));
-   if (lstat(".gitmodules", &st) < 0)
-   die_errno(_("unable to stat updated .gitmodules"));
-   fill_stat_cache_info(ce, &st);
-   ce->ce_mode = ce_mode_from_stat(ce, st.st_mode);
-   if (remove_cache_entry_at(pos) < 0)
-   die(_("unable to remove .gitmodules from index"));
-   if (write_sha1_file(buf.buf, buf.len, blob_type, ce->sha1))
-   die(_("adding updated .gitmodules failed"));
-   if (add_cache_entry(ce, ADD_CACHE_OK_TO_ADD|ADD_CACHE_OK_TO_REPLACE))
+   if (add_file_to_cache(".gitmodules", 0))
die(_("staging updated .gitmodules failed"));
 }

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Oct 2013, #03; Wed, 16)

2013-10-17 Thread Junio C Hamano
Junio C Hamano  writes:

> Karsten Blees  writes:
>
>> Am 16.10.2013 23:43, schrieb Junio C Hamano:
>>> * kb/fast-hashmap (2013-09-25) 6 commits
>>>  - fixup! diffcore-rename.c: simplify finding exact renames
>>>  - diffcore-rename.c: use new hash map implementation
>>>  - diffcore-rename.c: simplify finding exact renames
>>>  - diffcore-rename.c: move code around to prepare for the next patch
>>>  - buitin/describe.c: use new hash map implementation
>>>  - add a hashtable implementation that supports O(1) removal
>>> 
>>
>> I posted a much more complete v3 [1], but somehow missed Jonathan's fixup! 
>> commit.
>
> Thanks; I'll replace the above with v3 and squash the fix-up in.

Interestingly, v3 applied on 'maint' and then merged to 'master'
seems to break t3600 and t7001 with a coredump.

It would conflict with es/name-hash-no-trailing-slash-in-dirs that
has been cooking in 'next', too; the resolution might be trivial but
I didn't look too deeply into it.



--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Oct 2013, #03; Wed, 16)

2013-10-17 Thread Junio C Hamano
Karsten Blees  writes:

> Am 16.10.2013 23:43, schrieb Junio C Hamano:
>> * kb/fast-hashmap (2013-09-25) 6 commits
>>  - fixup! diffcore-rename.c: simplify finding exact renames
>>  - diffcore-rename.c: use new hash map implementation
>>  - diffcore-rename.c: simplify finding exact renames
>>  - diffcore-rename.c: move code around to prepare for the next patch
>>  - buitin/describe.c: use new hash map implementation
>>  - add a hashtable implementation that supports O(1) removal
>> 
>
> I posted a much more complete v3 [1], but somehow missed Jonathan's fixup! 
> commit.

Thanks; I'll replace the above with v3 and squash the fix-up in.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Oct 2013, #03; Wed, 16)

2013-10-17 Thread Karsten Blees
Am 16.10.2013 23:43, schrieb Junio C Hamano:
> * kb/fast-hashmap (2013-09-25) 6 commits
>  - fixup! diffcore-rename.c: simplify finding exact renames
>  - diffcore-rename.c: use new hash map implementation
>  - diffcore-rename.c: simplify finding exact renames
>  - diffcore-rename.c: move code around to prepare for the next patch
>  - buitin/describe.c: use new hash map implementation
>  - add a hashtable implementation that supports O(1) removal
> 

I posted a much more complete v3 [1], but somehow missed Jonathan's fixup! 
commit.

Btw., the test suite didn't catch the uninitialized variable, neither on mingw 
nor linux nor with valgrind. Is there a way to run tests with STACK_POISON or 
something?

[1] http://thread.gmane.org/gmane.comp.version-control.git/235644

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html