Re: What's cooking in git.git (Oct 2013, #03; Wed, 16)
Am 18.10.2013 21:09, schrieb Junio C Hamano: > Karsten Blees writes: > >> The coredumps are caused by my patch #10, which free()s >> cache_entries when they are removed, in combination with ... > > Looking at that patch, it makes me wonder if remove_index_entry_at() > and replace_index_entry() should be the ones that frees the old > entry in the first place. A caller may already have a ce pointing > at an old entry and use the information from old_ce to update a new > one after it installed it, e.g. > > old_ce = ... > new_ce = make_cache_entry(... old_ce->name, ...); > replace_index_entry(... new_ce); > new_ce->ce_mode = old_ce->cd_mode; > free(old_ce); > > The same goes for the functions that remove the entry. > Moving free() to the callers or caller's callers would make it much more complicated (more places to change). Besides, most callers don't even have a reference to old_ce and simply remove by position. Of course, this doesn't prevent caller's caller's callers to keep a reference to a removed / replaced entry, as found by Thomas. > > Going forward, I do agree with your patch #10 that removal or > replacing that may make an existing entry unreferenced should free > entries that are no longer used, and "use after free" should be > forbidden. > OK, I'll spend some more time analyzing the call hierarchies to see if there are more uses of removed cache_entries. I'll try to post an updated v4 by the end of the week. Karsten -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (Oct 2013, #03; Wed, 16)
Am 18.10.2013 21:09, schrieb Junio C Hamano: > Karsten Blees writes: >> Can't we just use add_file_to_cache here (which replaces >> cache_entries by creating a copy)? >> >> diff --git a/submodule.c b/submodule.c >> index 1905d75..e388487 100644 >> --- a/submodule.c >> +++ b/submodule.c >> @@ -116,30 +116,7 @@ int remove_path_from_gitmodules(const char *path) >> >> void stage_updated_gitmodules(void) >> { >> - struct strbuf buf = STRBUF_INIT; >> - struct stat st; >> - int pos; >> - struct cache_entry *ce; >> - int namelen = strlen(".gitmodules"); >> - >> - pos = cache_name_pos(".gitmodules", namelen); >> - if (pos < 0) { >> - warning(_("could not find .gitmodules in index")); >> - return; >> - } > > I think the remainder is (morally) equivalent between the original > and a single "add-file-to-cache" call, and the version after your > "how about this" patch in the message I am responding to looks more > correct (e.g. why does the original lstat after it has read the > file?). Cargo cult programming. I was looking at other code manipulating the index (as Documentation/technical/api-in-core-index.txt is rather terse ;-) and concluded I would need to read the possibly updated st.st_mode, in case updating the config file would have changed that. > But this warning may want to stay, no? Of course you are right on this one. All test ran successfully with this patch, so I think adding one for that warning makes sense too. And as that is submodule related stuff I volunteer for fixing all this ;-) >> - ce = active_cache[pos]; >> - ce->ce_flags = namelen; >> - if (strbuf_read_file(&buf, ".gitmodules", 0) < 0) >> - die(_("reading updated .gitmodules failed")); >> - if (lstat(".gitmodules", &st) < 0) >> - die_errno(_("unable to stat updated .gitmodules")); >> - fill_stat_cache_info(ce, &st); >> - ce->ce_mode = ce_mode_from_stat(ce, st.st_mode); >> - if (remove_cache_entry_at(pos) < 0) >> - die(_("unable to remove .gitmodules from index")); >> - if (write_sha1_file(buf.buf, buf.len, blob_type, ce->sha1)) >> - die(_("adding updated .gitmodules failed")); >> - if (add_cache_entry(ce, ADD_CACHE_OK_TO_ADD|ADD_CACHE_OK_TO_REPLACE)) >> + if (add_file_to_cache(".gitmodules", 0)) >> die(_("staging updated .gitmodules failed")); -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (Oct 2013, #03; Wed, 16)
Am 18.10.2013 02:42, schrieb Karsten Blees: > Am 17.10.2013 23:07, schrieb Junio C Hamano: >> Junio C Hamano writes: >> >>> Karsten Blees writes: >>> Am 16.10.2013 23:43, schrieb Junio C Hamano: > * kb/fast-hashmap (2013-09-25) 6 commits > - fixup! diffcore-rename.c: simplify finding exact renames > - diffcore-rename.c: use new hash map implementation > - diffcore-rename.c: simplify finding exact renames > - diffcore-rename.c: move code around to prepare for the next patch > - buitin/describe.c: use new hash map implementation > - add a hashtable implementation that supports O(1) removal > I posted a much more complete v3 [1], but somehow missed Jonathan's fixup! commit. >>> >>> Thanks; I'll replace the above with v3 and squash the fix-up in. >> >> Interestingly, v3 applied on 'maint' and then merged to 'master' >> seems to break t3600 and t7001 with a coredump. >> >> It would conflict with es/name-hash-no-trailing-slash-in-dirs that >> has been cooking in 'next', too; the resolution might be trivial but >> I didn't look too deeply into it. >> > > I've pushed a rebased version to > https://github.com/kblees/git/commits/kb/hashmap-v3-next > (no changes yet except for Jonathan's fixup in #04 and merge resolution). > > The coredumps are caused by my patch #10, which free()s cache_entries when > they are removed, in combination with submodule.c::stage_updated_gitmodules > (5fee9952 "submodule.c: add .gitmodules staging helper functions"), which > removes a cache_entry, then modifies and re-adds the (now) free()d memory. > > Can't we just use add_file_to_cache here (which replaces cache_entries by > creating a copy)? No objections from my side. Looks like we could also copy the cache entry just before remove_cache_entry_at() and use that copy afterwards, but your solution is so much shorter that I would really like to use it (unless someone more cache-savvy than me has any objections). And by the way: this is the last use of remove_cache_entry_at(), would it make sense to remove that define while at it? Only the remove_index_entry_at() function it is defined to is currently used. > diff --git a/submodule.c b/submodule.c > index 1905d75..e388487 100644 > --- a/submodule.c > +++ b/submodule.c > @@ -116,30 +116,7 @@ int remove_path_from_gitmodules(const char *path) > > void stage_updated_gitmodules(void) > { > - struct strbuf buf = STRBUF_INIT; > - struct stat st; > - int pos; > - struct cache_entry *ce; > - int namelen = strlen(".gitmodules"); > - > - pos = cache_name_pos(".gitmodules", namelen); > - if (pos < 0) { > - warning(_("could not find .gitmodules in index")); > - return; > - } > - ce = active_cache[pos]; > - ce->ce_flags = namelen; > - if (strbuf_read_file(&buf, ".gitmodules", 0) < 0) > - die(_("reading updated .gitmodules failed")); > - if (lstat(".gitmodules", &st) < 0) > - die_errno(_("unable to stat updated .gitmodules")); > - fill_stat_cache_info(ce, &st); > - ce->ce_mode = ce_mode_from_stat(ce, st.st_mode); > - if (remove_cache_entry_at(pos) < 0) > - die(_("unable to remove .gitmodules from index")); > - if (write_sha1_file(buf.buf, buf.len, blob_type, ce->sha1)) > - die(_("adding updated .gitmodules failed")); > - if (add_cache_entry(ce, ADD_CACHE_OK_TO_ADD|ADD_CACHE_OK_TO_REPLACE)) > + if (add_file_to_cache(".gitmodules", 0)) > die(_("staging updated .gitmodules failed")); > } > > -- > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (Oct 2013, #03; Wed, 16)
Karsten Blees writes: > The coredumps are caused by my patch #10, which free()s > cache_entries when they are removed, in combination with ... Looking at that patch, it makes me wonder if remove_index_entry_at() and replace_index_entry() should be the ones that frees the old entry in the first place. A caller may already have a ce pointing at an old entry and use the information from old_ce to update a new one after it installed it, e.g. old_ce = ... new_ce = make_cache_entry(... old_ce->name, ...); replace_index_entry(... new_ce); new_ce->ce_mode = old_ce->cd_mode; free(old_ce); The same goes for the functions that remove the entry. But I am probably biased saying this, because in the old days, cache entries could never be freed (they were carved out of a contiguous region of memory, mmapped from the index file). These days, we parse and run ntoh*() on the on-disk cache entries to create in-core form, and the "cache entries should never be freed" is no longer true, but I would not be surprised if there are still some code leftover that relies on "use after free" being safe, leaking unused cache entries. Going forward, I do agree with your patch #10 that removal or replacing that may make an existing entry unreferenced should free entries that are no longer used, and "use after free" should be forbidden. > Can't we just use add_file_to_cache here (which replaces > cache_entries by creating a copy)? > > diff --git a/submodule.c b/submodule.c > index 1905d75..e388487 100644 > --- a/submodule.c > +++ b/submodule.c > @@ -116,30 +116,7 @@ int remove_path_from_gitmodules(const char *path) > > void stage_updated_gitmodules(void) > { > - struct strbuf buf = STRBUF_INIT; > - struct stat st; > - int pos; > - struct cache_entry *ce; > - int namelen = strlen(".gitmodules"); > - > - pos = cache_name_pos(".gitmodules", namelen); > - if (pos < 0) { > - warning(_("could not find .gitmodules in index")); > - return; > - } I think the remainder is (morally) equivalent between the original and a single "add-file-to-cache" call, and the version after your "how about this" patch in the message I am responding to looks more correct (e.g. why does the original lstat after it has read the file?). But this warning may want to stay, no? > - ce = active_cache[pos]; > - ce->ce_flags = namelen; > - if (strbuf_read_file(&buf, ".gitmodules", 0) < 0) > - die(_("reading updated .gitmodules failed")); > - if (lstat(".gitmodules", &st) < 0) > - die_errno(_("unable to stat updated .gitmodules")); > - fill_stat_cache_info(ce, &st); > - ce->ce_mode = ce_mode_from_stat(ce, st.st_mode); > - if (remove_cache_entry_at(pos) < 0) > - die(_("unable to remove .gitmodules from index")); > - if (write_sha1_file(buf.buf, buf.len, blob_type, ce->sha1)) > - die(_("adding updated .gitmodules failed")); > - if (add_cache_entry(ce, ADD_CACHE_OK_TO_ADD|ADD_CACHE_OK_TO_REPLACE)) > + if (add_file_to_cache(".gitmodules", 0)) > die(_("staging updated .gitmodules failed")); > } -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (Oct 2013, #03; Wed, 16)
Am 17.10.2013 23:07, schrieb Junio C Hamano: > Junio C Hamano writes: > >> Karsten Blees writes: >> >>> Am 16.10.2013 23:43, schrieb Junio C Hamano: * kb/fast-hashmap (2013-09-25) 6 commits - fixup! diffcore-rename.c: simplify finding exact renames - diffcore-rename.c: use new hash map implementation - diffcore-rename.c: simplify finding exact renames - diffcore-rename.c: move code around to prepare for the next patch - buitin/describe.c: use new hash map implementation - add a hashtable implementation that supports O(1) removal >>> >>> I posted a much more complete v3 [1], but somehow missed Jonathan's fixup! >>> commit. >> >> Thanks; I'll replace the above with v3 and squash the fix-up in. > > Interestingly, v3 applied on 'maint' and then merged to 'master' > seems to break t3600 and t7001 with a coredump. > > It would conflict with es/name-hash-no-trailing-slash-in-dirs that > has been cooking in 'next', too; the resolution might be trivial but > I didn't look too deeply into it. > I've pushed a rebased version to https://github.com/kblees/git/commits/kb/hashmap-v3-next (no changes yet except for Jonathan's fixup in #04 and merge resolution). The coredumps are caused by my patch #10, which free()s cache_entries when they are removed, in combination with submodule.c::stage_updated_gitmodules (5fee9952 "submodule.c: add .gitmodules staging helper functions"), which removes a cache_entry, then modifies and re-adds the (now) free()d memory. Can't we just use add_file_to_cache here (which replaces cache_entries by creating a copy)? diff --git a/submodule.c b/submodule.c index 1905d75..e388487 100644 --- a/submodule.c +++ b/submodule.c @@ -116,30 +116,7 @@ int remove_path_from_gitmodules(const char *path) void stage_updated_gitmodules(void) { - struct strbuf buf = STRBUF_INIT; - struct stat st; - int pos; - struct cache_entry *ce; - int namelen = strlen(".gitmodules"); - - pos = cache_name_pos(".gitmodules", namelen); - if (pos < 0) { - warning(_("could not find .gitmodules in index")); - return; - } - ce = active_cache[pos]; - ce->ce_flags = namelen; - if (strbuf_read_file(&buf, ".gitmodules", 0) < 0) - die(_("reading updated .gitmodules failed")); - if (lstat(".gitmodules", &st) < 0) - die_errno(_("unable to stat updated .gitmodules")); - fill_stat_cache_info(ce, &st); - ce->ce_mode = ce_mode_from_stat(ce, st.st_mode); - if (remove_cache_entry_at(pos) < 0) - die(_("unable to remove .gitmodules from index")); - if (write_sha1_file(buf.buf, buf.len, blob_type, ce->sha1)) - die(_("adding updated .gitmodules failed")); - if (add_cache_entry(ce, ADD_CACHE_OK_TO_ADD|ADD_CACHE_OK_TO_REPLACE)) + if (add_file_to_cache(".gitmodules", 0)) die(_("staging updated .gitmodules failed")); } -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (Oct 2013, #03; Wed, 16)
Junio C Hamano writes: > Karsten Blees writes: > >> Am 16.10.2013 23:43, schrieb Junio C Hamano: >>> * kb/fast-hashmap (2013-09-25) 6 commits >>> - fixup! diffcore-rename.c: simplify finding exact renames >>> - diffcore-rename.c: use new hash map implementation >>> - diffcore-rename.c: simplify finding exact renames >>> - diffcore-rename.c: move code around to prepare for the next patch >>> - buitin/describe.c: use new hash map implementation >>> - add a hashtable implementation that supports O(1) removal >>> >> >> I posted a much more complete v3 [1], but somehow missed Jonathan's fixup! >> commit. > > Thanks; I'll replace the above with v3 and squash the fix-up in. Interestingly, v3 applied on 'maint' and then merged to 'master' seems to break t3600 and t7001 with a coredump. It would conflict with es/name-hash-no-trailing-slash-in-dirs that has been cooking in 'next', too; the resolution might be trivial but I didn't look too deeply into it. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (Oct 2013, #03; Wed, 16)
Karsten Blees writes: > Am 16.10.2013 23:43, schrieb Junio C Hamano: >> * kb/fast-hashmap (2013-09-25) 6 commits >> - fixup! diffcore-rename.c: simplify finding exact renames >> - diffcore-rename.c: use new hash map implementation >> - diffcore-rename.c: simplify finding exact renames >> - diffcore-rename.c: move code around to prepare for the next patch >> - buitin/describe.c: use new hash map implementation >> - add a hashtable implementation that supports O(1) removal >> > > I posted a much more complete v3 [1], but somehow missed Jonathan's fixup! > commit. Thanks; I'll replace the above with v3 and squash the fix-up in. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (Oct 2013, #03; Wed, 16)
Am 16.10.2013 23:43, schrieb Junio C Hamano: > * kb/fast-hashmap (2013-09-25) 6 commits > - fixup! diffcore-rename.c: simplify finding exact renames > - diffcore-rename.c: use new hash map implementation > - diffcore-rename.c: simplify finding exact renames > - diffcore-rename.c: move code around to prepare for the next patch > - buitin/describe.c: use new hash map implementation > - add a hashtable implementation that supports O(1) removal > I posted a much more complete v3 [1], but somehow missed Jonathan's fixup! commit. Btw., the test suite didn't catch the uninitialized variable, neither on mingw nor linux nor with valgrind. Is there a way to run tests with STACK_POISON or something? [1] http://thread.gmane.org/gmane.comp.version-control.git/235644 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html