Re: [PATCH] read-cache.c: Ensure unmerged entries are removed
Yes, --index-info worked for this purpouse, thanks! https://github.com/jsoriano/git/blob/remove-unmerged-index-entry/t/t9904-unmerged-file-with-merged-entry.sh#L25 I'll try to send the patches to the mailing lists later today or tomorrow. On Mon, Aug 18, 2014 at 6:34 PM, Junio C Hamano wrote: > Jaime Soriano Pastor writes: > >> I'd like to add some tests too for this, but I don't know how to >> reproduce this state with git commands only, is there any way to add >> entries to the index without checkings? > > Perhaps feeding "update-index --index-info" four input lines would work? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] read-cache.c: Ensure unmerged entries are removed
Jaime Soriano Pastor writes: > I'd like to add some tests too for this, but I don't know how to > reproduce this state with git commands only, is there any way to add > entries to the index without checkings? Perhaps feeding "update-index --index-info" four input lines would work? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] read-cache.c: Ensure unmerged entries are removed
I'd like to add some tests too for this, but I don't know how to reproduce this state with git commands only, is there any way to add entries to the index without checkings? Or maybe it could be done by creating a "test-" command that adds the entries to an index? Thanks. On Fri, Aug 15, 2014 at 11:45 PM, Junio C Hamano wrote: > Jaime Soriano Pastor writes: > >> On Thu, Aug 14, 2014 at 1:04 AM, Junio C Hamano wrote: >>> Being a conservative, I'd rather avoid doing any magic during >>> read_cache() time. "ls-files -s" for example should show the four >>> stages so that the "broken" state can be inspected. >>> >> Well, only read_cache_unmerged() is modified in the sent patch, so no >> magic is done in read_cache(), I'd also avoid changes there. > > Ahh, I must have overlooked that; changes being only in _unmerged() > variant makes me feel much better, and it probably would make much > of ... > >>> Yes, it would be more work,... > > ... moot, hopefully ;-) > >> git reset will clean the index anyway if the loop finishes, would it >> be ok? > > Surely. > >> git merge is also affected by the loop in read_cache_unmerged(), but >> any of the solutions would be enough for it as only by finishing the >> loop with unmerged entries it will die without commiting the cache to >> the index file. > > Again, true. The mergy operations want to start from a clean slate > and they call _unmerged() variant primarily to learn that there were > unmerged entries in the index, only to abort the operation in that > case, so a change to _unmerged() variant should be safe for them. > > I'll take another look at your patch later, but not before the 2.1 > final, and by that time you may already have sent a reroll ;-) > > Thanks. -- Jaime Soriano Pastor - Software Developer -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] read-cache.c: Ensure unmerged entries are removed
Jaime Soriano Pastor writes: > On Thu, Aug 14, 2014 at 1:04 AM, Junio C Hamano wrote: >> Being a conservative, I'd rather avoid doing any magic during >> read_cache() time. "ls-files -s" for example should show the four >> stages so that the "broken" state can be inspected. >> > Well, only read_cache_unmerged() is modified in the sent patch, so no > magic is done in read_cache(), I'd also avoid changes there. Ahh, I must have overlooked that; changes being only in _unmerged() variant makes me feel much better, and it probably would make much of ... >> Yes, it would be more work,... ... moot, hopefully ;-) > git reset will clean the index anyway if the loop finishes, would it > be ok? Surely. > git merge is also affected by the loop in read_cache_unmerged(), but > any of the solutions would be enough for it as only by finishing the > loop with unmerged entries it will die without commiting the cache to > the index file. Again, true. The mergy operations want to start from a clean slate and they call _unmerged() variant primarily to learn that there were unmerged entries in the index, only to abort the operation in that case, so a change to _unmerged() variant should be safe for them. I'll take another look at your patch later, but not before the 2.1 final, and by that time you may already have sent a reroll ;-) Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] read-cache.c: Ensure unmerged entries are removed
On Thu, Aug 14, 2014 at 1:04 AM, Junio C Hamano wrote: > Being a conservative, I'd rather avoid doing any magic during > read_cache() time. "ls-files -s" for example should show the four > stages so that the "broken" state can be inspected. > Well, only read_cache_unmerged() is modified in the sent patch, so no magic is done in read_cache(), I'd also avoid changes there. Indeed with the patch, "ls-files -s" can be used to inspect the problem without further problems. > Instead, I suspect that the code paths with problematic iterations > over the index entries that assume that having stage #0 entry for a > path guarantees that there will not be any higher stage entry first > need to be identified (you already said "add" and "reset" may be > problematic, there may be others, or they may be the only ones, I > dunno), and then the most sensible one, which would be different > from case to case, among various possibilities need to be chosen as > a fix to each of them: > > (1) the loop may be fixed to ignore/skip unmerged entries; > (2) the loop may be fixed to ignore/skip the merged entry; > (3) the loop may be fixed not to spin indefinitely on a path with > mixed entries; or > (4) the command should error out. > git reset will clean the index anyway if the loop finishes, would it be ok? I think that it'd be acceptable for git reset --hard to clean the index as git checkout -f already does it even in this case. git merge is also affected by the loop in read_cache_unmerged(), but any of the solutions would be enough for it as only by finishing the loop with unmerged entries it will die without commiting the cache to the index file. For git add probably the best option is to error out and ask the user to check "git ls-files -s" to investigate the problem and decide what to do. The error message given by "git commit -a" is a bit confusing in this case, I can take a look to this too. I'll try to prepare a patch with these cases, and rethinking the loop to avoid future problems there, I think that is a bit dangerous to look for the position of a path entry (with index_name_pos) for the next iteration. > Yes, it would be more work, but I'd feel safer if the following > worked: > > $ git ls-files -s > 100644 3cc58df83752123644fef39faab2393af643b1d2 0 conflict > 100644 f70f10e4db19068f79bc43844b49f3eece45c4e8 1 conflict > 100644 3cc58df83752123644fef39faab2393af643b1d2 2 conflict > 100644 223b7836fb19fdf64ba2d3cd6173c6a283141f78 3 conflict > $ >empty > $ git add empty > 100644 3cc58df83752123644fef39faab2393af643b1d2 0 conflict > 100644 f70f10e4db19068f79bc43844b49f3eece45c4e8 1 conflict > 100644 3cc58df83752123644fef39faab2393af643b1d2 2 conflict > 100644 223b7836fb19fdf64ba2d3cd6173c6a283141f78 3 conflict > 100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0 empty > $ git cat-file blob :empty >output > $ cmp empty output && echo OK > OK > > which would be impossible to do if we nuked the "problematic" stages > whenever we read the index, I am afraid. > This works with the first patch as read_cache() is not modified, and git add would only clean the entries for the paths passed as arguments. >> BTW, I didn't know "git cat-file blob 0:$path", but I only manage to >> get "Not a valid object name" fatals. How is it supposed to be used? > > That was a typo of ":$n:$path" (where 0 <= $n <= 3). Great, thanks! -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] read-cache.c: Ensure unmerged entries are removed
Jaime Soriano Pastor writes: > In the problematic cases I've seen (specially git add and git reset > --hard) the final state of both, merged and unmerged files, is that > only an entry in stage 0 exists. > Also, the current implementation of git checkout -f silently removes > higher stage entries in this case. >> >> Silently removing these at runtime may not be something we would >> want to do; after all, we do not know if the broken tool actually >> wanted to have the higher stage entries, or the merged entry. >> > > Yes, I have to agree on that, the user should have the final decission > about what stage entry to use, although I'm not sure if in the > previously commented cases there could be such an additional loss as > the operations that can be modified are already intended to silently > remove stage entries. > ... > Which option would be better? And what could be a good message? Being a conservative, I'd rather avoid doing any magic during read_cache() time. "ls-files -s" for example should show the four stages so that the "broken" state can be inspected. Instead, I suspect that the code paths with problematic iterations over the index entries that assume that having stage #0 entry for a path guarantees that there will not be any higher stage entry first need to be identified (you already said "add" and "reset" may be problematic, there may be others, or they may be the only ones, I dunno), and then the most sensible one, which would be different from case to case, among various possibilities need to be chosen as a fix to each of them: (1) the loop may be fixed to ignore/skip unmerged entries; (2) the loop may be fixed to ignore/skip the merged entry; (3) the loop may be fixed not to spin indefinitely on a path with mixed entries; or (4) the command should error out. Yes, it would be more work, but I'd feel safer if the following worked: $ git ls-files -s 100644 3cc58df83752123644fef39faab2393af643b1d2 0 conflict 100644 f70f10e4db19068f79bc43844b49f3eece45c4e8 1 conflict 100644 3cc58df83752123644fef39faab2393af643b1d2 2 conflict 100644 223b7836fb19fdf64ba2d3cd6173c6a283141f78 3 conflict $ >empty $ git add empty 100644 3cc58df83752123644fef39faab2393af643b1d2 0 conflict 100644 f70f10e4db19068f79bc43844b49f3eece45c4e8 1 conflict 100644 3cc58df83752123644fef39faab2393af643b1d2 2 conflict 100644 223b7836fb19fdf64ba2d3cd6173c6a283141f78 3 conflict 100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0 empty $ git cat-file blob :empty >output $ cmp empty output && echo OK OK which would be impossible to do if we nuked the "problematic" stages whenever we read the index, I am afraid. > BTW, I didn't know "git cat-file blob 0:$path", but I only manage to > get "Not a valid object name" fatals. How is it supposed to be used? That was a typo of ":$n:$path" (where 0 <= $n <= 3). -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] read-cache.c: Ensure unmerged entries are removed
On Tue, Aug 12, 2014 at 8:39 PM, Junio C Hamano wrote: > > Jaime Soriano Pastor writes: > > > Wrong implementations of tools that modify the index can left > > some files as merged and unmerged at the same time. Avoid undesiderable > > behaviours by handling this situation. > > It is understandable that the way _you_ decided to "handle the > situation" is so obvious to be spelled out to _you_, but that is the > most important design decision that needs to be described here. Do > you silently remove higher-stage entries when an entry at stage 0 > exists? Do you silently remove stage 0 entry when higher-stage > entries exist? Do you error out without doing neither? > Sorry, I didn't explain my decission enough, and my knowledge of git internals is not so good. The idea of my proposal is to remove higher stage entries when, after replacing an existing entry at stage 0, there are still entries in higher stages. In the problematic cases I've seen (specially git add and git reset --hard) the final state of both, merged and unmerged files, is that only an entry in stage 0 exists. Also, the current implementation of git checkout -f silently removes higher stage entries in this case. > > Silently removing these at runtime may not be something we would > want to do; after all, we do not know if the broken tool actually > wanted to have the higher stage entries, or the merged entry. > Yes, I have to agree on that, the user should have the final decission about what stage entry to use, although I'm not sure if in the previously commented cases there could be such an additional loss as the operations that can be modified are already intended to silently remove stage entries. > Ideally, I think we should error out and let the users figure out > how to proceed (we may of course need to make sure they have the > necessary tools to do so, e.g. "git cat-file blob 0:$path" to > resurrect the contents and "git update-index --cacheinfo" to stuff > the contents into the stages). > I have also tried a couple of implementations of this patch with die() and warning(). The implementation with die() would have a message like "There are other staged versions for merged file", and maybe some recomendation about how to see the blobs. The warning implementation could return -1, what would prevent git add to remove the higher-stage entries, but would still make git reset --hard to clean the index as it seems that it does it anyway if it manages to finish the call to read_index_unmerged. Another option would be to print the deleted entries as a warning but deleting them anyway. Which option would be better? And what could be a good message? BTW, I didn't know "git cat-file blob 0:$path", but I only manage to get "Not a valid object name" fatals. How is it supposed to be used? Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] read-cache.c: Ensure unmerged entries are removed
On Tue, Aug 12, 2014 at 8:31 PM, Junio C Hamano wrote: > > Jaime Soriano Pastor writes: > > > A file in the index can be left as merged and unmerged at the same time > > by some tools as libgit2, this causes some undesiderable behaviours in git. > > Well, doesn't it mean that libgit2 is broken? Have you filed a bug > over there yet? > Yes, exactly, I think libgit2 is broken but I wanted to double-check that it was still happening in their master branch, and it is. I have reported the bug after checking it. https://github.com/libgit2/libgit2/issues/2515 > > Having said that, protecting ourselves from insanity left by other > people is always a good idea, provided that it can be done without > bending overly backwards. Yes, I think the most important thing in this case is to protect git against this kind of inconsistencies. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] read-cache.c: Ensure unmerged entries are removed
Jaime Soriano Pastor writes: > Wrong implementations of tools that modify the index can left > some files as merged and unmerged at the same time. Avoid undesiderable > behaviours by handling this situation. It is understandable that the way _you_ decided to "handle the situation" is so obvious to be spelled out to _you_, but that is the most important design decision that needs to be described here. Do you silently remove higher-stage entries when an entry at stage 0 exists? Do you silently remove stage 0 entry when higher-stage entries exist? Do you error out without doing neither? Silently removing these at runtime may not be something we would want to do; after all, we do not know if the broken tool actually wanted to have the higher stage entries, or the merged entry. Ideally, I think we should error out and let the users figure out how to proceed (we may of course need to make sure they have the necessary tools to do so, e.g. "git cat-file blob 0:$path" to resurrect the contents and "git update-index --cacheinfo" to stuff the contents into the stages). Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] read-cache.c: Ensure unmerged entries are removed
Jaime Soriano Pastor writes: > A file in the index can be left as merged and unmerged at the same time > by some tools as libgit2, this causes some undesiderable behaviours in git. Well, doesn't it mean that libgit2 is broken? Have you filed a bug over there yet? Having said that, protecting ourselves from insanity left by other people is always a good idea, provided that it can be done without bending overly backwards. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] read-cache.c: Ensure unmerged entries are removed
A file in the index can be left as merged and unmerged at the same time by some tools as libgit2, this causes some undesiderable behaviours in git. I have seen, at least, these behaviours: - git reset --hard consuming 100% CPU and never ending - git reset --hard consuming all memory in git < 2.0 - git add/git mergetool not resolving a conflict, even if they finish succesfully The state is something like this: $ git ls-files -s 100644 257cc5642cb1a054f08cc83f2d943e56fd3ebe99 0 conflict 100644 257cc5642cb1a054f08cc83f2d943e56fd3ebe99 1 conflict 100644 5716ca5987cbf97d6bb54920bea6adde242d87e6 2 conflict 100644 f2e41136eac73c39554dede1fd7e67b12502d577 3 conflict This can be caused e.g. by libgit2 doing this: 1. Merge with conflicts, without solving them 2. Force checkout I see that this is not caused by git (I haven't been able to reproduce it only using git) but I think that git should be able to detect this situation and even handle it, specially to avoid the never-ending git resets. The proposed patch serves as protection and autoremediation for this kind of cases. Another option would be to detect the issue and tell the user to clean the index with git read-tree --empty, but I think this would be more intrussive than the patch. Regards, Jaime Soriano. Jaime Soriano Pastor (1): read-cache.c: Ensure unmerged entries are removed read-cache.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) -- 2.0.4.1.g8a38f21.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] read-cache.c: Ensure unmerged entries are removed
Wrong implementations of tools that modify the index can left some files as merged and unmerged at the same time. Avoid undesiderable behaviours by handling this situation. Signed-off-by: Jaime Soriano Pastor --- read-cache.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/read-cache.c b/read-cache.c index 7f5645e..23e46e1 100644 --- a/read-cache.c +++ b/read-cache.c @@ -935,6 +935,7 @@ static int add_index_entry_with_check(struct index_state *istate, struct cache_e int ok_to_replace = option & ADD_CACHE_OK_TO_REPLACE; int skip_df_check = option & ADD_CACHE_SKIP_DFCHECK; int new_only = option & ADD_CACHE_NEW_ONLY; + int replaced = 0; cache_tree_invalidate_path(istate->cache_tree, ce->name); pos = index_name_stage_pos(istate, ce->name, ce_namelen(ce), ce_stage(ce)); @@ -943,9 +944,10 @@ static int add_index_entry_with_check(struct index_state *istate, struct cache_e if (pos >= 0) { if (!new_only) replace_index_entry(istate, pos, ce); - return 0; - } - pos = -pos-1; + pos++; + replaced = 1; + } else + pos = -pos-1; /* * Inserting a merged entry ("stage 0") into the index @@ -959,6 +961,8 @@ static int add_index_entry_with_check(struct index_state *istate, struct cache_e } } + if (replaced) + return 0; if (!ok_to_add) return -1; if (!verify_path(ce->name)) -- 2.0.4.1.g8a38f21.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html