Re: [PATCH 1/2] Check order when reading index
On Wed, Aug 27, 2014 at 11:11 PM, Junio C Hamano wrote: > I was asking for an answer more from what you know about the code. > For example, would read_index_unmerged() choke if the index has two > or more stage #1 (or stage #3) entries for the same path (provided > that the index is otherwise normal, i.e. no stage #0 entry for that > path, entries are sorted by pathname order and stages are in an > order that does not decrease)? Oh, ok :) Re-reading the code a bit I think that there can be a potential problem in the add_index_entry_with_check() function. It's currently implemented to allow an only entry for each stage and each path, if an entry for a path and a stage is being added, and another one existed before, the old one is replaced, but just the first one, so adding an entry to stage #1 in an index with multiple entries at stage #1 would replace the first occurence, but not the rest, what could not be expected. The user could maybe expect that all entries are replaced, or only an specific one. If an stage #0 entry is added and there are multiple entries for any of the higher-stage entries there wouldn't be any problem as this function removes all the higher-stage entries for the same path without checking the stage. This last case is the one in read_index_unmerged(). -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] Check order when reading index
Jaime Soriano Pastor writes: > On Tue, Aug 26, 2014 at 7:43 PM, Junio C Hamano wrote: >> Does the current codebase choke with such entries in the index file, >> like you saw in your index file with both stage #0 and stage #1 >> entries? > > Not sure, I couldn't reproduce an scenario with an index with multiple > entries in the same stage for the same path. I think we have been discussing how to protect broken index file left by tools other people wrote, so I wouldn't be so surprised if our current toolset does not let you recreate certain breakages ;-) I was asking for an answer more from what you know about the code. For example, would read_index_unmerged() choke if the index has two or more stage #1 (or stage #3) entries for the same path (provided that the index is otherwise normal, i.e. no stage #0 entry for that path, entries are sorted by pathname order and stages are in an order that does not decrease)? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] Check order when reading index
On Tue, Aug 26, 2014 at 7:43 PM, Junio C Hamano wrote: > Does the current codebase choke with such entries in the index file, > like you saw in your index file with both stage #0 and stage #1 > entries? Not sure, I couldn't reproduce an scenario with an index with multiple entries in the same stage for the same path. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] Check order when reading index
Jaime Soriano Pastor writes: > On Tue, Aug 26, 2014 at 6:53 PM, Junio C Hamano wrote: >> >> Yes---that is what I meant by the "virtual stuff". Unlike resolve >> work by Daniel (around Sep 2005 $gmane/8088) that tried to use >> multiple ancestors directly in a single merge, recursive limits >> itself to repeated use of pairwise merges. > > Ok, I see now. Then what about checking also in check_ce_order() that > only stage #1 can be repeated? We could use multiple stage #3 entries to natively represent an octopus merge in progress if we wanted to. I do not think we want to close the door for doing so, unless there is some compelling reason. Does the current codebase choke with such entries in the index file, like you saw in your index file with both stage #0 and stage #1 entries? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] Check order when reading index
On Tue, Aug 26, 2014 at 6:53 PM, Junio C Hamano wrote: > > Yes---that is what I meant by the "virtual stuff". Unlike resolve > work by Daniel (around Sep 2005 $gmane/8088) that tried to use > multiple ancestors directly in a single merge, recursive limits > itself to repeated use of pairwise merges. Ok, I see now. Then what about checking also in check_ce_order() that only stage #1 can be repeated? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] Check order when reading index
Jeff King writes: >> With "natively" do you mean some internal state that is never written >> into the index? If this were the case then there wouldn't be any >> problem with the restriction when reading the index file. > > FWIW, that was my question on reading Junio's response, too. The current code may not put two stage #1 entries for the same path but allowing such entries was a part of the design from very early days; iow it is a valid index file, unlike the one that has both stage #0 and stage #1 for the same path. It is a no-no to reject such an index as long as we are discussing to add a new code to tighten the sanity check on the file content. >> I have also tried to reproduce it by directly calling >> git-merge-recursive and the most I have got is what it seemed to be >> like a conflict in the stage #1: >> >> $ git ls-files -s >> 100644 1036ba5101378f06aa10c5fa249b67e14f83166b 1 conflict >> 100644 2638c45f8e7bc5b56f70e92390153572811782c3 2 conflict >> 100644 178481050188cf00d7d9cd5a11e43ab8fab9294f 3 conflict >> >> $ git cat-file blob 1036ba5101378f06aa10c5fa249b67e14f83166b >> <<< Temporary merge branch 1 >> G >> === >> E >> F >> >>> Temporary merge branch 2 > > Yes, I think merge-recursive resolves the earlier merges and then feeds > the result into the main merge. And that's how you end up with the > "temporary merge branch" name instead of something useful. Yes---that is what I meant by the "virtual stuff". Unlike resolve work by Daniel (around Sep 2005 $gmane/8088) that tried to use multiple ancestors directly in a single merge, recursive limits itself to repeated use of pairwise merges. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] Check order when reading index
On Tue, Aug 26, 2014 at 02:08:35PM +0200, Jaime Soriano Pastor wrote: > > That is how we natively (read: not with the funky "virtual" stuff > > merge-recursive does) express a merge with multiple merge bases. > > You also should be able to read this in the way how "git merge" invokes > > merge strategies (one or more bases, double-dash and then current > > HEAD and the other branches). I think there are some tests in 3way > > merge tests that checks what should happen when our HEAD matches > > one of the stage #1 while their branch matches a different one of the > > stage #1, too. > > I'm a bit lost with this, conceptually it doesn't seem to be any > problem with having multiple merge bases, but I don't manage to > reproduce it. > With "natively" do you mean some internal state that is never written > into the index? If this were the case then there wouldn't be any > problem with the restriction when reading the index file. FWIW, that was my question on reading Junio's response, too. > I have also tried to reproduce it by directly calling > git-merge-recursive and the most I have got is what it seemed to be > like a conflict in the stage #1: > > $ git ls-files -s > 100644 1036ba5101378f06aa10c5fa249b67e14f83166b 1 conflict > 100644 2638c45f8e7bc5b56f70e92390153572811782c3 2 conflict > 100644 178481050188cf00d7d9cd5a11e43ab8fab9294f 3 conflict > > $ git cat-file blob 1036ba5101378f06aa10c5fa249b67e14f83166b > <<< Temporary merge branch 1 > G > === > E > F > >>> Temporary merge branch 2 Yes, I think merge-recursive resolves the earlier merges and then feeds the result into the main merge. And that's how you end up with the "temporary merge branch" name instead of something useful. It might work to have a recursive merge that causes a conflict on path X, and then we further need to resolve that conflict. I'm not sure if we try to represent that in the index somehow, or if merge-recursive just bails in this case (I didn't try it). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] Check order when reading index
On Mon, Aug 25, 2014 at 10:52 PM, Junio C Hamano wrote: > On Mon, Aug 25, 2014 at 12:44 PM, Jeff King wrote: >> For my own curiosity, how do you get into this situation, and what does >> it mean to have multiple stage#1 entries for the same path? What would >> "git cat-file :1:path" output? > > That is how we natively (read: not with the funky "virtual" stuff > merge-recursive does) express a merge with multiple merge bases. > You also should be able to read this in the way how "git merge" invokes > merge strategies (one or more bases, double-dash and then current > HEAD and the other branches). I think there are some tests in 3way > merge tests that checks what should happen when our HEAD matches > one of the stage #1 while their branch matches a different one of the > stage #1, too. I'm a bit lost with this, conceptually it doesn't seem to be any problem with having multiple merge bases, but I don't manage to reproduce it. With "natively" do you mean some internal state that is never written into the index? If this were the case then there wouldn't be any problem with the restriction when reading the index file. I have also tried to reproduce it by directly calling git-merge-recursive and the most I have got is what it seemed to be like a conflict in the stage #1: $ git ls-files -s 100644 1036ba5101378f06aa10c5fa249b67e14f83166b 1 conflict 100644 2638c45f8e7bc5b56f70e92390153572811782c3 2 conflict 100644 178481050188cf00d7d9cd5a11e43ab8fab9294f 3 conflict $ git cat-file blob 1036ba5101378f06aa10c5fa249b67e14f83166b <<< Temporary merge branch 1 G === E F >>> Temporary merge branch 2 And after thinking a bit more about it, I don't see a way to have two stage #1 entries for the same path with git commands only. It seems that all entries are added through the add_index_entry_with_check() function (except maybe the added to the cached tree extension), and this function replaces existing entries if they have the same name and stage. Also, all tests pass with the patch, without allowing two entries for the same stage. So I'm afraid that I don't fully understand this case :/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] Check order when reading index
On Mon, Aug 25, 2014 at 12:44 PM, Jeff King wrote: > For my own curiosity, how do you get into this situation, and what does > it mean to have multiple stage#1 entries for the same path? What would > "git cat-file :1:path" output? That is how we natively (read: not with the funky "virtual" stuff merge-recursive does) express a merge with multiple merge bases. You also should be able to read this in the way how "git merge" invokes merge strategies (one or more bases, double-dash and then current HEAD and the other branches). I think there are some tests in 3way merge tests that checks what should happen when our HEAD matches one of the stage #1 while their branch matches a different one of the stage #1, too. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] Check order when reading index
On Mon, Aug 25, 2014 at 7:21 PM, Junio C Hamano wrote: > Jaime Soriano Pastor writes: > >> Subject: Re: [PATCH 1/2] Check order when reading index > > Please be careful when crafting the commit title. This single line > will be the only one that readers will have to identify the change > among hundreds of entries in "git shortlog" output when trying to > see what kind of change went into the project during the given > period. Something like: > > read_index_from(): catch out of order entries while reading an index file > > perhaps? > Ok, reprashing it. >> +void check_ce_order(struct cache_entry *ce, struct cache_entry *next_ce) > > Does this have to be global, i.e. not "static void ..."? > Not really, changing it to static. >> + if (ce_stage(ce) >= ce_stage(next_ce)) >> + die("Unordered stage entries for '%s'", >> + ce->name); > > Not quite. We do allow multiple higher stage entries; having two or > more stage #1 entries is perfectly fine during a merge resolution, > and both ce and next_ce may be pointing at the stage #1 entries of > the same path. Replacing the comparison with ">" is sufficient, I > think. > Ok, but like Jeff, I'm also curious about how to have multiple stage #1 entries for the same path. Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] Check order when reading index
On Mon, Aug 25, 2014 at 10:21:58AM -0700, Junio C Hamano wrote: > > + if (ce_stage(ce) >= ce_stage(next_ce)) > > + die("Unordered stage entries for '%s'", > > + ce->name); > > Not quite. We do allow multiple higher stage entries; having two or > more stage #1 entries is perfectly fine during a merge resolution, > and both ce and next_ce may be pointing at the stage #1 entries of > the same path. Replacing the comparison with ">" is sufficient, I > think. For my own curiosity, how do you get into this situation, and what does it mean to have multiple stage#1 entries for the same path? What would "git cat-file :1:path" output? -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] Check order when reading index
Jaime Soriano Pastor writes: > Subject: Re: [PATCH 1/2] Check order when reading index Please be careful when crafting the commit title. This single line will be the only one that readers will have to identify the change among hundreds of entries in "git shortlog" output when trying to see what kind of change went into the project during the given period. Something like: read_index_from(): catch out of order entries while reading an index file perhaps? > Signed-off-by: Jaime Soriano Pastor > --- > read-cache.c | 18 ++ > 1 file changed, 18 insertions(+) > > diff --git a/read-cache.c b/read-cache.c > index 7f5645e..c1a9619 100644 > --- a/read-cache.c > +++ b/read-cache.c > @@ -1438,6 +1438,21 @@ static struct cache_entry *create_from_disk(struct > ondisk_cache_entry *ondisk, > return ce; > } > > +void check_ce_order(struct cache_entry *ce, struct cache_entry *next_ce) Does this have to be global, i.e. not "static void ..."? > +{ > + int name_compare = strcmp(ce->name, next_ce->name); > + if (0 < name_compare) > + die("Unordered stage entries in index"); > + if (!name_compare) { > + if (!ce_stage(ce)) > + die("Multiple stage entries for merged file '%s'", > + ce->name); OK. If ce is at stage #0, no other entry can have the same name regardless of the stage, and next_ce having the same name violates that rule. > + if (ce_stage(ce) >= ce_stage(next_ce)) > + die("Unordered stage entries for '%s'", > + ce->name); Not quite. We do allow multiple higher stage entries; having two or more stage #1 entries is perfectly fine during a merge resolution, and both ce and next_ce may be pointing at the stage #1 entries of the same path. Replacing the comparison with ">" is sufficient, I think. Thanks. > + } > +} > + > /* remember to discard_cache() before reading a different cache! */ > int read_index_from(struct index_state *istate, const char *path) > { > @@ -1499,6 +1514,9 @@ int read_index_from(struct index_state *istate, const > char *path) > ce = create_from_disk(disk_ce, &consumed, previous_name); > set_index_entry(istate, i, ce); > > + if (i > 0) > + check_ce_order(istate->cache[i - 1], ce); > + > src_offset += consumed; > } > strbuf_release(&previous_name_buf); -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] Check order when reading index
Signed-off-by: Jaime Soriano Pastor --- read-cache.c | 18 ++ 1 file changed, 18 insertions(+) diff --git a/read-cache.c b/read-cache.c index 7f5645e..c1a9619 100644 --- a/read-cache.c +++ b/read-cache.c @@ -1438,6 +1438,21 @@ static struct cache_entry *create_from_disk(struct ondisk_cache_entry *ondisk, return ce; } +void check_ce_order(struct cache_entry *ce, struct cache_entry *next_ce) +{ + int name_compare = strcmp(ce->name, next_ce->name); + if (0 < name_compare) + die("Unordered stage entries in index"); + if (!name_compare) { + if (!ce_stage(ce)) + die("Multiple stage entries for merged file '%s'", + ce->name); + if (ce_stage(ce) >= ce_stage(next_ce)) + die("Unordered stage entries for '%s'", + ce->name); + } +} + /* remember to discard_cache() before reading a different cache! */ int read_index_from(struct index_state *istate, const char *path) { @@ -1499,6 +1514,9 @@ int read_index_from(struct index_state *istate, const char *path) ce = create_from_disk(disk_ce, &consumed, previous_name); set_index_entry(istate, i, ce); + if (i > 0) + check_ce_order(istate->cache[i - 1], ce); + src_offset += consumed; } strbuf_release(&previous_name_buf); -- 2.0.4.1.g0b8a4f9.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html