Re: [PATCH 1/2] Check order when reading index

2014-08-27 Thread Jaime Soriano Pastor
On Wed, Aug 27, 2014 at 11:11 PM, Junio C Hamano  wrote:
> I was asking for an answer more from what you know about the code.
> For example, would read_index_unmerged() choke if the index has two
> or more stage #1 (or stage #3) entries for the same path (provided
> that the index is otherwise normal, i.e. no stage #0 entry for that
> path, entries are sorted by pathname order and stages are in an
> order that does not decrease)?

Oh, ok :) Re-reading the code a bit I think that there can be a
potential problem in the add_index_entry_with_check() function. It's
currently implemented to allow an only entry for each stage and each
path, if an entry for a path and a stage is being added, and another
one existed before, the old one is replaced, but just the first one,
so adding an entry to stage #1 in an index with multiple entries at
stage #1 would replace the first occurence, but not the rest, what
could not be expected. The user could maybe expect that all entries
are replaced, or only an specific one.
If an stage #0 entry is added and there are multiple entries for any
of the higher-stage entries there wouldn't be any problem as this
function removes all the higher-stage entries for the same path
without checking the stage. This last case is the one in
read_index_unmerged().
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Check order when reading index

2014-08-27 Thread Junio C Hamano
Jaime Soriano Pastor  writes:

> On Tue, Aug 26, 2014 at 7:43 PM, Junio C Hamano  wrote:
>> Does the current codebase choke with such entries in the index file,
>> like you saw in your index file with both stage #0 and stage #1
>> entries?
>
> Not sure, I couldn't reproduce an scenario with an index with multiple
> entries in the same stage for the same path.

I think we have been discussing how to protect broken index file
left by tools other people wrote, so I wouldn't be so surprised if
our current toolset does not let you recreate certain breakages ;-)

I was asking for an answer more from what you know about the code.
For example, would read_index_unmerged() choke if the index has two
or more stage #1 (or stage #3) entries for the same path (provided
that the index is otherwise normal, i.e. no stage #0 entry for that
path, entries are sorted by pathname order and stages are in an
order that does not decrease)?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Check order when reading index

2014-08-27 Thread Jaime Soriano Pastor
On Tue, Aug 26, 2014 at 7:43 PM, Junio C Hamano  wrote:
> Does the current codebase choke with such entries in the index file,
> like you saw in your index file with both stage #0 and stage #1
> entries?

Not sure, I couldn't reproduce an scenario with an index with multiple
entries in the same stage for the same path.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Check order when reading index

2014-08-26 Thread Junio C Hamano
Jaime Soriano Pastor  writes:

> On Tue, Aug 26, 2014 at 6:53 PM, Junio C Hamano  wrote:
>>
>> Yes---that is what I meant by the "virtual stuff".  Unlike resolve
>> work by Daniel (around Sep 2005 $gmane/8088) that tried to use
>> multiple ancestors directly in a single merge, recursive limits
>> itself to repeated use of pairwise merges.
>
> Ok, I see now. Then what about checking also in check_ce_order() that
> only stage #1 can be repeated?

We could use multiple stage #3 entries to natively represent an
octopus merge in progress if we wanted to.  I do not think we want
to close the door for doing so, unless there is some compelling
reason.

Does the current codebase choke with such entries in the index file,
like you saw in your index file with both stage #0 and stage #1
entries?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Check order when reading index

2014-08-26 Thread Jaime Soriano Pastor
On Tue, Aug 26, 2014 at 6:53 PM, Junio C Hamano  wrote:
>
> Yes---that is what I meant by the "virtual stuff".  Unlike resolve
> work by Daniel (around Sep 2005 $gmane/8088) that tried to use
> multiple ancestors directly in a single merge, recursive limits
> itself to repeated use of pairwise merges.

Ok, I see now. Then what about checking also in check_ce_order() that
only stage #1 can be repeated?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Check order when reading index

2014-08-26 Thread Junio C Hamano
Jeff King  writes:

>> With "natively" do you mean some internal state that is never written
>> into the index? If this were the case then there wouldn't be any
>> problem with the restriction when reading the index file.
>
> FWIW, that was my question on reading Junio's response, too.

The current code may not put two stage #1 entries for the same path
but allowing such entries was a part of the design from very early
days; iow it is a valid index file, unlike the one that has both
stage #0 and stage #1 for the same path.  It is a no-no to reject
such an index as long as we are discussing to add a new code to
tighten the sanity check on the file content.

>> I have also tried to reproduce it by directly calling
>> git-merge-recursive and the most I have got is what it seemed to be
>> like a conflict in the stage #1:
>> 
>> $ git ls-files -s
>> 100644 1036ba5101378f06aa10c5fa249b67e14f83166b 1 conflict
>> 100644 2638c45f8e7bc5b56f70e92390153572811782c3 2 conflict
>> 100644 178481050188cf00d7d9cd5a11e43ab8fab9294f 3 conflict
>> 
>> $ git cat-file blob 1036ba5101378f06aa10c5fa249b67e14f83166b
>> <<< Temporary merge branch 1
>> G
>> ===
>> E
>> F
>> >>> Temporary merge branch 2
>
> Yes, I think merge-recursive resolves the earlier merges and then feeds
> the result into the main merge. And that's how you end up with the
> "temporary merge branch" name instead of something useful.

Yes---that is what I meant by the "virtual stuff".  Unlike resolve
work by Daniel (around Sep 2005 $gmane/8088) that tried to use
multiple ancestors directly in a single merge, recursive limits
itself to repeated use of pairwise merges.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Check order when reading index

2014-08-26 Thread Jeff King
On Tue, Aug 26, 2014 at 02:08:35PM +0200, Jaime Soriano Pastor wrote:

> > That is how we natively (read: not with the funky "virtual" stuff
> > merge-recursive does) express a merge with multiple merge bases.
> > You also should be able to read this in the way how "git merge" invokes
> > merge strategies (one or more bases, double-dash and then current
> > HEAD and the other branches). I think there are some tests in 3way
> > merge tests that checks what should happen when our HEAD matches
> > one of the stage #1 while their branch matches a different one of the
> > stage #1, too.
> 
> I'm a bit lost with this, conceptually it doesn't seem to be any
> problem with having multiple merge bases, but I don't manage to
> reproduce it.
> With "natively" do you mean some internal state that is never written
> into the index? If this were the case then there wouldn't be any
> problem with the restriction when reading the index file.

FWIW, that was my question on reading Junio's response, too.

> I have also tried to reproduce it by directly calling
> git-merge-recursive and the most I have got is what it seemed to be
> like a conflict in the stage #1:
> 
> $ git ls-files -s
> 100644 1036ba5101378f06aa10c5fa249b67e14f83166b 1 conflict
> 100644 2638c45f8e7bc5b56f70e92390153572811782c3 2 conflict
> 100644 178481050188cf00d7d9cd5a11e43ab8fab9294f 3 conflict
> 
> $ git cat-file blob 1036ba5101378f06aa10c5fa249b67e14f83166b
> <<< Temporary merge branch 1
> G
> ===
> E
> F
> >>> Temporary merge branch 2

Yes, I think merge-recursive resolves the earlier merges and then feeds
the result into the main merge. And that's how you end up with the
"temporary merge branch" name instead of something useful.

It might work to have a recursive merge that causes a conflict on path
X, and then we further need to resolve that conflict. I'm not sure if we
try to represent that in the index somehow, or if merge-recursive just
bails in this case (I didn't try it).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Check order when reading index

2014-08-26 Thread Jaime Soriano Pastor
On Mon, Aug 25, 2014 at 10:52 PM, Junio C Hamano  wrote:
> On Mon, Aug 25, 2014 at 12:44 PM, Jeff King  wrote:
>> For my own curiosity, how do you get into this situation, and what does
>> it mean to have multiple stage#1 entries for the same path? What would
>> "git cat-file :1:path" output?
>
> That is how we natively (read: not with the funky "virtual" stuff
> merge-recursive does) express a merge with multiple merge bases.
> You also should be able to read this in the way how "git merge" invokes
> merge strategies (one or more bases, double-dash and then current
> HEAD and the other branches). I think there are some tests in 3way
> merge tests that checks what should happen when our HEAD matches
> one of the stage #1 while their branch matches a different one of the
> stage #1, too.

I'm a bit lost with this, conceptually it doesn't seem to be any
problem with having multiple merge bases, but I don't manage to
reproduce it.
With "natively" do you mean some internal state that is never written
into the index? If this were the case then there wouldn't be any
problem with the restriction when reading the index file.

I have also tried to reproduce it by directly calling
git-merge-recursive and the most I have got is what it seemed to be
like a conflict in the stage #1:

$ git ls-files -s
100644 1036ba5101378f06aa10c5fa249b67e14f83166b 1 conflict
100644 2638c45f8e7bc5b56f70e92390153572811782c3 2 conflict
100644 178481050188cf00d7d9cd5a11e43ab8fab9294f 3 conflict

$ git cat-file blob 1036ba5101378f06aa10c5fa249b67e14f83166b
<<< Temporary merge branch 1
G
===
E
F
>>> Temporary merge branch 2

And after thinking a bit more about it, I don't see a way to have two
stage #1 entries for the same path with git commands only. It seems
that all entries are added through the add_index_entry_with_check()
function (except maybe the added to the cached tree extension), and
this function replaces existing entries if they have the same name and
stage.
Also, all tests pass with the patch, without allowing two entries for
the same stage.

So I'm afraid that I don't fully understand this case :/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Check order when reading index

2014-08-25 Thread Junio C Hamano
On Mon, Aug 25, 2014 at 12:44 PM, Jeff King  wrote:
> For my own curiosity, how do you get into this situation, and what does
> it mean to have multiple stage#1 entries for the same path? What would
> "git cat-file :1:path" output?

That is how we natively (read: not with the funky "virtual" stuff
merge-recursive does) express a merge with multiple merge bases.
You also should be able to read this in the way how "git merge" invokes
merge strategies (one or more bases, double-dash and then current
HEAD and the other branches). I think there are some tests in 3way
merge tests that checks what should happen when our HEAD matches
one of the stage #1 while their branch matches a different one of the
stage #1, too.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Check order when reading index

2014-08-25 Thread Jaime Soriano Pastor
On Mon, Aug 25, 2014 at 7:21 PM, Junio C Hamano  wrote:
> Jaime Soriano Pastor  writes:
>
>> Subject: Re: [PATCH 1/2] Check order when reading index
>
> Please be careful when crafting the commit title.  This single line
> will be the only one that readers will have to identify the change
> among hundreds of entries in "git shortlog" output when trying to
> see what kind of change went into the project during the given
> period.  Something like:
>
> read_index_from(): catch out of order entries while reading an index file
>
> perhaps?
>
Ok, reprashing it.

>> +void check_ce_order(struct cache_entry *ce, struct cache_entry *next_ce)
>
> Does this have to be global, i.e. not "static void ..."?
>
Not really, changing it to static.

>> + if (ce_stage(ce) >= ce_stage(next_ce))
>> + die("Unordered stage entries for '%s'",
>> + ce->name);
>
> Not quite.  We do allow multiple higher stage entries; having two or
> more stage #1 entries is perfectly fine during a merge resolution,
> and both ce and next_ce may be pointing at the stage #1 entries of
> the same path.  Replacing the comparison with ">" is sufficient, I
> think.
>
Ok, but like Jeff, I'm also curious about how to have multiple stage
#1 entries for the same path.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Check order when reading index

2014-08-25 Thread Jeff King
On Mon, Aug 25, 2014 at 10:21:58AM -0700, Junio C Hamano wrote:

> > +   if (ce_stage(ce) >= ce_stage(next_ce))
> > +   die("Unordered stage entries for '%s'",
> > +   ce->name);
> 
> Not quite.  We do allow multiple higher stage entries; having two or
> more stage #1 entries is perfectly fine during a merge resolution,
> and both ce and next_ce may be pointing at the stage #1 entries of
> the same path.  Replacing the comparison with ">" is sufficient, I
> think.

For my own curiosity, how do you get into this situation, and what does
it mean to have multiple stage#1 entries for the same path? What would
"git cat-file :1:path" output?

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Check order when reading index

2014-08-25 Thread Junio C Hamano
Jaime Soriano Pastor  writes:

> Subject: Re: [PATCH 1/2] Check order when reading index

Please be careful when crafting the commit title.  This single line
will be the only one that readers will have to identify the change
among hundreds of entries in "git shortlog" output when trying to
see what kind of change went into the project during the given
period.  Something like:

read_index_from(): catch out of order entries while reading an index file

perhaps?

> Signed-off-by: Jaime Soriano Pastor 
> ---
>  read-cache.c | 18 ++
>  1 file changed, 18 insertions(+)
>
> diff --git a/read-cache.c b/read-cache.c
> index 7f5645e..c1a9619 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -1438,6 +1438,21 @@ static struct cache_entry *create_from_disk(struct 
> ondisk_cache_entry *ondisk,
>   return ce;
>  }
>  
> +void check_ce_order(struct cache_entry *ce, struct cache_entry *next_ce)

Does this have to be global, i.e. not "static void ..."?

> +{
> + int name_compare = strcmp(ce->name, next_ce->name);
> + if (0 < name_compare)
> + die("Unordered stage entries in index");
> + if (!name_compare) {
> + if (!ce_stage(ce))
> + die("Multiple stage entries for merged file '%s'",
> + ce->name);

OK.  If ce is at stage #0, no other entry can have the same name
regardless of the stage, and next_ce having the same name violates
that rule.

> + if (ce_stage(ce) >= ce_stage(next_ce))
> + die("Unordered stage entries for '%s'",
> + ce->name);

Not quite.  We do allow multiple higher stage entries; having two or
more stage #1 entries is perfectly fine during a merge resolution,
and both ce and next_ce may be pointing at the stage #1 entries of
the same path.  Replacing the comparison with ">" is sufficient, I
think.

Thanks.

> + }
> +}
> +
>  /* remember to discard_cache() before reading a different cache! */
>  int read_index_from(struct index_state *istate, const char *path)
>  {
> @@ -1499,6 +1514,9 @@ int read_index_from(struct index_state *istate, const 
> char *path)
>   ce = create_from_disk(disk_ce, &consumed, previous_name);
>   set_index_entry(istate, i, ce);
>  
> + if (i > 0)
> + check_ce_order(istate->cache[i - 1], ce);
> +
>   src_offset += consumed;
>   }
>   strbuf_release(&previous_name_buf);
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] Check order when reading index

2014-08-24 Thread Jaime Soriano Pastor
Signed-off-by: Jaime Soriano Pastor 
---
 read-cache.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index 7f5645e..c1a9619 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1438,6 +1438,21 @@ static struct cache_entry *create_from_disk(struct 
ondisk_cache_entry *ondisk,
return ce;
 }
 
+void check_ce_order(struct cache_entry *ce, struct cache_entry *next_ce)
+{
+   int name_compare = strcmp(ce->name, next_ce->name);
+   if (0 < name_compare)
+   die("Unordered stage entries in index");
+   if (!name_compare) {
+   if (!ce_stage(ce))
+   die("Multiple stage entries for merged file '%s'",
+   ce->name);
+   if (ce_stage(ce) >= ce_stage(next_ce))
+   die("Unordered stage entries for '%s'",
+   ce->name);
+   }
+}
+
 /* remember to discard_cache() before reading a different cache! */
 int read_index_from(struct index_state *istate, const char *path)
 {
@@ -1499,6 +1514,9 @@ int read_index_from(struct index_state *istate, const 
char *path)
ce = create_from_disk(disk_ce, &consumed, previous_name);
set_index_entry(istate, i, ce);
 
+   if (i > 0)
+   check_ce_order(istate->cache[i - 1], ce);
+
src_offset += consumed;
}
strbuf_release(&previous_name_buf);
-- 
2.0.4.1.g0b8a4f9.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html