On 08/22, Junio C Hamano wrote:
> Thomas Gummerer <t.gumme...@gmail.com> writes:
> 
> > Hmm, it does describe what happens in the code, which is what this
> > patch implements.  Maybe we should rephrase the title here?
> >
> > Or are you suggesting dropping this patch (and the next one)
> > completely, as we don't want to try and handle the case where this
> > kind of garbage is thrown at 'rerere'?
> 
> I consider these two patches as merely attempting to punt a bit
> better.  Once users start committing conflict-marker-looking lines
> in the contents, and getting them involved in actual conflicts, I do
> not think any approach (including what the original rerere uses
> before this patch) that assumes the markers will neatly form set of
> blocks of text enclosed in << == >> will reliably step around such
> broken contents.  E.g. it is entirely conceivable both branches have
> the <<< beginning of conflict marker plus contents from the HEAD
> before they recorded the marker that are identical, that diverge as
> you scan the text down and get closer to ===, something like:
> 
>         side A                  side B
>         --------------------    --------------------
> 
>         shared                  shared
>         <<<<<<<                 <<<<<<<
>         version before          version before
>         these guys merged       these guys merged
>         their ancestor          their ancestor
>         versions                versions.
>         but some                now some
>         lines are different     lines are different
>         =======                 ========
>         and other               totally different
>         contents                contents
>         ...                     ...
> 
> And a merge of these may make <<< part shared (i.e. outside the
> conflicted region) while lines near and below ==== part of conflict,
> which would give us something like
> 
>         merge of side A & B
>         -------------------
> 
>         shared                  
>         <<<<<<<                 (this is part of contents)
>         version before          
>         these guys merged       
>         their ancestor          
>         <<<<<<< HEAD            (conflict marker)
>         versions
>         but some
>         lines are different
>         =======                 (this is part of contents)
>         and other
>         contents
>         ...
>         =======                 (conflict marker)
>         versions.
>         now some
>         lines are different
>         =======                 (this is part of contents)
>         totally different
>         contents
>         ...
>         >>>>>>> theirs          (conflict marker)
> 
> Depending on the shape of the original conflict that was committed,
> we may have two versions of <<<, together with the real conflict
> marker, but shared closing >>> marker.  With contents like that,
> there is no way for us to split these lines into two groups at a
> line '=====' (which one?) and swap to come up with the normalized
> shape.
> 
> The original rerere algorithm would punt when such an unmatched
> markers are found, and deals with "nested conflict" situation by
> avoiding to create such a thing altogether.  I am sure your two
> patches may make the code punt less, but I suspect that is not a
> foolproof "solution" but more of a workaround, as I do not think it
> is solvable, once you allow users to commit conflict-marker looking
> strings in contents.

Agreed.  I think it may be solvable if we'd actually get the
information about what belongs to which side from the merge algorithm
directly.  But that sounds way more involved than what I'm able to
commit to for something that I don't forsee running into myself :)

>                       As the heuristics used in such a workaround
> are very likely to change, and something the end-users should not
> even rely on, I'd rather not document and promise the exact
> behaviour---perhaps we should stress "don't do that" even stronger
> instead.

Fair enough.  I thought of the technical documentation as something
that doesn't promise users anything, but rather describes how the
internals work right now, which is what this bit of documentation
attempted to write down.  But if we are worried about this giving end
users ideas then I definitely agree and we should get rid of this bit
of documentation.  I'll send a patch for that, and for adding a note
about "don't do that" in the man page.

Reply via email to