Re: [Monotone-devel] Re: Deterministic *-merge

2007-01-13 Thread Oren Ben-Kiki
On Fri, 2007-01-12 at 18:14 -0800, Nathaniel J. Smith wrote:
 Anyway, the answer to your question is that I'm not proposing anything
 at all change in monotone -- that's why I said at the beginning of the
 writeup that my note had no practical consequences :-).

I think that merge behavior is one of the most crucial pieces of how a
VCS behaves. Darcs is an example of this approach taken to extreme, and
demonstrates how a solid theoretical basis for the merge can be viewed
as the core design choice of a VCS. Having something equivalent for
more traditional version control systems would be a real breakthrough
IMVHO.

As for practical implications - cherry picking can be defined in terms
of a merge operation, as long as you *really* trust your merge :-)

 We've
 generally taken the line that the history graph should attempt to
 model as closely as possible the user's intrinsic understanding of
 versioning a bunch of files, because this is the both the most
 user-friendly and the most future-proof approach.  (Sucks to build a
 particular merge algorithm's assumptions into your hash-chained
 history graph, and thus be stuck with that merge algorithm for a few
 decades...)

That's true. On the other hand, there's immense power to be gained from
picking a powerful merge abstraction and running with it (again Darcs is
a good example).

 That said, I dunno, maybe someone will figure out how to use this kind
 of stuff to improve VCS UI -- it's just some ideas, who knows what
 will happen to them once they start wandering around in people's
 heads...

For starters, annotating the '#' with the list of candidate values
immediately suggests an interface for conflict resolution. Also
presumably the cases where conflict actually occurs would (hopefully)
be more in line with user expectations. And then there are the advanced
features you could build on top of this (like cherry-picking).

These all require that the chosen merge abstraction be really solid,
which translates to trying to poke theoretical holes in it, and then
trying it in practice for a while, until sufficient confidence is
reached.

Most VCS don't trust their merge method (for good reason). On the down
side, this restrict their power; on the up side, they allow plugging in
different merge algorithms, so it would be possible to implement and
test deterministic *-merge in them to see how well it works in practice.
Monotone is a particularly good platform for this because it is based on
solid abstractions in other areas (what is a version, what is a branch,
etc.)



___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] Deterministic *-merge

2007-01-12 Thread Oren Ben-Kiki
On Fri, 2007-01-12 at 03:00 -0800, Nathaniel J. Smith wrote:
 ...

 Deterministic merging
 =

Beautiful! There's just one point I didn't follow, though.

 But, magically, with deterministic *-merge, all orders work the same
 -- it even turns out to be possible to merge two conflicts and get out
 a non-conflict (!):
 
   a a a
  / \   / \   / \
 b*  b*b*  b*b*  b*
/ \ / \   / \ / \   / \ / \
   c*  b   c*c*  b   c*c*  b   c*
\ /   /   \   \ /   \ / \ /
 #   / \   # #   #
  \ /   \ /   \ /
   c c c

You lost me here. In the '#' node, you lost the specific values 'b' and
'c' - all you have is 'some unknown value'. How does merging two
'unknown values' produce a specific value?

I like the idea of not using the generic '#' and instead listing the set
of candidate values. This can be done in text files by some
not-so-pretty syntax tricks. Speaking of which, how did you envision
representing '#' in actual text files?

If the result of a conflict was replacing a scalar with a set of
conflicting values, we'd get a user action (creating a separate node)
that would pick one or replace the set by a new one. It seems like this
would be very useful when users need to resolve conflicts in practice...

At first I thought that was how you obtained 'c' above, but it turns out
not to be the case:

  a a a
 / \   / \   / \
b*  b*b*  b*b*  b*
   / \ / \   / \ / \   / \ / \
  c*  b   c*c*  b   c*c*  b   c*
   \ /   /   \   \ /   \ / \ /
  {b,c} / \ {b,c} {b,c}{b,c}
 \ /   \ /   \ /
  c c   {b,c}

So I'm still stumped on how you obtained 'c' in all three cases. I must have 
missed some important point here...



___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


[Monotone-devel] Re: Deterministic *-merge

2007-01-12 Thread Oren Ben-Kiki
On Fri, 2007-01-12 at 10:35 -0600, Timothy Brownawell wrote:
 Because the value of a merge node is chosen from *(node).
 
 The multi-*-merge writup at
 http://article.gmane.org/gmane.comp.version-control.revctrl/93
 says that *(A) is the minimal set of marked ancestors of A.
 
 Adding labels to the individual nodes produces
   a1
  / \
b1*  b2*
   /  \ /  \
 c1*   b3   c2*
   \  / \  /
#1   #2
 \  /
  c3

The article states:
  Algorithm: Given two nodes to merge, A and B, we consider four cases:
   a) value(A) = value(B): return the shared value
   b) *(A)  B: return value(B)
   c) *(B)  A: return value(A)
   d) else: conflict; escalate to user
  Where *(A)  B means all elements of the set *(A) are non-strict
  ancestors of the revision B.  The right way to read this is as try
  (a) first, and then if that fails try (b), (c), (d) simultaneously.

The post said:
  This is already obviously true for non-merge nodes.  We want it to be
  true for merge nodes too.  The easy way is to simply use it as the
  definition of merge(A, B).  If every node in *(A) union *(B) has the
  same value, then cool -- we can just make merge(A, B) that value.  If
  there are different values, then we have only one option -- we must
  define merge(A, B) to be #, because # is the only thing that is
  similar to multiple different values.

I'm still missing it. Just to see I get the algorithms right (sorry for
being dense):

  Old way:
value(b1) = value(b2)
By (a), merge into b3

  New way:
*(b1) union *(b2) = (b1, b2)
exist v (= 'b') such that forall n in above, value(n) = v
Merge into b3

  Old way:
value(c1) != value(b3)
*(c1) = c1
*(b3) = (b1, b2)
Not *(b3)  c
Not *(c1)  b3
Conflict.

  New way:
*(c1) = c1
*(b3) = (b1, b2)
*(c1) union *(b3) = (c1, b1, b2)
not exist v such that forall n in above value(n) = v
Merge into '#'

So far, so good. But:
  *(#1) = (c1, b2)
  *(#2) = (c2, b1)
  *(#1) union *(#2) = (c1, b2, c2, b1)
  not exist v such that forall n in above value(n) = v
  Merge into '#' - how does this yield 'c'?

Trying a less formal way of thinking about it:

Looking at the graph above, if you (informally) think in terms of
values, then in both '#1' and '#2' we can see in the history the user
preferred the value 'c' over the value 'b'. If we (somehow) worked in
term of values rather than nodes, both '#1' and '#2' would become 'c'.
Some people would argue that an ideal merge should give this result :-)

However we are thinking in term of ancestor nodes instead. In this case,
we have no indication that c1 overrides b2 (or c2 overrided b1), so we
are forced to merge the node into '#'. That's probably acceptable, given
the advantages of the approach.

Now we merge the two nodes, we know that c1 overrides b1 and c2
overrides b2. So it makes sense that the result is 'c' (even though this
would be surprising to the uninitiated newbie). However, I don't see how
the algorithm produces this result.

It is tempting to try to improve the algorithm to be more
value-oriented. If we culled the minimal marked ancestor set such that
there were no duplicate values... Or possibly culled the union of the
two merged nodes... If we culled *(b3) to be just (b1), or culled *(c1)
union *(b3) to be (c1, b1), then we could merge c1 and b3 to 'c'.
Previous attempts in this direction weren't very successful, but perhaps
the addition of '#' into the mix makes it possible again?



___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


[Monotone-devel] Re: Deterministic *-merge

2007-01-12 Thread Oren Ben-Kiki
On Fri, 2007-01-12 at 12:57 -0600, Timothy Brownawell wrote:
 What you're missing is the minimal part of the definition of *(node).
 You don't just union the mark sets of the parent nodes, you take that
 union and then run erase_ancestors() on it.

smack self on head of course. Its also sort of what I had in mind when
I thought of culling the set. Makes perfect sense. I knew I was missing
something obvious. Sorry.

One last question :-) Is the idea that the '#' nodes be completely
virtual? That is, if I have two actual versions in monotone, one with a
value of 'a' and one with a value of 'b', and I try to merge them...
what would happen in practice?

The simplest way would be to force the user to resolve every '#' to a
specific value (just like he does today when there is a conflict), so in
the monotone repository we would see:

  a*  b*
   \ /
c*

Only during the merge algorithm the graph would be considered to be:

  a*  b*
   \ /
#
|
c*

So all the '#' nodes would only exist virtually, never in a real
version. If that's the case, then the example of merging two '#' nodes
would be impossible to construct in practice.

The more complex alternative would be to allow '# values in real
versions. This would allow merging two '#' nodes to cleanly remove the
conflict, but it would also require some way of actually representing
'# values inside text files.

I assume that what you are proposing is the former (simple) method?



___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


[Monotone-devel] Re: Using monotone in a team

2006-12-02 Thread Oren Ben-Kiki
On Sat, 2006-12-02 at 01:49 +, Boris wrote:
 Hugo Cornelis wrote:
  Well, what I really want to do is have a mechanism that automatically
  distributes the necessary hooks from one central point.  For sure the
  members of the same team need the same trust policies, local
  alterations not allowed.  So if you want to join the team, you first
  have to synchronize the trust policies.  This does not preclude simple
  cloning for private work.  Perhaps the right question is, does this
  layer belong to monotone or not ?
 
 I guess it does not belong to monotone. As you even want to forbid 
 developers to change trust policies I think a distributed revision control 
 system is not what you want. What monotone does is copying data between 
 databases. Once it is there revisions can be combined flexibly. And how they 
 are combined depends on the local configuration (the hook functions). At 
 least that's my understanding so far (as usual without any warranties ;).

Sounds strange to me. I'd expect the trust policy and (all? some?) of
the hooks to be part of the data that monotone allows copying between
the databases.

This raises some issues with merging branches - you'd want to merge just
the code changes, and not necessarily the policy changes. Also some
hooks depend on the developer environment (e.g. the messy line-ending
issue), while others may need be shared by all developers (e.g., simple
pre-commit verifications). So it may be necessary to have that
distinction be explicit in the system somehow.

Tricky as this may be to get right, this is something you'd expect every
distributed version control system to address somehow...

Oren Ben-Kiki



___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] Re: [Revctrl] improvements to *-merge

2005-09-03 Thread Oren Ben-Kiki
On Friday 02 September 2005 21:48, Bram Cohen wrote:
 First of all though, there's a point I have to get out of the way.
 Just how does one pronounce 'precise-*-merge'? Star-merge is already
 taken as a term, and asterisk-merge just doesn't roll off the tongue
 in quite the same way.

Precis(e)tar-r-merge? :-)

 This approach is entirely based on values, not graph node

There's a subtlety involved:

a1
   / \
  d1  b1
   \ / \
b?  c1   

b? would have been a conflict, resolved in favor of b. So the rule is:

In case of a conflict resolved for one of the values, you keep the 
winner's generational number.

On the other hand,

      a1
     /  \
    b1   a1
     \  /
      a2?

Here there would have been a clean merge to b1, overridden for a, so the 
rule is: 

In case a clean merge is overridden for the losing value, you increment 
the generation number.

It sounds like a potential divergence between the programmer's intent 
and the algorithm's interpretation. It does seem to work however... Any 
rationale?

Then there's implicit undo :-) The whole approach philosophically 
opposes implicit undo, since the undone generation number _must_ be 
higher than the original, causing conflicts. I really don't see how you 
could even approach trying to make implicit undo work...

Have fun,

Oren Ben-Kiki


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] Re: 3-way merge considered harmful

2005-05-07 Thread Oren Ben-Kiki
 May 2005 00:29:48 -0700, Nathaniel Smith [EMAIL PROTECTED] said:

 njs Here's another pathological case for 3-way merge:
 njsA
 njs|
 njsB
 njs   / \
 njs  C   D

I'd like to throw in $0.02 worth here.

First, this problem has nothing to do with project file operations. The 
same thing would happen if you were talking about adding, removing and 
changing lines in a text file.

It seems to me you have two ways to go about solving this. The first is 
that a merge doesn't consider the state of the file (or project tree or 
whatever), but also the history leading to it. Taking this road to its 
logical end you end up with something like Darcs. In such a system, a 
version is defined as the sum of a set of deltas, rather than as a 
particular state.

On Wednesday 04 May 2005 18:41, [EMAIL PROTECTED] wrote:
 As I see it, this case and the criss-cross problem just show that
 picking any common ancestor is not enough to provide a correct
 merge ancestor (one that doesnt lose work), and much less a nice
 merge ancestor (a correct merge ancestor that also never requires the
 users to solve the same conflict twice).

Exactly. The second way is to stick with only using the three states of 
the file (or project tree or whatever). It is then inevitable that the 
choice of ancestor will affect the end result of a 3-way merge.

You can view this as a feature more than a bug. For example, using 3-way 
merge, with appropriate common ancestor selection, yields a 
cherry-picking operation:

/- B -- C
   A
\- D -- E

A 3-way merge of C and E using B as a base would cherry-pick the 
B--C change and apply it to E. True, B isn't even an ancestor of E, 
unless one considers reverse deltas; but that just drives the point 
that the base for a 3-way merge is a different thing from least 
common ancestor, depending on what you are trying to achieve.

 I fail to imagine an example 
 where the LCAD algorithm could be not correct by choosing an ancestor
 such as A in your example.

In a state-based (rather than delta-state) system, selecting the 
ancestor for a 3-way merge becomes a semantical operation; it reflects 
the _intent_ of the operation. This means there's no way to discover 
the one and only true common ancestor for merging versions B and E 
above; there is simply no such thing. The best the system can do is 
provide an automated way to locate the ancestor which causes the merge 
to reflect _a particular developer intent_.

 Im also not very convinced yet that there exists no algorithm to find
 a nice merge ancestor. Even then, I personally would be willing to
 solve the same conflict twice once in a while, if it was necessary to
 be able to use a standard 3-way merger to solve conflicts.

Certainly the system can compute the correct ancestor for a particular 
intent, such as merging a branch back to the main trunk etc. However, I 
suspect that any such algorithm must, to some extent, rely on 
meta-information that is not available in the topology itself (call 
this the expected development workflow).

BTW, in a Darcs-like system, the situation isn't much different. For the 
system to automatically select the set of deltas to include in a 
system, it again must rely on such meta-information.

Put another way - if Monotone, or Darcs for that matter subscribed to a 
particular workflow methodology (trunk/branch, development/release, 
etc.) then there would be no problem in the first place. However both 
systems attempt at providing basic building blocks which can be used to 
implement many types of workflows. Well, surprise - this means that 
when you use these building blocks, _you_ have to decide on how to 
apply them according to _your_ particular workflow, and there are some 
ways to apply them that make little or no sense.

Nathaniel's pathological case is an example of a nonsensical usage of 
3-way merge, just like char *p = 0; *p = 'a'; is a nonsensical use of 
C pointers; this doesn't mean 3-way merge, or pointers, are inherently 
useless, or need to be fixed somehow.

Have fun,

Oren Ben-Kiki


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] Re: 3-way merge considered harmful

2005-05-07 Thread Oren Ben-Kiki
 May 2005 00:29:48 -0700, Nathaniel Smith [EMAIL PROTECTED] said:

 njs Here's another pathological case for 3-way merge:
 njsA
 njs|
 njsB
 njs   / \
 njs  C   D

I'd like to throw in $0.02 worth here.

First, this problem has nothing to do with project file operations. The 
same thing would happen if you were talking about adding, removing and 
changing lines in a text file.

It seems to me you have two ways to go about solving this. The first is 
that a merge doesn't consider the state of the file (or project tree or 
whatever), but also the history leading to it. Taking this road to its 
logical end you end up with something like Darcs. In such a system, a 
version is defined as the sum of a set of deltas, rather than as a 
particular state.

On Wednesday 04 May 2005 18:41, [EMAIL PROTECTED] wrote:
 As I see it, this case and the criss-cross problem just show that
 picking any common ancestor is not enough to provide a correct
 merge ancestor (one that doesnt lose work), and much less a nice
 merge ancestor (a correct merge ancestor that also never requires the
 users to solve the same conflict twice).

Exactly. The second way is to stick with only using the three states of 
the file (or project tree or whatever). It is then inevitable that the 
choice of ancestor will affect the end result of a 3-way merge.

You can view this as a feature more than a bug. For example, using 3-way 
merge, with appropriate common ancestor selection, yields a 
cherry-picking operation:

/- B -- C
   A
\- D -- E

A 3-way merge of C and E using B as a base would cherry-pick the 
B--C change and apply it to E. True, B isn't even an ancestor of E, 
unless one considers reverse deltas; but that just drives the point 
that the base for a 3-way merge is a different thing from least 
common ancestor, depending on what you are trying to achieve.

 I fail to imagine an example 
 where the LCAD algorithm could be not correct by choosing an ancestor
 such as A in your example.

In a state-based (rather than delta-state) system, selecting the 
ancestor for a 3-way merge becomes a semantical operation; it reflects 
the _intent_ of the operation. This means there's no way to discover 
the one and only true common ancestor for merging versions B and E 
above; there is simply no such thing. The best the system can do is 
provide an automated way to locate the ancestor which causes the merge 
to reflect _a particular developer intent_.

 Im also not very convinced yet that there exists no algorithm to find
 a nice merge ancestor. Even then, I personally would be willing to
 solve the same conflict twice once in a while, if it was necessary to
 be able to use a standard 3-way merger to solve conflicts.

Certainly the system can compute the correct ancestor for a particular 
intent, such as merging a branch back to the main trunk etc. However, I 
suspect that any such algorithm must, to some extent, rely on 
meta-information that is not available in the topology itself (call 
this the expected development workflow).

BTW, in a Darcs-like system, the situation isn't much different. For the 
system to automatically select the set of deltas to include in a 
system, it again must rely on such meta-information.

Put another way - if Monotone, or Darcs for that matter subscribed to a 
particular workflow methodology (trunk/branch, development/release, 
etc.) then there would be no problem in the first place. However both 
systems attempt at providing basic building blocks which can be used to 
implement many types of workflows. Well, surprise - this means that 
when you use these building blocks, _you_ have to decide on how to 
apply them according to _your_ particular workflow, and there are some 
ways to apply them that make little or no sense.

Nathaniel's pathological case is an example of a nonsensical usage of 
3-way merge, just like char *p = 0; *p = 'a'; is a nonsensical use of 
C pointers; this doesn't mean 3-way merge, or pointers, are inherently 
useless, or need to be fixed somehow.

Have fun,

Oren Ben-Kiki


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel