Re: [Monotone-devel] Re: Deterministic *-merge
On Fri, 2007-01-12 at 18:14 -0800, Nathaniel J. Smith wrote: Anyway, the answer to your question is that I'm not proposing anything at all change in monotone -- that's why I said at the beginning of the writeup that my note had no practical consequences :-). I think that merge behavior is one of the most crucial pieces of how a VCS behaves. Darcs is an example of this approach taken to extreme, and demonstrates how a solid theoretical basis for the merge can be viewed as the core design choice of a VCS. Having something equivalent for more traditional version control systems would be a real breakthrough IMVHO. As for practical implications - cherry picking can be defined in terms of a merge operation, as long as you *really* trust your merge :-) We've generally taken the line that the history graph should attempt to model as closely as possible the user's intrinsic understanding of versioning a bunch of files, because this is the both the most user-friendly and the most future-proof approach. (Sucks to build a particular merge algorithm's assumptions into your hash-chained history graph, and thus be stuck with that merge algorithm for a few decades...) That's true. On the other hand, there's immense power to be gained from picking a powerful merge abstraction and running with it (again Darcs is a good example). That said, I dunno, maybe someone will figure out how to use this kind of stuff to improve VCS UI -- it's just some ideas, who knows what will happen to them once they start wandering around in people's heads... For starters, annotating the '#' with the list of candidate values immediately suggests an interface for conflict resolution. Also presumably the cases where conflict actually occurs would (hopefully) be more in line with user expectations. And then there are the advanced features you could build on top of this (like cherry-picking). These all require that the chosen merge abstraction be really solid, which translates to trying to poke theoretical holes in it, and then trying it in practice for a while, until sufficient confidence is reached. Most VCS don't trust their merge method (for good reason). On the down side, this restrict their power; on the up side, they allow plugging in different merge algorithms, so it would be possible to implement and test deterministic *-merge in them to see how well it works in practice. Monotone is a particularly good platform for this because it is based on solid abstractions in other areas (what is a version, what is a branch, etc.) ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] Deterministic *-merge
On Fri, 2007-01-12 at 03:00 -0800, Nathaniel J. Smith wrote: ... Deterministic merging = Beautiful! There's just one point I didn't follow, though. But, magically, with deterministic *-merge, all orders work the same -- it even turns out to be possible to merge two conflicts and get out a non-conflict (!): a a a / \ / \ / \ b* b*b* b*b* b* / \ / \ / \ / \ / \ / \ c* b c*c* b c*c* b c* \ / / \ \ / \ / \ / # / \ # # # \ / \ / \ / c c c You lost me here. In the '#' node, you lost the specific values 'b' and 'c' - all you have is 'some unknown value'. How does merging two 'unknown values' produce a specific value? I like the idea of not using the generic '#' and instead listing the set of candidate values. This can be done in text files by some not-so-pretty syntax tricks. Speaking of which, how did you envision representing '#' in actual text files? If the result of a conflict was replacing a scalar with a set of conflicting values, we'd get a user action (creating a separate node) that would pick one or replace the set by a new one. It seems like this would be very useful when users need to resolve conflicts in practice... At first I thought that was how you obtained 'c' above, but it turns out not to be the case: a a a / \ / \ / \ b* b*b* b*b* b* / \ / \ / \ / \ / \ / \ c* b c*c* b c*c* b c* \ / / \ \ / \ / \ / {b,c} / \ {b,c} {b,c}{b,c} \ / \ / \ / c c {b,c} So I'm still stumped on how you obtained 'c' in all three cases. I must have missed some important point here... ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
[Monotone-devel] Re: Deterministic *-merge
On Fri, 2007-01-12 at 10:35 -0600, Timothy Brownawell wrote: Because the value of a merge node is chosen from *(node). The multi-*-merge writup at http://article.gmane.org/gmane.comp.version-control.revctrl/93 says that *(A) is the minimal set of marked ancestors of A. Adding labels to the individual nodes produces a1 / \ b1* b2* / \ / \ c1* b3 c2* \ / \ / #1 #2 \ / c3 The article states: Algorithm: Given two nodes to merge, A and B, we consider four cases: a) value(A) = value(B): return the shared value b) *(A) B: return value(B) c) *(B) A: return value(A) d) else: conflict; escalate to user Where *(A) B means all elements of the set *(A) are non-strict ancestors of the revision B. The right way to read this is as try (a) first, and then if that fails try (b), (c), (d) simultaneously. The post said: This is already obviously true for non-merge nodes. We want it to be true for merge nodes too. The easy way is to simply use it as the definition of merge(A, B). If every node in *(A) union *(B) has the same value, then cool -- we can just make merge(A, B) that value. If there are different values, then we have only one option -- we must define merge(A, B) to be #, because # is the only thing that is similar to multiple different values. I'm still missing it. Just to see I get the algorithms right (sorry for being dense): Old way: value(b1) = value(b2) By (a), merge into b3 New way: *(b1) union *(b2) = (b1, b2) exist v (= 'b') such that forall n in above, value(n) = v Merge into b3 Old way: value(c1) != value(b3) *(c1) = c1 *(b3) = (b1, b2) Not *(b3) c Not *(c1) b3 Conflict. New way: *(c1) = c1 *(b3) = (b1, b2) *(c1) union *(b3) = (c1, b1, b2) not exist v such that forall n in above value(n) = v Merge into '#' So far, so good. But: *(#1) = (c1, b2) *(#2) = (c2, b1) *(#1) union *(#2) = (c1, b2, c2, b1) not exist v such that forall n in above value(n) = v Merge into '#' - how does this yield 'c'? Trying a less formal way of thinking about it: Looking at the graph above, if you (informally) think in terms of values, then in both '#1' and '#2' we can see in the history the user preferred the value 'c' over the value 'b'. If we (somehow) worked in term of values rather than nodes, both '#1' and '#2' would become 'c'. Some people would argue that an ideal merge should give this result :-) However we are thinking in term of ancestor nodes instead. In this case, we have no indication that c1 overrides b2 (or c2 overrided b1), so we are forced to merge the node into '#'. That's probably acceptable, given the advantages of the approach. Now we merge the two nodes, we know that c1 overrides b1 and c2 overrides b2. So it makes sense that the result is 'c' (even though this would be surprising to the uninitiated newbie). However, I don't see how the algorithm produces this result. It is tempting to try to improve the algorithm to be more value-oriented. If we culled the minimal marked ancestor set such that there were no duplicate values... Or possibly culled the union of the two merged nodes... If we culled *(b3) to be just (b1), or culled *(c1) union *(b3) to be (c1, b1), then we could merge c1 and b3 to 'c'. Previous attempts in this direction weren't very successful, but perhaps the addition of '#' into the mix makes it possible again? ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
[Monotone-devel] Re: Deterministic *-merge
On Fri, 2007-01-12 at 12:57 -0600, Timothy Brownawell wrote: What you're missing is the minimal part of the definition of *(node). You don't just union the mark sets of the parent nodes, you take that union and then run erase_ancestors() on it. smack self on head of course. Its also sort of what I had in mind when I thought of culling the set. Makes perfect sense. I knew I was missing something obvious. Sorry. One last question :-) Is the idea that the '#' nodes be completely virtual? That is, if I have two actual versions in monotone, one with a value of 'a' and one with a value of 'b', and I try to merge them... what would happen in practice? The simplest way would be to force the user to resolve every '#' to a specific value (just like he does today when there is a conflict), so in the monotone repository we would see: a* b* \ / c* Only during the merge algorithm the graph would be considered to be: a* b* \ / # | c* So all the '#' nodes would only exist virtually, never in a real version. If that's the case, then the example of merging two '#' nodes would be impossible to construct in practice. The more complex alternative would be to allow '# values in real versions. This would allow merging two '#' nodes to cleanly remove the conflict, but it would also require some way of actually representing '# values inside text files. I assume that what you are proposing is the former (simple) method? ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
[Monotone-devel] Re: Using monotone in a team
On Sat, 2006-12-02 at 01:49 +, Boris wrote: Hugo Cornelis wrote: Well, what I really want to do is have a mechanism that automatically distributes the necessary hooks from one central point. For sure the members of the same team need the same trust policies, local alterations not allowed. So if you want to join the team, you first have to synchronize the trust policies. This does not preclude simple cloning for private work. Perhaps the right question is, does this layer belong to monotone or not ? I guess it does not belong to monotone. As you even want to forbid developers to change trust policies I think a distributed revision control system is not what you want. What monotone does is copying data between databases. Once it is there revisions can be combined flexibly. And how they are combined depends on the local configuration (the hook functions). At least that's my understanding so far (as usual without any warranties ;). Sounds strange to me. I'd expect the trust policy and (all? some?) of the hooks to be part of the data that monotone allows copying between the databases. This raises some issues with merging branches - you'd want to merge just the code changes, and not necessarily the policy changes. Also some hooks depend on the developer environment (e.g. the messy line-ending issue), while others may need be shared by all developers (e.g., simple pre-commit verifications). So it may be necessary to have that distinction be explicit in the system somehow. Tricky as this may be to get right, this is something you'd expect every distributed version control system to address somehow... Oren Ben-Kiki ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] Re: [Revctrl] improvements to *-merge
On Friday 02 September 2005 21:48, Bram Cohen wrote: First of all though, there's a point I have to get out of the way. Just how does one pronounce 'precise-*-merge'? Star-merge is already taken as a term, and asterisk-merge just doesn't roll off the tongue in quite the same way. Precis(e)tar-r-merge? :-) This approach is entirely based on values, not graph node There's a subtlety involved: a1 / \ d1 b1 \ / \ b? c1 b? would have been a conflict, resolved in favor of b. So the rule is: In case of a conflict resolved for one of the values, you keep the winner's generational number. On the other hand, a1 / \ b1 a1 \ / a2? Here there would have been a clean merge to b1, overridden for a, so the rule is: In case a clean merge is overridden for the losing value, you increment the generation number. It sounds like a potential divergence between the programmer's intent and the algorithm's interpretation. It does seem to work however... Any rationale? Then there's implicit undo :-) The whole approach philosophically opposes implicit undo, since the undone generation number _must_ be higher than the original, causing conflicts. I really don't see how you could even approach trying to make implicit undo work... Have fun, Oren Ben-Kiki ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] Re: 3-way merge considered harmful
May 2005 00:29:48 -0700, Nathaniel Smith [EMAIL PROTECTED] said: njs Here's another pathological case for 3-way merge: njsA njs| njsB njs / \ njs C D I'd like to throw in $0.02 worth here. First, this problem has nothing to do with project file operations. The same thing would happen if you were talking about adding, removing and changing lines in a text file. It seems to me you have two ways to go about solving this. The first is that a merge doesn't consider the state of the file (or project tree or whatever), but also the history leading to it. Taking this road to its logical end you end up with something like Darcs. In such a system, a version is defined as the sum of a set of deltas, rather than as a particular state. On Wednesday 04 May 2005 18:41, [EMAIL PROTECTED] wrote: As I see it, this case and the criss-cross problem just show that picking any common ancestor is not enough to provide a correct merge ancestor (one that doesnt lose work), and much less a nice merge ancestor (a correct merge ancestor that also never requires the users to solve the same conflict twice). Exactly. The second way is to stick with only using the three states of the file (or project tree or whatever). It is then inevitable that the choice of ancestor will affect the end result of a 3-way merge. You can view this as a feature more than a bug. For example, using 3-way merge, with appropriate common ancestor selection, yields a cherry-picking operation: /- B -- C A \- D -- E A 3-way merge of C and E using B as a base would cherry-pick the B--C change and apply it to E. True, B isn't even an ancestor of E, unless one considers reverse deltas; but that just drives the point that the base for a 3-way merge is a different thing from least common ancestor, depending on what you are trying to achieve. I fail to imagine an example where the LCAD algorithm could be not correct by choosing an ancestor such as A in your example. In a state-based (rather than delta-state) system, selecting the ancestor for a 3-way merge becomes a semantical operation; it reflects the _intent_ of the operation. This means there's no way to discover the one and only true common ancestor for merging versions B and E above; there is simply no such thing. The best the system can do is provide an automated way to locate the ancestor which causes the merge to reflect _a particular developer intent_. Im also not very convinced yet that there exists no algorithm to find a nice merge ancestor. Even then, I personally would be willing to solve the same conflict twice once in a while, if it was necessary to be able to use a standard 3-way merger to solve conflicts. Certainly the system can compute the correct ancestor for a particular intent, such as merging a branch back to the main trunk etc. However, I suspect that any such algorithm must, to some extent, rely on meta-information that is not available in the topology itself (call this the expected development workflow). BTW, in a Darcs-like system, the situation isn't much different. For the system to automatically select the set of deltas to include in a system, it again must rely on such meta-information. Put another way - if Monotone, or Darcs for that matter subscribed to a particular workflow methodology (trunk/branch, development/release, etc.) then there would be no problem in the first place. However both systems attempt at providing basic building blocks which can be used to implement many types of workflows. Well, surprise - this means that when you use these building blocks, _you_ have to decide on how to apply them according to _your_ particular workflow, and there are some ways to apply them that make little or no sense. Nathaniel's pathological case is an example of a nonsensical usage of 3-way merge, just like char *p = 0; *p = 'a'; is a nonsensical use of C pointers; this doesn't mean 3-way merge, or pointers, are inherently useless, or need to be fixed somehow. Have fun, Oren Ben-Kiki ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel
Re: [Monotone-devel] Re: 3-way merge considered harmful
May 2005 00:29:48 -0700, Nathaniel Smith [EMAIL PROTECTED] said: njs Here's another pathological case for 3-way merge: njsA njs| njsB njs / \ njs C D I'd like to throw in $0.02 worth here. First, this problem has nothing to do with project file operations. The same thing would happen if you were talking about adding, removing and changing lines in a text file. It seems to me you have two ways to go about solving this. The first is that a merge doesn't consider the state of the file (or project tree or whatever), but also the history leading to it. Taking this road to its logical end you end up with something like Darcs. In such a system, a version is defined as the sum of a set of deltas, rather than as a particular state. On Wednesday 04 May 2005 18:41, [EMAIL PROTECTED] wrote: As I see it, this case and the criss-cross problem just show that picking any common ancestor is not enough to provide a correct merge ancestor (one that doesnt lose work), and much less a nice merge ancestor (a correct merge ancestor that also never requires the users to solve the same conflict twice). Exactly. The second way is to stick with only using the three states of the file (or project tree or whatever). It is then inevitable that the choice of ancestor will affect the end result of a 3-way merge. You can view this as a feature more than a bug. For example, using 3-way merge, with appropriate common ancestor selection, yields a cherry-picking operation: /- B -- C A \- D -- E A 3-way merge of C and E using B as a base would cherry-pick the B--C change and apply it to E. True, B isn't even an ancestor of E, unless one considers reverse deltas; but that just drives the point that the base for a 3-way merge is a different thing from least common ancestor, depending on what you are trying to achieve. I fail to imagine an example where the LCAD algorithm could be not correct by choosing an ancestor such as A in your example. In a state-based (rather than delta-state) system, selecting the ancestor for a 3-way merge becomes a semantical operation; it reflects the _intent_ of the operation. This means there's no way to discover the one and only true common ancestor for merging versions B and E above; there is simply no such thing. The best the system can do is provide an automated way to locate the ancestor which causes the merge to reflect _a particular developer intent_. Im also not very convinced yet that there exists no algorithm to find a nice merge ancestor. Even then, I personally would be willing to solve the same conflict twice once in a while, if it was necessary to be able to use a standard 3-way merger to solve conflicts. Certainly the system can compute the correct ancestor for a particular intent, such as merging a branch back to the main trunk etc. However, I suspect that any such algorithm must, to some extent, rely on meta-information that is not available in the topology itself (call this the expected development workflow). BTW, in a Darcs-like system, the situation isn't much different. For the system to automatically select the set of deltas to include in a system, it again must rely on such meta-information. Put another way - if Monotone, or Darcs for that matter subscribed to a particular workflow methodology (trunk/branch, development/release, etc.) then there would be no problem in the first place. However both systems attempt at providing basic building blocks which can be used to implement many types of workflows. Well, surprise - this means that when you use these building blocks, _you_ have to decide on how to apply them according to _your_ particular workflow, and there are some ways to apply them that make little or no sense. Nathaniel's pathological case is an example of a nonsensical usage of 3-way merge, just like char *p = 0; *p = 'a'; is a nonsensical use of C pointers; this doesn't mean 3-way merge, or pointers, are inherently useless, or need to be fixed somehow. Have fun, Oren Ben-Kiki ___ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel