Am 04.07.20 um 23:43 schrieb James Cook: > On Sat, 4 Jul 2020 at 09:59, Ben Franksen <ben.frank...@online.de> wrote: >> Am 04.07.20 um 00:10 schrieb James Cook: >>>>> I'd say this is a feature, not a bug. If you mark the conflict between A >>>>> and B as resolved, *without* recording a new patch, then this means you >>>>> are satisfied with the resolution that reverts both changes. There is no >>>>> reason why pulling C should conflict now. If, on the other hand, you are >>>>> not satisfied with this resolution, you may want record a "proper" >>>>> resolution, usually a mixture of A and B. This will then typically >>>>> conflict with C. >>> >>> That's a great point. I'm much less worried now. >> >> You should be, though. I think what I wrote above is nonsense. >> >> [...] > > I'm not sure I understand your concern right.
Ah, forget it. My concern was about cancelling inverses. I am biased by my own theorizing about patches. I once thought that patch sequences should form a groupoid, which requires that we must always cancel inverses. This is apparently not what you are aiming at. However, I do think you need to cancel inverse in /addresses/ as part of the minimization process. > Let me try to step > through what I'd like to happen. > > Suppose the starting context is s and patches A and B conflict and > have ending contexts a and b. > > In repository R1, the user tries to merge A and B, sees conflict > markers, and records a conflict resolution. The conflict resolution > consists of A^, B^, and a new patch C which replaces A and B. The > repository now looks like this (after re-ordering a bit): > > s A a A^ s B b B^ s C c There is nothing wrong with that as long as you always keep all patches in the sequence and never annihilate inverses. > Now, suppose the user tries to pull from another repository R2 that > has just patch A. Here, it is very important that R1 "remember" that > the A\/B conflict was already resolved. Otherwise, it may treat C \/ A > as a new conflict. > > But can't we just observe that the name "A" is already present in R1, > and therefore there's no need to pull in A? Wouldn't that ensure that > we never again try to resolve a conflict involving A? Of course. If you keep "inverses" in the repo (i.e. don't cancel them), then you are now in a situation where you want to merge (A;A^;B;B^;C)\/(A). The common subsequence is (A) and the uncommon parts are (A^;B;B^;C)\/() and their merge is trivially ()/\(A^;B;B^;C). So the result is A;A^;B;B^;C i.e. equal to R1. > I don't know how darcs figures out what patches need to be pulled, > especially if R1 is lazy. Don't let yourself be confused by lazy repos. Conceptually, darcs reads the complete remote repo, figures out the common patches, commutes all uncommon patches in both repos to the end, and then runs the actual merge algorithm only on the remaining (uncommon) sequences. The camp paper has a chapter with many nice pictures where this is explained in detail. In reality we almost never need to read the full remote repo. This is because the order of patches in a repositories is stored in so called inventories. An inventory is a file that lists patch names along with the content hash of each patch. Inventories can be nested and are stored hashed, too (i.e. the file name is equal to the content hash). This makes it easy to compare two repos efficiently to find the latest point at which they diverge. We only need to download remote inventories until we find an identical one in our local repo. Since cloning preserves inventories and operations deep inside a repo are uncommon (and also pretty expensive), this search usually terminates quickly. Lazy repos are missing only patches listed in closed inventories, which means that pulling from a lazy repo normally works fine. However, it /can/ fail if we actually need access to such a patch (e.g. in order to commute it) and we cannot download it from any source repository listed in the local repo (which includes the per-user cache). > I suspect the current implementation might > need to be extended so that a repository stores a set of all names of > patches that have been cancelled out, i.e. the names of all patches A > such that both A and A^ are present. But this is speculative because I > don't know how that algorithm works. See above. You can add such a set as an optimization but the algorithm doesn't need it *if* you keep all patches in the sequence that represents the state of your repo. >> If you want to have proper inverses (for conflict resolution) and still >> remember all of the history, you must abandon the idea that a repo is a >> sequence of patches (or rather: an equivalence class of such sequences, >> module permutations). So your idea of phantom patches (a patch together >> with its tombstone) amounts to saying that a repo is not a sequence of >> patches, but a tree (with multiple dead branches), and to merge a patch >> means to merge it with all branches of the tree (you'll have to work out >> what that means precisely). > > I had been thinking about representing a repo as a tree. A sequence of patches is a path in the patch graph. If you can walk back, then what you get has the form of a tree. This is no longer true if you can walk in circles, though. If we assume that the state of a repository is given by a sequences of patches (modulo commutation), then we could store the repo as a tree. This represntation uses less memory and requires less commutation (we don't have to store or commute inverse patches), but the algorithms become more complicated. This is an implementation issue, so not of interest for the theory. > I think there is a certain equivalence between trees and patch > sequence addresses: I am pretty sure you only get unique minimal patch addresses if you eliminate primitive inverse pairs (see below), which means minimal sequence addresses are necessarily linear. This does not contradict what I wrote above: the repository state is not a sequence address, but a plain sequence. > given a tree, you can turn it into a patch > sequence address by doing a depth first search (record positive > patches as you go deeper into the tree, and negative patches when you > return back upward). I suspect the other direction can be done too, > after you extend (Qi) in a patch sequence address to be a cycle and > then do some commuting, but I'm not sure. I guess you want to avoid proper cycles, i.e. cycles other than those of the form patches;patches^, but that's just an intuition. > An advantage of the patch sequence address representation over the > tree representation is that I find it easier to think about how > commuting can modify a cycle. With the tree representation you'd want > to push conflicts as far down the tree as possible. This should be > related to the minimality condition for patch sequence addresses, but > I haven't wrapped my head around it. Well, suppose you have primitive A;(B\/C) where neither A;B nor A;C commute. If we start with the sequence A;B;B^;C and want to minimize that to A;C, we have no other choice but to cancel inverses. Otherwise you'd have the situation that A;B;B^;C and A;C are not equivalent as context addresses. (I am not using the full notation here because primitive contexts are implicit in primitive patches, assuming all patches are well-typed, and the sets X and Y are irrelevant here.) Cheers Ben _______________________________________________ darcs-users mailing list darcs-users@osuosl.org https://lists.osuosl.org/mailman/listinfo/darcs-users