On Tue, Aug 10, 2010 at 10:39 AM, Torben Weis <[email protected]> wrote:
> 2010/8/10 Joseph Gentle <[email protected]> > >> <?a "k"="v" ?>hello<?a "k" ?> >> >> ... if you want to delete the characters in the middle, you can use this >> op: >> retain(1), delete_characters("ell"), retain(1) >> This is fine because if you apply that operation, then apply its inverse >> you'll end up back at the original. >> > > I think you might we wrong, finally (although you have always been right so > far :-) > > Imagine several other users, each deleting one letter of "hello". The last > user inserts a letter in the middle of hello. This insert arrives at the > server as the last operation, however, they all target the same document > version. > You will end up in an inconsistent state, because you have the same problem > that I mentioned in the very beginning. > The format of the inserted character will be incorrect. > Its up to the transform function to transform this correctly. Some of the hairy edge cases are listed in the fedone tests[1] (thats how I figured it out). In this case, if your document is this: <?a "k"="v" ?>ab<?a "k" ?> ... and you have three concurrent operations: A = annotation_update("k", "v" -> NULL), delete_characters("a"), annotation_end("k"), retain(1) B = retain(1), delete_characters("b") C = retain(2), insert_characters("c") They reach the server in the order A, B, C. So, B gets transformed by A and C gets transformed by A and B'. In this case, B transforms to: B' = annotation_update("k", "v" -> NULL), delete_characters("b"), annotation_end("k") C gets transformed first by A then B'. When C is transformed by A it becomes: C' = retain(1), insert_characters("c") Finally, we transform C' against B'. C'' = retain(1), annotation_update("k", NULL, "v"), insert_characters("c"), annotation_end("k") If we instead transformed B' against C', we would get: B'' = annotation_update("k", "v" -> NULL), delete_characters("b"), retain(1), annotation_end("k") Using the random op generator is basically the only way to make sure you handle all the edge cases correctly. (I still have a few bugs in my transform function I haven't fixed yet...) Good luck! :D -Joseph [1] http://code.google.com/p/wave-protocol/source/browse/test/org/waveprotocol/wave/model/document/operation/algorithm/DocOpTransformerTest.java?repo=libraries > Greetings > Torben > > > >> >> However, if you want to delete the whole string, you'll need to say this: >> annotation_boundary("k", "v" -> NULL), delete_characters("hello"), >> annotation_boundary_end("k") >> >> .. for the same reason. >> >> >>> >>> According to the specs, the same is true for the retain operation. >>> >>> Quote: "For update components (retain is one), the old values in the >>> annotations update match the annotation values of each item in the input >>> document that the component processes. The generated items in the output >>> document will have the same annotations as the corresponding input items, >>> except for the annotation keys in the annotations update; for those keys, >>> the generated items will have the new values." >>> >>> This means when I want to insert a single new character, the mutation >>> operation contains 1000 annotation updates, because I must "retain" all 1000 >>> characters and therefore I must provide their annotation. This is quite >>> costly. >>> >> >> No, thats not true of retains. You can retain past all the annotations. If >> you insert characters, the characters you insert will inherit the >> annotations from their left. >> >> The best way to think of it is to imagine a 3-pass algorithm. >> >> Pass 1: Do all the insert operations. Skip past everything else. >> Pass 2: Apply all the annotation_boundary components in the op. Skip past >> everything else. >> Pass 3: Do all the delete operations, and merge any adjacent annotation >> boundaries. Skip past everything else. >> >> Its quite possible (though much more complicated) to do a single-pass >> compose function. But, it has to match that functionality. This results in >> what looks like some somewhat twisted logic, for example in a single-pass >> compose function when you encounter an annotation boundary the behaviour can >> be different depending on whether the next annotation component is a >> retain/delete or a create. >> >> I fixed a whole bunch of weird bugs in my code when I started using the >> random op generator. I think the three-pass algorithm is probably pretty >> simple, but yeah - the corner cases in the 1 pass functions will get you. >> >> -J >> >> One might argue that this is an unlikely scenario. But remember that >>> cursors are realized with annotations. If 100 users have edited in a public >>> wave, then there are at least 100 annotation boundaries in the document >>> which have to be repeated in every operation. Even worse, each new session >>> creates a new annotation ... >>> >>> Is there any garbage collection on the server side that removes old >>> annotations to clear up the mess? >>> I must admit that annotations & OT are quite mind twisting when it comes >>> to the corner cases. >>> >>> Greetings >>> Torben >>> >>> 2010/8/6 Joseph Gentle <[email protected]> >>> >>>> In this case, the delete op must also delete the annotation. If it >>>> didn't, the op wouldn't be invertable (if you apply the op then apply its >>>> inverse, you'll end up removing the annotation). >>>> >>>> Your example should look like: >>>> mutation1 := anno('bold': 'true' -> NULL), delete 'ab', anno(end >>>> 'bold'), retain 2 >>>> mutation2 := retain 2, insert 'x', retain 2 >>>> >>>> After transform, mutation1_t is the same as you suggested, but >>>> >>>> mutation2_t := anno ('bold': NULL -> true) insert 'x', anno (end >>>> 'bold'), retain 2 >>>> >>>> ... and you get the correct behaviour. >>>> >>>> -J >>>> >>>> >>>> On Fri, Aug 6, 2010 at 11:14 AM, Torben Weis <[email protected]>wrote: >>>> >>>>> Hi, >>>>> >>>>> I encountered a problem while unit testing my OT code. Either I just >>>>> found a major flaw in the OT algorithm (i.e. a flaw that is almost >>>>> impossible to repair) or I am too tired to see the solution. Here is the >>>>> problem: >>>>> >>>>> Let "A, B ,C" be BOLD letters, and "a, b, c" denote the same letters, >>>>> but not in bold. >>>>> Now imagine a simple document with the content >>>>> >>>>> doc := ABcd >>>>> >>>>> i.e. "ab" in bold and "cd" not in bold. Now two users edit the document >>>>> concurrently. >>>>> >>>>> mutation1 := delete:2, retain:2 >>>>> mutation2 := retain:2, insert:"x", retain:2 >>>>> >>>>> i.e. user1 sees "cd" and user2 sees "ABXcd". Notice that the "x" is >>>>> bold -> "X", because in word processors you inherit the style from the >>>>> left. >>>>> This is important here! >>>>> >>>>> Now I transform both mutations and I get: >>>>> >>>>> mutation1_t := delete:2 retain:3 // I skip the inserted "x" -> >>>>> therefore retain2 >>>>> mutation2_t := insert"x", retain:2 // No need to skip "AB", because >>>>> they are already deleted >>>>> >>>>> Now user1 one applies mutation2_t and user2 applies mutation1_t. They >>>>> should end up in the same state. >>>>> >>>>> However, in the end user1 sees "xcd" and user2 sees "Xcd". The >>>>> difference is that x is not consistently annotated. >>>>> >>>>> As far as can see right now, the transformation function cannot do >>>>> anything because it needs style annotation that is only in the document >>>>> but >>>>> not in the mutations. >>>>> >>>>> I hope that someone can enlighten me here, because otherwise the OT >>>>> algorithm is broken. >>>>> >>>>> Greetings >>>>> Torben >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "Wave Protocol" group. >>>>> To post to this group, send email to [email protected]. >>>>> To unsubscribe from this group, send email to >>>>> [email protected]<wave-protocol%[email protected]> >>>>> . >>>>> For more options, visit this group at >>>>> http://groups.google.com/group/wave-protocol?hl=en. >>>>> >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Wave Protocol" group. >>>> To post to this group, send email to [email protected]. >>>> To unsubscribe from this group, send email to >>>> [email protected]<wave-protocol%[email protected]> >>>> . >>>> For more options, visit this group at >>>> http://groups.google.com/group/wave-protocol?hl=en. >>>> >>> >>> >>> >>> -- >>> --------------------------- >>> Prof. Torben Weis >>> Universitaet Duisburg-Essen >>> [email protected] >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "Wave Protocol" group. >>> To post to this group, send email to [email protected]. >>> To unsubscribe from this group, send email to >>> [email protected]<wave-protocol%[email protected]> >>> . >>> For more options, visit this group at >>> http://groups.google.com/group/wave-protocol?hl=en. >>> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Wave Protocol" group. >> To post to this group, send email to [email protected]. >> To unsubscribe from this group, send email to >> [email protected]<wave-protocol%[email protected]> >> . >> For more options, visit this group at >> http://groups.google.com/group/wave-protocol?hl=en. >> > > > > -- > --------------------------- > Prof. Torben Weis > Universitaet Duisburg-Essen > [email protected] > > -- > You received this message because you are subscribed to the Google Groups > "Wave Protocol" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]<wave-protocol%[email protected]> > . > For more options, visit this group at > http://groups.google.com/group/wave-protocol?hl=en. > -- You received this message because you are subscribed to the Google Groups "Wave Protocol" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/wave-protocol?hl=en.
