On Tue, Aug 10, 2010 at 3:33 AM, Torben Weis <[email protected]> wrote:

> Hi,
>
> I read the spec back and forth and my conclusion is that all document
> operations must repeat all annotations of the document.
>
> To build a worst case: A document consists of 1000 characters, each with
> different annotations.
> When deleting 1000 characters I have to repeat the annotation for each
> deleted character in the mutation operation (this is what the spec says and
> complies with the solution suggested by Joseph).
> So delete can be expensive, I accept this.
>

Not always. If you have this document:

<?a "k"="v" ?>hello<?a "k" ?>

... if you want to delete the characters in the middle, you can use this op:
retain(1), delete_characters("ell"), retain(1)
This is fine because if you apply that operation, then apply its inverse
you'll end up back at the original.

However, if you want to delete the whole string, you'll need to say this:
annotation_boundary("k", "v" -> NULL), delete_characters("hello"),
annotation_boundary_end("k")

.. for the same reason.


>
> According to the specs, the same is true for the retain operation.
>
> Quote: "For update components (retain is one), the old values in the
> annotations update match the annotation values of each item in the input
> document that the component processes. The generated items in the output
> document will have the same annotations as the corresponding input items,
> except for the annotation keys in the annotations update; for those keys,
> the generated items will have the new values."
>
> This means when I want to insert a single new character, the mutation
> operation contains 1000 annotation updates, because I must "retain" all 1000
> characters and therefore I must provide their annotation. This is quite
> costly.
>

No, thats not true of retains. You can retain past all the annotations. If
you insert characters, the characters you insert will inherit the
annotations from their left.

The best way to think of it is to imagine a 3-pass algorithm.

Pass 1: Do all the insert operations. Skip past everything else.
Pass 2: Apply all the annotation_boundary components in the op. Skip past
everything else.
Pass 3: Do all the delete operations, and merge any adjacent annotation
boundaries. Skip past everything else.

Its quite possible (though much more complicated) to do a single-pass
compose function. But, it has to match that functionality. This results in
what looks like some somewhat twisted logic, for example in a single-pass
compose function when you encounter an annotation boundary the behaviour can
be different depending on whether the next annotation component is a
retain/delete or a create.

I fixed a whole bunch of weird bugs in my code when I started using the
random op generator. I think the three-pass algorithm is probably pretty
simple, but yeah - the corner cases in the 1 pass functions will get you.

-J

One might argue that this is an unlikely scenario. But remember that cursors
> are realized with annotations. If 100 users have edited in a public wave,
> then there are at least 100 annotation boundaries in the document which have
> to be repeated in every operation. Even worse, each new session creates a
> new annotation ...
>
> Is there any garbage collection on the server side that removes old
> annotations to clear up the mess?
> I must admit that annotations & OT are quite mind twisting when it comes to
> the corner cases.
>
> Greetings
> Torben
>
> 2010/8/6 Joseph Gentle <[email protected]>
>
>> In this case, the delete op must also delete the annotation. If it didn't,
>> the op wouldn't be invertable (if you apply the op then apply its inverse,
>> you'll end up removing the annotation).
>>
>>  Your example should look like:
>> mutation1 := anno('bold': 'true' -> NULL), delete 'ab', anno(end 'bold'),
>> retain 2
>> mutation2 := retain 2, insert 'x', retain 2
>>
>> After transform, mutation1_t is the same as you suggested, but
>>
>> mutation2_t := anno ('bold': NULL -> true)  insert 'x', anno (end 'bold'),
>> retain 2
>>
>> ... and you get the correct behaviour.
>>
>> -J
>>
>>
>> On Fri, Aug 6, 2010 at 11:14 AM, Torben Weis <[email protected]>wrote:
>>
>>> Hi,
>>>
>>> I encountered a problem while unit testing my OT code. Either I just
>>> found a major flaw in the OT algorithm (i.e. a flaw that is almost
>>> impossible to repair) or I am too tired to see the solution. Here is the
>>> problem:
>>>
>>> Let "A, B ,C" be BOLD letters, and "a, b, c" denote the same letters, but
>>> not in bold.
>>> Now imagine a simple document with the content
>>>
>>> doc := ABcd
>>>
>>> i.e. "ab" in bold and "cd" not in bold. Now two users edit the document
>>> concurrently.
>>>
>>> mutation1 := delete:2, retain:2
>>> mutation2 := retain:2, insert:"x", retain:2
>>>
>>> i.e. user1 sees "cd" and user2 sees "ABXcd". Notice that the "x" is bold
>>> -> "X", because in word processors you inherit the style from the left. This
>>> is important here!
>>>
>>> Now I transform both mutations and I get:
>>>
>>> mutation1_t := delete:2 retain:3      // I skip the inserted "x" ->
>>> therefore retain2
>>> mutation2_t := insert"x", retain:2    // No need to skip "AB", because
>>> they are already deleted
>>>
>>> Now user1 one applies mutation2_t and user2 applies mutation1_t. They
>>> should end up in the same state.
>>>
>>> However, in the end user1 sees "xcd" and user2 sees "Xcd". The difference
>>> is that x is not consistently annotated.
>>>
>>> As far as can see right now, the transformation function cannot do
>>> anything because it needs style annotation that is only in the document but
>>> not in the mutations.
>>>
>>> I hope that someone can enlighten me here, because otherwise the OT
>>> algorithm is broken.
>>>
>>> Greetings
>>> Torben
>>>
>>>  --
>>> You received this message because you are subscribed to the Google Groups
>>> "Wave Protocol" group.
>>> To post to this group, send email to [email protected].
>>> To unsubscribe from this group, send email to
>>> [email protected]<wave-protocol%[email protected]>
>>> .
>>> For more options, visit this group at
>>> http://groups.google.com/group/wave-protocol?hl=en.
>>>
>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "Wave Protocol" group.
>> To post to this group, send email to [email protected].
>> To unsubscribe from this group, send email to
>> [email protected]<wave-protocol%[email protected]>
>> .
>> For more options, visit this group at
>> http://groups.google.com/group/wave-protocol?hl=en.
>>
>
>
>
> --
> ---------------------------
> Prof. Torben Weis
> Universitaet Duisburg-Essen
> [email protected]
>
> --
> You received this message because you are subscribed to the Google Groups
> "Wave Protocol" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected]<wave-protocol%[email protected]>
> .
> For more options, visit this group at
> http://groups.google.com/group/wave-protocol?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups "Wave 
Protocol" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/wave-protocol?hl=en.

Reply via email to