Re: Why Google Wave should not support TP2

Daniel Paull Fri, 26 Feb 2010 17:57:40 -0800


On Feb 26, 4:27 pm, Alexandre Mah <[email protected]> wrote:
> > 1) The removal of the server acknowledgement, leading to lower latency
> > between clients
>
> The server acknowledgements aren't related to the lack of TP2.


Yes it is.  See below.

> In fact, OT systems without TP2 (which is essentially all two-party OT
> algorithms) typically don't use server acknowledgements.

Holy misinterpretation Batman.  I have not claimed that the server ACK
was required if you don't satisfy TP2.  In adopting the Jupiter
system, the Wave team had a big problem - high memory consumption due
to multiple state spaces on the server.  You then had a choice too
make:

  A) Implement this Server Ack idea and suffer the consequences of it,
such as increased latency.
  B) Satisfy TP2 and reap the rewards that TP2 has to offer.

Both option lead to a single state space on the server.  When worded
in such a biased manner, (A) seem like a poor choice.

> > 2) Trivial "recovery" algorithms as the server can, after a rollback,
> > get operations back from clients
>
> Shifting the responsibility for reliability from a trusted server to
> other clients seems like a questionable solution.  I think reliably
> storing the server's data shouldn't be the job of clients.

This is just silly.  No one is shifting any responsibility.  However,
if you do not support the recovery of lost operations from clients,
then you have (at least) two big problems:

1) The server must promise durability under all fault conditions.
Google might be prepared to promise this, but most wave providers
won't be able to.

2) Clients can not work offline without the fear of not being able to
commit their work.  If my offline client has an operation that the
server lost due to a rollback, then I will not be able to commit my
work.

> > 3) Relaxed durability requirements.  Because of (2), the server can
> > promise less.  Now we can remove the horrid commit-notice message.
>
> It seems to me that farming out responsibility for storing data to
> other clients doesn't really give any guarantees that your data is
> being reliably stored.

So, if you *need* durability, then put it into the system and
understand that latency will be increased.

If you introduce enough redundancy (even without the promise of
durability) then the chance of losing data is less the chance of a
catastrophic failure of your single point of failure.  There is no
reason why clients can't take part in that redundancy; you accepted a
delta from a client once - why would you not accept it again?

> I'm not a suitable person to comment on commit-notice itself (although
> I did have some pretty strong opinions regarding commit-notice when it
> came up for internal debate) so I'll refrain from doing so.

Nicely played.  Perhaps keep your internal politics internal.

> > 4) Intention preservation that actually preserves intent.
>
> I'm not sure I know what you mean here.  

Consider the following scenario.  There is a document whose current
state is "X".  Three clients concurrently generate operations against
this document as follows:

  Client 1:  insert "A" before the "X"
  Client 2:  insert "B" after the "X"
  Client 3:  delete the "X"

I'm sure that everyone will agree that the final document state must
be "AB" if the intention of the users is to be preserved.

There are six possible orders that the server could apply these
operations; two of them may give unexpected results!  If the server
decided to apply delete operation first, then the two inserts, once
transformed to include the effect of the delete, will appear to be
concurrent inserts at the same position.  In the literature, the site
identifier is used to break this tie, meaning that half the time the
result final state will "AB" and half the time it will be "BA".  Wave
doesn't use site identifiers, but rather, relies on the order that the
server chooses for these inserts - this still leads to "BA" half the
time.

What is described above is a classic TP2 puzzle.  In a peer-to-peer
system this would represent divergence amongst peers.  In wave it just
gives a plain weird result.  I see it as not preserving the intention
of the users.

> That means two-party algorithms can always be at least as good at
> intention preservation as multi-party algorithms.

Do you still believe this?

> In fact, two-party algorithms often have nicer intention preservation
> properties than algorithms that satisfy TP2, because they're
> unencumbered by the requirement to satisfy TP2.

A fact eh?  I think I just showed that being unencumbered leads to
breaking of intention preservation in cases where the encumbered
system does not.

> I think algorithms which satisfy TP2 in order to work as a distributed
> algorithm tend to be considerably more complicated, have less elegant
> transforms, have slightly less desirable behaviour, or result in
> operations which are uglier and more verbose.  But if you can
> demonstrate otherwise, that would definitely be of interest to us.

We should take that discussion offline initially.  Last time I spoke
with your team you didn't seem interested in having this discussion.

> Here are some other considerations:
>
> A lot of Wave is built around the notion that there is a single
> universal history where all the operations have a total ordering.

Of course, this total ordering is a lie.  The ordering of concurrent
operations imposed by the server does not represent what actually
happened.  There is in reality a partial ordering based on causality.

> This is useful for features like playback (among some other more minor
> features).  In a peer-to-peer algorithm, it's a lot less natural to
> define a single universal history.

What do you mean by "natural"?  I would have thought that a user would
find it more natural to playback events in the order that he witnessed
them.  I find it unnatural that when playing back history I might see
things happen in an order that contradicts my memory of things.

> Another consideration is that the version vector in a distributed
> algorithm might become very large in size if you have hundreds or
> thousands of participants.  Maybe there are good ways to mitigate
> this, but please take into account that this is a vector you'd be
> sending with every operation (even tiny single-character operations).

Rubbish.  You exchange vector times upon connection.  Each operation
is sent with an (s, t) pair where s identifies the site and t is the
sequence number for that site.

Cheers,

Dan

-- 
You received this message because you are subscribed to the Google Groups "Wave 
Protocol" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/wave-protocol?hl=en.

Re: Why Google Wave should not support TP2

Reply via email to