On Feb 27, 3:10 am, Tad Glines <[email protected]> wrote:
>
> > Aren't they all servers that you trust and want them all to converge?
> > Can you give a more specific case that you are worried about?  I
> > struggle to understand how this network could be established (ie, why
> > did servers accept connections) if they did not want to exchange
> > operations.
>
> I think some of the confusion surrounds what we each mean by "server". In
> all my past e-mails, when I said server (e.g. host server, remote server,
> etc...), I was referring to a single wave domain (e.g. acmewave.com,
> wave.google.com, etc...). From the point of view of Wave federation and wave
> OT, the federation host, or federation remote is refered to as a server (in
> the code anyway). In actual implementations the "server" would likely be a
> cluster of physical server.
>
> In my example above you could replace A, B and C with A.com, B.com and
> C.com.

I think you should read David Barrett-Lennards's post on trust being
transitive.  Of course, Dave and I agree.  I said in another thread
that "It's always been a matter of trust".  I think Dave said it
better than Billy Joel though.

>
> > The TCP ACK is sufficient though.
>
> No, it's not. Just because a TCP packet has been ACK'ed does NOT mean that
> the application has actually read the data out of the TCP recieve buffer.
> Not does it mean that the application has finished parsing the message and
> reliably done anything with it.

Why do you care if the application has "finished parsing the message
and reliably done anything with it?"  All you do is send operations,
relying on the reliable, ordered delivery of TCP.

> > > 6) Branching and merging of data in an arbitrary manner
>
> > > Branching and merging is still possible without TP2.
>
> > No it's not.  You're turn.
>
> Perhaps we are thinking of two different things here, but, as fas as I can
> tell, it is already possible to branch the history of a wavelet, and then
> merge changes from one branch to another. I plan to test this, but it's not
> at the top of my task list right now.

Make sure you do some really weird stuff, like creating lots of
branches, merging some of the branches together, then merging these
new branches on other branches.  To show up problems you will need
insert and delete operations that fall very close to each other on
separate branches.  I think it will be easy to show divergence on
different branches though - pick your favorite TP2 puzzle and try to
emulate it using branches to prescribe the order of operations on
branches.

> > > > 7) Distributing the OT load across multiple servers
>
> > > This already happens. Each wavelet will need to be procesed on only one
> > > machine concurrently, but thousands of wavelets can be processed on
> > > thousands of servers concurrently.
>
> > So, by "Distributing the OT load across multiple servers" you mean
> > "putting the OT load for a given wavelet on a single server at any
> > point in time"?  You still have fundamental problems - let's sat that
> > all 100,000 users all tried to edit the same wavelet.  How do you
> > distribute the the OT load?  You can't.  You're stuffed.  With TP2 in
> > the mix, you have multiple server take the load.  The secret is to
> > reduce the fan-in (ie, number of connections) on each server.
>
> Frankly, I think 100,000 users all contributing to a single wavelet at the
> same time is an extreme example

Ok, pick a reasonable number and tell me if you think Wave is still up
to it.  At what point do you think Wave will fail to scale?

> I'm not aware of ANY collaborative system
> that supports 100,000 users concurrently modifying the same item.

That sounds like a challenge I'd like to take up.

> Also, each machine will still have to process the deltas from all 100,000
> users. Otherwise there is no way to arrive at a consistent state.
> This is true for the clients as well. each client in a 100,000 user wave, 
> will have
> to process the deltas from all 100,000 users. TP2 doesn't magically reduce
> the number of transforms that will need to be applied. It just relaxes
> ordering requirements.

TP2 does indeed reduce the number of operations performed by each
machine.  If you have a single server with 1000 connections to users
and they all happen to generate a single operation concurrently, then
the server needs to transform 1000 operations against 1000 other
operations.  The naive OT algorithm is O(n^2), so 1,000,000
transformations are required.

If instead you use 11 servers - 10 servers each having 100 connections
to client and the final server being connected to to other 10 servers
(think of a snowflake), then each of the ten servers sees 100
concurrent operation and performs 10,000 transformations (running
total of 100,000 operations).  The final server receives 10 composite
operations (which it considers to be performed concurrently) and
transforms these composite operations against each other - that's
another 10,000 transformations.  So, for the server to include all of
the operations, there was a total of 110,00 transformations - around
10% of the single-server scenario.

Of course, that's not the end of the story as these composite
operations get sent out to the 10 servers and all the clients.
However, no single machine ever needed to perform all of the
transformations.  Downstream machines benefit from upstream
calculations,  Note that different servers end up applying operations
in different orders, so TP2 must be met.

To scale further, a system satisfying TP2 can just stick in more
servers to spread he computational load.  Scaling to "100,000 users
concurrently modifying the same item" should not be a problem at all,
given enough resources.

It's not magic, just math.  The result is magical though.

Dan

-- 
You received this message because you are subscribed to the Google Groups "Wave 
Protocol" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/wave-protocol?hl=en.

Reply via email to