Re: ZAB kick Paxos butt?

2010-01-20 Thread Qing Yan
Yeah, actually I have no doubts about Paxos protocol itself but rather the
state machine implementation part
(as described in Paxos made simple,section 3) where there could be multiple
Paxos instances.
shouldn't the Paxos instance execution be serialized in order to make the
state machine abstraction useful/friendly
for the real world use? if one paxo instance fails application will be
notified so
that corresponding actions could be taken(retry,rollback,notify
client...etc), instead of blindly continuing and
getting unpredictable results later on. Actually in the Google Chubby case,
the database changelog is being streamed
into the Paxos cluster,how can they afford to lose some of the logs without
breaking the database integrity?
Did I miss something?

On the other hand, I think adopting the FIFO based protocol is a very smart
engineering decision.
It makes the whole thing less complicated and is also more powerful.
E.g. it saves you guys the efforts to invent another language/compiler(like
the Google ppl does).

Just curious, giving how persuasive the TCP stack is deployed today, why the
research
community still need to stick to the asynchronous system assumption? Just
because TCP sounds
uncool than asynchronous system on paper? hehe..
Cheers

Qing

>


Re: ZAB kick Paxos butt?

2010-01-20 Thread Henry Robinson
Qing -

Also, as you pointed out, ZAB requires this FIFO property of the
point-to-point links. Paxos copes with more adversarial networks which allow
reordering and missed messages. It's easy to alter Paxos so as not to
'publish' the results of consensus rounds where there are gaps in the
previous commit history. (You may be interested in the 'Fast Paxos' paper by
Lamport which talks about making the protocol 2-message optimal in all cases
when order is not important, i.e. the messages commute). You can express the
ordering dependency between messages by supplying a proposal number with
each that is monotonically increasing in causal order.

ZAB takes care of all of this for you by using TCP sequence numbers and
getting the deep pipelining available by knowing that there are no updates
being voted on depend on updates that have not yet arrived, the 'cost' is
relying on a stronger network model than Paxos presupposes.

Henry

2010/1/20 Benjamin Reed 

> hi Qing,
>
> i'm glad you like the page and Zab.
>
> yes, we are very familiar with Paxos. that page is meant to show a weakness
> of Paxos and a design point for Zab. it is not to say Paxos is not useful.
> Paxos is used in the real world in production systems. sometimes there are
> not order dependencies between messages, so Paxos is fine.
>
> in cases where order is important, multiple messages are batched into a
> single operation and only one operation is outstanding at a time. (i believe
> that this is what Chubby does, for example.) this is the solution you allude
> to: wait for 27 to commit before 28 is issued.
>
> for ZooKeeper we do have order dependencies and we wanted to have multiple
> operations in progress at various stages of the pipeline to allow us to
> lower latencies as well as increase our bandwidth utilization, which led us
> to Zab.
>
> ben
>
>
> Qing Yan wrote:
>
>> Hello,
>>Anyone familer with Paxos protocol here?
>>I was doing some comparision of ZAB vs Paxos... first of all, ZAB's
>> FIFO
>> based protocol is really cool!
>>
>>  http://wiki.apache.org/hadoop/ZooKeeper/PaxosRun mentioned the
>> inconsistency case for Paxos("the state change B depends upon A, but A was
>> not committed").
>>  In the "Paxos made simple" paper, author suggests fill the GAP (lost
>> state
>> machine changes) with "NO OP" opeartion.
>>
>>  Now I have some serious doubts how could Paxos be any useful in the real
>> world. yeah you do reach the consesus - albeit the content
>> is inconsistent/corrupted !?
>>
>>  E.g. on the wiki page, why the Paxos state machine allow fire off 27,28
>> concurrently where there is actually depedency? Shouldn't you wait
>> instance
>> 27 to be committed before start 28?
>>  Did I miss something?
>>
>>  Thanks for the enlight!
>>
>>   Cheers
>>
>>Qing
>>
>>
>
>


Re: ZAB kick Paxos butt?

2010-01-20 Thread Benjamin Reed

hi Qing,

i'm glad you like the page and Zab.

yes, we are very familiar with Paxos. that page is meant to show a 
weakness of Paxos and a design point for Zab. it is not to say Paxos is 
not useful. Paxos is used in the real world in production systems. 
sometimes there are not order dependencies between messages, so Paxos is 
fine.


in cases where order is important, multiple messages are batched into a 
single operation and only one operation is outstanding at a time. (i 
believe that this is what Chubby does, for example.) this is the 
solution you allude to: wait for 27 to commit before 28 is issued.


for ZooKeeper we do have order dependencies and we wanted to have 
multiple operations in progress at various stages of the pipeline to 
allow us to lower latencies as well as increase our bandwidth 
utilization, which led us to Zab.


ben

Qing Yan wrote:

Hello,
Anyone familer with Paxos protocol here?
I was doing some comparision of ZAB vs Paxos... first of all, ZAB's FIFO
based protocol is really cool!

 http://wiki.apache.org/hadoop/ZooKeeper/PaxosRun mentioned the
inconsistency case for Paxos("the state change B depends upon A, but A was
not committed").
 In the "Paxos made simple" paper, author suggests fill the GAP (lost state
machine changes) with "NO OP" opeartion.

  Now I have some serious doubts how could Paxos be any useful in the real
world. yeah you do reach the consesus - albeit the content
is inconsistent/corrupted !?

  E.g. on the wiki page, why the Paxos state machine allow fire off 27,28
concurrently where there is actually depedency? Shouldn't you wait instance
27 to be committed before start 28?
  Did I miss something?

  Thanks for the enlight!

   Cheers

Qing
  




ZAB kick Paxos butt?

2010-01-20 Thread Qing Yan
Hello,
Anyone familer with Paxos protocol here?
I was doing some comparision of ZAB vs Paxos... first of all, ZAB's FIFO
based protocol is really cool!

 http://wiki.apache.org/hadoop/ZooKeeper/PaxosRun mentioned the
inconsistency case for Paxos("the state change B depends upon A, but A was
not committed").
 In the "Paxos made simple" paper, author suggests fill the GAP (lost state
machine changes) with "NO OP" opeartion.

  Now I have some serious doubts how could Paxos be any useful in the real
world. yeah you do reach the consesus - albeit the content
is inconsistent/corrupted !?

  E.g. on the wiki page, why the Paxos state machine allow fire off 27,28
concurrently where there is actually depedency? Shouldn't you wait instance
27 to be committed before start 28?
  Did I miss something?

  Thanks for the enlight!

   Cheers

Qing