Re: stop script not working

2010-01-28 Thread Michael Bauland
Hi Patrick,

thank you very much for you quick reply.

With your help I found the problem. The variable $ZOOPIDFILE in the
zkServer.sh script containted the following (weird) string:

/zookeeper_server.pidver1/data

And this of course doesn't work. The problem was that in the zoo.cfg
file I had DOS carriage returns. :-(
For some reason this didn't bother the snapshots and logs, as they were
written to my data diretory. But the PID file wasn't there.
Once I fixed this and used UNIX carriage returns, everything works. I
can stop zookeeper now. :-)

Sorry for this stupid error and thanks again for your help.

Cheers,

Michael


-- 

Michael Bauland
michael.baul...@knipp.de
bauland.tel


Re: Q about ZK internal: how commit is being remembered

2010-01-28 Thread Benjamin Reed
henry is correct. just to state another way, Zab guarantees that if a 
quorum of servers have accepted a transaction, the transaction will 
commit. this means that if less than a quorum of servers have accepted a 
transaction, we can commit or discard. the only constraint we have in 
choosing is ordering. we have to decide which partially accepted 
transactions are going to be committed and which discarded before we 
propose any new messages so that ordering is preserved.


ben

Henry Robinson wrote:

Hi -

Note that a machine that has the highest received zxid will necessarily have
seen the most recent transaction that was logged by a quorum of followers
(the FIFO property of TCP again ensures that all previous messages will have
been seen). This is the property that ZAB needs to preserve. The idea is to
avoid missing a commit that went to a node that has since failed.

I was therefore slightly imprecise in my previous mail - it's possible for
only partially-proposed proposals to be committed if the leader that is
elected next has seen them. Only when another proposal is committed instead
must the original proposal be discarded.

I highly recommend Ben Reed's and Flavio Junqueira's LADIS paper on the
subject, for those with portal.acm.org access:
http://portal.acm.org/citation.cfm?id=1529978

Henry

On 27 January 2010 21:52, Qian Ye yeqian@gmail.com wrote:

  

Hi Henry:

According to your explanation, *ZAB makes the guarantee that a proposal
which has been logged by
a quorum of followers will eventually be committed* , however, the source
code of Zookeeper, the FastLeaderElection.java file, shows that, in the
election, the candidates only provide their zxid in the votes, the one with
the max zxid would win the election. I mean, it seems that no check has
been
made to make sure whether the latest proposal has been logged by a quorum
of
servers.

In this situation, the zookeeper would deliver a proposal, which is known
as
a failed one by the client. Imagine this scenario, a zookeeper cluster with
5 servers, Leader only receives 1 ack for proposal A, after a timeout, the
client is told that the proposal failed. At this time, all servers restart
due to a power failure. The server have the log of proposal A would be the
leader, however, the client is told the proposal A failed.

Do I misunderstand this?


On Wed, Jan 27, 2010 at 10:37 AM, Henry Robinson he...@cloudera.com
wrote:



Qing -

That part of the documentation is slightly confusing. The elected leader
must have the highest zxid that has been written to disk by a quorum of
followers. ZAB makes the guarantee that a proposal which has been logged
  

by


a quorum of followers will eventually be committed. Conversely, any
proposals that *don't* get logged by a quorum before the leader sending
them
dies will not be committed. One of the ZAB papers covers both these
situations - making sure proposals are committed or skipped at the right
moments.

So you get the neat property that leader election can be live in exactly
the
case where the ZK cluster is live. If a quorum of peers aren't available
  

to


elect the leader, the resulting cluster won't be live anyhow, so it's ok
for
leader election to fail.

FLP impossibility isn't actually strictly relevant for ZAB, because FLP
requires that message reordering is possible (see all the stuff in that
paper about non-deterministically drawing messages from a potentially
deliverable set). TCP FIFO channels don't reorder, so provide the extra
signalling that ZAB requires.

cheers,
Henry

2010/1/26 Qing Yan qing...@gmail.com

  

Hi,

I have question about how zookeeper *remembers* a commit operation.

According to




http://hadoop.apache.org/zookeeper/docs/r3.2.2/zookeeperInternals.html#sc_summary


quote


The leader will issue a COMMIT to all followers as soon as a quorum of
followers have ACKed a message. Since messages are ACKed in order,


COMMITs
  

will be sent by the leader as received by the followers in order.

COMMITs are processed in order. Followers deliver a proposals message


when
  

that proposal is committed.
/quote

My question is will leader wait for COMMIT to be processed by quorum
of followers before consider
COMMIT to be success? From the documentation it seems that leader


handles


COMMIT asynchronously and
don't expect confirmation from followers. In the extreme case, what


happens
  

if leader issue a COMMIT
to all followers and crash immediately before the COMMIT message can go


out
  

of the network. How the system
remembers the COMMIT ever happens?

Actually this is related to the leader election process:

quote
ZooKeeper messaging doesn't care about the exact method of electing a
leader
has long as the following holds:

  -

  The leader has seen the highest zxid of all the followers.
  -

  A quorum of servers have committed to following the leader.

 Of these two 

Dependency on JBoss JMX

2010-01-28 Thread Gustavo Niemeyer
Hello there,

Is the dependency on JBoss a hard one, or is there a way to not use
it?  Perhaps an alternative package providing the same interface?

I'm trying to get it included in Ubuntu and being asked about this.

Thanks in advance,

-- 
Gustavo Niemeyer
http://niemeyer.net


Re: Dependency on JBoss JMX

2010-01-28 Thread Gustavo Niemeyer
 there aren't any dependencies on jboss. can you clarify the dependency that
 you are seeing?

Sorry, that was JBoss JMX, more specifically, and I actually don't see
it being used explicitly either.  Probably an indirect dependency.

I'll check back with Matthias Klose, who's doing the favor of
packaging it for us.

-- 
Gustavo Niemeyer
http://niemeyer.net


Re: Q about ZK internal: how commit is being remembered

2010-01-28 Thread Qian Ye
Thanks henry and ben, actually I have read the paper henry mentioned in this
mail, but I'm still not so clear with some of the details. Anyway, maybe
more study on the source code can help me understanding. Since Ben said
that, if less than a quorum of servers have accepted a transaction, we can
commit or discard. Would this feature cause any unexpected problem? Can you
give some hints about this issue?



On Fri, Jan 29, 2010 at 1:09 AM, Benjamin Reed br...@yahoo-inc.com wrote:

 henry is correct. just to state another way, Zab guarantees that if a
 quorum of servers have accepted a transaction, the transaction will commit.
 this means that if less than a quorum of servers have accepted a
 transaction, we can commit or discard. the only constraint we have in
 choosing is ordering. we have to decide which partially accepted
 transactions are going to be committed and which discarded before we propose
 any new messages so that ordering is preserved.

 ben


 Henry Robinson wrote:

 Hi -

 Note that a machine that has the highest received zxid will necessarily
 have
 seen the most recent transaction that was logged by a quorum of followers
 (the FIFO property of TCP again ensures that all previous messages will
 have
 been seen). This is the property that ZAB needs to preserve. The idea is
 to
 avoid missing a commit that went to a node that has since failed.

 I was therefore slightly imprecise in my previous mail - it's possible for
 only partially-proposed proposals to be committed if the leader that is
 elected next has seen them. Only when another proposal is committed
 instead
 must the original proposal be discarded.

 I highly recommend Ben Reed's and Flavio Junqueira's LADIS paper on the
 subject, for those with portal.acm.org access:
 http://portal.acm.org/citation.cfm?id=1529978

 Henry

 On 27 January 2010 21:52, Qian Ye yeqian@gmail.com wrote:



 Hi Henry:

 According to your explanation, *ZAB makes the guarantee that a proposal
 which has been logged by
 a quorum of followers will eventually be committed* , however, the
 source
 code of Zookeeper, the FastLeaderElection.java file, shows that, in the
 election, the candidates only provide their zxid in the votes, the one
 with
 the max zxid would win the election. I mean, it seems that no check has
 been
 made to make sure whether the latest proposal has been logged by a quorum
 of
 servers.

 In this situation, the zookeeper would deliver a proposal, which is known
 as
 a failed one by the client. Imagine this scenario, a zookeeper cluster
 with
 5 servers, Leader only receives 1 ack for proposal A, after a timeout,
 the
 client is told that the proposal failed. At this time, all servers
 restart
 due to a power failure. The server have the log of proposal A would be
 the
 leader, however, the client is told the proposal A failed.

 Do I misunderstand this?


 On Wed, Jan 27, 2010 at 10:37 AM, Henry Robinson he...@cloudera.com
 wrote:



 Qing -

 That part of the documentation is slightly confusing. The elected leader
 must have the highest zxid that has been written to disk by a quorum of
 followers. ZAB makes the guarantee that a proposal which has been logged


 by


 a quorum of followers will eventually be committed. Conversely, any
 proposals that *don't* get logged by a quorum before the leader sending
 them
 dies will not be committed. One of the ZAB papers covers both these
 situations - making sure proposals are committed or skipped at the right
 moments.

 So you get the neat property that leader election can be live in exactly
 the
 case where the ZK cluster is live. If a quorum of peers aren't available


 to


 elect the leader, the resulting cluster won't be live anyhow, so it's ok
 for
 leader election to fail.

 FLP impossibility isn't actually strictly relevant for ZAB, because FLP
 requires that message reordering is possible (see all the stuff in that
 paper about non-deterministically drawing messages from a potentially
 deliverable set). TCP FIFO channels don't reorder, so provide the extra
 signalling that ZAB requires.

 cheers,
 Henry

 2010/1/26 Qing Yan qing...@gmail.com



 Hi,

 I have question about how zookeeper *remembers* a commit operation.

 According to





 http://hadoop.apache.org/zookeeper/docs/r3.2.2/zookeeperInternals.html#sc_summary


 quote


 The leader will issue a COMMIT to all followers as soon as a quorum of
 followers have ACKed a message. Since messages are ACKed in order,


 COMMITs


 will be sent by the leader as received by the followers in order.

 COMMITs are processed in order. Followers deliver a proposals message


 when


 that proposal is committed.
 /quote

 My question is will leader wait for COMMIT to be processed by quorum
 of followers before consider
 COMMIT to be success? From the documentation it seems that leader


 handles


 COMMIT asynchronously and
 don't expect confirmation from followers. In the extreme case, what


 happens


 if leader issue a COMMIT
 to all 

Re: Q about ZK internal: how commit is being remembered

2010-01-28 Thread Mahadev Konar
Qian,

  ZooKeeper gurantees that if a client sees some transaction response, then
it will persist but the one's that a client does not see might be discarded
or committed. So in case a quorum does not log the transaction, there might
be a case wherein a zookeeper server which does not have the logged
transaction becomes the leader (because the machines with the logged
transaction are down). In that case the transaction is discarded. In a case
when a machine which has the logged transaction becomes the leader that
transaction will be committed.

Hope that clear your doubt.

mahadev


On 1/28/10 6:02 PM, Qian Ye yeqian@gmail.com wrote:

 Thanks henry and ben, actually I have read the paper henry mentioned in this
 mail, but I'm still not so clear with some of the details. Anyway, maybe
 more study on the source code can help me understanding. Since Ben said
 that, if less than a quorum of servers have accepted a transaction, we can
 commit or discard. Would this feature cause any unexpected problem? Can you
 give some hints about this issue?
 
 
 
 On Fri, Jan 29, 2010 at 1:09 AM, Benjamin Reed br...@yahoo-inc.com wrote:
 
 henry is correct. just to state another way, Zab guarantees that if a
 quorum of servers have accepted a transaction, the transaction will commit.
 this means that if less than a quorum of servers have accepted a
 transaction, we can commit or discard. the only constraint we have in
 choosing is ordering. we have to decide which partially accepted
 transactions are going to be committed and which discarded before we propose
 any new messages so that ordering is preserved.
 
 ben
 
 
 Henry Robinson wrote:
 
 Hi -
 
 Note that a machine that has the highest received zxid will necessarily
 have
 seen the most recent transaction that was logged by a quorum of followers
 (the FIFO property of TCP again ensures that all previous messages will
 have
 been seen). This is the property that ZAB needs to preserve. The idea is
 to
 avoid missing a commit that went to a node that has since failed.
 
 I was therefore slightly imprecise in my previous mail - it's possible for
 only partially-proposed proposals to be committed if the leader that is
 elected next has seen them. Only when another proposal is committed
 instead
 must the original proposal be discarded.
 
 I highly recommend Ben Reed's and Flavio Junqueira's LADIS paper on the
 subject, for those with portal.acm.org access:
 http://portal.acm.org/citation.cfm?id=1529978
 
 Henry
 
 On 27 January 2010 21:52, Qian Ye yeqian@gmail.com wrote:
 
 
 
 Hi Henry:
 
 According to your explanation, *ZAB makes the guarantee that a proposal
 which has been logged by
 a quorum of followers will eventually be committed* , however, the
 source
 code of Zookeeper, the FastLeaderElection.java file, shows that, in the
 election, the candidates only provide their zxid in the votes, the one
 with
 the max zxid would win the election. I mean, it seems that no check has
 been
 made to make sure whether the latest proposal has been logged by a quorum
 of
 servers.
 
 In this situation, the zookeeper would deliver a proposal, which is known
 as
 a failed one by the client. Imagine this scenario, a zookeeper cluster
 with
 5 servers, Leader only receives 1 ack for proposal A, after a timeout,
 the
 client is told that the proposal failed. At this time, all servers
 restart
 due to a power failure. The server have the log of proposal A would be
 the
 leader, however, the client is told the proposal A failed.
 
 Do I misunderstand this?
 
 
 On Wed, Jan 27, 2010 at 10:37 AM, Henry Robinson he...@cloudera.com
 wrote:
 
 
 
 Qing -
 
 That part of the documentation is slightly confusing. The elected leader
 must have the highest zxid that has been written to disk by a quorum of
 followers. ZAB makes the guarantee that a proposal which has been logged
 
 
 by
 
 
 a quorum of followers will eventually be committed. Conversely, any
 proposals that *don't* get logged by a quorum before the leader sending
 them
 dies will not be committed. One of the ZAB papers covers both these
 situations - making sure proposals are committed or skipped at the right
 moments.
 
 So you get the neat property that leader election can be live in exactly
 the
 case where the ZK cluster is live. If a quorum of peers aren't available
 
 
 to
 
 
 elect the leader, the resulting cluster won't be live anyhow, so it's ok
 for
 leader election to fail.
 
 FLP impossibility isn't actually strictly relevant for ZAB, because FLP
 requires that message reordering is possible (see all the stuff in that
 paper about non-deterministically drawing messages from a potentially
 deliverable set). TCP FIFO channels don't reorder, so provide the extra
 signalling that ZAB requires.
 
 cheers,
 Henry
 
 2010/1/26 Qing Yan qing...@gmail.com
 
 
 
 Hi,
 
 I have question about how zookeeper *remembers* a commit operation.
 
 According to
 
 
 
 
 
 

Re: Q about ZK internal: how commit is being remembered

2010-01-28 Thread Qing Yan
Hi Qian Ye,


Could you forward me a copy of the paper?  I don't have ACM access...duo
xie!


btw, I was a ZJUer too..

cheers,

Qing



On Fri, Jan 29, 2010 at 10:02 AM, Qian Ye yeqian@gmail.com wrote:

 Thanks henry and ben, actually I have read the paper henry mentioned in
 this
 mail, but I'm still not so clear with some of the details. Anyway, maybe
 more study on the source code can help me understanding. Since Ben said
 that, if less than a quorum of servers have accepted a transaction, we can
 commit or discard. Would this feature cause any unexpected problem? Can
 you
 give some hints about this issue?



 On Fri, Jan 29, 2010 at 1:09 AM, Benjamin Reed br...@yahoo-inc.com
 wrote:

  henry is correct. just to state another way, Zab guarantees that if a
  quorum of servers have accepted a transaction, the transaction will
 commit.
  this means that if less than a quorum of servers have accepted a
  transaction, we can commit or discard. the only constraint we have in
  choosing is ordering. we have to decide which partially accepted
  transactions are going to be committed and which discarded before we
 propose
  any new messages so that ordering is preserved.
 
  ben
 
 
  Henry Robinson wrote:
 
  Hi -
 
  Note that a machine that has the highest received zxid will necessarily
  have
  seen the most recent transaction that was logged by a quorum of
 followers
  (the FIFO property of TCP again ensures that all previous messages will
  have
  been seen). This is the property that ZAB needs to preserve. The idea is
  to
  avoid missing a commit that went to a node that has since failed.
 
  I was therefore slightly imprecise in my previous mail - it's possible
 for
  only partially-proposed proposals to be committed if the leader that is
  elected next has seen them. Only when another proposal is committed
  instead
  must the original proposal be discarded.
 
  I highly recommend Ben Reed's and Flavio Junqueira's LADIS paper on the
  subject, for those with portal.acm.org access:
  http://portal.acm.org/citation.cfm?id=1529978
 
  Henry
 
  On 27 January 2010 21:52, Qian Ye yeqian@gmail.com wrote:
 
 
 
  Hi Henry:
 
  According to your explanation, *ZAB makes the guarantee that a
 proposal
  which has been logged by
  a quorum of followers will eventually be committed* , however, the
  source
  code of Zookeeper, the FastLeaderElection.java file, shows that, in the
  election, the candidates only provide their zxid in the votes, the one
  with
  the max zxid would win the election. I mean, it seems that no check has
  been
  made to make sure whether the latest proposal has been logged by a
 quorum
  of
  servers.
 
  In this situation, the zookeeper would deliver a proposal, which is
 known
  as
  a failed one by the client. Imagine this scenario, a zookeeper cluster
  with
  5 servers, Leader only receives 1 ack for proposal A, after a timeout,
  the
  client is told that the proposal failed. At this time, all servers
  restart
  due to a power failure. The server have the log of proposal A would be
  the
  leader, however, the client is told the proposal A failed.
 
  Do I misunderstand this?
 
 
  On Wed, Jan 27, 2010 at 10:37 AM, Henry Robinson he...@cloudera.com
  wrote:
 
 
 
  Qing -
 
  That part of the documentation is slightly confusing. The elected
 leader
  must have the highest zxid that has been written to disk by a quorum
 of
  followers. ZAB makes the guarantee that a proposal which has been
 logged
 
 
  by
 
 
  a quorum of followers will eventually be committed. Conversely, any
  proposals that *don't* get logged by a quorum before the leader
 sending
  them
  dies will not be committed. One of the ZAB papers covers both these
  situations - making sure proposals are committed or skipped at the
 right
  moments.
 
  So you get the neat property that leader election can be live in
 exactly
  the
  case where the ZK cluster is live. If a quorum of peers aren't
 available
 
 
  to
 
 
  elect the leader, the resulting cluster won't be live anyhow, so it's
 ok
  for
  leader election to fail.
 
  FLP impossibility isn't actually strictly relevant for ZAB, because
 FLP
  requires that message reordering is possible (see all the stuff in
 that
  paper about non-deterministically drawing messages from a potentially
  deliverable set). TCP FIFO channels don't reorder, so provide the
 extra
  signalling that ZAB requires.
 
  cheers,
  Henry
 
  2010/1/26 Qing Yan qing...@gmail.com
 
 
 
  Hi,
 
  I have question about how zookeeper *remembers* a commit operation.
 
  According to
 
 
 
 
 
 
 http://hadoop.apache.org/zookeeper/docs/r3.2.2/zookeeperInternals.html#sc_summary
 
 
  quote
 
 
  The leader will issue a COMMIT to all followers as soon as a quorum
 of
  followers have ACKed a message. Since messages are ACKed in order,
 
 
  COMMITs
 
 
  will be sent by the leader as received by the followers in order.
 
  COMMITs are processed in order. Followers deliver a proposals 

Re: Q about ZK internal: how commit is being remembered

2010-01-28 Thread Qian Ye
Thanks Mahadev, I see what you mean.


On Fri, Jan 29, 2010 at 10:06 AM, Mahadev Konar maha...@yahoo-inc.comwrote:

 Qian,

  ZooKeeper gurantees that if a client sees some transaction response, then
 it will persist but the one's that a client does not see might be discarded
 or committed. So in case a quorum does not log the transaction, there might
 be a case wherein a zookeeper server which does not have the logged
 transaction becomes the leader (because the machines with the logged
 transaction are down). In that case the transaction is discarded. In a case
 when a machine which has the logged transaction becomes the leader that
 transaction will be committed.

 Hope that clear your doubt.

 mahadev


 On 1/28/10 6:02 PM, Qian Ye yeqian@gmail.com wrote:

  Thanks henry and ben, actually I have read the paper henry mentioned in
 this
  mail, but I'm still not so clear with some of the details. Anyway, maybe
  more study on the source code can help me understanding. Since Ben said
  that, if less than a quorum of servers have accepted a transaction, we
 can
  commit or discard. Would this feature cause any unexpected problem? Can
 you
  give some hints about this issue?
 
 
 
  On Fri, Jan 29, 2010 at 1:09 AM, Benjamin Reed br...@yahoo-inc.com
 wrote:
 
  henry is correct. just to state another way, Zab guarantees that if a
  quorum of servers have accepted a transaction, the transaction will
 commit.
  this means that if less than a quorum of servers have accepted a
  transaction, we can commit or discard. the only constraint we have in
  choosing is ordering. we have to decide which partially accepted
  transactions are going to be committed and which discarded before we
 propose
  any new messages so that ordering is preserved.
 
  ben
 
 
  Henry Robinson wrote:
 
  Hi -
 
  Note that a machine that has the highest received zxid will necessarily
  have
  seen the most recent transaction that was logged by a quorum of
 followers
  (the FIFO property of TCP again ensures that all previous messages will
  have
  been seen). This is the property that ZAB needs to preserve. The idea
 is
  to
  avoid missing a commit that went to a node that has since failed.
 
  I was therefore slightly imprecise in my previous mail - it's possible
 for
  only partially-proposed proposals to be committed if the leader that is
  elected next has seen them. Only when another proposal is committed
  instead
  must the original proposal be discarded.
 
  I highly recommend Ben Reed's and Flavio Junqueira's LADIS paper on the
  subject, for those with portal.acm.org access:
  http://portal.acm.org/citation.cfm?id=1529978
 
  Henry
 
  On 27 January 2010 21:52, Qian Ye yeqian@gmail.com wrote:
 
 
 
  Hi Henry:
 
  According to your explanation, *ZAB makes the guarantee that a
 proposal
  which has been logged by
  a quorum of followers will eventually be committed* , however, the
  source
  code of Zookeeper, the FastLeaderElection.java file, shows that, in
 the
  election, the candidates only provide their zxid in the votes, the one
  with
  the max zxid would win the election. I mean, it seems that no check
 has
  been
  made to make sure whether the latest proposal has been logged by a
 quorum
  of
  servers.
 
  In this situation, the zookeeper would deliver a proposal, which is
 known
  as
  a failed one by the client. Imagine this scenario, a zookeeper cluster
  with
  5 servers, Leader only receives 1 ack for proposal A, after a timeout,
  the
  client is told that the proposal failed. At this time, all servers
  restart
  due to a power failure. The server have the log of proposal A would be
  the
  leader, however, the client is told the proposal A failed.
 
  Do I misunderstand this?
 
 
  On Wed, Jan 27, 2010 at 10:37 AM, Henry Robinson he...@cloudera.com
  wrote:
 
 
 
  Qing -
 
  That part of the documentation is slightly confusing. The elected
 leader
  must have the highest zxid that has been written to disk by a quorum
 of
  followers. ZAB makes the guarantee that a proposal which has been
 logged
 
 
  by
 
 
  a quorum of followers will eventually be committed. Conversely, any
  proposals that *don't* get logged by a quorum before the leader
 sending
  them
  dies will not be committed. One of the ZAB papers covers both these
  situations - making sure proposals are committed or skipped at the
 right
  moments.
 
  So you get the neat property that leader election can be live in
 exactly
  the
  case where the ZK cluster is live. If a quorum of peers aren't
 available
 
 
  to
 
 
  elect the leader, the resulting cluster won't be live anyhow, so it's
 ok
  for
  leader election to fail.
 
  FLP impossibility isn't actually strictly relevant for ZAB, because
 FLP
  requires that message reordering is possible (see all the stuff in
 that
  paper about non-deterministically drawing messages from a potentially
  deliverable set). TCP FIFO channels don't reorder, so provide the
 extra
  signalling that 

Re: Q about ZK internal: how commit is being remembered

2010-01-28 Thread Ted Dunning
Just a quick plug for my company, but all ACM publications are available to
rent at www.deepdyve.com.  See, for instance,
http://www.deepdyve.com/search?query=A+simple+totally+ordered+broadcast+protocol

This rental service isn't the same as getting the PDF and you may prefer to
subscribe to the ACM to get the actual documents.  It is much cheaper,
however, and might fit the bill.

If you try it, send me email.  It is a new service and we need feedback.



On Thu, Jan 28, 2010 at 6:31 PM, Qing Yan qing...@gmail.com wrote:

 Hi Qian Ye,


 Could you forward me a copy of the paper?  I don't have ACM access...duo
 xie!


 btw, I was a ZJUer too..

 cheers,

 Qing



 On Fri, Jan 29, 2010 at 10:02 AM, Qian Ye yeqian@gmail.com wrote:

  Thanks henry and ben, actually I have read the paper henry mentioned in
  this
  mail, but I'm still not so clear with some of the details. Anyway, maybe
  more study on the source code can help me understanding. Since Ben said
  that, if less than a quorum of servers have accepted a transaction, we
 can
  commit or discard. Would this feature cause any unexpected problem? Can
  you
  give some hints about this issue?
 
 
 
  On Fri, Jan 29, 2010 at 1:09 AM, Benjamin Reed br...@yahoo-inc.com
  wrote:
 
   henry is correct. just to state another way, Zab guarantees that if a
   quorum of servers have accepted a transaction, the transaction will
  commit.
   this means that if less than a quorum of servers have accepted a
   transaction, we can commit or discard. the only constraint we have in
   choosing is ordering. we have to decide which partially accepted
   transactions are going to be committed and which discarded before we
  propose
   any new messages so that ordering is preserved.
  
   ben
  
  
   Henry Robinson wrote:
  
   Hi -
  
   Note that a machine that has the highest received zxid will
 necessarily
   have
   seen the most recent transaction that was logged by a quorum of
  followers
   (the FIFO property of TCP again ensures that all previous messages
 will
   have
   been seen). This is the property that ZAB needs to preserve. The idea
 is
   to
   avoid missing a commit that went to a node that has since failed.
  
   I was therefore slightly imprecise in my previous mail - it's possible
  for
   only partially-proposed proposals to be committed if the leader that
 is
   elected next has seen them. Only when another proposal is committed
   instead
   must the original proposal be discarded.
  
   I highly recommend Ben Reed's and Flavio Junqueira's LADIS paper on
 the
   subject, for those with portal.acm.org access:
   http://portal.acm.org/citation.cfm?id=1529978
  
   Henry
  
   On 27 January 2010 21:52, Qian Ye yeqian@gmail.com wrote:
  
  
  
   Hi Henry:
  
   According to your explanation, *ZAB makes the guarantee that a
  proposal
   which has been logged by
   a quorum of followers will eventually be committed* , however, the
   source
   code of Zookeeper, the FastLeaderElection.java file, shows that, in
 the
   election, the candidates only provide their zxid in the votes, the
 one
   with
   the max zxid would win the election. I mean, it seems that no check
 has
   been
   made to make sure whether the latest proposal has been logged by a
  quorum
   of
   servers.
  
   In this situation, the zookeeper would deliver a proposal, which is
  known
   as
   a failed one by the client. Imagine this scenario, a zookeeper
 cluster
   with
   5 servers, Leader only receives 1 ack for proposal A, after a
 timeout,
   the
   client is told that the proposal failed. At this time, all servers
   restart
   due to a power failure. The server have the log of proposal A would
 be
   the
   leader, however, the client is told the proposal A failed.
  
   Do I misunderstand this?
  
  
   On Wed, Jan 27, 2010 at 10:37 AM, Henry Robinson he...@cloudera.com
 
   wrote:
  
  
  
   Qing -
  
   That part of the documentation is slightly confusing. The elected
  leader
   must have the highest zxid that has been written to disk by a quorum
  of
   followers. ZAB makes the guarantee that a proposal which has been
  logged
  
  
   by
  
  
   a quorum of followers will eventually be committed. Conversely, any
   proposals that *don't* get logged by a quorum before the leader
  sending
   them
   dies will not be committed. One of the ZAB papers covers both these
   situations - making sure proposals are committed or skipped at the
  right
   moments.
  
   So you get the neat property that leader election can be live in
  exactly
   the
   case where the ZK cluster is live. If a quorum of peers aren't
  available
  
  
   to
  
  
   elect the leader, the resulting cluster won't be live anyhow, so
 it's
  ok
   for
   leader election to fail.
  
   FLP impossibility isn't actually strictly relevant for ZAB, because
  FLP
   requires that message reordering is possible (see all the stuff in
  that
   paper about non-deterministically drawing messages from