Re: Consistent problem when solve Digest mismatch

2013-03-06 Thread Jason Tang
Actually I didn't concurrent update the same records, because I first
create it, then search it, then delete it. The version conflict solved
failed, due to delete local time stamp is earlier then create local time
stamp.


2013/3/6 aaron morton aa...@thelastpickle.com

 Otherwise, it means the version conflict solving strong depends on global
 sequence id (timestamp) which need provide by client ?

 Yes.
 If you have an  area of your data model that has a high degree of
 concurrency C* may not be the right match.

 In 1.1 we have atomic updates so clients see either the entire write or
 none of it. And sometimes you can design a data model that does mutate
 shared values, but writes ledger entries instead. See Matt Denis talk here
 http://www.datastax.com/events/cassandrasummit2012/presentations or this
 post http://thelastpickle.com/2012/08/18/Sorting-Lists-For-Humans/

 Cheers

 -
 Aaron Morton
 Freelance Cassandra Developer
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 4/03/2013, at 4:30 PM, Jason Tang ares.t...@gmail.com wrote:

 Hi

 The timestamp provided by my client is unix timestamp (with ntp), and as I
 said, due to the ntp drift, the local unix timestamp is not accurately
 synchronized (compare to my case).

 So for short, client can not provide global sequence number to indicate
 the event order.

 But I wonder, I configured Cassandra consistency level as write QUORUM. So
 for one record, I suppose Cassandra has the ability to decide the final
 update results.

 Otherwise, it means the version conflict solving strong depends on global
 sequence id (timestamp) which need provide by client ?


 //Tang


 2013/3/4 Sylvain Lebresne sylv...@datastax.com

 The problem is, what is the sequence number you are talking about is
 exactly?

 Or let me put it another way: if you do have a sequence number that
 provides a total ordering of your operation, then that is exactly what you
 should use as your timestamp. What Cassandra calls the timestamp, is
 exactly what you call seqID, it's the number Cassandra uses to decide the
 order of operation.

 Except that in real life, provided you have more than one client talking
 to Cassandra, then providing a total ordering of operation is hard, and in
 fact not doable efficiently. So in practice, people use unix timestamp
 (with ntp) which provide a very good while cheap approximation of the real
 life order of operations.

 But again, if you do know how to assign a more precise timestamp,
 Cassandra let you use that: you can provid your own timestamp (using unix
 timestamp is just the default). The point being, unix timestamp is the
 better approximation we have in practice.

 --
 Sylvain


 On Mon, Mar 4, 2013 at 9:26 AM, Jason Tang ares.t...@gmail.com wrote:

 Hi

   Previous I met a consistency problem, you can refer the link below for
 the whole story.

 http://mail-archives.apache.org/mod_mbox/cassandra-user/201206.mbox/%3CCAFb+LUxna0jiY0V=AvXKzUdxSjApYm4zWk=ka9ljm-txc04...@mail.gmail.com%3E

   And after check the code, seems I found some clue of the problem.
 Maybe some one can check this.

   For short, I have Cassandra cluster (1.0.3), The consistency level is
 read/write quorum, replication_factor is 3.

   Here is event sequence:

 seqID   NodeA   NodeB   NodeC
 1. New  New   New
 2. Update  Update   Update
 3. Delete   Delete

 When try to read from NodeB and NodeC, Digest mismatch exception
 triggered, so Cassandra try to resolve this version conflict.
 But the result is value Update.

 Here is the suspect root cause, the version conflict resolved based
 on time stamp.

 Node C local time is a bit earlier then node A.

 Update requests sent from node C with time stamp 00:00:00.050,
 Delete sent from node A with time stamp 00:00:00.020, which is not same
 as the event sequence.

 So the version conflict resolved incorrectly.

 It is true?

 If Yes, then it means, consistency level can secure the conflict been
 found, but to solve it correctly, dependence one time synchronization's
 accuracy, e.g. NTP ?








Re: Consistent problem when solve Digest mismatch

2013-03-05 Thread aaron morton
 Otherwise, it means the version conflict solving strong depends on global 
 sequence id (timestamp) which need provide by client ?
Yes. 
If you have an  area of your data model that has a high degree of concurrency 
C* may not be the right match.

In 1.1 we have atomic updates so clients see either the entire write or none of 
it. And sometimes you can design a data model that does mutate shared values, 
but writes ledger entries instead. See Matt Denis talk here 
http://www.datastax.com/events/cassandrasummit2012/presentations or this post 
http://thelastpickle.com/2012/08/18/Sorting-Lists-For-Humans/

Cheers

-
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 4/03/2013, at 4:30 PM, Jason Tang ares.t...@gmail.com wrote:

 Hi 
 
 The timestamp provided by my client is unix timestamp (with ntp), and as I 
 said, due to the ntp drift, the local unix timestamp is not accurately 
 synchronized (compare to my case).
 
 So for short, client can not provide global sequence number to indicate the 
 event order.
 
 But I wonder, I configured Cassandra consistency level as write QUORUM. So 
 for one record, I suppose Cassandra has the ability to decide the final 
 update results.
 
 Otherwise, it means the version conflict solving strong depends on global 
 sequence id (timestamp) which need provide by client ?
 
 
 //Tang
 
 
 2013/3/4 Sylvain Lebresne sylv...@datastax.com
 The problem is, what is the sequence number you are talking about is exactly?
 
 Or let me put it another way: if you do have a sequence number that provides 
 a total ordering of your operation, then that is exactly what you should use 
 as your timestamp. What Cassandra calls the timestamp, is exactly what you 
 call seqID, it's the number Cassandra uses to decide the order of operation.
 
 Except that in real life, provided you have more than one client talking to 
 Cassandra, then providing a total ordering of operation is hard, and in fact 
 not doable efficiently. So in practice, people use unix timestamp (with ntp) 
 which provide a very good while cheap approximation of the real life order of 
 operations.
 
 But again, if you do know how to assign a more precise timestamp, Cassandra 
 let you use that: you can provid your own timestamp (using unix timestamp is 
 just the default). The point being, unix timestamp is the better 
 approximation we have in practice.
 
 --
 Sylvain
 
 
 On Mon, Mar 4, 2013 at 9:26 AM, Jason Tang ares.t...@gmail.com wrote:
 Hi
 
   Previous I met a consistency problem, you can refer the link below for the 
 whole story.
 http://mail-archives.apache.org/mod_mbox/cassandra-user/201206.mbox/%3CCAFb+LUxna0jiY0V=AvXKzUdxSjApYm4zWk=ka9ljm-txc04...@mail.gmail.com%3E
 
   And after check the code, seems I found some clue of the problem. Maybe 
 some one can check this.
 
   For short, I have Cassandra cluster (1.0.3), The consistency level is 
 read/write quorum, replication_factor is 3. 
 
   Here is event sequence:
 
 seqID   NodeA   NodeB   NodeC
 1. New  New   New
 2. Update  Update   Update
 3. Delete   Delete
 
 When try to read from NodeB and NodeC, Digest mismatch exception triggered, 
 so Cassandra try to resolve this version conflict.
 But the result is value Update.
 
 Here is the suspect root cause, the version conflict resolved based on time 
 stamp.
 
 Node C local time is a bit earlier then node A.
 
 Update requests sent from node C with time stamp 00:00:00.050, Delete 
 sent from node A with time stamp 00:00:00.020, which is not same as the event 
 sequence.
 
 So the version conflict resolved incorrectly.
 
 It is true?
 
 If Yes, then it means, consistency level can secure the conflict been found, 
 but to solve it correctly, dependence one time synchronization's accuracy, 
 e.g. NTP ?
 
 
 
 



Re: Consistent problem when solve Digest mismatch

2013-03-04 Thread Sylvain Lebresne
The problem is, what is the sequence number you are talking about is
exactly?

Or let me put it another way: if you do have a sequence number that
provides a total ordering of your operation, then that is exactly what you
should use as your timestamp. What Cassandra calls the timestamp, is
exactly what you call seqID, it's the number Cassandra uses to decide the
order of operation.

Except that in real life, provided you have more than one client talking to
Cassandra, then providing a total ordering of operation is hard, and in
fact not doable efficiently. So in practice, people use unix timestamp
(with ntp) which provide a very good while cheap approximation of the real
life order of operations.

But again, if you do know how to assign a more precise timestamp,
Cassandra let you use that: you can provid your own timestamp (using unix
timestamp is just the default). The point being, unix timestamp is the
better approximation we have in practice.

--
Sylvain


On Mon, Mar 4, 2013 at 9:26 AM, Jason Tang ares.t...@gmail.com wrote:

 Hi

   Previous I met a consistency problem, you can refer the link below for
 the whole story.

 http://mail-archives.apache.org/mod_mbox/cassandra-user/201206.mbox/%3CCAFb+LUxna0jiY0V=AvXKzUdxSjApYm4zWk=ka9ljm-txc04...@mail.gmail.com%3E

   And after check the code, seems I found some clue of the problem. Maybe
 some one can check this.

   For short, I have Cassandra cluster (1.0.3), The consistency level is
 read/write quorum, replication_factor is 3.

   Here is event sequence:

 seqID   NodeA   NodeB   NodeC
 1. New  New   New
 2. Update  Update   Update
 3. Delete   Delete

 When try to read from NodeB and NodeC, Digest mismatch exception
 triggered, so Cassandra try to resolve this version conflict.
 But the result is value Update.

 Here is the suspect root cause, the version conflict resolved based
 on time stamp.

 Node C local time is a bit earlier then node A.

 Update requests sent from node C with time stamp 00:00:00.050, Delete
 sent from node A with time stamp 00:00:00.020, which is not same as the
 event sequence.

 So the version conflict resolved incorrectly.

 It is true?

 If Yes, then it means, consistency level can secure the conflict been
 found, but to solve it correctly, dependence one time synchronization's
 accuracy, e.g. NTP ?





Re: Consistent problem when solve Digest mismatch

2013-03-04 Thread Jason Tang
Hi

The timestamp provided by my client is unix timestamp (with ntp), and as I
said, due to the ntp drift, the local unix timestamp is not accurately
synchronized (compare to my case).

So for short, client can not provide global sequence number to indicate the
event order.

But I wonder, I configured Cassandra consistency level as write QUORUM. So
for one record, I suppose Cassandra has the ability to decide the final
update results.

Otherwise, it means the version conflict solving strong depends on global
sequence id (timestamp) which need provide by client ?


//Tang


2013/3/4 Sylvain Lebresne sylv...@datastax.com

 The problem is, what is the sequence number you are talking about is
 exactly?

 Or let me put it another way: if you do have a sequence number that
 provides a total ordering of your operation, then that is exactly what you
 should use as your timestamp. What Cassandra calls the timestamp, is
 exactly what you call seqID, it's the number Cassandra uses to decide the
 order of operation.

 Except that in real life, provided you have more than one client talking
 to Cassandra, then providing a total ordering of operation is hard, and in
 fact not doable efficiently. So in practice, people use unix timestamp
 (with ntp) which provide a very good while cheap approximation of the real
 life order of operations.

 But again, if you do know how to assign a more precise timestamp,
 Cassandra let you use that: you can provid your own timestamp (using unix
 timestamp is just the default). The point being, unix timestamp is the
 better approximation we have in practice.

 --
 Sylvain


 On Mon, Mar 4, 2013 at 9:26 AM, Jason Tang ares.t...@gmail.com wrote:

 Hi

   Previous I met a consistency problem, you can refer the link below for
 the whole story.

 http://mail-archives.apache.org/mod_mbox/cassandra-user/201206.mbox/%3CCAFb+LUxna0jiY0V=AvXKzUdxSjApYm4zWk=ka9ljm-txc04...@mail.gmail.com%3E

   And after check the code, seems I found some clue of the problem. Maybe
 some one can check this.

   For short, I have Cassandra cluster (1.0.3), The consistency level is
 read/write quorum, replication_factor is 3.

   Here is event sequence:

 seqID   NodeA   NodeB   NodeC
 1. New  New   New
 2. Update  Update   Update
 3. Delete   Delete

 When try to read from NodeB and NodeC, Digest mismatch exception
 triggered, so Cassandra try to resolve this version conflict.
 But the result is value Update.

 Here is the suspect root cause, the version conflict resolved based
 on time stamp.

 Node C local time is a bit earlier then node A.

 Update requests sent from node C with time stamp 00:00:00.050, Delete
 sent from node A with time stamp 00:00:00.020, which is not same as the
 event sequence.

 So the version conflict resolved incorrectly.

 It is true?

 If Yes, then it means, consistency level can secure the conflict been
 found, but to solve it correctly, dependence one time synchronization's
 accuracy, e.g. NTP ?