Re: Cassandra Consistency problem with NTP
Just FYI, For one of the projects, i got around the NTP Drift problem by always reading more than i need, For example i want to read all the messages before x seconds then i would query cassandra for (x seconds + 500ms) then filter the duplicates in the client. Yes it does more network and yes client needs more logic to handle it. Regards, /VJ On Thu, Jan 17, 2013 at 10:47 AM, Edward Capriolo edlinuxg...@gmail.comwrote: If you have 40ms NTP drift something is VERY VERY wrong. You should have a local NTP server on the same subnet, do not try to use one on the moon. On Thu, Jan 17, 2013 at 4:42 AM, Sylvain Lebresne sylv...@datastax.comwrote: So what I want is, Cassandra provide some information for client, to indicate A is stored before B, e.g. global unique timestamp, or row order. The row order is determined by 1) the comparator you use for the column family and 2) the column names you, the client, choose for A and B. So what are the column names you use for A and B? Now what you could do is use a TimeUUID comparator for that column family and use a time uuid for A and B column names. In that case, provided A and B are sent from the same client node and B is sent after A on that client (which you said is the case), then any non buggy time uuid generator will guarantee that the uuid generated for A will be smaller than the one for B and thus that in Cassandra, A will be sorted before B. In any case, the point I want to make is that Cassandra itself cannot do anything for you problem, because by design the row ordering is something entirely controlled client side (and just so there is no misunderstanding, I want to make that point not because I'm not trying to suggest you were wrong asking this mailing list, but because we can't suggest a proper solution unless we clearly understand what the problem is). -- Sylvain 2013/1/17 Sylvain Lebresne sylv...@datastax.com I'm not sure I fully understand your problem. You seem to be talking of ordering the requests, in the order they are generated. But in that case, you will rely on the ordering of columns within whatever row you store request A and B in, and that order depends on the column names, which in turns is client provided and doesn't depend at all of the time synchronization of the cluster nodes. And since you are able to say that request A comes before B, I suppose this means said requests are generated from the same source. In which case you just need to make sure that the column names storing each request respect the correct ordering. The column timestamps Cassandra uses are here to which update *to the same column* is the more recent one. So it only comes into play if you requests A and B update the same column and you're interested in knowing which one of the update will win when you read. But even if that's your case (which doesn't sound like it at all from your description), the column timestamp is only generated server side if you use CQL. And even in that latter case, it's a convenience and you can force a timestamp client side if you really wish. In other words, Cassandra dependency on time synchronization is not a strong one even in that case. But again, that doesn't seem at all to be the problem you are trying to solve. -- Sylvain On Thu, Jan 17, 2013 at 2:56 AM, Jason Tang ares.t...@gmail.comwrote: Hi I am using Cassandra in a message bus solution, the major responsibility of cassandra is recording the incoming requests for later consumming. One strategy is First in First out (FIFO), so I need to get the stored request in reversed order. I use NTP to synchronize the system time for the nodes in the cluster. (4 nodes). But the local time of each node are still have some inaccuracy, around 40 ms. The consistency level is write all and read one, and replicate factor is 3. But here is the problem: A request come to node One at local time PM 10:00:01.000 B request come to node Two at local time PM 10:00:00.980 The correct order is A -- B But the timestamp is B -- A So is there any way for Cassandra to keep the correct order for read operation? (e.g. logical timestamp ?) Or Cassandra strong depence on time synchronization solution? BRs //Tang
Re: Cassandra Consistency problem with NTP
So what I want is, Cassandra provide some information for client, to indicate A is stored before B, e.g. global unique timestamp, or row order. The row order is determined by 1) the comparator you use for the column family and 2) the column names you, the client, choose for A and B. So what are the column names you use for A and B? Now what you could do is use a TimeUUID comparator for that column family and use a time uuid for A and B column names. In that case, provided A and B are sent from the same client node and B is sent after A on that client (which you said is the case), then any non buggy time uuid generator will guarantee that the uuid generated for A will be smaller than the one for B and thus that in Cassandra, A will be sorted before B. In any case, the point I want to make is that Cassandra itself cannot do anything for you problem, because by design the row ordering is something entirely controlled client side (and just so there is no misunderstanding, I want to make that point not because I'm not trying to suggest you were wrong asking this mailing list, but because we can't suggest a proper solution unless we clearly understand what the problem is). -- Sylvain 2013/1/17 Sylvain Lebresne sylv...@datastax.com I'm not sure I fully understand your problem. You seem to be talking of ordering the requests, in the order they are generated. But in that case, you will rely on the ordering of columns within whatever row you store request A and B in, and that order depends on the column names, which in turns is client provided and doesn't depend at all of the time synchronization of the cluster nodes. And since you are able to say that request A comes before B, I suppose this means said requests are generated from the same source. In which case you just need to make sure that the column names storing each request respect the correct ordering. The column timestamps Cassandra uses are here to which update *to the same column* is the more recent one. So it only comes into play if you requests A and B update the same column and you're interested in knowing which one of the update will win when you read. But even if that's your case (which doesn't sound like it at all from your description), the column timestamp is only generated server side if you use CQL. And even in that latter case, it's a convenience and you can force a timestamp client side if you really wish. In other words, Cassandra dependency on time synchronization is not a strong one even in that case. But again, that doesn't seem at all to be the problem you are trying to solve. -- Sylvain On Thu, Jan 17, 2013 at 2:56 AM, Jason Tang ares.t...@gmail.com wrote: Hi I am using Cassandra in a message bus solution, the major responsibility of cassandra is recording the incoming requests for later consumming. One strategy is First in First out (FIFO), so I need to get the stored request in reversed order. I use NTP to synchronize the system time for the nodes in the cluster. (4 nodes). But the local time of each node are still have some inaccuracy, around 40 ms. The consistency level is write all and read one, and replicate factor is 3. But here is the problem: A request come to node One at local time PM 10:00:01.000 B request come to node Two at local time PM 10:00:00.980 The correct order is A -- B But the timestamp is B -- A So is there any way for Cassandra to keep the correct order for read operation? (e.g. logical timestamp ?) Or Cassandra strong depence on time synchronization solution? BRs //Tang
Re: Cassandra Consistency problem with NTP
If you have 40ms NTP drift something is VERY VERY wrong. You should have a local NTP server on the same subnet, do not try to use one on the moon. On Thu, Jan 17, 2013 at 4:42 AM, Sylvain Lebresne sylv...@datastax.comwrote: So what I want is, Cassandra provide some information for client, to indicate A is stored before B, e.g. global unique timestamp, or row order. The row order is determined by 1) the comparator you use for the column family and 2) the column names you, the client, choose for A and B. So what are the column names you use for A and B? Now what you could do is use a TimeUUID comparator for that column family and use a time uuid for A and B column names. In that case, provided A and B are sent from the same client node and B is sent after A on that client (which you said is the case), then any non buggy time uuid generator will guarantee that the uuid generated for A will be smaller than the one for B and thus that in Cassandra, A will be sorted before B. In any case, the point I want to make is that Cassandra itself cannot do anything for you problem, because by design the row ordering is something entirely controlled client side (and just so there is no misunderstanding, I want to make that point not because I'm not trying to suggest you were wrong asking this mailing list, but because we can't suggest a proper solution unless we clearly understand what the problem is). -- Sylvain 2013/1/17 Sylvain Lebresne sylv...@datastax.com I'm not sure I fully understand your problem. You seem to be talking of ordering the requests, in the order they are generated. But in that case, you will rely on the ordering of columns within whatever row you store request A and B in, and that order depends on the column names, which in turns is client provided and doesn't depend at all of the time synchronization of the cluster nodes. And since you are able to say that request A comes before B, I suppose this means said requests are generated from the same source. In which case you just need to make sure that the column names storing each request respect the correct ordering. The column timestamps Cassandra uses are here to which update *to the same column* is the more recent one. So it only comes into play if you requests A and B update the same column and you're interested in knowing which one of the update will win when you read. But even if that's your case (which doesn't sound like it at all from your description), the column timestamp is only generated server side if you use CQL. And even in that latter case, it's a convenience and you can force a timestamp client side if you really wish. In other words, Cassandra dependency on time synchronization is not a strong one even in that case. But again, that doesn't seem at all to be the problem you are trying to solve. -- Sylvain On Thu, Jan 17, 2013 at 2:56 AM, Jason Tang ares.t...@gmail.com wrote: Hi I am using Cassandra in a message bus solution, the major responsibility of cassandra is recording the incoming requests for later consumming. One strategy is First in First out (FIFO), so I need to get the stored request in reversed order. I use NTP to synchronize the system time for the nodes in the cluster. (4 nodes). But the local time of each node are still have some inaccuracy, around 40 ms. The consistency level is write all and read one, and replicate factor is 3. But here is the problem: A request come to node One at local time PM 10:00:01.000 B request come to node Two at local time PM 10:00:00.980 The correct order is A -- B But the timestamp is B -- A So is there any way for Cassandra to keep the correct order for read operation? (e.g. logical timestamp ?) Or Cassandra strong depence on time synchronization solution? BRs //Tang
Re: Cassandra Consistency problem with NTP
One solution is to only read up to (now - 1 second). If this is a public API where you want to guarantee full consistency (ie, if you have added a message to the queue, it will definitely appear to be there) you can instead delay requests for 1 second before reading up to the moment that the request was received. In either of these approaches you can tune the time offset based on how closely synchronized you believe you can keep your clocks. The tradeoff of course, will be increased latency. On Wed, Jan 16, 2013 at 5:56 PM, Jason Tang ares.t...@gmail.com wrote: Hi I am using Cassandra in a message bus solution, the major responsibility of cassandra is recording the incoming requests for later consumming. One strategy is First in First out (FIFO), so I need to get the stored request in reversed order. I use NTP to synchronize the system time for the nodes in the cluster. (4 nodes). But the local time of each node are still have some inaccuracy, around 40 ms. The consistency level is write all and read one, and replicate factor is 3. But here is the problem: A request come to node One at local time PM 10:00:01.000 B request come to node Two at local time PM 10:00:00.980 The correct order is A -- B But the timestamp is B -- A So is there any way for Cassandra to keep the correct order for read operation? (e.g. logical timestamp ?) Or Cassandra strong depence on time synchronization solution? BRs //Tang
Re: Cassandra Consistency problem with NTP
Delay read is acceptable, but problem still there: A request come to node One at local time PM 10:00:01.000 B request come to node Two at local time PM 10:00:00.980 The correct order is A -- B I am not sure how node C will handle the data, although A came before B, but B's timestamp is earlier then A ? 2013/1/17 Russell Haering russellhaer...@gmail.com One solution is to only read up to (now - 1 second). If this is a public API where you want to guarantee full consistency (ie, if you have added a message to the queue, it will definitely appear to be there) you can instead delay requests for 1 second before reading up to the moment that the request was received. In either of these approaches you can tune the time offset based on how closely synchronized you believe you can keep your clocks. The tradeoff of course, will be increased latency. On Wed, Jan 16, 2013 at 5:56 PM, Jason Tang ares.t...@gmail.com wrote: Hi I am using Cassandra in a message bus solution, the major responsibility of cassandra is recording the incoming requests for later consumming. One strategy is First in First out (FIFO), so I need to get the stored request in reversed order. I use NTP to synchronize the system time for the nodes in the cluster. (4 nodes). But the local time of each node are still have some inaccuracy, around 40 ms. The consistency level is write all and read one, and replicate factor is 3. But here is the problem: A request come to node One at local time PM 10:00:01.000 B request come to node Two at local time PM 10:00:00.980 The correct order is A -- B But the timestamp is B -- A So is there any way for Cassandra to keep the correct order for read operation? (e.g. logical timestamp ?) Or Cassandra strong depence on time synchronization solution? BRs //Tang
Re: Cassandra Consistency problem with NTP
I'm not sure I fully understand your problem. You seem to be talking of ordering the requests, in the order they are generated. But in that case, you will rely on the ordering of columns within whatever row you store request A and B in, and that order depends on the column names, which in turns is client provided and doesn't depend at all of the time synchronization of the cluster nodes. And since you are able to say that request A comes before B, I suppose this means said requests are generated from the same source. In which case you just need to make sure that the column names storing each request respect the correct ordering. The column timestamps Cassandra uses are here to which update *to the same column* is the more recent one. So it only comes into play if you requests A and B update the same column and you're interested in knowing which one of the update will win when you read. But even if that's your case (which doesn't sound like it at all from your description), the column timestamp is only generated server side if you use CQL. And even in that latter case, it's a convenience and you can force a timestamp client side if you really wish. In other words, Cassandra dependency on time synchronization is not a strong one even in that case. But again, that doesn't seem at all to be the problem you are trying to solve. -- Sylvain On Thu, Jan 17, 2013 at 2:56 AM, Jason Tang ares.t...@gmail.com wrote: Hi I am using Cassandra in a message bus solution, the major responsibility of cassandra is recording the incoming requests for later consumming. One strategy is First in First out (FIFO), so I need to get the stored request in reversed order. I use NTP to synchronize the system time for the nodes in the cluster. (4 nodes). But the local time of each node are still have some inaccuracy, around 40 ms. The consistency level is write all and read one, and replicate factor is 3. But here is the problem: A request come to node One at local time PM 10:00:01.000 B request come to node Two at local time PM 10:00:00.980 The correct order is A -- B But the timestamp is B -- A So is there any way for Cassandra to keep the correct order for read operation? (e.g. logical timestamp ?) Or Cassandra strong depence on time synchronization solution? BRs //Tang
Re: Cassandra Consistency problem with NTP
Yes, Sylvain, you are correct. When I say A comes before B, it means client will secure the order, actually, B will be sent only after get response of A request. And Yes, A and B are not update same record, so it is not typical Cassandra consistency problem. And Yes, the column name is provide by client, and now I use the local timestamp, and local time of A and B are not synchronized well, so I have problem. So what I want is, Cassandra provide some information for client, to indicate A is stored before B, e.g. global unique timestamp, or row order. 2013/1/17 Sylvain Lebresne sylv...@datastax.com I'm not sure I fully understand your problem. You seem to be talking of ordering the requests, in the order they are generated. But in that case, you will rely on the ordering of columns within whatever row you store request A and B in, and that order depends on the column names, which in turns is client provided and doesn't depend at all of the time synchronization of the cluster nodes. And since you are able to say that request A comes before B, I suppose this means said requests are generated from the same source. In which case you just need to make sure that the column names storing each request respect the correct ordering. The column timestamps Cassandra uses are here to which update *to the same column* is the more recent one. So it only comes into play if you requests A and B update the same column and you're interested in knowing which one of the update will win when you read. But even if that's your case (which doesn't sound like it at all from your description), the column timestamp is only generated server side if you use CQL. And even in that latter case, it's a convenience and you can force a timestamp client side if you really wish. In other words, Cassandra dependency on time synchronization is not a strong one even in that case. But again, that doesn't seem at all to be the problem you are trying to solve. -- Sylvain On Thu, Jan 17, 2013 at 2:56 AM, Jason Tang ares.t...@gmail.com wrote: Hi I am using Cassandra in a message bus solution, the major responsibility of cassandra is recording the incoming requests for later consumming. One strategy is First in First out (FIFO), so I need to get the stored request in reversed order. I use NTP to synchronize the system time for the nodes in the cluster. (4 nodes). But the local time of each node are still have some inaccuracy, around 40 ms. The consistency level is write all and read one, and replicate factor is 3. But here is the problem: A request come to node One at local time PM 10:00:01.000 B request come to node Two at local time PM 10:00:00.980 The correct order is A -- B But the timestamp is B -- A So is there any way for Cassandra to keep the correct order for read operation? (e.g. logical timestamp ?) Or Cassandra strong depence on time synchronization solution? BRs //Tang