Hi ,
Attached conversation can be of some help to you.
Regards
Amit Singh
From: Sanjeev T [mailto:san...@gmail.com]
Sent: Wednesday, December 21, 2016 9:24 AM
To: user@cassandra.apache.org
Subject: Handling Leap second delay
Hi,
Can some of you share points on, the versions and handling leap second delay on
Dec 31, 2016.
Regards
-Sanjeev
--- Begin Message ---
Based on most of what I've said previously pretty much most ways of avoiding
your ordering issue of the leap second is going to be a "hack" and there will
be some amount of hope involved.
If the updates occur more than 300ms apart and you are confident your nodes
have clocks that are within 150ms of each other, then I'd close my eyes and
hope they all leap second at the same time within that 150ms.
If they are less then 300ms (I'm guessing you meant less 300ms), then I would
look to figure out what the smallest gap is between those two updates and make
sure your nodes clocks are close enough in that gap that the leap second will
occur on all nodes within that gap.
If that's not good enough, you could just halt those scenarios for 2 seconds
over the leap second and then resume them once you've confirmed all clocks have
skipped.
On Wed, 2 Nov 2016 at 18:13 Anuj Wadehra
<anujw_2...@yahoo.co.in<mailto:anujw_2...@yahoo.co.in>> wrote:
Thanks Ben for taking out time for the detailed reply !!
We dont need strict ordering for all operations but we are looking for
scenarios where 2 quick updates to same column of same row are possible. By
quick updates, I mean >300 ms. Configuring NTP properly (as mentioned in some
blogs in your link) should give fair relative accuracy between the Cassandra
nodes. But leap second takes the clock back for an ENTIRE one sec (huge) and
the probability of old write overwriting the new one increases drastically. So,
we want to be proactive with things.
I agree that you should avoid such scebaruos with design (if possible).
Good to know that you guys have setup your own NTP servers as per the
recommendation. Curious..Do you also do some monitoring around NTP?
Thanks
Anuj
On Fri, 28 Oct, 2016 at 12:25 AM, Ben Bromhead
<b...@instaclustr.com<mailto:b...@instaclustr.com>> wrote:
If you need guaranteed strict ordering in a distributed system, I would
not use Cassandra, Cassandra does not provide this out of the box. I would look
to a system that uses lamport or vector clocks. Based on your description of
how your systems runs at the moment (and how close your updates are together),
you have either already experienced out of order updates or there is a real
possibility you will in the future.
Sorry to be so dire, but if you do require causal consistency / strict
ordering, you are not getting it at the moment. Distributed systems theory is
really tricky, even for people that are "experts" on distributed systems over
unreliable networks (I would certainly not put myself in that category). People
have made a very good name for themselves by showing that the vast majority of
distributed databases have had bugs when it comes to their various consistency
models and the claims these databases make.
So make sure you really do need guaranteed causal consistency/strict
ordering or if you can design around it (e.g. using conflict free replicated
data types) or choose a system that is designed to provide it.
Having said that... here are some hacky things you could do in Cassandra
to try and get this behaviour, which I in no way endorse doing :)
* Cassandra counters do leverage a logical clock per shard and you could
hack something together with counters and lightweight transactions, but you
would want to do your homework on counters accuracy during before diving into
it... as I don't know if the implementation is safe in the context of your
question. Also this would probably require a significant rework of your
application plus a significant performance hit. I would invite a counter guru
to jump in here...
* You can leverage the fact that timestamps are monotonic if you isolate
writes to a single node for a single shared... but you then loose Cassandra's
availability guarantees, e.g. a keyspace with an RF of 1 and a CL of > ONE will
get monotonic timestamps (if generated on the server side).
* Continuing down the path of isolating writes to a single node for a
given shard you could also isolate writes to the primary replica using your
client driver during the leap second (make it a minute either side of the
leap), but again you lose out on availability and you are probably already
experiencing out of ordered writes given how close your writes and updates are.
A note on NTP: NTP is generally fine if you use it to keep the clocks
synced between the Cassandra nodes. If you are interested in how we have
implemented NTP at Instaclustr, see our blogpost on it
https://www.instaclustr.com/blog/2015/11/05/apache-cassandra-synchronization/.
Ben
On Thu, 27 Oct 2016 at 10:18 Anuj Wadehra <anujw_2...@yahoo.co.in> wrote:
Hi Ben,
Thanks for your reply. We dont use timestamps in primary key. We rely
on server side timestamps generated by coordinator. So, no functions at client
side would help.
Yes, drifts can create problems too. But even if you ensure that nodes
are perfectly synced with NTP, you will surely mess up the order of updates
during the leap second(interleaving). Some applications update same column of
same row quickly (within a second ) and reversing the order would corrupt the
data.
I am interested in learning how people relying on strict order of
updates handle leap second scenario when clock goes back one second(same second
is repeated). What kind of tricks people use to ensure that server side
timestamps are monotonic ?
As per my understanding NTP slew mode may not be suitable for
Cassandra as it may cause unpredictable drift amongst the Cassandra nodes.
Ideas ??
Thanks
Anuj
Sent from Yahoo Mail on
Android<https://overview.mail.yahoo.com/mobile/?.src=Android>
On Thu, 20 Oct, 2016 at 11:25 PM, Ben Bromhead
<b...@instaclustr.com> wrote:
http://www.datastax.com/dev/blog/preparing-for-the-leap-second
gives a pretty good overview
If you are using a timestamp as part of your primary key, this is
the situation where you could end up overwriting data. I would suggest using
timeuuid instead which will ensure that you get different primary keys even for
data inserted at the exact same timestamp.
The blog post also suggests using certain monotonic timestamp
classes in Java however these will not help you if you have multiple clients
that may overwrite data.
As for the interleaving or out of order problem, this is hard to
address in Cassandra without resorting to external coordination or LWTs. If you
are relying on a wall clock to guarantee order in a distributed system you will
get yourself into trouble even without leap seconds (clock drift, NTP
inaccuracy etc).
On Thu, 20 Oct 2016 at 10:30 Anuj Wadehra <anujw_2...@yahoo.co.in>
wrote:
Hi,
I would like to know how you guys handle leap seconds with
Cassandra.
I am not bothered about the livelock issue as we are using
appropriate versions of Linux and Java. I am more interested in finding an
optimum answer for the following question:
How do you handle wrong ordering of multiple writes (on same row
and column) during the leap second? You may overwrite the new value with old
one (disaster).
And Downtime is no option :)
I can see that CASSANDRA-9131 is still open..
FYI..we are on 2.0.14 ..
Thanks
Anuj
--
Ben Bromhead
CTO | Instaclustr<https://www.instaclustr.com/>
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer
--
Ben Bromhead
CTO | Instaclustr<https://www.instaclustr.com/>
+1 650 284 9692<tel:(650)%20284-9692>
Managed Cassandra / Spark on AWS, Azure and Softlayer
--
Ben Bromhead
CTO | Instaclustr<https://www.instaclustr.com/>
+1 650 284 9692
Managed Cassandra / Spark on AWS, Azure and Softlayer
--- End Message ---