Re: potential data loss in Cassandra 1.1.0 .. 1.1.4

Alain RODRIGUEZ Thu, 18 Oct 2012 05:31:36 -0700

Hi Jonathan.

We are currently running the datastax AMI on amazon. Cassandra is in
version 1.1.2.


I guess that the datastax repo (deb
http://debian.datastax.com/communitystable main) will be updated
directly in 1.1.6 ?

"Replaying already-flushed data a second time is harmless -- except
for counters.
 So, to avoid replaying flushed counter data, we recommend performing drain
when shutting down the pre-1.1.6 C* prior to upgrade."

I'm afraid to forget draining my node before my next update or update +
expand.

Could you ask your team to add this specific warning in your documentation
like here : http://www.datastax.com/docs/1.1/install/expand_ami (we use to
update to last stable release before expand) or here :
http://www.datastax.com/docs/1.1/install/upgrading or in any other place
where this could be useful ?

Having counters replayed would lead to a big mess in our app, I guess there
are more people in our case who could save a lot of time and money with an
up to date documentation.

Anyway, thank you for this bug fix and this warning.

Alain

2012/10/17 Jonathan Ellis <[email protected]>

> I wanted to call out a particularly important bug for those who aren't
> in the habit of reading CHANGES.
>
> Summary: the bug was fixed in 1.1.5, with an follow-on fix for 1.1.6
> that only affects users of 1.1.0 .. 1.1.4.  Thus, if you upgraded from
> 1.0.x or earlier directly to 1.1.5, you're okay as far as this is
> concerned.  But if you used an earlier 1.1 release, you should upgrade
> to 1.1.6.
>
> Explanation:
>
> A rewrite of the commitlog code for 1.1.0 used Java's nanotime api to
> generate commitlog segment IDs.  This could cause data loss in the
> event of a power failure, since we assume commitlog IDs are strictly
> increasing in our replay logic.  Simplified, the replay logic looks like
> this:
>
> 1. Take the most recent flush time X for each columnfamily
> 2. Replay all activity in the commitlog that occurred after X
>
> The problem is that nanotime gets effectively a new random seed after
> a reboot.  If the new seed is substantially below the old one, any new
> commitlog segments will never be "after" the pre-reboot flush
> timestamps.  Subsequently, restarting Cassandra will not replay any
> unflushed updates.
>
> We fixed the nanotime problem in 1.1.5 (CASSANDRA-4601).  But, we
> didn't realize the implications for replay timestamps until later
> (CASSANDRA-4782).  To fix these retroactively, 1.1.6 sets the flush
> time of pre-1.1.6 sstables to zero.  Thus, the first startup of 1.1.6
> will result in replaying the entire commitlog, including data that may
> have already been flushed.
>
> Replaying already-flushed data a second time is harmless -- except for
> counters.  So, to avoid replaying flushed counter data, we recommend
> performing drain when shutting down the pre-1.1.6 C* prior to upgrade.
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>

Re: potential data loss in Cassandra 1.1.0 .. 1.1.4

Reply via email to