Thanks - getting a better picture of things. So "entropy" is tendency of a C* datastore to be inconsistent due to writes/updates not taking place across ALL nodes that carry replica of a row (can happen if nodes are down for maintenance) It can also happen due to node crashes/restarts that can result in loss of uncommitted data. This can result in either stale data or ghost data (column/row re-appearing after a delete). So there are the "anti-entropy" processes in place to help with this - hinted handoff - read repair (can happen while performing a consistent read OR also async as driven/configured by *_read_repair_chance AFTER consistent read) - commit logs - explicit/manual repair via command - compaction (compaction is indirect mechanism to purge tombstone, thereby ensuring that stale data will NOT resurrect)
So for an application where you have only timeseries data or where data is always inserted, I would like to know the need for manual repair? I see/hear advice that there should always be a periodic (mostly weekly) manual/explicit repair in a C* system - and that's what I am trying to understand. Repair is a real expensive process and would like to justify the need to expend resources (when and how much) for it. Among other things, this advice also gives an impression to people not familiar with C* (e.g. me) that it is too fragile and needs substantial manual intervention. Appreciate all the feedback and details that you have been sharing. From: Edward Capriolo <edlinuxg...@gmail.com> Date: Monday, February 27, 2017 at 8:00 PM To: "user@cassandra.apache.org" <user@cassandra.apache.org> Cc: Benjamin Roth <benjamin.r...@jaumo.com> Subject: Re: Is periodic manual repair necessary? There are 4 anti entropy systems in cassandra. Hinted handoff Read repair Commit logs Repair commamd All are basically best effort. Commit logs get corrupt and only flush periodically. Bits rot on disk and while crossing networks network Read repair is async and only happens randomly Hinted handoff stops after some time and is not guarenteed. On Monday, February 27, 2017, Thakrar, Jayesh <jthak...@conversantmedia.com<mailto:jthak...@conversantmedia.com>> wrote: Thanks Roth and Oskar for your quick responses. This is a single datacenter, multi-rack setup. > A TTL is technically similar to a delete - in the end both create tombstones. >If you want to eliminate the possibility of resurrected deleted data, you >should run repairs. So why do I need to worry about data resurrection? Because, the TTL for the data is specified at the row level (atleast in this case) i.e. across ALL columns across ALL replicas. So they all will have the same data or wont have the data at all (i.e. it would have been tombstoned). > If you can guarantuee a 100% that data is read-repaired before > gc_grace_seconds after the data has been TTL'ed, you won't need an extra > repair. Why read-repaired before "gc_grace_period"? Isn't gc_grace_period the grace period for compaction to occur? So if the data was not consistent and read-repair happens before that, then well and good. Does read-repair not happen after gc/compaction? If this table has data being constantly/periodically inserted, then compaction will also happen accordingly, right? Thanks, Jayesh From: Benjamin Roth <benjamin.r...@jaumo.com<javascript:_e(%7B%7D,'cvml','benjamin.r...@jaumo.com');>> Date: Monday, February 27, 2017 at 11:53 AM To: <user@cassandra.apache.org<javascript:_e(%7B%7D,'cvml','user@cassandra.apache.org');>> Subject: Re: Is periodic manual repair necessary? A TTL is technically similar to a delete - in the end both create tombstones. If you want to eliminate the possibility of resurrected deleted data, you should run repairs. If you can guarantuee a 100% that data is read-repaired before gc_grace_seconds after the data has been TTL'ed, you won't need an extra repair. 2017-02-27 18:29 GMT+01:00 Oskar Kjellin <oskar.kjel...@gmail.com<javascript:_e(%7B%7D,'cvml','oskar.kjel...@gmail.com');>>: Are you running multi dc? Skickat från min iPad 27 feb. 2017 kl. 16:08 skrev Thakrar, Jayesh <jthak...@conversantmedia.com<javascript:_e(%7B%7D,'cvml','jthak...@conversantmedia.com');>>: Suppose I have an application, where there are no deletes, only 5-10% of rows being occasionally updated (and that too only once) and a lot of reads. Furthermore, I have replication = 3 and both read and write are configured for local_quorum. Occasionally, servers do go into maintenance. I understand when the maintenance is longer than the period for hinted_handoffs to be preserved, they are lost and servers may have stale data. But I do expect it to be rectified on reads. If the stale data is not read again, I don’t care for it to be corrected as then the data will be automatically purged because of TTL. In such a situation, do I need to have a periodic (weekly?) manual/batch read_repair process? Thanks, Jayesh Thakrar -- Benjamin Roth Prokurist Jaumo GmbH · www.jaumo.com<http://www.jaumo.com> Wehrstraße 46 · 73035 Göppingen · Germany Phone +49 7161 304880-6 · Fax +49 7161 304880-1 AG Ulm · HRB 731058 · Managing Director: Jens Kammerer -- Sorry this was sent from mobile. Will do less grammar and spell check than usual.