Re: Is deleting live sstable safe in this scenario?

Nitan Kainth Wed, 27 May 2020 10:45:56 -0700

Jeff,

If Cassandra is down how will it generate merkle tree to compare?



Regards,
Nitan
Cell: 510 449 9629

> On May 27, 2020, at 11:15 AM, Jeff Jirsa <jji...@gmail.com> wrote:
> 
> 
> You definitely can repair with a node down by passing `-hosts specific_hosts`
> 
>> On Wed, May 27, 2020 at 9:06 AM Nitan Kainth <nitankai...@gmail.com> wrote:
>> I didn't get you Leon,
>> 
>> But, the simple thing is just to follow the steps and you will be fine. You 
>> can't run the repair if the node is down.
>> 
>>> On Wed, May 27, 2020 at 10:34 AM Leon Zaruvinsky <leonzaruvin...@gmail.com> 
>>> wrote:
>>> Hey Jeff/Nitan,
>>> 
>>> 1) this concern should not be a problem if the repair happens before the 
>>> corrupted node is brought back online, right?
>>> 2) in this case, is option (3) equivalent to replacing the node? where we 
>>> repair the two live nodes and then bring up the third node with no data
>>> 
>>> Leon
>>> 
>>>> On Tue, May 26, 2020 at 10:11 PM Jeff Jirsa <jji...@gmail.com> wrote:
>>>> There’s two problems with this approach if you need strict correctness 
>>>> 
>>>> 1) after you delete the sstable and before you repair you’ll violate 
>>>> consistency, so you’ll potentially serve incorrect data for a while
>>>> 
>>>> 2) The sstable May have a tombstone past gc grace that’s shadowing data in 
>>>> another sstable that’s not corrupt and deleting it may resurrect that 
>>>> deleted data. 
>>>> 
>>>> The only strictly safe thing to do here, unfortunately, is to treat the 
>>>> host as failed and rebuild it from it’s neighbors (and again being 
>>>> pedantic here, that means stop the host, while it’s stopped repair the 
>>>> surviving replicas, then bootstrap a replacement on top of the same tokens)
>>>> 
>>>> 
>>>> 
>>>> > On May 26, 2020, at 4:46 PM, Leon Zaruvinsky <leonzaruvin...@gmail.com> 
>>>> > wrote:
>>>> > 
>>>> > 
>>>> > Hi all,
>>>> > 
>>>> > I'm looking to understand Cassandra's behavior in an sstable corruption 
>>>> > scenario, and what the minimum amount of work is that needs to be done 
>>>> > to remove a bad sstable file.
>>>> > 
>>>> > Consider: 3 node, RF 3 cluster, reads/writes at quorum
>>>> > SStable corruption exception on one node at 
>>>> > keyspace1/table1/lb-1-big-Data.db
>>>> > Sstablescrub does not work.
>>>> > 
>>>> > Is it safest to, after running a repair on the two live nodes,
>>>> > 1) Delete only keyspace1/table1/lb-1-big-Data.db,
>>>> > 2) Delete all files associated with that sstable (i.e., 
>>>> > keyspace1/table1/lb-1-*),
>>>> > 3) Delete all files under keyspace1/table1/, or
>>>> > 4) Any of the above are the same from a correctness perspective.
>>>> > 
>>>> > Thanks,
>>>> > Leon
>>>> > 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>>>

Re: Is deleting live sstable safe in this scenario?

Reply via email to