Re: Safe to run cleanup before repair?
Great, thanks for your replies. 2017-11-12 21:44 GMT+01:00 Jeff Jirsa: > That is: bootstrap will maintain whatever consistency guarantees you had > when you started. > > -- > Jeff Jirsa > > > On Nov 12, 2017, at 12:41 PM, kurt greaves wrote: > > By default, bootstrap will stream from the primary replica of the range it > is taking ownership of. So Node 3 would have to stream from Node 2 if it > was taking ownership of Node 2's tokens. > On 13 Nov. 2017 05:00, "Joel Samuelsson" > wrote: > >> Yeah, sounds right. What I'm worried about is the following: >> I used to have only 2 nodes with RF 2 so both nodes had a copy of all >> data. There were incosistencies since I was unable to run repair, so some >> parts of the data may only exist on one node. I have now added two nodes, >> thus changing which nodes own what parts of the data. My concern is if a >> piece of data is now owned by say Node 1 and Node 3 but before the addition >> of new nodes only existed on Node 2 and a cleanup would then delete it >> permanently since Node 2 no longer owns it. Could this ever happen? >> >
Re: Safe to run cleanup before repair?
That is: bootstrap will maintain whatever consistency guarantees you had when you started. -- Jeff Jirsa > On Nov 12, 2017, at 12:41 PM, kurt greaveswrote: > > By default, bootstrap will stream from the primary replica of the range it is > taking ownership of. So Node 3 would have to stream from Node 2 if it was > taking ownership of Node 2's tokens. >> On 13 Nov. 2017 05:00, "Joel Samuelsson" wrote: >> Yeah, sounds right. What I'm worried about is the following: >> I used to have only 2 nodes with RF 2 so both nodes had a copy of all data. >> There were incosistencies since I was unable to run repair, so some parts of >> the data may only exist on one node. I have now added two nodes, thus >> changing which nodes own what parts of the data. My concern is if a piece of >> data is now owned by say Node 1 and Node 3 but before the addition of new >> nodes only existed on Node 2 and a cleanup would then delete it permanently >> since Node 2 no longer owns it. Could this ever happen?
Re: Safe to run cleanup before repair?
By default, bootstrap will stream from the primary replica of the range it is taking ownership of. So Node 3 would have to stream from Node 2 if it was taking ownership of Node 2's tokens. On 13 Nov. 2017 05:00, "Joel Samuelsson"wrote: > Yeah, sounds right. What I'm worried about is the following: > I used to have only 2 nodes with RF 2 so both nodes had a copy of all > data. There were incosistencies since I was unable to run repair, so some > parts of the data may only exist on one node. I have now added two nodes, > thus changing which nodes own what parts of the data. My concern is if a > piece of data is now owned by say Node 1 and Node 3 but before the addition > of new nodes only existed on Node 2 and a cleanup would then delete it > permanently since Node 2 no longer owns it. Could this ever happen? >
Re: Safe to run cleanup before repair?
Yeah, sounds right. What I'm worried about is the following: I used to have only 2 nodes with RF 2 so both nodes had a copy of all data. There were incosistencies since I was unable to run repair, so some parts of the data may only exist on one node. I have now added two nodes, thus changing which nodes own what parts of the data. My concern is if a piece of data is now owned by say Node 1 and Node 3 but before the addition of new nodes only existed on Node 2 and a cleanup would then delete it permanently since Node 2 no longer owns it. Could this ever happen?
Re: Safe to run cleanup before repair?
Cleanup, very simply, throws away data no longer owned by the instance because of range movements. Repair only repairs data owned by the instance (it ignores data that would be cleared by cleanup). I don't see any reason why you can't run cleanup before repair. On Sun, Nov 12, 2017 at 9:35 AM, Joel Samuelssonwrote: > So, I have a cluster which grew too large data-wise so that compactions no > longer worked (because of full disk). I have now added new nodes so that > data is spread more thin. However, I know there are incosistencies in the > cluster and I need to run a repair but those also fail because of out of > disk errors. Is it safe to run cleanup before I run the repair or might I > lose data because of said incosistencies? >
Safe to run cleanup before repair?
So, I have a cluster which grew too large data-wise so that compactions no longer worked (because of full disk). I have now added new nodes so that data is spread more thin. However, I know there are incosistencies in the cluster and I need to run a repair but those also fail because of out of disk errors. Is it safe to run cleanup before I run the repair or might I lose data because of said incosistencies?