Re: Re: eliminate need to repair by using column TTL??

2011-07-22 Thread jonathan . colby
good points Aaron. I realize now how expensive repair on reads are. I'm  
going to keep doing repairs regularly but still have a max TTL on all  
columns to make sure we don't have really old data we no longer need  
getting buried in the cluster.


On , aaron morton  wrote:
Read repair will only repair data that is read on the nodes that are up  
at that time, and does not guarantee that any changes it detects will be  
written back to the nodes. The diff mutations are async fire and forget  
messages which may go missing or be dropped or ignored by the recipient  
just like any other message.




Also getting hit with a bunch of read repair operations is pretty  
painful. The normal read runs, the coordinator detects the digest  
mis-match, the read runs again from all nodes and they all have to return  
their full data (no digests this time), the coordinator detects the  
diffs, mutations are sent back to each node that needs them. All this  
happens sync to the read request when the CL > ONE. Thats 2 reads with  
more network IO and up to RF mutations .




The delete thing is important but repair also reduces the chance of reads  
getting hit with RR and gives me confidence when it's necessary to nuke a  
bad node.




Your plan may work but it feels risky to me. You may end up with worse  
read performance and unpleasent emotions if you ever have to nuke a node.  
Others may disagree.




Not ignoring the fact the repair can take a long time, fail, hurt  
performance etc. There are plans to improve it though.





Cheers





-



Aaron Morton



Freelance Cassandra Developer



@aaronmorton



http://www.thelastpickle.com





On 22 Jul 2011, at 19:55, jonathan.co...@gmail.com wrote:




> One of the main reasons for regularly running repair is to make sure  
deletes are propagated in the cluster, ie, data is not resurrected if a  
node never received the delete call.



>



> And repair-on-read takes care of repairing inconsistencies "on-the-fly".



>


> So if I were to set a universal TTL on all columns - so everything  
would only live for a certain age, would I be able to get away without  
having to do regular repairs with nodetool?



>


> I realize this scenario would not be applicable for everyone, but our  
data model would allow us to do this.



>


> So could this be an alternative to running the (resource-intensive,  
long-running) repairs with nodetool?



>



> Thanks.






Re: eliminate need to repair by using column TTL??

2011-07-22 Thread aaron morton
Read repair will only repair data that is read on the nodes that are up at that 
time, and does not guarantee that any changes it detects will be written back 
to the nodes. The diff mutations are async fire and forget messages which may 
go missing or be dropped or ignored by the recipient just like any other 
message. 

Also getting hit with a bunch of read repair operations is pretty painful. The 
normal read runs, the coordinator detects the digest mis-match, the read runs 
again from all nodes and they all have to return their full data (no digests 
this time), the coordinator detects the diffs, mutations are sent back to each 
node that needs them. All this happens sync to the read request when the CL > 
ONE. Thats 2 reads with more network IO and up to RF mutations . 

The delete thing is important but repair also reduces the chance of reads 
getting hit with RR and gives me confidence when it's necessary to nuke a bad 
node. 

Your plan may work but it feels risky to me. You may end up with worse read 
performance and unpleasent emotions if you ever have to nuke a node. Others may 
disagree. 

Not ignoring the fact the repair can take a long time, fail, hurt performance 
etc. There are plans to improve it though. 

Cheers
  
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 22 Jul 2011, at 19:55, jonathan.co...@gmail.com wrote:

> One of the main reasons for regularly running repair is to make sure deletes 
> are propagated in the cluster, i.e., data is not resurrected if a node never 
> received the delete call.
> 
> And repair-on-read takes care of repairing inconsistencies "on-the-fly".
> 
> So if I were to set a universal TTL on all columns - so everything would only 
> live for a certain age, would I be able to get away without having to do 
> regular repairs with nodetool?
> 
> I realize this scenario would not be applicable for everyone, but our data 
> model would allow us to do this. 
> 
> So could this be an alternative to running the (resource-intensive, 
> long-running) repairs with nodetool?
> 
> Thanks.



eliminate need to repair by using column TTL??

2011-07-22 Thread jonathan . colby
One of the main reasons for regularly running repair is to make sure  
deletes are propagated in the cluster, ie, data is not resurrected if a  
node never received the delete call.


And repair-on-read takes care of repairing inconsistencies "on-the-fly".

So if I were to set a universal TTL on all columns - so everything would  
only live for a certain age, would I be able to get away without having to  
do regular repairs with nodetool?


I realize this scenario would not be applicable for everyone, but our data  
model would allow us to do this.


So could this be an alternative to running the (resource-intensive,  
long-running) repairs with nodetool?


Thanks.