Re: How to predict time to complete for nodetool repair
Also Reaper will skip the anticompaction phase which you might be going through with nodetool (depending on your version of Cassandra). That'll reduce the overall time spent on repair and will remove some compaction pressure. But as Erick said, unless you have past repairs to rely on and a stable data size, it is impossible to predict the time it takes for repair to complete. Cheers, Alex Le lun. 23 mars 2020 à 12:44, Oleksandr Shulgin < oleksandr.shul...@zalando.de> a écrit : > On Mon, Mar 23, 2020 at 5:49 AM Shishir Kumar > wrote: > >> Hi, >> >> Is it possible to get/predict how much time it will take for *nodetool >> -pr *to complete on a node? Currently in one of my env (~800GB data per >> node in 6 node cluster), it is running since last 3 days. >> > > Cassandra Reaper used to provide a reasonably accurate estimate as I > recall. Of course, the repair has to be triggered by Reaper itself--it's > no use if you have already started it with nodetool. > > Regards, > -- > Alex > >
Re: How to predict time to complete for nodetool repair
On Mon, Mar 23, 2020 at 5:49 AM Shishir Kumar wrote: > Hi, > > Is it possible to get/predict how much time it will take for *nodetool > -pr *to complete on a node? Currently in one of my env (~800GB data per > node in 6 node cluster), it is running since last 3 days. > Cassandra Reaper used to provide a reasonably accurate estimate as I recall. Of course, the repair has to be triggered by Reaper itself--it's no use if you have already started it with nodetool. Regards, -- Alex
Re: How to predict time to complete for nodetool repair
There's a lot of moving parts with repairs and how long it takes depends on various factors including (but not limited to): - how busy the nodes are - how fast the CPUs are - how fast the disks are - how much network bandwidth is available - how much data needs to be repaired It's more art than science trying to predict it but you get closer to science the more successful runs you've had meaning you make an educated guesstimate based on the previous repair runs. The very first run is always pretty bad and consider it an outlier but use the next successive runs as indicators and you should be able to extrapolate from there. Finally, be careful when comparing runs during low traffic periods of the month and primetime/peak. Repairs will run faster when the cluster is not busy and will perform horrendously at peak times since it's competing for the same resources as normal app traffic. Cheers!
How to predict time to complete for nodetool repair
Hi, Is it possible to get/predict how much time it will take for *nodetool -pr *to complete on a node? Currently in one of my env (~800GB data per node in 6 node cluster), it is running since last 3 days. Regards, Shishir