Re: How to predict time to complete for nodetool repair

2020-03-23 Thread Alexander DEJANOVSKI
Also Reaper will skip the anticompaction phase which you might be going
through with nodetool (depending on your version of Cassandra).
That'll reduce the overall time spent on repair and will remove some
compaction pressure.

But as Erick said, unless you have past repairs to rely on and a stable
data size, it is impossible to predict the time it takes for repair to
complete.

Cheers,

Alex

Le lun. 23 mars 2020 à 12:44, Oleksandr Shulgin <
oleksandr.shul...@zalando.de> a écrit :

> On Mon, Mar 23, 2020 at 5:49 AM Shishir Kumar 
> wrote:
>
>> Hi,
>>
>> Is it possible to get/predict how much time it will take for *nodetool
>> -pr *to complete on a node? Currently in one of my env (~800GB data per
>> node in 6 node cluster), it is running since last 3 days.
>>
>
> Cassandra Reaper used to provide a reasonably accurate estimate as I
> recall.  Of course, the repair has to be triggered by Reaper itself--it's
> no use if you have already started it with nodetool.
>
> Regards,
> --
> Alex
>
>


Re: How to predict time to complete for nodetool repair

2020-03-23 Thread Oleksandr Shulgin
On Mon, Mar 23, 2020 at 5:49 AM Shishir Kumar 
wrote:

> Hi,
>
> Is it possible to get/predict how much time it will take for *nodetool
> -pr *to complete on a node? Currently in one of my env (~800GB data per
> node in 6 node cluster), it is running since last 3 days.
>

Cassandra Reaper used to provide a reasonably accurate estimate as I
recall.  Of course, the repair has to be triggered by Reaper itself--it's
no use if you have already started it with nodetool.

Regards,
--
Alex


Re: How to predict time to complete for nodetool repair

2020-03-23 Thread Erick Ramirez
There's a lot of moving parts with repairs and how long it takes depends on
various factors including (but not limited to):

   - how busy the nodes are
   - how fast the CPUs are
   - how fast the disks are
   - how much network bandwidth is available
   - how much data needs to be repaired

It's more art than science trying to predict it but you get closer to
science the more successful runs you've had meaning you make an educated
guesstimate based on the previous repair runs. The very first run is always
pretty bad and consider it an outlier but use the next successive runs as
indicators and you should be able to extrapolate from there.

Finally, be careful when comparing runs during low traffic periods of the
month and primetime/peak. Repairs will run faster when the cluster is not
busy and will perform horrendously at peak times since it's competing for
the same resources as normal app traffic. Cheers!


How to predict time to complete for nodetool repair

2020-03-22 Thread Shishir Kumar
Hi,

Is it possible to get/predict how much time it will take for *nodetool -pr *to
complete on a node? Currently in one of my env (~800GB data per node in 6
node cluster), it is running since last 3 days.

Regards,
Shishir