Hey list,

Tried to find an answer to this elsewhere, but turned up nothing.

We ran our first incremental repair after a large dc migration two days
ago; the cluster had been running full repairs prior to this during the
migration. Our nodes are currently going through anticompaction, as
expected.

However, two days later, there is little to no apparent progress on this
process. The compaction count does increase, in bursts, but compactionstats
hangs with no response. We're seeing our disk space footprint grow steadily
as well. The number of sstables on disk is reaching high levels.

In the past, when our compactions seem to hang, a restart seems to move
things along; at the very least, it seems to allow JMX to respond. However,
I'm not sure of the repercussions of a restart during anticompaction.

Given my understanding of anticompaction, my expectation would be that the
sstables that had been split and marked repaired would remain that way, the
ones that had not yet been split would be left as unrepaired and some
ranges would probably be re-repaired on the next incremental repair, and
the machine would do standard compaction among the two sets (repaired vs
unrepaired). In other words, we wouldn't lose any progress in incremental
repair + anticompaction, but some repaired data would get re-repaired. Does
this seem reasonable?

Should I just let this anticompaction run its course? We did the migration
procedure (marking sstables as repaired) awhile ago, but did a full repair
again after that before we decommissioned our old dc.

Any guidance would be appreciated! Thanks,

Bryan

Reply via email to