Bit more information. Using jmxterm and inspecting the state of a node when
it's "slow" playing hints, I can see the following from the node that has
hints to play:
$>get MaxHintsInProgress
#mbean = org.apache.cassandra.db:type=StorageProxy:
MaxHintsInProgress = 2048;
$>get HintsInProgress
We have a 96 node cluster running 3.11 with 256 vnodes each. We're running
a rolling restart. As we restart nodes, we notice that each node takes a
while to have all other nodes be marked as up and this corresponds to nodes
that haven't finished playing hints.
We looked at the hinted handoff