Now that my cluster appears to run smoothly and after a few successful
repairs and compacts, I'm back in the business of deletion of portions
of data based on its date of insertion. For reasons too lengthy to be
explained here, I don't want to use TTL.

I use a batch mutator in Pycassa to delete ~1M rows based on
a longish list of keys I'm extracting from an auxiliary CF (with no
problem of any sort).

Now, it appears that such heads-on delete puts a temporary
but large load on the cluster. I have SSD's and they go to 100%
utilization, and the CPU spikes to significant loads.

Does anyone do throttling on such mass-delete procedure?

Thanks in advance,

Maxim

Reply via email to