Re: Capacity problem with a lot of writes?

Edward Capriolo Fri, 26 Nov 2010 08:34:44 -0800

On Fri, Nov 26, 2010 at 10:49 AM, Peter Schuller
<peter.schul...@infidyne.com> wrote:
>> Making compaction parallel isn't a priority because the problem is
>> almost always the opposite: how do we spread it out over a longer
>> period of time instead of sharp spikes of activity that hurt
>> read/write latency.  I'd be very surprised if latency would be
>> acceptable if you did have parallel compaction.  In other words, your
>> real problem is you need more capacity for your workload.
>
> Do you expect this to be true even with the I/O situation improved
> (i.e., under conditions where the additional I/O is not a problem)? It
> seems counter-intuitive to me that single-core compaction would make a
> huge impact on latency when compaction is CPU bound on a 8+ core
> system under moderate load (even taking into account cache
> coherency/NUMA etc).
>
> --
> / Peter Schuller
>


Carlos,

I wanted to mention a specific technique I used to solve a situation I
ran into. We had a large influx of data that pushed at our current
hardware, as stated above the true answer was more hardware. However
we ran into a situation where a single node failed several large
compactions. We failed 2 or 3 big compactions we ended up with ~1000
SSTables for a column family.

This turned into a chicken and egg situation where reads were slow
because there were many sstables and extra data like tombstones.
However the compaction was brutally slow from the read/write traffic.

My solution was to create a side by side install on the same box, I
used different data directories and different ports,
/var/lib/cassandra/compact 9168 etc, moved the data to the new install
and started it up. Then I ran nodetool compact on the new instance.
This node was seeing no read or write traffic.

I was surprised to see the machine was at 400%/1600% CPU used and not
much io-wait. Compacting 600 GB of small SSTables took about 4 days.
(However when sstables are larger I have compacted 400GB in 4 hours on
the same hardware.)

After which I moved the data file back in place and started the node
back into the cluster. I have lived on both sides of the fence where i
want long slow compactions or breakneck fast ones.

I believe there is room for other compaction models. I am interested
in systems that can optimize the case with multiple data directories
for example. It seems like from my experiment a major compaction can
not fully utilize hardware is specific conditions. Although knowing
which ones to use where and how to automatically select the optimal
strategy are interesting concerns.

Re: Capacity problem with a lot of writes?

Reply via email to