MPF is indeed pretty lightweight, but since its job is to mark the
commitlog replay position after a flush -- which has to be done in
flush order to preserve correctness in failure scenarios -- you'll see
the pending op count go up when you have multiple flushes happening.
This is expected.

Your real problem is the 17000 pending mutations, the 22 active +
pending flushes, and probably compaction activity as well.

(Also, if each of those pending mutations is 10,000 columns, you may
be causing yourself memory pressure as well.)

On Wed, Apr 27, 2011 at 11:01 AM, Terje Marthinussen
<tmarthinus...@gmail.com> wrote:
> 0.8 trunk:
>
> When playing back a fairly large chunk of hints, things basically locks up
> under load.
> The hints are never processed successfully. Lots of Mutations dropped.
>
> One thing is that maybe the default 10k columns per send with 50ms delays is
> a bit on the aggressive side (10k*20 =200.000 columns in a second?), the
> other thing is that it seems like the whole memtable flushing locks up.
>
> I tried to increase number of memtable flushers and queue a bit (8
> concurrent flushers) to make things work, but no luck.
>
> Pool Name                    Active   Pending      Completed
> ReadStage                         0         0              1
> RequestResponseStage              0         0        2236304
> MutationStage                   100     17564        4011533
> ReadRepairStage                   0         0              0
> ReplicateOnWriteStage             0         0              0
> GossipStage                       0         0           2281
> AntiEntropyStage                  0         0              0
> MigrationStage                    0         0              0
> MemtablePostFlusher               1        13             50
> StreamStage                       0         0              0
> FlushWriter                       8        14             73
> MiscStage                         0         0              0
> FlushSorter                       0         0              0
> InternalResponseStage             0         0              0
> HintedHandoff                     1         8              3
>
> A quick source code scan makes me believe that the MemtablePostFlusher
> should not normally use a lot of time, but it seem like it does so here.
> What may cause this?
>
> Terje
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Reply via email to