Re: Memtable flush blocking writes

2013-09-24 Thread Ken Hancock
This is on Cassandra 1.2.9 though packaged into DSE which I suspect may
come into play here.  I didn't really get to the bottom of it other than to
up the queue to 32 which is about the number of CFs I have.  After that,
mutation drops disappeared and the FlushWriter blocks went away.


On Mon, Sep 23, 2013 at 6:03 PM, Robert Coli rc...@eventbrite.com wrote:

 On Fri, Aug 23, 2013 at 10:35 AM, Ken Hancock ken.hanc...@schange.comwrote:

 I appear to have a problem illustrated by
 https://issues.apache.org/jira/browse/CASSANDRA-1955. At low data
 rates, I'm seeing mutation messages dropped because writers are
 blocked as I get a storm of memtables being flushed. OpsCenter
 memtables seem to also contribute to this:

 ...

 Now I can increase memtable_flush_queue_size, but it seems based on
 the above that in order to solve the problem, I need to set this to
 count(CF). What's the downside of this approach? It seems a backwards
 solution to the real problem...


 What version of Cassandra? Did you ever get to the bottom of this?

 =Rob




-- 
*Ken Hancock *| System Architect, Advanced Advertising
SeaChange International
50 Nagog Park
Acton, Massachusetts 01720
ken.hanc...@schange.com | www.schange.com |
NASDAQ:SEAChttp://www.schange.com/en-US/Company/InvestorRelations.aspx

Office: +1 (978) 889-3329 | [image: Google Talk:]
ken.hanc...@schange.com | [image:
Skype:]hancockks | [image: Yahoo IM:]hancockks [image:
LinkedIn]http://www.linkedin.com/in/kenhancock

[image: SeaChange International]
 http://www.schange.com/This e-mail and any attachments may contain
information which is SeaChange International confidential. The information
enclosed is intended only for the addressees herein and may not be copied
or forwarded without permission from SeaChange International.


Re: Memtable flush blocking writes

2013-09-23 Thread Robert Coli
On Fri, Aug 23, 2013 at 10:35 AM, Ken Hancock ken.hanc...@schange.comwrote:

 I appear to have a problem illustrated by
 https://issues.apache.org/jira/browse/CASSANDRA-1955. At low data
 rates, I'm seeing mutation messages dropped because writers are
 blocked as I get a storm of memtables being flushed. OpsCenter
 memtables seem to also contribute to this:

...

 Now I can increase memtable_flush_queue_size, but it seems based on
 the above that in order to solve the problem, I need to set this to
 count(CF). What's the downside of this approach? It seems a backwards
 solution to the real problem...


What version of Cassandra? Did you ever get to the bottom of this?

=Rob


Memtable flush blocking writes

2013-08-23 Thread Ken Hancock
I appear to have a problem illustrated by
https://issues.apache.org/jira/browse/CASSANDRA-1955. At low data
rates, I'm seeing mutation messages dropped because writers are
blocked as I get a storm of memtables being flushed. OpsCenter
memtables seem to also contribute to this:

INFO [OptionalTasks:1] 2013-08-23 01:53:58,522 ColumnFamilyStore.java
(line 630) Enqueuing flush of
Memtable-runratecountforiczone@1281182121(14976/120803 serialized/live
bytes, 360 ops)
INFO [OptionalTasks:1] 2013-08-23 01:53:58,523 ColumnFamilyStore.java
(line 630) Enqueuing flush of
Memtable-runratecountforchannel@705923070(278200/1048576
serialized/live bytes, 6832 ops)
INFO [OptionalTasks:1] 2013-08-23 01:53:58,525 ColumnFamilyStore.java
(line 630) Enqueuing flush of
Memtable-solr_resources@1615459594(66362/66362 serialized/live bytes,
4 ops)
INFO [OptionalTasks:1] 2013-08-23 01:53:58,525 ColumnFamilyStore.java
(line 630) Enqueuing flush of
Memtable-scheduleddaychannelie@393647337(33203968/36700160
serialized/live bytes, 865620 ops)
INFO [OptionalTasks:1] 2013-08-23 01:53:58,530 ColumnFamilyStore.java
(line 630) Enqueuing flush of
Memtable-failediecountfornetwork@1781160199(8680/124903
serialized/live bytes, 273 ops)
INFO [OptionalTasks:1] 2013-08-23 01:53:58,530 ColumnFamilyStore.java
(line 630) Enqueuing flush of
Memtable-rollups7200@37425413(6504/23 serialized/live bytes, 271
ops)
INFO [OptionalTasks:1] 2013-08-23 01:53:58,531 ColumnFamilyStore.java
(line 630) Enqueuing flush of
Memtable-rollups60@1943691367(638176/1048576 serialized/live bytes,
39894 ops)
INFO [OptionalTasks:1] 2013-08-23 01:53:58,531 ColumnFamilyStore.java
(line 630) Enqueuing flush of Memtable-events@99567005(1133/1133
serialized/live bytes, 39 ops)
INFO [OptionalTasks:1] 2013-08-23 01:53:58,532 ColumnFamilyStore.java
(line 630) Enqueuing flush of
Memtable-rollups300@532892022(184296/1048576 serialized/live bytes,
7679 ops)
INFO [OptionalTasks:1] 2013-08-23 01:53:58,532 ColumnFamilyStore.java
(line 630) Enqueuing flush of
Memtable-ie@1309405764(457390051/152043520 serialized/live bytes,
16956160 ops)
INFO [OptionalTasks:1] 2013-08-23 01:53:58,823 ColumnFamilyStore.java
(line 630) Enqueuing flush of
Memtable-videoexpectedformat@1530999508(684/24557 serialized/live
bytes, 12453 ops)
INFO [OptionalTasks:1] 2013-08-23 01:53:58,929 ColumnFamilyStore.java
(line 630) Enqueuing flush of
Memtable-failediecountforzone@411870848(9200/95294 serialized/live
bytes, 284 ops)
INFO [OptionalTasks:1] 2013-08-23 01:53:59,012 ColumnFamilyStore.java
(line 630) Enqueuing flush of Memtable-rollups86400@744253892(456/456
serialized/live bytes, 19 ops)
INFO [OptionalTasks:1] 2013-08-23 01:53:59,364 ColumnFamilyStore.java
(line 630) Enqueuing flush of Memtable-peers@2024878954(2006/40629
serialized/live bytes, 452 ops)

I had a tpstats running across all the nodes in my cluster every 5
seconds or so and observe the following:

2013-08-23T01:53:47 192.168.131.227 FlushWriter 0 0 33 0 0
2013-08-23T01:53:55 192.168.131.227 FlushWriter 0 0 33 0 0
2013-08-23T01:54:00 192.168.131.227 FlushWriter 2 10 37 1 5
2013-08-23T01:54:07 192.168.131.227 FlushWriter 1 1 53 0 11
2013-08-23T01:54:12 192.168.131.227 FlushWriter 1 1 53 0 11

Now I can increase memtable_flush_queue_size, but it seems based on
the above that in order to solve the problem, I need to set this to
count(CF). What's the downside of this approach? It seems a backwards
solution to the real problem...