Re: Why the StageManager thread pools have 60 seconds keepalive time?

2012-08-21 Thread aaron morton
One thing we did change in the past weeks was the memtable_flush_queue_size in order to occupy less heap space with memtables, this was due to having received this warning message and some OOM exceptions: Danger. Do you know any strategy to diagnose if memtables flushing to disk and

Re: Why the StageManager thread pools have 60 seconds keepalive time?

2012-08-21 Thread Guillermo Winkler
Aaron, thanks for your answer. We do have big batch updates not always with the columns belonging to the same row(i.e. many threads are needed to handle the updates), but it did not not represented a problem when the CFs had less data on them. One thing we did change in the past weeks was the

Re: Why the StageManager thread pools have 60 seconds keepalive time?

2012-08-19 Thread aaron morton
Your seeing dropped mutations reported from nodetool tpstats ? Take a look at the logs. Look for messages from the MessagingService with the pattern {} {} messages dropped in last {}ms They will be followed by info about the TP stats. First would be the workload. Are you sending very big

Re: Why the StageManager thread pools have 60 seconds keepalive time?

2012-08-17 Thread Guillermo Winkler
Aaron, thanks for your answer. I'm actually tracking a problem where mutations get dropped and cfstats show no activity whatsoever, I have 100 threads for the mutation pool, no running or pending tasks, but some mutations get dropped none the less. I'm thinking about some scheduling problems but

Why the StageManager thread pools have 60 seconds keepalive time?

2012-08-16 Thread Guillermo Winkler
Hi, I have a cassandra cluster where I'm seeing a lot of thread trashing from the mutation pool. MutationStage:72031 Where threads get created and disposed in 100's batches every few minutes, since it's a 16 core server concurrent_writes is set in 100 in the cassandra.yaml. concurrent_writes:

Re: Why the StageManager thread pools have 60 seconds keepalive time?

2012-08-16 Thread aaron morton
That's some pretty old code. I would guess it was done that way to conserve resources. And _i think_ thread creation is pretty light weight. Jonathan / Brandon / others - opinions ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On