On 10.09.2013, at 02:34, "Laing, Michael" <michael.la...@nytimes.com> wrote:
> I have seen something similar. > > Of course correlation is not causation... Thanks for sharing - interesting. However, I still find it confusing that C* does not refuse service befor it dies. Maybe that is a by-product of the SEDA architecture, though. I switched back from hsha to sync and increased memtable max size and heap. That did the trick. Now it flies. Jan > > Like you, doing testing with heavy writes. > > I was using a python client to drive the writes using the cql module which is > thrift based. > > The correlation I eventually tracked down was that whichever node my python > client(s) connected to eventually ran out of memory because it could not gain > enough back by flushing memtables. It was just a matter of time. > > I switched to the new python-driver client and the problem disappeared. > > I have now been able to return almost all parameters to defaults and get out > the business of manually managing the JVM heap, to my great relief! > > Currently, I have to retool my test harness as I have been unable to drive > C*2.0.0 to destruction (yet). > > Michael > > > On Mon, Sep 9, 2013 at 8:11 PM, Jan Algermissen <jan.algermis...@nordsc.com> > wrote: > I have a strange pattern: In a cluster with three equally dimensioned and > configured nodes I keep loosing one because apparently it fails to flush its > memtables: > > http://twitpic.com/dcrtel > > > It is a different node every time. > > So far I understand that I should expect to see the chain-saw graph when > memtables are build up and then get flushed. But what about that third node? > Has anyone seen something similar? > > Jan > > C* dsc 2.0 , 3x 4GB, 2CPU nodes with heavy writes of 70 col-rows (aprox 10 > of those rows per wide row) > > I have turned off caches, reduced overall memtable and set flush-wroters to > 2, rpc_reader and writer threads to 1. > > >