Re: Exceptions on 0.7.0

2011-02-23 Thread Stu Hood
o. I was too puzzled by the numbers >>>>> >>>>> >>>>> On Thu, Feb 10, 2011 at 10:30 AM, aaron morton < >>>>> aa...@thelastpickle.com> wrote: >>>>> >>>>>> Shimi, >>>>>> You may be seeing th

Re: Exceptions on 0.7.0

2011-02-22 Thread David Boxenhorn
seeing the result of CASSANDRA-1992, are you able to test >>>>> with the most recent 0.7 build ? >>>>> https://hudson.apache.org/hudson/job/Cassandra-0.7/ >>>>> >>>>> >>>>> Aaron >>>>> >>>> I wil

Re: Exceptions on 0.7.0

2011-02-22 Thread shimi
gt;>>> >>>> >>>> Aaron >>>> >>> I will. I hope the data was not corrupted. >>> >>> >>> >>> On Thu, Feb 10, 2011 at 10:30 AM, aaron morton >>> wrote: >>> >>>> Shimi, >>>> You

Re: Exceptions on 0.7.0

2011-02-22 Thread David Boxenhorn
t; >>> Shimi, >>> You may be seeing the result of CASSANDRA-1992, are you able to test with >>> the most recent 0.7 build ? >>> https://hudson.apache.org/hudson/job/Cassandra-0.7/ >>> >>> >>> Aaron >>> >>> On 10 Feb 2

Re: frequent client exceptions on 0.7.0

2011-02-21 Thread Peter Schuller
> AFAIK the MemtablePostFlusher is the TP writing sstables, if it has a queue > then there is the potential for writes to block while it waits for Memtables > to be flushed. Take a look at your Memtable settings per CF, could it be that > all the Memtables are flushing at once? There is info in

Re: frequent client exceptions on 0.7.0

2011-02-20 Thread Aaron Morton
AFAIK the MemtablePostFlusher is the TP writing sstables, if it has a queue then there is the potential for writes to block while it waits for Memtables to be flushed. Take a look at your Memtable settings per CF, could it be that all the Memtables are flushing at once? There is info in the logs

Re: frequent client exceptions on 0.7.0

2011-02-18 Thread Andy Skalet
On Thu, Feb 17, 2011 at 12:22 PM, Aaron Morton wrote: > Messages been dropped means the machine node is overloaded. Look at the > thread pool stats to see which thread pools have queues. It may be IO > related, so also check the read and write latency on the CF and use iostat. > > i would try th

Re: frequent client exceptions on 0.7.0

2011-02-17 Thread Aaron Morton
t > were not worth it. I have ended up running the nodes closer to the wire and > living with an increased rate of client side exceptions and nodes going down > for short periods. > > Dan > > -Original Message- > From: Andy Skalet [mailto:aeska...@bitjug.com] >

RE: frequent client exceptions on 0.7.0

2011-02-17 Thread Dan Hendry
tions and nodes going down for short periods. Dan -Original Message- From: Andy Skalet [mailto:aeska...@bitjug.com] Sent: February-17-11 4:18 To: Peter Schuller Cc: user@cassandra.apache.org Subject: Re: frequent client exceptions on 0.7.0 On Thu, Feb 17, 2011 at 12:37 AM, Peter Schulle

Re: frequent client exceptions on 0.7.0

2011-02-17 Thread Andy Skalet
On Thu, Feb 17, 2011 at 12:37 AM, Peter Schuller wrote: > Bottom line: Check /var/log/cassandra/system.log to begin with and see > if it's reporting anything or being restarted. Thanks, Peter. In the system.log, I see quite a few of these across several machines. Everything else in the log is I

Re: frequent client exceptions on 0.7.0

2011-02-17 Thread Peter Schuller
>   raise EOFError() > EOFError [snip] > error: [Errno 104] Connection reset by peer Sounds like you either have a firewalling/networking issues that is tearing down TCP connections, or your cassandra node is dying. Have you checked the Cassandra system log? A frequent mistake is configuring mem

frequent client exceptions on 0.7.0

2011-02-16 Thread Andy Skalet
Hello, We were occasionally experiencing client exceptions with 0.6.3, so we upgraded to 0.7.0 a couple weeks ago, but unfortunately we now get more client exceptions, and more frequently. Also, occasionally nodetool ring will show a node Down even though cassandra is still running and the node w

Re: Exceptions on 0.7.0

2011-02-10 Thread Aaron Morton
Can someone with a better understanding of CASSANDRA-1992 jump in ? AaronOn 11 Feb, 2011,at 02:51 AM, Attila Babo wrote:The same problem here, even with apache-cassandra-2011-02-10_06-30-00-bin.tar.gz from hudson. I'm happy to share the full log if needed or run tests to identify the core problem

Re: Exceptions on 0.7.0

2011-02-10 Thread Attila Babo
The same problem here, even with apache-cassandra-2011-02-10_06-30-00-bin.tar.gz from hudson. I'm happy to share the full log if needed or run tests to identify the core problem which looks like an overflow for me. Database was upgraded from 0.6.8, there were no problems with it before. /Attila -

Re: Exceptions on 0.7.0

2011-02-10 Thread shimi
? >> https://hudson.apache.org/hudson/job/Cassandra-0.7/ >> >> >> Aaron >> >> On 10 Feb 2011, at 13:42, Dan Hendry wrote: >> >> Out of curiosity, do you really have on the order of 1,986,622,313 >> elements (I believe elements=keys) in the cf? >>

Re: Exceptions on 0.7.0

2011-02-10 Thread aaron morton
ut of curiosity, do you really have on the order of 1,986,622,313 elements >> (I believe elements=keys) in the cf? >> >> Dan >> >> From: shimi [mailto:shim...@gmail.com] >> Sent: February-09-11 15:06 >> To: user@cassandra.apache.org >> Subject: Excepti

Re: Exceptions on 0.7.0

2011-02-10 Thread shimi
86,622,313 elements > (I believe elements=keys) in the cf? > > Dan > > *From:* shimi [mailto:shim...@gmail.com] > *Sent:* February-09-11 15:06 > *To:* user@cassandra.apache.org > *Subject:* Exceptions on 0.7.0 > > I have a 4 node test cluster were I test the port to 0.7.0 from

Re: Exceptions on 0.7.0

2011-02-10 Thread aaron morton
313 elements > (I believe elements=keys) in the cf? > > Dan > > From: shimi [mailto:shim...@gmail.com] > Sent: February-09-11 15:06 > To: user@cassandra.apache.org > Subject: Exceptions on 0.7.0 > > I have a 4 node test cluster were I test the port to 0.7.0 from

RE: Exceptions on 0.7.0

2011-02-09 Thread Dan Hendry
Out of curiosity, do you really have on the order of 1,986,622,313 elements (I believe elements=keys) in the cf? Dan From: shimi [mailto:shim...@gmail.com] Sent: February-09-11 15:06 To: user@cassandra.apache.org Subject: Exceptions on 0.7.0 I have a 4 node test cluster were I test

Exceptions on 0.7.0

2011-02-09 Thread shimi
I have a 4 node test cluster were I test the port to 0.7.0 from 0.6.X On 3 out of the 4 nodes I get exceptions in the log. I am using RP. Changes that I did: 1. changed the replication factor from 3 to 4 2. configured the nodes to use Dynamic Snitch 3. RR of 0.33 I run repair on 2 nodes before I