found the reason.
the IncomingTCPConnection.run() hit an exception and the thread
terminated. the next incarnation of the thread did not come up until
20 seconds later, which caused the TimedOutException and
UNavalableException to clients.
WARN [Thread-28] 2011-09-28 02:17:57,561
ITC threads are started as soon as the other party connects.
On Tue, Sep 27, 2011 at 9:22 PM, Yang tedd...@gmail.com wrote:
found the reason.
the IncomingTCPConnection.run() hit an exception and the thread
terminated. the next incarnation of the thread did not come up until
20 seconds
it looks the new conns are created by the sending side
(OutboundTCPconnection.java), when it detects a IOException on
write(),
since these timeouts happen rather frequently, about 10 -- 20 times
per hour, I wonder really it's due to network in EC2, and really would
like some ways to ascertain
I have this happening on 0.8.x It looks to me as this happens when the node
is under heavy load such as unthrottled compactions or a huge GC.
2011/9/24 Yang tedd...@gmail.com
I'm using 1.0.0
there seems to be too many node Up/Dead events detected by the failure
detector.
I'm using a 2
Dne 25.9.2011 9:29, Philippe napsal(a):
I have this happening on 0.8.x It looks to me as this happens when the
node is under heavy load such as unthrottled compactions or a huge GC.
i have this problem too. Node down detection must be improved -
increased timeouts a bit or make more tries
Dne 25.9.2011 14:31, Radim Kolar napsal(a):
Dne 25.9.2011 9:29, Philippe napsal(a):
I have this happening on 0.8.x It looks to me as this happens when
the node is under heavy load such as unthrottled compactions or a
huge GC.
i have this problem too. Node down detection must be improved -
On Sat, Sep 24, 2011 at 4:54 PM, Yang tedd...@gmail.com wrote:
I'm using 1.0.0
there seems to be too many node Up/Dead events detected by the failure
detector.
I'm using a 2 node cluster on EC2, in the same region, same security
group, so I assume the message drop
rate should be
Thanks Brandon.
I suspected that, but I think that's precluded as a possibility since
I setup another background job to do
echo | nc other_box 7000
in a loop,
this job seems to be working fine all the time, so network seems fine.
Yang
On Sun, Sep 25, 2011 at 10:39 AM, Brandon Williams
On Sun, Sep 25, 2011 at 12:52 PM, Yang tedd...@gmail.com wrote:
Thanks Brandon.
I suspected that, but I think that's precluded as a possibility since
I setup another background job to do
echo | nc other_box 7000
in a loop,
this job seems to be working fine all the time, so network seems
Thanks Brandon.
I'll try this.
but you can also see my later post regarding message drop :
http://mail-archives.apache.org/mod_mbox/cassandra-user/201109.mbox/%3ccaanh3_8aehidyh9ybt82_emh3likbcdsenrak3jhfzaj2l+...@mail.gmail.com%3E
that seems to show something in either code or background load
On Sun, Sep 25, 2011 at 1:10 PM, Yang tedd...@gmail.com wrote:
Thanks Brandon.
I'll try this.
but you can also see my later post regarding message drop :
I'm using 1.0.0
there seems to be too many node Up/Dead events detected by the failure
detector.
I'm using a 2 node cluster on EC2, in the same region, same security
group, so I assume the message drop
rate should be fairly low.
but in about every 5 minutes, I'm seeing some node detected as
12 matches
Mail list logo