If you want this to be part of the jira record, you need to add it as a comment on the issue; jira is not configured to turn emails into comments automatically.
On Sun, Dec 27, 2009 at 11:07 PM, Michael Lee <mail.list.steel.men...@gmail.com> wrote: > Confirm this issue by following tests > suppose a cluster contained 8 nodes, which contained about 10000 rows(key > range from 1 to 10000): > Address Status Load Range > Ring > 170141183460469231731687303715884105728 > 10.237.4.85 Up 757.13 MB 21267647932558653966460912964485513216 > |<--| > 10.237.1.135 Up 761.54 MB 42535295865117307932921825928971026432 > | ^ > 10.237.1.137 Up 748.02 MB 63802943797675961899382738893456539648 > v | > 10.237.1.139 Up 732.36 MB 85070591730234615865843651857942052864 > | ^ > 10.237.1.140 Up 725.6 MB > 106338239662793269832304564822427566080 v | > 10.237.1.141 Up 726.59 MB > 127605887595351923798765477786913079296 | ^ > 10.237.1.143 Up 728.16 MB > 148873535527910577765226390751398592512 v | > 10.237.1.144 Up 745.69 MB > 170141183460469231731687303715884105728 |-->| > > (1) Read keys range [1-10000], all keys read out ok ( client send read > request directly to 10.237.4.85, 10.237.1.137, 10.237.1.140, 10.237.1.143 ) > (2) Turn-off 10.237.1.135 while remain pressure, some read request will > time out, > after all nodes know 10.237.1.135 has down (about 10 s later), all read > request become ok again, that’s fine > (3) After turn-on 10.237.1.135(and cassandra service also), some read > request will time out again, and will remain FOREVER even all nodes know > 10.237.1.135 has up, > That’s a PROBLEM! > (4) Reboot 10.237.1.135, problem remains. > (5) If stop pressure and reboot whole cluster then perform step 1, all > things are fine, again….. > > All read request use Quorum policy, version of Cassandra is > apache-cassandra-incubating-0.5.0-beta2, and I’ve tested > apache-cassandra-incubating-0.5.0-RC1, problem remains. > > After read system.log, I found after 10.237.1.135 down and up again, other > nodes will not establish tcp connection to it(on tcp port 7000 ) forever! > And read request sent to 10.237.1.135(into Pending-Writes because socket > channel is closed) will not sent to net forever(from observing tcpdump). > > It’s seems when 10.237.1.135 going down in step2, some socket channel was > reset , > after 10.237.1.135 come back, these socket channel remain closed, forever > ---------END---------- > > > -----Original Message----- > From: Jonathan Ellis (JIRA) [mailto:j...@apache.org] > Sent: Thursday, December 24, 2009 10:47 AM > To: cassandra-comm...@incubator.apache.org > Subject: [jira] Updated: (CASSANDRA-651) cassandra 0.5 version throttles and > sometimes kills traffic to a node if you restart it. > > > [ > https://issues.apache.org/jira/browse/CASSANDRA-651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel > ] > > Jonathan Ellis updated CASSANDRA-651: > ------------------------------------- > > Fix Version/s: 0.5 > Assignee: Jaakko Laine > >> cassandra 0.5 version throttles and sometimes kills traffic to a node if you >> restart it. >> ---------------------------------------------------------------------------------------- >> >> Key: CASSANDRA-651 >> URL: https://issues.apache.org/jira/browse/CASSANDRA-651 >> Project: Cassandra >> Issue Type: Bug >> Components: Core >> Affects Versions: 0.5 >> Environment: latest in 0.5 branch >> Reporter: Ramzi Rabah >> Assignee: Jaakko Laine >> Fix For: 0.5 >> >> >> From the cassandra user message board: >> "I just recently upgraded to latest in 0.5 branch, and I am running >> into a serious issue. I have a cluster with 4 nodes, rackunaware >> strategy, and using my own tokens distributed evenly over the hash >> space. I am writing/reading equally to them at an equal rate of about >> 230 reads/writes per second(and cfstats shows that). The first 3 nodes >> are seeds, the last one isn't. When I start all the nodes together at >> the same time, they all receive equal amounts of reads/writes (about >> 230). >> When I bring node 4 down and bring it back up again, node 4's load >> fluctuates between the 230 it used to get to sometimes no traffic at >> all. The other 3 still have the same amount of traffic. And no errors >> what so ever seen in logs. " > > -- > This message is automatically generated by JIRA. > - > You can reply to this email to add a comment to the issue online. > >