[jira] [Commented] (CASSANDRA-2768) AntiEntropyService excluding nodes that are on version 0.7 or sooner

2011-06-30 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13058061#comment-13058061
 ] 

Héctor Izquierdo commented on CASSANDRA-2768:
-

I'm on 0.8.1 updating from 0.7.6-2 and I have stumbled upong this bug. I can't 
run repair on a node whose disk broke

INFO [manual-repair-02182a20-5659-4aa0-aab9-2fff430f8a71] 2011-06-30 
20:29:51,487 AntiEntropyService.java (line 179) Excluding /10.20.13.80 from 
repair because it is on version 0.7 or sooner. You should consider updating 
this node before running repair again.
 INFO [manual-repair-6b50f51a-f689-4825-bcb9-bebf68664117] 2011-06-30 
20:29:51,487 AntiEntropyService.java (line 179) Excluding /10.20.13.76 from 
repair because it is on version 0.7 or sooner. You should consider updating 
this node before running repair again.
 INFO [manual-repair-6b50f51a-f689-4825-bcb9-bebf68664117] 2011-06-30 
20:29:51,487 AntiEntropyService.java (line 179) Excluding /10.20.13.77 from 
repair because it is on version 0.7 or sooner. You should consider updating 
this node before running repair again.
 INFO [manual-repair-02182a20-5659-4aa0-aab9-2fff430f8a71] 2011-06-30 
20:29:51,487 AntiEntropyService.java (line 179) Excluding /10.20.13.76 from 
repair because it is on version 0.7 or sooner. You should consider updating 
this node before running repair again.
 INFO [manual-repair-c46dd589-ed22-4b7a-809c-d97c094d2354] 2011-06-30 
20:29:51,487 AntiEntropyService.java (line 179) Excluding /10.20.13.80 from 
repair because it is on version 0.7 or sooner. You should consider updating 
this node before running repair again.
 INFO [manual-repair-c46dd589-ed22-4b7a-809c-d97c094d2354] 2011-06-30 
20:29:51,488 AntiEntropyService.java (line 179) Excluding /10.20.13.79 from 
repair because it is on version 0.7 or sooner. You should consider updating 
this node before running repair again.
 INFO [manual-repair-02182a20-5659-4aa0-aab9-2fff430f8a71] 2011-06-30 
20:29:51,488 AntiEntropyService.java (line 782) No neighbors to repair with for 
sbs on 
(141784319550391026443072753096570088105,170141183460469231731687303715884105727]:
 manual-repair-02182a20-5659-4aa0-aab9-2fff430f8a71 completed.
 INFO [manual-repair-6b50f51a-f689-4825-bcb9-bebf68664117] 2011-06-30 
20:29:51,487 AntiEntropyService.java (line 782) No neighbors to repair with for 
sbs on 
(170141183460469231731687303715884105727,28356863910078205288614550619314017621]:
 manual-repair-6b50f51a-f689-4825-bcb9-bebf68664117 completed.
 INFO [manual-repair-c46dd589-ed22-4b7a-809c-d97c094d2354] 2011-06-30 
20:29:51,488 AntiEntropyService.java (line 782) No neighbors to repair with for 
sbs on 
(113427455640312821154458202477256070484,141784319550391026443072753096570088105]:
 manual-repair-c46dd589-ed22-4b7a-809c-d97c094d2354 completed.


All nodes are on 0.8.1

 AntiEntropyService excluding nodes that are on version 0.7 or sooner
 

 Key: CASSANDRA-2768
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2768
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.0
 Environment: 4 node environment -- 
 Originally 0.7.6-2 with a Keyspace defined with RF=3
 Upgraded all nodes ( 1 at a time ) to version 0.8.0:  For each node, the node 
 was shut down, new version was turned on, using the existing data files / 
 directories and a nodetool repair was run.  
Reporter: Sasha Dolgy

 When I run nodetool repair on any of the nodes, the 
 /var/log/cassandra/system.log reports errors similar to:
 INFO [manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec] 2011-06-13 
 21:28:39,877 AntiEntropyService.java (line 177) Excluding /10.128.34.18 from 
 repair because it is on version 0.7 or sooner. You should consider updating 
 this node before running repair again.
 ERROR [manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec] 2011-06-13 
 21:28:39,877 AbstractCassandraDaemon.java (line 113) Fatal exception in 
 thread Thread[manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec,5,RMI 
 Runtime]
 java.util.ConcurrentModificationException
   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
   at java.util.HashMap$KeyIterator.next(HashMap.java:828)
   at 
 org.apache.cassandra.service.AntiEntropyService.getNeighbors(AntiEntropyService.java:173)
   at 
 org.apache.cassandra.service.AntiEntropyService$RepairSession.run(AntiEntropyService.java:776)
 The INFO message and subsequent ERROR message are logged for 2 nodes .. I 
 suspect that this is because RF=3.  
 nodetool ring shows that all nodes are up.  
 Client connections (read / write) are not having issues..  
 nodetool version on all nodes shows that each node is 0.8.0
 At suggestion of some contributors, I have restarted each node and tried to 
 run a nodetool repair 

[jira] [Commented] (CASSANDRA-2768) AntiEntropyService excluding nodes that are on version 0.7 or sooner

2011-06-14 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049078#comment-13049078
 ] 

Sylvain Lebresne commented on CASSANDRA-2768:
-

The important part here is that this is not a repair specific thing per se. The 
important part of the stack trace is the 'Excluding ...' part.
It is triggered because of the following code in AES.getNeighbors:
{noformat}
  if (Gossiper.instance.getVersion(endpoint) = MessagingService.VERSION_07)
  {
  logger.info(Excluding  + endpoint +  from repair because it is on 
version 0.7 or sooner. You should consider updating this node before running 
repair again.);
  neighbors.remove(endpoint);
  }
{noformat}
Since Sasha has reportedly verified that all node report being on 0.8.0, this 
suggests a Gossiper bug that reports the wrong version (even after node 
restarts).

The exception itself has been fixed in CASSANDRA-2767 and should not be the 
focus of attention here.

 AntiEntropyService excluding nodes that are on version 0.7 or sooner
 

 Key: CASSANDRA-2768
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2768
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.0
 Environment: 4 node environment -- 
 Originally 0.7.6-2 with a Keyspace defined with RF=3
 Upgraded all nodes ( 1 at a time ) to version 0.8.0:  For each node, the node 
 was shut down, new version was turned on, using the existing data files / 
 directories and a nodetool repair was run.  
Reporter: Sasha Dolgy
Assignee: Sylvain Lebresne

 When I run nodetool repair on any of the nodes, the 
 /var/log/cassandra/system.log reports errors similar to:
 INFO [manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec] 2011-06-13 
 21:28:39,877 AntiEntropyService.java (line 177) Excluding /10.128.34.18 from 
 repair because it is on version 0.7 or sooner. You should consider updating 
 this node before running repair again.
 ERROR [manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec] 2011-06-13 
 21:28:39,877 AbstractCassandraDaemon.java (line 113) Fatal exception in 
 thread Thread[manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec,5,RMI 
 Runtime]
 java.util.ConcurrentModificationException
   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
   at java.util.HashMap$KeyIterator.next(HashMap.java:828)
   at 
 org.apache.cassandra.service.AntiEntropyService.getNeighbors(AntiEntropyService.java:173)
   at 
 org.apache.cassandra.service.AntiEntropyService$RepairSession.run(AntiEntropyService.java:776)
 The INFO message and subsequent ERROR message are logged for 2 nodes .. I 
 suspect that this is because RF=3.  
 nodetool ring shows that all nodes are up.  
 Client connections (read / write) are not having issues..  
 nodetool version on all nodes shows that each node is 0.8.0
 At suggestion of some contributors, I have restarted each node and tried to 
 run a nodetool repair again ... the result is the same with the messages 
 being logged.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2768) AntiEntropyService excluding nodes that are on version 0.7 or sooner

2011-06-14 Thread Sasha Dolgy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049214#comment-13049214
 ] 

Sasha Dolgy commented on CASSANDRA-2768:


I'd like to add that we are also using Ec2Snitch across the four nodes, 
although, all four nodes are based in the EC2 region of Singapore.  

 AntiEntropyService excluding nodes that are on version 0.7 or sooner
 

 Key: CASSANDRA-2768
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2768
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.0
 Environment: 4 node environment -- 
 Originally 0.7.6-2 with a Keyspace defined with RF=3
 Upgraded all nodes ( 1 at a time ) to version 0.8.0:  For each node, the node 
 was shut down, new version was turned on, using the existing data files / 
 directories and a nodetool repair was run.  
Reporter: Sasha Dolgy
Assignee: Sylvain Lebresne

 When I run nodetool repair on any of the nodes, the 
 /var/log/cassandra/system.log reports errors similar to:
 INFO [manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec] 2011-06-13 
 21:28:39,877 AntiEntropyService.java (line 177) Excluding /10.128.34.18 from 
 repair because it is on version 0.7 or sooner. You should consider updating 
 this node before running repair again.
 ERROR [manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec] 2011-06-13 
 21:28:39,877 AbstractCassandraDaemon.java (line 113) Fatal exception in 
 thread Thread[manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec,5,RMI 
 Runtime]
 java.util.ConcurrentModificationException
   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
   at java.util.HashMap$KeyIterator.next(HashMap.java:828)
   at 
 org.apache.cassandra.service.AntiEntropyService.getNeighbors(AntiEntropyService.java:173)
   at 
 org.apache.cassandra.service.AntiEntropyService$RepairSession.run(AntiEntropyService.java:776)
 The INFO message and subsequent ERROR message are logged for 2 nodes .. I 
 suspect that this is because RF=3.  
 nodetool ring shows that all nodes are up.  
 Client connections (read / write) are not having issues..  
 nodetool version on all nodes shows that each node is 0.8.0
 At suggestion of some contributors, I have restarted each node and tried to 
 run a nodetool repair again ... the result is the same with the messages 
 being logged.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2768) AntiEntropyService excluding nodes that are on version 0.7 or sooner

2011-06-14 Thread Sasha Dolgy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049243#comment-13049243
 ] 

Sasha Dolgy commented on CASSANDRA-2768:


Hi ... able to give more information now:


cassandra:~$ nodetool ring
Address Status State   LoadOwnsToken
   
170141183460469231731687303715884105726
10.128.103.148  Up Normal  961.38 KB   11.22%  
1909554714494251628118265338228798
10.128.94.227   Up Normal  667.56 KB   22.11%  
56713727820156410577229101238628035242
10.128.34.18Up Normal  688.1 KB33.33%  
113427455640312821154458202477256070484
10.128.90.109   Up Normal  965.76 KB   33.33%  
170141183460469231731687303715884105726

Not a lot of data.  I created a new keyspace with (RF=2), dropped the old one.  
Ran repair on the nodes, and now I no longer get the error on some of the 
nodes. 

I can confirm again all systems are reporting:  ReleaseVersion: 0.8.0  from 
'nodetool version'

I am seeing this error on two of the nodes:   

 ERROR [pool-2-thread-14] 2011-06-14 23:33:40,544 CustomTThreadPoolServer.java 
(line 199) Thrift error occurred during processing of message.
org.apache.thrift.protocol.TProtocolException: Missing version in 
readMessageBegin, old client?
at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:213)
at 
org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2877)
at 
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
ERROR [pool-2-thread-16] 2011-06-14 23:33:42,024 CustomTThreadPoolServer.java 
(line 199) Thrift error occurred during processing of message.
org.apache.thrift.protocol.TProtocolException: Missing version in 
readMessageBegin, old client?
at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:213)
at 
org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2877)
at 
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

109 and 148 look to be communicating fine.

18 -- 109 (version error)
18 -- 227 (version error)
227 -- 18 (version error)
227 -- 148 (version error)

For my sanity, I checked and confirmed that all four instances are part of the 
same security group and there are firewall rules allow communication between 
all four nodes on ports 7000 and 9090

Configuration on all nodes is standard with the following exceptions:


#listen_address: localhost
endpoint_snitch: org.apache.cassandra.locator.Ec2Snitch


 AntiEntropyService excluding nodes that are on version 0.7 or sooner
 

 Key: CASSANDRA-2768
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2768
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.0
 Environment: 4 node environment -- 
 Originally 0.7.6-2 with a Keyspace defined with RF=3
 Upgraded all nodes ( 1 at a time ) to version 0.8.0:  For each node, the node 
 was shut down, new version was turned on, using the existing data files / 
 directories and a nodetool repair was run.  
Reporter: Sasha Dolgy
Assignee: Brandon Williams

 When I run nodetool repair on any of the nodes, the 
 /var/log/cassandra/system.log reports errors similar to:
 INFO [manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec] 2011-06-13 
 21:28:39,877 AntiEntropyService.java (line 177) Excluding /10.128.34.18 from 
 repair because it is on version 0.7 or sooner. You should consider updating 
 this node before running repair again.
 ERROR [manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec] 2011-06-13 
 21:28:39,877 AbstractCassandraDaemon.java (line 113) Fatal exception in 
 thread Thread[manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec,5,RMI 
 Runtime]
 java.util.ConcurrentModificationException
   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
   at java.util.HashMap$KeyIterator.next(HashMap.java:828)
   at 
 org.apache.cassandra.service.AntiEntropyService.getNeighbors(AntiEntropyService.java:173)
   at 
 org.apache.cassandra.service.AntiEntropyService$RepairSession.run(AntiEntropyService.java:776)
 The INFO message 

[jira] [Commented] (CASSANDRA-2768) AntiEntropyService excluding nodes that are on version 0.7 or sooner

2011-06-14 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049326#comment-13049326
 ] 

Brandon Williams commented on CASSANDRA-2768:
-

bq. Since Sasha has reportedly verified that all node report being on 0.8.0, 
this suggests a Gossiper bug that reports the wrong version (even after node 
restarts).

Gossiper's setVersion and getVersion are fairly straightforward, and setVersion 
is called in IncomingTcpConnection and setting it to whatever the remote node 
said to, so a bug here looks unlikely.  The version information is not 
persisted anywhere, so the remote node has to be indicating it is 0.7.  I was 
unable to reproduce following Sasha's steps, so I think the most likely 
explanation here is that a node is mistakenly still on 0.7.

{quote}
I am seeing this error on two of the nodes:

ERROR [pool-2-thread-14] 2011-06-14 23:33:40,544 CustomTThreadPoolServer.java 
(line 199) Thrift error occurred during processing of message.
org.apache.thrift.protocol.TProtocolException: Missing version in 
readMessageBegin, old client?
{quote}

This is indicative of a client-side thrift compatibility problem and is 
unrelated, as thrift is not used for internode communication.

 AntiEntropyService excluding nodes that are on version 0.7 or sooner
 

 Key: CASSANDRA-2768
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2768
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.0
 Environment: 4 node environment -- 
 Originally 0.7.6-2 with a Keyspace defined with RF=3
 Upgraded all nodes ( 1 at a time ) to version 0.8.0:  For each node, the node 
 was shut down, new version was turned on, using the existing data files / 
 directories and a nodetool repair was run.  
Reporter: Sasha Dolgy
Assignee: Brandon Williams

 When I run nodetool repair on any of the nodes, the 
 /var/log/cassandra/system.log reports errors similar to:
 INFO [manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec] 2011-06-13 
 21:28:39,877 AntiEntropyService.java (line 177) Excluding /10.128.34.18 from 
 repair because it is on version 0.7 or sooner. You should consider updating 
 this node before running repair again.
 ERROR [manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec] 2011-06-13 
 21:28:39,877 AbstractCassandraDaemon.java (line 113) Fatal exception in 
 thread Thread[manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec,5,RMI 
 Runtime]
 java.util.ConcurrentModificationException
   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
   at java.util.HashMap$KeyIterator.next(HashMap.java:828)
   at 
 org.apache.cassandra.service.AntiEntropyService.getNeighbors(AntiEntropyService.java:173)
   at 
 org.apache.cassandra.service.AntiEntropyService$RepairSession.run(AntiEntropyService.java:776)
 The INFO message and subsequent ERROR message are logged for 2 nodes .. I 
 suspect that this is because RF=3.  
 nodetool ring shows that all nodes are up.  
 Client connections (read / write) are not having issues..  
 nodetool version on all nodes shows that each node is 0.8.0
 At suggestion of some contributors, I have restarted each node and tried to 
 run a nodetool repair again ... the result is the same with the messages 
 being logged.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2768) AntiEntropyService excluding nodes that are on version 0.7 or sooner

2011-06-14 Thread Sasha Dolgy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049411#comment-13049411
 ] 

Sasha Dolgy commented on CASSANDRA-2768:


I can assure you they are all on 0.8.0 if the output of nodetool version is 
correct

cassandra@ip-10-128-34-18:~$ nodetool -h 10.128.34.18 -p 9090 ring
Address Status State   LoadOwnsToken
   
170141183460469231731687303715884105726
10.128.103.148  Up Normal  1.02 MB 11.22%  
1909554714494251628118265338228798
10.128.94.227   Up Normal  667.56 KB   22.11%  
56713727820156410577229101238628035242
10.128.34.18Up Normal  688.1 KB33.33%  
113427455640312821154458202477256070484
10.128.90.109   Up Normal  1.11 MB 33.33%  
170141183460469231731687303715884105726
cassandra@ip-10-128-34-18:~$

cassandra@ip-10-128-34-18:~$ nodetool -h 10.128.34.18 -p 9090 version
ReleaseVersion: 0.8.0
cassandra@ip-10-128-34-18:~$


cassandra@ip-10-128-94-227:~$ nodetool -h 10.128.94.227 -p 9090 version
ReleaseVersion: 0.8.0
cassandra@ip-10-128-94-227:~$


cassandra@ip-10-128-90-109:~$ nodetool -h 10.128.90.109 -p 9090 version
ReleaseVersion: 0.8.0
cassandra@ip-10-128-90-109:~$


cassandra@ip-10-128-103-148:~$ nodetool -h 10.128.103.148 -p 9090 version
ReleaseVersion: 0.8.0
cassandra@ip-10-128-103-148:~$


 AntiEntropyService excluding nodes that are on version 0.7 or sooner
 

 Key: CASSANDRA-2768
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2768
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.0
 Environment: 4 node environment -- 
 Originally 0.7.6-2 with a Keyspace defined with RF=3
 Upgraded all nodes ( 1 at a time ) to version 0.8.0:  For each node, the node 
 was shut down, new version was turned on, using the existing data files / 
 directories and a nodetool repair was run.  
Reporter: Sasha Dolgy
Assignee: Brandon Williams

 When I run nodetool repair on any of the nodes, the 
 /var/log/cassandra/system.log reports errors similar to:
 INFO [manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec] 2011-06-13 
 21:28:39,877 AntiEntropyService.java (line 177) Excluding /10.128.34.18 from 
 repair because it is on version 0.7 or sooner. You should consider updating 
 this node before running repair again.
 ERROR [manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec] 2011-06-13 
 21:28:39,877 AbstractCassandraDaemon.java (line 113) Fatal exception in 
 thread Thread[manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec,5,RMI 
 Runtime]
 java.util.ConcurrentModificationException
   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
   at java.util.HashMap$KeyIterator.next(HashMap.java:828)
   at 
 org.apache.cassandra.service.AntiEntropyService.getNeighbors(AntiEntropyService.java:173)
   at 
 org.apache.cassandra.service.AntiEntropyService$RepairSession.run(AntiEntropyService.java:776)
 The INFO message and subsequent ERROR message are logged for 2 nodes .. I 
 suspect that this is because RF=3.  
 nodetool ring shows that all nodes are up.  
 Client connections (read / write) are not having issues..  
 nodetool version on all nodes shows that each node is 0.8.0
 At suggestion of some contributors, I have restarted each node and tried to 
 run a nodetool repair again ... the result is the same with the messages 
 being logged.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2768) AntiEntropyService excluding nodes that are on version 0.7 or sooner

2011-06-14 Thread Sasha Dolgy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13049421#comment-13049421
 ] 

Sasha Dolgy commented on CASSANDRA-2768:


After dropping the old keyspace, and creating the new keyspace .. I didn't 
cycle through and stop / start each node.  Have just done that now and all 
appears to be fine.  No more errors in the logs.  This problem was 100% there, 
on version 0.8.0 with the old keyspace.  

 AntiEntropyService excluding nodes that are on version 0.7 or sooner
 

 Key: CASSANDRA-2768
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2768
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.0
 Environment: 4 node environment -- 
 Originally 0.7.6-2 with a Keyspace defined with RF=3
 Upgraded all nodes ( 1 at a time ) to version 0.8.0:  For each node, the node 
 was shut down, new version was turned on, using the existing data files / 
 directories and a nodetool repair was run.  
Reporter: Sasha Dolgy
Assignee: Brandon Williams

 When I run nodetool repair on any of the nodes, the 
 /var/log/cassandra/system.log reports errors similar to:
 INFO [manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec] 2011-06-13 
 21:28:39,877 AntiEntropyService.java (line 177) Excluding /10.128.34.18 from 
 repair because it is on version 0.7 or sooner. You should consider updating 
 this node before running repair again.
 ERROR [manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec] 2011-06-13 
 21:28:39,877 AbstractCassandraDaemon.java (line 113) Fatal exception in 
 thread Thread[manual-repair-1c6b33bc-ef14-4ec8-94f6-f1464ec8bdec,5,RMI 
 Runtime]
 java.util.ConcurrentModificationException
   at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
   at java.util.HashMap$KeyIterator.next(HashMap.java:828)
   at 
 org.apache.cassandra.service.AntiEntropyService.getNeighbors(AntiEntropyService.java:173)
   at 
 org.apache.cassandra.service.AntiEntropyService$RepairSession.run(AntiEntropyService.java:776)
 The INFO message and subsequent ERROR message are logged for 2 nodes .. I 
 suspect that this is because RF=3.  
 nodetool ring shows that all nodes are up.  
 Client connections (read / write) are not having issues..  
 nodetool version on all nodes shows that each node is 0.8.0
 At suggestion of some contributors, I have restarted each node and tried to 
 run a nodetool repair again ... the result is the same with the messages 
 being logged.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira