[jira] [Updated] (CASSANDRA-4223) Non Unique Streaming session ID's

2012-05-10 Thread Aaron Morton (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Morton updated CASSANDRA-4223:


Attachment: 4223_counter_session_id-V2.diff

4223_counter_session_id-V2.diff 

Uses stream source flag as discussed. Added the flags to StreamHeader so they 
were together. 


 Non Unique Streaming session ID's
 -

 Key: CASSANDRA-4223
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4223
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Ubuntu 10.04.2 LTS
 java version 1.6.0_24
 Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
 Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)
 Bare metal servers from 
 https://www.stormondemand.com/servers/baremetal.html 
 The servers run on a custom hypervisor.
  
Reporter: Aaron Morton
Assignee: Aaron Morton
  Labels: datastax_qa
 Fix For: 1.0.11, 1.1.1

 Attachments: 4223_counter_session_id-V2.diff, 
 4223_counter_session_id.diff, NanoTest.java, fmm streaming bug.txt


 I have observed repair processes failing due to duplicate Streaming session 
 ID's. In this installation it is preventing rebalance from completing. I 
 believe it has also prevented repair from completing in the past. 
 The attached streaming-logs.txt file contains log messages and an explanation 
 of what was happening during a repair operation. it has the evidence for 
 duplicate session ID's.
 The duplicate session id's were generated on the repairing node and sent to 
 the streaming node. The streaming source replaced the first session with the 
 second which resulted in both sessions failing when the first FILE_COMPLETE 
 message was received. 
 The errors were:
 {code:java}
 DEBUG [MiscStage:1] 2012-05-03 21:40:33,997 StreamReplyVerbHandler.java (line 
 47) Received StreamReply StreamReply(sessionId=26132848816442266, 
 file='/var/lib/cassandra/data/FMM_Studio/PartsData-hc-1-Data.db', 
 action=FILE_FINISHED)
 ERROR [MiscStage:1] 2012-05-03 21:40:34,027 AbstractCassandraDaemon.java 
 (line 139) Fatal exception in thread Thread[MiscStage:1,5,main]
 java.lang.IllegalStateException: target reports current file is 
 /var/lib/cassandra/data/FMM_Studio/PartsData-hc-1-Data.db but is null
 at 
 org.apache.cassandra.streaming.StreamOutSession.validateCurrentFile(StreamOutSession.java:195)
 at 
 org.apache.cassandra.streaming.StreamReplyVerbHandler.doVerb(StreamReplyVerbHandler.java:58)
 at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
 Source)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 at java.lang.Thread.run(Unknown Source)
 {code}
 and
 {code:java}
 DEBUG [MiscStage:2] 2012-05-03 21:40:36,497 StreamReplyVerbHandler.java (line 
 47) Received StreamReply StreamReply(sessionId=26132848816442266, 
 file='/var/lib/cassandra/data/OpsCenter/rollups7200-hc-3-Data.db', 
 action=FILE_FINISHED)
 ERROR [MiscStage:2] 2012-05-03 21:40:36,497 AbstractCassandraDaemon.java 
 (line 139) Fatal exception in thread Thread[MiscStage:2,5,main]
 java.lang.IllegalStateException: target reports current file is 
 /var/lib/cassandra/data/OpsCenter/rollups7200-hc-3-Data.db but is null
 at 
 org.apache.cassandra.streaming.StreamOutSession.validateCurrentFile(StreamOutSession.java:195)
 at 
 org.apache.cassandra.streaming.StreamReplyVerbHandler.doVerb(StreamReplyVerbHandler.java:58)
 at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
 Source)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 at java.lang.Thread.run(Unknown Source)
 {code}
 I think this is because System.nanoTime() is used for the session ID when 
 creating the StreamInSession objects (driven from 
 StorageService.requestRanges()) . 
 From the documentation 
 (http://docs.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime()) 
 {quote}
 This method provides nanosecond precision, but not necessarily nanosecond 
 accuracy. No guarantees are made about how frequently values change. 
 {quote}
 Also some info here on clocks and timers 
 https://blogs.oracle.com/dholmes/entry/inside_the_hotspot_vm_clocks
 The hypervisor may be at fault here. But it seems like we cannot rely on 
 successive calls to nanoTime() to return different values. 
 To avoid message/interface changes on the StreamHeader it would be good to 
 keep the session ID a long. The simplest approach may be to make successive 
 calls to nanoTime until the result changes. We could fail if a certain number 
 of milliseconds have passed. 
 

[jira] [Updated] (CASSANDRA-4223) Non Unique Streaming session ID's

2012-05-09 Thread Aaron Morton (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Morton updated CASSANDRA-4223:


Attachment: 4223_counter_session_id.diff

Use an AtomicLong in StreamInSession and one in StreamOutSession for the 
session id. 

Sessions are always accessed using inet_address, session_id, and in and out 
session are in their own collections. 

 Non Unique Streaming session ID's
 -

 Key: CASSANDRA-4223
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4223
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Ubuntu 10.04.2 LTS
 java version 1.6.0_24
 Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
 Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)
 Bare metal servers from 
 https://www.stormondemand.com/servers/baremetal.html 
 The servers run on a custom hypervisor.
  
Reporter: Aaron Morton
Assignee: Aaron Morton
  Labels: datastax_qa
 Fix For: 1.0.11, 1.1.1

 Attachments: 4223_counter_session_id.diff, NanoTest.java, fmm 
 streaming bug.txt


 I have observed repair processes failing due to duplicate Streaming session 
 ID's. In this installation it is preventing rebalance from completing. I 
 believe it has also prevented repair from completing in the past. 
 The attached streaming-logs.txt file contains log messages and an explanation 
 of what was happening during a repair operation. it has the evidence for 
 duplicate session ID's.
 The duplicate session id's were generated on the repairing node and sent to 
 the streaming node. The streaming source replaced the first session with the 
 second which resulted in both sessions failing when the first FILE_COMPLETE 
 message was received. 
 The errors were:
 {code:java}
 DEBUG [MiscStage:1] 2012-05-03 21:40:33,997 StreamReplyVerbHandler.java (line 
 47) Received StreamReply StreamReply(sessionId=26132848816442266, 
 file='/var/lib/cassandra/data/FMM_Studio/PartsData-hc-1-Data.db', 
 action=FILE_FINISHED)
 ERROR [MiscStage:1] 2012-05-03 21:40:34,027 AbstractCassandraDaemon.java 
 (line 139) Fatal exception in thread Thread[MiscStage:1,5,main]
 java.lang.IllegalStateException: target reports current file is 
 /var/lib/cassandra/data/FMM_Studio/PartsData-hc-1-Data.db but is null
 at 
 org.apache.cassandra.streaming.StreamOutSession.validateCurrentFile(StreamOutSession.java:195)
 at 
 org.apache.cassandra.streaming.StreamReplyVerbHandler.doVerb(StreamReplyVerbHandler.java:58)
 at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
 Source)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 at java.lang.Thread.run(Unknown Source)
 {code}
 and
 {code:java}
 DEBUG [MiscStage:2] 2012-05-03 21:40:36,497 StreamReplyVerbHandler.java (line 
 47) Received StreamReply StreamReply(sessionId=26132848816442266, 
 file='/var/lib/cassandra/data/OpsCenter/rollups7200-hc-3-Data.db', 
 action=FILE_FINISHED)
 ERROR [MiscStage:2] 2012-05-03 21:40:36,497 AbstractCassandraDaemon.java 
 (line 139) Fatal exception in thread Thread[MiscStage:2,5,main]
 java.lang.IllegalStateException: target reports current file is 
 /var/lib/cassandra/data/OpsCenter/rollups7200-hc-3-Data.db but is null
 at 
 org.apache.cassandra.streaming.StreamOutSession.validateCurrentFile(StreamOutSession.java:195)
 at 
 org.apache.cassandra.streaming.StreamReplyVerbHandler.doVerb(StreamReplyVerbHandler.java:58)
 at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
 Source)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 at java.lang.Thread.run(Unknown Source)
 {code}
 I think this is because System.nanoTime() is used for the session ID when 
 creating the StreamInSession objects (driven from 
 StorageService.requestRanges()) . 
 From the documentation 
 (http://docs.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime()) 
 {quote}
 This method provides nanosecond precision, but not necessarily nanosecond 
 accuracy. No guarantees are made about how frequently values change. 
 {quote}
 Also some info here on clocks and timers 
 https://blogs.oracle.com/dholmes/entry/inside_the_hotspot_vm_clocks
 The hypervisor may be at fault here. But it seems like we cannot rely on 
 successive calls to nanoTime() to return different values. 
 To avoid message/interface changes on the StreamHeader it would be good to 
 keep the session ID a long. The simplest approach may be to make successive 
 calls to nanoTime until the result changes. We could fail if a certain number 

[jira] [Updated] (CASSANDRA-4223) Non Unique Streaming session ID's

2012-05-07 Thread Aaron Morton (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Morton updated CASSANDRA-4223:


Attachment: NanoTest.java

Test for unique nanoTime() results.

 Non Unique Streaming session ID's
 -

 Key: CASSANDRA-4223
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4223
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.9
 Environment: Ubuntu 10.04.2 LTS
 java version 1.6.0_24
 Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
 Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)
 Bare metal servers from 
 https://www.stormondemand.com/servers/baremetal.html 
 The servers run on a custom hypervisor.
  
Reporter: Aaron Morton
Assignee: Aaron Morton
  Labels: datastax_qa
 Attachments: NanoTest.java, fmm streaming bug.txt


 I have observed repair processes failing due to duplicate Streaming session 
 ID's. In this installation it is preventing rebalance from completing. I 
 believe it has also prevented repair from completing in the past. 
 The attached streaming-logs.txt file contains log messages and an explanation 
 of what was happening during a repair operation. it has the evidence for 
 duplicate session ID's.
 The duplicate session id's were generated on the repairing node and sent to 
 the streaming node. The streaming source replaced the first session with the 
 second which resulted in both sessions failing when the first FILE_COMPLETE 
 message was received. 
 The errors were:
 {code:java}
 DEBUG [MiscStage:1] 2012-05-03 21:40:33,997 StreamReplyVerbHandler.java (line 
 47) Received StreamReply StreamReply(sessionId=26132848816442266, 
 file='/var/lib/cassandra/data/FMM_Studio/PartsData-hc-1-Data.db', 
 action=FILE_FINISHED)
 ERROR [MiscStage:1] 2012-05-03 21:40:34,027 AbstractCassandraDaemon.java 
 (line 139) Fatal exception in thread Thread[MiscStage:1,5,main]
 java.lang.IllegalStateException: target reports current file is 
 /var/lib/cassandra/data/FMM_Studio/PartsData-hc-1-Data.db but is null
 at 
 org.apache.cassandra.streaming.StreamOutSession.validateCurrentFile(StreamOutSession.java:195)
 at 
 org.apache.cassandra.streaming.StreamReplyVerbHandler.doVerb(StreamReplyVerbHandler.java:58)
 at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
 Source)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 at java.lang.Thread.run(Unknown Source)
 {code}
 and
 {code:java}
 DEBUG [MiscStage:2] 2012-05-03 21:40:36,497 StreamReplyVerbHandler.java (line 
 47) Received StreamReply StreamReply(sessionId=26132848816442266, 
 file='/var/lib/cassandra/data/OpsCenter/rollups7200-hc-3-Data.db', 
 action=FILE_FINISHED)
 ERROR [MiscStage:2] 2012-05-03 21:40:36,497 AbstractCassandraDaemon.java 
 (line 139) Fatal exception in thread Thread[MiscStage:2,5,main]
 java.lang.IllegalStateException: target reports current file is 
 /var/lib/cassandra/data/OpsCenter/rollups7200-hc-3-Data.db but is null
 at 
 org.apache.cassandra.streaming.StreamOutSession.validateCurrentFile(StreamOutSession.java:195)
 at 
 org.apache.cassandra.streaming.StreamReplyVerbHandler.doVerb(StreamReplyVerbHandler.java:58)
 at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
 Source)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 at java.lang.Thread.run(Unknown Source)
 {code}
 I think this is because System.nanoTime() is used for the session ID when 
 creating the StreamInSession objects (driven from 
 StorageService.requestRanges()) . 
 From the documentation 
 (http://docs.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime()) 
 {quote}
 This method provides nanosecond precision, but not necessarily nanosecond 
 accuracy. No guarantees are made about how frequently values change. 
 {quote}
 Also some info here on clocks and timers 
 https://blogs.oracle.com/dholmes/entry/inside_the_hotspot_vm_clocks
 The hypervisor may be at fault here. But it seems like we cannot rely on 
 successive calls to nanoTime() to return different values. 
 To avoid message/interface changes on the StreamHeader it would be good to 
 keep the session ID a long. The simplest approach may be to make successive 
 calls to nanoTime until the result changes. We could fail if a certain number 
 of milliseconds have passed. 
 Hashing the file names and ranges is also a possibility, but more involved. 
 (We may also want to drop latency times that are 0 nano seconds.)

--
This message is automatically generated by 

[jira] [Updated] (CASSANDRA-4223) Non Unique Streaming session ID's

2012-05-07 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-4223:
--

 Reviewer: yukim
Affects Version/s: (was: 1.0.9)
Fix Version/s: 1.1.1
   1.0.11

 Non Unique Streaming session ID's
 -

 Key: CASSANDRA-4223
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4223
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Ubuntu 10.04.2 LTS
 java version 1.6.0_24
 Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
 Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)
 Bare metal servers from 
 https://www.stormondemand.com/servers/baremetal.html 
 The servers run on a custom hypervisor.
  
Reporter: Aaron Morton
Assignee: Aaron Morton
  Labels: datastax_qa
 Fix For: 1.0.11, 1.1.1

 Attachments: NanoTest.java, fmm streaming bug.txt


 I have observed repair processes failing due to duplicate Streaming session 
 ID's. In this installation it is preventing rebalance from completing. I 
 believe it has also prevented repair from completing in the past. 
 The attached streaming-logs.txt file contains log messages and an explanation 
 of what was happening during a repair operation. it has the evidence for 
 duplicate session ID's.
 The duplicate session id's were generated on the repairing node and sent to 
 the streaming node. The streaming source replaced the first session with the 
 second which resulted in both sessions failing when the first FILE_COMPLETE 
 message was received. 
 The errors were:
 {code:java}
 DEBUG [MiscStage:1] 2012-05-03 21:40:33,997 StreamReplyVerbHandler.java (line 
 47) Received StreamReply StreamReply(sessionId=26132848816442266, 
 file='/var/lib/cassandra/data/FMM_Studio/PartsData-hc-1-Data.db', 
 action=FILE_FINISHED)
 ERROR [MiscStage:1] 2012-05-03 21:40:34,027 AbstractCassandraDaemon.java 
 (line 139) Fatal exception in thread Thread[MiscStage:1,5,main]
 java.lang.IllegalStateException: target reports current file is 
 /var/lib/cassandra/data/FMM_Studio/PartsData-hc-1-Data.db but is null
 at 
 org.apache.cassandra.streaming.StreamOutSession.validateCurrentFile(StreamOutSession.java:195)
 at 
 org.apache.cassandra.streaming.StreamReplyVerbHandler.doVerb(StreamReplyVerbHandler.java:58)
 at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
 Source)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 at java.lang.Thread.run(Unknown Source)
 {code}
 and
 {code:java}
 DEBUG [MiscStage:2] 2012-05-03 21:40:36,497 StreamReplyVerbHandler.java (line 
 47) Received StreamReply StreamReply(sessionId=26132848816442266, 
 file='/var/lib/cassandra/data/OpsCenter/rollups7200-hc-3-Data.db', 
 action=FILE_FINISHED)
 ERROR [MiscStage:2] 2012-05-03 21:40:36,497 AbstractCassandraDaemon.java 
 (line 139) Fatal exception in thread Thread[MiscStage:2,5,main]
 java.lang.IllegalStateException: target reports current file is 
 /var/lib/cassandra/data/OpsCenter/rollups7200-hc-3-Data.db but is null
 at 
 org.apache.cassandra.streaming.StreamOutSession.validateCurrentFile(StreamOutSession.java:195)
 at 
 org.apache.cassandra.streaming.StreamReplyVerbHandler.doVerb(StreamReplyVerbHandler.java:58)
 at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
 Source)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 at java.lang.Thread.run(Unknown Source)
 {code}
 I think this is because System.nanoTime() is used for the session ID when 
 creating the StreamInSession objects (driven from 
 StorageService.requestRanges()) . 
 From the documentation 
 (http://docs.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime()) 
 {quote}
 This method provides nanosecond precision, but not necessarily nanosecond 
 accuracy. No guarantees are made about how frequently values change. 
 {quote}
 Also some info here on clocks and timers 
 https://blogs.oracle.com/dholmes/entry/inside_the_hotspot_vm_clocks
 The hypervisor may be at fault here. But it seems like we cannot rely on 
 successive calls to nanoTime() to return different values. 
 To avoid message/interface changes on the StreamHeader it would be good to 
 keep the session ID a long. The simplest approach may be to make successive 
 calls to nanoTime until the result changes. We could fail if a certain number 
 of milliseconds have passed. 
 Hashing the file names and ranges is also a possibility, but more involved. 
 (We may also want to drop latency times that are 0 

[jira] [Updated] (CASSANDRA-4223) Non Unique Streaming session ID's

2012-05-06 Thread Aaron Morton (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Morton updated CASSANDRA-4223:


Attachment: fmm streaming bug.txt

 Non Unique Streaming session ID's
 -

 Key: CASSANDRA-4223
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4223
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.0.9
 Environment: Ubuntu 10.04.2 LTS
 java version 1.6.0_24
 Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
 Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)
 Bare metal servers from 
 https://www.stormondemand.com/servers/baremetal.html 
 The servers run on a custom hypervisor.
  
Reporter: Aaron Morton
Assignee: Aaron Morton
 Attachments: fmm streaming bug.txt


 I have observed repair processes failing due to duplicate Streaming session 
 ID's. In this installation it is preventing rebalance from completing. I 
 believe it has also prevented repair from completing in the past. 
 The attached streaming-logs.txt file contains log messages and an explanation 
 of what was happening during a repair operation. it has the evidence for 
 duplicate session ID's.
 The duplicate session id's were generated on the repairing node and sent to 
 the streaming node. The streaming source replaced the first session with the 
 second which resulted in both sessions failing when the first FILE_COMPLETE 
 message was received. 
 The errors were:
 {code:java}
 DEBUG [MiscStage:1] 2012-05-03 21:40:33,997 StreamReplyVerbHandler.java (line 
 47) Received StreamReply StreamReply(sessionId=26132848816442266, 
 file='/var/lib/cassandra/data/FMM_Studio/PartsData-hc-1-Data.db', 
 action=FILE_FINISHED)
 ERROR [MiscStage:1] 2012-05-03 21:40:34,027 AbstractCassandraDaemon.java 
 (line 139) Fatal exception in thread Thread[MiscStage:1,5,main]
 java.lang.IllegalStateException: target reports current file is 
 /var/lib/cassandra/data/FMM_Studio/PartsData-hc-1-Data.db but is null
 at 
 org.apache.cassandra.streaming.StreamOutSession.validateCurrentFile(StreamOutSession.java:195)
 at 
 org.apache.cassandra.streaming.StreamReplyVerbHandler.doVerb(StreamReplyVerbHandler.java:58)
 at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
 Source)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 at java.lang.Thread.run(Unknown Source)
 {code}
 and
 {code:java}
 DEBUG [MiscStage:2] 2012-05-03 21:40:36,497 StreamReplyVerbHandler.java (line 
 47) Received StreamReply StreamReply(sessionId=26132848816442266, 
 file='/var/lib/cassandra/data/OpsCenter/rollups7200-hc-3-Data.db', 
 action=FILE_FINISHED)
 ERROR [MiscStage:2] 2012-05-03 21:40:36,497 AbstractCassandraDaemon.java 
 (line 139) Fatal exception in thread Thread[MiscStage:2,5,main]
 java.lang.IllegalStateException: target reports current file is 
 /var/lib/cassandra/data/OpsCenter/rollups7200-hc-3-Data.db but is null
 at 
 org.apache.cassandra.streaming.StreamOutSession.validateCurrentFile(StreamOutSession.java:195)
 at 
 org.apache.cassandra.streaming.StreamReplyVerbHandler.doVerb(StreamReplyVerbHandler.java:58)
 at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
 Source)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 at java.lang.Thread.run(Unknown Source)
 {code}
 I think this is because System.nanoTime() is used for the session ID when 
 creating the StreamInSession objects (driven from 
 StorageService.requestRanges()) . 
 From the documentation 
 (http://docs.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime()) 
 {quote}
 This method provides nanosecond precision, but not necessarily nanosecond 
 accuracy. No guarantees are made about how frequently values change. 
 {quote}
 Also some info here on clocks and timers 
 https://blogs.oracle.com/dholmes/entry/inside_the_hotspot_vm_clocks
 The hypervisor may be at fault here. But it seems like we cannot rely on 
 successive calls to nanoTime() to return different values. 
 To avoid message/interface changes on the StreamHeader it would be good to 
 keep the session ID a long. The simplest approach may be to make successive 
 calls to nanoTime until the result changes. We could fail if a certain number 
 of milliseconds have passed. 
 Hashing the file names and ranges is also a possibility, but more involved. 
 (We may also want to drop latency times that are 0 nano seconds.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: