[jira] [Updated] (CASSANDRA-12506) nodetool compactionhistory in it's output should have timestamp in human readable format

2016-08-19 Thread Kenneth Failbus (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Failbus updated CASSANDRA-12506:

Description: 
While running nodetool compactionhistory the output shows id and other columns. 
I wanted to also see the timestamp for each id that got executed in human 
readable format
So, e.g. in the output if column id can be preceded by human readable timestamp 
format, it will help in understanding when a particular compaction ran and it's 
impact on the system resources.

  was:
While running nodetool compactionhistory the output shows id and other columns. 
I wanted to also see the timestamp for each id that got executed in human 
readable format
So, e.g. in the output if column id can be preceded by timestamp, it will help 
in understanding when a particular compaction ran and it's impact on the system 
resources.


> nodetool compactionhistory in it's output should have timestamp in human 
> readable format
> 
>
> Key: CASSANDRA-12506
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12506
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
> Environment: AWS
>Reporter: Kenneth Failbus
>Priority: Minor
>
> While running nodetool compactionhistory the output shows id and other 
> columns. I wanted to also see the timestamp for each id that got executed in 
> human readable format
> So, e.g. in the output if column id can be preceded by human readable 
> timestamp format, it will help in understanding when a particular compaction 
> ran and it's impact on the system resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12506) nodetool compactionhistory in it's output should have timestamp in human readable form

2016-08-19 Thread Kenneth Failbus (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Failbus updated CASSANDRA-12506:

Description: 
While running nodetool compactionhistory the output shows id and other columns. 
I wanted to also see the timestamp for each id that got executed in human 
readable format
So, e.g. in the output if column id can be preceded by timestamp, it will help 
in understanding when a particular compaction ran and it's impact on the system 
resources.

  was:
While running nodetool compactionhistory the output shows id and other columns. 
I wanted to also see the timestamp for each id that got executed.
So, e.g. in the output if column id can be preceded by timestamp, it will help 
in understanding when a particular compaction ran and it's impact on the system 
resources.


> nodetool compactionhistory in it's output should have timestamp in human 
> readable form
> --
>
> Key: CASSANDRA-12506
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12506
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
> Environment: AWS
>Reporter: Kenneth Failbus
>Priority: Minor
>
> While running nodetool compactionhistory the output shows id and other 
> columns. I wanted to also see the timestamp for each id that got executed in 
> human readable format
> So, e.g. in the output if column id can be preceded by timestamp, it will 
> help in understanding when a particular compaction ran and it's impact on the 
> system resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12506) nodetool compactionhistory in it's output should have timestamp in human readable form

2016-08-19 Thread Kenneth Failbus (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Failbus updated CASSANDRA-12506:

Summary: nodetool compactionhistory in it's output should have timestamp in 
human readable form  (was: nodetool compactionhistory in it's output should 
have timestamp)

> nodetool compactionhistory in it's output should have timestamp in human 
> readable form
> --
>
> Key: CASSANDRA-12506
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12506
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
> Environment: AWS
>Reporter: Kenneth Failbus
>Priority: Minor
>
> While running nodetool compactionhistory the output shows id and other 
> columns. I wanted to also see the timestamp for each id that got executed.
> So, e.g. in the output if column id can be preceded by timestamp, it will 
> help in understanding when a particular compaction ran and it's impact on the 
> system resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12506) nodetool compactionhistory in it's output should have timestamp in human readable format

2016-08-19 Thread Kenneth Failbus (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Failbus updated CASSANDRA-12506:

Summary: nodetool compactionhistory in it's output should have timestamp in 
human readable format  (was: nodetool compactionhistory in it's output should 
have timestamp in human readable form)

> nodetool compactionhistory in it's output should have timestamp in human 
> readable format
> 
>
> Key: CASSANDRA-12506
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12506
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
> Environment: AWS
>Reporter: Kenneth Failbus
>Priority: Minor
>
> While running nodetool compactionhistory the output shows id and other 
> columns. I wanted to also see the timestamp for each id that got executed in 
> human readable format
> So, e.g. in the output if column id can be preceded by timestamp, it will 
> help in understanding when a particular compaction ran and it's impact on the 
> system resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12507) Wanted to know if there will be any future work in supporting hybrid snitches

2016-08-19 Thread Kenneth Failbus (JIRA)
Kenneth Failbus created CASSANDRA-12507:
---

 Summary: Wanted to know if there will be any future work in 
supporting hybrid snitches
 Key: CASSANDRA-12507
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12507
 Project: Cassandra
  Issue Type: Wish
  Components: Core
Reporter: Kenneth Failbus
Priority: Minor


I have a mixed cloud technologies on which I run cassandra cluster. I wanted to 
know if I can leverage EC2MRS on EC2 cloud as a snitch where as leverage other 
snitch types on other cloud environment e.g. using GPFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12506) nodetool compactionhistory in it's output should have timestamp

2016-08-19 Thread Kenneth Failbus (JIRA)
Kenneth Failbus created CASSANDRA-12506:
---

 Summary: nodetool compactionhistory in it's output should have 
timestamp
 Key: CASSANDRA-12506
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12506
 Project: Cassandra
  Issue Type: Improvement
  Components: Compaction
 Environment: AWS
Reporter: Kenneth Failbus
Priority: Minor


While running nodetool compactionhistory the output shows id and other columns. 
I wanted to also see the timestamp for each id that got executed.
So, e.g. in the output if column id can be preceded by timestamp, it will help 
in understanding when a particular compaction ran and it's impact on the system 
resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8898) sstableloader utility should allow loading of data from mounted filesystem

2016-07-27 Thread Kenneth Failbus (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Failbus updated CASSANDRA-8898:
---
Description: 
When trying to load data from a mounted filesystem onto a new cluster, 
following exceptions is observed intermittently, and at some point the 
sstableloader process gets hung without completing the loading process.

Please note that in my case the scenario was loading the existing sstables from 
an existing cluster to a brand new cluster.

Finally, it was found that there were some hard assumptions been made by 
sstableloader utility w.r.t response from the filesystem, which was not working 
with mounted filesystem.

The work-around was to copy each existing nodes sstable data files locally and 
then point sstableloader to that local filesystem to then load data onto new 
cluster.

In case of restoring during disaster recovery from backups the data using 
sstableloader, this copying to local filesystem of data files and then loading 
would take a long time.

It would be a good enhancement of the sstableloader utility to enable use of 
mounted filesystem as copying data locally and then loading is time consuming.

Below is the exception seen during the use of mounted filesystem.
{code}
java.lang.AssertionError: Reference counter -1 for Min-System-jb-5449-Data.db 
at 
org.apache.cassandra.io.sstable.SSTableReader.releaseReference(SSTableReader.java:1146)
 
at 
org.apache.cassandra.streaming.StreamTransferTask.complete(StreamTransferTask.java:74)
 
at 
org.apache.cassandra.streaming.StreamSession.received(StreamSession.java:542) 
at 
org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:424)
 
at 
org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:245)
 
at java.lang.Thread.run(Thread.java:744) 
WARN 21:07:16,853 [Stream #3e5a5ba0-bdef-11e4-a975-5777dbff0945] Stream failed

  at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:59)
 
at 
org.apache.cassandra.io.sstable.SSTableReader.openDataReader(SSTableReader.java:1406)
 
at 
org.apache.cassandra.streaming.compress.CompressedStreamWriter.write(CompressedStreamWriter.java:55)
 
at 
org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:59)
 
at 
org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:42)
 
at 
org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
 
at 
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:339)
 
at 
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:311)
 
at java.lang.Thread.run(Thread.java:744) 
Caused by: java.io.FileNotFoundException: 
/opt/tmp/casapp-c1-c00055-g.ch.tvx.comcast.com/MinDataService/System/MinDataService-System-jb-5997-Data.db
 (No such file or directory) 
at java.io.RandomAccessFile.open(Native Method) 
at java.io.RandomAccessFile.(RandomAccessFile.java:241) 
at 
org.apache.cassandra.io.util.RandomAccessReader.(RandomAccessReader.java:58)
 
at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.(CompressedRandomAccessReader.java:76)
 
at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:55)
 
... 8 more 
Exception in thread "STREAM-OUT-/96.115.88.196" java.lang.NullPointerException 
at 
org.apache.cassandra.streaming.ConnectionHandler$MessageHandler.signalCloseDone(ConnectionHandler.java:205)
 
at 
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
 
at java.lang.Thread.run(Thread.java:744) 
ERROR 20:49:35,646 [Stream #d9fce650-bdf3-11e4-b6c0-252cb9b3e9f3] Streaming 
error occurred 
java.lang.AssertionError: Reference counter -3 for 
/opt/tmp/casapp-c1-c00055-g.ch.tvx.comcast.com/MinDataService/System/MinDataService-System-jb-4897-Data.db
 
at 
org.apache.cassandra.io.sstable.SSTableReader.releaseReference(SSTableReader.java:1146)
 
at 
org.apache.cassandra.streaming.StreamTransferTask.complete(StreamTransferTask.java:74)
 
at 
org.apache.cassandra.streaming.StreamSession.received(StreamSession.java:542) 
at 
org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:424)
 
at 
org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:245)
 
at java.lang.Thread.run(Thread.java:744) 
Exception in thread "STREAM-IN-/96.115.88.196" java.lang.NullPointerException 
at 

[jira] [Commented] (CASSANDRA-10660) Support user-defined compactions through nodetool

2015-11-12 Thread Kenneth Failbus (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15002515#comment-15002515
 ] 

Kenneth Failbus commented on CASSANDRA-10660:
-

[~jeromatron] Thanks I wrote also a script.

> Support user-defined compactions through nodetool
> -
>
> Key: CASSANDRA-10660
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10660
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Tyler Hobbs
>Priority: Minor
>  Labels: lhf
>
> For a long time, we've supported running user-defined compactions through 
> JMX.  This comes in handy fairly often, mostly when dealing with low disk 
> space or tombstone purging, so it would be good to add something to nodetool 
> for this.  An extra option for {{nodetool compact}} would probably suffice.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10660) Support user-defined compactions through nodetool

2015-11-05 Thread Kenneth Failbus (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14992175#comment-14992175
 ] 

Kenneth Failbus commented on CASSANDRA-10660:
-

This would be a good enhancement to the node tool to have option to compact and 
purging of tombstones.

> Support user-defined compactions through nodetool
> -
>
> Key: CASSANDRA-10660
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10660
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Tyler Hobbs
>Priority: Minor
>  Labels: lhf
>
> For a long time, we've supported running user-defined compactions through 
> JMX.  This comes in handy fairly often, mostly when dealing with low disk 
> space or tombstone purging, so it would be good to add something to nodetool 
> for this.  An extra option for {{nodetool compact}} would probably suffice.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6053) system.peers table not updated after decommissioning nodes in C* 2.0

2015-11-03 Thread Kenneth Failbus (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987349#comment-14987349
 ] 

Kenneth Failbus commented on CASSANDRA-6053:


Even though, this was fixed it surfaced in 2.0.14 release that we have in 
production. We are going to follow the work-around as mentioned above.

> system.peers table not updated after decommissioning nodes in C* 2.0
> 
>
> Key: CASSANDRA-6053
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6053
> Project: Cassandra
>  Issue Type: Bug
> Environment: Datastax AMI running EC2 m1.xlarge instances
>Reporter: Guyon Moree
>Assignee: Tyler Hobbs
> Fix For: 1.2.14, 2.0.5
>
> Attachments: 6053-v1.patch, peers
>
>
> After decommissioning my cluster from 20 to 9 nodes using opscenter, I found 
> all but one of the nodes had incorrect system.peers tables.
> This became a problem (afaik) when using the python-driver, since this 
> queries the peers table to set up its connection pool. Resulting in very slow 
> startup times, because of timeouts.
> The output of nodetool didn't seem to be affected. After removing the 
> incorrect entries from the peers tables, the connection issues seem to have 
> disappeared for us. 
> Would like some feedback on if this was the right way to handle the issue or 
> if I'm still left with a broken cluster.
> Attached is the output of nodetool status, which shows the correct 9 nodes. 
> Below that the output of the system.peers tables on the individual nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-6053) system.peers table not updated after decommissioning nodes in C* 2.0

2015-11-03 Thread Kenneth Failbus (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987349#comment-14987349
 ] 

Kenneth Failbus edited comment on CASSANDRA-6053 at 11/3/15 2:30 PM:
-

[~brandon.williams] FYI - Even though, this was fixed it surfaced in 2.0.14 
release that we have in production. We are going to follow the work-around as 
mentioned above.


was (Author: kenfailbus):
Even though, this was fixed it surfaced in 2.0.14 release that we have in 
production. We are going to follow the work-around as mentioned above.

> system.peers table not updated after decommissioning nodes in C* 2.0
> 
>
> Key: CASSANDRA-6053
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6053
> Project: Cassandra
>  Issue Type: Bug
> Environment: Datastax AMI running EC2 m1.xlarge instances
>Reporter: Guyon Moree
>Assignee: Tyler Hobbs
> Fix For: 1.2.14, 2.0.5
>
> Attachments: 6053-v1.patch, peers
>
>
> After decommissioning my cluster from 20 to 9 nodes using opscenter, I found 
> all but one of the nodes had incorrect system.peers tables.
> This became a problem (afaik) when using the python-driver, since this 
> queries the peers table to set up its connection pool. Resulting in very slow 
> startup times, because of timeouts.
> The output of nodetool didn't seem to be affected. After removing the 
> incorrect entries from the peers tables, the connection issues seem to have 
> disappeared for us. 
> Would like some feedback on if this was the right way to handle the issue or 
> if I'm still left with a broken cluster.
> Attached is the output of nodetool status, which shows the correct 9 nodes. 
> Below that the output of the system.peers tables on the individual nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8072) Exception during startup: Unable to gossip with any seeds

2015-11-02 Thread Kenneth Failbus (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985302#comment-14985302
 ] 

Kenneth Failbus commented on CASSANDRA-8072:


I am seeing this issue in 2.0.14. And it's reproducible. We are adding a node 
in a large vnode based cluster. All the other time we have successfully added 
the node in the past and this time around we are seeing this issue.

Is there a work around for this situation? Any ideas?



> Exception during startup: Unable to gossip with any seeds
> -
>
> Key: CASSANDRA-8072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8072
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ryan Springer
>Assignee: Stefania
> Fix For: 2.1.x
>
> Attachments: cas-dev-dt-01-uw1-cassandra-seed01_logs.tar.bz2, 
> cas-dev-dt-01-uw1-cassandra-seed02_logs.tar.bz2, 
> cas-dev-dt-01-uw1-cassandra02_logs.tar.bz2, 
> casandra-system-log-with-assert-patch.log, screenshot-1.png, 
> trace_logs.tar.bz2
>
>
> When Opscenter 4.1.4 or 5.0.1 tries to provision a 2-node DSC 2.0.10 cluster 
> in either ec2 or locally, an error occurs sometimes with one of the nodes 
> refusing to start C*.  The error in the /var/log/cassandra/system.log is:
> ERROR [main] 2014-10-06 15:54:52,292 CassandraDaemon.java (line 513) 
> Exception encountered during startup
> java.lang.RuntimeException: Unable to gossip with any seeds
> at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1200)
> at 
> org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:444)
> at 
> org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:655)
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:609)
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:502)
> at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:378)
> at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
> at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:585)
>  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:52,326 Gossiper.java 
> (line 1279) Announcing shutdown
>  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:54,326 
> MessagingService.java (line 701) Waiting for messaging service to quiesce
>  INFO [ACCEPT-localhost/127.0.0.1] 2014-10-06 15:54:54,327 
> MessagingService.java (line 941) MessagingService has terminated the accept() 
> thread
> This errors does not always occur when provisioning a 2-node cluster, but 
> probably around half of the time on only one of the nodes.  I haven't been 
> able to reproduce this error with DSC 2.0.9, and there have been no code or 
> definition file changes in Opscenter.
> I can reproduce locally with the above steps.  I'm happy to test any proposed 
> fixes since I'm the only person able to reproduce reliably so far.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8072) Exception during startup: Unable to gossip with any seeds

2015-11-02 Thread Kenneth Failbus (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985589#comment-14985589
 ] 

Kenneth Failbus commented on CASSANDRA-8072:


Upon enabling trace on the one see and re-bootstrapping the new node, I got the 
following exception
{code}
2015-11-02 17:34:52,150 [ACCEPT-/10.22.168.53] DEBUG MessagingService Error 
reading the socket Socket[addr=/10.xx.xx.xx,port=46678,localport=10xxx]
java.net.SocketTimeoutException
at 
sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
at java.io.InputStream.read(InputStream.java:101)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:81)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at 
org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:916)
{code}

> Exception during startup: Unable to gossip with any seeds
> -
>
> Key: CASSANDRA-8072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8072
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ryan Springer
>Assignee: Stefania
> Fix For: 2.1.x
>
> Attachments: cas-dev-dt-01-uw1-cassandra-seed01_logs.tar.bz2, 
> cas-dev-dt-01-uw1-cassandra-seed02_logs.tar.bz2, 
> cas-dev-dt-01-uw1-cassandra02_logs.tar.bz2, 
> casandra-system-log-with-assert-patch.log, screenshot-1.png, 
> trace_logs.tar.bz2
>
>
> When Opscenter 4.1.4 or 5.0.1 tries to provision a 2-node DSC 2.0.10 cluster 
> in either ec2 or locally, an error occurs sometimes with one of the nodes 
> refusing to start C*.  The error in the /var/log/cassandra/system.log is:
> ERROR [main] 2014-10-06 15:54:52,292 CassandraDaemon.java (line 513) 
> Exception encountered during startup
> java.lang.RuntimeException: Unable to gossip with any seeds
> at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1200)
> at 
> org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:444)
> at 
> org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:655)
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:609)
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:502)
> at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:378)
> at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
> at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:585)
>  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:52,326 Gossiper.java 
> (line 1279) Announcing shutdown
>  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:54,326 
> MessagingService.java (line 701) Waiting for messaging service to quiesce
>  INFO [ACCEPT-localhost/127.0.0.1] 2014-10-06 15:54:54,327 
> MessagingService.java (line 941) MessagingService has terminated the accept() 
> thread
> This errors does not always occur when provisioning a 2-node cluster, but 
> probably around half of the time on only one of the nodes.  I haven't been 
> able to reproduce this error with DSC 2.0.9, and there have been no code or 
> definition file changes in Opscenter.
> I can reproduce locally with the above steps.  I'm happy to test any proposed 
> fixes since I'm the only person able to reproduce reliably so far.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8072) Exception during startup: Unable to gossip with any seeds

2015-11-02 Thread Kenneth Failbus (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985589#comment-14985589
 ] 

Kenneth Failbus edited comment on CASSANDRA-8072 at 11/2/15 5:38 PM:
-

Upon enabling trace on the one see and re-bootstrapping the new node, I got the 
following exception on the node that was bootstrapping.
{code}
2015-11-02 17:34:52,150 [ACCEPT-/10.22.168.53] DEBUG MessagingService Error 
reading the socket Socket[addr=/10.xx.xx.xx,port=46678,localport=10xxx]
java.net.SocketTimeoutException
at 
sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
at java.io.InputStream.read(InputStream.java:101)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:81)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at 
org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:916)
{code}


was (Author: kenfailbus):
Upon enabling trace on the one see and re-bootstrapping the new node, I got the 
following exception
{code}
2015-11-02 17:34:52,150 [ACCEPT-/10.22.168.53] DEBUG MessagingService Error 
reading the socket Socket[addr=/10.xx.xx.xx,port=46678,localport=10xxx]
java.net.SocketTimeoutException
at 
sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
at java.io.InputStream.read(InputStream.java:101)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:81)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at 
org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:916)
{code}

> Exception during startup: Unable to gossip with any seeds
> -
>
> Key: CASSANDRA-8072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8072
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ryan Springer
>Assignee: Stefania
> Fix For: 2.1.x
>
> Attachments: cas-dev-dt-01-uw1-cassandra-seed01_logs.tar.bz2, 
> cas-dev-dt-01-uw1-cassandra-seed02_logs.tar.bz2, 
> cas-dev-dt-01-uw1-cassandra02_logs.tar.bz2, 
> casandra-system-log-with-assert-patch.log, screenshot-1.png, 
> trace_logs.tar.bz2
>
>
> When Opscenter 4.1.4 or 5.0.1 tries to provision a 2-node DSC 2.0.10 cluster 
> in either ec2 or locally, an error occurs sometimes with one of the nodes 
> refusing to start C*.  The error in the /var/log/cassandra/system.log is:
> ERROR [main] 2014-10-06 15:54:52,292 CassandraDaemon.java (line 513) 
> Exception encountered during startup
> java.lang.RuntimeException: Unable to gossip with any seeds
> at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1200)
> at 
> org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:444)
> at 
> org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:655)
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:609)
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:502)
> at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:378)
> at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
> at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:585)
>  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:52,326 Gossiper.java 
> (line 1279) Announcing shutdown
>  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:54,326 
> MessagingService.java (line 701) Waiting for messaging service to quiesce
>  INFO [ACCEPT-localhost/127.0.0.1] 2014-10-06 15:54:54,327 
> MessagingService.java (line 941) MessagingService has terminated the accept() 
> thread
> This errors does not always occur when provisioning a 2-node cluster, but 
> probably around half of the time on only one of the nodes.  I haven't been 
> able to reproduce this error with DSC 2.0.9, and there have been no code or 
> definition file changes in Opscenter.
> I can reproduce locally with the above steps.  I'm happy to test any proposed 
> fixes since I'm the only person able to reproduce reliably so far.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8072) Exception during startup: Unable to gossip with any seeds

2015-11-02 Thread Kenneth Failbus (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985589#comment-14985589
 ] 

Kenneth Failbus edited comment on CASSANDRA-8072 at 11/2/15 5:49 PM:
-

Upon enabling trace on the one seed node and re-bootstrapping the new node, I 
got the following exception on the node that was bootstrapping.
{code}
2015-11-02 17:34:52,150 [ACCEPT-/10.22.168.53] DEBUG MessagingService Error 
reading the socket Socket[addr=/10.xx.xx.xx,port=46678,localport=10xxx]
java.net.SocketTimeoutException
at 
sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
at java.io.InputStream.read(InputStream.java:101)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:81)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at 
org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:916)
{code}


was (Author: kenfailbus):
Upon enabling trace on the one see and re-bootstrapping the new node, I got the 
following exception on the node that was bootstrapping.
{code}
2015-11-02 17:34:52,150 [ACCEPT-/10.22.168.53] DEBUG MessagingService Error 
reading the socket Socket[addr=/10.xx.xx.xx,port=46678,localport=10xxx]
java.net.SocketTimeoutException
at 
sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
at java.io.InputStream.read(InputStream.java:101)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:81)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at 
org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:916)
{code}

> Exception during startup: Unable to gossip with any seeds
> -
>
> Key: CASSANDRA-8072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8072
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ryan Springer
>Assignee: Stefania
> Fix For: 2.1.x
>
> Attachments: cas-dev-dt-01-uw1-cassandra-seed01_logs.tar.bz2, 
> cas-dev-dt-01-uw1-cassandra-seed02_logs.tar.bz2, 
> cas-dev-dt-01-uw1-cassandra02_logs.tar.bz2, 
> casandra-system-log-with-assert-patch.log, screenshot-1.png, 
> trace_logs.tar.bz2
>
>
> When Opscenter 4.1.4 or 5.0.1 tries to provision a 2-node DSC 2.0.10 cluster 
> in either ec2 or locally, an error occurs sometimes with one of the nodes 
> refusing to start C*.  The error in the /var/log/cassandra/system.log is:
> ERROR [main] 2014-10-06 15:54:52,292 CassandraDaemon.java (line 513) 
> Exception encountered during startup
> java.lang.RuntimeException: Unable to gossip with any seeds
> at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1200)
> at 
> org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:444)
> at 
> org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:655)
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:609)
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:502)
> at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:378)
> at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
> at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:585)
>  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:52,326 Gossiper.java 
> (line 1279) Announcing shutdown
>  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:54,326 
> MessagingService.java (line 701) Waiting for messaging service to quiesce
>  INFO [ACCEPT-localhost/127.0.0.1] 2014-10-06 15:54:54,327 
> MessagingService.java (line 941) MessagingService has terminated the accept() 
> thread
> This errors does not always occur when provisioning a 2-node cluster, but 
> probably around half of the time on only one of the nodes.  I haven't been 
> able to reproduce this error with DSC 2.0.9, and there have been no code or 
> definition file changes in Opscenter.
> I can reproduce locally with the above steps.  I'm happy to test any proposed 
> fixes since I'm the only person able to reproduce reliably so far.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8072) Exception during startup: Unable to gossip with any seeds

2015-11-02 Thread Kenneth Failbus (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985589#comment-14985589
 ] 

Kenneth Failbus edited comment on CASSANDRA-8072 at 11/2/15 5:50 PM:
-

Upon enabling trace on the one seed node and re-bootstrapping the new node, I 
got the following exception on the node that was bootstrapping.
{code}
2015-11-02 17:34:52,150 [ACCEPT-/10.22.168.53] DEBUG MessagingService Error 
reading the socket Socket[addr=/10.xx.xx.xx,port=46678,localport=10xxx]
java.net.SocketTimeoutException
at 
sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
at java.io.InputStream.read(InputStream.java:101)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:81)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at 
org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:916)
{code}

And then usual exception that this ticket is mentioning about
{code}
2015-11-02 17:34:55,526 [EXPIRING-MAP-REAPER:1] TRACE ExpiringMap Expired 0 
entries
2015-11-02 17:35:00,526 [EXPIRING-MAP-REAPER:1] TRACE ExpiringMap Expired 0 
entries
2015-11-02 17:35:05,527 [EXPIRING-MAP-REAPER:1] TRACE ExpiringMap Expired 0 
entries
2015-11-02 17:35:10,527 [EXPIRING-MAP-REAPER:1] TRACE ExpiringMap Expired 0 
entries
2015-11-02 17:35:11,982 [main] ERROR CassandraDaemon Exception encountered 
during startup
java.lang.RuntimeException: Unable to gossip with any seeds
at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1296)
at 
org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:457)
at 
org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:671)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:623)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:515)
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:437)
at com.datastax.bdp.server.DseDaemon.setup(DseDaemon.java:423)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567)
at com.datastax.bdp.server.DseDaemon.main(DseDaemon.java:641)
2015-11-02 17:35:11,986 [Thread-7] INFO DseDaemon DSE shutting down...
2015-11-02 17:35:11,987 [StorageServiceShutdownHook] WARN Gossiper No local 
state or state is in silent shutdown, not announcing shutdown
2015-11-02 17:35:11,987 [StorageServiceShutdownHook] INFO MessagingService 
Waiting for messaging service to quiesce
2015-11-02 17:35:11,987 [StorageServiceShutdownHook] DEBUG MessagingService 
Closing accept() thread
2015-11-02 17:35:11,988 [ACCEPT-/10.22.168.53] DEBUG MessagingService 
Asynchronous close seen by server thread
2015-11-02 17:35:11,988 [ACCEPT-/10.22.168.53] INFO MessagingService 
MessagingService has terminated the accept() thread
2015-11-02 17:35:12,068 [Thread-7] ERROR CassandraDaemon Exception in thread 
Thread[Thread-7,5,main]
{code}


was (Author: kenfailbus):
Upon enabling trace on the one seed node and re-bootstrapping the new node, I 
got the following exception on the node that was bootstrapping.
{code}
2015-11-02 17:34:52,150 [ACCEPT-/10.22.168.53] DEBUG MessagingService Error 
reading the socket Socket[addr=/10.xx.xx.xx,port=46678,localport=10xxx]
java.net.SocketTimeoutException
at 
sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
at java.io.InputStream.read(InputStream.java:101)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:81)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at 
org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:916)
{code}

> Exception during startup: Unable to gossip with any seeds
> -
>
> Key: CASSANDRA-8072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8072
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ryan Springer
>Assignee: Stefania
> Fix For: 2.1.x
>
> Attachments: cas-dev-dt-01-uw1-cassandra-seed01_logs.tar.bz2, 
> cas-dev-dt-01-uw1-cassandra-seed02_logs.tar.bz2, 
> cas-dev-dt-01-uw1-cassandra02_logs.tar.bz2, 
> casandra-system-log-with-assert-patch.log, screenshot-1.png, 
> trace_logs.tar.bz2
>
>
> When Opscenter 4.1.4 or 5.0.1 tries to provision a 2-node DSC 2.0.10 cluster 
> in either ec2 or locally, an error occurs sometimes with one of the nodes 
> refusing to start C*.  The error in the /var/log/cassandra/system.log is:
> ERROR [main] 2014-10-06 15:54:52,292 CassandraDaemon.java (line 

[jira] [Commented] (CASSANDRA-10233) IndexOutOfBoundsException in HintedHandOffManager

2015-10-07 Thread Kenneth Failbus (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14946882#comment-14946882
 ] 

Kenneth Failbus commented on CASSANDRA-10233:
-

Folks,

I am seeing the same exception in 2.0.14 release. Any plans to apply the patch 
in that release. What is the work-around?

> IndexOutOfBoundsException in HintedHandOffManager
> -
>
> Key: CASSANDRA-10233
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10233
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Cassandra 2.2.0
>Reporter: Omri Iluz
>Assignee: J.P. Eiti Kimura
> Attachments: cassandra-2.1.8-10233-v2.txt, 
> cassandra-2.1.8-10233-v3.txt, cassandra-2.1.8-10233-v4.txt, 
> cassandra-2.2.1-10233.txt
>
>
> After upgrading our cluster to 2.2.0, the following error started showing 
> exectly every 10 minutes on every server in the cluster:
> {noformat}
> INFO  [CompactionExecutor:1381] 2015-08-31 18:31:55,506 
> CompactionTask.java:142 - Compacting (8e7e1520-500e-11e5-b1e3-e95897ba4d20) 
> [/cassandra/data/system/hints-2666e20573ef38b390fefecf96e8f0c7/la-540-big-Data.db:level=0,
>  ]
> INFO  [CompactionExecutor:1381] 2015-08-31 18:31:55,599 
> CompactionTask.java:224 - Compacted (8e7e1520-500e-11e5-b1e3-e95897ba4d20) 1 
> sstables to 
> [/cassandra/data/system/hints-2666e20573ef38b390fefecf96e8f0c7/la-541-big,] 
> to level=0.  1,544,495 bytes to 1,544,495 (~100% of original) in 93ms = 
> 15.838121MB/s.  0 total partitions merged to 4.  Partition merge counts were 
> {1:4, }
> ERROR [HintedHandoff:1] 2015-08-31 18:31:55,600 CassandraDaemon.java:182 - 
> Exception in thread Thread[HintedHandoff:1,1,main]
> java.lang.IndexOutOfBoundsException: null
>   at java.nio.Buffer.checkIndex(Buffer.java:538) ~[na:1.7.0_79]
>   at java.nio.HeapByteBuffer.getLong(HeapByteBuffer.java:410) 
> ~[na:1.7.0_79]
>   at org.apache.cassandra.utils.UUIDGen.getUUID(UUIDGen.java:106) 
> ~[apache-cassandra-2.2.0.jar:2.2.0]
>   at 
> org.apache.cassandra.db.HintedHandOffManager.scheduleAllDeliveries(HintedHandOffManager.java:515)
>  ~[apache-cassandra-2.2.0.jar:2.2.0]
>   at 
> org.apache.cassandra.db.HintedHandOffManager.access$000(HintedHandOffManager.java:88)
>  ~[apache-cassandra-2.2.0.jar:2.2.0]
>   at 
> org.apache.cassandra.db.HintedHandOffManager$1.run(HintedHandOffManager.java:168)
>  ~[apache-cassandra-2.2.0.jar:2.2.0]
>   at 
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118)
>  ~[apache-cassandra-2.2.0.jar:2.2.0]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> [na:1.7.0_79]
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) 
> [na:1.7.0_79]
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>  [na:1.7.0_79]
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>  [na:1.7.0_79]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_79]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_79]
>   at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10233) IndexOutOfBoundsException in HintedHandOffManager

2015-10-07 Thread Kenneth Failbus (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14946978#comment-14946978
 ] 

Kenneth Failbus commented on CASSANDRA-10233:
-

This is what I saw when running the above command.
{code}
cqlsh> SELECT target_id,hint_id,message_version FROM system.hints LIMIT 5;

 target_id | hint_id  | message_version
---+--+-
   | af72b680-6622-11e5-bb6e-5b476c07be60 |   7
   | b27f3330-6622-11e5-bb6e-5b476c07be60 |   7
   | b5f0da50-6622-11e5-bb6e-5b476c07be60 |   7
   | b6a1b3c0-6622-11e5-bb6e-5b476c07be60 |   7
   | b6bcb5d0-6622-11e5-bb6e-5b476c07be60 |   7
{code}

> IndexOutOfBoundsException in HintedHandOffManager
> -
>
> Key: CASSANDRA-10233
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10233
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Cassandra 2.2.0
>Reporter: Omri Iluz
>Assignee: J.P. Eiti Kimura
> Attachments: cassandra-2.1.8-10233-v2.txt, 
> cassandra-2.1.8-10233-v3.txt, cassandra-2.1.8-10233-v4.txt, 
> cassandra-2.2.1-10233.txt
>
>
> After upgrading our cluster to 2.2.0, the following error started showing 
> exectly every 10 minutes on every server in the cluster:
> {noformat}
> INFO  [CompactionExecutor:1381] 2015-08-31 18:31:55,506 
> CompactionTask.java:142 - Compacting (8e7e1520-500e-11e5-b1e3-e95897ba4d20) 
> [/cassandra/data/system/hints-2666e20573ef38b390fefecf96e8f0c7/la-540-big-Data.db:level=0,
>  ]
> INFO  [CompactionExecutor:1381] 2015-08-31 18:31:55,599 
> CompactionTask.java:224 - Compacted (8e7e1520-500e-11e5-b1e3-e95897ba4d20) 1 
> sstables to 
> [/cassandra/data/system/hints-2666e20573ef38b390fefecf96e8f0c7/la-541-big,] 
> to level=0.  1,544,495 bytes to 1,544,495 (~100% of original) in 93ms = 
> 15.838121MB/s.  0 total partitions merged to 4.  Partition merge counts were 
> {1:4, }
> ERROR [HintedHandoff:1] 2015-08-31 18:31:55,600 CassandraDaemon.java:182 - 
> Exception in thread Thread[HintedHandoff:1,1,main]
> java.lang.IndexOutOfBoundsException: null
>   at java.nio.Buffer.checkIndex(Buffer.java:538) ~[na:1.7.0_79]
>   at java.nio.HeapByteBuffer.getLong(HeapByteBuffer.java:410) 
> ~[na:1.7.0_79]
>   at org.apache.cassandra.utils.UUIDGen.getUUID(UUIDGen.java:106) 
> ~[apache-cassandra-2.2.0.jar:2.2.0]
>   at 
> org.apache.cassandra.db.HintedHandOffManager.scheduleAllDeliveries(HintedHandOffManager.java:515)
>  ~[apache-cassandra-2.2.0.jar:2.2.0]
>   at 
> org.apache.cassandra.db.HintedHandOffManager.access$000(HintedHandOffManager.java:88)
>  ~[apache-cassandra-2.2.0.jar:2.2.0]
>   at 
> org.apache.cassandra.db.HintedHandOffManager$1.run(HintedHandOffManager.java:168)
>  ~[apache-cassandra-2.2.0.jar:2.2.0]
>   at 
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118)
>  ~[apache-cassandra-2.2.0.jar:2.2.0]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> [na:1.7.0_79]
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) 
> [na:1.7.0_79]
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>  [na:1.7.0_79]
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>  [na:1.7.0_79]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_79]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_79]
>   at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10233) IndexOutOfBoundsException in HintedHandOffManager

2015-10-07 Thread Kenneth Failbus (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14946996#comment-14946996
 ] 

Kenneth Failbus commented on CASSANDRA-10233:
-

Since the output above in select query looks very that there is no target_id 
present, would it be better off truncating the whole hints table and restarting 
cassandra. Also, we have reduced our hint window to 5min. beyond which we just 
bootstrap the node back or run repairs on it.

> IndexOutOfBoundsException in HintedHandOffManager
> -
>
> Key: CASSANDRA-10233
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10233
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: Cassandra 2.2.0
>Reporter: Omri Iluz
>Assignee: J.P. Eiti Kimura
> Attachments: cassandra-2.1.8-10233-v2.txt, 
> cassandra-2.1.8-10233-v3.txt, cassandra-2.1.8-10233-v4.txt, 
> cassandra-2.2.1-10233.txt
>
>
> After upgrading our cluster to 2.2.0, the following error started showing 
> exectly every 10 minutes on every server in the cluster:
> {noformat}
> INFO  [CompactionExecutor:1381] 2015-08-31 18:31:55,506 
> CompactionTask.java:142 - Compacting (8e7e1520-500e-11e5-b1e3-e95897ba4d20) 
> [/cassandra/data/system/hints-2666e20573ef38b390fefecf96e8f0c7/la-540-big-Data.db:level=0,
>  ]
> INFO  [CompactionExecutor:1381] 2015-08-31 18:31:55,599 
> CompactionTask.java:224 - Compacted (8e7e1520-500e-11e5-b1e3-e95897ba4d20) 1 
> sstables to 
> [/cassandra/data/system/hints-2666e20573ef38b390fefecf96e8f0c7/la-541-big,] 
> to level=0.  1,544,495 bytes to 1,544,495 (~100% of original) in 93ms = 
> 15.838121MB/s.  0 total partitions merged to 4.  Partition merge counts were 
> {1:4, }
> ERROR [HintedHandoff:1] 2015-08-31 18:31:55,600 CassandraDaemon.java:182 - 
> Exception in thread Thread[HintedHandoff:1,1,main]
> java.lang.IndexOutOfBoundsException: null
>   at java.nio.Buffer.checkIndex(Buffer.java:538) ~[na:1.7.0_79]
>   at java.nio.HeapByteBuffer.getLong(HeapByteBuffer.java:410) 
> ~[na:1.7.0_79]
>   at org.apache.cassandra.utils.UUIDGen.getUUID(UUIDGen.java:106) 
> ~[apache-cassandra-2.2.0.jar:2.2.0]
>   at 
> org.apache.cassandra.db.HintedHandOffManager.scheduleAllDeliveries(HintedHandOffManager.java:515)
>  ~[apache-cassandra-2.2.0.jar:2.2.0]
>   at 
> org.apache.cassandra.db.HintedHandOffManager.access$000(HintedHandOffManager.java:88)
>  ~[apache-cassandra-2.2.0.jar:2.2.0]
>   at 
> org.apache.cassandra.db.HintedHandOffManager$1.run(HintedHandOffManager.java:168)
>  ~[apache-cassandra-2.2.0.jar:2.2.0]
>   at 
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118)
>  ~[apache-cassandra-2.2.0.jar:2.2.0]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> [na:1.7.0_79]
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) 
> [na:1.7.0_79]
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
>  [na:1.7.0_79]
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>  [na:1.7.0_79]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_79]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_79]
>   at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes

2015-08-06 Thread Kenneth Failbus (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660740#comment-14660740
 ] 

Kenneth Failbus commented on CASSANDRA-5220:


Will this fix be back-ported to 2.0.x or 2.1.x releases. It will be a big help 
since this would solve and make the product stable on those releases. Since 
vnodes is a very good functionality for scaling purpose only if repairs keep up.

 Repair improvements when using vnodes
 -

 Key: CASSANDRA-5220
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5220
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.2.0 beta 1
Reporter: Brandon Williams
Assignee: Marcus Olsson
  Labels: performance, repair
 Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, 
 cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, 
 cassandra-3.0-5220.patch


 Currently when using vnodes, repair takes much longer to complete than 
 without them.  This appears at least in part because it's using a session per 
 range and processing them sequentially.  This generates a lot of log spam 
 with vnodes, and while being gentler and lighter on hard disk deployments, 
 ssd-based deployments would often prefer that repair be as fast as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7909) Do not exit nodetool repair when receiving JMX NOTIF_LOST

2015-07-21 Thread Kenneth Failbus (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14635625#comment-14635625
 ] 

Kenneth Failbus commented on CASSANDRA-7909:


This issue is still not fixed in 2.0.14. we have a 48 node cluster. When 
nodetool repair is run, notification is lost.

2015-07-21 18:25:19,627] Lost notification. You should check server log for 
repair status of keyspace x

[2015-07-21 18:37:19,636] Lost notification. You should check server log for 
repair status of keyspace x
[2015-07-21 18:41:19,647] Lost notification. You should check server log for 
repair status of keyspace x
[2015-07-21 18:43:19,649] Lost notification. You should check server log for 
repair status of keyspace x
[2015-07-21 18:45:19,651] Lost notification. You should check server log for 
repair status of keyspace x
[2015-07-21 18:49:19,655] Lost notification. You should check server log for 
repair status of keyspace x
[2015-07-21 18:52:19,657] Lost notification. You should check server log for 
repair status of keyspace x
[2015-07-21 18:53:19,659] Lost notification. You should check server log for 
repair status of keyspace x


 Do not exit nodetool repair when receiving JMX NOTIF_LOST
 -

 Key: CASSANDRA-7909
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7909
 Project: Cassandra
  Issue Type: Improvement
Reporter: Yuki Morishita
Assignee: Yuki Morishita
Priority: Trivial
 Fix For: 2.0.11, 2.1.1

 Attachments: 
 0001-Do-not-quit-nodetool-when-JMX-notif_lost-received.patch


 {{nodetool repair}} prints out 'Lost notification...' and exits when JMX 
 NOTIF_LOST message is received. But we should not exit right away since that 
 message just indicates some messages are lost because they arrive so fast 
 that they cannot be delivered to the remote client quickly enough according 
 to 
 https://weblogs.java.net/blog/emcmanus/archive/2007/08/when_can_jmx_no.html. 
 So we should just continue to listen to events until repair finishes or 
 connection is really closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8641) Repair causes a large number of tiny SSTables

2015-07-14 Thread Kenneth Failbus (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627034#comment-14627034
 ] 

Kenneth Failbus commented on CASSANDRA-8641:


Folks,

we are seeing similar situation in 2.0.14, where we have vnodes for our cluster 
and it's creating huge number of sstables on certain nodes and not all the 
nodes where repair runs with -par option.We are using STCS for the column 
family. 
So, I have the similar question like others is this fixed in 2.0.x release.



 Repair causes a large number of tiny SSTables
 -

 Key: CASSANDRA-8641
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8641
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Ubuntu 14.04
Reporter: Flavien Charlon
 Fix For: 2.1.3


 I have a 3 nodes cluster with RF = 3, quad core and 32 GB or RAM. I am 
 running 2.1.2 with all the default settings. I'm seeing some strange 
 behaviors during incremental repair (under write load).
 Taking the example of one particular column family, before running an 
 incremental repair, I have about 13 SSTables. After finishing the incremental 
 repair, I have over 114000 SSTables.
 {noformat}
 Table: customers
 SSTable count: 114688
 Space used (live): 97203707290
 Space used (total): 99175455072
 Space used by snapshots (total): 0
 SSTable Compression Ratio: 0.28281112416526505
 Memtable cell count: 0
 Memtable data size: 0
 Memtable switch count: 1069
 Local read count: 0
 Local read latency: NaN ms
 Local write count: 11548705
 Local write latency: 0.030 ms
 Pending flushes: 0
 Bloom filter false positives: 0
 Bloom filter false ratio: 0.0
 Bloom filter space used: 144145152
 Compacted partition minimum bytes: 311
 Compacted partition maximum bytes: 1996099046
 Compacted partition mean bytes: 3419
 Average live cells per slice (last five minutes): 0.0
 Maximum live cells per slice (last five minutes): 0.0
 Average tombstones per slice (last five minutes): 0.0
 Maximum tombstones per slice (last five minutes): 0.0
 {noformat}
 Looking at the logs during the repair, it seems Cassandra is struggling to 
 compact minuscule memtables (often just a few kilobytes):
 {noformat}
 INFO  [CompactionExecutor:337] 2015-01-17 01:44:27,011 
 CompactionTask.java:251 - Compacted 32 sstables to 
 [/mnt/data/cassandra/data/business/customers-d9d42d209ccc11e48ca54553c90a9d45/business-customers-ka-228341,].
   8,332 bytes to 6,547 (~78% of original) in 80,476ms = 0.78MB/s.  32 
 total partitions merged to 32.  Partition merge counts were {1:32, }
 INFO  [CompactionExecutor:337] 2015-01-17 01:45:35,519 
 CompactionTask.java:251 - Compacted 32 sstables to 
 [/mnt/data/cassandra/data/business/customers-d9d42d209ccc11e48ca54553c90a9d45/business-customers-ka-229348,].
   8,384 bytes to 6,563 (~78% of original) in 6,880ms = 0.000910MB/s.  32 
 total partitions merged to 32.  Partition merge counts were {1:32, }
 INFO  [CompactionExecutor:339] 2015-01-17 01:47:46,475 
 CompactionTask.java:251 - Compacted 32 sstables to 
 [/mnt/data/cassandra/data/business/customers-d9d42d209ccc11e48ca54553c90a9d45/business-customers-ka-229351,].
   8,423 bytes to 6,401 (~75% of original) in 10,416ms = 0.000586MB/s.  32 
 total partitions merged to 32.  Partition merge counts were {1:32, }
 {noformat}
  
 Here is an excerpt of the system logs showing the abnormal flushing:
 {noformat}
 INFO  [AntiEntropyStage:1] 2015-01-17 15:28:43,807 ColumnFamilyStore.java:840 
 - Enqueuing flush of customers: 634484 (0%) on-heap, 2599489 (0%) off-heap
 INFO  [AntiEntropyStage:1] 2015-01-17 15:29:06,823 ColumnFamilyStore.java:840 
 - Enqueuing flush of levels: 129504 (0%) on-heap, 222168 (0%) off-heap
 INFO  [AntiEntropyStage:1] 2015-01-17 15:29:07,940 ColumnFamilyStore.java:840 
 - Enqueuing flush of chain: 4508 (0%) on-heap, 6880 (0%) off-heap
 INFO  [AntiEntropyStage:1] 2015-01-17 15:29:08,124 ColumnFamilyStore.java:840 
 - Enqueuing flush of invoices: 1469772 (0%) on-heap, 2542675 (0%) off-heap
 INFO  [AntiEntropyStage:1] 2015-01-17 15:29:09,471 ColumnFamilyStore.java:840 
 - Enqueuing flush of customers: 809844 (0%) on-heap, 3364728 (0%) off-heap
 INFO  [AntiEntropyStage:1] 2015-01-17 15:29:24,368 ColumnFamilyStore.java:840 
 - Enqueuing flush of levels: 28212 (0%) on-heap, 44220 (0%) off-heap
 INFO  [AntiEntropyStage:1] 2015-01-17 15:29:24,822 ColumnFamilyStore.java:840 
 - Enqueuing flush of chain: 860 (0%) on-heap, 1130 (0%) off-heap
 INFO  [AntiEntropyStage:1] 2015-01-17 15:29:24,985 ColumnFamilyStore.java:840 
 - Enqueuing flush of invoices: 334480 (0%) on-heap, 568959 (0%) off-heap
 INFO  [AntiEntropyStage:1] 2015-01-17 15:29:27,375 ColumnFamilyStore.java:840 
 - Enqueuing flush of customers: 221568 (0%) on-heap, 929962 (0%) off-heap
 INFO  [AntiEntropyStage:1] 

[jira] [Commented] (CASSANDRA-8448) Comparison method violates its general contract in AbstractEndpointSnitch

2015-07-08 Thread Kenneth Failbus (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618635#comment-14618635
 ] 

Kenneth Failbus commented on CASSANDRA-8448:


Folks,

I still see this error in 2.0.14 version of cassandra. Can you please make sure 
that nothing got regressed?


 Comparison method violates its general contract in AbstractEndpointSnitch
 ---

 Key: CASSANDRA-8448
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8448
 Project: Cassandra
  Issue Type: Bug
Reporter: J.B. Langston
Assignee: Brandon Williams
 Fix For: 2.1.3, 2.0.13

 Attachments: 8448-v2.txt, 8448-v3.txt, 8448.txt


 Seen in both 1.2 and 2.0.  The error is occurring here: 
 https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/locator/AbstractEndpointSnitch.java#L49
 {code}
 ERROR [Thrift:9] 2014-12-04 20:12:28,732 CustomTThreadPoolServer.java (line 
 219) Error occurred during processing of message.
 com.google.common.util.concurrent.UncheckedExecutionException: 
 java.lang.IllegalArgumentException: Comparison method violates its general 
 contract!
   at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2199)
   at com.google.common.cache.LocalCache.get(LocalCache.java:3932)
   at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3936)
   at 
 com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4806)
   at 
 org.apache.cassandra.service.ClientState.authorize(ClientState.java:352)
   at 
 org.apache.cassandra.service.ClientState.ensureHasPermission(ClientState.java:224)
   at 
 org.apache.cassandra.service.ClientState.hasAccess(ClientState.java:218)
   at 
 org.apache.cassandra.service.ClientState.hasColumnFamilyAccess(ClientState.java:202)
   at 
 org.apache.cassandra.thrift.CassandraServer.createMutationList(CassandraServer.java:822)
   at 
 org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:954)
   at com.datastax.bdp.server.DseServer.batch_mutate(DseServer.java:576)
   at 
 org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3922)
   at 
 org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3906)
   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
   at 
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:201)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.IllegalArgumentException: Comparison method violates its 
 general contract!
   at java.util.TimSort.mergeHi(TimSort.java:868)
   at java.util.TimSort.mergeAt(TimSort.java:485)
   at java.util.TimSort.mergeCollapse(TimSort.java:410)
   at java.util.TimSort.sort(TimSort.java:214)
   at java.util.TimSort.sort(TimSort.java:173)
   at java.util.Arrays.sort(Arrays.java:659)
   at java.util.Collections.sort(Collections.java:217)
   at 
 org.apache.cassandra.locator.AbstractEndpointSnitch.sortByProximity(AbstractEndpointSnitch.java:49)
   at 
 org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximityWithScore(DynamicEndpointSnitch.java:157)
   at 
 org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximityWithBadness(DynamicEndpointSnitch.java:186)
   at 
 org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximity(DynamicEndpointSnitch.java:151)
   at 
 org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1408)
   at 
 org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1402)
   at 
 org.apache.cassandra.service.AbstractReadExecutor.getReadExecutor(AbstractReadExecutor.java:148)
   at 
 org.apache.cassandra.service.StorageProxy.fetchRows(StorageProxy.java:1223)
   at 
 org.apache.cassandra.service.StorageProxy.read(StorageProxy.java:1165)
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:255)
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:225)
   at org.apache.cassandra.auth.Auth.selectUser(Auth.java:243)
   at org.apache.cassandra.auth.Auth.isSuperuser(Auth.java:84)
   at 
 org.apache.cassandra.auth.AuthenticatedUser.isSuper(AuthenticatedUser.java:50)
   at 
 org.apache.cassandra.auth.CassandraAuthorizer.authorize(CassandraAuthorizer.java:69)
   at 

[jira] [Commented] (CASSANDRA-9519) CASSANDRA-8448 Doesn't seem to be fixed

2015-07-08 Thread Kenneth Failbus (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14618637#comment-14618637
 ] 

Kenneth Failbus commented on CASSANDRA-9519:


I see this error in 2.0.14 version of cassandra also.

 CASSANDRA-8448 Doesn't seem to be fixed
 ---

 Key: CASSANDRA-9519
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9519
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jeremiah Jordan
 Fix For: 2.1.x, 2.2.x


 Still seeing the Comparison method violates its general contract! in 2.1.5
 {code}
 java.lang.IllegalArgumentException: Comparison method violates its general 
 contract!
   at java.util.TimSort.mergeHi(TimSort.java:895) ~[na:1.8.0_45]
   at java.util.TimSort.mergeAt(TimSort.java:512) ~[na:1.8.0_45]
   at java.util.TimSort.mergeCollapse(TimSort.java:437) ~[na:1.8.0_45]
   at java.util.TimSort.sort(TimSort.java:241) ~[na:1.8.0_45]
   at java.util.Arrays.sort(Arrays.java:1512) ~[na:1.8.0_45]
   at java.util.ArrayList.sort(ArrayList.java:1454) ~[na:1.8.0_45]
   at java.util.Collections.sort(Collections.java:175) ~[na:1.8.0_45]
   at 
 org.apache.cassandra.locator.AbstractEndpointSnitch.sortByProximity(AbstractEndpointSnitch.java:49)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximityWithScore(DynamicEndpointSnitch.java:158)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximityWithBadness(DynamicEndpointSnitch.java:187)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximity(DynamicEndpointSnitch.java:152)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1530)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:1688)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:256)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:209)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:63)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:238)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:260) 
 ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:272) 
 ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7317) Repair range validation is too strict

2015-05-08 Thread Kenneth Failbus (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14534858#comment-14534858
 ] 

Kenneth Failbus commented on CASSANDRA-7317:


Folks,

I am seeing this error again in 2.0.9 release. I have vnodes in my cluster 
enabled.

 Repair range validation is too strict
 -

 Key: CASSANDRA-7317
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7317
 Project: Cassandra
  Issue Type: Bug
Reporter: Nick Bailey
Assignee: Yuki Morishita
 Fix For: 2.0.9

 Attachments: 7317-2.0.txt, Untitled Diagram(1).png


 From what I can tell the calculation (using the -pr option) and validation of 
 tokens for repairing ranges is broken. Or at least should be improved. Using 
 an example with ccm:
 Nodetool ring:
 {noformat}
 Datacenter: dc1
 ==
 AddressRackStatus State   LoadOwns
 Token
   -10
 127.0.0.1  r1  Up Normal  188.96 KB   50.00%  
 -9223372036854775808
 127.0.0.2  r1  Up Normal  194.77 KB   50.00%  -10
 Datacenter: dc2
 ==
 AddressRackStatus State   LoadOwns
 Token
   0
 127.0.0.4  r1  Up Normal  160.58 KB   0.00%   
 -9223372036854775798
 127.0.0.3  r1  Up Normal  139.46 KB   0.00%   0
 {noformat}
 Schema:
 {noformat}
 CREATE KEYSPACE system_traces WITH replication = {
   'class': 'NetworkTopologyStrategy',
   'dc2': '2',
   'dc1': '2'
 };
 {noformat}
 Repair -pr:
 {noformat}
 [Nicks-MacBook-Pro:21:35:58 cassandra-2.0] cassandra$ bin/nodetool -p 7100 
 repair -pr system_traces
 [2014-05-28 21:36:01,977] Starting repair command #12, repairing 1 ranges for 
 keyspace system_traces
 [2014-05-28 21:36:02,207] Repair session f984d290-e6d9-11e3-9edc-5f8011daec21 
 for range (0,-9223372036854775808] finished
 [2014-05-28 21:36:02,207] Repair command #12 finished
 [Nicks-MacBook-Pro:21:36:02 cassandra-2.0] cassandra$ bin/nodetool -p 7200 
 repair -pr system_traces
 [2014-05-28 21:36:14,086] Starting repair command #1, repairing 1 ranges for 
 keyspace system_traces
 [2014-05-28 21:36:14,406] Repair session 00bd45b0-e6da-11e3-98fc-5f8011daec21 
 for range (-9223372036854775798,-10] finished
 [2014-05-28 21:36:14,406] Repair command #1 finished
 {noformat}
 Note that repairing both nodes in dc1, leaves very small ranges unrepaired. 
 For example (-10,0]. Repairing the 'primary range' in dc2 will repair those 
 small ranges. Maybe that is the behavior we want but it seems 
 counterintuitive.
 The behavior when manually trying to repair the full range of 127.0.0.01 
 definitely needs improvement though.
 Repair command:
 {noformat}
 [Nicks-MacBook-Pro:21:50:44 cassandra-2.0] cassandra$ bin/nodetool -p 7100 
 repair -st -10 -et -9223372036854775808 system_traces
 [2014-05-28 21:50:55,803] Starting repair command #17, repairing 1 ranges for 
 keyspace system_traces
 [2014-05-28 21:50:55,804] Starting repair command #17, repairing 1 ranges for 
 keyspace system_traces
 [2014-05-28 21:50:55,804] Repair command #17 finished
 [Nicks-MacBook-Pro:21:50:56 cassandra-2.0] cassandra$ echo $?
 1
 {noformat}
 system.log:
 {noformat}
 ERROR [Thread-96] 2014-05-28 21:40:05,921 StorageService.java (line 2621) 
 Repair session failed:
 java.lang.IllegalArgumentException: Requested range intersects a local range 
 but is not fully contained in one; this would lead to imprecise repair
 {noformat}
 * The actual output of the repair command doesn't really indicate that there 
 was an issue. Although the command does return with a non zero exit status.
 * The error here is invisible if you are using the synchronous jmx repair 
 api. It will appear as though the repair completed successfully.
 * Personally, I believe that should be a valid repair command. For the 
 system_traces keyspace, 127.0.0.1 is responsible for this range (and I would 
 argue the 'primary range' of the node).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-7317) Repair range validation is too strict

2015-05-08 Thread Kenneth Failbus (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14534858#comment-14534858
 ] 

Kenneth Failbus edited comment on CASSANDRA-7317 at 5/8/15 4:49 PM:


Folks,

I am seeing this error again in 2.0.9 release. I have vnodes in my cluster 
enabled.
{code}
2015-05-08 15:01:56,021 [AntiEntropyStage:1] INFO Validator [repair 
#254edb00-f593-11e4-9397-51babce9f892] Sending completed merkle tree to 
/10.22.168.35 for CF1/Sequence
2015-05-08 15:01:58,518 [AntiEntropyStage:1] INFO Validator [repair 
#e3ca16e0-f592-11e4-bce3-6f1b5fa480b1] Sending completed merkle tree to 
/10.22.168.105 for system_auth/permissions
2015-05-08 15:01:58,791 [AntiEntropyStage:1] INFO Validator [repair 
#e3ca16e0-f592-11e4-bce3-6f1b5fa480b1] Sending completed merkle tree to 
/10.22.168.105 for system_auth/credentials
2015-05-08 15:01:58,980 [AntiEntropyStage:1] INFO Validator [repair 
#e3ca16e0-f592-11e4-bce3-6f1b5fa480b1] Sending completed merkle tree to 
/10.22.168.105 for system_auth/users
2015-05-08 15:02:00,640 [AntiEntropyStage:1] INFO Validator [repair 
#e0d31e00-f592-11e4-993e-c9cc22925782] Sending completed merkle tree to 
/10.22.168.97 for system_auth/credentials
2015-05-08 15:02:01,345 [AntiEntropyStage:1] INFO Validator [repair 
#e0d31e00-f592-11e4-993e-c9cc22925782] Sending completed merkle tree to 
/10.22.168.97 for system_auth/users
2015-05-08 15:02:01,577 [AntiEntropyStage:1] INFO Validator [repair 
#e0d31e00-f592-11e4-993e-c9cc22925782] Sending completed merkle tree to 
/10.22.168.97 for system_auth/permissions
2015-05-08 15:02:01,753 [AntiEntropyStage:1] INFO Validator [repair 
#27dba060-f593-11e4-873b-9d346bbba08e] Sending completed merkle tree to 
/10.22.168.87 for CF1/Sequence
2015-05-08 15:02:02,622 [AntiEntropyStage:1] INFO Validator [repair 
#dba213a0-f592-11e4-b745-192986bd7af2] Sending completed merkle tree to 
/10.22.168.117 for system_auth/credentials
2015-05-08 15:02:02,873 [AntiEntropyStage:1] INFO Validator [repair 
#dba213a0-f592-11e4-b745-192986bd7af2] Sending completed merkle tree to 
/10.22.168.117 for system_auth/users
2015-05-08 15:02:03,508 [AntiEntropyStage:1] INFO Validator [repair 
#dba213a0-f592-11e4-b745-192986bd7af2] Sending completed merkle tree to 
/10.22.168.117 for system_auth/permissions
2015-05-08 15:02:03,988 [AntiEntropyStage:1] INFO Validator [repair 
#d0a2ad70-f592-11e4-a5a2-b73fe73dbe79] Sending completed merkle tree to 
/10.22.168.109 for system_auth/credentials
2015-05-08 15:02:04,759 [AntiEntropyStage:1] INFO Validator [repair 
#d0a2ad70-f592-11e4-a5a2-b73fe73dbe79] Sending completed merkle tree to 
/10.22.168.109 for system_auth/users
2015-05-08 15:02:05,066 [AntiEntropyStage:1] INFO Validator [repair 
#d0a2ad70-f592-11e4-a5a2-b73fe73dbe79] Sending completed merkle tree to 
/10.22.168.109 for system_auth/permissions
2015-05-08 15:02:05,200 [Thread-227856] ERROR StorageService Repair session 
failed:
java.lang.IllegalArgumentException: Requested range intersects a local range 
but is not fully contained in one; this would lead to imprecise repair
at 
org.apache.cassandra.service.ActiveRepairService.getNeighbors(ActiveRepairService.java:161)
at 
org.apache.cassandra.repair.RepairSession.init(RepairSession.java:130)
at 
org.apache.cassandra.repair.RepairSession.init(RepairSession.java:119)
at 
org.apache.cassandra.service.ActiveRepairService.submitRepairSession(ActiveRepairService.java:97)
at 
org.apache.cassandra.service.StorageService.forceKeyspaceRepair(StorageService.java:2628)
at 
org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2564)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.lang.Thread.run(Thread.java:744)
{code}


was (Author: kenfailbus):
Folks,

I am seeing this error again in 2.0.9 release. I have vnodes in my cluster 
enabled.
{code}
2015-05-08 15:01:56,021 [AntiEntropyStage:1] INFO Validator [repair 
#254edb00-f593-11e4-9397-51babce9f892] Sending completed merkle tree to 
/10.22.168.35 for CF1/Sequence
2015-05-08 15:01:58,518 [AntiEntropyStage:1] INFO Validator [repair 
#e3ca16e0-f592-11e4-bce3-6f1b5fa480b1] Sending completed merkle tree to 
/10.22.168.105 for system_auth/permissions
2015-05-08 15:01:58,791 [AntiEntropyStage:1] INFO Validator [repair 
#e3ca16e0-f592-11e4-bce3-6f1b5fa480b1] Sending completed merkle tree to 
/10.22.168.105 for system_auth/credentials
2015-05-08 15:01:58,980 [AntiEntropyStage:1] INFO Validator [repair 
#e3ca16e0-f592-11e4-bce3-6f1b5fa480b1] Sending completed merkle tree to 
/10.22.168.105 for system_auth/users
2015-05-08 15:02:00,640 [AntiEntropyStage:1] INFO Validator [repair 

[jira] [Comment Edited] (CASSANDRA-7317) Repair range validation is too strict

2015-05-08 Thread Kenneth Failbus (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14534858#comment-14534858
 ] 

Kenneth Failbus edited comment on CASSANDRA-7317 at 5/8/15 4:48 PM:


Folks,

I am seeing this error again in 2.0.9 release. I have vnodes in my cluster 
enabled.
{code}
2015-05-08 15:01:56,021 [AntiEntropyStage:1] INFO Validator [repair 
#254edb00-f593-11e4-9397-51babce9f892] Sending completed merkle tree to 
/10.22.168.35 for CF1/Sequence
2015-05-08 15:01:58,518 [AntiEntropyStage:1] INFO Validator [repair 
#e3ca16e0-f592-11e4-bce3-6f1b5fa480b1] Sending completed merkle tree to 
/10.22.168.105 for system_auth/permissions
2015-05-08 15:01:58,791 [AntiEntropyStage:1] INFO Validator [repair 
#e3ca16e0-f592-11e4-bce3-6f1b5fa480b1] Sending completed merkle tree to 
/10.22.168.105 for system_auth/credentials
2015-05-08 15:01:58,980 [AntiEntropyStage:1] INFO Validator [repair 
#e3ca16e0-f592-11e4-bce3-6f1b5fa480b1] Sending completed merkle tree to 
/10.22.168.105 for system_auth/users
2015-05-08 15:02:00,640 [AntiEntropyStage:1] INFO Validator [repair 
#e0d31e00-f592-11e4-993e-c9cc22925782] Sending completed merkle tree to 
/10.22.168.97 for system_auth/credentials
2015-05-08 15:02:01,345 [AntiEntropyStage:1] INFO Validator [repair 
#e0d31e00-f592-11e4-993e-c9cc22925782] Sending completed merkle tree to 
/10.22.168.97 for system_auth/users
2015-05-08 15:02:01,577 [AntiEntropyStage:1] INFO Validator [repair 
#e0d31e00-f592-11e4-993e-c9cc22925782] Sending completed merkle tree to 
/10.22.168.97 for system_auth/permissions
2015-05-08 15:02:01,753 [AntiEntropyStage:1] INFO Validator [repair 
#27dba060-f593-11e4-873b-9d346bbba08e] Sending completed merkle tree to 
/10.22.168.87 for CF1/Sequence
2015-05-08 15:02:02,622 [AntiEntropyStage:1] INFO Validator [repair 
#dba213a0-f592-11e4-b745-192986bd7af2] Sending completed merkle tree to 
/10.22.168.117 for system_auth/credentials
2015-05-08 15:02:02,873 [AntiEntropyStage:1] INFO Validator [repair 
#dba213a0-f592-11e4-b745-192986bd7af2] Sending completed merkle tree to 
/10.22.168.117 for system_auth/users
2015-05-08 15:02:03,508 [AntiEntropyStage:1] INFO Validator [repair 
#dba213a0-f592-11e4-b745-192986bd7af2] Sending completed merkle tree to 
/10.22.168.117 for system_auth/permissions
2015-05-08 15:02:03,988 [AntiEntropyStage:1] INFO Validator [repair 
#d0a2ad70-f592-11e4-a5a2-b73fe73dbe79] Sending completed merkle tree to 
/10.22.168.109 for system_auth/credentials
2015-05-08 15:02:04,759 [AntiEntropyStage:1] INFO Validator [repair 
#d0a2ad70-f592-11e4-a5a2-b73fe73dbe79] Sending completed merkle tree to 
/10.22.168.109 for system_auth/users
2015-05-08 15:02:05,066 [AntiEntropyStage:1] INFO Validator [repair 
#d0a2ad70-f592-11e4-a5a2-b73fe73dbe79] Sending completed merkle tree to 
/10.22.168.109 for system_auth/permissions
2015-05-08 15:02:05,200 [Thread-227856] ERROR StorageService Repair session 
failed:
java.lang.IllegalArgumentException: Requested range intersects a local range 
but is not fully contained in one; this would lead to imprecise repair
at 
org.apache.cassandra.service.ActiveRepairService.getNeighbors(ActiveRepairService.java:161)
at 
org.apache.cassandra.repair.RepairSession.init(RepairSession.java:130)
at 
org.apache.cassandra.repair.RepairSession.init(RepairSession.java:119)
at 
org.apache.cassandra.service.ActiveRepairService.submitRepairSession(ActiveRepairService.java:97)
at 
org.apache.cassandra.service.StorageService.forceKeyspaceRepair(StorageService.java:2628)
at 
org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2564)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.lang.Thread.run(Thread.java:744)



was (Author: kenfailbus):
Folks,

I am seeing this error again in 2.0.9 release. I have vnodes in my cluster 
enabled.

 Repair range validation is too strict
 -

 Key: CASSANDRA-7317
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7317
 Project: Cassandra
  Issue Type: Bug
Reporter: Nick Bailey
Assignee: Yuki Morishita
 Fix For: 2.0.9

 Attachments: 7317-2.0.txt, Untitled Diagram(1).png


 From what I can tell the calculation (using the -pr option) and validation of 
 tokens for repairing ranges is broken. Or at least should be improved. Using 
 an example with ccm:
 Nodetool ring:
 {noformat}
 Datacenter: dc1
 ==
 AddressRackStatus State   LoadOwns
 Token
   -10
 127.0.0.1  

[jira] [Comment Edited] (CASSANDRA-7317) Repair range validation is too strict

2015-05-08 Thread Kenneth Failbus (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14534858#comment-14534858
 ] 

Kenneth Failbus edited comment on CASSANDRA-7317 at 5/8/15 4:53 PM:


Folks,

I am seeing this error again in 2.0.9 release. I have vnodes in my cluster 
enabled. We run our repairs with -local option without -pr on our cluster.
{code}
2015-05-08 15:01:56,021 [AntiEntropyStage:1] INFO Validator [repair 
#254edb00-f593-11e4-9397-51babce9f892] Sending completed merkle tree to 
/10.22.168.35 for CF1/Sequence
2015-05-08 15:01:58,518 [AntiEntropyStage:1] INFO Validator [repair 
#e3ca16e0-f592-11e4-bce3-6f1b5fa480b1] Sending completed merkle tree to 
/10.22.168.105 for system_auth/permissions
2015-05-08 15:01:58,791 [AntiEntropyStage:1] INFO Validator [repair 
#e3ca16e0-f592-11e4-bce3-6f1b5fa480b1] Sending completed merkle tree to 
/10.22.168.105 for system_auth/credentials
2015-05-08 15:01:58,980 [AntiEntropyStage:1] INFO Validator [repair 
#e3ca16e0-f592-11e4-bce3-6f1b5fa480b1] Sending completed merkle tree to 
/10.22.168.105 for system_auth/users
2015-05-08 15:02:00,640 [AntiEntropyStage:1] INFO Validator [repair 
#e0d31e00-f592-11e4-993e-c9cc22925782] Sending completed merkle tree to 
/10.22.168.97 for system_auth/credentials
2015-05-08 15:02:01,345 [AntiEntropyStage:1] INFO Validator [repair 
#e0d31e00-f592-11e4-993e-c9cc22925782] Sending completed merkle tree to 
/10.22.168.97 for system_auth/users
2015-05-08 15:02:01,577 [AntiEntropyStage:1] INFO Validator [repair 
#e0d31e00-f592-11e4-993e-c9cc22925782] Sending completed merkle tree to 
/10.22.168.97 for system_auth/permissions
2015-05-08 15:02:01,753 [AntiEntropyStage:1] INFO Validator [repair 
#27dba060-f593-11e4-873b-9d346bbba08e] Sending completed merkle tree to 
/10.22.168.87 for CF1/Sequence
2015-05-08 15:02:02,622 [AntiEntropyStage:1] INFO Validator [repair 
#dba213a0-f592-11e4-b745-192986bd7af2] Sending completed merkle tree to 
/10.22.168.117 for system_auth/credentials
2015-05-08 15:02:02,873 [AntiEntropyStage:1] INFO Validator [repair 
#dba213a0-f592-11e4-b745-192986bd7af2] Sending completed merkle tree to 
/10.22.168.117 for system_auth/users
2015-05-08 15:02:03,508 [AntiEntropyStage:1] INFO Validator [repair 
#dba213a0-f592-11e4-b745-192986bd7af2] Sending completed merkle tree to 
/10.22.168.117 for system_auth/permissions
2015-05-08 15:02:03,988 [AntiEntropyStage:1] INFO Validator [repair 
#d0a2ad70-f592-11e4-a5a2-b73fe73dbe79] Sending completed merkle tree to 
/10.22.168.109 for system_auth/credentials
2015-05-08 15:02:04,759 [AntiEntropyStage:1] INFO Validator [repair 
#d0a2ad70-f592-11e4-a5a2-b73fe73dbe79] Sending completed merkle tree to 
/10.22.168.109 for system_auth/users
2015-05-08 15:02:05,066 [AntiEntropyStage:1] INFO Validator [repair 
#d0a2ad70-f592-11e4-a5a2-b73fe73dbe79] Sending completed merkle tree to 
/10.22.168.109 for system_auth/permissions
2015-05-08 15:02:05,200 [Thread-227856] ERROR StorageService Repair session 
failed:
java.lang.IllegalArgumentException: Requested range intersects a local range 
but is not fully contained in one; this would lead to imprecise repair
at 
org.apache.cassandra.service.ActiveRepairService.getNeighbors(ActiveRepairService.java:161)
at 
org.apache.cassandra.repair.RepairSession.init(RepairSession.java:130)
at 
org.apache.cassandra.repair.RepairSession.init(RepairSession.java:119)
at 
org.apache.cassandra.service.ActiveRepairService.submitRepairSession(ActiveRepairService.java:97)
at 
org.apache.cassandra.service.StorageService.forceKeyspaceRepair(StorageService.java:2628)
at 
org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2564)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.lang.Thread.run(Thread.java:744)
{code}


was (Author: kenfailbus):
Folks,

I am seeing this error again in 2.0.9 release. I have vnodes in my cluster 
enabled.
{code}
2015-05-08 15:01:56,021 [AntiEntropyStage:1] INFO Validator [repair 
#254edb00-f593-11e4-9397-51babce9f892] Sending completed merkle tree to 
/10.22.168.35 for CF1/Sequence
2015-05-08 15:01:58,518 [AntiEntropyStage:1] INFO Validator [repair 
#e3ca16e0-f592-11e4-bce3-6f1b5fa480b1] Sending completed merkle tree to 
/10.22.168.105 for system_auth/permissions
2015-05-08 15:01:58,791 [AntiEntropyStage:1] INFO Validator [repair 
#e3ca16e0-f592-11e4-bce3-6f1b5fa480b1] Sending completed merkle tree to 
/10.22.168.105 for system_auth/credentials
2015-05-08 15:01:58,980 [AntiEntropyStage:1] INFO Validator [repair 
#e3ca16e0-f592-11e4-bce3-6f1b5fa480b1] Sending completed merkle tree to 
/10.22.168.105 for system_auth/users
2015-05-08 15:02:00,640 

[jira] [Updated] (CASSANDRA-8898) sstableloader utility should allow loading of data from mounted filesystem

2015-03-03 Thread Kenneth Failbus (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Failbus updated CASSANDRA-8898:
---
Description: 
When trying to load data from a mounted filesystem onto a new cluster, 
following exceptions is observed intermittently, and at some point the 
sstableloader process gets hung without completing the loading process.

Please note that in my case the scenario was loading the existing sstables from 
an existing cluster to a brand new cluster.

Finally, it was found that there were some hard assumptions been made by 
sstableloader utility w.r.t response from the filesystem, which was not working 
with mounted filesystem.

The work-around was to copy each existing nodes sstable data files locally and 
then point sstableloader to that local filesystem to then load data onto new 
cluster.

In case of restoring during disaster recovery from backups the data using 
sstableloader, this copying to local filesystem of data files and then loading 
would take a long time.

It would be a good enhancement of the sstableloader utility to enable use of 
mounted filesystem as copying data locally and then loading is time consuming.

Below is the exception seen during the use of mounted filesystem.
{code}
java.lang.AssertionError: Reference counter -1 for 
/opt/tmp/casapp-c1-c00053-g.ch.tvx.comcast.com/MinDataService/System/MinDataService-System-jb-5449-Data.db
 
at 
org.apache.cassandra.io.sstable.SSTableReader.releaseReference(SSTableReader.java:1146)
 
at 
org.apache.cassandra.streaming.StreamTransferTask.complete(StreamTransferTask.java:74)
 
at 
org.apache.cassandra.streaming.StreamSession.received(StreamSession.java:542) 
at 
org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:424)
 
at 
org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:245)
 
at java.lang.Thread.run(Thread.java:744) 
WARN 21:07:16,853 [Stream #3e5a5ba0-bdef-11e4-a975-5777dbff0945] Stream failed

  at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:59)
 
at 
org.apache.cassandra.io.sstable.SSTableReader.openDataReader(SSTableReader.java:1406)
 
at 
org.apache.cassandra.streaming.compress.CompressedStreamWriter.write(CompressedStreamWriter.java:55)
 
at 
org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:59)
 
at 
org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:42)
 
at 
org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
 
at 
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:339)
 
at 
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:311)
 
at java.lang.Thread.run(Thread.java:744) 
Caused by: java.io.FileNotFoundException: 
/opt/tmp/casapp-c1-c00055-g.ch.tvx.comcast.com/MinDataService/System/MinDataService-System-jb-5997-Data.db
 (No such file or directory) 
at java.io.RandomAccessFile.open(Native Method) 
at java.io.RandomAccessFile.init(RandomAccessFile.java:241) 
at 
org.apache.cassandra.io.util.RandomAccessReader.init(RandomAccessReader.java:58)
 
at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.init(CompressedRandomAccessReader.java:76)
 
at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:55)
 
... 8 more 
Exception in thread STREAM-OUT-/96.115.88.196 java.lang.NullPointerException 
at 
org.apache.cassandra.streaming.ConnectionHandler$MessageHandler.signalCloseDone(ConnectionHandler.java:205)
 
at 
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
 
at java.lang.Thread.run(Thread.java:744) 
ERROR 20:49:35,646 [Stream #d9fce650-bdf3-11e4-b6c0-252cb9b3e9f3] Streaming 
error occurred 
java.lang.AssertionError: Reference counter -3 for 
/opt/tmp/casapp-c1-c00055-g.ch.tvx.comcast.com/MinDataService/System/MinDataService-System-jb-4897-Data.db
 
at 
org.apache.cassandra.io.sstable.SSTableReader.releaseReference(SSTableReader.java:1146)
 
at 
org.apache.cassandra.streaming.StreamTransferTask.complete(StreamTransferTask.java:74)
 
at 
org.apache.cassandra.streaming.StreamSession.received(StreamSession.java:542) 
at 
org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:424)
 
at 
org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:245)
 
at java.lang.Thread.run(Thread.java:744) 
Exception in thread STREAM-IN-/96.115.88.196 

[jira] [Created] (CASSANDRA-8898) sstableloader utility should allow loading of data from mounted filesystem

2015-03-03 Thread Kenneth Failbus (JIRA)
Kenneth Failbus created CASSANDRA-8898:
--

 Summary: sstableloader utility should allow loading of data from 
mounted filesystem
 Key: CASSANDRA-8898
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8898
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
 Environment: 2.0.12
Reporter: Kenneth Failbus


When trying to load data from a mounted filesystem onto a new cluster, 
following exceptions is observed intermittently, and at some point the 
sstableloader process gets hung without completing the loading process.

Please note that in my case the scenario was loading the existing sstables from 
an existing cluster to a brand new cluster.

Finally, it was found that there were some hard assumptions been made by 
sstableloader utility w.r.t response from the filesystem, which was not working 
with mounted filesystem.

The work-around was to copy each existing nodes sstable data files locally and 
then point sstableloader to that local filesystem to then load data.

In case of restoring during disaster recovery from backups the data using 
sstableloader, this copying to local filesystem of data files and then loading 
would take a long time.

It would be a good enhancement of the sstableloader utility to enable use of 
mounted filesystem as copying data locally and then loading is time consuming.

Below is the exception seen during the use of mounted filesystem.
{code}
java.lang.AssertionError: Reference counter -1 for 
/opt/tmp/casapp-c1-c00053-g.ch.tvx.comcast.com/MinDataService/System/MinDataService-System-jb-5449-Data.db
 
at 
org.apache.cassandra.io.sstable.SSTableReader.releaseReference(SSTableReader.java:1146)
 
at 
org.apache.cassandra.streaming.StreamTransferTask.complete(StreamTransferTask.java:74)
 
at 
org.apache.cassandra.streaming.StreamSession.received(StreamSession.java:542) 
at 
org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:424)
 
at 
org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:245)
 
at java.lang.Thread.run(Thread.java:744) 
WARN 21:07:16,853 [Stream #3e5a5ba0-bdef-11e4-a975-5777dbff0945] Stream failed

  at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:59)
 
at 
org.apache.cassandra.io.sstable.SSTableReader.openDataReader(SSTableReader.java:1406)
 
at 
org.apache.cassandra.streaming.compress.CompressedStreamWriter.write(CompressedStreamWriter.java:55)
 
at 
org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:59)
 
at 
org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:42)
 
at 
org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
 
at 
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:339)
 
at 
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:311)
 
at java.lang.Thread.run(Thread.java:744) 
Caused by: java.io.FileNotFoundException: 
/opt/tmp/casapp-c1-c00055-g.ch.tvx.comcast.com/MinDataService/System/MinDataService-System-jb-5997-Data.db
 (No such file or directory) 
at java.io.RandomAccessFile.open(Native Method) 
at java.io.RandomAccessFile.init(RandomAccessFile.java:241) 
at 
org.apache.cassandra.io.util.RandomAccessReader.init(RandomAccessReader.java:58)
 
at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.init(CompressedRandomAccessReader.java:76)
 
at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:55)
 
... 8 more 
Exception in thread STREAM-OUT-/96.115.88.196 java.lang.NullPointerException 
at 
org.apache.cassandra.streaming.ConnectionHandler$MessageHandler.signalCloseDone(ConnectionHandler.java:205)
 
at 
org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
 
at java.lang.Thread.run(Thread.java:744) 
ERROR 20:49:35,646 [Stream #d9fce650-bdf3-11e4-b6c0-252cb9b3e9f3] Streaming 
error occurred 
java.lang.AssertionError: Reference counter -3 for 
/opt/tmp/casapp-c1-c00055-g.ch.tvx.comcast.com/MinDataService/System/MinDataService-System-jb-4897-Data.db
 
at 
org.apache.cassandra.io.sstable.SSTableReader.releaseReference(SSTableReader.java:1146)
 
at 
org.apache.cassandra.streaming.StreamTransferTask.complete(StreamTransferTask.java:74)
 
at 
org.apache.cassandra.streaming.StreamSession.received(StreamSession.java:542) 
at 
org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:424)
 
at