[jira] Created: (CASSANDRA-2076) Not starting due to Invalid saved cache

2011-01-31 Thread Thibaut (JIRA)
Not starting due to Invalid saved cache
---

 Key: CASSANDRA-2076
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2076
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: linux
Reporter: Thibaut
Priority: Minor
 Fix For: 0.7.2


This occured on two nodes on me (running 0.7.1 from svn)

One node was killed by the kernel due to a OOM and the other node was haning 
and I had to kill it manually with kill -9 (kill didn't work). (maybe these 
were faulty hardware nodes, I don't know)

The saved_cache was corrupt afterwards and I couldn't start the nodes. 

After deleting the saved_caches directory I could start the nodes again. 

Instead of not starting when an error occurs, cassandra could simply delete the 
errornous file and continue to start?




 INFO 22:31:11,570 reading saved cache
/hd1/cassandra_md5/saved_caches/table_attributes-table_attributes-KeyCache
ERROR 22:31:11,595 Exception encountered during startup.
java.lang.RuntimeException: The provided key was not UTF8 encoded.
   at 
org.apache.cassandra.dht.OrderPreservingPartitioner.getToken(OrderPreservingPartitioner.java:159)
   at 
org.apache.cassandra.dht.OrderPreservingPartitioner.decorateKey(OrderPreservingPartitioner.java:44)
   at 
org.apache.cassandra.db.ColumnFamilyStore.readSavedCache(ColumnFamilyStore.java:281)
   at 
org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:218)
   at 
org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:458)
   at 
org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:440)
   at org.apache.cassandra.db.Table.initCf(Table.java:360)
   at org.apache.cassandra.db.Table.init(Table.java:290)
   at org.apache.cassandra.db.Table.open(Table.java:107)
   at 
org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:167)
   at 
org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:312)
   at 
org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:81)
Caused by: java.nio.charset.MalformedInputException: Input length = 1
   at java.nio.charset.CoderResult.throwException(CoderResult.java:260)
   at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:781)
   at 
org.apache.cassandra.utils.FBUtilities.decodeToUTF8(FBUtilities.java:403)
   at 
org.apache.cassandra.dht.OrderPreservingPartitioner.getToken(OrderPreservingPartitioner.java:155)
   ... 11 more
Exception encountered during startup.
java.lang.RuntimeException: The provided key was not UTF8 encoded.
   at 
org.apache.cassandra.dht.OrderPreservingPartitioner.getToken(OrderPreservingPartitioner.java:159)
   at 
org.apache.cassandra.dht.OrderPreservingPartitioner.decorateKey(OrderPreservingPartitioner.java:44)
   at 
org.apache.cassandra.db.ColumnFamilyStore.readSavedCache(ColumnFamilyStore.java:281)
   at 
org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:218)
   at 
org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:458)
   at 
org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:440)
   at org.apache.cassandra.db.Table.initCf(Table.java:360)
   at org.apache.cassandra.db.Table.init(Table.java:290)
   at org.apache.cassandra.db.Table.open(Table.java:107)
   at 
org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:167)
   at 
org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:312)
   at 
org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:81)
Caused by: java.nio.charset.MalformedInputException: Input length = 1
   at java.nio.charset.CoderResult.throwException(CoderResult.java:260)
   at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:781)
   at 
org.apache.cassandra.utils.FBUtilities.decodeToUTF8(FBUtilities.java:403)
   at 
org.apache.cassandra.dht.OrderPreservingPartitioner.getToken(OrderPreservingPartitioner.java:155)
   ... 11 more

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-1600) Merge get_indexed_slices with get_range_slices

2011-01-31 Thread Ben Pirt (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12988732#comment-12988732
 ] 

Ben Pirt commented on CASSANDRA-1600:
-

Hopefully it is useful to get another use case for why this is important to a 
real-world user.

We are storing time-series data and would like to be able to pull out all 
values between time A and time B that have a specific value as a property. 
Because we aren't able to combine a range slice with an indexed slice we are 
having to duplicate our data into several keyspaces so we can still do the 
range slice. Our ideal scenario would be to be able to say Give me all keys 
between time A and time B whose property P is greater than or equal to 5

I would imagine that in another time-series type scenario of storing lots of 
logs (e.g. Apache logs) it would be very useful to say Give me all logs 
between time A and time B with a status code of 200

Please do let me know if I'm misunderstanding things and that there is a better 
way of doing this, but it seems to me that it would be very useful 
functionality. Very much looking forward to 0.8 for this fix alone!

 Merge get_indexed_slices with get_range_slices
 --

 Key: CASSANDRA-1600
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1600
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Affects Versions: 0.7 beta 1
Reporter: Stu Hood
 Fix For: 0.8

 Attachments: 
 0001-Add-optional-IndexClause-to-KeyRange-and-serialize-wit.txt, 
 0002-Drop-the-IndexClause.count-parameter.txt, 
 0003-Execute-RangeSliceCommands-using-scan-when-an-IndexCla.txt, 
 0004-Remove-get_indexed_slices-method.txt, 
 0005-Update-system-tests-to-use-get_range_slices.txt, 
 0006-Remove-start_key-from-IndexClause-for-the-start_key-in.txt, 
 0007-Respect-end_key-for-filtered-queries.txt, 
 0008-allow-applying-row-filtering-to-sequential-scan.txt, 
 0009-rename-Index-Filter.txt, AbstractScanIterator.java


 From a comment on 1157:
 {quote}
 IndexClause only has a start key for get_indexed_slices, but it would seem 
 that the reasoning behind using 'KeyRange' for get_range_slices applies there 
 as well, since if you know the range you care about in the primary index, you 
 don't want to continue scanning until you exhaust 'count' (or the cluster).
 Since it would appear that get_indexed_slices would benefit from a KeyRange, 
 why not smash get_(range|indexed)_slices together, and make IndexClause an 
 optional field on KeyRange?
 {quote}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Issue Comment Edited: (CASSANDRA-1600) Merge get_indexed_slices with get_range_slices

2011-01-31 Thread Ben Pirt (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12988732#comment-12988732
 ] 

Ben Pirt edited comment on CASSANDRA-1600 at 1/31/11 11:47 AM:
---

Hopefully it is useful to get another use case for why this is important to a 
real-world user.

We are storing time-series data and would like to be able to pull out all 
values between time A and time B that have a specific value as a property. 
Because we aren't able to combine a range slice with an indexed slice we are 
having to duplicate our data into several keyspaces so we can still do the 
range slice. Our ideal scenario would be to be able to say Give me all keys 
between time A and time B whose property P is greater than or equal to 5

I would imagine that in another time-series type scenario of storing lots of 
logs (e.g. Apache logs) it would be very useful to say Give me all logs 
between time A and time B with a status code of 200

My only question is how this works in conjunction with limit. As a user I would 
expect that if I limited the results to 100, I would get a max of 100 results 
between time A and time B which matched the secondary index query, however I 
understand this may be at odds with how get_range applies the limit. I would 
want the limit to be applied after the secondary index predicate has been 
applied.

Please do let me know if I'm misunderstanding things and that there is a better 
way of doing this, but it seems to me that it would be very useful 
functionality. Very much looking forward to 0.8 for this fix alone!

  was (Author: bjpirt):
Hopefully it is useful to get another use case for why this is important to 
a real-world user.

We are storing time-series data and would like to be able to pull out all 
values between time A and time B that have a specific value as a property. 
Because we aren't able to combine a range slice with an indexed slice we are 
having to duplicate our data into several keyspaces so we can still do the 
range slice. Our ideal scenario would be to be able to say Give me all keys 
between time A and time B whose property P is greater than or equal to 5

I would imagine that in another time-series type scenario of storing lots of 
logs (e.g. Apache logs) it would be very useful to say Give me all logs 
between time A and time B with a status code of 200

Please do let me know if I'm misunderstanding things and that there is a better 
way of doing this, but it seems to me that it would be very useful 
functionality. Very much looking forward to 0.8 for this fix alone!
  
 Merge get_indexed_slices with get_range_slices
 --

 Key: CASSANDRA-1600
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1600
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Affects Versions: 0.7 beta 1
Reporter: Stu Hood
 Fix For: 0.8

 Attachments: 
 0001-Add-optional-IndexClause-to-KeyRange-and-serialize-wit.txt, 
 0002-Drop-the-IndexClause.count-parameter.txt, 
 0003-Execute-RangeSliceCommands-using-scan-when-an-IndexCla.txt, 
 0004-Remove-get_indexed_slices-method.txt, 
 0005-Update-system-tests-to-use-get_range_slices.txt, 
 0006-Remove-start_key-from-IndexClause-for-the-start_key-in.txt, 
 0007-Respect-end_key-for-filtered-queries.txt, 
 0008-allow-applying-row-filtering-to-sequential-scan.txt, 
 0009-rename-Index-Filter.txt, AbstractScanIterator.java


 From a comment on 1157:
 {quote}
 IndexClause only has a start key for get_indexed_slices, but it would seem 
 that the reasoning behind using 'KeyRange' for get_range_slices applies there 
 as well, since if you know the range you care about in the primary index, you 
 don't want to continue scanning until you exhaust 'count' (or the cluster).
 Since it would appear that get_indexed_slices would benefit from a KeyRange, 
 why not smash get_(range|indexed)_slices together, and make IndexClause an 
 optional field on KeyRange?
 {quote}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1065627 - /cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/config/DatabaseDescriptor.java

2011-01-31 Thread jbellis
Author: jbellis
Date: Mon Jan 31 14:45:33 2011
New Revision: 1065627

URL: http://svn.apache.org/viewvc?rev=1065627view=rev
Log:
include stacktrace for configuration errors in system log

Modified:

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/config/DatabaseDescriptor.java

Modified: 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/config/DatabaseDescriptor.java?rev=1065627r1=1065626r2=1065627view=diff
==
--- 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
 (original)
+++ 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
 Mon Jan 31 14:45:33 2011
@@ -374,19 +374,19 @@ public classDatabaseDescriptor
 }
 catch (UnknownHostException e)
 {
-logger.error(Fatal error:  + e.getMessage());
+logger.error(Fatal configuration error , e);
 System.err.println(Unable to start with unknown hosts configured. 
 Use IP addresses instead of hostnames.);
 System.exit(2);
 }
 catch (ConfigurationException e)
 {
-logger.error(Fatal error:  + e.getMessage());
+logger.error(Fatal configuration error, e);
 System.err.println(Bad configuration; unable to start server);
 System.exit(1);
 }
 catch (YAMLException e)
 {
-logger.error(Fatal error:  + e.getMessage());
+logger.error(Fatal configuration error error, e);
 System.err.println(Bad configuration; unable to start server);
 System.exit(1);
 }




[jira] Commented: (CASSANDRA-2076) Not starting due to Invalid saved cache

2011-01-31 Thread Thibaut (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12988780#comment-12988780
 ] 

Thibaut commented on CASSANDRA-2076:


This might be related:

Two other nodes (still running) also show up the The provided key was not UTF8 
encoded. error in the log.

I have never seen this error in 0.7.0


ERROR [MutationStage:19] 2011-01-30 21:36:16,951 
DebuggableThreadPoolExecutor.java (line 103) Error in ThreadPoolExecutor
java.lang.RuntimeException: java.lang.RuntimeException: The provided key was 
not UTF8 encoded.
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.RuntimeException: The provided key was not UTF8 encoded.
at 
org.apache.cassandra.dht.OrderPreservingPartitioner.getToken(OrderPreservingPartitioner.java:159)
at 
org.apache.cassandra.dht.OrderPreservingPartitioner.decorateKey(OrderPreservingPartitioner.java:44)
at org.apache.cassandra.db.Table.apply(Table.java:406)
at org.apache.cassandra.db.RowMutation.apply(RowMutation.java:190)
at 
org.apache.cassandra.service.StorageProxy$2.runMayThrow(StorageProxy.java:288)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
... 3 more
Caused by: java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:260)
at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:781)
at 
org.apache.cassandra.utils.FBUtilities.decodeToUTF8(FBUtilities.java:403)
at 
org.apache.cassandra.dht.OrderPreservingPartitioner.getToken(OrderPreservingPartitioner.java:155)
... 8 more
ERROR [MutationStage:19] 2011-01-30 21:36:16,991 AbstractCassandraDaemon.java 
(line 119) Fatal exception in thread Thread[MutationStage:19,5,main]
java.lang.RuntimeException: java.lang.RuntimeException: The provided key was 
not UTF8 encoded.
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.RuntimeException: The provided key was not UTF8 encoded.
at 
org.apache.cassandra.dht.OrderPreservingPartitioner.getToken(OrderPreservingPartitioner.java:159)
at 
org.apache.cassandra.dht.OrderPreservingPartitioner.decorateKey(OrderPreservingPartitioner.java:44)
at org.apache.cassandra.db.Table.apply(Table.java:406)
at org.apache.cassandra.db.RowMutation.apply(RowMutation.java:190)
at 
org.apache.cassandra.service.StorageProxy$2.runMayThrow(StorageProxy.java:288)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
... 3 more
Caused by: java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:260)
at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:781)
at 
org.apache.cassandra.utils.FBUtilities.decodeToUTF8(FBUtilities.java:403)
at 
org.apache.cassandra.dht.OrderPreservingPartitioner.getToken(OrderPreservingPartitioner.java:155)
... 8 more
 WARN [ScheduledTasks:1] 2011-01-30 21:36:21,450 MessagingService.java (line 
506) Dropped 8 MUTATION messages in the last 5000ms





 Not starting due to Invalid saved cache
 ---

 Key: CASSANDRA-2076
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2076
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: linux
Reporter: Thibaut
Priority: Minor
 Fix For: 0.7.2


 This occured on two nodes on me (running 0.7.1 from svn)
 One node was killed by the kernel due to a OOM and the other node was haning 
 and I had to kill it manually with kill -9 (kill didn't work). (maybe these 
 were faulty hardware nodes, I don't know)
 The saved_cache was corrupt afterwards and I couldn't start the nodes. 
 After deleting the saved_caches directory I could start the nodes again. 
 Instead of not starting when an error occurs, cassandra could simply delete 
 the errornous file and continue to start?
  INFO 22:31:11,570 reading saved cache
 /hd1/cassandra_md5/saved_caches/table_attributes-table_attributes-KeyCache
 ERROR 22:31:11,595 Exception encountered during startup.
 java.lang.RuntimeException: The provided key was not UTF8 encoded.
at 
 

[jira] Updated: (CASSANDRA-2076) Not starting due to Invalid saved cache

2011-01-31 Thread Thibaut (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thibaut updated CASSANDRA-2076:
---

Priority: Critical  (was: Minor)

 Not starting due to Invalid saved cache
 ---

 Key: CASSANDRA-2076
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2076
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: linux
Reporter: Thibaut
Priority: Critical
 Fix For: 0.7.2


 This occured on two nodes on me (running 0.7.1 from svn)
 One node was killed by the kernel due to a OOM and the other node was haning 
 and I had to kill it manually with kill -9 (kill didn't work). (maybe these 
 were faulty hardware nodes, I don't know)
 The saved_cache was corrupt afterwards and I couldn't start the nodes. 
 After deleting the saved_caches directory I could start the nodes again. 
 Instead of not starting when an error occurs, cassandra could simply delete 
 the errornous file and continue to start?
  INFO 22:31:11,570 reading saved cache
 /hd1/cassandra_md5/saved_caches/table_attributes-table_attributes-KeyCache
 ERROR 22:31:11,595 Exception encountered during startup.
 java.lang.RuntimeException: The provided key was not UTF8 encoded.
at 
 org.apache.cassandra.dht.OrderPreservingPartitioner.getToken(OrderPreservingPartitioner.java:159)
at 
 org.apache.cassandra.dht.OrderPreservingPartitioner.decorateKey(OrderPreservingPartitioner.java:44)
at 
 org.apache.cassandra.db.ColumnFamilyStore.readSavedCache(ColumnFamilyStore.java:281)
at 
 org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:218)
at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:458)
at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:440)
at org.apache.cassandra.db.Table.initCf(Table.java:360)
at org.apache.cassandra.db.Table.init(Table.java:290)
at org.apache.cassandra.db.Table.open(Table.java:107)
at 
 org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:167)
at 
 org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:312)
at 
 org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:81)
 Caused by: java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:260)
at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:781)
at 
 org.apache.cassandra.utils.FBUtilities.decodeToUTF8(FBUtilities.java:403)
at 
 org.apache.cassandra.dht.OrderPreservingPartitioner.getToken(OrderPreservingPartitioner.java:155)
... 11 more
 Exception encountered during startup.
 java.lang.RuntimeException: The provided key was not UTF8 encoded.
at 
 org.apache.cassandra.dht.OrderPreservingPartitioner.getToken(OrderPreservingPartitioner.java:159)
at 
 org.apache.cassandra.dht.OrderPreservingPartitioner.decorateKey(OrderPreservingPartitioner.java:44)
at 
 org.apache.cassandra.db.ColumnFamilyStore.readSavedCache(ColumnFamilyStore.java:281)
at 
 org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:218)
at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:458)
at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:440)
at org.apache.cassandra.db.Table.initCf(Table.java:360)
at org.apache.cassandra.db.Table.init(Table.java:290)
at org.apache.cassandra.db.Table.open(Table.java:107)
at 
 org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:167)
at 
 org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:312)
at 
 org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:81)
 Caused by: java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:260)
at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:781)
at 
 org.apache.cassandra.utils.FBUtilities.decodeToUTF8(FBUtilities.java:403)
at 
 org.apache.cassandra.dht.OrderPreservingPartitioner.getToken(OrderPreservingPartitioner.java:155)
... 11 more

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2076) Not starting due to Invalid saved cache

2011-01-31 Thread Thibaut (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thibaut updated CASSANDRA-2076:
---

Fix Version/s: 0.7.1

 Not starting due to Invalid saved cache
 ---

 Key: CASSANDRA-2076
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2076
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: linux
Reporter: Thibaut
Priority: Critical
 Fix For: 0.7.1, 0.7.2


 This occured on two nodes on me (running 0.7.1 from svn)
 One node was killed by the kernel due to a OOM and the other node was haning 
 and I had to kill it manually with kill -9 (kill didn't work). (maybe these 
 were faulty hardware nodes, I don't know)
 The saved_cache was corrupt afterwards and I couldn't start the nodes. 
 After deleting the saved_caches directory I could start the nodes again. 
 Instead of not starting when an error occurs, cassandra could simply delete 
 the errornous file and continue to start?
  INFO 22:31:11,570 reading saved cache
 /hd1/cassandra_md5/saved_caches/table_attributes-table_attributes-KeyCache
 ERROR 22:31:11,595 Exception encountered during startup.
 java.lang.RuntimeException: The provided key was not UTF8 encoded.
at 
 org.apache.cassandra.dht.OrderPreservingPartitioner.getToken(OrderPreservingPartitioner.java:159)
at 
 org.apache.cassandra.dht.OrderPreservingPartitioner.decorateKey(OrderPreservingPartitioner.java:44)
at 
 org.apache.cassandra.db.ColumnFamilyStore.readSavedCache(ColumnFamilyStore.java:281)
at 
 org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:218)
at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:458)
at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:440)
at org.apache.cassandra.db.Table.initCf(Table.java:360)
at org.apache.cassandra.db.Table.init(Table.java:290)
at org.apache.cassandra.db.Table.open(Table.java:107)
at 
 org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:167)
at 
 org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:312)
at 
 org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:81)
 Caused by: java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:260)
at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:781)
at 
 org.apache.cassandra.utils.FBUtilities.decodeToUTF8(FBUtilities.java:403)
at 
 org.apache.cassandra.dht.OrderPreservingPartitioner.getToken(OrderPreservingPartitioner.java:155)
... 11 more
 Exception encountered during startup.
 java.lang.RuntimeException: The provided key was not UTF8 encoded.
at 
 org.apache.cassandra.dht.OrderPreservingPartitioner.getToken(OrderPreservingPartitioner.java:159)
at 
 org.apache.cassandra.dht.OrderPreservingPartitioner.decorateKey(OrderPreservingPartitioner.java:44)
at 
 org.apache.cassandra.db.ColumnFamilyStore.readSavedCache(ColumnFamilyStore.java:281)
at 
 org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:218)
at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:458)
at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:440)
at org.apache.cassandra.db.Table.initCf(Table.java:360)
at org.apache.cassandra.db.Table.init(Table.java:290)
at org.apache.cassandra.db.Table.open(Table.java:107)
at 
 org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:167)
at 
 org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:312)
at 
 org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:81)
 Caused by: java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:260)
at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:781)
at 
 org.apache.cassandra.utils.FBUtilities.decodeToUTF8(FBUtilities.java:403)
at 
 org.apache.cassandra.dht.OrderPreservingPartitioner.getToken(OrderPreservingPartitioner.java:155)
... 11 more

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-2076) Not starting due to Invalid saved cache

2011-01-31 Thread Thibaut (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12988781#comment-12988781
 ] 

Thibaut commented on CASSANDRA-2076:


I brought down and restarted the entire cluster. (100 nodes, 5x20 nodes)

Every single node complains of an invalid file in the saved_cache directory.

 Not starting due to Invalid saved cache
 ---

 Key: CASSANDRA-2076
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2076
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: linux
Reporter: Thibaut
Priority: Critical
 Fix For: 0.7.1, 0.7.2


 This occured on two nodes on me (running 0.7.1 from svn)
 One node was killed by the kernel due to a OOM and the other node was haning 
 and I had to kill it manually with kill -9 (kill didn't work). (maybe these 
 were faulty hardware nodes, I don't know)
 The saved_cache was corrupt afterwards and I couldn't start the nodes. 
 After deleting the saved_caches directory I could start the nodes again. 
 Instead of not starting when an error occurs, cassandra could simply delete 
 the errornous file and continue to start?
  INFO 22:31:11,570 reading saved cache
 /hd1/cassandra_md5/saved_caches/table_attributes-table_attributes-KeyCache
 ERROR 22:31:11,595 Exception encountered during startup.
 java.lang.RuntimeException: The provided key was not UTF8 encoded.
at 
 org.apache.cassandra.dht.OrderPreservingPartitioner.getToken(OrderPreservingPartitioner.java:159)
at 
 org.apache.cassandra.dht.OrderPreservingPartitioner.decorateKey(OrderPreservingPartitioner.java:44)
at 
 org.apache.cassandra.db.ColumnFamilyStore.readSavedCache(ColumnFamilyStore.java:281)
at 
 org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:218)
at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:458)
at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:440)
at org.apache.cassandra.db.Table.initCf(Table.java:360)
at org.apache.cassandra.db.Table.init(Table.java:290)
at org.apache.cassandra.db.Table.open(Table.java:107)
at 
 org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:167)
at 
 org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:312)
at 
 org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:81)
 Caused by: java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:260)
at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:781)
at 
 org.apache.cassandra.utils.FBUtilities.decodeToUTF8(FBUtilities.java:403)
at 
 org.apache.cassandra.dht.OrderPreservingPartitioner.getToken(OrderPreservingPartitioner.java:155)
... 11 more
 Exception encountered during startup.
 java.lang.RuntimeException: The provided key was not UTF8 encoded.
at 
 org.apache.cassandra.dht.OrderPreservingPartitioner.getToken(OrderPreservingPartitioner.java:159)
at 
 org.apache.cassandra.dht.OrderPreservingPartitioner.decorateKey(OrderPreservingPartitioner.java:44)
at 
 org.apache.cassandra.db.ColumnFamilyStore.readSavedCache(ColumnFamilyStore.java:281)
at 
 org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:218)
at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:458)
at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:440)
at org.apache.cassandra.db.Table.initCf(Table.java:360)
at org.apache.cassandra.db.Table.init(Table.java:290)
at org.apache.cassandra.db.Table.open(Table.java:107)
at 
 org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:167)
at 
 org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:312)
at 
 org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:81)
 Caused by: java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:260)
at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:781)
at 
 org.apache.cassandra.utils.FBUtilities.decodeToUTF8(FBUtilities.java:403)
at 
 org.apache.cassandra.dht.OrderPreservingPartitioner.getToken(OrderPreservingPartitioner.java:155)
... 11 more

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2076) Not restarting due to Invalid saved cache

2011-01-31 Thread Thibaut (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thibaut updated CASSANDRA-2076:
---

Summary: Not restarting due to Invalid saved cache  (was: Not starting due 
to Invalid saved cache)

 Not restarting due to Invalid saved cache
 -

 Key: CASSANDRA-2076
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2076
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: linux
Reporter: Thibaut
Priority: Critical
 Fix For: 0.7.1, 0.7.2


 This occured on two nodes on me (running 0.7.1 from svn)
 One node was killed by the kernel due to a OOM and the other node was haning 
 and I had to kill it manually with kill -9 (kill didn't work). (maybe these 
 were faulty hardware nodes, I don't know)
 The saved_cache was corrupt afterwards and I couldn't start the nodes. 
 After deleting the saved_caches directory I could start the nodes again. 
 Instead of not starting when an error occurs, cassandra could simply delete 
 the errornous file and continue to start?
  INFO 22:31:11,570 reading saved cache
 /hd1/cassandra_md5/saved_caches/table_attributes-table_attributes-KeyCache
 ERROR 22:31:11,595 Exception encountered during startup.
 java.lang.RuntimeException: The provided key was not UTF8 encoded.
at 
 org.apache.cassandra.dht.OrderPreservingPartitioner.getToken(OrderPreservingPartitioner.java:159)
at 
 org.apache.cassandra.dht.OrderPreservingPartitioner.decorateKey(OrderPreservingPartitioner.java:44)
at 
 org.apache.cassandra.db.ColumnFamilyStore.readSavedCache(ColumnFamilyStore.java:281)
at 
 org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:218)
at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:458)
at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:440)
at org.apache.cassandra.db.Table.initCf(Table.java:360)
at org.apache.cassandra.db.Table.init(Table.java:290)
at org.apache.cassandra.db.Table.open(Table.java:107)
at 
 org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:167)
at 
 org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:312)
at 
 org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:81)
 Caused by: java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:260)
at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:781)
at 
 org.apache.cassandra.utils.FBUtilities.decodeToUTF8(FBUtilities.java:403)
at 
 org.apache.cassandra.dht.OrderPreservingPartitioner.getToken(OrderPreservingPartitioner.java:155)
... 11 more
 Exception encountered during startup.
 java.lang.RuntimeException: The provided key was not UTF8 encoded.
at 
 org.apache.cassandra.dht.OrderPreservingPartitioner.getToken(OrderPreservingPartitioner.java:159)
at 
 org.apache.cassandra.dht.OrderPreservingPartitioner.decorateKey(OrderPreservingPartitioner.java:44)
at 
 org.apache.cassandra.db.ColumnFamilyStore.readSavedCache(ColumnFamilyStore.java:281)
at 
 org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:218)
at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:458)
at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:440)
at org.apache.cassandra.db.Table.initCf(Table.java:360)
at org.apache.cassandra.db.Table.init(Table.java:290)
at org.apache.cassandra.db.Table.open(Table.java:107)
at 
 org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:167)
at 
 org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:312)
at 
 org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:81)
 Caused by: java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:260)
at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:781)
at 
 org.apache.cassandra.utils.FBUtilities.decodeToUTF8(FBUtilities.java:403)
at 
 org.apache.cassandra.dht.OrderPreservingPartitioner.getToken(OrderPreservingPartitioner.java:155)
... 11 more

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-2073) Streaming occasionally makes gossip back up

2011-01-31 Thread Gary Dusbabek (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12988782#comment-12988782
 ] 

Gary Dusbabek commented on CASSANDRA-2073:
--

+1

 Streaming occasionally makes gossip back up
 ---

 Key: CASSANDRA-2073
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2073
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.0
Reporter: Brandon Williams
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 0.7.2

 Attachments: 2073.txt


 Streaming occasionally makes gossip back up, causing nodes to mark each other 
 as down even though the network is ok.  This appears to happen just after 
 streaming has finished.  I noticed this in the course of working on 
 CASSANDRA-2072, so decommission is one way to reproduce.  It seems to happen 
 maybe one of fifteen or twenty tries, so it's fairly rare.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1065654 - in /cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra: config/DatabaseDescriptor.java service/AbstractCassandraDaemon.java service/StorageService.java

2011-01-31 Thread jbellis
Author: jbellis
Date: Mon Jan 31 15:40:48 2011
New Revision: 1065654

URL: http://svn.apache.org/viewvc?rev=1065654view=rev
Log:
more informative error messages for configuration problems
patch by jbellis

Modified:

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/config/DatabaseDescriptor.java

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/AbstractCassandraDaemon.java

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/StorageService.java

Modified: 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/config/DatabaseDescriptor.java?rev=1065654r1=1065653r2=1065654view=diff
==
--- 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
 (original)
+++ 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
 Mon Jan 31 15:40:48 2011
@@ -120,8 +120,17 @@ public classDatabaseDescriptor
 {
 URL url = getStorageConfigURL();
 logger.info(Loading settings from  + url);
-
-InputStream input = url.openStream();
+
+InputStream input = null;
+try
+{
+input = url.openStream();
+}
+catch (IOException e)
+{
+// getStorageConfigURL should have ruled this out
+throw new AssertionError(e);
+}
 org.yaml.snakeyaml.constructor.Constructor constructor = new 
org.yaml.snakeyaml.constructor.Constructor(Config.class);
 TypeDescription desc = new TypeDescription(Config.class);
 desc.putListPropertyType(keyspaces, RawKeyspace.class);
@@ -253,7 +262,16 @@ public classDatabaseDescriptor
 
 /* Local IP or hostname to bind RPC server to */
 if (conf.rpc_address != null)
-rpcAddress = InetAddress.getByName(conf.rpc_address);
+{
+try
+{
+rpcAddress = InetAddress.getByName(conf.rpc_address);
+}
+catch (UnknownHostException e)
+{
+throw new ConfigurationException(Unknown host in 
rpc_address  + conf.rpc_address);
+}
+}
 
 if (conf.thrift_framed_transport_size_in_mb  0  
conf.thrift_max_message_length_in_mb  conf.thrift_framed_transport_size_in_mb)
 {
@@ -291,6 +309,10 @@ public classDatabaseDescriptor
 {
 throw new ConfigurationException(Invalid Request 
Scheduler class  + conf.request_scheduler);
 }
+catch (Exception e)
+{
+throw new ConfigurationException(Unable to instantiate 
request scheduler, e);
+}
 }
 else
 {
@@ -369,31 +391,28 @@ public classDatabaseDescriptor
 }
 for (String seedString : conf.seeds)
 {
-seeds.add(InetAddress.getByName(seedString));
+try
+{
+seeds.add(InetAddress.getByName(seedString));
+}
+catch (UnknownHostException e)
+{
+throw new ConfigurationException(Unknown seed  + 
seedString + .  Consider using IP addresses instead of host names);
+}
 }
 }
-catch (UnknownHostException e)
-{
-logger.error(Fatal configuration error , e);
-System.err.println(Unable to start with unknown hosts configured. 
 Use IP addresses instead of hostnames.);
-System.exit(2);
-}
 catch (ConfigurationException e)
 {
 logger.error(Fatal configuration error, e);
-System.err.println(Bad configuration; unable to start server);
+System.err.println(e.getMessage() + \nFatal configuration error; 
unable to start server.  See log for stacktrace.);
 System.exit(1);
 }
 catch (YAMLException e)
 {
 logger.error(Fatal configuration error error, e);
-System.err.println(Bad configuration; unable to start server);
+System.err.println(e.getMessage() + \nInvalid yaml; unable to 
start server.  See log for stacktrace.);
 System.exit(1);
 }
-catch (Exception e)
-{
-throw new RuntimeException(e);
-}
 }
 
 private static IEndpointSnitch createEndpointSnitch(String 
endpointSnitchClassName) throws ConfigurationException

Modified: 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/AbstractCassandraDaemon.java
URL: 

svn commit: r1065660 - in /cassandra/branches/cassandra-0.7: conf/cassandra.yaml src/java/org/apache/cassandra/locator/TokenMetadata.java

2011-01-31 Thread jbellis
Author: jbellis
Date: Mon Jan 31 16:02:21 2011
New Revision: 1065660

URL: http://svn.apache.org/viewvc?rev=1065660view=rev
Log:
move initialization out of constructor where possible

Modified:
cassandra/branches/cassandra-0.7/conf/cassandra.yaml

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/locator/TokenMetadata.java

Modified: cassandra/branches/cassandra-0.7/conf/cassandra.yaml
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/conf/cassandra.yaml?rev=1065660r1=1065659r2=1065660view=diff
==
--- cassandra/branches/cassandra-0.7/conf/cassandra.yaml (original)
+++ cassandra/branches/cassandra-0.7/conf/cassandra.yaml Mon Jan 31 16:02:21 
2011
@@ -220,7 +220,7 @@ rpc_timeout_in_ms: 1
 # org.apache.cassandra.locator.PropertyFileSnitch:
 #  - Proximity is determined by rack and data center, which are
 #explicitly configured in cassandra-topology.properties.
-endpoint_snitch: org.apache.cassandra.locator.SimpleSnitch
+endpoint_snitch: org.apache.cassandra.locator.PropertyFileSnitch
 
 # dynamic_snitch -- This boolean controls whether the above snitch is
 # wrapped with a dynamic snitch, which will monitor read latencies

Modified: 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/locator/TokenMetadata.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/locator/TokenMetadata.java?rev=1065660r1=1065659r2=1065660view=diff
==
--- 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/locator/TokenMetadata.java
 (original)
+++ 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/locator/TokenMetadata.java
 Mon Jan 31 16:02:21 2011
@@ -50,22 +50,22 @@ public class TokenMetadata
 // for any nodes that boot simultaneously between same two nodes. For this 
we cannot simply make pending ranges a ttMultimap/tt,
 // since that would make us unable to notice the real problem of two nodes 
trying to boot using the same token.
 // In order to do this properly, we need to know what tokens are booting 
at any time.
-private BiMapToken, InetAddress bootstrapTokens;
+private BiMapToken, InetAddress bootstrapTokens = HashBiMap.create();
 
 // we will need to know at all times what nodes are leaving and calculate 
ranges accordingly.
 // An anonymous pending ranges list is not enough, as that does not tell 
which node is leaving
 // and/or if the ranges are there because of bootstrap or leave operation.
 // (See CASSANDRA-603 for more detail + examples).
-private SetInetAddress leavingEndpoints;
+private SetInetAddress leavingEndpoints = new HashSetInetAddress();
 
-private ConcurrentMapString, MultimapRange, InetAddress pendingRanges;
+private ConcurrentMapString, MultimapRange, InetAddress pendingRanges 
= new ConcurrentHashMapString, MultimapRange, InetAddress();
 
 /* Use this lock for manipulating the token map */
 private final ReadWriteLock lock = new ReentrantReadWriteLock(true);
 private ArrayListToken sortedTokens;
 
 /* list of subscribers that are notified when the tokenToEndpointMap 
changed */
-private final CopyOnWriteArrayListAbstractReplicationStrategy 
subscribers;
+private final CopyOnWriteArrayListAbstractReplicationStrategy 
subscribers = new CopyOnWriteArrayListAbstractReplicationStrategy();
 
 public TokenMetadata()
 {
@@ -77,11 +77,7 @@ public class TokenMetadata
 if (tokenToEndpointMap == null)
 tokenToEndpointMap = HashBiMap.create();
 this.tokenToEndpointMap = tokenToEndpointMap;
-bootstrapTokens = HashBiMap.create();
-leavingEndpoints = new HashSetInetAddress();
-pendingRanges = new ConcurrentHashMapString, MultimapRange, 
InetAddress();
 sortedTokens = sortTokens();
-subscribers = new CopyOnWriteArrayListAbstractReplicationStrategy();
 }
 
 private ArrayListToken sortTokens()




svn commit: r1065664 - in /cassandra/branches/cassandra-0.7: CHANGES.txt src/java/org/apache/cassandra/gms/Gossiper.java src/java/org/apache/cassandra/net/IncomingTcpConnection.java src/java/org/apach

2011-01-31 Thread gdusbabek
Author: gdusbabek
Date: Mon Jan 31 16:12:57 2011
New Revision: 1065664

URL: http://svn.apache.org/viewvc?rev=1065664view=rev
Log:
ignore messages from the future. keep track of nodes in gossip regardless. 
patch by gdusbabek, reviewed by jbellis. CASSANDRA-1970

Modified:
cassandra/branches/cassandra-0.7/CHANGES.txt

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/gms/Gossiper.java

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/net/IncomingTcpConnection.java

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/net/MessagingService.java

Modified: cassandra/branches/cassandra-0.7/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/CHANGES.txt?rev=1065664r1=1065663r2=1065664view=diff
==
--- cassandra/branches/cassandra-0.7/CHANGES.txt (original)
+++ cassandra/branches/cassandra-0.7/CHANGES.txt Mon Jan 31 16:12:57 2011
@@ -49,7 +49,8 @@
  * fix math in RandomPartitioner.describeOwnership (CASSANDRA-2071)
  * fix deletion of sstable non-data components (CASSANDRA-2059)
  * avoid blocking gossip while deleting handoff hints (CASSANDRA-2073)
-
+ * ignore messages from newer versions, keep track of nodes in gossip 
+   regardless of version (CASSANDRA-1970)
 
 0.7.0-final
  * fix offsets to ByteBuffer.get (CASSANDRA-1939)

Modified: 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/gms/Gossiper.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/gms/Gossiper.java?rev=1065664r1=1065663r2=1065664view=diff
==
--- 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/gms/Gossiper.java
 (original)
+++ 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/gms/Gossiper.java
 Mon Jan 31 16:12:57 2011
@@ -26,6 +26,7 @@ import java.util.*;
 import java.util.Map.Entry;
 import java.util.concurrent.*;
 
+import org.cliffc.high_scale_lib.NonBlockingHashMap;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
@@ -141,6 +142,10 @@ public class Gossiper implements IFailur
  * after removal to prevent nodes from falsely reincarnating during the 
time when removal
  * gossip gets propagated to all nodes */
 MapInetAddress, Long justRemovedEndpoints_ = new 
ConcurrentHashMapInetAddress, Long();
+
+// protocol versions of the other nodes in the cluster
+private final ConcurrentMapInetAddress, Integer versions = new 
NonBlockingHashMapInetAddress, Integer();
+
 
 private Gossiper()
 {
@@ -169,6 +174,20 @@ public class Gossiper implements IFailur
 {
 subscribers_.remove(subscriber);
 }
+
+public void setVersion(InetAddress address, int version)
+{
+Integer old = versions.put(address, version);
+EndpointState state = endpointStateMap_.get(address);
+if (state == null)
+addSavedEndpoint(address);
+}
+
+public Integer getVersion(InetAddress address)
+{
+return versions.get(address);
+}
+
 
 public SetInetAddress getLiveMembers()
 {

Modified: 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/net/IncomingTcpConnection.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/net/IncomingTcpConnection.java?rev=1065664r1=1065663r2=1065664view=diff
==
--- 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/net/IncomingTcpConnection.java
 (original)
+++ 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/net/IncomingTcpConnection.java
 Mon Jan 31 16:12:57 2011
@@ -24,6 +24,7 @@ package org.apache.cassandra.net;
 import java.io.*;
 import java.net.Socket;
 
+import org.apache.cassandra.gms.Gossiper;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
@@ -52,6 +53,7 @@ public class IncomingTcpConnection exten
 {
 DataInputStream input;
 boolean isStream;
+int version;
 try
 {
 // determine the connection type to decide whether to buffer
@@ -62,6 +64,8 @@ public class IncomingTcpConnection exten
 if (!isStream)
 // we should buffer
 input = new DataInputStream(new 
BufferedInputStream(socket.getInputStream(), 4096));
+version = MessagingService.getBits(header, 15, 8);
+Gossiper.instance.setVersion(socket.getInetAddress(), version);
 }
 catch (IOException e)
 {
@@ -74,6 +78,12 @@ public class IncomingTcpConnection exten
 {
 if (isStream)
 {
+if (version  MessagingService.version_)
+{
+logger.error(Received untranslated stream from newer 
protcol version. 

svn commit: r1065665 - in /cassandra/branches/cassandra-0.7: src/java/org/apache/cassandra/locator/ src/java/org/apache/cassandra/service/ test/unit/org/apache/cassandra/dht/

2011-01-31 Thread jbellis
Author: jbellis
Date: Mon Jan 31 16:13:20 2011
New Revision: 1065665

URL: http://svn.apache.org/viewvc?rev=1065665view=rev
Log:
convert SS.partitioner, valueFactory to instance fields

Modified:

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/locator/Ec2Snitch.java

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/MigrationManager.java

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/StorageLoadBalancer.java

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/StorageService.java

cassandra/branches/cassandra-0.7/test/unit/org/apache/cassandra/dht/BootStrapperTest.java

Modified: 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/locator/Ec2Snitch.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/locator/Ec2Snitch.java?rev=1065665r1=1065664r2=1065665view=diff
==
--- 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/locator/Ec2Snitch.java
 (original)
+++ 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/locator/Ec2Snitch.java
 Mon Jan 31 16:13:20 2011
@@ -89,7 +89,7 @@ public class Ec2Snitch extends AbstractN
 {
 // Share EC2 info via gossip.  We have to wait until Gossiper is 
initialized though.
 logger.info(Ec2Snitch adding ApplicationState ec2region= + ec2region 
+  ec2zone= + ec2zone);
-Gossiper.instance.addLocalApplicationState(ApplicationState.DC, 
StorageService.valueFactory.datacenter(ec2region));
-Gossiper.instance.addLocalApplicationState(ApplicationState.RACK, 
StorageService.valueFactory.rack(ec2zone));
+Gossiper.instance.addLocalApplicationState(ApplicationState.DC, 
StorageService.instance.valueFactory.datacenter(ec2region));
+Gossiper.instance.addLocalApplicationState(ApplicationState.RACK, 
StorageService.instance.valueFactory.rack(ec2zone));
 }
 }

Modified: 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/MigrationManager.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/MigrationManager.java?rev=1065665r1=1065664r2=1065665view=diff
==
--- 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/MigrationManager.java
 (original)
+++ 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/MigrationManager.java
 Mon Jan 31 16:13:20 2011
@@ -97,7 +97,7 @@ public class MigrationManager implements
 MessagingService.instance().sendOneWay(msg, host);
 // this is for notifying nodes as they arrive in the cluster.
 if (!StorageService.instance.isClientMode())
-
Gossiper.instance.addLocalApplicationState(ApplicationState.SCHEMA, 
StorageService.valueFactory.migration(version));
+
Gossiper.instance.addLocalApplicationState(ApplicationState.SCHEMA, 
StorageService.instance.valueFactory.migration(version));
 }
 
 /**

Modified: 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/StorageLoadBalancer.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/StorageLoadBalancer.java?rev=1065665r1=1065664r2=1065665view=diff
==
--- 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/StorageLoadBalancer.java
 (original)
+++ 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/StorageLoadBalancer.java
 Mon Jan 31 16:13:20 2011
@@ -348,7 +348,7 @@ public class StorageLoadBalancer impleme
 if (logger_.isDebugEnabled())
 logger_.debug(Disseminating load info ...);
 
Gossiper.instance.addLocalApplicationState(ApplicationState.LOAD,
-   
StorageService.valueFactory.load(StorageService.instance.getLoad()));
+   
StorageService.instance.valueFactory.load(StorageService.instance.getLoad()));
 }
 };
 StorageService.scheduledTasks.scheduleWithFixedDelay(runnable, 2 * 
Gossiper.intervalInMillis_, BROADCAST_INTERVAL, TimeUnit.MILLISECONDS);

Modified: 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/StorageService.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/StorageService.java?rev=1065665r1=1065664r2=1065665view=diff
==
--- 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/StorageService.java
 (original)
+++ 

svn commit: r1065668 - /cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/AbstractCassandraDaemon.java

2011-01-31 Thread jbellis
Author: jbellis
Date: Mon Jan 31 16:16:06 2011
New Revision: 1065668

URL: http://svn.apache.org/viewvc?rev=1065668view=rev
Log:
fix circular initialization problem with PropertyFileSnitch caused by #1951
patch by slebresne; reviewed by jbellis

Modified:

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/AbstractCassandraDaemon.java

Modified: 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/AbstractCassandraDaemon.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/AbstractCassandraDaemon.java?rev=1065668r1=1065667r2=1065668view=diff
==
--- 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/AbstractCassandraDaemon.java
 (original)
+++ 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/AbstractCassandraDaemon.java
 Mon Jan 31 16:16:06 2011
@@ -56,11 +56,6 @@ import org.mortbay.thread.ThreadPool;
  */
 public abstract class AbstractCassandraDaemon implements CassandraDaemon
 {
-public AbstractCassandraDaemon()
-{
-StorageService.instance.registerDaemon(this);
-}
-
 //Initialize logging in such a way that it checks for config changes every 
10 seconds.
 static
 {
@@ -184,6 +179,7 @@ public abstract class AbstractCassandraD
 SystemTable.purgeIncompatibleHints();
 
 // start server internals
+StorageService.instance.registerDaemon(this);
 try
 {
 StorageService.instance.initServer();




svn commit: r1065669 - /cassandra/branches/cassandra-0.7/conf/cassandra.yaml

2011-01-31 Thread jbellis
Author: jbellis
Date: Mon Jan 31 16:16:44 2011
New Revision: 1065669

URL: http://svn.apache.org/viewvc?rev=1065669view=rev
Log:
set default snitch back to Simple

Modified:
cassandra/branches/cassandra-0.7/conf/cassandra.yaml

Modified: cassandra/branches/cassandra-0.7/conf/cassandra.yaml
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/conf/cassandra.yaml?rev=1065669r1=1065668r2=1065669view=diff
==
--- cassandra/branches/cassandra-0.7/conf/cassandra.yaml (original)
+++ cassandra/branches/cassandra-0.7/conf/cassandra.yaml Mon Jan 31 16:16:44 
2011
@@ -220,7 +220,7 @@ rpc_timeout_in_ms: 1
 # org.apache.cassandra.locator.PropertyFileSnitch:
 #  - Proximity is determined by rack and data center, which are
 #explicitly configured in cassandra-topology.properties.
-endpoint_snitch: org.apache.cassandra.locator.PropertyFileSnitch
+endpoint_snitch: org.apache.cassandra.locator.SimpleSnitch
 
 # dynamic_snitch -- This boolean controls whether the above snitch is
 # wrapped with a dynamic snitch, which will monitor read latencies




svn commit: r1065676 - in /cassandra/trunk: ./ conf/ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/java/org/apache/cassandra/config/ src/java/org/apache/cassandra/db/ src/java/org/apache/

2011-01-31 Thread gdusbabek
Author: gdusbabek
Date: Mon Jan 31 16:30:16 2011
New Revision: 1065676

URL: http://svn.apache.org/viewvc?rev=1065676view=rev
Log:
merge from 0.7

Modified:
cassandra/trunk/   (props changed)
cassandra/trunk/CHANGES.txt
cassandra/trunk/conf/cassandra.yaml

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java
   (props changed)
cassandra/trunk/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
cassandra/trunk/src/java/org/apache/cassandra/db/HintedHandOffManager.java
cassandra/trunk/src/java/org/apache/cassandra/gms/Gossiper.java
cassandra/trunk/src/java/org/apache/cassandra/locator/Ec2Snitch.java
cassandra/trunk/src/java/org/apache/cassandra/locator/TokenMetadata.java
cassandra/trunk/src/java/org/apache/cassandra/net/IncomingTcpConnection.java
cassandra/trunk/src/java/org/apache/cassandra/net/MessagingService.java

cassandra/trunk/src/java/org/apache/cassandra/service/AbstractCassandraDaemon.java
cassandra/trunk/src/java/org/apache/cassandra/service/MigrationManager.java

cassandra/trunk/src/java/org/apache/cassandra/service/StorageLoadBalancer.java
cassandra/trunk/src/java/org/apache/cassandra/service/StorageService.java
cassandra/trunk/test/unit/org/apache/cassandra/dht/BootStrapperTest.java

Propchange: cassandra/trunk/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Mon Jan 31 16:30:16 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1064713
-/cassandra/branches/cassandra-0.7:1026516-1064915
+/cassandra/branches/cassandra-0.7:1026516-1065665
 /cassandra/branches/cassandra-0.7.0:1053690-1055654
 /cassandra/tags/cassandra-0.7.0-rc3:1051699-1053689
 /incubator/cassandra/branches/cassandra-0.3:774578-796573

Modified: cassandra/trunk/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/CHANGES.txt?rev=1065676r1=1065675r2=1065676view=diff
==
--- cassandra/trunk/CHANGES.txt (original)
+++ cassandra/trunk/CHANGES.txt Mon Jan 31 16:30:16 2011
@@ -58,7 +58,9 @@
(CASSANDRA-2058)
  * fix math in RandomPartitioner.describeOwnership (CASSANDRA-2071)
  * fix deletion of sstable non-data components (CASSANDRA-2059)
-
+ * avoid blocking gossip while deleting handoff hints (CASSANDRA-2073)
+ * ignore messages from newer versions, keep track of nodes in gossip 
+   regardless of version (CASSANDRA-1970)
 
 0.7.0-final
  * fix offsets to ByteBuffer.get (CASSANDRA-1939)

Modified: cassandra/trunk/conf/cassandra.yaml
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/conf/cassandra.yaml?rev=1065676r1=1065675r2=1065676view=diff
==
--- cassandra/trunk/conf/cassandra.yaml (original)
+++ cassandra/trunk/conf/cassandra.yaml Mon Jan 31 16:30:16 2011
@@ -225,7 +225,7 @@ rpc_timeout_in_ms: 1
 # org.apache.cassandra.locator.PropertyFileSnitch:
 #  - Proximity is determined by rack and data center, which are
 #explicitly configured in cassandra-topology.properties.
-endpoint_snitch: org.apache.cassandra.locator.SimpleSnitch
+endpoint_snitch: org.apache.cassandra.locator.PropertyFileSnitch
 
 # dynamic_snitch -- This boolean controls whether the above snitch is
 # wrapped with a dynamic snitch, which will monitor read latencies

Propchange: 
cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Mon Jan 31 16:30:16 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:922689-1052356,1052358-1053452,1053454,1053456-1064713
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1064915
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1065665
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1055654
 
/cassandra/tags/cassandra-0.7.0-rc3/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1051699-1053689
 
/incubator/cassandra/branches/cassandra-0.3/interface/gen-java/org/apache/cassandra/service/Cassandra.java:774578-796573

Propchange: 

[jira] Commented: (CASSANDRA-1970) Message version resolution

2011-01-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12988795#comment-12988795
 ] 

Hudson commented on CASSANDRA-1970:
---

Integrated in Cassandra-0.7 #231 (See 
[https://hudson.apache.org/hudson/job/Cassandra-0.7/231/])
ignore messages from the future. keep track of nodes in gossip regardless. 
patch by gdusbabek, reviewed by jbellis. CASSANDRA-1970


 Message version resolution
 --

 Key: CASSANDRA-1970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1970
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Gary Dusbabek
Assignee: Gary Dusbabek
Priority: Minor
 Fix For: 0.7.2

 Attachments: 1970.txt, 
 v3-0001-ignore-messages-from-newer-versions-keep-track-of-node.txt


 When a new new node (version N) contacts an old node (version N-1) for the 
 first time, the old node will not understand the message.  One resolution 
 mechanism would be for the old node to bounce the message back to the 
 sender.  The sender would then respond by translating the message to the 
 appropriate version and resending it.
 For this to work, 0.7.1 will need to have the bounce feature.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (CASSANDRA-2079) AsciiType comparator no longer usable on numeric types in 0.7

2011-01-31 Thread Robbie Strickland (JIRA)
AsciiType comparator no longer usable on numeric types in 0.7
-

 Key: CASSANDRA-2079
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2079
 Project: Cassandra
  Issue Type: Improvement
  Components: Documentation  website
Affects Versions: 0.7.0
 Environment: Ubuntu 10
Reporter: Robbie Strickland
 Fix For: 0.7.0


Prior to 0.7, if you wanted to use integer values other than long types as 
column names, you had to use AsciiType to get a valid numeric-order comparison. 
 If you migrate to 0.7 you need to change the comparison type to IntegerType, 
otherwise you will get the following error: InvalidRequestException(Why: 
Invalid byte for ascii: -51), or something similar.  The documentation should 
be updated to warn users of this issue.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-1969) Use BB for row cache - To Improve GC performance.

2011-01-31 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12988806#comment-12988806
 ] 

Vijay commented on CASSANDRA-1969:
--

Hi Jonathan,

1) Can We catch for OOM when creating direct memory? and log it and return null 
so it is not affecting the normal operations?
2) Can we add a JVM parameter to limit the JVM direct memory allocations (Which 
will include the allocations for Page Cache)?

 Use BB for row cache - To Improve GC performance.
 -

 Key: CASSANDRA-1969
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1969
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux and Mac
Reporter: Vijay
Assignee: Vijay
Priority: Minor
 Attachments: 0001-Config-1969.txt, 
 0001-introduce-ICache-InstrumentingCache-IRowCacheProvider.txt, 
 0002-Update_existing-1965.txt, 0002-implement-SerializingCache.txt, 
 0003-New_Cache_Providers-1969.txt, 0003-add-ICache.isCopying-method.txt, 
 0004-TestCase-1969.txt, BB_Cache-1945.png, JMX-Cache-1945.png, 
 Old_Cahce-1945.png, POC-0001-Config-1945.txt, 
 POC-0002-Update_existing-1945.txt, POC-0003-New_Cache_Providers-1945.txt


 Java BB.allocateDirect() will allocate native memory out of the JVM and will 
 help reducing the GC pressure in the JVM with a large Cache.
 From some of the basic tests it shows around 50% improvement than doing a 
 normal Object cache.
 In addition this patch provide the users an option to choose 
 BB.allocateDirect or store everything in the heap.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (CASSANDRA-2080) Upgrade to release of Whirr 0.3.0

2011-01-31 Thread Stu Hood (JIRA)
Upgrade to release of Whirr 0.3.0
-

 Key: CASSANDRA-2080
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2080
 Project: Cassandra
  Issue Type: Improvement
Reporter: Stu Hood
Assignee: Stu Hood
Priority: Trivial


Whirr 0.3.0 has been released.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-1600) Merge get_indexed_slices with get_range_slices

2011-01-31 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12988844#comment-12988844
 ] 

Jonathan Ellis commented on CASSANDRA-1600:
---

You can do this with the existing get_indexed_slices API, you just have to 
manually stop paging when you get to B.

 Merge get_indexed_slices with get_range_slices
 --

 Key: CASSANDRA-1600
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1600
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Affects Versions: 0.7 beta 1
Reporter: Stu Hood
 Fix For: 0.8

 Attachments: 
 0001-Add-optional-IndexClause-to-KeyRange-and-serialize-wit.txt, 
 0002-Drop-the-IndexClause.count-parameter.txt, 
 0003-Execute-RangeSliceCommands-using-scan-when-an-IndexCla.txt, 
 0004-Remove-get_indexed_slices-method.txt, 
 0005-Update-system-tests-to-use-get_range_slices.txt, 
 0006-Remove-start_key-from-IndexClause-for-the-start_key-in.txt, 
 0007-Respect-end_key-for-filtered-queries.txt, 
 0008-allow-applying-row-filtering-to-sequential-scan.txt, 
 0009-rename-Index-Filter.txt, AbstractScanIterator.java


 From a comment on 1157:
 {quote}
 IndexClause only has a start key for get_indexed_slices, but it would seem 
 that the reasoning behind using 'KeyRange' for get_range_slices applies there 
 as well, since if you know the range you care about in the primary index, you 
 don't want to continue scanning until you exhaust 'count' (or the cluster).
 Since it would appear that get_indexed_slices would benefit from a KeyRange, 
 why not smash get_(range|indexed)_slices together, and make IndexClause an 
 optional field on KeyRange?
 {quote}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (CASSANDRA-2081) Consistency QUORUM does not work anymore

2011-01-31 Thread Thibaut (JIRA)
Consistency QUORUM does not work anymore


 Key: CASSANDRA-2081
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2081
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: linux, hector + cassandra
Reporter: Thibaut
Priority: Blocker
 Fix For: 0.7.1


I'm using apache-cassandra-2011-01-28_20-06-01.jar and hector 7.0.25.

Using consistency level Quorum won't work anymore (tested it on read). 
Consisteny level ONE still works though

I have tried this with one dead node in my cluster.

If I restart cassandra with an older svn revision 
(apache-cassandra-2011-01-28_20-06-01.jar), I can access the cluster with 
consistency level QUORUM again, while still using 
apache-cassandra-2011-01-28_20-06-01.jar and hector 7.0.25 in my application.


11/01/31 19:54:38 ERROR connection.CassandraHostRetryService: Downed 
intr1n18(192.168.0.18):9160 host still appears to be down: Unable to open 
transport to intr1n18(192.168.0.18):9160 , java.net.NoRouteToHostException: No 
route to host
11/01/31 19:54:38 INFO connection.CassandraHostRetryService: Downed Host retry 
status false with host: intr1n18(192.168.0.18):9160
11/01/31 19:54:45 ERROR connection.HConnectionManager: Could not fullfill 
request on this host CassandraClientintr1n11:9160-483

intr1n11 is marked as up however and I can also access the node through the 
cassandra cli.


192.168.0.1 Up Normal  8.02 GB 5.00%   0cc
192.168.0.2 Up Normal  7.96 GB 5.00%   199
192.168.0.3 Up Normal  8.24 GB 5.00%   266
192.168.0.4 Up Normal  4.94 GB 5.00%   333
192.168.0.5 Up Normal  5.02 GB 5.00%   400
192.168.0.6 Up Normal  5 GB5.00%   4cc
192.168.0.7 Up Normal  5.1 GB  5.00%   599
192.168.0.8 Up Normal  5.07 GB 5.00%   666
192.168.0.9 Up Normal  4.78 GB 5.00%   733
192.168.0.10Up Normal  4.34 GB 5.00%   7ff
192.168.0.11Up Normal  5.01 GB 5.00%   8cc
192.168.0.12Up Normal  5.31 GB 5.00%   999
192.168.0.13Up Normal  5.56 GB 5.00%   a66
192.168.0.14Up Normal  5.82 GB 5.00%   b33
192.168.0.15Up Normal  5.57 GB 5.00%   c00
192.168.0.16Up Normal  5.03 GB 5.00%   ccc
192.168.0.17Up Normal  4.77 GB 5.00%   d99
192.168.0.18Down   Normal  ?   5.00%   e66
192.168.0.19Up Normal  4.78 GB 5.00%   f33
192.168.0.20Up Normal  4.83 GB 5.00%   






-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2072) Race condition during decommission

2011-01-31 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2072:


Attachment: (was: 
0003-Remove-endpoint-state-when-expiring-justRemovedEndpo.patch)

 Race condition during decommission
 --

 Key: CASSANDRA-2072
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2072
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.0
Reporter: Brandon Williams
Assignee: Brandon Williams
Priority: Minor
 Attachments: 
 0001-announce-having-left-the-ring-for-RING_DELAY-on-deco.patch, 
 0002-Improve-TRACE-logging-for-Gossiper.patch


 Occasionally when decommissioning a node, there is a race condition that 
 occurs where another node will never remove the token and thus propagate it 
 again with a state of down.  With CASSANDRA-1900 we can solve this, but it 
 shouldn't occur in the first place.
 Given nodes A, B, and C, if you decommission B it will stream to A and C.  
 When complete, B will decommission and receive this stacktrace:
 ERROR 00:02:40,282 Fatal exception in thread Thread[Thread-5,5,main]
 java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut 
 down
 at 
 org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:62)
 at 
 java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)
 at 
 java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658)
 at 
 org.apache.cassandra.net.MessagingService.receive(MessagingService.java:387)
 at 
 org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:91
 At this point A will show it is removing B's token, but C will not and 
 instead its failure detector will report that B is dead, and nodetool ring on 
 C shows B in a leaving/down state.  In another gossip round, C will propagate 
 this state back to A.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-2081) Consistency QUORUM does not work anymore

2011-01-31 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12988865#comment-12988865
 ] 

Jonathan Ellis commented on CASSANDRA-2081:
---

What kind of doesn't work are you seeing?

 Consistency QUORUM does not work anymore
 

 Key: CASSANDRA-2081
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2081
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: linux, hector + cassandra
Reporter: Thibaut
Priority: Blocker
 Fix For: 0.7.1


 I'm using apache-cassandra-2011-01-28_20-06-01.jar and hector 7.0.25.
 Using consistency level Quorum won't work anymore (tested it on read). 
 Consisteny level ONE still works though
 I have tried this with one dead node in my cluster.
 If I restart cassandra with an older svn revision 
 (apache-cassandra-2011-01-28_20-06-01.jar), I can access the cluster with 
 consistency level QUORUM again, while still using 
 apache-cassandra-2011-01-28_20-06-01.jar and hector 7.0.25 in my application.
 11/01/31 19:54:38 ERROR connection.CassandraHostRetryService: Downed 
 intr1n18(192.168.0.18):9160 host still appears to be down: Unable to open 
 transport to intr1n18(192.168.0.18):9160 , java.net.NoRouteToHostException: 
 No route to host
 11/01/31 19:54:38 INFO connection.CassandraHostRetryService: Downed Host 
 retry status false with host: intr1n18(192.168.0.18):9160
 11/01/31 19:54:45 ERROR connection.HConnectionManager: Could not fullfill 
 request on this host CassandraClientintr1n11:9160-483
 intr1n11 is marked as up however and I can also access the node through the 
 cassandra cli.
 192.168.0.1 Up Normal  8.02 GB 5.00%   0cc
 192.168.0.2 Up Normal  7.96 GB 5.00%   199
 192.168.0.3 Up Normal  8.24 GB 5.00%   266
 192.168.0.4 Up Normal  4.94 GB 5.00%   333
 192.168.0.5 Up Normal  5.02 GB 5.00%   400
 192.168.0.6 Up Normal  5 GB5.00%   4cc
 192.168.0.7 Up Normal  5.1 GB  5.00%   599
 192.168.0.8 Up Normal  5.07 GB 5.00%   666
 192.168.0.9 Up Normal  4.78 GB 5.00%   733
 192.168.0.10Up Normal  4.34 GB 5.00%   7ff
 192.168.0.11Up Normal  5.01 GB 5.00%   8cc
 192.168.0.12Up Normal  5.31 GB 5.00%   999
 192.168.0.13Up Normal  5.56 GB 5.00%   a66
 192.168.0.14Up Normal  5.82 GB 5.00%   b33
 192.168.0.15Up Normal  5.57 GB 5.00%   c00
 192.168.0.16Up Normal  5.03 GB 5.00%   ccc
 192.168.0.17Up Normal  4.77 GB 5.00%   d99
 192.168.0.18Down   Normal  ?   5.00%   e66
 192.168.0.19Up Normal  4.78 GB 5.00%   f33
 192.168.0.20Up Normal  4.83 GB 5.00%   

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (CASSANDRA-2083) Hinted Handoff and schema race

2011-01-31 Thread Brandon Williams (JIRA)
Hinted Handoff and schema race
--

 Key: CASSANDRA-2083
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2083
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.0
Reporter: Brandon Williams
Priority: Minor


If a node is down while a keyspace/cf is created and then data is inserted into 
the CF causing other nodes to hint, when the down node recovers it will lose 
some hints until the schema propagates:

{noformat}
ERROR 19:59:28,264 Error in row mutation
org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't find 
cfId=1000
at 
org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:117)
at 
org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:377)
at 
org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:50)
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:70)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
 INFO 19:59:28,356 Applying migration 28e2e7a4-2d74-11e0-9b6b-cdc89135952c
{noformat}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Issue Comment Edited: (CASSANDRA-2081) Consistency QUORUM does not work anymore

2011-01-31 Thread Thibaut (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12988867#comment-12988867
 ] 

Thibaut edited comment on CASSANDRA-2081 at 1/31/11 8:02 PM:
-

My application hangs/blocks forever as I catch all the Hector exceptions and 
retry when there was an error.

Above log file messages will repeat itself again and again.

There are also no error messages in the cassandra log file.

  was (Author: tbritz):
My application hangs/blocks forever as I catch all the Hector exceptions 
and retry when there was an error.

Above log file messages will repeat itself again and again.

  
 Consistency QUORUM does not work anymore
 

 Key: CASSANDRA-2081
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2081
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: linux, hector + cassandra
Reporter: Thibaut
Priority: Blocker
 Fix For: 0.7.1


 I'm using apache-cassandra-2011-01-28_20-06-01.jar and hector 7.0.25.
 Using consistency level Quorum won't work anymore (tested it on read). 
 Consisteny level ONE still works though
 I have tried this with one dead node in my cluster.
 If I restart cassandra with an older svn revision 
 (apache-cassandra-2011-01-28_20-06-01.jar), I can access the cluster with 
 consistency level QUORUM again, while still using 
 apache-cassandra-2011-01-28_20-06-01.jar and hector 7.0.25 in my application.
 11/01/31 19:54:38 ERROR connection.CassandraHostRetryService: Downed 
 intr1n18(192.168.0.18):9160 host still appears to be down: Unable to open 
 transport to intr1n18(192.168.0.18):9160 , java.net.NoRouteToHostException: 
 No route to host
 11/01/31 19:54:38 INFO connection.CassandraHostRetryService: Downed Host 
 retry status false with host: intr1n18(192.168.0.18):9160
 11/01/31 19:54:45 ERROR connection.HConnectionManager: Could not fullfill 
 request on this host CassandraClientintr1n11:9160-483
 intr1n11 is marked as up however and I can also access the node through the 
 cassandra cli.
 192.168.0.1 Up Normal  8.02 GB 5.00%   0cc
 192.168.0.2 Up Normal  7.96 GB 5.00%   199
 192.168.0.3 Up Normal  8.24 GB 5.00%   266
 192.168.0.4 Up Normal  4.94 GB 5.00%   333
 192.168.0.5 Up Normal  5.02 GB 5.00%   400
 192.168.0.6 Up Normal  5 GB5.00%   4cc
 192.168.0.7 Up Normal  5.1 GB  5.00%   599
 192.168.0.8 Up Normal  5.07 GB 5.00%   666
 192.168.0.9 Up Normal  4.78 GB 5.00%   733
 192.168.0.10Up Normal  4.34 GB 5.00%   7ff
 192.168.0.11Up Normal  5.01 GB 5.00%   8cc
 192.168.0.12Up Normal  5.31 GB 5.00%   999
 192.168.0.13Up Normal  5.56 GB 5.00%   a66
 192.168.0.14Up Normal  5.82 GB 5.00%   b33
 192.168.0.15Up Normal  5.57 GB 5.00%   c00
 192.168.0.16Up Normal  5.03 GB 5.00%   ccc
 192.168.0.17Up Normal  4.77 GB 5.00%   d99
 192.168.0.18Down   Normal  ?   5.00%   e66
 192.168.0.19Up Normal  4.78 GB 5.00%   f33
 192.168.0.20Up Normal  4.83 GB 5.00%   

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2081) Consistency QUORUM does not work anymore (hector:Could not fullfill request on this host)

2011-01-31 Thread Thibaut (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thibaut updated CASSANDRA-2081:
---

Summary: Consistency QUORUM does not work anymore (hector:Could not 
fullfill request on this host)  (was: Consistency QUORUM does not work anymore)

 Consistency QUORUM does not work anymore (hector:Could not fullfill request 
 on this host)
 -

 Key: CASSANDRA-2081
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2081
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: linux, hector + cassandra
Reporter: Thibaut
Priority: Blocker
 Fix For: 0.7.1


 I'm using apache-cassandra-2011-01-28_20-06-01.jar and hector 7.0.25.
 Using consistency level Quorum won't work anymore (tested it on read). 
 Consisteny level ONE still works though
 I have tried this with one dead node in my cluster.
 If I restart cassandra with an older svn revision 
 (apache-cassandra-2011-01-28_20-06-01.jar), I can access the cluster with 
 consistency level QUORUM again, while still using 
 apache-cassandra-2011-01-28_20-06-01.jar and hector 7.0.25 in my application.
 11/01/31 19:54:38 ERROR connection.CassandraHostRetryService: Downed 
 intr1n18(192.168.0.18):9160 host still appears to be down: Unable to open 
 transport to intr1n18(192.168.0.18):9160 , java.net.NoRouteToHostException: 
 No route to host
 11/01/31 19:54:38 INFO connection.CassandraHostRetryService: Downed Host 
 retry status false with host: intr1n18(192.168.0.18):9160
 11/01/31 19:54:45 ERROR connection.HConnectionManager: Could not fullfill 
 request on this host CassandraClientintr1n11:9160-483
 intr1n11 is marked as up however and I can also access the node through the 
 cassandra cli.
 192.168.0.1 Up Normal  8.02 GB 5.00%   0cc
 192.168.0.2 Up Normal  7.96 GB 5.00%   199
 192.168.0.3 Up Normal  8.24 GB 5.00%   266
 192.168.0.4 Up Normal  4.94 GB 5.00%   333
 192.168.0.5 Up Normal  5.02 GB 5.00%   400
 192.168.0.6 Up Normal  5 GB5.00%   4cc
 192.168.0.7 Up Normal  5.1 GB  5.00%   599
 192.168.0.8 Up Normal  5.07 GB 5.00%   666
 192.168.0.9 Up Normal  4.78 GB 5.00%   733
 192.168.0.10Up Normal  4.34 GB 5.00%   7ff
 192.168.0.11Up Normal  5.01 GB 5.00%   8cc
 192.168.0.12Up Normal  5.31 GB 5.00%   999
 192.168.0.13Up Normal  5.56 GB 5.00%   a66
 192.168.0.14Up Normal  5.82 GB 5.00%   b33
 192.168.0.15Up Normal  5.57 GB 5.00%   c00
 192.168.0.16Up Normal  5.03 GB 5.00%   ccc
 192.168.0.17Up Normal  4.77 GB 5.00%   d99
 192.168.0.18Down   Normal  ?   5.00%   e66
 192.168.0.19Up Normal  4.78 GB 5.00%   f33
 192.168.0.20Up Normal  4.83 GB 5.00%   

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (CASSANDRA-2084) Corrupt sstables cause compaction to fail again, and again and again, ...

2011-01-31 Thread Dan Hendry (JIRA)
Corrupt sstables cause compaction to fail again, and again and again, ...
-

 Key: CASSANDRA-2084
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2084
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.0
 Environment: Ubuntu 10.10
Cassandra 0.7.0
4 Nodes
Reporter: Dan Hendry


I have been having some serious data corruption issues in my cluster. I suspect 
some deeper more serious Cassandra bug but I dont know what or where it is and 
I have not found a way to reproduce the issues I have been having. 

This ticket is for a behaviour I have observed where cassandra starts 
compacting a set of sstables, fails, does not clean up the tmp files, then 
start compacting the exact same set of sstables again. (See logs below). After 
awhile, the node runs out of disk space and crashes. At the very least, 
cassandra should clean up temp files after a failed compaction. Better yet, it 
should stop trying to compact that file and log what file the error occurred 
for. The list of corrupt sstables does not even have to be persistent, just an 
in memory list which gets wiped out on a restart.

Here is a sample log, the same 4 sstables are being compacted then failing then 
being compacted again. 

 INFO [CompactionExecutor:1] 2011-01-31 13:08:26,434 CompactionManager.java 
(line 272) Compacting 
[org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/kikmetrics/DeviceEventsByDevice-e-562-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/kikmetrics/DeviceEventsByDevice-e-692-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/kikmetrics/DeviceEventsByDevice-e-773-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/kikmetrics/DeviceEventsByDevice-e-940-Data.db')]
 INFO [HintedHandoff:1] 2011-01-31 13:08:28,878 HintedHandOffManager.java (line 
226) Could not complete hinted handoff to /192.168.4.16
 INFO [HintedHandoff:1] 2011-01-31 13:08:28,879 ColumnFamilyStore.java (line 
648) switching in a fresh Memtable for HintsColumnFamily at 
CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-1296500864696.log',
 position=104140211)
 INFO [HintedHandoff:1] 2011-01-31 13:08:28,879 ColumnFamilyStore.java (line 
952) Enqueuing flush of Memtable-HintsColumnFamily@1652350488(1155546 bytes, 
20839 operations)
 INFO [FlushWriter:1] 2011-01-31 13:08:28,879 Memtable.java (line 155) Writing 
Memtable-HintsColumnFamily@1652350488(1155546 bytes, 20839 operations)
 INFO [FlushWriter:1] 2011-01-31 13:08:29,199 Memtable.java (line 162) 
Completed flushing /var/lib/cassandra/data/system/HintsColumnFamily-e-9-Data.db 
(1075487 bytes)
 INFO [GossipStage:1] 2011-01-31 13:08:45,508 Gossiper.java (line 569) 
InetAddress /192.168.4.16 is now UP
 INFO [COMMIT-LOG-WRITER] 2011-01-31 13:08:59,736 CommitLogSegment.java (line 
50) Creating new commitlog segment 
/var/lib/cassandra/commitlog/CommitLog-1296500939735.log
 INFO [MutationStage:8] 2011-01-31 13:09:15,868 ColumnFamilyStore.java (line 
648) switching in a fresh Memtable for UserSearch at 
CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-1296500939735.log',
 position=56028937)
 INFO [MutationStage:8] 2011-01-31 13:09:15,868 ColumnFamilyStore.java (line 
952) Enqueuing flush of Memtable-UserSearch@1186863256(174163962 bytes, 2097155 
operations)
 INFO [FlushWriter:1] 2011-01-31 13:09:15,868 Memtable.java (line 155) Writing 
Memtable-UserSearch@1186863256(174163962 bytes, 2097155 operations)
ERROR [CompactionExecutor:1] 2011-01-31 13:09:22,462 
AbstractCassandraDaemon.java (line 91) Fatal exception in thread 
Thread[CompactionExecutor:1,1,main]
java.io.IOError: java.io.EOFException: attempted to skip 776104308 bytes but 
only skipped 8469212
at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:78)
at 
org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:178)
at 
org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:143)
at 
org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:135)
at 
org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:38)
at 
org.apache.commons.collections.iterators.CollatingIterator.set(CollatingIterator.java:284)
at 
org.apache.commons.collections.iterators.CollatingIterator.least(CollatingIterator.java:326)
at 
org.apache.commons.collections.iterators.CollatingIterator.next(CollatingIterator.java:230)
at 
org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:68)
at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
at 

[jira] Issue Comment Edited: (CASSANDRA-2081) Consistency QUORUM does not work anymore (hector:Could not fullfill request on this host)

2011-01-31 Thread Thibaut (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12988867#comment-12988867
 ] 

Thibaut edited comment on CASSANDRA-2081 at 1/31/11 8:10 PM:
-

My application hangs/blocks forever as I catch all the Hector exceptions and 
retry when there was an error.

Above log file messages will repeat itself again and again.

There are also no error messages in the cassandra log file.

Also Could not fullfill request on this host CassandraClient is an error 
message I have never seen before. 

  was (Author: tbritz):
My application hangs/blocks forever as I catch all the Hector exceptions 
and retry when there was an error.

Above log file messages will repeat itself again and again.

There are also no error messages in the cassandra log file.
  
 Consistency QUORUM does not work anymore (hector:Could not fullfill request 
 on this host)
 -

 Key: CASSANDRA-2081
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2081
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: linux, hector + cassandra
Reporter: Thibaut
Priority: Blocker
 Fix For: 0.7.1


 I'm using apache-cassandra-2011-01-28_20-06-01.jar and hector 7.0.25.
 Using consistency level Quorum won't work anymore (tested it on read). 
 Consisteny level ONE still works though
 I have tried this with one dead node in my cluster.
 If I restart cassandra with an older svn revision 
 (apache-cassandra-2011-01-28_20-06-01.jar), I can access the cluster with 
 consistency level QUORUM again, while still using 
 apache-cassandra-2011-01-28_20-06-01.jar and hector 7.0.25 in my application.
 11/01/31 19:54:38 ERROR connection.CassandraHostRetryService: Downed 
 intr1n18(192.168.0.18):9160 host still appears to be down: Unable to open 
 transport to intr1n18(192.168.0.18):9160 , java.net.NoRouteToHostException: 
 No route to host
 11/01/31 19:54:38 INFO connection.CassandraHostRetryService: Downed Host 
 retry status false with host: intr1n18(192.168.0.18):9160
 11/01/31 19:54:45 ERROR connection.HConnectionManager: Could not fullfill 
 request on this host CassandraClientintr1n11:9160-483
 intr1n11 is marked as up however and I can also access the node through the 
 cassandra cli.
 192.168.0.1 Up Normal  8.02 GB 5.00%   0cc
 192.168.0.2 Up Normal  7.96 GB 5.00%   199
 192.168.0.3 Up Normal  8.24 GB 5.00%   266
 192.168.0.4 Up Normal  4.94 GB 5.00%   333
 192.168.0.5 Up Normal  5.02 GB 5.00%   400
 192.168.0.6 Up Normal  5 GB5.00%   4cc
 192.168.0.7 Up Normal  5.1 GB  5.00%   599
 192.168.0.8 Up Normal  5.07 GB 5.00%   666
 192.168.0.9 Up Normal  4.78 GB 5.00%   733
 192.168.0.10Up Normal  4.34 GB 5.00%   7ff
 192.168.0.11Up Normal  5.01 GB 5.00%   8cc
 192.168.0.12Up Normal  5.31 GB 5.00%   999
 192.168.0.13Up Normal  5.56 GB 5.00%   a66
 192.168.0.14Up Normal  5.82 GB 5.00%   b33
 192.168.0.15Up Normal  5.57 GB 5.00%   c00
 192.168.0.16Up Normal  5.03 GB 5.00%   ccc
 192.168.0.17Up Normal  4.77 GB 5.00%   d99
 192.168.0.18Down   Normal  ?   5.00%   e66
 192.168.0.19Up Normal  4.78 GB 5.00%   f33
 192.168.0.20Up Normal  4.83 GB 5.00%   

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2084) Corrupt sstables cause compaction to fail again, and again and again, ...

2011-01-31 Thread Dan Hendry (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dan Hendry updated CASSANDRA-2084:
--

Environment: 
Ubuntu 10.10
Cassandra 0.7.0 (4 Nodes)

Java:
- java version 1.6.0_22
- Java(TM) SE Runtime Environment (build 1.6.0_22-b04)
- Java HotSpot(TM) 64-Bit Server VM (build 17.1-b03, mixed mode)


  was:
Ubuntu 10.10
Cassandra 0.7.0
4 Nodes


 Corrupt sstables cause compaction to fail again, and again and again, ...
 -

 Key: CASSANDRA-2084
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2084
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.0
 Environment: Ubuntu 10.10
 Cassandra 0.7.0 (4 Nodes)
 Java:
 - java version 1.6.0_22
 - Java(TM) SE Runtime Environment (build 1.6.0_22-b04)
 - Java HotSpot(TM) 64-Bit Server VM (build 17.1-b03, mixed mode)
Reporter: Dan Hendry

 I have been having some serious data corruption issues in my cluster. I 
 suspect some deeper more serious Cassandra bug but I dont know what or where 
 it is and I have not found a way to reproduce the issues I have been having. 
 This ticket is for a behaviour I have observed where cassandra starts 
 compacting a set of sstables, fails, does not clean up the tmp files, then 
 start compacting the exact same set of sstables again. (See logs below). 
 After awhile, the node runs out of disk space and crashes. At the very least, 
 cassandra should clean up temp files after a failed compaction. Better yet, 
 it should stop trying to compact that file and log what file the error 
 occurred for. The list of corrupt sstables does not even have to be 
 persistent, just an in memory list which gets wiped out on a restart.
 Here is a sample log, the same 4 sstables are being compacted then failing 
 then being compacted again. 
  INFO [CompactionExecutor:1] 2011-01-31 13:08:26,434 CompactionManager.java 
 (line 272) Compacting 
 [org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/kikmetrics/DeviceEventsByDevice-e-562-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/kikmetrics/DeviceEventsByDevice-e-692-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/kikmetrics/DeviceEventsByDevice-e-773-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/kikmetrics/DeviceEventsByDevice-e-940-Data.db')]
  INFO [HintedHandoff:1] 2011-01-31 13:08:28,878 HintedHandOffManager.java 
 (line 226) Could not complete hinted handoff to /192.168.4.16
  INFO [HintedHandoff:1] 2011-01-31 13:08:28,879 ColumnFamilyStore.java (line 
 648) switching in a fresh Memtable for HintsColumnFamily at 
 CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-1296500864696.log',
  position=104140211)
  INFO [HintedHandoff:1] 2011-01-31 13:08:28,879 ColumnFamilyStore.java (line 
 952) Enqueuing flush of Memtable-HintsColumnFamily@1652350488(1155546 bytes, 
 20839 operations)
  INFO [FlushWriter:1] 2011-01-31 13:08:28,879 Memtable.java (line 155) 
 Writing Memtable-HintsColumnFamily@1652350488(1155546 bytes, 20839 operations)
  INFO [FlushWriter:1] 2011-01-31 13:08:29,199 Memtable.java (line 162) 
 Completed flushing 
 /var/lib/cassandra/data/system/HintsColumnFamily-e-9-Data.db (1075487 bytes)
  INFO [GossipStage:1] 2011-01-31 13:08:45,508 Gossiper.java (line 569) 
 InetAddress /192.168.4.16 is now UP
  INFO [COMMIT-LOG-WRITER] 2011-01-31 13:08:59,736 CommitLogSegment.java (line 
 50) Creating new commitlog segment 
 /var/lib/cassandra/commitlog/CommitLog-1296500939735.log
  INFO [MutationStage:8] 2011-01-31 13:09:15,868 ColumnFamilyStore.java (line 
 648) switching in a fresh Memtable for UserSearch at 
 CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-1296500939735.log',
  position=56028937)
  INFO [MutationStage:8] 2011-01-31 13:09:15,868 ColumnFamilyStore.java (line 
 952) Enqueuing flush of Memtable-UserSearch@1186863256(174163962 bytes, 
 2097155 operations)
  INFO [FlushWriter:1] 2011-01-31 13:09:15,868 Memtable.java (line 155) 
 Writing Memtable-UserSearch@1186863256(174163962 bytes, 2097155 operations)
 ERROR [CompactionExecutor:1] 2011-01-31 13:09:22,462 
 AbstractCassandraDaemon.java (line 91) Fatal exception in thread 
 Thread[CompactionExecutor:1,1,main]
 java.io.IOError: java.io.EOFException: attempted to skip 776104308 bytes but 
 only skipped 8469212
 at 
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:78)
 at 
 org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:178)
 at 
 org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:143)
 at 
 

[jira] Commented: (CASSANDRA-2081) Consistency QUORUM does not work anymore (hector:Could not fullfill request on this host)

2011-01-31 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12988879#comment-12988879
 ] 

Jonathan Ellis commented on CASSANDRA-2081:
---

Is this RF=3?

What do you see in the Cassandra log when you set log level to debug, for the 
queries that Hector gives up on?

What are the versions you tried that works/doesn't work?  (In description above 
both versions are given as apache-cassandra-2011-01-28_20-06-01.jar.)

 Consistency QUORUM does not work anymore (hector:Could not fullfill request 
 on this host)
 -

 Key: CASSANDRA-2081
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2081
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: linux, hector + cassandra
Reporter: Thibaut
Priority: Blocker
 Fix For: 0.7.1


 I'm using apache-cassandra-2011-01-28_20-06-01.jar and hector 7.0.25.
 Using consistency level Quorum won't work anymore (tested it on read). 
 Consisteny level ONE still works though
 I have tried this with one dead node in my cluster.
 If I restart cassandra with an older svn revision 
 (apache-cassandra-2011-01-28_20-06-01.jar), I can access the cluster with 
 consistency level QUORUM again, while still using 
 apache-cassandra-2011-01-28_20-06-01.jar and hector 7.0.25 in my application.
 11/01/31 19:54:38 ERROR connection.CassandraHostRetryService: Downed 
 intr1n18(192.168.0.18):9160 host still appears to be down: Unable to open 
 transport to intr1n18(192.168.0.18):9160 , java.net.NoRouteToHostException: 
 No route to host
 11/01/31 19:54:38 INFO connection.CassandraHostRetryService: Downed Host 
 retry status false with host: intr1n18(192.168.0.18):9160
 11/01/31 19:54:45 ERROR connection.HConnectionManager: Could not fullfill 
 request on this host CassandraClientintr1n11:9160-483
 intr1n11 is marked as up however and I can also access the node through the 
 cassandra cli.
 192.168.0.1 Up Normal  8.02 GB 5.00%   0cc
 192.168.0.2 Up Normal  7.96 GB 5.00%   199
 192.168.0.3 Up Normal  8.24 GB 5.00%   266
 192.168.0.4 Up Normal  4.94 GB 5.00%   333
 192.168.0.5 Up Normal  5.02 GB 5.00%   400
 192.168.0.6 Up Normal  5 GB5.00%   4cc
 192.168.0.7 Up Normal  5.1 GB  5.00%   599
 192.168.0.8 Up Normal  5.07 GB 5.00%   666
 192.168.0.9 Up Normal  4.78 GB 5.00%   733
 192.168.0.10Up Normal  4.34 GB 5.00%   7ff
 192.168.0.11Up Normal  5.01 GB 5.00%   8cc
 192.168.0.12Up Normal  5.31 GB 5.00%   999
 192.168.0.13Up Normal  5.56 GB 5.00%   a66
 192.168.0.14Up Normal  5.82 GB 5.00%   b33
 192.168.0.15Up Normal  5.57 GB 5.00%   c00
 192.168.0.16Up Normal  5.03 GB 5.00%   ccc
 192.168.0.17Up Normal  4.77 GB 5.00%   d99
 192.168.0.18Down   Normal  ?   5.00%   e66
 192.168.0.19Up Normal  4.78 GB 5.00%   f33
 192.168.0.20Up Normal  4.83 GB 5.00%   

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-2081) Consistency QUORUM does not work anymore (hector:Could not fullfill request on this host)

2011-01-31 Thread Thibaut (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1295#comment-1295
 ] 

Thibaut commented on CASSANDRA-2081:


RF=3

I will enable the debug log level tomorrow for cassandra, switch back to 
apache-cassandra-2011-01-28_20-06-01.jar and post you the results.

The last version that I tried that worked was 
apache-cassandra-2011-01-24_06-01-26.jar. 
apache-cassandra-2011-01-28_20-06-01.jar doesn't work anymore.


 Consistency QUORUM does not work anymore (hector:Could not fullfill request 
 on this host)
 -

 Key: CASSANDRA-2081
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2081
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: linux, hector + cassandra
Reporter: Thibaut
Priority: Blocker
 Fix For: 0.7.1


 I'm using apache-cassandra-2011-01-28_20-06-01.jar and hector 7.0.25.
 Using consistency level Quorum won't work anymore (tested it on read). 
 Consisteny level ONE still works though
 I have tried this with one dead node in my cluster.
 If I restart cassandra with an older svn revision 
 (apache-cassandra-2011-01-28_20-06-01.jar), I can access the cluster with 
 consistency level QUORUM again, while still using 
 apache-cassandra-2011-01-28_20-06-01.jar and hector 7.0.25 in my application.
 11/01/31 19:54:38 ERROR connection.CassandraHostRetryService: Downed 
 intr1n18(192.168.0.18):9160 host still appears to be down: Unable to open 
 transport to intr1n18(192.168.0.18):9160 , java.net.NoRouteToHostException: 
 No route to host
 11/01/31 19:54:38 INFO connection.CassandraHostRetryService: Downed Host 
 retry status false with host: intr1n18(192.168.0.18):9160
 11/01/31 19:54:45 ERROR connection.HConnectionManager: Could not fullfill 
 request on this host CassandraClientintr1n11:9160-483
 intr1n11 is marked as up however and I can also access the node through the 
 cassandra cli.
 192.168.0.1 Up Normal  8.02 GB 5.00%   0cc
 192.168.0.2 Up Normal  7.96 GB 5.00%   199
 192.168.0.3 Up Normal  8.24 GB 5.00%   266
 192.168.0.4 Up Normal  4.94 GB 5.00%   333
 192.168.0.5 Up Normal  5.02 GB 5.00%   400
 192.168.0.6 Up Normal  5 GB5.00%   4cc
 192.168.0.7 Up Normal  5.1 GB  5.00%   599
 192.168.0.8 Up Normal  5.07 GB 5.00%   666
 192.168.0.9 Up Normal  4.78 GB 5.00%   733
 192.168.0.10Up Normal  4.34 GB 5.00%   7ff
 192.168.0.11Up Normal  5.01 GB 5.00%   8cc
 192.168.0.12Up Normal  5.31 GB 5.00%   999
 192.168.0.13Up Normal  5.56 GB 5.00%   a66
 192.168.0.14Up Normal  5.82 GB 5.00%   b33
 192.168.0.15Up Normal  5.57 GB 5.00%   c00
 192.168.0.16Up Normal  5.03 GB 5.00%   ccc
 192.168.0.17Up Normal  4.77 GB 5.00%   d99
 192.168.0.18Down   Normal  ?   5.00%   e66
 192.168.0.19Up Normal  4.78 GB 5.00%   f33
 192.168.0.20Up Normal  4.83 GB 5.00%   

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-2058) Nodes periodically spike in load

2011-01-31 Thread David King (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12988890#comment-12988890
 ] 

David King commented on CASSANDRA-2058:
---

I have upgraded to 0.6.11 and am definitely still seeing this problem (although 
I'm no longer seeing the 30% performance hit while the nodes are up)

 Nodes periodically spike in load
 

 Key: CASSANDRA-2058
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2058
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.6.10, 0.7.1
Reporter: David King
Assignee: Jonathan Ellis
 Fix For: 0.6.11, 0.7.1

 Attachments: 2058-0.7-v2.txt, 2058-0.7-v3.txt, 2058-0.7.txt, 
 2058.txt, cassandra.pmc01.log.bz2, cassandra.pmc14.log.bz2, graph a.png, 
 graph b.png


 (Filing as a placeholder bug as I gather information.)
 At ~10p 24 Jan, I upgraded our 20-node cluster from 0.6.8-0.6.10, turned on 
 the DES, and moved some CFs from one KS into another (drain whole cluster, 
 take it down, move files, change schema, put it back up). Since then, I've 
 had four storms whereby a node's load will shoot to 700+ (400% CPU on a 4-cpu 
 machine) and become totally unresponsive. After a moment or two like that, 
 its neighbour dies too, and the failure cascades around the ring. 
 Unfortunately because of the high load I'm not able to get into the machine 
 to pull a thread dump to see wtf it's doing as it happens.
 I've also had an issue where a single node spikes up to high load, but 
 recovers. This may or may not be the same issue from which the nodes don't 
 recover as above, but both are new behaviour

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-2081) Consistency QUORUM does not work anymore (hector:Could not fullfill request on this host)

2011-01-31 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12988894#comment-12988894
 ] 

Brandon Williams commented on CASSANDRA-2081:
-

I'm not able to reproduce with contrib/stress, can you try that?

 Consistency QUORUM does not work anymore (hector:Could not fullfill request 
 on this host)
 -

 Key: CASSANDRA-2081
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2081
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: linux, hector + cassandra
Reporter: Thibaut
Priority: Blocker
 Fix For: 0.7.1


 I'm using apache-cassandra-2011-01-28_20-06-01.jar and hector 7.0.25.
 Using consistency level Quorum won't work anymore (tested it on read). 
 Consisteny level ONE still works though
 I have tried this with one dead node in my cluster.
 If I restart cassandra with an older svn revision 
 (apache-cassandra-2011-01-28_20-06-01.jar), I can access the cluster with 
 consistency level QUORUM again, while still using 
 apache-cassandra-2011-01-28_20-06-01.jar and hector 7.0.25 in my application.
 11/01/31 19:54:38 ERROR connection.CassandraHostRetryService: Downed 
 intr1n18(192.168.0.18):9160 host still appears to be down: Unable to open 
 transport to intr1n18(192.168.0.18):9160 , java.net.NoRouteToHostException: 
 No route to host
 11/01/31 19:54:38 INFO connection.CassandraHostRetryService: Downed Host 
 retry status false with host: intr1n18(192.168.0.18):9160
 11/01/31 19:54:45 ERROR connection.HConnectionManager: Could not fullfill 
 request on this host CassandraClientintr1n11:9160-483
 intr1n11 is marked as up however and I can also access the node through the 
 cassandra cli.
 192.168.0.1 Up Normal  8.02 GB 5.00%   0cc
 192.168.0.2 Up Normal  7.96 GB 5.00%   199
 192.168.0.3 Up Normal  8.24 GB 5.00%   266
 192.168.0.4 Up Normal  4.94 GB 5.00%   333
 192.168.0.5 Up Normal  5.02 GB 5.00%   400
 192.168.0.6 Up Normal  5 GB5.00%   4cc
 192.168.0.7 Up Normal  5.1 GB  5.00%   599
 192.168.0.8 Up Normal  5.07 GB 5.00%   666
 192.168.0.9 Up Normal  4.78 GB 5.00%   733
 192.168.0.10Up Normal  4.34 GB 5.00%   7ff
 192.168.0.11Up Normal  5.01 GB 5.00%   8cc
 192.168.0.12Up Normal  5.31 GB 5.00%   999
 192.168.0.13Up Normal  5.56 GB 5.00%   a66
 192.168.0.14Up Normal  5.82 GB 5.00%   b33
 192.168.0.15Up Normal  5.57 GB 5.00%   c00
 192.168.0.16Up Normal  5.03 GB 5.00%   ccc
 192.168.0.17Up Normal  4.77 GB 5.00%   d99
 192.168.0.18Down   Normal  ?   5.00%   e66
 192.168.0.19Up Normal  4.78 GB 5.00%   f33
 192.168.0.20Up Normal  4.83 GB 5.00%   

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2072) Race condition during decommission

2011-01-31 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2072:
--

Fix Version/s: 0.7.2

 Race condition during decommission
 --

 Key: CASSANDRA-2072
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2072
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.0
Reporter: Brandon Williams
Assignee: Brandon Williams
Priority: Minor
 Fix For: 0.7.2

 Attachments: 
 0001-announce-having-left-the-ring-for-RING_DELAY-on-deco.patch, 
 0002-Improve-TRACE-logging-for-Gossiper.patch, 
 0003-Remove-endpoint-state-when-expiring-justRemovedEndpo.patch


 Occasionally when decommissioning a node, there is a race condition that 
 occurs where another node will never remove the token and thus propagate it 
 again with a state of down.  With CASSANDRA-1900 we can solve this, but it 
 shouldn't occur in the first place.
 Given nodes A, B, and C, if you decommission B it will stream to A and C.  
 When complete, B will decommission and receive this stacktrace:
 ERROR 00:02:40,282 Fatal exception in thread Thread[Thread-5,5,main]
 java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut 
 down
 at 
 org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:62)
 at 
 java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767)
 at 
 java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658)
 at 
 org.apache.cassandra.net.MessagingService.receive(MessagingService.java:387)
 at 
 org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:91
 At this point A will show it is removing B's token, but C will not and 
 instead its failure detector will report that B is dead, and nodetool ring on 
 C shows B in a leaving/down state.  In another gossip round, C will propagate 
 this state back to A.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-2058) Nodes periodically spike in load

2011-01-31 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12988912#comment-12988912
 ] 

Jonathan Ellis commented on CASSANDRA-2058:
---

Please tell me you're at least seeing this less often than with .10 :)

 Nodes periodically spike in load
 

 Key: CASSANDRA-2058
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2058
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.6.10, 0.7.1
Reporter: David King
Assignee: Jonathan Ellis
 Fix For: 0.6.11, 0.7.1

 Attachments: 2058-0.7-v2.txt, 2058-0.7-v3.txt, 2058-0.7.txt, 
 2058.txt, cassandra.pmc01.log.bz2, cassandra.pmc14.log.bz2, graph a.png, 
 graph b.png


 (Filing as a placeholder bug as I gather information.)
 At ~10p 24 Jan, I upgraded our 20-node cluster from 0.6.8-0.6.10, turned on 
 the DES, and moved some CFs from one KS into another (drain whole cluster, 
 take it down, move files, change schema, put it back up). Since then, I've 
 had four storms whereby a node's load will shoot to 700+ (400% CPU on a 4-cpu 
 machine) and become totally unresponsive. After a moment or two like that, 
 its neighbour dies too, and the failure cascades around the ring. 
 Unfortunately because of the high load I'm not able to get into the machine 
 to pull a thread dump to see wtf it's doing as it happens.
 I've also had an issue where a single node spikes up to high load, but 
 recovers. This may or may not be the same issue from which the nodes don't 
 recover as above, but both are new behaviour

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1065827 - in /cassandra/trunk: ./ conf/ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/java/org/apache/cassandra/service/

2011-01-31 Thread jbellis
Author: jbellis
Date: Mon Jan 31 22:12:43 2011
New Revision: 1065827

URL: http://svn.apache.org/viewvc?rev=1065827view=rev
Log:
merge from 0.7

Modified:
cassandra/trunk/   (props changed)
cassandra/trunk/conf/cassandra.yaml

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java
   (props changed)

cassandra/trunk/src/java/org/apache/cassandra/service/AbstractCassandraDaemon.java

Propchange: cassandra/trunk/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Mon Jan 31 22:12:43 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1064713
-/cassandra/branches/cassandra-0.7:1026516-1065665
+/cassandra/branches/cassandra-0.7:1026516-1065826
 /cassandra/branches/cassandra-0.7.0:1053690-1055654
 /cassandra/tags/cassandra-0.7.0-rc3:1051699-1053689
 /incubator/cassandra/branches/cassandra-0.3:774578-796573

Modified: cassandra/trunk/conf/cassandra.yaml
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/conf/cassandra.yaml?rev=1065827r1=1065826r2=1065827view=diff
==
--- cassandra/trunk/conf/cassandra.yaml (original)
+++ cassandra/trunk/conf/cassandra.yaml Mon Jan 31 22:12:43 2011
@@ -225,7 +225,7 @@ rpc_timeout_in_ms: 1
 # org.apache.cassandra.locator.PropertyFileSnitch:
 #  - Proximity is determined by rack and data center, which are
 #explicitly configured in cassandra-topology.properties.
-endpoint_snitch: org.apache.cassandra.locator.PropertyFileSnitch
+endpoint_snitch: org.apache.cassandra.locator.SimpleSnitch
 
 # dynamic_snitch -- This boolean controls whether the above snitch is
 # wrapped with a dynamic snitch, which will monitor read latencies

Propchange: 
cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Mon Jan 31 22:12:43 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:922689-1052356,1052358-1053452,1053454,1053456-1064713
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1065665
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1065826
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1055654
 
/cassandra/tags/cassandra-0.7.0-rc3/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1051699-1053689
 
/incubator/cassandra/branches/cassandra-0.3/interface/gen-java/org/apache/cassandra/service/Cassandra.java:774578-796573

Propchange: 
cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Mon Jan 31 22:12:43 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:922689-1052356,1052358-1053452,1053454,1053456-1064713
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1065665
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1065826
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1053690-1055654
 
/cassandra/tags/cassandra-0.7.0-rc3/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1051699-1053689
 
/incubator/cassandra/branches/cassandra-0.3/interface/gen-java/org/apache/cassandra/service/column_t.java:774578-792198

Propchange: 
cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Mon Jan 31 22:12:43 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java:922689-1052356,1052358-1053452,1053454,1053456-1064713
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java:1026516-1065665

[jira] Commented: (CASSANDRA-2067) refactor o.a.c.utils.UUIDGen to allow creating type 1 UUIDs for a given time

2011-01-31 Thread Folke Behrens (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12988927#comment-12988927
 ] 

Folke Behrens commented on CASSANDRA-2067:
--

You could also use the Preferences system to store a random permanent node ID.

 refactor o.a.c.utils.UUIDGen to allow creating type 1 UUIDs for a given time
 

 Key: CASSANDRA-2067
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2067
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Eric Evans
Assignee: Eric Evans
 Fix For: 0.8

 Attachments: 
 v1-0001-CASSANDRA-2067-o.a.c.utils.UUIDGen-adapted-from-flewto.txt, 
 v1-0002-eliminate-usage-of-JUG-for-UUIDs.txt, 
 v1-0003-remove-JUG-jar-and-references.txt, 
 v2-0001-CASSANDRA-2067-o.a.c.utils.UUIDGen-adapted-from-flewto.txt, 
 v2-0002-eliminate-usage-of-JUG-for-UUIDs.txt, 
 v2-0003-remove-JUG-jar-and-license-files.txt

   Original Estimate: 0h
  Remaining Estimate: 0h

 CASSANDRA-2027 creates the need to generate type 1 UUIDs using arbitrary 
 date/times.  IMO, this would be a good opportunity to replace 
 o.a.c.utils.UUIDGen with the class that Gary Dusbabek wrote for Flewton 
 (https://github.com/flewton/flewton/blob/master/src/com/rackspace/flewton/util/UUIDGen.java),
  which is better/more comprehensive.  We can even eliminate the dependency on 
 JUG.
 Patches to follow.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-1551) create tell me what nodes you have hints for jmx api

2011-01-31 Thread Jon Hermes (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Hermes updated CASSANDRA-1551:
--

Attachment: 1551-v4.txt

Rebased, deleteHFE() now accepts an ipaddr or hostname.

 create tell me what nodes you have hints for jmx api
 --

 Key: CASSANDRA-1551
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1551
 Project: Cassandra
  Issue Type: New Feature
  Components: Tools
Reporter: Jonathan Ellis
Assignee: Jon Hermes
Priority: Minor
 Fix For: 0.7.2

 Attachments: 1551-v2.txt, 1551-v3.txt, 1551-v4.txt, 1551.txt

   Original Estimate: 4h
  Remaining Estimate: 4h

 we can do this efficiently in 0.7 due to new HH schema.  in 0.6 this would 
 require scanning all hints so probably not worth it.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-2074) Currently voted on 7.0.1 release won't start on windows

2011-01-31 Thread Joaquin Casares (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12988961#comment-12988961
 ] 

Joaquin Casares commented on CASSANDRA-2074:


I tried to reproduce this using the same version of Cassandra that you 
downloaded and couldn't.

I updated the instructions to include Windows and Cassandra 0.7 configurations 
here: http://wiki.apache.org/cassandra/RunningCassandraInEclipse.

I did however notice that you aren't running the -Dcassandra-foreground 
argument. What you are probably seeing the output before Cassandra starts 
running in the background since it seems like everything processed fine.

You could either include the foreground option or access Cassandra using the 
cassandra-cli. Do either of these options give you better results?

 Currently voted on 7.0.1 release won't start on windows
 ---

 Key: CASSANDRA-2074
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2074
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Windows 7
Reporter: Thibaut
Assignee: Joaquin Casares
 Fix For: 0.7.1


 The proposed release 
 (https://hudson.apache.org/hudson/job/Cassandra-0.7/228/) won't start on my 
 windows dev machine running ecplise. (Haven't tested this on linux)
 Startup parameters:
 -Dcassandra.config=cassandra-test/cassandra.yaml
 -ea -Xmx2G
 It exists right after the following message, no ERROR message is shown. I 
 also tried deleting all my data folders, but cassandra still exists.
 INFO 18:02:09,690 Will not load MX4J, mx4j-tools.jar is not in the classpath
 apache-cassandra-2011-01-24_06-01-26.jar works fine though.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-2058) Nodes periodically spike in load

2011-01-31 Thread David King (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12988968#comment-12988968
 ] 

David King commented on CASSANDRA-2058:
---

It's hard to say. I lost 5 nodes in about an hour, but I don't know how many I 
lost last time

 Nodes periodically spike in load
 

 Key: CASSANDRA-2058
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2058
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.6.10, 0.7.1
 Environment: OpenJDK 64-Bit Server VM (build 1.6.0_0-b12, mixed mode)
 Ubuntu 8.10
 Linux pmc01 2.6.27-22-xen #1 SMP Fri Feb 20 23:58:13 UTC 2009 x86_64 GNU/Linux
Reporter: David King
Assignee: Jonathan Ellis
 Fix For: 0.6.11, 0.7.1

 Attachments: 2058-0.7-v2.txt, 2058-0.7-v3.txt, 2058-0.7.txt, 
 2058.txt, cassandra.pmc01.log.bz2, cassandra.pmc14.log.bz2, graph a.png, 
 graph b.png


 (Filing as a placeholder bug as I gather information.)
 At ~10p 24 Jan, I upgraded our 20-node cluster from 0.6.8-0.6.10, turned on 
 the DES, and moved some CFs from one KS into another (drain whole cluster, 
 take it down, move files, change schema, put it back up). Since then, I've 
 had four storms whereby a node's load will shoot to 700+ (400% CPU on a 4-cpu 
 machine) and become totally unresponsive. After a moment or two like that, 
 its neighbour dies too, and the failure cascades around the ring. 
 Unfortunately because of the high load I'm not able to get into the machine 
 to pull a thread dump to see wtf it's doing as it happens.
 I've also had an issue where a single node spikes up to high load, but 
 recovers. This may or may not be the same issue from which the nodes don't 
 recover as above, but both are new behaviour

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Cassandra Wiki] Update of RunningCassandraInEclipse by JoaquinCasares

2011-01-31 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The RunningCassandraInEclipse page has been changed by JoaquinCasares.
http://wiki.apache.org/cassandra/RunningCassandraInEclipse?action=diffrev1=18rev2=19

--

  Right click on the build.xml (in your project root) - Run As - Ant 
Build.
  This will do a whole lot of good things, eg. generate the CLI grammar with 
ANTLR, generate avro and thrift code.
  
+ '''UPDATE''' New for Cassandra 0.7.1: Right click on the build.xml (in your 
project root) - Run As - Ant Build... and select 
generate-eclipse-files. This will automatically build most of the jars in the 
right places. All that is left to do is to is the very next step in which you 
add all of the jars in the lib/ folder to the Build Path and all the 
dissociations should dissappear. If so, skip to the Run Cassandra section.
+ 
  Next thing you want to do is to add all the needed third party libraries to 
the build path. 
  Expand the lib/ folder and find a bunch of jar files. Shift select all of 
them and right mouse click and choose Build Path - Add to Build Path.   
  This will force Eclipse do update the entire workspace, so please be patient. 
Some of the errors should also have disappeared by now (not all though).
@@ -64, +66 @@

  Now, if you are lucky, your Eclipse workspace should look something like this:
  
  {{attachment:FixSrcJavaSourceFolder-11.png}}
-  
+ 
+ = Common Errors =
+ 
- (Some Eclipse users have complained about the following error message: 
+ Some Eclipse users have complained about the following error message: 
  
  'Access restriction: The method getDuration() from the type GcInfo is not 
accessible due to restriction on required library 
/System/Library/Frameworks/JavaVM.framework/Versions/1.6.0/Classes/classes.jar'.
 
  
@@ -80, +84 @@

  
  
  Now the errors should be gone and you are ready to create a run/debug 
configuration for cassandra.
+ 
+ = Run Cassandra =
  
  Click Run - Run Configurations Select 
org.apache.cassandra.thrift.CassandraDaemon as you Main class, make sure that 
your cassandra project is selected in the Project field.
  Under the Arguments tab you can specify VM arguments. Below is my complete VM 
arguments list for Cassandra 0.6:


[jira] Created: (CASSANDRA-2085) digest latencies are not included in snitch calculations

2011-01-31 Thread Jonathan Ellis (JIRA)
digest latencies are not included in snitch calculations


 Key: CASSANDRA-2085
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2085
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.6.9
Reporter: Jonathan Ellis
 Fix For: 0.6.11


ResponseVerbHandler calls

MessagingService.instance.maybeAddLatency(cb, message.getFrom(), age);

but maybeAddLatency needs to include DigestResponseHandler (it was ported from 
0.7 where that no longer exists)

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (CASSANDRA-2086) array index out of bounds on compact repair

2011-01-31 Thread Jeffrey Damick (JIRA)
array index out of bounds on compact  repair
-

 Key: CASSANDRA-2086
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2086
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Jeffrey Damick
Priority: Critical



We're seeing array index out of bounds exceptions (below) on 0.7.0 when running 
compact.
 
The repair seems to hang indefinitely on all nodes (also throws index oob).


On 1 node in our cluster (running compact):

 INFO [CompactionExecutor:1] 2011-01-31 20:07:12,140 CompactionManager.java 
(line 272) Compacting 
[org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data//XXX-e-318-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/xxx/xxx-e-317-Data.db')]
ERROR [CompactionExecutor:1] 2011-01-31 20:07:12,295 
AbstractCassandraDaemon.java (line 91) Fatal exception in thread 
Thread[CompactionExecutor:1,1,main]
java.lang.ArrayIndexOutOfBoundsException: 7
at 
org.apache.cassandra.db.marshal.TimeUUIDType.compareTimestampBytes(TimeUUIDType.java:58)
at 
org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.java:45)
at 
org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.java:29)
at 
java.util.concurrent.ConcurrentSkipListMap$ComparableUsingComparator.compareTo(Unknown
 Source)
at java.util.concurrent.ConcurrentSkipListMap.doPut(Unknown Source)
at java.util.concurrent.ConcurrentSkipListMap.putIfAbsent(Unknown Source)



And another node (running compact):

 INFO [StreamStage:1] 2011-01-31 20:03:48,663 StreamOutSession.java (line 174) 
Streaming to /xxx.xxx.xxx.xxx
ERROR [CompactionExecutor:1] 2011-01-31 20:03:52,587 
AbstractCassandraDaemon.java (line 91) Fatal exception in thread 
Thread[CompactionExecutor:1,1,main]
java.lang.ArrayIndexOutOfBoundsException
ERROR [CompactionExecutor:1] 2011-01-31 20:03:54,216 
AbstractCassandraDaemon.java (line 91) Fatal exception in thread 
Thread[CompactionExecutor:1,1,main]
java.lang.ArrayIndexOutOfBoundsException: 6
at 
org.apache.cassandra.db.marshal.TimeUUIDType.compareTimestampBytes(TimeUUIDType.java:56)
at 
org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.java:45)
at 
org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.java:29)
at 
java.util.concurrent.ConcurrentSkipListMap$ComparableUsingComparator.compareTo(Unknown
 Source)
at java.util.concurrent.ConcurrentSkipListMap.doPut(Unknown Source)
at java.util.concurrent.ConcurrentSkipListMap.putIfAbsent(Unknown 
Source)
at org.apache.cassandra.db.ColumnFamily.addColumn(ColumnFamily.java:218)


Is this related to: CASSANDRA-1959 or CASSANDRA-1992?

This has left some of my data in an unrecoverable  inaccessible state - how 
can i repair this situation? 



-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Issue Comment Edited: (CASSANDRA-2081) Consistency QUORUM does not work anymore (hector:Could not fullfill request on this host)

2011-01-31 Thread Aaron Morton (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12989032#comment-12989032
 ] 

Aaron Morton edited comment on CASSANDRA-2081 at 2/1/11 4:46 AM:
-

I've sort of stumbled onto something similar with an 0.7 install. I need to go 
home now so cannot dig any deeper and rule out human error, but this is what I 
have.

5 node 0.7.0 install

1) Load data in using

python stress.py -d jb-cass1,jb-cass2,jb-cass3,jb-cass4,jb-cass5 -o insert -n 
100 -e QUORUM -t 10 -i 1 -l 3
(use all 5 nodes, insert 1,000,000 rows with RF 3 and QUORUM and 10 threads, 
report progress every second)

2) Read back using 

python stress.py -d jb-cass2,jb-cass3,jb-cass4,jb-cass5 -o read -n 100 -e 
QUORUM -t 10 -i 1
(note that jb-cass1 is removed from the list)

3) make big bang

Once the read has run a few seconds I ran reboot -f on node 1. I expect the 
read operations to complete, output was 

11270,1315,1315,0.00839671943578,9
11631,361,361,0.00746133188792,11
11631,0,0,NaN,12
11631,0,0,NaN,13
11631,0,0,NaN,14
11631,0,0,NaN,15
11631,0,0,NaN,16
11631,0,0,NaN,17
11631,0,0,NaN,18
11631,0,0,NaN,19
Process Reader-10:
Traceback (most recent call last):
  File /vol/apps/python-2.6.4_64/lib/python2.6/multiprocessing/process.py, 
line 232, in _bootstrap
self.run()
  File stress.py, line 279, in run
r = self.cclient.get_slice(key, parent, p, consistency)
  File 
/local1/frameworks/cassandra/apache-cassandra-0.7.0-src/contrib/py_stress/cassandra/Cassandra.py,
 line 432, in get_slice
return self.recv_get_slice()
  File 
/local1/frameworks/cassandra/apache-cassandra-0.7.0-src/contrib/py_stress/cassandra/Cassandra.py,
 line 462, in recv_get_slice
raise result.te

All clients died. stress.py is not setting a timeout on the thrift socket, so 
am guessing this is server side.

I was running DEBUG on all the nodes (but had turned off the line numbers), 
this is from one. the 114.63 machine is obviously the one I killed. 


DEBUG [pool-1-thread-2] 2011-02-01 17:14:08,186 StorageService.java (line 
org.apache.cassandra.service.StorageService) Sorted endpoints are 
/192.168.114.63,jb08.wetafx.co.nz/192.168.114.67,/192.168.114.64
DEBUG [pool-1-thread-2] 2011-02-01 17:14:08,186 QuorumResponseHandler.java 
(line org.apache.cassandra.service.QuorumResponseHandler) QuorumResponseHandler 
blocking for 2 responses
DEBUG [pool-1-thread-2] 2011-02-01 17:14:08,186 StorageProxy.java (line 
org.apache.cassandra.service.StorageProxy) strongread reading digest for 
SliceFromReadCommand(table='Keyspace1', key='30323334343534', 
column_parent='QueryPath(columnFamilyName='Standard1', superColumnName='null', 
columnName='null')', start='', finish='', reversed=false, count=5) from 
6...@jb08.wetafx.co.nz/192.168.114.67
DEBUG [pool-1-thread-2] 2011-02-01 17:14:08,187 StorageProxy.java (line 
org.apache.cassandra.service.StorageProxy) strongread reading data for 
SliceFromReadCommand(table='Keyspace1', key='30323334343534', 
column_parent='QueryPath(columnFamilyName='Standard1', superColumnName='null', 
columnName='null')', start='', finish='', reversed=false, count=5) from 
6623@/192.168.114.63
DEBUG [pool-1-thread-2] 2011-02-01 17:14:08,187 StorageProxy.java (line 
org.apache.cassandra.service.StorageProxy) strongread reading digest for 
SliceFromReadCommand(table='Keyspace1', key='30323334343534', 
column_parent='QueryPath(columnFamilyName='Standard1', superColumnName='null', 
columnName='null')', start='', finish='', reversed=false, count=5) from 
6624@/192.168.114.64
DEBUG [ReadStage:19] 2011-02-01 17:14:08,187 SliceQueryFilter.java (line 
org.apache.cassandra.db.filter.SliceQueryFilter) collecting 0 of 5: 
4330:false:34@1296532428248604
DEBUG [ReadStage:19] 2011-02-01 17:14:08,187 SliceQueryFilter.java (line 
org.apache.cassandra.db.filter.SliceQueryFilter) collecting 1 of 5: 
4331:false:34@1296532428248637
DEBUG [ReadStage:19] 2011-02-01 17:14:08,187 SliceQueryFilter.java (line 
org.apache.cassandra.db.filter.SliceQueryFilter) collecting 2 of 5: 
4332:false:34@1296532428248640
DEBUG [ReadStage:19] 2011-02-01 17:14:08,187 SliceQueryFilter.java (line 
org.apache.cassandra.db.filter.SliceQueryFilter) collecting 3 of 5: 
4333:false:34@1296532428248642
DEBUG [ReadStage:19] 2011-02-01 17:14:08,187 SliceQueryFilter.java (line 
org.apache.cassandra.db.filter.SliceQueryFilter) collecting 4 of 5: 
4334:false:34@1296532428248656
DEBUG [ReadStage:19] 2011-02-01 17:14:08,187 ReadVerbHandler.java (line 
org.apache.cassandra.db.ReadVerbHandler) digest is 
220b82e28c2bb4be869c168243d75f01
DEBUG [ReadStage:19] 2011-02-01 17:14:08,187 ReadVerbHandler.java (line 
org.apache.cassandra.db.ReadVerbHandler) Read key 30323334343534; sending 
response to 
7d8fa1fd-a2fe-6a54-7bb0-3b129206d...@jb08.wetafx.co.nz/192.168.114.67
DEBUG [RequestResponseStage:13] 2011-02-01 17:14:08,188 
ResponseVerbHandler.java (line 

[jira] Resolved: (CASSANDRA-2086) array index out of bounds on compact repair

2011-01-31 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-2086.
---

Resolution: Duplicate

this is CASSANDRA-1992.

 array index out of bounds on compact  repair
 -

 Key: CASSANDRA-2086
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2086
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Jeffrey Damick
Priority: Critical

 We're seeing array index out of bounds exceptions (below) on 0.7.0 when 
 running compact.
  
 The repair seems to hang indefinitely on all nodes (also throws index oob).
 On 1 node in our cluster (running compact):
  INFO [CompactionExecutor:1] 2011-01-31 20:07:12,140 CompactionManager.java 
 (line 272) Compacting 
 [org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data//XXX-e-318-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/xxx/xxx-e-317-Data.db')]
 ERROR [CompactionExecutor:1] 2011-01-31 20:07:12,295 
 AbstractCassandraDaemon.java (line 91) Fatal exception in thread 
 Thread[CompactionExecutor:1,1,main]
 java.lang.ArrayIndexOutOfBoundsException: 7
 at 
 org.apache.cassandra.db.marshal.TimeUUIDType.compareTimestampBytes(TimeUUIDType.java:58)
 at 
 org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.java:45)
 at 
 org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.java:29)
 at 
 java.util.concurrent.ConcurrentSkipListMap$ComparableUsingComparator.compareTo(Unknown
  Source)
 at java.util.concurrent.ConcurrentSkipListMap.doPut(Unknown Source)
 at java.util.concurrent.ConcurrentSkipListMap.putIfAbsent(Unknown Source)
 And another node (running compact):
  INFO [StreamStage:1] 2011-01-31 20:03:48,663 StreamOutSession.java (line 
 174) Streaming to /xxx.xxx.xxx.xxx
 ERROR [CompactionExecutor:1] 2011-01-31 20:03:52,587 
 AbstractCassandraDaemon.java (line 91) Fatal exception in thread 
 Thread[CompactionExecutor:1,1,main]
 java.lang.ArrayIndexOutOfBoundsException
 ERROR [CompactionExecutor:1] 2011-01-31 20:03:54,216 
 AbstractCassandraDaemon.java (line 91) Fatal exception in thread 
 Thread[CompactionExecutor:1,1,main]
 java.lang.ArrayIndexOutOfBoundsException: 6
 at 
 org.apache.cassandra.db.marshal.TimeUUIDType.compareTimestampBytes(TimeUUIDType.java:56)
 at 
 org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.java:45)
 at 
 org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.java:29)
 at 
 java.util.concurrent.ConcurrentSkipListMap$ComparableUsingComparator.compareTo(Unknown
  Source)
 at java.util.concurrent.ConcurrentSkipListMap.doPut(Unknown Source)
 at java.util.concurrent.ConcurrentSkipListMap.putIfAbsent(Unknown 
 Source)
 at 
 org.apache.cassandra.db.ColumnFamily.addColumn(ColumnFamily.java:218)
 Is this related to: CASSANDRA-1959 or CASSANDRA-1992?
 This has left some of my data in an unrecoverable  inaccessible state - how 
 can i repair this situation? 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-2086) array index out of bounds on compact repair

2011-01-31 Thread Jeffrey Damick (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12989035#comment-12989035
 ] 

Jeffrey Damick commented on CASSANDRA-2086:
---

but is there is any way to repair the problem without deleting all of my data? 

 array index out of bounds on compact  repair
 -

 Key: CASSANDRA-2086
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2086
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Jeffrey Damick
Priority: Critical

 We're seeing array index out of bounds exceptions (below) on 0.7.0 when 
 running compact.
  
 The repair seems to hang indefinitely on all nodes (also throws index oob).
 On 1 node in our cluster (running compact):
  INFO [CompactionExecutor:1] 2011-01-31 20:07:12,140 CompactionManager.java 
 (line 272) Compacting 
 [org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data//XXX-e-318-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/xxx/xxx-e-317-Data.db')]
 ERROR [CompactionExecutor:1] 2011-01-31 20:07:12,295 
 AbstractCassandraDaemon.java (line 91) Fatal exception in thread 
 Thread[CompactionExecutor:1,1,main]
 java.lang.ArrayIndexOutOfBoundsException: 7
 at 
 org.apache.cassandra.db.marshal.TimeUUIDType.compareTimestampBytes(TimeUUIDType.java:58)
 at 
 org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.java:45)
 at 
 org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.java:29)
 at 
 java.util.concurrent.ConcurrentSkipListMap$ComparableUsingComparator.compareTo(Unknown
  Source)
 at java.util.concurrent.ConcurrentSkipListMap.doPut(Unknown Source)
 at java.util.concurrent.ConcurrentSkipListMap.putIfAbsent(Unknown Source)
 And another node (running compact):
  INFO [StreamStage:1] 2011-01-31 20:03:48,663 StreamOutSession.java (line 
 174) Streaming to /xxx.xxx.xxx.xxx
 ERROR [CompactionExecutor:1] 2011-01-31 20:03:52,587 
 AbstractCassandraDaemon.java (line 91) Fatal exception in thread 
 Thread[CompactionExecutor:1,1,main]
 java.lang.ArrayIndexOutOfBoundsException
 ERROR [CompactionExecutor:1] 2011-01-31 20:03:54,216 
 AbstractCassandraDaemon.java (line 91) Fatal exception in thread 
 Thread[CompactionExecutor:1,1,main]
 java.lang.ArrayIndexOutOfBoundsException: 6
 at 
 org.apache.cassandra.db.marshal.TimeUUIDType.compareTimestampBytes(TimeUUIDType.java:56)
 at 
 org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.java:45)
 at 
 org.apache.cassandra.db.marshal.TimeUUIDType.compare(TimeUUIDType.java:29)
 at 
 java.util.concurrent.ConcurrentSkipListMap$ComparableUsingComparator.compareTo(Unknown
  Source)
 at java.util.concurrent.ConcurrentSkipListMap.doPut(Unknown Source)
 at java.util.concurrent.ConcurrentSkipListMap.putIfAbsent(Unknown 
 Source)
 at 
 org.apache.cassandra.db.ColumnFamily.addColumn(ColumnFamily.java:218)
 Is this related to: CASSANDRA-1959 or CASSANDRA-1992?
 This has left some of my data in an unrecoverable  inaccessible state - how 
 can i repair this situation? 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Issue Comment Edited: (CASSANDRA-1941) Add distributed test doing reads during MovementTest

2011-01-31 Thread Stu Hood (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12989048#comment-12989048
 ] 

Stu Hood edited comment on CASSANDRA-1941 at 2/1/11 5:28 AM:
-

MovementTest performs a loadbalance, which is almost a full roundtrip.

It should be possible to test bootstrap by:
# decommissioning the node via nodetool
# killing the process
# wiping its state
# starting it again



These commands exist in the Whirr scripts now.

  was (Author: stuhood):
MovementTest performs a loadbalance, which almost a full roundtrip.

It should be possible to test bootstrap by:
# decommissioning the node via nodetool
# killing the process
# wiping its state
# starting it again



These commands exist in the Whirr scripts now.
  
 Add distributed test doing reads during MovementTest
 

 Key: CASSANDRA-1941
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1941
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Brandon Williams
Priority: Minor
 Fix For: 0.8


 Following introduction of the distributed test framework in CASSANDRA-1859, 
 we should extend that to test reads while bootstrap happens (this is a 
 scenario that has had regressions in the past).
 See test/distributed/README.txt for intro.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-1941) Add distributed test doing reads during MovementTest

2011-01-31 Thread Stu Hood (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12989048#comment-12989048
 ] 

Stu Hood commented on CASSANDRA-1941:
-

MovementTest performs a loadbalance, which almost a full roundtrip.

It should be possible to test bootstrap by:
# decommissioning the node via nodetool
# killing the process
# wiping its state
# starting it again



These commands exist in the Whirr scripts now.

 Add distributed test doing reads during MovementTest
 

 Key: CASSANDRA-1941
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1941
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Brandon Williams
Priority: Minor
 Fix For: 0.8


 Following introduction of the distributed test framework in CASSANDRA-1859, 
 we should extend that to test reads while bootstrap happens (this is a 
 scenario that has had regressions in the past).
 See test/distributed/README.txt for intro.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (CASSANDRA-2088) Temp files for failed compactions/streaming not cleaned up

2011-01-31 Thread Stu Hood (JIRA)
Temp files for failed compactions/streaming not cleaned up
--

 Key: CASSANDRA-2088
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2088
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Stu Hood
 Fix For: 0.7.2


From separate reports, compaction and repair are currently missing 
opportunities to clean up tmp files after failures.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-2088) Temp files for failed compactions/streaming not cleaned up

2011-01-31 Thread Stu Hood (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12989053#comment-12989053
 ] 

Stu Hood commented on CASSANDRA-2088:
-

Regarding repair: 
http://www.mail-archive.com/user@cassandra.apache.org/msg09259.html
And compaction: CASSANDRA-2084

 Temp files for failed compactions/streaming not cleaned up
 --

 Key: CASSANDRA-2088
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2088
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Stu Hood
 Fix For: 0.7.2


 From separate reports, compaction and repair are currently missing 
 opportunities to clean up tmp files after failures.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2087) Keep in-memory list of uncompactable sstables

2011-01-31 Thread Stu Hood (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-2087:


Summary: Keep in-memory list of uncompactable sstables  (was: Keep in 
memory list of uncompactable sstables)

 Keep in-memory list of uncompactable sstables
 -

 Key: CASSANDRA-2087
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2087
 Project: Cassandra
  Issue Type: Improvement
Reporter: Stu Hood
Priority: Minor

 Rather than retrying compactions that we know will fail we should:
 {quote}stop trying to compact that file and log what file the error occurred 
 for. The list of corrupt sstables does not even have to be persistent, just 
 an in memory list which gets wiped out on a restart.{quote}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Issue Comment Edited: (CASSANDRA-2084) Corrupt sstables cause compaction to fail again, and again and again, ...

2011-01-31 Thread Stu Hood (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12989054#comment-12989054
 ] 

Stu Hood edited comment on CASSANDRA-2084 at 2/1/11 6:00 AM:
-

EDIT: Just double checked, apparently version 'f' was in the 0.7 branch, but 
did not make it into 0.7.0: apologies. I'll take a close look at this tomorrow.

-It looks like those SSTables were created with a pre-release version of 
Cassandra 0.7 (version 'e', vs the release version 'f'). Mind you, that is a 
usecase that we would like to support, but it's important information to 
include in a bug report.-

-This error occurs suspiciously close to the bloom filter reading code, which 
changed between e and f. I'll CC kingryan to have him take a look tomorrow.-

Keeping a list of uncompactable SSTables is an excellent idea: opened 
CASSANDRA-2087. Also opened CASSANDRA-2088 for the compaction cleanup problem. 
Thanks for the report!

  was (Author: stuhood):
It looks like those SSTables were created with a pre-release version of 
Cassandra 0.7 (version 'e', vs the release version 'f'). Mind you, that is a 
usecase that we would like to support, but it's important information to 
include in a bug report.

This error occurs suspiciously close to the bloom filter reading code, which 
changed between e and f. I'll CC kingryan to have him take a look tomorrow.

Keeping a list of uncompactable SSTables is an excellent idea: opened 
CASSANDRA-2087. Also opened CASSANDRA-2088 for the compaction cleanup problem. 
Thanks for the report!
  
 Corrupt sstables cause compaction to fail again, and again and again, ...
 -

 Key: CASSANDRA-2084
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2084
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.0
 Environment: Ubuntu 10.10
 Cassandra 0.7.0 (4 Nodes)
 Java:
 - java version 1.6.0_22
 - Java(TM) SE Runtime Environment (build 1.6.0_22-b04)
 - Java HotSpot(TM) 64-Bit Server VM (build 17.1-b03, mixed mode)
Reporter: Dan Hendry

 I have been having some serious data corruption issues in my cluster. I 
 suspect some deeper more serious Cassandra bug but I dont know what or where 
 it is and I have not found a way to reproduce the issues I have been having. 
 This ticket is for a behaviour I have observed where cassandra starts 
 compacting a set of sstables, fails, does not clean up the tmp files, then 
 start compacting the exact same set of sstables again. (See logs below). 
 After awhile, the node runs out of disk space and crashes. At the very least, 
 cassandra should clean up temp files after a failed compaction. Better yet, 
 it should stop trying to compact that file and log what file the error 
 occurred for. The list of corrupt sstables does not even have to be 
 persistent, just an in memory list which gets wiped out on a restart.
 Here is a sample log, the same 4 sstables are being compacted then failing 
 then being compacted again. 
  INFO [CompactionExecutor:1] 2011-01-31 13:08:26,434 CompactionManager.java 
 (line 272) Compacting 
 [org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/kikmetrics/DeviceEventsByDevice-e-562-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/kikmetrics/DeviceEventsByDevice-e-692-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/kikmetrics/DeviceEventsByDevice-e-773-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/kikmetrics/DeviceEventsByDevice-e-940-Data.db')]
  INFO [HintedHandoff:1] 2011-01-31 13:08:28,878 HintedHandOffManager.java 
 (line 226) Could not complete hinted handoff to /192.168.4.16
  INFO [HintedHandoff:1] 2011-01-31 13:08:28,879 ColumnFamilyStore.java (line 
 648) switching in a fresh Memtable for HintsColumnFamily at 
 CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-1296500864696.log',
  position=104140211)
  INFO [HintedHandoff:1] 2011-01-31 13:08:28,879 ColumnFamilyStore.java (line 
 952) Enqueuing flush of Memtable-HintsColumnFamily@1652350488(1155546 bytes, 
 20839 operations)
  INFO [FlushWriter:1] 2011-01-31 13:08:28,879 Memtable.java (line 155) 
 Writing Memtable-HintsColumnFamily@1652350488(1155546 bytes, 20839 operations)
  INFO [FlushWriter:1] 2011-01-31 13:08:29,199 Memtable.java (line 162) 
 Completed flushing 
 /var/lib/cassandra/data/system/HintsColumnFamily-e-9-Data.db (1075487 bytes)
  INFO [GossipStage:1] 2011-01-31 13:08:45,508 Gossiper.java (line 569) 
 InetAddress /192.168.4.16 is now UP
  INFO [COMMIT-LOG-WRITER] 2011-01-31 13:08:59,736 CommitLogSegment.java (line 
 50) Creating new commitlog segment 
 /var/lib/cassandra/commitlog/CommitLog-1296500939735.log
  

[jira] Updated: (CASSANDRA-1941) Add distributed test doing reads during MovementTest

2011-01-31 Thread Stu Hood (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-1941:


Issue Type: Test  (was: New Feature)

 Add distributed test doing reads during MovementTest
 

 Key: CASSANDRA-1941
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1941
 Project: Cassandra
  Issue Type: Test
  Components: Core
Reporter: Jonathan Ellis
Assignee: Brandon Williams
Priority: Minor
 Fix For: 0.8


 Following introduction of the distributed test framework in CASSANDRA-1859, 
 we should extend that to test reads while bootstrap happens (this is a 
 scenario that has had regressions in the past).
 See test/distributed/README.txt for intro.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (CASSANDRA-2089) Distributed test for the dynamic snitch

2011-01-31 Thread Stu Hood (JIRA)
Distributed test for the dynamic snitch
---

 Key: CASSANDRA-2089
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2089
 Project: Cassandra
  Issue Type: Test
  Components: Core
Reporter: Stu Hood


The dynamic snitch has turned into an essential component in dealing with 
partially failed nodes: it would be great to have it fully tested before the 
0.8 release.

In order to implement a proper test of the snitch, it is necessary to be able 
to flip a switch to place a node in a degraded state.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2089) Distributed test for the dynamic snitch

2011-01-31 Thread Stu Hood (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-2089:


Fix Version/s: 0.8

 Distributed test for the dynamic snitch
 ---

 Key: CASSANDRA-2089
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2089
 Project: Cassandra
  Issue Type: Test
  Components: Core
Reporter: Stu Hood
  Labels: des
 Fix For: 0.8


 The dynamic snitch has turned into an essential component in dealing with 
 partially failed nodes: it would be great to have it fully tested before the 
 0.8 release.
 In order to implement a proper test of the snitch, it is necessary to be able 
 to flip a switch to place a node in a degraded state.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira