[Cassandra Wiki] Trivial Update of Counters by Alexis Wilke

2012-08-29 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The Counters page has been changed by Alexis Wilke:
http://wiki.apache.org/cassandra/Counters?action=diffrev1=16rev2=17

Comment:
Fixed a couple of plurial forms, and replace insert with add which is much 
more appropriate for a counter!

  {{{
  [default@test] create column family counterCF with 
default_validation_class=CounterColumnType and replicate_on_write=true;
  }}}
- Setting the `default_validation_class` to `CounterColumnType` indicates that 
the column will be counters. Setting `replicate_on_write=true` will be optional 
starting in 0.8.2, but a bug made it default to false in 0.8.0 and 0.8.1, which 
is unsafe.
+ Setting the `default_validation_class` to `CounterColumnType` indicates that 
the columns will be counters. Setting `replicate_on_write=true` will be 
optional starting in 0.8.2, but a bug made it default to false in 0.8.0 and 
0.8.1, which is unsafe.
  
   Incrementing and accessing counters 
  
@@ -86, +86 @@

  == Technical limitations ==
  
* If a write fails unexpectedly (timeout or loss of connection to the 
coordinator node) the client will not know if the operation has been performed. 
A retry can result in an over count 
[[https://issues.apache.org/jira/browse/CASSANDRA-2495|CASSANDRA-2495]].
-   * Counter removal is intrinsically limited. For instance, if you issue very 
quickly the sequence increment, remove, increment it is possible for the 
removal to be lost (if for some reason the remove happens to be the last 
received messages). Hence, removal of counters is provided for definitive 
removal only, that is when the deleted counter is not increment afterwards. 
This holds for row deletion too: if you delete a row of counters, incrementing 
any counter in that row (that existed before the deletion) will result in an 
undetermined behavior. Note that if you need to reset a counter, one option 
(that is unfortunately not concurrent safe) could be to read its ''value'' and 
insert ''-value''.
+   * Counter removal is intrinsically limited. For instance, if you issue very 
quickly the sequence increment, remove, increment it is possible for the 
removal to be lost (if for some reason the remove happens to be the last 
received messages). Hence, removal of counters is provided for definitive 
removal only, that is when the deleted counter is not increment afterwards. 
This holds for row deletion too: if you delete a row of counters, incrementing 
any counter in that row (that existed before the deletion) will result in an 
undetermined behavior. Note that if you need to reset a counter, one option 
(that is unfortunately not concurrent safe) could be to read its ''value'' and 
add ''-value''.
* `CounterColumnType` may only be set in the `default_validation_class`. A 
column family either contains only counters, or no counters at all.
  
  == Further reading ==
- See [[https://issues.apache.org/jira/browse/CASSANDRA-1072|CASSANDRA-1072]] 
and especially the 
[[https://issues.apache.org/jira/secure/attachment/12459754/Partitionedcountersdesigndoc.pdf|design
 doc]] for further information about how this works internally (but note that 
some of the limitation fixed in these technical documents have been fixed since 
then, for instance all consistency level '''are''' supported, for both reads 
and writes).
+ See [[https://issues.apache.org/jira/browse/CASSANDRA-1072|CASSANDRA-1072]] 
and especially the 
[[https://issues.apache.org/jira/secure/attachment/12459754/Partitionedcountersdesigndoc.pdf|design
 doc]] for further information about how this works internally (but note that 
some of the limitations fixed in these technical documents have been fixed 
since then, for instance all consistency level '''are''' supported, for both 
reads and writes).
  


[jira] [Updated] (CASSANDRA-4049) Add generic way of adding SSTable components required custom compaction strategy

2012-08-29 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Kołaczkowski updated CASSANDRA-4049:
--

Attachment: (was: pluggable_custom_components.patch)

 Add generic way of adding SSTable components required custom compaction 
 strategy
 

 Key: CASSANDRA-4049
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4049
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Piotr Kołaczkowski
Assignee: Piotr Kołaczkowski
Priority: Minor
  Labels: compaction
 Fix For: 1.1.5

 Attachments: pluggable_custom_components-1.1.4.patch


 CFS compaction strategy coming up in the next DSE release needs to store some 
 important information in Tombstones.db and RemovedKeys.db files, one per 
 sstable. However, currently Cassandra issues warnings when these files are 
 found in the data directory. Additionally, when switched to 
 SizeTieredCompactionStrategy, the files are left in the data directory after 
 compaction.
 The attached patch adds new components to the Component class so Cassandra 
 knows about those files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4049) Add generic way of adding SSTable components required custom compaction strategy

2012-08-29 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13443899#comment-13443899
 ] 

Piotr Kołaczkowski commented on CASSANDRA-4049:
---

Ok, rebased. Actually only line numbers changed, there was no conflict.

 Add generic way of adding SSTable components required custom compaction 
 strategy
 

 Key: CASSANDRA-4049
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4049
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Piotr Kołaczkowski
Assignee: Piotr Kołaczkowski
Priority: Minor
  Labels: compaction
 Fix For: 1.1.5

 Attachments: pluggable_custom_components-1.1.4.patch


 CFS compaction strategy coming up in the next DSE release needs to store some 
 important information in Tombstones.db and RemovedKeys.db files, one per 
 sstable. However, currently Cassandra issues warnings when these files are 
 found in the data directory. Additionally, when switched to 
 SizeTieredCompactionStrategy, the files are left in the data directory after 
 compaction.
 The attached patch adds new components to the Component class so Cassandra 
 knows about those files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-4049) Add generic way of adding SSTable components required custom compaction strategy

2012-08-29 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Kołaczkowski updated CASSANDRA-4049:
--

Attachment: pluggable_custom_components-1.1.4.patch

 Add generic way of adding SSTable components required custom compaction 
 strategy
 

 Key: CASSANDRA-4049
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4049
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Piotr Kołaczkowski
Assignee: Piotr Kołaczkowski
Priority: Minor
  Labels: compaction
 Fix For: 1.1.5

 Attachments: pluggable_custom_components-1.1.4.patch


 CFS compaction strategy coming up in the next DSE release needs to store some 
 important information in Tombstones.db and RemovedKeys.db files, one per 
 sstable. However, currently Cassandra issues warnings when these files are 
 found in the data directory. Additionally, when switched to 
 SizeTieredCompactionStrategy, the files are left in the data directory after 
 compaction.
 The attached patch adds new components to the Component class so Cassandra 
 knows about those files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-2897) Secondary indexes without read-before-write

2012-08-29 Thread Sam Tunnicliffe (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-2897:
---

Attachment: 0003-CASSANDRA-2897.txt

 Secondary indexes without read-before-write
 ---

 Key: CASSANDRA-2897
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2897
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Sylvain Lebresne
Assignee: Sam Tunnicliffe
Priority: Minor
  Labels: secondary_index
 Fix For: 1.2.0 beta 1

 Attachments: 
 0001-CASSANDRA-2897-Secondary-indexes-without-read-before-w.txt, 
 0002-CASSANDRA-2897-Secondary-indexes-without-read-before-w.txt, 
 0003-CASSANDRA-2897.txt, 2897-apply-cleanup.txt, 41ec9fc-2897.txt


 Currently, secondary index updates require a read-before-write to maintain 
 the index consistency. Keeping the index consistent at all time is not 
 necessary however. We could let the (secondary) index get inconsistent on 
 writes and repair those on reads. This would be easy because on reads, we 
 make sure to request the indexed columns anyway, so we can just skip the row 
 that are not needed and repair the index at the same time.
 This does trade work on writes for work on reads. However, read-before-write 
 is sufficiently costly that it will likely be a win overall.
 There is (at least) two small technical difficulties here though:
 # If we repair on read, this will be racy with writes, so we'll probably have 
 to synchronize there.
 # We probably shouldn't only rely on read to repair and we should also have a 
 task to repair the index for things that are rarely read. It's unclear how to 
 make that low impact though.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-4561) update column family fails

2012-08-29 Thread Pavel Yaskevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-4561:
---

Attachment: CASSANDRA-4561.patch

 update column family fails
 --

 Key: CASSANDRA-4561
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4561
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.0, 1.1.1, 1.1.2, 1.1.3, 1.1.4
Reporter: Zenek Kraweznik
Assignee: Pavel Yaskevich
 Attachments: CASSANDRA-4561.patch


 [default@test] show schema;
 create column family Messages
   with column_type = 'Standard'
   and comparator = 'AsciiType'
   and default_validation_class = 'BytesType'
   and key_validation_class = 'AsciiType'
   and read_repair_chance = 0.1
   and dclocal_read_repair_chance = 0.0
   and gc_grace = 864000
   and min_compaction_threshold = 2
   and max_compaction_threshold = 4
   and replicate_on_write = true
   and compaction_strategy = 
 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'
   and caching = 'KEYS_ONLY'
   and compaction_strategy_options = {'sstable_size_in_mb' : '1024'}
   and compression_options = {'chunk_length_kb' : '64', 'sstable_compression' 
 : 'org.apache.cassandra.io.compress.DeflateCompressor'};
 [default@test] update column family Messages with min_compaction_threshold = 
 4 and  max_compaction_threshold = 32;
 a5b7544e-1ef5-3bfd-8770-c09594e37ec2
 Waiting for schema agreement...
 ... schemas agree across the cluster
 [default@test] show schema;
 create column family Messages
   with column_type = 'Standard'
   and comparator = 'AsciiType'
   and default_validation_class = 'BytesType'
   and key_validation_class = 'AsciiType'
   and read_repair_chance = 0.1
   and dclocal_read_repair_chance = 0.0
   and gc_grace = 864000
   and min_compaction_threshold = 2
   and max_compaction_threshold = 4
   and replicate_on_write = true
   and compaction_strategy = 
 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'
   and caching = 'KEYS_ONLY'
   and compaction_strategy_options = {'sstable_size_in_mb' : '1024'}
   and compression_options = {'chunk_length_kb' : '64', 'sstable_compression' 
 : 'org.apache.cassandra.io.compress.DeflateCompressor'};

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4532) NPE when trying to select a slice from a composite table

2012-08-29 Thread basanth gowda (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444059#comment-13444059
 ] 

basanth gowda commented on CASSANDRA-4532:
--

I was trying to get a slice range, like you could do in thrift.

table defn :

create tables schedules(status ascii, timecreated bigint, key ascii, nil ascii, 
PRIMARY KEY(status,timecreated,key));

for the same time there can be a lot of entries.

Lets suppose there are 50 entries that match where timecreated is  Ln

1st query : select * from schedules where timecreated = Ln limit 10;


2nd Query : select * from schedules where timecreated=L10 AND key=K10 and 
timecreatedLn.

In CQL terms this is a wrong query I know, basically not sure how to represent 
Between in CQL


In Hector I would do, get slice range limiting 10 first time, 

for the next query (until no more are returned) I would use the time returned 
by last query and key returned by last query as the start range. This is in 
production and works perfectly fine

 NPE when trying to select a slice from a composite table
 

 Key: CASSANDRA-4532
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4532
 Project: Cassandra
  Issue Type: Bug
  Components: API, Core
Affects Versions: 1.1.3
 Environment: Cassandra 1.1.3 (2 nodes) on a single host - mac osx
Reporter: basanth gowda
Priority: Minor
  Labels: Slice, cql, cql3

 I posted this question on StackOverflow, because i need a solution. 
 Created a table with :
 {noformat}
 create table compositetest(m_id ascii,i_id int,l_id ascii,body ascii, PRIMARY 
 KEY(m_id,i_id,l_id));
 {noformat}
 wanted to slice the results returned, so did something like below, not sure 
 if its the right way. The first one returns data perfectly as expected, 
 second one to get the next 3 columns closes the transport of my cqlsh
 {noformat}
 cqlsh:testkeyspace1 select * from compositetest where i_id=3 limit 3;
  m_id | i_id | l_id | body
 --+--+--+--
m1 |1 |   l1 |   b1
m1 |2 |   l2 |   b2
m2 |1 |   l1 |   b1
 cqlsh:testkeyspace1 Was trying to write something for slice range.
 TSocket read 0 bytes
 {noformat}
 Is there a way to achieve what I am doing here, it would be good if some 
 meaning ful error is sent back, instead of cqlsh closing the transport.
 On the server side I see the following error.
 {noformat}
 ERROR [Thrift:3] 2012-08-12 15:15:24,414 CustomTThreadPoolServer.java (line 
 204) Error occurred during processing of message.
 java.lang.NullPointerException
   at 
 org.apache.cassandra.cql3.statements.SelectStatement$Restriction.setBound(SelectStatement.java:1277)
   at 
 org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.updateRestriction(SelectStatement.java:1151)
   at 
 org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepare(SelectStatement.java:1001)
   at 
 org.apache.cassandra.cql3.QueryProcessor.getStatement(QueryProcessor.java:215)
   at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:121)
   at 
 org.apache.cassandra.thrift.CassandraServer.execute_cql_query(CassandraServer.java:1237)
   at 
 org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3542)
   at 
 org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3530)
   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
   at 
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:186)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:680)
 {noformat}
 With ThriftClient I get :
 {noformat}
 org.apache.thrift.transport.TTransportException
   at 
 org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
   at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
   at 
 org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
   at 
 org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
   at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
   at 
 org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
   at 
 org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
   at 
 org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
   at 

[jira] [Updated] (CASSANDRA-2293) Rewrite nodetool help

2012-08-29 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2293:
--

Reviewer: amorton

 Rewrite nodetool help
 -

 Key: CASSANDRA-2293
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2293
 Project: Cassandra
  Issue Type: Improvement
  Components: Core, Documentation  website
Affects Versions: 0.8 beta 1
Reporter: Aaron Morton
Assignee: Jason Brown
Priority: Minor
 Fix For: 1.2.1

 Attachments: 0001-Jira-CASSANDRA-2293-Rewrite-nodetool-help.patch


 Once CASSANDRA-2008 is through and we are happy with the approach I would 
 like to write similar help for nodetool. 
 Both command line help of the form nodetool help and nodetool help 
 command.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4532) NPE when trying to select a slice from a composite table

2012-08-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444067#comment-13444067
 ] 

Jonathan Ellis commented on CASSANDRA-4532:
---

can you test against trunk?

 NPE when trying to select a slice from a composite table
 

 Key: CASSANDRA-4532
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4532
 Project: Cassandra
  Issue Type: Bug
  Components: API, Core
Affects Versions: 1.1.3
 Environment: Cassandra 1.1.3 (2 nodes) on a single host - mac osx
Reporter: basanth gowda
Priority: Minor
  Labels: Slice, cql, cql3

 I posted this question on StackOverflow, because i need a solution. 
 Created a table with :
 {noformat}
 create table compositetest(m_id ascii,i_id int,l_id ascii,body ascii, PRIMARY 
 KEY(m_id,i_id,l_id));
 {noformat}
 wanted to slice the results returned, so did something like below, not sure 
 if its the right way. The first one returns data perfectly as expected, 
 second one to get the next 3 columns closes the transport of my cqlsh
 {noformat}
 cqlsh:testkeyspace1 select * from compositetest where i_id=3 limit 3;
  m_id | i_id | l_id | body
 --+--+--+--
m1 |1 |   l1 |   b1
m1 |2 |   l2 |   b2
m2 |1 |   l1 |   b1
 cqlsh:testkeyspace1 Was trying to write something for slice range.
 TSocket read 0 bytes
 {noformat}
 Is there a way to achieve what I am doing here, it would be good if some 
 meaning ful error is sent back, instead of cqlsh closing the transport.
 On the server side I see the following error.
 {noformat}
 ERROR [Thrift:3] 2012-08-12 15:15:24,414 CustomTThreadPoolServer.java (line 
 204) Error occurred during processing of message.
 java.lang.NullPointerException
   at 
 org.apache.cassandra.cql3.statements.SelectStatement$Restriction.setBound(SelectStatement.java:1277)
   at 
 org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.updateRestriction(SelectStatement.java:1151)
   at 
 org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepare(SelectStatement.java:1001)
   at 
 org.apache.cassandra.cql3.QueryProcessor.getStatement(QueryProcessor.java:215)
   at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:121)
   at 
 org.apache.cassandra.thrift.CassandraServer.execute_cql_query(CassandraServer.java:1237)
   at 
 org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3542)
   at 
 org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3530)
   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
   at 
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:186)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:680)
 {noformat}
 With ThriftClient I get :
 {noformat}
 org.apache.thrift.transport.TTransportException
   at 
 org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
   at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
   at 
 org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
   at 
 org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
   at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
   at 
 org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
   at 
 org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
   at 
 org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
   at 
 org.apache.cassandra.thrift.Cassandra$Client.recv_execute_cql_query(Cassandra.java:1402)
   at 
 org.apache.cassandra.thrift.Cassandra$Client.execute_cql_query(Cassandra.java:1388)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4532) NPE when trying to select a slice from a composite table

2012-08-29 Thread basanth gowda (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444099#comment-13444099
 ] 

basanth gowda commented on CASSANDRA-4532:
--

No luck. See the last query closed the socket.

Here are the steps to reproduce :

cqlsh:testkeyspace1 create table compositetest(status ascii,ctime bigint,key 
ascii,nil ascii,PRIMARY KEY(status,ctime,key));

cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES 
('C',12345678,'key1','');
cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES 
('C',12345678,'key2','');
cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES 
('C',12345679,'key3','');
cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES 
('C',12345679,'key4','');
cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES 
('C',12345679,'key5','');
cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES 
('C',12345680,'key6','');
cqlsh:testkeyspace1 select * from compositetest;
 status | ctime| key  | nil
+--+--+-
  C | 12345678 | key1 |
  C | 12345678 | key2 |
  C | 12345679 | key3 |
  C | 12345679 | key4 |
  C | 12345679 | key5 |
  C | 12345680 | key6 |

1st query of slice :

cqlsh:testkeyspace1 select * from compositetest where ctime=12345680 limit 3;
 status | ctime| key  | nil
+--+--+--
  C | 12345678 | key1 | 
  C | 12345678 | key2 | 
  C | 12345679 | key3 | null

Second Query : I want to get values where first one left off (Yes you could do 
this with hector) [Try 1]

cqlsh:testkeyspace1 select * from compositetest where ctime=12345679 and 
key='key3' and ctime=12345680 limit 3;
Bad Request: PRIMARY KEY part key cannot be restricted (preceding part ctime is 
either not restricted or by a non-EQ relation) [Try 2]
cqlsh:testkeyspace1 select * from compositetest where ctime=12345679 and 
key='key3' and ctime=12345680 limit 3;
TSocket read 0 bytes
cqlsh:testkeyspace1

 NPE when trying to select a slice from a composite table
 

 Key: CASSANDRA-4532
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4532
 Project: Cassandra
  Issue Type: Bug
  Components: API, Core
Affects Versions: 1.1.3
 Environment: Cassandra 1.1.3 (2 nodes) on a single host - mac osx
Reporter: basanth gowda
Priority: Minor
  Labels: Slice, cql, cql3

 I posted this question on StackOverflow, because i need a solution. 
 Created a table with :
 {noformat}
 create table compositetest(m_id ascii,i_id int,l_id ascii,body ascii, PRIMARY 
 KEY(m_id,i_id,l_id));
 {noformat}
 wanted to slice the results returned, so did something like below, not sure 
 if its the right way. The first one returns data perfectly as expected, 
 second one to get the next 3 columns closes the transport of my cqlsh
 {noformat}
 cqlsh:testkeyspace1 select * from compositetest where i_id=3 limit 3;
  m_id | i_id | l_id | body
 --+--+--+--
m1 |1 |   l1 |   b1
m1 |2 |   l2 |   b2
m2 |1 |   l1 |   b1
 cqlsh:testkeyspace1 Was trying to write something for slice range.
 TSocket read 0 bytes
 {noformat}
 Is there a way to achieve what I am doing here, it would be good if some 
 meaning ful error is sent back, instead of cqlsh closing the transport.
 On the server side I see the following error.
 {noformat}
 ERROR [Thrift:3] 2012-08-12 15:15:24,414 CustomTThreadPoolServer.java (line 
 204) Error occurred during processing of message.
 java.lang.NullPointerException
   at 
 org.apache.cassandra.cql3.statements.SelectStatement$Restriction.setBound(SelectStatement.java:1277)
   at 
 org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.updateRestriction(SelectStatement.java:1151)
   at 
 org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepare(SelectStatement.java:1001)
   at 
 org.apache.cassandra.cql3.QueryProcessor.getStatement(QueryProcessor.java:215)
   at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:121)
   at 
 org.apache.cassandra.thrift.CassandraServer.execute_cql_query(CassandraServer.java:1237)
   at 
 org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3542)
   at 
 org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3530)
   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
   at 
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:186)
   at 
 

[jira] [Comment Edited] (CASSANDRA-4532) NPE when trying to select a slice from a composite table

2012-08-29 Thread basanth gowda (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444099#comment-13444099
 ] 

basanth gowda edited comment on CASSANDRA-4532 at 8/30/12 1:21 AM:
---

No luck. See the last query closed the socket. I took the latest from git and 
compiled

Here are the steps to reproduce :

cqlsh:testkeyspace1 create table compositetest(status ascii,ctime bigint,key 
ascii,nil ascii,PRIMARY KEY(status,ctime,key));

cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES 
('C',12345678,'key1','');
cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES 
('C',12345678,'key2','');
cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES 
('C',12345679,'key3','');
cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES 
('C',12345679,'key4','');
cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES 
('C',12345679,'key5','');
cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES 
('C',12345680,'key6','');
cqlsh:testkeyspace1 select * from compositetest;
 status | ctime| key  | nil
+--+--+-
  C | 12345678 | key1 |
  C | 12345678 | key2 |
  C | 12345679 | key3 |
  C | 12345679 | key4 |
  C | 12345679 | key5 |
  C | 12345680 | key6 |

1st query of slice :

cqlsh:testkeyspace1 select * from compositetest where ctime=12345680 limit 3;
 status | ctime| key  | nil
+--+--+--
  C | 12345678 | key1 | 
  C | 12345678 | key2 | 
  C | 12345679 | key3 | null

Second Query : I want to get values where first one left off (Yes you could do 
this with hector) [Try 1]

cqlsh:testkeyspace1 select * from compositetest where ctime=12345679 and 
key='key3' and ctime=12345680 limit 3;
Bad Request: PRIMARY KEY part key cannot be restricted (preceding part ctime is 
either not restricted or by a non-EQ relation) [Try 2]
cqlsh:testkeyspace1 select * from compositetest where ctime=12345679 and 
key='key3' and ctime=12345680 limit 3;
TSocket read 0 bytes
cqlsh:testkeyspace1

  was (Author: basu76):
No luck. See the last query closed the socket.

Here are the steps to reproduce :

cqlsh:testkeyspace1 create table compositetest(status ascii,ctime bigint,key 
ascii,nil ascii,PRIMARY KEY(status,ctime,key));

cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES 
('C',12345678,'key1','');
cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES 
('C',12345678,'key2','');
cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES 
('C',12345679,'key3','');
cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES 
('C',12345679,'key4','');
cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES 
('C',12345679,'key5','');
cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES 
('C',12345680,'key6','');
cqlsh:testkeyspace1 select * from compositetest;
 status | ctime| key  | nil
+--+--+-
  C | 12345678 | key1 |
  C | 12345678 | key2 |
  C | 12345679 | key3 |
  C | 12345679 | key4 |
  C | 12345679 | key5 |
  C | 12345680 | key6 |

1st query of slice :

cqlsh:testkeyspace1 select * from compositetest where ctime=12345680 limit 3;
 status | ctime| key  | nil
+--+--+--
  C | 12345678 | key1 | 
  C | 12345678 | key2 | 
  C | 12345679 | key3 | null

Second Query : I want to get values where first one left off (Yes you could do 
this with hector) [Try 1]

cqlsh:testkeyspace1 select * from compositetest where ctime=12345679 and 
key='key3' and ctime=12345680 limit 3;
Bad Request: PRIMARY KEY part key cannot be restricted (preceding part ctime is 
either not restricted or by a non-EQ relation) [Try 2]
cqlsh:testkeyspace1 select * from compositetest where ctime=12345679 and 
key='key3' and ctime=12345680 limit 3;
TSocket read 0 bytes
cqlsh:testkeyspace1
  
 NPE when trying to select a slice from a composite table
 

 Key: CASSANDRA-4532
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4532
 Project: Cassandra
  Issue Type: Bug
  Components: API, Core
Affects Versions: 1.1.3
 Environment: Cassandra 1.1.3 (2 nodes) on a single host - mac osx
Reporter: basanth gowda
Priority: Minor
  Labels: Slice, cql, cql3

 I posted this question on StackOverflow, because i need a solution. 
 Created a table with :
 {noformat}
 create table compositetest(m_id ascii,i_id int,l_id ascii,body ascii, PRIMARY 
 KEY(m_id,i_id,l_id));
 {noformat}
 wanted to slice 

[jira] [Updated] (CASSANDRA-4292) Improve JBOD loadbalancing and reduce contention

2012-08-29 Thread Yuki Morishita (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuki Morishita updated CASSANDRA-4292:
--

Attachment: 0001-Fix-writing-sstables-to-wrong-directory-when-compact.patch

Dave,

You are right, attaching fix to chose right directory to write sstable when 
compacting.

 Improve JBOD loadbalancing and reduce contention
 

 Key: CASSANDRA-4292
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4292
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Yuki Morishita
 Fix For: 1.2.0 beta 1

 Attachments: 
 0001-Fix-writing-sstables-to-wrong-directory-when-compact.patch, 4292.txt, 
 4292-v2.txt, 4292-v3.txt, 4292-v4.txt


 As noted in CASSANDRA-809, we have a certain amount of flush (and compaction) 
 threads, which mix and match disk volumes indiscriminately.  It may be worth 
 creating a tight thread - disk affinity, to prevent unnecessary conflict at 
 that level.
 OTOH as SSDs become more prevalent this becomes a non-issue.  Unclear how 
 much pain this actually causes in practice in the meantime.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-4583) Some nodes forget schema when 1 node fails

2012-08-29 Thread Edward Sargisson (JIRA)
Edward Sargisson created CASSANDRA-4583:
---

 Summary: Some nodes forget schema when 1 node fails
 Key: CASSANDRA-4583
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4583
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.2
 Environment: CentOS release 6.3 (Final)
Reporter: Edward Sargisson


At present we do not have a complete reproduction for this defect but am 
raising this defect as request by Aaron Morton. We will update as we find out 
more. If any additional logging or tests are requested we will do them if we 
can. 

We have experienced 2 failures ascribed to this defect. On the cassandra user 
mailing list Peter Schuller (2012-08-28) describes an additional failure.

Reproduction steps as currently known:
1. Setup a cluster with 6 nodes (call them #1 through #6).
2. Have #5 fail completely. One failure was when the node was stopped to 
replace the battery in the hard disk cache. The second failure was when the 
hardware monitoring recorded a problem, CPU usage was increasing without 
explanation and the server console was frozen so the machine was restarted.
3. Bring #5 back

Expected behaviour:
* #5 should rejoin the ring.

Actual behaviour (based on the incident we saw yesterday):
* #5 didn't rejoin the ring.
* We stopped all nodes and started them one by one.
* Nodes #2, #4, #6 had forgotten most of their column families. They had the 
keys space but with only one column family instead of the usual 9 or so.
* We ran nodetool resetlocalschema on #2, #4 and #6.
* We ran nodetool repair -pr on #2, #4, #5 and #6
* On one of these nodes nodetool repair appeared to crash in that there were no 
messages in the logs from it for 10min+. Nodetool compactionstats and nodetool 
netstats showed no activity.
* Restarting nodetool repair -pr fixed the problem and ran to completion.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-4583) Some nodes forget schema when 1 node fails

2012-08-29 Thread Edward Sargisson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Sargisson updated CASSANDRA-4583:


Description: 
At present we do not have a complete reproduction for this defect but am 
raising this defect as request by Aaron Morton. We will update as we find out 
more. If any additional logging or tests are requested we will do them if we 
can. 

We have experienced 2 failures ascribed to this defect. On the cassandra user 
mailing list Peter Schuller (2012-08-28) describes an additional failure.

Reproduction steps as currently known:
1. Setup a cluster with 6 nodes (call them #1 through #6).
2. Have #5 fail completely. One failure was when the node was stopped to 
replace the battery in the hard disk cache. The second failure was when the 
hardware monitoring recorded a problem, CPU usage was increasing without 
explanation and the server console was frozen so the machine was restarted.
3. Bring #5 back

Expected behaviour:
* #5 should rejoin the ring.

Actual behaviour (based on the incident we saw yesterday):
* #5 didn't rejoin the ring.
* We stopped all nodes and started them one by one.
* Nodes #2, #4, #6 had forgotten most of their column families. They had the 
keys space but with only one column family instead of the usual 9 or so.
* We ran nodetool resetlocalschema on #2, #4 and #6.
* We ran nodetool repair -pr on #2, #4, #5 and #6
* On #2 nodetool repair appeared to crash in that there were no messages in the 
logs from it for 10min+. Nodetool compactionstats and nodetool netstats showed 
no activity.
* Restarting nodetool repair -pr fixed the problem and ran to completion.



  was:
At present we do not have a complete reproduction for this defect but am 
raising this defect as request by Aaron Morton. We will update as we find out 
more. If any additional logging or tests are requested we will do them if we 
can. 

We have experienced 2 failures ascribed to this defect. On the cassandra user 
mailing list Peter Schuller (2012-08-28) describes an additional failure.

Reproduction steps as currently known:
1. Setup a cluster with 6 nodes (call them #1 through #6).
2. Have #5 fail completely. One failure was when the node was stopped to 
replace the battery in the hard disk cache. The second failure was when the 
hardware monitoring recorded a problem, CPU usage was increasing without 
explanation and the server console was frozen so the machine was restarted.
3. Bring #5 back

Expected behaviour:
* #5 should rejoin the ring.

Actual behaviour (based on the incident we saw yesterday):
* #5 didn't rejoin the ring.
* We stopped all nodes and started them one by one.
* Nodes #2, #4, #6 had forgotten most of their column families. They had the 
keys space but with only one column family instead of the usual 9 or so.
* We ran nodetool resetlocalschema on #2, #4 and #6.
* We ran nodetool repair -pr on #2, #4, #5 and #6
* On one of these nodes nodetool repair appeared to crash in that there were no 
messages in the logs from it for 10min+. Nodetool compactionstats and nodetool 
netstats showed no activity.
* Restarting nodetool repair -pr fixed the problem and ran to completion.




 Some nodes forget schema when 1 node fails
 --

 Key: CASSANDRA-4583
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4583
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.2
 Environment: CentOS release 6.3 (Final)
Reporter: Edward Sargisson

 At present we do not have a complete reproduction for this defect but am 
 raising this defect as request by Aaron Morton. We will update as we find out 
 more. If any additional logging or tests are requested we will do them if we 
 can. 
 We have experienced 2 failures ascribed to this defect. On the cassandra user 
 mailing list Peter Schuller (2012-08-28) describes an additional failure.
 Reproduction steps as currently known:
 1. Setup a cluster with 6 nodes (call them #1 through #6).
 2. Have #5 fail completely. One failure was when the node was stopped to 
 replace the battery in the hard disk cache. The second failure was when the 
 hardware monitoring recorded a problem, CPU usage was increasing without 
 explanation and the server console was frozen so the machine was restarted.
 3. Bring #5 back
 Expected behaviour:
 * #5 should rejoin the ring.
 Actual behaviour (based on the incident we saw yesterday):
 * #5 didn't rejoin the ring.
 * We stopped all nodes and started them one by one.
 * Nodes #2, #4, #6 had forgotten most of their column families. They had the 
 keys space but with only one column family instead of the usual 9 or so.
 * We ran nodetool resetlocalschema on #2, #4 and #6.
 * We ran nodetool repair -pr on #2, #4, #5 and #6
 * On #2 nodetool repair appeared to crash in that 

[jira] [Updated] (CASSANDRA-4572) lost+found directory in the data dir causes problems again

2012-08-29 Thread Yuki Morishita (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuki Morishita updated CASSANDRA-4572:
--

Attachment: 4572-1.1.txt

When you do File#listFiles on lost+found directory, it returns null. I believe 
there are other cases that it returns null, so attached patch just checks null 
after File#listFiles is performed.

 lost+found directory in the data dir causes problems again
 --

 Key: CASSANDRA-4572
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4572
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.0
Reporter: Brandon Williams
Assignee: Yuki Morishita
 Fix For: 1.1.5

 Attachments: 4572-1.1.txt


 Looks like we've regressed from CASSANDRA-1547 and mounting a fs directly on 
 the data dir is a problem again.
 {noformat}
 INFO [main] 2012-08-22 23:30:03,710 Directories.java (line 475) Upgrade from 
 pre-1.1 version detected: migrating sstables to new directory layout ERROR 
 [main] 2012-08-22 23:30:03,712 AbstractCassandraDaemon.java (line 370) 
 Exception encountered during startup 
 java.lang.NullPointerException at 
 org.apache.cassandra.db.Directories.migrateSSTables(Directories.java:487)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4571) Strange permament socket descriptors increasing leads to Too many open files

2012-08-29 Thread Steven Willcox (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444161#comment-13444161
 ] 

Steven Willcox commented on CASSANDRA-4571:
---

We are also seeing this bug and all nodes eventually run out of file 
descriptors and crash.

 Strange permament socket descriptors increasing leads to Too many open files
 --

 Key: CASSANDRA-4571
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4571
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.1, 1.1.2, 1.1.3
 Environment: CentOS 5.8 Linux 2.6.18-308.13.1.el5 #1 SMP Tue Aug 21 
 17:10:18 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux. 
 java version 1.6.0_33
 Java(TM) SE Runtime Environment (build 1.6.0_33-b03)
 Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03, mixed mode)
Reporter: Serg Shnerson
Priority: Critical

 On the two-node cluster there was found strange socket descriptors 
 increasing. lsof -n | grep java shows many rows like
 java   8380 cassandra  113r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  114r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  115r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  116r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  117r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  118r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  119r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  120r unix 0x8101a374a080
 938348482 socket
  And number of this rows constantly increasing. After about 24 hours this 
 situation leads to error.
 We use PHPCassa client. Load is not so high (aroud ~50kb/s on write). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CASSANDRA-4571) Strange permament socket descriptors increasing leads to Too many open files

2012-08-29 Thread Steven Willcox (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444161#comment-13444161
 ] 

Steven Willcox edited comment on CASSANDRA-4571 at 8/30/12 3:22 AM:


We are also seeing this bug and all nodes eventually run out of file 
descriptors and crash. It is a blocker for us.

  was (Author: swillcox):
We are also seeing this bug and all nodes eventually run out of file 
descriptors and crash.
  
 Strange permament socket descriptors increasing leads to Too many open files
 --

 Key: CASSANDRA-4571
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4571
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.1, 1.1.2, 1.1.3
 Environment: CentOS 5.8 Linux 2.6.18-308.13.1.el5 #1 SMP Tue Aug 21 
 17:10:18 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux. 
 java version 1.6.0_33
 Java(TM) SE Runtime Environment (build 1.6.0_33-b03)
 Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03, mixed mode)
Reporter: Serg Shnerson
Priority: Critical

 On the two-node cluster there was found strange socket descriptors 
 increasing. lsof -n | grep java shows many rows like
 java   8380 cassandra  113r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  114r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  115r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  116r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  117r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  118r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  119r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  120r unix 0x8101a374a080
 938348482 socket
  And number of this rows constantly increasing. After about 24 hours this 
 situation leads to error.
 We use PHPCassa client. Load is not so high (aroud ~50kb/s on write). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-1123) Allow tracing query details

2012-08-29 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-1123:
--

Attachment: 1123-v9.txt

How about this?  v9 is v7 with an ExecuteOnlyExecutor to remind us in the 
future not to use submit on the trace stage.

 Allow tracing query details
 ---

 Key: CASSANDRA-1123
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1123
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: David Alves
 Fix For: 1.2.0 beta 1

 Attachments: 1123-3.patch.gz, 1123.patch, 1123.patch, 1123.patch, 
 1123.patch, 1123-v6.txt, 1123-v7.patch, 1123-v8.patch, 1123-v9.txt


 In the spirit of CASSANDRA-511, it would be useful to tracing on queries to 
 see where latency is coming from: how long did row cache lookup take?  key 
 search in the index?  merging the data from the sstables?  etc.
 The main difference vs setting debug logging is that debug logging is too big 
 of a hammer; by turning on the flood of logging for everyone, you actually 
 distort the information you're looking for.  This would be something you 
 could set per-query (or more likely per connection).
 We don't need to be as sophisticated as the techniques discussed in the 
 following papers but they are interesting reading:
 http://research.google.com/pubs/pub36356.html
 http://www.usenix.org/events/osdi04/tech/full_papers/barham/barham_html/
 http://www.usenix.org/event/nsdi07/tech/fonseca.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4383) Binary encoding of vnode tokens

2012-08-29 Thread Eric Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444176#comment-13444176
 ] 

Eric Evans commented on CASSANDRA-4383:
---

The approach taken here continues to make me a little hesitant simply because, 
I think, it introduces for the first time a need for proper ordering of STATE 
transmission/reception.  I don't have a clear-enough understanding of how the 
underlying messaging works to know if we can firmly rely on that or not (I take 
you at your word that we can), but it is significant enough to warrant a 
mention, if for the interface alone.

And on that note...

bq. Patch 0003 is something we can take or leave, it just seemed to make sense 
to have a way to atomically set two gossip states at once in case the gossiper 
fires in between adding the first and second state, however it's ok for us to 
gossip TOKENS without STATUS, since STATUS is what fires events. I'm also not 
100% certain it actually adds them atomically, since EndpointState is backed by 
NBHM.

What I had in mind was (at least) something like a 
{{sendState\{Normal,Bootstrap,...\}(Collectiontoken tokens)}} to encapsulate 
those operations sensitive to ordering.  
{{Gossiper.addLocalApplicationStates(...)}} still makes it too easy to do the 
wrong thing.

A couple of other things:

{{StorageService.getHostId(ep)}} creates a second means of obtaining a host ID, 
and it's not at all obvious that it should only be used in assigning the value 
returned by the method of the _same name_ in {{TokenMetadata}}.  At a minimum, 
I think this should be given a new name, one that makes obvious its purpose.  
However, wouldn't encapsulation be better anyway if this were a {{Gossiper}} 
method?

And in addition to {{getHostId}}, {{usesHostId}} also seems better suited to 
the {{Gossiper}}.

Finally, is there any reason that {{TokenSerializer.(de)serialize(...)}} 
shouldn't be static?  What is the instance buying us?



As for the whole ...using the presence of a hostID as an implicit indication 
of the version..., I was unaware of CASSANDRA-4317 (e6530cc3) and (somehow 
)mistakenly took that as part of this change.  Sorry about that.

bq. Well, it's six one way and a half dozen the other. We can look at 
NET_VERSION instead, but it was also introduced in 1.2, so it's effectively the 
same thing... and you could have the bug in the opposite direction

If we look for a NET_VERSION that's not there (and should be), then the error 
is a missing NET_VERSION.  If NET_VERSION is = 1.2, or exists at all if you 
prefer the implicit (I don't), and there is no HOST_ID, then we have a bug in 
transmitting/reception of HOST_ID.

But, mostly I meant that it doesn't read as well as the old code that clearly 
did one thing when the version was  X, and another when it was = Y.

Anyway, this ship has sailed as far as this ticket goes, so if you'd prefer 
discuss it elsewhere then that's fine.  There is no need to hold any of this 
against this particular issue/change.

 Binary encoding of vnode tokens
 ---

 Key: CASSANDRA-4383
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4383
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Brandon Williams
Assignee: Brandon Williams
 Fix For: 1.2.0 beta 1

 Attachments: 
 0001-Add-HOST_ID-and-TOKENS-app-states-binary-serialization.txt, 
 0002-Fix-tests.txt, 0003-Add-tokens-and-status-atomically.txt


 Since after CASSANDRA-4317 we can know which version a remote node is using 
 (that is, whether it is vnode-aware or not) this a good opportunity to change 
 the token encoding to binary, since with a default of 256 tokens per node 
 even a fixed-length 16 byte encoding per token provides a great deal of 
 savings in gossip traffic over a text representation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-4583) Some nodes forget schema when 1 node fails

2012-08-29 Thread Edward Sargisson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Sargisson updated CASSANDRA-4583:


Attachment: cass-4583-5-system.log
cass-4583-2-system.log

cass-4583-5-system.log is an extract from #5 from the time of the incident.
Similarly, cass-4583-2-system.log is from #2.

#2 is 10.30.11.40
#5 is 10.30.11.43

 Some nodes forget schema when 1 node fails
 --

 Key: CASSANDRA-4583
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4583
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.2
 Environment: CentOS release 6.3 (Final)
Reporter: Edward Sargisson
 Attachments: cass-4583-2-system.log, cass-4583-5-system.log


 At present we do not have a complete reproduction for this defect but am 
 raising this defect as request by Aaron Morton. We will update as we find out 
 more. If any additional logging or tests are requested we will do them if we 
 can. 
 We have experienced 2 failures ascribed to this defect. On the cassandra user 
 mailing list Peter Schuller (2012-08-28) describes an additional failure.
 Reproduction steps as currently known:
 1. Setup a cluster with 6 nodes (call them #1 through #6).
 2. Have #5 fail completely. One failure was when the node was stopped to 
 replace the battery in the hard disk cache. The second failure was when the 
 hardware monitoring recorded a problem, CPU usage was increasing without 
 explanation and the server console was frozen so the machine was restarted.
 3. Bring #5 back
 Expected behaviour:
 * #5 should rejoin the ring.
 Actual behaviour (based on the incident we saw yesterday):
 * #5 didn't rejoin the ring.
 * We stopped all nodes and started them one by one.
 * Nodes #2, #4, #6 had forgotten most of their column families. They had the 
 keys space but with only one column family instead of the usual 9 or so.
 * We ran nodetool resetlocalschema on #2, #4 and #6.
 * We ran nodetool repair -pr on #2, #4, #5 and #6
 * On #2 nodetool repair appeared to crash in that there were no messages in 
 the logs from it for 10min+. Nodetool compactionstats and nodetool netstats 
 showed no activity.
 * Restarting nodetool repair -pr fixed the problem and ran to completion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4571) Strange permament socket descriptors increasing leads to Too many open files

2012-08-29 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444190#comment-13444190
 ] 

Per Otterström commented on CASSANDRA-4571:
---

To verify, we started from scratch. A new installation on 3 servers. And the FD 
leak is still there. So, with our particular setup we are able to reproduce the 
bug.

These are the characteristics of our setup:
- We have one single CF.
- Rows are inserted in batches.
- Rows are red, updated and deleted in a random like pattern.
- The FD leak seem to start during heavy read load (but can appear during mixed 
read/write/delete operations as well).
- We are using Hector to access this single CF.
- Cassandra configuration is basically standard.

The FD leaks does not show immediately. It appears once there is ~60M rows in 
CF.


 Strange permament socket descriptors increasing leads to Too many open files
 --

 Key: CASSANDRA-4571
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4571
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.1, 1.1.2, 1.1.3
 Environment: CentOS 5.8 Linux 2.6.18-308.13.1.el5 #1 SMP Tue Aug 21 
 17:10:18 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux. 
 java version 1.6.0_33
 Java(TM) SE Runtime Environment (build 1.6.0_33-b03)
 Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03, mixed mode)
Reporter: Serg Shnerson
Priority: Critical

 On the two-node cluster there was found strange socket descriptors 
 increasing. lsof -n | grep java shows many rows like
 java   8380 cassandra  113r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  114r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  115r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  116r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  117r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  118r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  119r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  120r unix 0x8101a374a080
 938348482 socket
  And number of this rows constantly increasing. After about 24 hours this 
 situation leads to error.
 We use PHPCassa client. Load is not so high (aroud ~50kb/s on write). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4571) Strange permament socket descriptors increasing leads to Too many open files

2012-08-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444197#comment-13444197
 ] 

Jonathan Ellis commented on CASSANDRA-4571:
---

Are you sure you can't reproduce on a single-node cluster?

Because we're getting conflicting evidence here; on the one hand, strace 
indicates that the fd leakage is related to file i/o, but if so, you shouldn't 
need multiple nodes in the cluster to repro.

 Strange permament socket descriptors increasing leads to Too many open files
 --

 Key: CASSANDRA-4571
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4571
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.1, 1.1.2, 1.1.3
 Environment: CentOS 5.8 Linux 2.6.18-308.13.1.el5 #1 SMP Tue Aug 21 
 17:10:18 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux. 
 java version 1.6.0_33
 Java(TM) SE Runtime Environment (build 1.6.0_33-b03)
 Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03, mixed mode)
Reporter: Serg Shnerson
Priority: Critical

 On the two-node cluster there was found strange socket descriptors 
 increasing. lsof -n | grep java shows many rows like
 java   8380 cassandra  113r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  114r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  115r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  116r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  117r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  118r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  119r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  120r unix 0x8101a374a080
 938348482 socket
  And number of this rows constantly increasing. After about 24 hours this 
 situation leads to error.
 We use PHPCassa client. Load is not so high (aroud ~50kb/s on write). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-1123) Allow tracing query details

2012-08-29 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-1123:
--

Attachment: 1123-v9.txt

 Allow tracing query details
 ---

 Key: CASSANDRA-1123
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1123
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: David Alves
 Fix For: 1.2.0 beta 1

 Attachments: 1123-3.patch.gz, 1123.patch, 1123.patch, 1123.patch, 
 1123.patch, 1123-v6.txt, 1123-v7.patch, 1123-v8.patch, 1123-v9.txt, 
 1123-v9.txt


 In the spirit of CASSANDRA-511, it would be useful to tracing on queries to 
 see where latency is coming from: how long did row cache lookup take?  key 
 search in the index?  merging the data from the sstables?  etc.
 The main difference vs setting debug logging is that debug logging is too big 
 of a hammer; by turning on the flood of logging for everyone, you actually 
 distort the information you're looking for.  This would be something you 
 could set per-query (or more likely per connection).
 We don't need to be as sophisticated as the techniques discussed in the 
 following papers but they are interesting reading:
 http://research.google.com/pubs/pub36356.html
 http://www.usenix.org/events/osdi04/tech/full_papers/barham/barham_html/
 http://www.usenix.org/event/nsdi07/tech/fonseca.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-1123) Allow tracing query details

2012-08-29 Thread David Alves (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444209#comment-13444209
 ] 

David Alves commented on CASSANDRA-1123:


+1, wfm

 Allow tracing query details
 ---

 Key: CASSANDRA-1123
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1123
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: David Alves
 Fix For: 1.2.0 beta 1

 Attachments: 1123-3.patch.gz, 1123.patch, 1123.patch, 1123.patch, 
 1123.patch, 1123-v6.txt, 1123-v7.patch, 1123-v8.patch, 1123-v9.txt, 
 1123-v9.txt


 In the spirit of CASSANDRA-511, it would be useful to tracing on queries to 
 see where latency is coming from: how long did row cache lookup take?  key 
 search in the index?  merging the data from the sstables?  etc.
 The main difference vs setting debug logging is that debug logging is too big 
 of a hammer; by turning on the flood of logging for everyone, you actually 
 distort the information you're looking for.  This would be something you 
 could set per-query (or more likely per connection).
 We don't need to be as sophisticated as the techniques discussed in the 
 following papers but they are interesting reading:
 http://research.google.com/pubs/pub36356.html
 http://www.usenix.org/events/osdi04/tech/full_papers/barham/barham_html/
 http://www.usenix.org/event/nsdi07/tech/fonseca.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (CASSANDRA-4583) Some nodes forget schema when 1 node fails

2012-08-29 Thread Pavel Yaskevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich resolved CASSANDRA-4583.


Resolution: Duplicate

This looks like it was caused by the same problem as CASSANDRA-4129 and 
timestamp problems related to nanoTime usage for schema, all of that is fixed 
in 1.1.4

 Some nodes forget schema when 1 node fails
 --

 Key: CASSANDRA-4583
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4583
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.2
 Environment: CentOS release 6.3 (Final)
Reporter: Edward Sargisson
 Attachments: cass-4583-2-system.log, cass-4583-5-system.log


 At present we do not have a complete reproduction for this defect but am 
 raising this defect as request by Aaron Morton. We will update as we find out 
 more. If any additional logging or tests are requested we will do them if we 
 can. 
 We have experienced 2 failures ascribed to this defect. On the cassandra user 
 mailing list Peter Schuller (2012-08-28) describes an additional failure.
 Reproduction steps as currently known:
 1. Setup a cluster with 6 nodes (call them #1 through #6).
 2. Have #5 fail completely. One failure was when the node was stopped to 
 replace the battery in the hard disk cache. The second failure was when the 
 hardware monitoring recorded a problem, CPU usage was increasing without 
 explanation and the server console was frozen so the machine was restarted.
 3. Bring #5 back
 Expected behaviour:
 * #5 should rejoin the ring.
 Actual behaviour (based on the incident we saw yesterday):
 * #5 didn't rejoin the ring.
 * We stopped all nodes and started them one by one.
 * Nodes #2, #4, #6 had forgotten most of their column families. They had the 
 keys space but with only one column family instead of the usual 9 or so.
 * We ran nodetool resetlocalschema on #2, #4 and #6.
 * We ran nodetool repair -pr on #2, #4, #5 and #6
 * On #2 nodetool repair appeared to crash in that there were no messages in 
 the logs from it for 10min+. Nodetool compactionstats and nodetool netstats 
 showed no activity.
 * Restarting nodetool repair -pr fixed the problem and ran to completion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CASSANDRA-4583) Some nodes forget schema when 1 node fails

2012-08-29 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444210#comment-13444210
 ] 

Pavel Yaskevich edited comment on CASSANDRA-4583 at 8/30/12 4:09 AM:
-

This looks like it was caused by the same problem as CASSANDRA-4219 and 
timestamp problems related to nanoTime usage for schema, all of that is fixed 
in 1.1.4

  was (Author: xedin):
This looks like it was caused by the same problem as CASSANDRA-4129 and 
timestamp problems related to nanoTime usage for schema, all of that is fixed 
in 1.1.4
  
 Some nodes forget schema when 1 node fails
 --

 Key: CASSANDRA-4583
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4583
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.2
 Environment: CentOS release 6.3 (Final)
Reporter: Edward Sargisson
 Attachments: cass-4583-2-system.log, cass-4583-5-system.log


 At present we do not have a complete reproduction for this defect but am 
 raising this defect as request by Aaron Morton. We will update as we find out 
 more. If any additional logging or tests are requested we will do them if we 
 can. 
 We have experienced 2 failures ascribed to this defect. On the cassandra user 
 mailing list Peter Schuller (2012-08-28) describes an additional failure.
 Reproduction steps as currently known:
 1. Setup a cluster with 6 nodes (call them #1 through #6).
 2. Have #5 fail completely. One failure was when the node was stopped to 
 replace the battery in the hard disk cache. The second failure was when the 
 hardware monitoring recorded a problem, CPU usage was increasing without 
 explanation and the server console was frozen so the machine was restarted.
 3. Bring #5 back
 Expected behaviour:
 * #5 should rejoin the ring.
 Actual behaviour (based on the incident we saw yesterday):
 * #5 didn't rejoin the ring.
 * We stopped all nodes and started them one by one.
 * Nodes #2, #4, #6 had forgotten most of their column families. They had the 
 keys space but with only one column family instead of the usual 9 or so.
 * We ran nodetool resetlocalschema on #2, #4 and #6.
 * We ran nodetool repair -pr on #2, #4, #5 and #6
 * On #2 nodetool repair appeared to crash in that there were no messages in 
 the logs from it for 10min+. Nodetool compactionstats and nodetool netstats 
 showed no activity.
 * Restarting nodetool repair -pr fixed the problem and ran to completion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4583) Some nodes forget schema when 1 node fails

2012-08-29 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444224#comment-13444224
 ] 

Pavel Yaskevich commented on CASSANDRA-4583:


Additionally there are CASSANDRA-4432 and CASSANDRA-4561 related to timestamp 
problem

 Some nodes forget schema when 1 node fails
 --

 Key: CASSANDRA-4583
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4583
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.2
 Environment: CentOS release 6.3 (Final)
Reporter: Edward Sargisson
 Attachments: cass-4583-2-system.log, cass-4583-5-system.log


 At present we do not have a complete reproduction for this defect but am 
 raising this defect as request by Aaron Morton. We will update as we find out 
 more. If any additional logging or tests are requested we will do them if we 
 can. 
 We have experienced 2 failures ascribed to this defect. On the cassandra user 
 mailing list Peter Schuller (2012-08-28) describes an additional failure.
 Reproduction steps as currently known:
 1. Setup a cluster with 6 nodes (call them #1 through #6).
 2. Have #5 fail completely. One failure was when the node was stopped to 
 replace the battery in the hard disk cache. The second failure was when the 
 hardware monitoring recorded a problem, CPU usage was increasing without 
 explanation and the server console was frozen so the machine was restarted.
 3. Bring #5 back
 Expected behaviour:
 * #5 should rejoin the ring.
 Actual behaviour (based on the incident we saw yesterday):
 * #5 didn't rejoin the ring.
 * We stopped all nodes and started them one by one.
 * Nodes #2, #4, #6 had forgotten most of their column families. They had the 
 keys space but with only one column family instead of the usual 9 or so.
 * We ran nodetool resetlocalschema on #2, #4 and #6.
 * We ran nodetool repair -pr on #2, #4, #5 and #6
 * On #2 nodetool repair appeared to crash in that there were no messages in 
 the logs from it for 10min+. Nodetool compactionstats and nodetool netstats 
 showed no activity.
 * Restarting nodetool repair -pr fixed the problem and ran to completion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4583) Some nodes forget schema when 1 node fails

2012-08-29 Thread Edward Sargisson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444257#comment-13444257
 ] 

Edward Sargisson commented on CASSANDRA-4583:
-

Hi Pavel,
Thanks for your quick reply - that seems reasonable until and unless we can 
show otherwise.
We'll schedule an upgrade to 1.1.4 and will report back if we see a recurrence 
afterwards.

 Some nodes forget schema when 1 node fails
 --

 Key: CASSANDRA-4583
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4583
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.2
 Environment: CentOS release 6.3 (Final)
Reporter: Edward Sargisson
 Attachments: cass-4583-2-system.log, cass-4583-5-system.log


 At present we do not have a complete reproduction for this defect but am 
 raising this defect as request by Aaron Morton. We will update as we find out 
 more. If any additional logging or tests are requested we will do them if we 
 can. 
 We have experienced 2 failures ascribed to this defect. On the cassandra user 
 mailing list Peter Schuller (2012-08-28) describes an additional failure.
 Reproduction steps as currently known:
 1. Setup a cluster with 6 nodes (call them #1 through #6).
 2. Have #5 fail completely. One failure was when the node was stopped to 
 replace the battery in the hard disk cache. The second failure was when the 
 hardware monitoring recorded a problem, CPU usage was increasing without 
 explanation and the server console was frozen so the machine was restarted.
 3. Bring #5 back
 Expected behaviour:
 * #5 should rejoin the ring.
 Actual behaviour (based on the incident we saw yesterday):
 * #5 didn't rejoin the ring.
 * We stopped all nodes and started them one by one.
 * Nodes #2, #4, #6 had forgotten most of their column families. They had the 
 keys space but with only one column family instead of the usual 9 or so.
 * We ran nodetool resetlocalschema on #2, #4 and #6.
 * We ran nodetool repair -pr on #2, #4, #5 and #6
 * On #2 nodetool repair appeared to crash in that there were no messages in 
 the logs from it for 10min+. Nodetool compactionstats and nodetool netstats 
 showed no activity.
 * Restarting nodetool repair -pr fixed the problem and ran to completion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-4584) Add CQL syntax to enable request tracing

2012-08-29 Thread Jonathan Ellis (JIRA)
Jonathan Ellis created CASSANDRA-4584:
-

 Summary: Add CQL syntax to enable request tracing
 Key: CASSANDRA-4584
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4584
 Project: Cassandra
  Issue Type: Sub-task
Affects Versions: 1.2.0 beta 1
Reporter: Jonathan Ellis
Assignee: Sylvain Lebresne
 Fix For: 1.2.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-4585) Add cqlsh support for tracing results

2012-08-29 Thread Jonathan Ellis (JIRA)
Jonathan Ellis created CASSANDRA-4585:
-

 Summary: Add cqlsh support for tracing results
 Key: CASSANDRA-4585
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4585
 Project: Cassandra
  Issue Type: Sub-task
  Components: Tools
Affects Versions: 1.2.0 beta 1
Reporter: Jonathan Ellis
Assignee: paul cannon
 Fix For: 1.2.0




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[3/3] add request tracing patch by David Alves; reviewed by jbellis for CASSANDRA-1123

2012-08-29 Thread jbellis
http://git-wip-us.apache.org/repos/asf/cassandra/blob/5c94432b/src/java/org/apache/cassandra/tracing/Tracing.java
--
diff --git a/src/java/org/apache/cassandra/tracing/Tracing.java 
b/src/java/org/apache/cassandra/tracing/Tracing.java
new file mode 100644
index 000..7675d74
--- /dev/null
+++ b/src/java/org/apache/cassandra/tracing/Tracing.java
@@ -0,0 +1,256 @@
+/*
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ *
+ */
+package org.apache.cassandra.tracing;
+
+import static com.google.common.base.Preconditions.checkState;
+import static org.apache.cassandra.utils.ByteBufferUtil.bytes;
+
+import java.net.InetAddress;
+import java.nio.ByteBuffer;
+import java.util.Arrays;
+import java.util.Map;
+import java.util.UUID;
+
+import org.apache.cassandra.concurrent.Stage;
+import org.apache.cassandra.concurrent.StageManager;
+import org.apache.cassandra.config.CFMetaData;
+import org.apache.cassandra.cql3.ColumnNameBuilder;
+import org.apache.cassandra.db.ColumnFamily;
+import org.apache.cassandra.db.ExpiringColumn;
+import org.apache.cassandra.db.RowMutation;
+import org.apache.cassandra.db.marshal.InetAddressType;
+import org.apache.cassandra.db.marshal.LongType;
+import org.apache.cassandra.db.marshal.TimeUUIDType;
+import org.apache.cassandra.db.marshal.UTF8Type;
+import org.apache.cassandra.net.MessageIn;
+import org.apache.cassandra.service.StorageProxy;
+import org.apache.cassandra.thrift.ConsistencyLevel;
+import org.apache.cassandra.thrift.TimedOutException;
+import org.apache.cassandra.thrift.UnavailableException;
+import org.apache.cassandra.utils.ByteBufferUtil;
+import org.apache.cassandra.utils.FBUtilities;
+import org.apache.cassandra.utils.UUIDGen;
+import org.apache.cassandra.utils.WrappedRunnable;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * A trace session context. Able to track and store trace sessions. A session 
is usually a user initiated query, and may
+ * have multiple local and remote events before it is completed. All events 
and sessions are stored at table.
+ */
+public class Tracing
+{
+public static final String TRACE_KS = system_traces;
+public static final String EVENTS_CF = events;
+public static final String SESSIONS_CF = sessions;
+public static final String TRACE_HEADER = TraceSession;
+
+private static final int TTL = 24 * 3600;
+
+private static Tracing instance = new Tracing();
+
+public static final Logger logger = LoggerFactory.getLogger(Tracing.class);
+
+/**
+ * Fetches and lazy initializes the trace context.
+ */
+public static Tracing instance()
+{
+return instance;
+}
+
+private InetAddress localAddress = FBUtilities.getLocalAddress();
+
+private final ThreadLocalTraceState state = new 
ThreadLocalTraceState();
+
+public static void addColumn(ColumnFamily cf, ByteBuffer name, Object 
value)
+{
+cf.addColumn(new ExpiringColumn(name, 
ByteBufferUtil.bytes(value.toString()), System.currentTimeMillis(), TTL));
+}
+
+public static void addColumn(ColumnFamily cf, ByteBuffer name, InetAddress 
address)
+{
+cf.addColumn(new ExpiringColumn(name, ByteBufferUtil.bytes(address), 
System.currentTimeMillis(), TTL));
+}
+
+public static void addColumn(ColumnFamily cf, ByteBuffer name, int value)
+{
+cf.addColumn(new ExpiringColumn(name, ByteBufferUtil.bytes(value), 
System.currentTimeMillis(), TTL));
+}
+
+public static void addColumn(ColumnFamily cf, ByteBuffer name, long value)
+{
+cf.addColumn(new ExpiringColumn(name, ByteBufferUtil.bytes(value), 
System.currentTimeMillis(), TTL));
+}
+
+public static void addColumn(ColumnFamily cf, ByteBuffer name, String 
value)
+{
+cf.addColumn(new ExpiringColumn(name, ByteBufferUtil.bytes(value), 
System.currentTimeMillis(), TTL));
+}
+
+private void addColumn(ColumnFamily cf, ByteBuffer name, ByteBuffer value)
+{
+cf.addColumn(new ExpiringColumn(name, value, 
System.currentTimeMillis(), TTL));
+}
+
+public void addParameterColumns(ColumnFamily cf, MapString, String 
rawPayload)
+{
+for 

[jira] [Commented] (CASSANDRA-4583) Some nodes forget schema when 1 node fails

2012-08-29 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444307#comment-13444307
 ] 

Pavel Yaskevich commented on CASSANDRA-4583:


Sounds good!

 Some nodes forget schema when 1 node fails
 --

 Key: CASSANDRA-4583
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4583
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.2
 Environment: CentOS release 6.3 (Final)
Reporter: Edward Sargisson
 Attachments: cass-4583-2-system.log, cass-4583-5-system.log


 At present we do not have a complete reproduction for this defect but am 
 raising this defect as request by Aaron Morton. We will update as we find out 
 more. If any additional logging or tests are requested we will do them if we 
 can. 
 We have experienced 2 failures ascribed to this defect. On the cassandra user 
 mailing list Peter Schuller (2012-08-28) describes an additional failure.
 Reproduction steps as currently known:
 1. Setup a cluster with 6 nodes (call them #1 through #6).
 2. Have #5 fail completely. One failure was when the node was stopped to 
 replace the battery in the hard disk cache. The second failure was when the 
 hardware monitoring recorded a problem, CPU usage was increasing without 
 explanation and the server console was frozen so the machine was restarted.
 3. Bring #5 back
 Expected behaviour:
 * #5 should rejoin the ring.
 Actual behaviour (based on the incident we saw yesterday):
 * #5 didn't rejoin the ring.
 * We stopped all nodes and started them one by one.
 * Nodes #2, #4, #6 had forgotten most of their column families. They had the 
 keys space but with only one column family instead of the usual 9 or so.
 * We ran nodetool resetlocalschema on #2, #4 and #6.
 * We ran nodetool repair -pr on #2, #4, #5 and #6
 * On #2 nodetool repair appeared to crash in that there were no messages in 
 the logs from it for 10min+. Nodetool compactionstats and nodetool netstats 
 showed no activity.
 * Restarting nodetool repair -pr fixed the problem and ran to completion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


git commit: fix mispaste

2012-08-29 Thread jbellis
Updated Branches:
  refs/heads/trunk 5c94432b2 - ad52ce4fa


fix mispaste


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ad52ce4f
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ad52ce4f
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ad52ce4f

Branch: refs/heads/trunk
Commit: ad52ce4fa303d2c63cbd9833b7245ab2cdff28b3
Parents: 5c94432
Author: Jonathan Ellis jbel...@apache.org
Authored: Wed Aug 29 13:47:18 2012 -0500
Committer: Jonathan Ellis jbel...@apache.org
Committed: Wed Aug 29 13:47:18 2012 -0500

--
 src/java/org/apache/cassandra/tools/NodeCmd.java |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/ad52ce4f/src/java/org/apache/cassandra/tools/NodeCmd.java
--
diff --git a/src/java/org/apache/cassandra/tools/NodeCmd.java 
b/src/java/org/apache/cassandra/tools/NodeCmd.java
index 8cb7dbe..554fba2 100644
--- a/src/java/org/apache/cassandra/tools/NodeCmd.java
+++ b/src/java/org/apache/cassandra/tools/NodeCmd.java
@@ -47,7 +47,7 @@ import org.apache.cassandra.thrift.InvalidRequestException;
 import org.apache.cassandra.utils.EstimatedHistogram;
 import org.apache.cassandra.utils.Pair;
 
-public class trace_next_queryNodeCmd
+public class NodeCmd
 {
 private static final PairString, String SNAPSHOT_COLUMNFAMILY_OPT = new 
PairString, String(cf, column-family);
 private static final PairString, String HOST_OPT = new PairString, 
String(h, host);
@@ -147,7 +147,7 @@ public class trace_next_queryNodeCmd
 // No args
 addCmdHelp(header, ring, Print information about the token ring);
 addCmdHelp(header, join, Join the ring);
-addCmdHelp(header, igit nfo [-T/--tokens], Print node information 
(uptime, load, ...));
+addCmdHelp(header, info [-T/--tokens], Print node information 
(uptime, load, ...));
 addCmdHelp(header, status, Print cluster information (state, load, 
IDs, ...));
 addCmdHelp(header, cfstats, Print statistics on column families);
 addCmdHelp(header, version, Print cassandra version);



[jira] [Commented] (CASSANDRA-1123) Allow tracing query details

2012-08-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444338#comment-13444338
 ] 

Jonathan Ellis commented on CASSANDRA-1123:
---

Hmm, looks like this broke our test log4j config somehow.  ant test gives a lot 
of this:

{noformat}
[junit] ERROR 14:02:56,567 Fatal exception in thread 
Thread[MigrationStage:1,5,main]
[junit] java.lang.NullPointerException
[junit] at 
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1195)
[junit] at 
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1087)
[junit] at 
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1077)
[junit] at 
org.apache.cassandra.config.ColumnDefinition.readSchema(ColumnDefinition.java:247)
[junit] at 
org.apache.cassandra.config.CFMetaData.fromSchema(CFMetaData.java:1320)
[junit] at 
org.apache.cassandra.config.KSMetaData.deserializeColumnFamilies(KSMetaData.java:293)
[junit] at 
org.apache.cassandra.db.DefsTable.mergeColumnFamilies(DefsTable.java:342)
[junit] at 
org.apache.cassandra.db.DefsTable.mergeSchema(DefsTable.java:255)
[junit] at 
org.apache.cassandra.service.MigrationManager$1.call(MigrationManager.java:202)
[junit] at 
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
[junit] at java.util.concurrent.FutureTask.run(FutureTask.java:138)
[junit] at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
[junit] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
[junit] at java.lang.Thread.run(Thread.java:662)
{noformat}

where CFS:1195 is a logger.debug call.

ant test uses test/conf/log4j-server.properties, which just specifies a file 
and stdout at DEBUG.

 Allow tracing query details
 ---

 Key: CASSANDRA-1123
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1123
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: David Alves
 Fix For: 1.2.0 beta 1

 Attachments: 1123-3.patch.gz, 1123.patch, 1123.patch, 1123.patch, 
 1123.patch, 1123-v6.txt, 1123-v7.patch, 1123-v8.patch, 1123-v9.txt, 
 1123-v9.txt


 In the spirit of CASSANDRA-511, it would be useful to tracing on queries to 
 see where latency is coming from: how long did row cache lookup take?  key 
 search in the index?  merging the data from the sstables?  etc.
 The main difference vs setting debug logging is that debug logging is too big 
 of a hammer; by turning on the flood of logging for everyone, you actually 
 distort the information you're looking for.  This would be something you 
 could set per-query (or more likely per connection).
 We don't need to be as sophisticated as the techniques discussed in the 
 following papers but they are interesting reading:
 http://research.google.com/pubs/pub36356.html
 http://www.usenix.org/events/osdi04/tech/full_papers/barham/barham_html/
 http://www.usenix.org/event/nsdi07/tech/fonseca.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (CASSANDRA-1123) Allow tracing query details

2012-08-29 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reopened CASSANDRA-1123:
---


Reverted pending tests fix.

 Allow tracing query details
 ---

 Key: CASSANDRA-1123
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1123
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: David Alves
 Fix For: 1.2.0 beta 1

 Attachments: 1123-3.patch.gz, 1123.patch, 1123.patch, 1123.patch, 
 1123.patch, 1123-v6.txt, 1123-v7.patch, 1123-v8.patch, 1123-v9.txt, 
 1123-v9.txt


 In the spirit of CASSANDRA-511, it would be useful to tracing on queries to 
 see where latency is coming from: how long did row cache lookup take?  key 
 search in the index?  merging the data from the sstables?  etc.
 The main difference vs setting debug logging is that debug logging is too big 
 of a hammer; by turning on the flood of logging for everyone, you actually 
 distort the information you're looking for.  This would be something you 
 could set per-query (or more likely per connection).
 We don't need to be as sophisticated as the techniques discussed in the 
 following papers but they are interesting reading:
 http://research.google.com/pubs/pub36356.html
 http://www.usenix.org/events/osdi04/tech/full_papers/barham/barham_html/
 http://www.usenix.org/event/nsdi07/tech/fonseca.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-1123) Allow tracing query details

2012-08-29 Thread David Alves (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Alves updated CASSANDRA-1123:
---

Attachment: 1123-v9.patch

Fixes the NPE. Problem was the result.getColumnCount call inside the logging 
statement (result maybe null).

 Allow tracing query details
 ---

 Key: CASSANDRA-1123
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1123
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: David Alves
 Fix For: 1.2.0 beta 1

 Attachments: 1123-3.patch.gz, 1123.patch, 1123.patch, 1123.patch, 
 1123.patch, 1123-v6.txt, 1123-v7.patch, 1123-v8.patch, 1123-v9.patch, 
 1123-v9.txt, 1123-v9.txt


 In the spirit of CASSANDRA-511, it would be useful to tracing on queries to 
 see where latency is coming from: how long did row cache lookup take?  key 
 search in the index?  merging the data from the sstables?  etc.
 The main difference vs setting debug logging is that debug logging is too big 
 of a hammer; by turning on the flood of logging for everyone, you actually 
 distort the information you're looking for.  This would be something you 
 could set per-query (or more likely per connection).
 We don't need to be as sophisticated as the techniques discussed in the 
 following papers but they are interesting reading:
 http://research.google.com/pubs/pub36356.html
 http://www.usenix.org/events/osdi04/tech/full_papers/barham/barham_html/
 http://www.usenix.org/event/nsdi07/tech/fonseca.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4567) Error in log related to Murmur3Partitioner

2012-08-29 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1304#comment-1304
 ] 

Pavel Yaskevich commented on CASSANDRA-4567:


+1

 Error in log related to Murmur3Partitioner
 --

 Key: CASSANDRA-4567
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4567
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.2.0 beta 1
 Environment: Using ccm on ubuntu
Reporter: Tyler Patterson
Assignee: Vijay
 Fix For: 1.2.0 beta 1

 Attachments: 0001-CASSANDRA-4567.patch, 0001-CASSANDRA-4567-v2.patch, 
 0001-CASSANDRA-4567-v3.patch


 Start a 2-node cluster on cassandra-1.1. Bring down one node, upgrade it to 
 trunk, start it back up. The following error shows up in the log:
 {code}
 ...
  INFO [main] 2012-08-22 10:44:40,012 CacheService.java (line 170) Scheduling 
 row cache save to each 0 seconds (going to save all keys).
  INFO [SSTableBatchOpen:1] 2012-08-22 10:44:40,106 SSTableReader.java (line 
 164) Opening 
 /tmp/dtest-IYHWfV/test/node1/data/system/LocationInfo/system-LocationInfo-he-2
  (148 bytes)
  INFO [SSTableBatchOpen:2] 2012-08-22 10:44:40,106 SSTableReader.java (line 
 164) Opening 
 /tmp/dtest-IYHWfV/test/node1/data/system/LocationInfo/system-LocationInfo-he-1
  (226 bytes)
  INFO [SSTableBatchOpen:3] 2012-08-22 10:44:40,106 SSTableReader.java (line 
 164) Opening 
 /tmp/dtest-IYHWfV/test/node1/data/system/LocationInfo/system-LocationInfo-he-3
  (89 bytes)
 ERROR [SSTableBatchOpen:3] 2012-08-22 10:44:40,114 CassandraDaemon.java (line 
 131) Exception in thread Thread[SSTableBatchOpen:3,5,main]
 java.lang.RuntimeException: Cannot open 
 /tmp/dtest-IYHWfV/test/node1/data/system/LocationInfo/system-LocationInfo-he-3
  because partitioner does not match 
 org.apache.cassandra.dht.Murmur3Partitioner
 at 
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:175)
 at 
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:149)
 at 
 org.apache.cassandra.io.sstable.SSTableReader$1.run(SSTableReader.java:236)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 ERROR [SSTableBatchOpen:2] 2012-08-22 10:44:40,114 CassandraDaemon.java (line 
 131) Exception in thread Thread[SSTableBatchOpen:2,5,main]
 java.lang.RuntimeException: Cannot open 
 /tmp/dtest-IYHWfV/test/node1/data/system/LocationInfo/system-LocationInfo-he-1
  because partitioner does not match 
 org.apache.cassandra.dht.Murmur3Partitioner
 at 
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:175)
 at 
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:149)
 at 
 org.apache.cassandra.io.sstable.SSTableReader$1.run(SSTableReader.java:236)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 ERROR [SSTableBatchOpen:1] 2012-08-22 10:44:40,114 CassandraDaemon.java (line 
 131) Exception in thread Thread[SSTableBatchOpen:1,5,main]
 java.lang.RuntimeException: Cannot open 
 /tmp/dtest-IYHWfV/test/node1/data/system/LocationInfo/system-LocationInfo-he-2
  because partitioner does not match 
 org.apache.cassandra.dht.Murmur3Partitioner
 at 
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:175)
 at 
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:149)
 at 
 org.apache.cassandra.io.sstable.SSTableReader$1.run(SSTableReader.java:236)
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
  INFO 

[jira] [Commented] (CASSANDRA-4009) Increase usage of Metrics and flesh out o.a.c.metrics

2012-08-29 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1366#comment-1366
 ] 

Brandon Williams commented on CASSANDRA-4009:
-

+1

 Increase usage of Metrics and flesh out o.a.c.metrics
 -

 Key: CASSANDRA-4009
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4009
 Project: Cassandra
  Issue Type: Improvement
Reporter: Brandon Williams
Assignee: Yuki Morishita
Priority: Minor
 Fix For: 1.2.0

 Attachments: 4009.txt, 4009.txt, 4009-v2.txt


 With CASSANDRA-3671 we have begun using the Metrics packages to expose stats 
 in a new JMX structure, intended to be more user-friendly (for example, you 
 don't need to know what a StorageProxy is or does.)  This ticket serves as a 
 parent for subtasks to finish fleshing out the rest of the enhanced metrics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


git commit: fix NPE patch by David Alves for CASSANDRA-1123

2012-08-29 Thread jbellis
Updated Branches:
  refs/heads/trunk ad52ce4fa - 5b6a2b11b


fix NPE
patch by David Alves for CASSANDRA-1123


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5b6a2b11
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5b6a2b11
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5b6a2b11

Branch: refs/heads/trunk
Commit: 5b6a2b11bc8a9499ac012d745869e3d814cc91ad
Parents: ad52ce4
Author: Jonathan Ellis jbel...@apache.org
Authored: Wed Aug 29 18:09:57 2012 -0500
Committer: Jonathan Ellis jbel...@apache.org
Committed: Wed Aug 29 18:10:05 2012 -0500

--
 .../org/apache/cassandra/db/ColumnFamilyStore.java |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/5b6a2b11/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
--
diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java 
b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
index 8ef686b..ef0e55d 100644
--- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
+++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
@@ -1192,7 +1192,7 @@ public class ColumnFamilyStore implements 
ColumnFamilyStoreMBean
 readStats.addNano(System.nanoTime() - start);
 }
 
-logger.debug(Read {} columns, result.getColumnCount());
+logger.debug(Read {} columns, result == null ? 0 : 
result.getColumnCount());
 return result;
 }
 



[jira] [Commented] (CASSANDRA-4571) Strange permament socket descriptors increasing leads to Too many open files

2012-08-29 Thread Serg Shnerson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444558#comment-13444558
 ] 

Serg Shnerson commented on CASSANDRA-4571:
--

bq.Are you sure you can't reproduce on a single-node cluster?

My mistake. Bug also was reproduced with one-node cluster.

 Strange permament socket descriptors increasing leads to Too many open files
 --

 Key: CASSANDRA-4571
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4571
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.1, 1.1.2, 1.1.3
 Environment: CentOS 5.8 Linux 2.6.18-308.13.1.el5 #1 SMP Tue Aug 21 
 17:10:18 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux. 
 java version 1.6.0_33
 Java(TM) SE Runtime Environment (build 1.6.0_33-b03)
 Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03, mixed mode)
Reporter: Serg Shnerson
Priority: Critical

 On the two-node cluster there was found strange socket descriptors 
 increasing. lsof -n | grep java shows many rows like
 java   8380 cassandra  113r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  114r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  115r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  116r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  117r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  118r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  119r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  120r unix 0x8101a374a080
 938348482 socket
  And number of this rows constantly increasing. After about 24 hours this 
 situation leads to error.
 We use PHPCassa client. Load is not so high (aroud ~50kb/s on write). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (CASSANDRA-4571) Strange permament socket descriptors increasing leads to Too many open files

2012-08-29 Thread Serg Shnerson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444558#comment-13444558
 ] 

Serg Shnerson edited comment on CASSANDRA-4571 at 8/30/12 11:08 AM:


bq.Are you sure you can't reproduce on a single-node cluster?

My mistake. I've checked it again. Bug also was reproduced with one-node 
cluster.

  was (Author: sergshne):
bq.Are you sure you can't reproduce on a single-node cluster?

My mistake. Bug also was reproduced with one-node cluster.
  
 Strange permament socket descriptors increasing leads to Too many open files
 --

 Key: CASSANDRA-4571
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4571
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.1, 1.1.2, 1.1.3
 Environment: CentOS 5.8 Linux 2.6.18-308.13.1.el5 #1 SMP Tue Aug 21 
 17:10:18 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux. 
 java version 1.6.0_33
 Java(TM) SE Runtime Environment (build 1.6.0_33-b03)
 Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03, mixed mode)
Reporter: Serg Shnerson
Priority: Critical

 On the two-node cluster there was found strange socket descriptors 
 increasing. lsof -n | grep java shows many rows like
 java   8380 cassandra  113r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  114r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  115r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  116r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  117r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  118r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  119r unix 0x8101a374a080
 938348482 socket
 java   8380 cassandra  120r unix 0x8101a374a080
 938348482 socket
  And number of this rows constantly increasing. After about 24 hours this 
 situation leads to error.
 We use PHPCassa client. Load is not so high (aroud ~50kb/s on write). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


git commit: log related to Murmur3Partitioner patch by vijay; reviewed by Pavel Yaskevich for CASSANDRA-4282

2012-08-29 Thread vijay
Updated Branches:
  refs/heads/trunk 5b6a2b11b - 0525ae25f


log related to Murmur3Partitioner
patch by vijay; reviewed by Pavel Yaskevich for CASSANDRA-4282


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0525ae25
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0525ae25
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0525ae25

Branch: refs/heads/trunk
Commit: 0525ae25f82ea132727b395da973b06fd1733011
Parents: 5b6a2b1
Author: Vijay Parthasarathy vijay2...@gmail.com
Authored: Wed Aug 29 18:02:57 2012 -0700
Committer: Vijay Parthasarathy vijay2...@gmail.com
Committed: Wed Aug 29 18:02:57 2012 -0700

--
 .../cassandra/config/DatabaseDescriptor.java   |7 +
 .../org/apache/cassandra/gms/GossipDigestSyn.java  |   18 --
 .../cassandra/gms/GossipDigestSynVerbHandler.java  |6 +
 src/java/org/apache/cassandra/gms/Gossiper.java|4 ++-
 .../apache/cassandra/io/sstable/SSTableReader.java |9 +--
 test/data/serialization/1.2/gms.Gossip.bin |  Bin 109 - 158 bytes
 .../apache/cassandra/gms/SerializationsTest.java   |2 +-
 7 files changed, 38 insertions(+), 8 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/0525ae25/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
--
diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java 
b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
index 2e22e95..7533214 100644
--- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
+++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
@@ -71,6 +71,7 @@ public class DatabaseDescriptor
 
 /* Hashing strategy Random or OPHF */
 private static IPartitioner? partitioner;
+private static String paritionerName;
 
 private static Config.DiskAccessMode indexAccessMode;
 
@@ -224,6 +225,7 @@ public class DatabaseDescriptor
 {
 throw new ConfigurationException(Invalid partitioner class  
+ conf.partitioner);
 }
+paritionerName = partitioner.getClass().getCanonicalName();
 
 /* phi convict threshold for FailureDetector */
 if (conf.phi_convict_threshold  5 || conf.phi_convict_threshold  
16)
@@ -642,6 +644,11 @@ public class DatabaseDescriptor
 return partitioner;
 }
 
+public static String getPartitionerName()
+{
+return paritionerName;
+}
+
 /* For tests ONLY, don't use otherwise or all hell will break loose */
 public static void setPartitioner(IPartitioner? newPartitioner)
 {

http://git-wip-us.apache.org/repos/asf/cassandra/blob/0525ae25/src/java/org/apache/cassandra/gms/GossipDigestSyn.java
--
diff --git a/src/java/org/apache/cassandra/gms/GossipDigestSyn.java 
b/src/java/org/apache/cassandra/gms/GossipDigestSyn.java
index 8ce2257..24979f1 100644
--- a/src/java/org/apache/cassandra/gms/GossipDigestSyn.java
+++ b/src/java/org/apache/cassandra/gms/GossipDigestSyn.java
@@ -23,6 +23,7 @@ import java.util.List;
 
 import org.apache.cassandra.db.TypeSizes;
 import org.apache.cassandra.io.IVersionedSerializer;
+import org.apache.cassandra.net.MessagingService;
 
 /**
  * This is the first message that gets sent out as a start of the Gossip 
protocol in a
@@ -33,11 +34,13 @@ public class GossipDigestSyn
 public static final IVersionedSerializerGossipDigestSyn serializer = new 
GossipDigestSynSerializer();
 
 final String clusterId;
+final String partioner;
 final ListGossipDigest gDigests;
 
-public GossipDigestSyn(String clusterId, ListGossipDigest gDigests)
+public GossipDigestSyn(String clusterId, String partioner, 
ListGossipDigest gDigests)
 {
 this.clusterId = clusterId;
+this.partioner = partioner;
 this.gDigests = gDigests;
 }
 
@@ -79,19 +82,28 @@ class GossipDigestSynSerializer implements 
IVersionedSerializerGossipDigestSyn
 public void serialize(GossipDigestSyn gDigestSynMessage, DataOutput dos, 
int version) throws IOException
 {
 dos.writeUTF(gDigestSynMessage.clusterId);
+if (version = MessagingService.VERSION_12)
+dos.writeUTF(gDigestSynMessage.partioner);
 GossipDigestSerializationHelper.serialize(gDigestSynMessage.gDigests, 
dos, version);
 }
 
 public GossipDigestSyn deserialize(DataInput dis, int version) throws 
IOException
 {
 String clusterId = dis.readUTF();
+String partioner = null;
+if (version = MessagingService.VERSION_12)
+partioner = dis.readUTF();
 ListGossipDigest gDigests = 
GossipDigestSerializationHelper.deserialize(dis, 

[jira] [Commented] (CASSANDRA-3979) Consider providing error code with exceptions (and documenting them)

2012-08-29 Thread paul cannon (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444605#comment-13444605
 ] 

paul cannon commented on CASSANDRA-3979:


+1 on this monster. This will be fantastic for clients, and should encourage 
use of the binary protocol. I did have to make a really small change to the 
'stress' tool code to get a full successful build, as shown here: 
https://github.com/thepaul/cassandra/commit/7b1f71f6 , but that's a triviality.

The only other thing is that it would have been nice to include the extra 
information for Thrift clients too- even if it's just rendered into the error 
string. But maybe that would break super-fragile clients that depend on exact 
error messages?

 Consider providing error code with exceptions (and documenting them)
 

 Key: CASSANDRA-3979
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3979
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
  Labels: cql3
 Fix For: 1.2.0 beta 1


 It could be a good idea to assign documented error code for the different 
 exception raised. Currently, one may have to parse the exception string (say 
 if one wants to know if its 'create keyspace' failed because the keyspace 
 already exists versus other kind of exception), but it means we cannot 
 improve the error message at the risk of breaking client code. Adding 
 documented error codes with the message would avoid this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4498) Remove openjdk-6-jre Cassandra APT dependencies

2012-08-29 Thread paul cannon (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444606#comment-13444606
 ] 

paul cannon commented on CASSANDRA-4498:


If there's no other problems with installing openjdk-6-jre on the side, then 
definitely +1 for status quo.

 Remove openjdk-6-jre Cassandra APT dependencies
 ---

 Key: CASSANDRA-4498
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4498
 Project: Cassandra
  Issue Type: Improvement
Reporter: Terrance Shepherd
Assignee: Brandon Williams
Priority: Minor
  Labels: debian
 Fix For: 1.2.0 beta 1

 Attachments: apache_cassandra_Packages.diff


 As it is well known the recommended jre for Cassandra is sun java 1.6 but at 
 this point that package no longer in the debian or ubuntu apt repos. In order 
 to run Cassandra with the sun java 1.6 jre it must be installed manually with 
 out the repos. Because of this when you install cassandra via the apache or 
 datastax apt repos it must also install openjdk-6-jre even though sun java 
 1.6 jre is already installed.
 I would suggest that the java apt dependencies be removed from the Depends 
 field in package configuration and move to either the Recommends or Suggests 
 field so that way openjdk is not being downloaded when not necessary and 
 possibly interfering with a be pre-installed jre

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-3979) Consider providing error code with exceptions (and documenting them)

2012-08-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444622#comment-13444622
 ] 

Jonathan Ellis commented on CASSANDRA-3979:
---

bq. maybe that would break super-fragile clients that depend on exact error 
messages

That seems like a reasonable risk to take.

 Consider providing error code with exceptions (and documenting them)
 

 Key: CASSANDRA-3979
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3979
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
  Labels: cql3
 Fix For: 1.2.0 beta 1


 It could be a good idea to assign documented error code for the different 
 exception raised. Currently, one may have to parse the exception string (say 
 if one wants to know if its 'create keyspace' failed because the keyspace 
 already exists versus other kind of exception), but it means we cannot 
 improve the error message at the risk of breaking client code. Adding 
 documented error codes with the message would avoid this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4480) Binary protocol: adds events push

2012-08-29 Thread paul cannon (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444642#comment-13444642
 ] 

paul cannon commented on CASSANDRA-4480:


I don't think I like the split of 'control' and 'data' modes. Why can't the 
client library relegate the REGISTER/EVENT messages to a single connection, if 
the user wants it that way? This seems like it adds needless complexity on both 
sides.

In the very worst case, if we remove the restriction, a few dozen connections 
get an extra ~40-byte message once per (rare) topology/node-status change when 
only one would have sufficed. Is that really that much of a problem?

 Binary protocol: adds events push 
 --

 Key: CASSANDRA-4480
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4480
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 1.2.0

 Attachments: 4480.txt


 Clients needs to know about a number of cluster changes (new/removed nodes 
 typically) to function properly. With the binary protocol we could start 
 pushing such events to the clients directly.
 The basic idea would be that a client would register to a number of events 
 and would then receive notifications when those happened. I could at least 
 the following events be useful to clients:
 * Addition and removal of nodes
 * Schema changes (otherwise clients would have to pull schema all the time to 
 know that say a new column has been added)
 * node up/dow events (down events might not be too useful, but up events 
 could be helpful).
 The main problem I can see with that is that we want to make it clear that 
 clients are supposed to register for events on only one or two of their 
 connections (total, not per-host), otherwise it'll be just flooding. One 
 solution to make it much more unlikely that this happen could be to 
 distinguish two kinds of connections: Data and Control (could just a simple 
 flag with the startup message for instance). Data connections would not allow 
 registering to events and Control ones would allow it but wouldn't allow 
 queries. I.e. clients would have to dedicate a connection to those events, 
 but that's likely the only sane way to do it anyway.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4292) Improve JBOD loadbalancing and reduce contention

2012-08-29 Thread Dave Brosius (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444641#comment-13444641
 ] 

Dave Brosius commented on CASSANDRA-4292:
-

+1 patch LGTM

 Improve JBOD loadbalancing and reduce contention
 

 Key: CASSANDRA-4292
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4292
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Yuki Morishita
 Fix For: 1.2.0 beta 1

 Attachments: 
 0001-Fix-writing-sstables-to-wrong-directory-when-compact.patch, 4292.txt, 
 4292-v2.txt, 4292-v3.txt, 4292-v4.txt


 As noted in CASSANDRA-809, we have a certain amount of flush (and compaction) 
 threads, which mix and match disk volumes indiscriminately.  It may be worth 
 creating a tight thread - disk affinity, to prevent unnecessary conflict at 
 that level.
 OTOH as SSDs become more prevalent this becomes a non-issue.  Unclear how 
 much pain this actually causes in practice in the meantime.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-4586) composite indexes do a linear search on all SecondaryIndex objects for any update

2012-08-29 Thread Jonathan Ellis (JIRA)
Jonathan Ellis created CASSANDRA-4586:
-

 Summary: composite indexes do a linear search on all 
SecondaryIndex objects for any update
 Key: CASSANDRA-4586
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4586
 Project: Cassandra
  Issue Type: Bug
Reporter: Jonathan Ellis
Assignee: Sylvain Lebresne


not much point in having a Map if we can't use it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4586) composite indexes do a linear search on all SecondaryIndex objects for any update

2012-08-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444665#comment-13444665
 ] 

Jonathan Ellis commented on CASSANDRA-4586:
---

seems like this would be more straightforward if we pulled out the cql3 name 
from the composite first, then the IndexManager could do a Map lookup again.

 composite indexes do a linear search on all SecondaryIndex objects for any 
 update
 -

 Key: CASSANDRA-4586
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4586
 Project: Cassandra
  Issue Type: Bug
Reporter: Jonathan Ellis
Assignee: Sylvain Lebresne

 not much point in having a Map if we can't use it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-2897) Secondary indexes without read-before-write

2012-08-29 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2897:
--

Attachment: 2897-v4.txt

v4 pushes *all* index updates into the helper closure, renamed to 
SecondaryIndexManager.Updater.  This cleans up Table.apply even more (no more 
looping to create a redundant Map of updated columns), and allows index 
maintenance during compaction relatively cleanly -- this is added for the first 
time here.

I note, for the record, that composite indexes make my head hurt 
(CASSANDRA-4586).

I further note that finding the wrong column value being used to create 
dummyColumn in the index-stale block was a *bitch*.  Not sure how your new 
tests passed with that.  Two bugs cancelling out, I guess.  (Similarly, 
dummyColumn needed to be introduced in KeysSearcher since just using the index 
column is wrong even for non-composites, since delete expects a base-data 
column.)

I await news of the new bugs I've introduced. :)

 Secondary indexes without read-before-write
 ---

 Key: CASSANDRA-2897
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2897
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Sylvain Lebresne
Assignee: Sam Tunnicliffe
Priority: Minor
  Labels: secondary_index
 Fix For: 1.2.0 beta 1

 Attachments: 
 0001-CASSANDRA-2897-Secondary-indexes-without-read-before-w.txt, 
 0002-CASSANDRA-2897-Secondary-indexes-without-read-before-w.txt, 
 0003-CASSANDRA-2897.txt, 2897-apply-cleanup.txt, 2897-v4.txt, 41ec9fc-2897.txt


 Currently, secondary index updates require a read-before-write to maintain 
 the index consistency. Keeping the index consistent at all time is not 
 necessary however. We could let the (secondary) index get inconsistent on 
 writes and repair those on reads. This would be easy because on reads, we 
 make sure to request the indexed columns anyway, so we can just skip the row 
 that are not needed and repair the index at the same time.
 This does trade work on writes for work on reads. However, read-before-write 
 is sufficiently costly that it will likely be a win overall.
 There is (at least) two small technical difficulties here though:
 # If we repair on read, this will be racy with writes, so we'll probably have 
 to synchronize there.
 # We probably shouldn't only rely on read to repair and we should also have a 
 task to repair the index for things that are rarely read. It's unclear how to 
 make that low impact though.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (CASSANDRA-1123) Allow tracing query details

2012-08-29 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-1123.
---

Resolution: Fixed

committed.  (turns out I didn't push the revert earlier, so I just left that 
out when I did push.)

 Allow tracing query details
 ---

 Key: CASSANDRA-1123
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1123
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: David Alves
 Fix For: 1.2.0 beta 1

 Attachments: 1123-3.patch.gz, 1123.patch, 1123.patch, 1123.patch, 
 1123.patch, 1123-v6.txt, 1123-v7.patch, 1123-v8.patch, 1123-v9.patch, 
 1123-v9.txt, 1123-v9.txt


 In the spirit of CASSANDRA-511, it would be useful to tracing on queries to 
 see where latency is coming from: how long did row cache lookup take?  key 
 search in the index?  merging the data from the sstables?  etc.
 The main difference vs setting debug logging is that debug logging is too big 
 of a hammer; by turning on the flood of logging for everyone, you actually 
 distort the information you're looking for.  This would be something you 
 could set per-query (or more likely per connection).
 We don't need to be as sophisticated as the techniques discussed in the 
 following papers but they are interesting reading:
 http://research.google.com/pubs/pub36356.html
 http://www.usenix.org/events/osdi04/tech/full_papers/barham/barham_html/
 http://www.usenix.org/event/nsdi07/tech/fonseca.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4586) composite indexes do a linear search on all SecondaryIndex objects for any update

2012-08-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444706#comment-13444706
 ] 

Jonathan Ellis commented on CASSANDRA-4586:
---

yes, composites caused a world of pain for CASSANDRA-2897. :)

 composite indexes do a linear search on all SecondaryIndex objects for any 
 update
 -

 Key: CASSANDRA-4586
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4586
 Project: Cassandra
  Issue Type: Bug
Reporter: Jonathan Ellis
Assignee: Sylvain Lebresne

 not much point in having a Map if we can't use it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4586) composite indexes do a linear search on all SecondaryIndex objects for any update

2012-08-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444705#comment-13444705
 ] 

Jonathan Ellis commented on CASSANDRA-4586:
---

also on my hit list: having to be super careful to call makeIndexColumnName in 
the right places, with no type safety since it's BB in BB out.

 composite indexes do a linear search on all SecondaryIndex objects for any 
 update
 -

 Key: CASSANDRA-4586
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4586
 Project: Cassandra
  Issue Type: Bug
Reporter: Jonathan Ellis
Assignee: Sylvain Lebresne

 not much point in having a Map if we can't use it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira