[Cassandra Wiki] Trivial Update of Counters by Alexis Wilke
Dear Wiki user, You have subscribed to a wiki page or wiki category on Cassandra Wiki for change notification. The Counters page has been changed by Alexis Wilke: http://wiki.apache.org/cassandra/Counters?action=diffrev1=16rev2=17 Comment: Fixed a couple of plurial forms, and replace insert with add which is much more appropriate for a counter! {{{ [default@test] create column family counterCF with default_validation_class=CounterColumnType and replicate_on_write=true; }}} - Setting the `default_validation_class` to `CounterColumnType` indicates that the column will be counters. Setting `replicate_on_write=true` will be optional starting in 0.8.2, but a bug made it default to false in 0.8.0 and 0.8.1, which is unsafe. + Setting the `default_validation_class` to `CounterColumnType` indicates that the columns will be counters. Setting `replicate_on_write=true` will be optional starting in 0.8.2, but a bug made it default to false in 0.8.0 and 0.8.1, which is unsafe. Incrementing and accessing counters @@ -86, +86 @@ == Technical limitations == * If a write fails unexpectedly (timeout or loss of connection to the coordinator node) the client will not know if the operation has been performed. A retry can result in an over count [[https://issues.apache.org/jira/browse/CASSANDRA-2495|CASSANDRA-2495]]. - * Counter removal is intrinsically limited. For instance, if you issue very quickly the sequence increment, remove, increment it is possible for the removal to be lost (if for some reason the remove happens to be the last received messages). Hence, removal of counters is provided for definitive removal only, that is when the deleted counter is not increment afterwards. This holds for row deletion too: if you delete a row of counters, incrementing any counter in that row (that existed before the deletion) will result in an undetermined behavior. Note that if you need to reset a counter, one option (that is unfortunately not concurrent safe) could be to read its ''value'' and insert ''-value''. + * Counter removal is intrinsically limited. For instance, if you issue very quickly the sequence increment, remove, increment it is possible for the removal to be lost (if for some reason the remove happens to be the last received messages). Hence, removal of counters is provided for definitive removal only, that is when the deleted counter is not increment afterwards. This holds for row deletion too: if you delete a row of counters, incrementing any counter in that row (that existed before the deletion) will result in an undetermined behavior. Note that if you need to reset a counter, one option (that is unfortunately not concurrent safe) could be to read its ''value'' and add ''-value''. * `CounterColumnType` may only be set in the `default_validation_class`. A column family either contains only counters, or no counters at all. == Further reading == - See [[https://issues.apache.org/jira/browse/CASSANDRA-1072|CASSANDRA-1072]] and especially the [[https://issues.apache.org/jira/secure/attachment/12459754/Partitionedcountersdesigndoc.pdf|design doc]] for further information about how this works internally (but note that some of the limitation fixed in these technical documents have been fixed since then, for instance all consistency level '''are''' supported, for both reads and writes). + See [[https://issues.apache.org/jira/browse/CASSANDRA-1072|CASSANDRA-1072]] and especially the [[https://issues.apache.org/jira/secure/attachment/12459754/Partitionedcountersdesigndoc.pdf|design doc]] for further information about how this works internally (but note that some of the limitations fixed in these technical documents have been fixed since then, for instance all consistency level '''are''' supported, for both reads and writes).
[jira] [Updated] (CASSANDRA-4049) Add generic way of adding SSTable components required custom compaction strategy
[ https://issues.apache.org/jira/browse/CASSANDRA-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Kołaczkowski updated CASSANDRA-4049: -- Attachment: (was: pluggable_custom_components.patch) Add generic way of adding SSTable components required custom compaction strategy Key: CASSANDRA-4049 URL: https://issues.apache.org/jira/browse/CASSANDRA-4049 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Piotr Kołaczkowski Assignee: Piotr Kołaczkowski Priority: Minor Labels: compaction Fix For: 1.1.5 Attachments: pluggable_custom_components-1.1.4.patch CFS compaction strategy coming up in the next DSE release needs to store some important information in Tombstones.db and RemovedKeys.db files, one per sstable. However, currently Cassandra issues warnings when these files are found in the data directory. Additionally, when switched to SizeTieredCompactionStrategy, the files are left in the data directory after compaction. The attached patch adds new components to the Component class so Cassandra knows about those files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4049) Add generic way of adding SSTable components required custom compaction strategy
[ https://issues.apache.org/jira/browse/CASSANDRA-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13443899#comment-13443899 ] Piotr Kołaczkowski commented on CASSANDRA-4049: --- Ok, rebased. Actually only line numbers changed, there was no conflict. Add generic way of adding SSTable components required custom compaction strategy Key: CASSANDRA-4049 URL: https://issues.apache.org/jira/browse/CASSANDRA-4049 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Piotr Kołaczkowski Assignee: Piotr Kołaczkowski Priority: Minor Labels: compaction Fix For: 1.1.5 Attachments: pluggable_custom_components-1.1.4.patch CFS compaction strategy coming up in the next DSE release needs to store some important information in Tombstones.db and RemovedKeys.db files, one per sstable. However, currently Cassandra issues warnings when these files are found in the data directory. Additionally, when switched to SizeTieredCompactionStrategy, the files are left in the data directory after compaction. The attached patch adds new components to the Component class so Cassandra knows about those files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4049) Add generic way of adding SSTable components required custom compaction strategy
[ https://issues.apache.org/jira/browse/CASSANDRA-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Kołaczkowski updated CASSANDRA-4049: -- Attachment: pluggable_custom_components-1.1.4.patch Add generic way of adding SSTable components required custom compaction strategy Key: CASSANDRA-4049 URL: https://issues.apache.org/jira/browse/CASSANDRA-4049 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Piotr Kołaczkowski Assignee: Piotr Kołaczkowski Priority: Minor Labels: compaction Fix For: 1.1.5 Attachments: pluggable_custom_components-1.1.4.patch CFS compaction strategy coming up in the next DSE release needs to store some important information in Tombstones.db and RemovedKeys.db files, one per sstable. However, currently Cassandra issues warnings when these files are found in the data directory. Additionally, when switched to SizeTieredCompactionStrategy, the files are left in the data directory after compaction. The attached patch adds new components to the Component class so Cassandra knows about those files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2897) Secondary indexes without read-before-write
[ https://issues.apache.org/jira/browse/CASSANDRA-2897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-2897: --- Attachment: 0003-CASSANDRA-2897.txt Secondary indexes without read-before-write --- Key: CASSANDRA-2897 URL: https://issues.apache.org/jira/browse/CASSANDRA-2897 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.7.0 Reporter: Sylvain Lebresne Assignee: Sam Tunnicliffe Priority: Minor Labels: secondary_index Fix For: 1.2.0 beta 1 Attachments: 0001-CASSANDRA-2897-Secondary-indexes-without-read-before-w.txt, 0002-CASSANDRA-2897-Secondary-indexes-without-read-before-w.txt, 0003-CASSANDRA-2897.txt, 2897-apply-cleanup.txt, 41ec9fc-2897.txt Currently, secondary index updates require a read-before-write to maintain the index consistency. Keeping the index consistent at all time is not necessary however. We could let the (secondary) index get inconsistent on writes and repair those on reads. This would be easy because on reads, we make sure to request the indexed columns anyway, so we can just skip the row that are not needed and repair the index at the same time. This does trade work on writes for work on reads. However, read-before-write is sufficiently costly that it will likely be a win overall. There is (at least) two small technical difficulties here though: # If we repair on read, this will be racy with writes, so we'll probably have to synchronize there. # We probably shouldn't only rely on read to repair and we should also have a task to repair the index for things that are rarely read. It's unclear how to make that low impact though. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4561) update column family fails
[ https://issues.apache.org/jira/browse/CASSANDRA-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Yaskevich updated CASSANDRA-4561: --- Attachment: CASSANDRA-4561.patch update column family fails -- Key: CASSANDRA-4561 URL: https://issues.apache.org/jira/browse/CASSANDRA-4561 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.0, 1.1.1, 1.1.2, 1.1.3, 1.1.4 Reporter: Zenek Kraweznik Assignee: Pavel Yaskevich Attachments: CASSANDRA-4561.patch [default@test] show schema; create column family Messages with column_type = 'Standard' and comparator = 'AsciiType' and default_validation_class = 'BytesType' and key_validation_class = 'AsciiType' and read_repair_chance = 0.1 and dclocal_read_repair_chance = 0.0 and gc_grace = 864000 and min_compaction_threshold = 2 and max_compaction_threshold = 4 and replicate_on_write = true and compaction_strategy = 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy' and caching = 'KEYS_ONLY' and compaction_strategy_options = {'sstable_size_in_mb' : '1024'} and compression_options = {'chunk_length_kb' : '64', 'sstable_compression' : 'org.apache.cassandra.io.compress.DeflateCompressor'}; [default@test] update column family Messages with min_compaction_threshold = 4 and max_compaction_threshold = 32; a5b7544e-1ef5-3bfd-8770-c09594e37ec2 Waiting for schema agreement... ... schemas agree across the cluster [default@test] show schema; create column family Messages with column_type = 'Standard' and comparator = 'AsciiType' and default_validation_class = 'BytesType' and key_validation_class = 'AsciiType' and read_repair_chance = 0.1 and dclocal_read_repair_chance = 0.0 and gc_grace = 864000 and min_compaction_threshold = 2 and max_compaction_threshold = 4 and replicate_on_write = true and compaction_strategy = 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy' and caching = 'KEYS_ONLY' and compaction_strategy_options = {'sstable_size_in_mb' : '1024'} and compression_options = {'chunk_length_kb' : '64', 'sstable_compression' : 'org.apache.cassandra.io.compress.DeflateCompressor'}; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4532) NPE when trying to select a slice from a composite table
[ https://issues.apache.org/jira/browse/CASSANDRA-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444059#comment-13444059 ] basanth gowda commented on CASSANDRA-4532: -- I was trying to get a slice range, like you could do in thrift. table defn : create tables schedules(status ascii, timecreated bigint, key ascii, nil ascii, PRIMARY KEY(status,timecreated,key)); for the same time there can be a lot of entries. Lets suppose there are 50 entries that match where timecreated is Ln 1st query : select * from schedules where timecreated = Ln limit 10; 2nd Query : select * from schedules where timecreated=L10 AND key=K10 and timecreatedLn. In CQL terms this is a wrong query I know, basically not sure how to represent Between in CQL In Hector I would do, get slice range limiting 10 first time, for the next query (until no more are returned) I would use the time returned by last query and key returned by last query as the start range. This is in production and works perfectly fine NPE when trying to select a slice from a composite table Key: CASSANDRA-4532 URL: https://issues.apache.org/jira/browse/CASSANDRA-4532 Project: Cassandra Issue Type: Bug Components: API, Core Affects Versions: 1.1.3 Environment: Cassandra 1.1.3 (2 nodes) on a single host - mac osx Reporter: basanth gowda Priority: Minor Labels: Slice, cql, cql3 I posted this question on StackOverflow, because i need a solution. Created a table with : {noformat} create table compositetest(m_id ascii,i_id int,l_id ascii,body ascii, PRIMARY KEY(m_id,i_id,l_id)); {noformat} wanted to slice the results returned, so did something like below, not sure if its the right way. The first one returns data perfectly as expected, second one to get the next 3 columns closes the transport of my cqlsh {noformat} cqlsh:testkeyspace1 select * from compositetest where i_id=3 limit 3; m_id | i_id | l_id | body --+--+--+-- m1 |1 | l1 | b1 m1 |2 | l2 | b2 m2 |1 | l1 | b1 cqlsh:testkeyspace1 Was trying to write something for slice range. TSocket read 0 bytes {noformat} Is there a way to achieve what I am doing here, it would be good if some meaning ful error is sent back, instead of cqlsh closing the transport. On the server side I see the following error. {noformat} ERROR [Thrift:3] 2012-08-12 15:15:24,414 CustomTThreadPoolServer.java (line 204) Error occurred during processing of message. java.lang.NullPointerException at org.apache.cassandra.cql3.statements.SelectStatement$Restriction.setBound(SelectStatement.java:1277) at org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.updateRestriction(SelectStatement.java:1151) at org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepare(SelectStatement.java:1001) at org.apache.cassandra.cql3.QueryProcessor.getStatement(QueryProcessor.java:215) at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:121) at org.apache.cassandra.thrift.CassandraServer.execute_cql_query(CassandraServer.java:1237) at org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3542) at org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3530) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:186) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:680) {noformat} With ThriftClient I get : {noformat} org.apache.thrift.transport.TTransportException at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129) at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204) at
[jira] [Updated] (CASSANDRA-2293) Rewrite nodetool help
[ https://issues.apache.org/jira/browse/CASSANDRA-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-2293: -- Reviewer: amorton Rewrite nodetool help - Key: CASSANDRA-2293 URL: https://issues.apache.org/jira/browse/CASSANDRA-2293 Project: Cassandra Issue Type: Improvement Components: Core, Documentation website Affects Versions: 0.8 beta 1 Reporter: Aaron Morton Assignee: Jason Brown Priority: Minor Fix For: 1.2.1 Attachments: 0001-Jira-CASSANDRA-2293-Rewrite-nodetool-help.patch Once CASSANDRA-2008 is through and we are happy with the approach I would like to write similar help for nodetool. Both command line help of the form nodetool help and nodetool help command. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4532) NPE when trying to select a slice from a composite table
[ https://issues.apache.org/jira/browse/CASSANDRA-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444067#comment-13444067 ] Jonathan Ellis commented on CASSANDRA-4532: --- can you test against trunk? NPE when trying to select a slice from a composite table Key: CASSANDRA-4532 URL: https://issues.apache.org/jira/browse/CASSANDRA-4532 Project: Cassandra Issue Type: Bug Components: API, Core Affects Versions: 1.1.3 Environment: Cassandra 1.1.3 (2 nodes) on a single host - mac osx Reporter: basanth gowda Priority: Minor Labels: Slice, cql, cql3 I posted this question on StackOverflow, because i need a solution. Created a table with : {noformat} create table compositetest(m_id ascii,i_id int,l_id ascii,body ascii, PRIMARY KEY(m_id,i_id,l_id)); {noformat} wanted to slice the results returned, so did something like below, not sure if its the right way. The first one returns data perfectly as expected, second one to get the next 3 columns closes the transport of my cqlsh {noformat} cqlsh:testkeyspace1 select * from compositetest where i_id=3 limit 3; m_id | i_id | l_id | body --+--+--+-- m1 |1 | l1 | b1 m1 |2 | l2 | b2 m2 |1 | l1 | b1 cqlsh:testkeyspace1 Was trying to write something for slice range. TSocket read 0 bytes {noformat} Is there a way to achieve what I am doing here, it would be good if some meaning ful error is sent back, instead of cqlsh closing the transport. On the server side I see the following error. {noformat} ERROR [Thrift:3] 2012-08-12 15:15:24,414 CustomTThreadPoolServer.java (line 204) Error occurred during processing of message. java.lang.NullPointerException at org.apache.cassandra.cql3.statements.SelectStatement$Restriction.setBound(SelectStatement.java:1277) at org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.updateRestriction(SelectStatement.java:1151) at org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepare(SelectStatement.java:1001) at org.apache.cassandra.cql3.QueryProcessor.getStatement(QueryProcessor.java:215) at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:121) at org.apache.cassandra.thrift.CassandraServer.execute_cql_query(CassandraServer.java:1237) at org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3542) at org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3530) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:186) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:680) {noformat} With ThriftClient I get : {noformat} org.apache.thrift.transport.TTransportException at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129) at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.cassandra.thrift.Cassandra$Client.recv_execute_cql_query(Cassandra.java:1402) at org.apache.cassandra.thrift.Cassandra$Client.execute_cql_query(Cassandra.java:1388) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4532) NPE when trying to select a slice from a composite table
[ https://issues.apache.org/jira/browse/CASSANDRA-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444099#comment-13444099 ] basanth gowda commented on CASSANDRA-4532: -- No luck. See the last query closed the socket. Here are the steps to reproduce : cqlsh:testkeyspace1 create table compositetest(status ascii,ctime bigint,key ascii,nil ascii,PRIMARY KEY(status,ctime,key)); cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES ('C',12345678,'key1',''); cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES ('C',12345678,'key2',''); cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES ('C',12345679,'key3',''); cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES ('C',12345679,'key4',''); cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES ('C',12345679,'key5',''); cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES ('C',12345680,'key6',''); cqlsh:testkeyspace1 select * from compositetest; status | ctime| key | nil +--+--+- C | 12345678 | key1 | C | 12345678 | key2 | C | 12345679 | key3 | C | 12345679 | key4 | C | 12345679 | key5 | C | 12345680 | key6 | 1st query of slice : cqlsh:testkeyspace1 select * from compositetest where ctime=12345680 limit 3; status | ctime| key | nil +--+--+-- C | 12345678 | key1 | C | 12345678 | key2 | C | 12345679 | key3 | null Second Query : I want to get values where first one left off (Yes you could do this with hector) [Try 1] cqlsh:testkeyspace1 select * from compositetest where ctime=12345679 and key='key3' and ctime=12345680 limit 3; Bad Request: PRIMARY KEY part key cannot be restricted (preceding part ctime is either not restricted or by a non-EQ relation) [Try 2] cqlsh:testkeyspace1 select * from compositetest where ctime=12345679 and key='key3' and ctime=12345680 limit 3; TSocket read 0 bytes cqlsh:testkeyspace1 NPE when trying to select a slice from a composite table Key: CASSANDRA-4532 URL: https://issues.apache.org/jira/browse/CASSANDRA-4532 Project: Cassandra Issue Type: Bug Components: API, Core Affects Versions: 1.1.3 Environment: Cassandra 1.1.3 (2 nodes) on a single host - mac osx Reporter: basanth gowda Priority: Minor Labels: Slice, cql, cql3 I posted this question on StackOverflow, because i need a solution. Created a table with : {noformat} create table compositetest(m_id ascii,i_id int,l_id ascii,body ascii, PRIMARY KEY(m_id,i_id,l_id)); {noformat} wanted to slice the results returned, so did something like below, not sure if its the right way. The first one returns data perfectly as expected, second one to get the next 3 columns closes the transport of my cqlsh {noformat} cqlsh:testkeyspace1 select * from compositetest where i_id=3 limit 3; m_id | i_id | l_id | body --+--+--+-- m1 |1 | l1 | b1 m1 |2 | l2 | b2 m2 |1 | l1 | b1 cqlsh:testkeyspace1 Was trying to write something for slice range. TSocket read 0 bytes {noformat} Is there a way to achieve what I am doing here, it would be good if some meaning ful error is sent back, instead of cqlsh closing the transport. On the server side I see the following error. {noformat} ERROR [Thrift:3] 2012-08-12 15:15:24,414 CustomTThreadPoolServer.java (line 204) Error occurred during processing of message. java.lang.NullPointerException at org.apache.cassandra.cql3.statements.SelectStatement$Restriction.setBound(SelectStatement.java:1277) at org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.updateRestriction(SelectStatement.java:1151) at org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepare(SelectStatement.java:1001) at org.apache.cassandra.cql3.QueryProcessor.getStatement(QueryProcessor.java:215) at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:121) at org.apache.cassandra.thrift.CassandraServer.execute_cql_query(CassandraServer.java:1237) at org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3542) at org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3530) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:186) at
[jira] [Comment Edited] (CASSANDRA-4532) NPE when trying to select a slice from a composite table
[ https://issues.apache.org/jira/browse/CASSANDRA-4532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444099#comment-13444099 ] basanth gowda edited comment on CASSANDRA-4532 at 8/30/12 1:21 AM: --- No luck. See the last query closed the socket. I took the latest from git and compiled Here are the steps to reproduce : cqlsh:testkeyspace1 create table compositetest(status ascii,ctime bigint,key ascii,nil ascii,PRIMARY KEY(status,ctime,key)); cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES ('C',12345678,'key1',''); cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES ('C',12345678,'key2',''); cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES ('C',12345679,'key3',''); cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES ('C',12345679,'key4',''); cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES ('C',12345679,'key5',''); cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES ('C',12345680,'key6',''); cqlsh:testkeyspace1 select * from compositetest; status | ctime| key | nil +--+--+- C | 12345678 | key1 | C | 12345678 | key2 | C | 12345679 | key3 | C | 12345679 | key4 | C | 12345679 | key5 | C | 12345680 | key6 | 1st query of slice : cqlsh:testkeyspace1 select * from compositetest where ctime=12345680 limit 3; status | ctime| key | nil +--+--+-- C | 12345678 | key1 | C | 12345678 | key2 | C | 12345679 | key3 | null Second Query : I want to get values where first one left off (Yes you could do this with hector) [Try 1] cqlsh:testkeyspace1 select * from compositetest where ctime=12345679 and key='key3' and ctime=12345680 limit 3; Bad Request: PRIMARY KEY part key cannot be restricted (preceding part ctime is either not restricted or by a non-EQ relation) [Try 2] cqlsh:testkeyspace1 select * from compositetest where ctime=12345679 and key='key3' and ctime=12345680 limit 3; TSocket read 0 bytes cqlsh:testkeyspace1 was (Author: basu76): No luck. See the last query closed the socket. Here are the steps to reproduce : cqlsh:testkeyspace1 create table compositetest(status ascii,ctime bigint,key ascii,nil ascii,PRIMARY KEY(status,ctime,key)); cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES ('C',12345678,'key1',''); cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES ('C',12345678,'key2',''); cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES ('C',12345679,'key3',''); cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES ('C',12345679,'key4',''); cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES ('C',12345679,'key5',''); cqlsh:testkeyspace1 insert into compositetest(status,ctime,key,nil) VALUES ('C',12345680,'key6',''); cqlsh:testkeyspace1 select * from compositetest; status | ctime| key | nil +--+--+- C | 12345678 | key1 | C | 12345678 | key2 | C | 12345679 | key3 | C | 12345679 | key4 | C | 12345679 | key5 | C | 12345680 | key6 | 1st query of slice : cqlsh:testkeyspace1 select * from compositetest where ctime=12345680 limit 3; status | ctime| key | nil +--+--+-- C | 12345678 | key1 | C | 12345678 | key2 | C | 12345679 | key3 | null Second Query : I want to get values where first one left off (Yes you could do this with hector) [Try 1] cqlsh:testkeyspace1 select * from compositetest where ctime=12345679 and key='key3' and ctime=12345680 limit 3; Bad Request: PRIMARY KEY part key cannot be restricted (preceding part ctime is either not restricted or by a non-EQ relation) [Try 2] cqlsh:testkeyspace1 select * from compositetest where ctime=12345679 and key='key3' and ctime=12345680 limit 3; TSocket read 0 bytes cqlsh:testkeyspace1 NPE when trying to select a slice from a composite table Key: CASSANDRA-4532 URL: https://issues.apache.org/jira/browse/CASSANDRA-4532 Project: Cassandra Issue Type: Bug Components: API, Core Affects Versions: 1.1.3 Environment: Cassandra 1.1.3 (2 nodes) on a single host - mac osx Reporter: basanth gowda Priority: Minor Labels: Slice, cql, cql3 I posted this question on StackOverflow, because i need a solution. Created a table with : {noformat} create table compositetest(m_id ascii,i_id int,l_id ascii,body ascii, PRIMARY KEY(m_id,i_id,l_id)); {noformat} wanted to slice
[jira] [Updated] (CASSANDRA-4292) Improve JBOD loadbalancing and reduce contention
[ https://issues.apache.org/jira/browse/CASSANDRA-4292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita updated CASSANDRA-4292: -- Attachment: 0001-Fix-writing-sstables-to-wrong-directory-when-compact.patch Dave, You are right, attaching fix to chose right directory to write sstable when compacting. Improve JBOD loadbalancing and reduce contention Key: CASSANDRA-4292 URL: https://issues.apache.org/jira/browse/CASSANDRA-4292 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Assignee: Yuki Morishita Fix For: 1.2.0 beta 1 Attachments: 0001-Fix-writing-sstables-to-wrong-directory-when-compact.patch, 4292.txt, 4292-v2.txt, 4292-v3.txt, 4292-v4.txt As noted in CASSANDRA-809, we have a certain amount of flush (and compaction) threads, which mix and match disk volumes indiscriminately. It may be worth creating a tight thread - disk affinity, to prevent unnecessary conflict at that level. OTOH as SSDs become more prevalent this becomes a non-issue. Unclear how much pain this actually causes in practice in the meantime. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-4583) Some nodes forget schema when 1 node fails
Edward Sargisson created CASSANDRA-4583: --- Summary: Some nodes forget schema when 1 node fails Key: CASSANDRA-4583 URL: https://issues.apache.org/jira/browse/CASSANDRA-4583 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.2 Environment: CentOS release 6.3 (Final) Reporter: Edward Sargisson At present we do not have a complete reproduction for this defect but am raising this defect as request by Aaron Morton. We will update as we find out more. If any additional logging or tests are requested we will do them if we can. We have experienced 2 failures ascribed to this defect. On the cassandra user mailing list Peter Schuller (2012-08-28) describes an additional failure. Reproduction steps as currently known: 1. Setup a cluster with 6 nodes (call them #1 through #6). 2. Have #5 fail completely. One failure was when the node was stopped to replace the battery in the hard disk cache. The second failure was when the hardware monitoring recorded a problem, CPU usage was increasing without explanation and the server console was frozen so the machine was restarted. 3. Bring #5 back Expected behaviour: * #5 should rejoin the ring. Actual behaviour (based on the incident we saw yesterday): * #5 didn't rejoin the ring. * We stopped all nodes and started them one by one. * Nodes #2, #4, #6 had forgotten most of their column families. They had the keys space but with only one column family instead of the usual 9 or so. * We ran nodetool resetlocalschema on #2, #4 and #6. * We ran nodetool repair -pr on #2, #4, #5 and #6 * On one of these nodes nodetool repair appeared to crash in that there were no messages in the logs from it for 10min+. Nodetool compactionstats and nodetool netstats showed no activity. * Restarting nodetool repair -pr fixed the problem and ran to completion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4583) Some nodes forget schema when 1 node fails
[ https://issues.apache.org/jira/browse/CASSANDRA-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Sargisson updated CASSANDRA-4583: Description: At present we do not have a complete reproduction for this defect but am raising this defect as request by Aaron Morton. We will update as we find out more. If any additional logging or tests are requested we will do them if we can. We have experienced 2 failures ascribed to this defect. On the cassandra user mailing list Peter Schuller (2012-08-28) describes an additional failure. Reproduction steps as currently known: 1. Setup a cluster with 6 nodes (call them #1 through #6). 2. Have #5 fail completely. One failure was when the node was stopped to replace the battery in the hard disk cache. The second failure was when the hardware monitoring recorded a problem, CPU usage was increasing without explanation and the server console was frozen so the machine was restarted. 3. Bring #5 back Expected behaviour: * #5 should rejoin the ring. Actual behaviour (based on the incident we saw yesterday): * #5 didn't rejoin the ring. * We stopped all nodes and started them one by one. * Nodes #2, #4, #6 had forgotten most of their column families. They had the keys space but with only one column family instead of the usual 9 or so. * We ran nodetool resetlocalschema on #2, #4 and #6. * We ran nodetool repair -pr on #2, #4, #5 and #6 * On #2 nodetool repair appeared to crash in that there were no messages in the logs from it for 10min+. Nodetool compactionstats and nodetool netstats showed no activity. * Restarting nodetool repair -pr fixed the problem and ran to completion. was: At present we do not have a complete reproduction for this defect but am raising this defect as request by Aaron Morton. We will update as we find out more. If any additional logging or tests are requested we will do them if we can. We have experienced 2 failures ascribed to this defect. On the cassandra user mailing list Peter Schuller (2012-08-28) describes an additional failure. Reproduction steps as currently known: 1. Setup a cluster with 6 nodes (call them #1 through #6). 2. Have #5 fail completely. One failure was when the node was stopped to replace the battery in the hard disk cache. The second failure was when the hardware monitoring recorded a problem, CPU usage was increasing without explanation and the server console was frozen so the machine was restarted. 3. Bring #5 back Expected behaviour: * #5 should rejoin the ring. Actual behaviour (based on the incident we saw yesterday): * #5 didn't rejoin the ring. * We stopped all nodes and started them one by one. * Nodes #2, #4, #6 had forgotten most of their column families. They had the keys space but with only one column family instead of the usual 9 or so. * We ran nodetool resetlocalschema on #2, #4 and #6. * We ran nodetool repair -pr on #2, #4, #5 and #6 * On one of these nodes nodetool repair appeared to crash in that there were no messages in the logs from it for 10min+. Nodetool compactionstats and nodetool netstats showed no activity. * Restarting nodetool repair -pr fixed the problem and ran to completion. Some nodes forget schema when 1 node fails -- Key: CASSANDRA-4583 URL: https://issues.apache.org/jira/browse/CASSANDRA-4583 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.2 Environment: CentOS release 6.3 (Final) Reporter: Edward Sargisson At present we do not have a complete reproduction for this defect but am raising this defect as request by Aaron Morton. We will update as we find out more. If any additional logging or tests are requested we will do them if we can. We have experienced 2 failures ascribed to this defect. On the cassandra user mailing list Peter Schuller (2012-08-28) describes an additional failure. Reproduction steps as currently known: 1. Setup a cluster with 6 nodes (call them #1 through #6). 2. Have #5 fail completely. One failure was when the node was stopped to replace the battery in the hard disk cache. The second failure was when the hardware monitoring recorded a problem, CPU usage was increasing without explanation and the server console was frozen so the machine was restarted. 3. Bring #5 back Expected behaviour: * #5 should rejoin the ring. Actual behaviour (based on the incident we saw yesterday): * #5 didn't rejoin the ring. * We stopped all nodes and started them one by one. * Nodes #2, #4, #6 had forgotten most of their column families. They had the keys space but with only one column family instead of the usual 9 or so. * We ran nodetool resetlocalschema on #2, #4 and #6. * We ran nodetool repair -pr on #2, #4, #5 and #6 * On #2 nodetool repair appeared to crash in that
[jira] [Updated] (CASSANDRA-4572) lost+found directory in the data dir causes problems again
[ https://issues.apache.org/jira/browse/CASSANDRA-4572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita updated CASSANDRA-4572: -- Attachment: 4572-1.1.txt When you do File#listFiles on lost+found directory, it returns null. I believe there are other cases that it returns null, so attached patch just checks null after File#listFiles is performed. lost+found directory in the data dir causes problems again -- Key: CASSANDRA-4572 URL: https://issues.apache.org/jira/browse/CASSANDRA-4572 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.0 Reporter: Brandon Williams Assignee: Yuki Morishita Fix For: 1.1.5 Attachments: 4572-1.1.txt Looks like we've regressed from CASSANDRA-1547 and mounting a fs directly on the data dir is a problem again. {noformat} INFO [main] 2012-08-22 23:30:03,710 Directories.java (line 475) Upgrade from pre-1.1 version detected: migrating sstables to new directory layout ERROR [main] 2012-08-22 23:30:03,712 AbstractCassandraDaemon.java (line 370) Exception encountered during startup java.lang.NullPointerException at org.apache.cassandra.db.Directories.migrateSSTables(Directories.java:487) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4571) Strange permament socket descriptors increasing leads to Too many open files
[ https://issues.apache.org/jira/browse/CASSANDRA-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444161#comment-13444161 ] Steven Willcox commented on CASSANDRA-4571: --- We are also seeing this bug and all nodes eventually run out of file descriptors and crash. Strange permament socket descriptors increasing leads to Too many open files -- Key: CASSANDRA-4571 URL: https://issues.apache.org/jira/browse/CASSANDRA-4571 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.1, 1.1.2, 1.1.3 Environment: CentOS 5.8 Linux 2.6.18-308.13.1.el5 #1 SMP Tue Aug 21 17:10:18 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux. java version 1.6.0_33 Java(TM) SE Runtime Environment (build 1.6.0_33-b03) Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03, mixed mode) Reporter: Serg Shnerson Priority: Critical On the two-node cluster there was found strange socket descriptors increasing. lsof -n | grep java shows many rows like java 8380 cassandra 113r unix 0x8101a374a080 938348482 socket java 8380 cassandra 114r unix 0x8101a374a080 938348482 socket java 8380 cassandra 115r unix 0x8101a374a080 938348482 socket java 8380 cassandra 116r unix 0x8101a374a080 938348482 socket java 8380 cassandra 117r unix 0x8101a374a080 938348482 socket java 8380 cassandra 118r unix 0x8101a374a080 938348482 socket java 8380 cassandra 119r unix 0x8101a374a080 938348482 socket java 8380 cassandra 120r unix 0x8101a374a080 938348482 socket And number of this rows constantly increasing. After about 24 hours this situation leads to error. We use PHPCassa client. Load is not so high (aroud ~50kb/s on write). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (CASSANDRA-4571) Strange permament socket descriptors increasing leads to Too many open files
[ https://issues.apache.org/jira/browse/CASSANDRA-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444161#comment-13444161 ] Steven Willcox edited comment on CASSANDRA-4571 at 8/30/12 3:22 AM: We are also seeing this bug and all nodes eventually run out of file descriptors and crash. It is a blocker for us. was (Author: swillcox): We are also seeing this bug and all nodes eventually run out of file descriptors and crash. Strange permament socket descriptors increasing leads to Too many open files -- Key: CASSANDRA-4571 URL: https://issues.apache.org/jira/browse/CASSANDRA-4571 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.1, 1.1.2, 1.1.3 Environment: CentOS 5.8 Linux 2.6.18-308.13.1.el5 #1 SMP Tue Aug 21 17:10:18 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux. java version 1.6.0_33 Java(TM) SE Runtime Environment (build 1.6.0_33-b03) Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03, mixed mode) Reporter: Serg Shnerson Priority: Critical On the two-node cluster there was found strange socket descriptors increasing. lsof -n | grep java shows many rows like java 8380 cassandra 113r unix 0x8101a374a080 938348482 socket java 8380 cassandra 114r unix 0x8101a374a080 938348482 socket java 8380 cassandra 115r unix 0x8101a374a080 938348482 socket java 8380 cassandra 116r unix 0x8101a374a080 938348482 socket java 8380 cassandra 117r unix 0x8101a374a080 938348482 socket java 8380 cassandra 118r unix 0x8101a374a080 938348482 socket java 8380 cassandra 119r unix 0x8101a374a080 938348482 socket java 8380 cassandra 120r unix 0x8101a374a080 938348482 socket And number of this rows constantly increasing. After about 24 hours this situation leads to error. We use PHPCassa client. Load is not so high (aroud ~50kb/s on write). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-1123) Allow tracing query details
[ https://issues.apache.org/jira/browse/CASSANDRA-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-1123: -- Attachment: 1123-v9.txt How about this? v9 is v7 with an ExecuteOnlyExecutor to remind us in the future not to use submit on the trace stage. Allow tracing query details --- Key: CASSANDRA-1123 URL: https://issues.apache.org/jira/browse/CASSANDRA-1123 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Jonathan Ellis Assignee: David Alves Fix For: 1.2.0 beta 1 Attachments: 1123-3.patch.gz, 1123.patch, 1123.patch, 1123.patch, 1123.patch, 1123-v6.txt, 1123-v7.patch, 1123-v8.patch, 1123-v9.txt In the spirit of CASSANDRA-511, it would be useful to tracing on queries to see where latency is coming from: how long did row cache lookup take? key search in the index? merging the data from the sstables? etc. The main difference vs setting debug logging is that debug logging is too big of a hammer; by turning on the flood of logging for everyone, you actually distort the information you're looking for. This would be something you could set per-query (or more likely per connection). We don't need to be as sophisticated as the techniques discussed in the following papers but they are interesting reading: http://research.google.com/pubs/pub36356.html http://www.usenix.org/events/osdi04/tech/full_papers/barham/barham_html/ http://www.usenix.org/event/nsdi07/tech/fonseca.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4383) Binary encoding of vnode tokens
[ https://issues.apache.org/jira/browse/CASSANDRA-4383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444176#comment-13444176 ] Eric Evans commented on CASSANDRA-4383: --- The approach taken here continues to make me a little hesitant simply because, I think, it introduces for the first time a need for proper ordering of STATE transmission/reception. I don't have a clear-enough understanding of how the underlying messaging works to know if we can firmly rely on that or not (I take you at your word that we can), but it is significant enough to warrant a mention, if for the interface alone. And on that note... bq. Patch 0003 is something we can take or leave, it just seemed to make sense to have a way to atomically set two gossip states at once in case the gossiper fires in between adding the first and second state, however it's ok for us to gossip TOKENS without STATUS, since STATUS is what fires events. I'm also not 100% certain it actually adds them atomically, since EndpointState is backed by NBHM. What I had in mind was (at least) something like a {{sendState\{Normal,Bootstrap,...\}(Collectiontoken tokens)}} to encapsulate those operations sensitive to ordering. {{Gossiper.addLocalApplicationStates(...)}} still makes it too easy to do the wrong thing. A couple of other things: {{StorageService.getHostId(ep)}} creates a second means of obtaining a host ID, and it's not at all obvious that it should only be used in assigning the value returned by the method of the _same name_ in {{TokenMetadata}}. At a minimum, I think this should be given a new name, one that makes obvious its purpose. However, wouldn't encapsulation be better anyway if this were a {{Gossiper}} method? And in addition to {{getHostId}}, {{usesHostId}} also seems better suited to the {{Gossiper}}. Finally, is there any reason that {{TokenSerializer.(de)serialize(...)}} shouldn't be static? What is the instance buying us? As for the whole ...using the presence of a hostID as an implicit indication of the version..., I was unaware of CASSANDRA-4317 (e6530cc3) and (somehow )mistakenly took that as part of this change. Sorry about that. bq. Well, it's six one way and a half dozen the other. We can look at NET_VERSION instead, but it was also introduced in 1.2, so it's effectively the same thing... and you could have the bug in the opposite direction If we look for a NET_VERSION that's not there (and should be), then the error is a missing NET_VERSION. If NET_VERSION is = 1.2, or exists at all if you prefer the implicit (I don't), and there is no HOST_ID, then we have a bug in transmitting/reception of HOST_ID. But, mostly I meant that it doesn't read as well as the old code that clearly did one thing when the version was X, and another when it was = Y. Anyway, this ship has sailed as far as this ticket goes, so if you'd prefer discuss it elsewhere then that's fine. There is no need to hold any of this against this particular issue/change. Binary encoding of vnode tokens --- Key: CASSANDRA-4383 URL: https://issues.apache.org/jira/browse/CASSANDRA-4383 Project: Cassandra Issue Type: Sub-task Reporter: Brandon Williams Assignee: Brandon Williams Fix For: 1.2.0 beta 1 Attachments: 0001-Add-HOST_ID-and-TOKENS-app-states-binary-serialization.txt, 0002-Fix-tests.txt, 0003-Add-tokens-and-status-atomically.txt Since after CASSANDRA-4317 we can know which version a remote node is using (that is, whether it is vnode-aware or not) this a good opportunity to change the token encoding to binary, since with a default of 256 tokens per node even a fixed-length 16 byte encoding per token provides a great deal of savings in gossip traffic over a text representation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4583) Some nodes forget schema when 1 node fails
[ https://issues.apache.org/jira/browse/CASSANDRA-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Sargisson updated CASSANDRA-4583: Attachment: cass-4583-5-system.log cass-4583-2-system.log cass-4583-5-system.log is an extract from #5 from the time of the incident. Similarly, cass-4583-2-system.log is from #2. #2 is 10.30.11.40 #5 is 10.30.11.43 Some nodes forget schema when 1 node fails -- Key: CASSANDRA-4583 URL: https://issues.apache.org/jira/browse/CASSANDRA-4583 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.2 Environment: CentOS release 6.3 (Final) Reporter: Edward Sargisson Attachments: cass-4583-2-system.log, cass-4583-5-system.log At present we do not have a complete reproduction for this defect but am raising this defect as request by Aaron Morton. We will update as we find out more. If any additional logging or tests are requested we will do them if we can. We have experienced 2 failures ascribed to this defect. On the cassandra user mailing list Peter Schuller (2012-08-28) describes an additional failure. Reproduction steps as currently known: 1. Setup a cluster with 6 nodes (call them #1 through #6). 2. Have #5 fail completely. One failure was when the node was stopped to replace the battery in the hard disk cache. The second failure was when the hardware monitoring recorded a problem, CPU usage was increasing without explanation and the server console was frozen so the machine was restarted. 3. Bring #5 back Expected behaviour: * #5 should rejoin the ring. Actual behaviour (based on the incident we saw yesterday): * #5 didn't rejoin the ring. * We stopped all nodes and started them one by one. * Nodes #2, #4, #6 had forgotten most of their column families. They had the keys space but with only one column family instead of the usual 9 or so. * We ran nodetool resetlocalschema on #2, #4 and #6. * We ran nodetool repair -pr on #2, #4, #5 and #6 * On #2 nodetool repair appeared to crash in that there were no messages in the logs from it for 10min+. Nodetool compactionstats and nodetool netstats showed no activity. * Restarting nodetool repair -pr fixed the problem and ran to completion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4571) Strange permament socket descriptors increasing leads to Too many open files
[ https://issues.apache.org/jira/browse/CASSANDRA-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444190#comment-13444190 ] Per Otterström commented on CASSANDRA-4571: --- To verify, we started from scratch. A new installation on 3 servers. And the FD leak is still there. So, with our particular setup we are able to reproduce the bug. These are the characteristics of our setup: - We have one single CF. - Rows are inserted in batches. - Rows are red, updated and deleted in a random like pattern. - The FD leak seem to start during heavy read load (but can appear during mixed read/write/delete operations as well). - We are using Hector to access this single CF. - Cassandra configuration is basically standard. The FD leaks does not show immediately. It appears once there is ~60M rows in CF. Strange permament socket descriptors increasing leads to Too many open files -- Key: CASSANDRA-4571 URL: https://issues.apache.org/jira/browse/CASSANDRA-4571 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.1, 1.1.2, 1.1.3 Environment: CentOS 5.8 Linux 2.6.18-308.13.1.el5 #1 SMP Tue Aug 21 17:10:18 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux. java version 1.6.0_33 Java(TM) SE Runtime Environment (build 1.6.0_33-b03) Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03, mixed mode) Reporter: Serg Shnerson Priority: Critical On the two-node cluster there was found strange socket descriptors increasing. lsof -n | grep java shows many rows like java 8380 cassandra 113r unix 0x8101a374a080 938348482 socket java 8380 cassandra 114r unix 0x8101a374a080 938348482 socket java 8380 cassandra 115r unix 0x8101a374a080 938348482 socket java 8380 cassandra 116r unix 0x8101a374a080 938348482 socket java 8380 cassandra 117r unix 0x8101a374a080 938348482 socket java 8380 cassandra 118r unix 0x8101a374a080 938348482 socket java 8380 cassandra 119r unix 0x8101a374a080 938348482 socket java 8380 cassandra 120r unix 0x8101a374a080 938348482 socket And number of this rows constantly increasing. After about 24 hours this situation leads to error. We use PHPCassa client. Load is not so high (aroud ~50kb/s on write). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4571) Strange permament socket descriptors increasing leads to Too many open files
[ https://issues.apache.org/jira/browse/CASSANDRA-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444197#comment-13444197 ] Jonathan Ellis commented on CASSANDRA-4571: --- Are you sure you can't reproduce on a single-node cluster? Because we're getting conflicting evidence here; on the one hand, strace indicates that the fd leakage is related to file i/o, but if so, you shouldn't need multiple nodes in the cluster to repro. Strange permament socket descriptors increasing leads to Too many open files -- Key: CASSANDRA-4571 URL: https://issues.apache.org/jira/browse/CASSANDRA-4571 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.1, 1.1.2, 1.1.3 Environment: CentOS 5.8 Linux 2.6.18-308.13.1.el5 #1 SMP Tue Aug 21 17:10:18 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux. java version 1.6.0_33 Java(TM) SE Runtime Environment (build 1.6.0_33-b03) Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03, mixed mode) Reporter: Serg Shnerson Priority: Critical On the two-node cluster there was found strange socket descriptors increasing. lsof -n | grep java shows many rows like java 8380 cassandra 113r unix 0x8101a374a080 938348482 socket java 8380 cassandra 114r unix 0x8101a374a080 938348482 socket java 8380 cassandra 115r unix 0x8101a374a080 938348482 socket java 8380 cassandra 116r unix 0x8101a374a080 938348482 socket java 8380 cassandra 117r unix 0x8101a374a080 938348482 socket java 8380 cassandra 118r unix 0x8101a374a080 938348482 socket java 8380 cassandra 119r unix 0x8101a374a080 938348482 socket java 8380 cassandra 120r unix 0x8101a374a080 938348482 socket And number of this rows constantly increasing. After about 24 hours this situation leads to error. We use PHPCassa client. Load is not so high (aroud ~50kb/s on write). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-1123) Allow tracing query details
[ https://issues.apache.org/jira/browse/CASSANDRA-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-1123: -- Attachment: 1123-v9.txt Allow tracing query details --- Key: CASSANDRA-1123 URL: https://issues.apache.org/jira/browse/CASSANDRA-1123 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Jonathan Ellis Assignee: David Alves Fix For: 1.2.0 beta 1 Attachments: 1123-3.patch.gz, 1123.patch, 1123.patch, 1123.patch, 1123.patch, 1123-v6.txt, 1123-v7.patch, 1123-v8.patch, 1123-v9.txt, 1123-v9.txt In the spirit of CASSANDRA-511, it would be useful to tracing on queries to see where latency is coming from: how long did row cache lookup take? key search in the index? merging the data from the sstables? etc. The main difference vs setting debug logging is that debug logging is too big of a hammer; by turning on the flood of logging for everyone, you actually distort the information you're looking for. This would be something you could set per-query (or more likely per connection). We don't need to be as sophisticated as the techniques discussed in the following papers but they are interesting reading: http://research.google.com/pubs/pub36356.html http://www.usenix.org/events/osdi04/tech/full_papers/barham/barham_html/ http://www.usenix.org/event/nsdi07/tech/fonseca.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1123) Allow tracing query details
[ https://issues.apache.org/jira/browse/CASSANDRA-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444209#comment-13444209 ] David Alves commented on CASSANDRA-1123: +1, wfm Allow tracing query details --- Key: CASSANDRA-1123 URL: https://issues.apache.org/jira/browse/CASSANDRA-1123 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Jonathan Ellis Assignee: David Alves Fix For: 1.2.0 beta 1 Attachments: 1123-3.patch.gz, 1123.patch, 1123.patch, 1123.patch, 1123.patch, 1123-v6.txt, 1123-v7.patch, 1123-v8.patch, 1123-v9.txt, 1123-v9.txt In the spirit of CASSANDRA-511, it would be useful to tracing on queries to see where latency is coming from: how long did row cache lookup take? key search in the index? merging the data from the sstables? etc. The main difference vs setting debug logging is that debug logging is too big of a hammer; by turning on the flood of logging for everyone, you actually distort the information you're looking for. This would be something you could set per-query (or more likely per connection). We don't need to be as sophisticated as the techniques discussed in the following papers but they are interesting reading: http://research.google.com/pubs/pub36356.html http://www.usenix.org/events/osdi04/tech/full_papers/barham/barham_html/ http://www.usenix.org/event/nsdi07/tech/fonseca.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-4583) Some nodes forget schema when 1 node fails
[ https://issues.apache.org/jira/browse/CASSANDRA-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Yaskevich resolved CASSANDRA-4583. Resolution: Duplicate This looks like it was caused by the same problem as CASSANDRA-4129 and timestamp problems related to nanoTime usage for schema, all of that is fixed in 1.1.4 Some nodes forget schema when 1 node fails -- Key: CASSANDRA-4583 URL: https://issues.apache.org/jira/browse/CASSANDRA-4583 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.2 Environment: CentOS release 6.3 (Final) Reporter: Edward Sargisson Attachments: cass-4583-2-system.log, cass-4583-5-system.log At present we do not have a complete reproduction for this defect but am raising this defect as request by Aaron Morton. We will update as we find out more. If any additional logging or tests are requested we will do them if we can. We have experienced 2 failures ascribed to this defect. On the cassandra user mailing list Peter Schuller (2012-08-28) describes an additional failure. Reproduction steps as currently known: 1. Setup a cluster with 6 nodes (call them #1 through #6). 2. Have #5 fail completely. One failure was when the node was stopped to replace the battery in the hard disk cache. The second failure was when the hardware monitoring recorded a problem, CPU usage was increasing without explanation and the server console was frozen so the machine was restarted. 3. Bring #5 back Expected behaviour: * #5 should rejoin the ring. Actual behaviour (based on the incident we saw yesterday): * #5 didn't rejoin the ring. * We stopped all nodes and started them one by one. * Nodes #2, #4, #6 had forgotten most of their column families. They had the keys space but with only one column family instead of the usual 9 or so. * We ran nodetool resetlocalschema on #2, #4 and #6. * We ran nodetool repair -pr on #2, #4, #5 and #6 * On #2 nodetool repair appeared to crash in that there were no messages in the logs from it for 10min+. Nodetool compactionstats and nodetool netstats showed no activity. * Restarting nodetool repair -pr fixed the problem and ran to completion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (CASSANDRA-4583) Some nodes forget schema when 1 node fails
[ https://issues.apache.org/jira/browse/CASSANDRA-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444210#comment-13444210 ] Pavel Yaskevich edited comment on CASSANDRA-4583 at 8/30/12 4:09 AM: - This looks like it was caused by the same problem as CASSANDRA-4219 and timestamp problems related to nanoTime usage for schema, all of that is fixed in 1.1.4 was (Author: xedin): This looks like it was caused by the same problem as CASSANDRA-4129 and timestamp problems related to nanoTime usage for schema, all of that is fixed in 1.1.4 Some nodes forget schema when 1 node fails -- Key: CASSANDRA-4583 URL: https://issues.apache.org/jira/browse/CASSANDRA-4583 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.2 Environment: CentOS release 6.3 (Final) Reporter: Edward Sargisson Attachments: cass-4583-2-system.log, cass-4583-5-system.log At present we do not have a complete reproduction for this defect but am raising this defect as request by Aaron Morton. We will update as we find out more. If any additional logging or tests are requested we will do them if we can. We have experienced 2 failures ascribed to this defect. On the cassandra user mailing list Peter Schuller (2012-08-28) describes an additional failure. Reproduction steps as currently known: 1. Setup a cluster with 6 nodes (call them #1 through #6). 2. Have #5 fail completely. One failure was when the node was stopped to replace the battery in the hard disk cache. The second failure was when the hardware monitoring recorded a problem, CPU usage was increasing without explanation and the server console was frozen so the machine was restarted. 3. Bring #5 back Expected behaviour: * #5 should rejoin the ring. Actual behaviour (based on the incident we saw yesterday): * #5 didn't rejoin the ring. * We stopped all nodes and started them one by one. * Nodes #2, #4, #6 had forgotten most of their column families. They had the keys space but with only one column family instead of the usual 9 or so. * We ran nodetool resetlocalschema on #2, #4 and #6. * We ran nodetool repair -pr on #2, #4, #5 and #6 * On #2 nodetool repair appeared to crash in that there were no messages in the logs from it for 10min+. Nodetool compactionstats and nodetool netstats showed no activity. * Restarting nodetool repair -pr fixed the problem and ran to completion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4583) Some nodes forget schema when 1 node fails
[ https://issues.apache.org/jira/browse/CASSANDRA-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444224#comment-13444224 ] Pavel Yaskevich commented on CASSANDRA-4583: Additionally there are CASSANDRA-4432 and CASSANDRA-4561 related to timestamp problem Some nodes forget schema when 1 node fails -- Key: CASSANDRA-4583 URL: https://issues.apache.org/jira/browse/CASSANDRA-4583 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.2 Environment: CentOS release 6.3 (Final) Reporter: Edward Sargisson Attachments: cass-4583-2-system.log, cass-4583-5-system.log At present we do not have a complete reproduction for this defect but am raising this defect as request by Aaron Morton. We will update as we find out more. If any additional logging or tests are requested we will do them if we can. We have experienced 2 failures ascribed to this defect. On the cassandra user mailing list Peter Schuller (2012-08-28) describes an additional failure. Reproduction steps as currently known: 1. Setup a cluster with 6 nodes (call them #1 through #6). 2. Have #5 fail completely. One failure was when the node was stopped to replace the battery in the hard disk cache. The second failure was when the hardware monitoring recorded a problem, CPU usage was increasing without explanation and the server console was frozen so the machine was restarted. 3. Bring #5 back Expected behaviour: * #5 should rejoin the ring. Actual behaviour (based on the incident we saw yesterday): * #5 didn't rejoin the ring. * We stopped all nodes and started them one by one. * Nodes #2, #4, #6 had forgotten most of their column families. They had the keys space but with only one column family instead of the usual 9 or so. * We ran nodetool resetlocalschema on #2, #4 and #6. * We ran nodetool repair -pr on #2, #4, #5 and #6 * On #2 nodetool repair appeared to crash in that there were no messages in the logs from it for 10min+. Nodetool compactionstats and nodetool netstats showed no activity. * Restarting nodetool repair -pr fixed the problem and ran to completion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4583) Some nodes forget schema when 1 node fails
[ https://issues.apache.org/jira/browse/CASSANDRA-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444257#comment-13444257 ] Edward Sargisson commented on CASSANDRA-4583: - Hi Pavel, Thanks for your quick reply - that seems reasonable until and unless we can show otherwise. We'll schedule an upgrade to 1.1.4 and will report back if we see a recurrence afterwards. Some nodes forget schema when 1 node fails -- Key: CASSANDRA-4583 URL: https://issues.apache.org/jira/browse/CASSANDRA-4583 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.2 Environment: CentOS release 6.3 (Final) Reporter: Edward Sargisson Attachments: cass-4583-2-system.log, cass-4583-5-system.log At present we do not have a complete reproduction for this defect but am raising this defect as request by Aaron Morton. We will update as we find out more. If any additional logging or tests are requested we will do them if we can. We have experienced 2 failures ascribed to this defect. On the cassandra user mailing list Peter Schuller (2012-08-28) describes an additional failure. Reproduction steps as currently known: 1. Setup a cluster with 6 nodes (call them #1 through #6). 2. Have #5 fail completely. One failure was when the node was stopped to replace the battery in the hard disk cache. The second failure was when the hardware monitoring recorded a problem, CPU usage was increasing without explanation and the server console was frozen so the machine was restarted. 3. Bring #5 back Expected behaviour: * #5 should rejoin the ring. Actual behaviour (based on the incident we saw yesterday): * #5 didn't rejoin the ring. * We stopped all nodes and started them one by one. * Nodes #2, #4, #6 had forgotten most of their column families. They had the keys space but with only one column family instead of the usual 9 or so. * We ran nodetool resetlocalschema on #2, #4 and #6. * We ran nodetool repair -pr on #2, #4, #5 and #6 * On #2 nodetool repair appeared to crash in that there were no messages in the logs from it for 10min+. Nodetool compactionstats and nodetool netstats showed no activity. * Restarting nodetool repair -pr fixed the problem and ran to completion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-4584) Add CQL syntax to enable request tracing
Jonathan Ellis created CASSANDRA-4584: - Summary: Add CQL syntax to enable request tracing Key: CASSANDRA-4584 URL: https://issues.apache.org/jira/browse/CASSANDRA-4584 Project: Cassandra Issue Type: Sub-task Affects Versions: 1.2.0 beta 1 Reporter: Jonathan Ellis Assignee: Sylvain Lebresne Fix For: 1.2.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-4585) Add cqlsh support for tracing results
Jonathan Ellis created CASSANDRA-4585: - Summary: Add cqlsh support for tracing results Key: CASSANDRA-4585 URL: https://issues.apache.org/jira/browse/CASSANDRA-4585 Project: Cassandra Issue Type: Sub-task Components: Tools Affects Versions: 1.2.0 beta 1 Reporter: Jonathan Ellis Assignee: paul cannon Fix For: 1.2.0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[3/3] add request tracing patch by David Alves; reviewed by jbellis for CASSANDRA-1123
http://git-wip-us.apache.org/repos/asf/cassandra/blob/5c94432b/src/java/org/apache/cassandra/tracing/Tracing.java -- diff --git a/src/java/org/apache/cassandra/tracing/Tracing.java b/src/java/org/apache/cassandra/tracing/Tracing.java new file mode 100644 index 000..7675d74 --- /dev/null +++ b/src/java/org/apache/cassandra/tracing/Tracing.java @@ -0,0 +1,256 @@ +/* + * + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + * + */ +package org.apache.cassandra.tracing; + +import static com.google.common.base.Preconditions.checkState; +import static org.apache.cassandra.utils.ByteBufferUtil.bytes; + +import java.net.InetAddress; +import java.nio.ByteBuffer; +import java.util.Arrays; +import java.util.Map; +import java.util.UUID; + +import org.apache.cassandra.concurrent.Stage; +import org.apache.cassandra.concurrent.StageManager; +import org.apache.cassandra.config.CFMetaData; +import org.apache.cassandra.cql3.ColumnNameBuilder; +import org.apache.cassandra.db.ColumnFamily; +import org.apache.cassandra.db.ExpiringColumn; +import org.apache.cassandra.db.RowMutation; +import org.apache.cassandra.db.marshal.InetAddressType; +import org.apache.cassandra.db.marshal.LongType; +import org.apache.cassandra.db.marshal.TimeUUIDType; +import org.apache.cassandra.db.marshal.UTF8Type; +import org.apache.cassandra.net.MessageIn; +import org.apache.cassandra.service.StorageProxy; +import org.apache.cassandra.thrift.ConsistencyLevel; +import org.apache.cassandra.thrift.TimedOutException; +import org.apache.cassandra.thrift.UnavailableException; +import org.apache.cassandra.utils.ByteBufferUtil; +import org.apache.cassandra.utils.FBUtilities; +import org.apache.cassandra.utils.UUIDGen; +import org.apache.cassandra.utils.WrappedRunnable; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * A trace session context. Able to track and store trace sessions. A session is usually a user initiated query, and may + * have multiple local and remote events before it is completed. All events and sessions are stored at table. + */ +public class Tracing +{ +public static final String TRACE_KS = system_traces; +public static final String EVENTS_CF = events; +public static final String SESSIONS_CF = sessions; +public static final String TRACE_HEADER = TraceSession; + +private static final int TTL = 24 * 3600; + +private static Tracing instance = new Tracing(); + +public static final Logger logger = LoggerFactory.getLogger(Tracing.class); + +/** + * Fetches and lazy initializes the trace context. + */ +public static Tracing instance() +{ +return instance; +} + +private InetAddress localAddress = FBUtilities.getLocalAddress(); + +private final ThreadLocalTraceState state = new ThreadLocalTraceState(); + +public static void addColumn(ColumnFamily cf, ByteBuffer name, Object value) +{ +cf.addColumn(new ExpiringColumn(name, ByteBufferUtil.bytes(value.toString()), System.currentTimeMillis(), TTL)); +} + +public static void addColumn(ColumnFamily cf, ByteBuffer name, InetAddress address) +{ +cf.addColumn(new ExpiringColumn(name, ByteBufferUtil.bytes(address), System.currentTimeMillis(), TTL)); +} + +public static void addColumn(ColumnFamily cf, ByteBuffer name, int value) +{ +cf.addColumn(new ExpiringColumn(name, ByteBufferUtil.bytes(value), System.currentTimeMillis(), TTL)); +} + +public static void addColumn(ColumnFamily cf, ByteBuffer name, long value) +{ +cf.addColumn(new ExpiringColumn(name, ByteBufferUtil.bytes(value), System.currentTimeMillis(), TTL)); +} + +public static void addColumn(ColumnFamily cf, ByteBuffer name, String value) +{ +cf.addColumn(new ExpiringColumn(name, ByteBufferUtil.bytes(value), System.currentTimeMillis(), TTL)); +} + +private void addColumn(ColumnFamily cf, ByteBuffer name, ByteBuffer value) +{ +cf.addColumn(new ExpiringColumn(name, value, System.currentTimeMillis(), TTL)); +} + +public void addParameterColumns(ColumnFamily cf, MapString, String rawPayload) +{ +for
[jira] [Commented] (CASSANDRA-4583) Some nodes forget schema when 1 node fails
[ https://issues.apache.org/jira/browse/CASSANDRA-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444307#comment-13444307 ] Pavel Yaskevich commented on CASSANDRA-4583: Sounds good! Some nodes forget schema when 1 node fails -- Key: CASSANDRA-4583 URL: https://issues.apache.org/jira/browse/CASSANDRA-4583 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.2 Environment: CentOS release 6.3 (Final) Reporter: Edward Sargisson Attachments: cass-4583-2-system.log, cass-4583-5-system.log At present we do not have a complete reproduction for this defect but am raising this defect as request by Aaron Morton. We will update as we find out more. If any additional logging or tests are requested we will do them if we can. We have experienced 2 failures ascribed to this defect. On the cassandra user mailing list Peter Schuller (2012-08-28) describes an additional failure. Reproduction steps as currently known: 1. Setup a cluster with 6 nodes (call them #1 through #6). 2. Have #5 fail completely. One failure was when the node was stopped to replace the battery in the hard disk cache. The second failure was when the hardware monitoring recorded a problem, CPU usage was increasing without explanation and the server console was frozen so the machine was restarted. 3. Bring #5 back Expected behaviour: * #5 should rejoin the ring. Actual behaviour (based on the incident we saw yesterday): * #5 didn't rejoin the ring. * We stopped all nodes and started them one by one. * Nodes #2, #4, #6 had forgotten most of their column families. They had the keys space but with only one column family instead of the usual 9 or so. * We ran nodetool resetlocalschema on #2, #4 and #6. * We ran nodetool repair -pr on #2, #4, #5 and #6 * On #2 nodetool repair appeared to crash in that there were no messages in the logs from it for 10min+. Nodetool compactionstats and nodetool netstats showed no activity. * Restarting nodetool repair -pr fixed the problem and ran to completion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
git commit: fix mispaste
Updated Branches: refs/heads/trunk 5c94432b2 - ad52ce4fa fix mispaste Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ad52ce4f Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ad52ce4f Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ad52ce4f Branch: refs/heads/trunk Commit: ad52ce4fa303d2c63cbd9833b7245ab2cdff28b3 Parents: 5c94432 Author: Jonathan Ellis jbel...@apache.org Authored: Wed Aug 29 13:47:18 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Wed Aug 29 13:47:18 2012 -0500 -- src/java/org/apache/cassandra/tools/NodeCmd.java |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/ad52ce4f/src/java/org/apache/cassandra/tools/NodeCmd.java -- diff --git a/src/java/org/apache/cassandra/tools/NodeCmd.java b/src/java/org/apache/cassandra/tools/NodeCmd.java index 8cb7dbe..554fba2 100644 --- a/src/java/org/apache/cassandra/tools/NodeCmd.java +++ b/src/java/org/apache/cassandra/tools/NodeCmd.java @@ -47,7 +47,7 @@ import org.apache.cassandra.thrift.InvalidRequestException; import org.apache.cassandra.utils.EstimatedHistogram; import org.apache.cassandra.utils.Pair; -public class trace_next_queryNodeCmd +public class NodeCmd { private static final PairString, String SNAPSHOT_COLUMNFAMILY_OPT = new PairString, String(cf, column-family); private static final PairString, String HOST_OPT = new PairString, String(h, host); @@ -147,7 +147,7 @@ public class trace_next_queryNodeCmd // No args addCmdHelp(header, ring, Print information about the token ring); addCmdHelp(header, join, Join the ring); -addCmdHelp(header, igit nfo [-T/--tokens], Print node information (uptime, load, ...)); +addCmdHelp(header, info [-T/--tokens], Print node information (uptime, load, ...)); addCmdHelp(header, status, Print cluster information (state, load, IDs, ...)); addCmdHelp(header, cfstats, Print statistics on column families); addCmdHelp(header, version, Print cassandra version);
[jira] [Commented] (CASSANDRA-1123) Allow tracing query details
[ https://issues.apache.org/jira/browse/CASSANDRA-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444338#comment-13444338 ] Jonathan Ellis commented on CASSANDRA-1123: --- Hmm, looks like this broke our test log4j config somehow. ant test gives a lot of this: {noformat} [junit] ERROR 14:02:56,567 Fatal exception in thread Thread[MigrationStage:1,5,main] [junit] java.lang.NullPointerException [junit] at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1195) [junit] at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1087) [junit] at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1077) [junit] at org.apache.cassandra.config.ColumnDefinition.readSchema(ColumnDefinition.java:247) [junit] at org.apache.cassandra.config.CFMetaData.fromSchema(CFMetaData.java:1320) [junit] at org.apache.cassandra.config.KSMetaData.deserializeColumnFamilies(KSMetaData.java:293) [junit] at org.apache.cassandra.db.DefsTable.mergeColumnFamilies(DefsTable.java:342) [junit] at org.apache.cassandra.db.DefsTable.mergeSchema(DefsTable.java:255) [junit] at org.apache.cassandra.service.MigrationManager$1.call(MigrationManager.java:202) [junit] at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) [junit] at java.util.concurrent.FutureTask.run(FutureTask.java:138) [junit] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [junit] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [junit] at java.lang.Thread.run(Thread.java:662) {noformat} where CFS:1195 is a logger.debug call. ant test uses test/conf/log4j-server.properties, which just specifies a file and stdout at DEBUG. Allow tracing query details --- Key: CASSANDRA-1123 URL: https://issues.apache.org/jira/browse/CASSANDRA-1123 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Jonathan Ellis Assignee: David Alves Fix For: 1.2.0 beta 1 Attachments: 1123-3.patch.gz, 1123.patch, 1123.patch, 1123.patch, 1123.patch, 1123-v6.txt, 1123-v7.patch, 1123-v8.patch, 1123-v9.txt, 1123-v9.txt In the spirit of CASSANDRA-511, it would be useful to tracing on queries to see where latency is coming from: how long did row cache lookup take? key search in the index? merging the data from the sstables? etc. The main difference vs setting debug logging is that debug logging is too big of a hammer; by turning on the flood of logging for everyone, you actually distort the information you're looking for. This would be something you could set per-query (or more likely per connection). We don't need to be as sophisticated as the techniques discussed in the following papers but they are interesting reading: http://research.google.com/pubs/pub36356.html http://www.usenix.org/events/osdi04/tech/full_papers/barham/barham_html/ http://www.usenix.org/event/nsdi07/tech/fonseca.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (CASSANDRA-1123) Allow tracing query details
[ https://issues.apache.org/jira/browse/CASSANDRA-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis reopened CASSANDRA-1123: --- Reverted pending tests fix. Allow tracing query details --- Key: CASSANDRA-1123 URL: https://issues.apache.org/jira/browse/CASSANDRA-1123 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Jonathan Ellis Assignee: David Alves Fix For: 1.2.0 beta 1 Attachments: 1123-3.patch.gz, 1123.patch, 1123.patch, 1123.patch, 1123.patch, 1123-v6.txt, 1123-v7.patch, 1123-v8.patch, 1123-v9.txt, 1123-v9.txt In the spirit of CASSANDRA-511, it would be useful to tracing on queries to see where latency is coming from: how long did row cache lookup take? key search in the index? merging the data from the sstables? etc. The main difference vs setting debug logging is that debug logging is too big of a hammer; by turning on the flood of logging for everyone, you actually distort the information you're looking for. This would be something you could set per-query (or more likely per connection). We don't need to be as sophisticated as the techniques discussed in the following papers but they are interesting reading: http://research.google.com/pubs/pub36356.html http://www.usenix.org/events/osdi04/tech/full_papers/barham/barham_html/ http://www.usenix.org/event/nsdi07/tech/fonseca.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-1123) Allow tracing query details
[ https://issues.apache.org/jira/browse/CASSANDRA-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Alves updated CASSANDRA-1123: --- Attachment: 1123-v9.patch Fixes the NPE. Problem was the result.getColumnCount call inside the logging statement (result maybe null). Allow tracing query details --- Key: CASSANDRA-1123 URL: https://issues.apache.org/jira/browse/CASSANDRA-1123 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Jonathan Ellis Assignee: David Alves Fix For: 1.2.0 beta 1 Attachments: 1123-3.patch.gz, 1123.patch, 1123.patch, 1123.patch, 1123.patch, 1123-v6.txt, 1123-v7.patch, 1123-v8.patch, 1123-v9.patch, 1123-v9.txt, 1123-v9.txt In the spirit of CASSANDRA-511, it would be useful to tracing on queries to see where latency is coming from: how long did row cache lookup take? key search in the index? merging the data from the sstables? etc. The main difference vs setting debug logging is that debug logging is too big of a hammer; by turning on the flood of logging for everyone, you actually distort the information you're looking for. This would be something you could set per-query (or more likely per connection). We don't need to be as sophisticated as the techniques discussed in the following papers but they are interesting reading: http://research.google.com/pubs/pub36356.html http://www.usenix.org/events/osdi04/tech/full_papers/barham/barham_html/ http://www.usenix.org/event/nsdi07/tech/fonseca.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4567) Error in log related to Murmur3Partitioner
[ https://issues.apache.org/jira/browse/CASSANDRA-4567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1304#comment-1304 ] Pavel Yaskevich commented on CASSANDRA-4567: +1 Error in log related to Murmur3Partitioner -- Key: CASSANDRA-4567 URL: https://issues.apache.org/jira/browse/CASSANDRA-4567 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.2.0 beta 1 Environment: Using ccm on ubuntu Reporter: Tyler Patterson Assignee: Vijay Fix For: 1.2.0 beta 1 Attachments: 0001-CASSANDRA-4567.patch, 0001-CASSANDRA-4567-v2.patch, 0001-CASSANDRA-4567-v3.patch Start a 2-node cluster on cassandra-1.1. Bring down one node, upgrade it to trunk, start it back up. The following error shows up in the log: {code} ... INFO [main] 2012-08-22 10:44:40,012 CacheService.java (line 170) Scheduling row cache save to each 0 seconds (going to save all keys). INFO [SSTableBatchOpen:1] 2012-08-22 10:44:40,106 SSTableReader.java (line 164) Opening /tmp/dtest-IYHWfV/test/node1/data/system/LocationInfo/system-LocationInfo-he-2 (148 bytes) INFO [SSTableBatchOpen:2] 2012-08-22 10:44:40,106 SSTableReader.java (line 164) Opening /tmp/dtest-IYHWfV/test/node1/data/system/LocationInfo/system-LocationInfo-he-1 (226 bytes) INFO [SSTableBatchOpen:3] 2012-08-22 10:44:40,106 SSTableReader.java (line 164) Opening /tmp/dtest-IYHWfV/test/node1/data/system/LocationInfo/system-LocationInfo-he-3 (89 bytes) ERROR [SSTableBatchOpen:3] 2012-08-22 10:44:40,114 CassandraDaemon.java (line 131) Exception in thread Thread[SSTableBatchOpen:3,5,main] java.lang.RuntimeException: Cannot open /tmp/dtest-IYHWfV/test/node1/data/system/LocationInfo/system-LocationInfo-he-3 because partitioner does not match org.apache.cassandra.dht.Murmur3Partitioner at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:175) at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:149) at org.apache.cassandra.io.sstable.SSTableReader$1.run(SSTableReader.java:236) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) ERROR [SSTableBatchOpen:2] 2012-08-22 10:44:40,114 CassandraDaemon.java (line 131) Exception in thread Thread[SSTableBatchOpen:2,5,main] java.lang.RuntimeException: Cannot open /tmp/dtest-IYHWfV/test/node1/data/system/LocationInfo/system-LocationInfo-he-1 because partitioner does not match org.apache.cassandra.dht.Murmur3Partitioner at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:175) at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:149) at org.apache.cassandra.io.sstable.SSTableReader$1.run(SSTableReader.java:236) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) ERROR [SSTableBatchOpen:1] 2012-08-22 10:44:40,114 CassandraDaemon.java (line 131) Exception in thread Thread[SSTableBatchOpen:1,5,main] java.lang.RuntimeException: Cannot open /tmp/dtest-IYHWfV/test/node1/data/system/LocationInfo/system-LocationInfo-he-2 because partitioner does not match org.apache.cassandra.dht.Murmur3Partitioner at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:175) at org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:149) at org.apache.cassandra.io.sstable.SSTableReader$1.run(SSTableReader.java:236) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) INFO
[jira] [Commented] (CASSANDRA-4009) Increase usage of Metrics and flesh out o.a.c.metrics
[ https://issues.apache.org/jira/browse/CASSANDRA-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1366#comment-1366 ] Brandon Williams commented on CASSANDRA-4009: - +1 Increase usage of Metrics and flesh out o.a.c.metrics - Key: CASSANDRA-4009 URL: https://issues.apache.org/jira/browse/CASSANDRA-4009 Project: Cassandra Issue Type: Improvement Reporter: Brandon Williams Assignee: Yuki Morishita Priority: Minor Fix For: 1.2.0 Attachments: 4009.txt, 4009.txt, 4009-v2.txt With CASSANDRA-3671 we have begun using the Metrics packages to expose stats in a new JMX structure, intended to be more user-friendly (for example, you don't need to know what a StorageProxy is or does.) This ticket serves as a parent for subtasks to finish fleshing out the rest of the enhanced metrics. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
git commit: fix NPE patch by David Alves for CASSANDRA-1123
Updated Branches: refs/heads/trunk ad52ce4fa - 5b6a2b11b fix NPE patch by David Alves for CASSANDRA-1123 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5b6a2b11 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5b6a2b11 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5b6a2b11 Branch: refs/heads/trunk Commit: 5b6a2b11bc8a9499ac012d745869e3d814cc91ad Parents: ad52ce4 Author: Jonathan Ellis jbel...@apache.org Authored: Wed Aug 29 18:09:57 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Wed Aug 29 18:10:05 2012 -0500 -- .../org/apache/cassandra/db/ColumnFamilyStore.java |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/5b6a2b11/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java index 8ef686b..ef0e55d 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java +++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java @@ -1192,7 +1192,7 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean readStats.addNano(System.nanoTime() - start); } -logger.debug(Read {} columns, result.getColumnCount()); +logger.debug(Read {} columns, result == null ? 0 : result.getColumnCount()); return result; }
[jira] [Commented] (CASSANDRA-4571) Strange permament socket descriptors increasing leads to Too many open files
[ https://issues.apache.org/jira/browse/CASSANDRA-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444558#comment-13444558 ] Serg Shnerson commented on CASSANDRA-4571: -- bq.Are you sure you can't reproduce on a single-node cluster? My mistake. Bug also was reproduced with one-node cluster. Strange permament socket descriptors increasing leads to Too many open files -- Key: CASSANDRA-4571 URL: https://issues.apache.org/jira/browse/CASSANDRA-4571 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.1, 1.1.2, 1.1.3 Environment: CentOS 5.8 Linux 2.6.18-308.13.1.el5 #1 SMP Tue Aug 21 17:10:18 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux. java version 1.6.0_33 Java(TM) SE Runtime Environment (build 1.6.0_33-b03) Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03, mixed mode) Reporter: Serg Shnerson Priority: Critical On the two-node cluster there was found strange socket descriptors increasing. lsof -n | grep java shows many rows like java 8380 cassandra 113r unix 0x8101a374a080 938348482 socket java 8380 cassandra 114r unix 0x8101a374a080 938348482 socket java 8380 cassandra 115r unix 0x8101a374a080 938348482 socket java 8380 cassandra 116r unix 0x8101a374a080 938348482 socket java 8380 cassandra 117r unix 0x8101a374a080 938348482 socket java 8380 cassandra 118r unix 0x8101a374a080 938348482 socket java 8380 cassandra 119r unix 0x8101a374a080 938348482 socket java 8380 cassandra 120r unix 0x8101a374a080 938348482 socket And number of this rows constantly increasing. After about 24 hours this situation leads to error. We use PHPCassa client. Load is not so high (aroud ~50kb/s on write). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (CASSANDRA-4571) Strange permament socket descriptors increasing leads to Too many open files
[ https://issues.apache.org/jira/browse/CASSANDRA-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444558#comment-13444558 ] Serg Shnerson edited comment on CASSANDRA-4571 at 8/30/12 11:08 AM: bq.Are you sure you can't reproduce on a single-node cluster? My mistake. I've checked it again. Bug also was reproduced with one-node cluster. was (Author: sergshne): bq.Are you sure you can't reproduce on a single-node cluster? My mistake. Bug also was reproduced with one-node cluster. Strange permament socket descriptors increasing leads to Too many open files -- Key: CASSANDRA-4571 URL: https://issues.apache.org/jira/browse/CASSANDRA-4571 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.1, 1.1.2, 1.1.3 Environment: CentOS 5.8 Linux 2.6.18-308.13.1.el5 #1 SMP Tue Aug 21 17:10:18 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux. java version 1.6.0_33 Java(TM) SE Runtime Environment (build 1.6.0_33-b03) Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03, mixed mode) Reporter: Serg Shnerson Priority: Critical On the two-node cluster there was found strange socket descriptors increasing. lsof -n | grep java shows many rows like java 8380 cassandra 113r unix 0x8101a374a080 938348482 socket java 8380 cassandra 114r unix 0x8101a374a080 938348482 socket java 8380 cassandra 115r unix 0x8101a374a080 938348482 socket java 8380 cassandra 116r unix 0x8101a374a080 938348482 socket java 8380 cassandra 117r unix 0x8101a374a080 938348482 socket java 8380 cassandra 118r unix 0x8101a374a080 938348482 socket java 8380 cassandra 119r unix 0x8101a374a080 938348482 socket java 8380 cassandra 120r unix 0x8101a374a080 938348482 socket And number of this rows constantly increasing. After about 24 hours this situation leads to error. We use PHPCassa client. Load is not so high (aroud ~50kb/s on write). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
git commit: log related to Murmur3Partitioner patch by vijay; reviewed by Pavel Yaskevich for CASSANDRA-4282
Updated Branches: refs/heads/trunk 5b6a2b11b - 0525ae25f log related to Murmur3Partitioner patch by vijay; reviewed by Pavel Yaskevich for CASSANDRA-4282 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0525ae25 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0525ae25 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0525ae25 Branch: refs/heads/trunk Commit: 0525ae25f82ea132727b395da973b06fd1733011 Parents: 5b6a2b1 Author: Vijay Parthasarathy vijay2...@gmail.com Authored: Wed Aug 29 18:02:57 2012 -0700 Committer: Vijay Parthasarathy vijay2...@gmail.com Committed: Wed Aug 29 18:02:57 2012 -0700 -- .../cassandra/config/DatabaseDescriptor.java |7 + .../org/apache/cassandra/gms/GossipDigestSyn.java | 18 -- .../cassandra/gms/GossipDigestSynVerbHandler.java |6 + src/java/org/apache/cassandra/gms/Gossiper.java|4 ++- .../apache/cassandra/io/sstable/SSTableReader.java |9 +-- test/data/serialization/1.2/gms.Gossip.bin | Bin 109 - 158 bytes .../apache/cassandra/gms/SerializationsTest.java |2 +- 7 files changed, 38 insertions(+), 8 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/0525ae25/src/java/org/apache/cassandra/config/DatabaseDescriptor.java -- diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java index 2e22e95..7533214 100644 --- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java +++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java @@ -71,6 +71,7 @@ public class DatabaseDescriptor /* Hashing strategy Random or OPHF */ private static IPartitioner? partitioner; +private static String paritionerName; private static Config.DiskAccessMode indexAccessMode; @@ -224,6 +225,7 @@ public class DatabaseDescriptor { throw new ConfigurationException(Invalid partitioner class + conf.partitioner); } +paritionerName = partitioner.getClass().getCanonicalName(); /* phi convict threshold for FailureDetector */ if (conf.phi_convict_threshold 5 || conf.phi_convict_threshold 16) @@ -642,6 +644,11 @@ public class DatabaseDescriptor return partitioner; } +public static String getPartitionerName() +{ +return paritionerName; +} + /* For tests ONLY, don't use otherwise or all hell will break loose */ public static void setPartitioner(IPartitioner? newPartitioner) { http://git-wip-us.apache.org/repos/asf/cassandra/blob/0525ae25/src/java/org/apache/cassandra/gms/GossipDigestSyn.java -- diff --git a/src/java/org/apache/cassandra/gms/GossipDigestSyn.java b/src/java/org/apache/cassandra/gms/GossipDigestSyn.java index 8ce2257..24979f1 100644 --- a/src/java/org/apache/cassandra/gms/GossipDigestSyn.java +++ b/src/java/org/apache/cassandra/gms/GossipDigestSyn.java @@ -23,6 +23,7 @@ import java.util.List; import org.apache.cassandra.db.TypeSizes; import org.apache.cassandra.io.IVersionedSerializer; +import org.apache.cassandra.net.MessagingService; /** * This is the first message that gets sent out as a start of the Gossip protocol in a @@ -33,11 +34,13 @@ public class GossipDigestSyn public static final IVersionedSerializerGossipDigestSyn serializer = new GossipDigestSynSerializer(); final String clusterId; +final String partioner; final ListGossipDigest gDigests; -public GossipDigestSyn(String clusterId, ListGossipDigest gDigests) +public GossipDigestSyn(String clusterId, String partioner, ListGossipDigest gDigests) { this.clusterId = clusterId; +this.partioner = partioner; this.gDigests = gDigests; } @@ -79,19 +82,28 @@ class GossipDigestSynSerializer implements IVersionedSerializerGossipDigestSyn public void serialize(GossipDigestSyn gDigestSynMessage, DataOutput dos, int version) throws IOException { dos.writeUTF(gDigestSynMessage.clusterId); +if (version = MessagingService.VERSION_12) +dos.writeUTF(gDigestSynMessage.partioner); GossipDigestSerializationHelper.serialize(gDigestSynMessage.gDigests, dos, version); } public GossipDigestSyn deserialize(DataInput dis, int version) throws IOException { String clusterId = dis.readUTF(); +String partioner = null; +if (version = MessagingService.VERSION_12) +partioner = dis.readUTF(); ListGossipDigest gDigests = GossipDigestSerializationHelper.deserialize(dis,
[jira] [Commented] (CASSANDRA-3979) Consider providing error code with exceptions (and documenting them)
[ https://issues.apache.org/jira/browse/CASSANDRA-3979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444605#comment-13444605 ] paul cannon commented on CASSANDRA-3979: +1 on this monster. This will be fantastic for clients, and should encourage use of the binary protocol. I did have to make a really small change to the 'stress' tool code to get a full successful build, as shown here: https://github.com/thepaul/cassandra/commit/7b1f71f6 , but that's a triviality. The only other thing is that it would have been nice to include the extra information for Thrift clients too- even if it's just rendered into the error string. But maybe that would break super-fragile clients that depend on exact error messages? Consider providing error code with exceptions (and documenting them) Key: CASSANDRA-3979 URL: https://issues.apache.org/jira/browse/CASSANDRA-3979 Project: Cassandra Issue Type: Sub-task Components: API Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Labels: cql3 Fix For: 1.2.0 beta 1 It could be a good idea to assign documented error code for the different exception raised. Currently, one may have to parse the exception string (say if one wants to know if its 'create keyspace' failed because the keyspace already exists versus other kind of exception), but it means we cannot improve the error message at the risk of breaking client code. Adding documented error codes with the message would avoid this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4498) Remove openjdk-6-jre Cassandra APT dependencies
[ https://issues.apache.org/jira/browse/CASSANDRA-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444606#comment-13444606 ] paul cannon commented on CASSANDRA-4498: If there's no other problems with installing openjdk-6-jre on the side, then definitely +1 for status quo. Remove openjdk-6-jre Cassandra APT dependencies --- Key: CASSANDRA-4498 URL: https://issues.apache.org/jira/browse/CASSANDRA-4498 Project: Cassandra Issue Type: Improvement Reporter: Terrance Shepherd Assignee: Brandon Williams Priority: Minor Labels: debian Fix For: 1.2.0 beta 1 Attachments: apache_cassandra_Packages.diff As it is well known the recommended jre for Cassandra is sun java 1.6 but at this point that package no longer in the debian or ubuntu apt repos. In order to run Cassandra with the sun java 1.6 jre it must be installed manually with out the repos. Because of this when you install cassandra via the apache or datastax apt repos it must also install openjdk-6-jre even though sun java 1.6 jre is already installed. I would suggest that the java apt dependencies be removed from the Depends field in package configuration and move to either the Recommends or Suggests field so that way openjdk is not being downloaded when not necessary and possibly interfering with a be pre-installed jre -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3979) Consider providing error code with exceptions (and documenting them)
[ https://issues.apache.org/jira/browse/CASSANDRA-3979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444622#comment-13444622 ] Jonathan Ellis commented on CASSANDRA-3979: --- bq. maybe that would break super-fragile clients that depend on exact error messages That seems like a reasonable risk to take. Consider providing error code with exceptions (and documenting them) Key: CASSANDRA-3979 URL: https://issues.apache.org/jira/browse/CASSANDRA-3979 Project: Cassandra Issue Type: Sub-task Components: API Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Labels: cql3 Fix For: 1.2.0 beta 1 It could be a good idea to assign documented error code for the different exception raised. Currently, one may have to parse the exception string (say if one wants to know if its 'create keyspace' failed because the keyspace already exists versus other kind of exception), but it means we cannot improve the error message at the risk of breaking client code. Adding documented error codes with the message would avoid this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4480) Binary protocol: adds events push
[ https://issues.apache.org/jira/browse/CASSANDRA-4480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444642#comment-13444642 ] paul cannon commented on CASSANDRA-4480: I don't think I like the split of 'control' and 'data' modes. Why can't the client library relegate the REGISTER/EVENT messages to a single connection, if the user wants it that way? This seems like it adds needless complexity on both sides. In the very worst case, if we remove the restriction, a few dozen connections get an extra ~40-byte message once per (rare) topology/node-status change when only one would have sufficed. Is that really that much of a problem? Binary protocol: adds events push -- Key: CASSANDRA-4480 URL: https://issues.apache.org/jira/browse/CASSANDRA-4480 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Priority: Minor Fix For: 1.2.0 Attachments: 4480.txt Clients needs to know about a number of cluster changes (new/removed nodes typically) to function properly. With the binary protocol we could start pushing such events to the clients directly. The basic idea would be that a client would register to a number of events and would then receive notifications when those happened. I could at least the following events be useful to clients: * Addition and removal of nodes * Schema changes (otherwise clients would have to pull schema all the time to know that say a new column has been added) * node up/dow events (down events might not be too useful, but up events could be helpful). The main problem I can see with that is that we want to make it clear that clients are supposed to register for events on only one or two of their connections (total, not per-host), otherwise it'll be just flooding. One solution to make it much more unlikely that this happen could be to distinguish two kinds of connections: Data and Control (could just a simple flag with the startup message for instance). Data connections would not allow registering to events and Control ones would allow it but wouldn't allow queries. I.e. clients would have to dedicate a connection to those events, but that's likely the only sane way to do it anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4292) Improve JBOD loadbalancing and reduce contention
[ https://issues.apache.org/jira/browse/CASSANDRA-4292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444641#comment-13444641 ] Dave Brosius commented on CASSANDRA-4292: - +1 patch LGTM Improve JBOD loadbalancing and reduce contention Key: CASSANDRA-4292 URL: https://issues.apache.org/jira/browse/CASSANDRA-4292 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Assignee: Yuki Morishita Fix For: 1.2.0 beta 1 Attachments: 0001-Fix-writing-sstables-to-wrong-directory-when-compact.patch, 4292.txt, 4292-v2.txt, 4292-v3.txt, 4292-v4.txt As noted in CASSANDRA-809, we have a certain amount of flush (and compaction) threads, which mix and match disk volumes indiscriminately. It may be worth creating a tight thread - disk affinity, to prevent unnecessary conflict at that level. OTOH as SSDs become more prevalent this becomes a non-issue. Unclear how much pain this actually causes in practice in the meantime. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-4586) composite indexes do a linear search on all SecondaryIndex objects for any update
Jonathan Ellis created CASSANDRA-4586: - Summary: composite indexes do a linear search on all SecondaryIndex objects for any update Key: CASSANDRA-4586 URL: https://issues.apache.org/jira/browse/CASSANDRA-4586 Project: Cassandra Issue Type: Bug Reporter: Jonathan Ellis Assignee: Sylvain Lebresne not much point in having a Map if we can't use it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4586) composite indexes do a linear search on all SecondaryIndex objects for any update
[ https://issues.apache.org/jira/browse/CASSANDRA-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444665#comment-13444665 ] Jonathan Ellis commented on CASSANDRA-4586: --- seems like this would be more straightforward if we pulled out the cql3 name from the composite first, then the IndexManager could do a Map lookup again. composite indexes do a linear search on all SecondaryIndex objects for any update - Key: CASSANDRA-4586 URL: https://issues.apache.org/jira/browse/CASSANDRA-4586 Project: Cassandra Issue Type: Bug Reporter: Jonathan Ellis Assignee: Sylvain Lebresne not much point in having a Map if we can't use it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2897) Secondary indexes without read-before-write
[ https://issues.apache.org/jira/browse/CASSANDRA-2897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-2897: -- Attachment: 2897-v4.txt v4 pushes *all* index updates into the helper closure, renamed to SecondaryIndexManager.Updater. This cleans up Table.apply even more (no more looping to create a redundant Map of updated columns), and allows index maintenance during compaction relatively cleanly -- this is added for the first time here. I note, for the record, that composite indexes make my head hurt (CASSANDRA-4586). I further note that finding the wrong column value being used to create dummyColumn in the index-stale block was a *bitch*. Not sure how your new tests passed with that. Two bugs cancelling out, I guess. (Similarly, dummyColumn needed to be introduced in KeysSearcher since just using the index column is wrong even for non-composites, since delete expects a base-data column.) I await news of the new bugs I've introduced. :) Secondary indexes without read-before-write --- Key: CASSANDRA-2897 URL: https://issues.apache.org/jira/browse/CASSANDRA-2897 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.7.0 Reporter: Sylvain Lebresne Assignee: Sam Tunnicliffe Priority: Minor Labels: secondary_index Fix For: 1.2.0 beta 1 Attachments: 0001-CASSANDRA-2897-Secondary-indexes-without-read-before-w.txt, 0002-CASSANDRA-2897-Secondary-indexes-without-read-before-w.txt, 0003-CASSANDRA-2897.txt, 2897-apply-cleanup.txt, 2897-v4.txt, 41ec9fc-2897.txt Currently, secondary index updates require a read-before-write to maintain the index consistency. Keeping the index consistent at all time is not necessary however. We could let the (secondary) index get inconsistent on writes and repair those on reads. This would be easy because on reads, we make sure to request the indexed columns anyway, so we can just skip the row that are not needed and repair the index at the same time. This does trade work on writes for work on reads. However, read-before-write is sufficiently costly that it will likely be a win overall. There is (at least) two small technical difficulties here though: # If we repair on read, this will be racy with writes, so we'll probably have to synchronize there. # We probably shouldn't only rely on read to repair and we should also have a task to repair the index for things that are rarely read. It's unclear how to make that low impact though. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-1123) Allow tracing query details
[ https://issues.apache.org/jira/browse/CASSANDRA-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-1123. --- Resolution: Fixed committed. (turns out I didn't push the revert earlier, so I just left that out when I did push.) Allow tracing query details --- Key: CASSANDRA-1123 URL: https://issues.apache.org/jira/browse/CASSANDRA-1123 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Jonathan Ellis Assignee: David Alves Fix For: 1.2.0 beta 1 Attachments: 1123-3.patch.gz, 1123.patch, 1123.patch, 1123.patch, 1123.patch, 1123-v6.txt, 1123-v7.patch, 1123-v8.patch, 1123-v9.patch, 1123-v9.txt, 1123-v9.txt In the spirit of CASSANDRA-511, it would be useful to tracing on queries to see where latency is coming from: how long did row cache lookup take? key search in the index? merging the data from the sstables? etc. The main difference vs setting debug logging is that debug logging is too big of a hammer; by turning on the flood of logging for everyone, you actually distort the information you're looking for. This would be something you could set per-query (or more likely per connection). We don't need to be as sophisticated as the techniques discussed in the following papers but they are interesting reading: http://research.google.com/pubs/pub36356.html http://www.usenix.org/events/osdi04/tech/full_papers/barham/barham_html/ http://www.usenix.org/event/nsdi07/tech/fonseca.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4586) composite indexes do a linear search on all SecondaryIndex objects for any update
[ https://issues.apache.org/jira/browse/CASSANDRA-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444706#comment-13444706 ] Jonathan Ellis commented on CASSANDRA-4586: --- yes, composites caused a world of pain for CASSANDRA-2897. :) composite indexes do a linear search on all SecondaryIndex objects for any update - Key: CASSANDRA-4586 URL: https://issues.apache.org/jira/browse/CASSANDRA-4586 Project: Cassandra Issue Type: Bug Reporter: Jonathan Ellis Assignee: Sylvain Lebresne not much point in having a Map if we can't use it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4586) composite indexes do a linear search on all SecondaryIndex objects for any update
[ https://issues.apache.org/jira/browse/CASSANDRA-4586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13444705#comment-13444705 ] Jonathan Ellis commented on CASSANDRA-4586: --- also on my hit list: having to be super careful to call makeIndexColumnName in the right places, with no type safety since it's BB in BB out. composite indexes do a linear search on all SecondaryIndex objects for any update - Key: CASSANDRA-4586 URL: https://issues.apache.org/jira/browse/CASSANDRA-4586 Project: Cassandra Issue Type: Bug Reporter: Jonathan Ellis Assignee: Sylvain Lebresne not much point in having a Map if we can't use it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira