[jira] [Commented] (CASSANDRA-7020) Incorrect result of query WHERE token(key) -9223372036854775808 when using Murmur3Partitioner
[ https://issues.apache.org/jira/browse/CASSANDRA-7020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133637#comment-14133637 ] Piotr Kołaczkowski commented on CASSANDRA-7020: --- {quote} -9223372036854775808 is min value for ssmtable scanned so putting it in where clause of select statement behaves as select * from test{quote} And this is illogical and surprising behavior because the tokens of the returned rows do not satisfy the condition in the query. No token is ever smaller than min token, therefore the only correct answer is empty row set here. Nevertheless, if it were to wrap-around, it needs to wrap-around always consistently, not just for the min token. So actually any query not restricting the token range from both sides should return all rows. Actually I like the idea of never wrapping around, because it doesn't make token comparisons special and treats them just as any other integer comparison. Incorrect result of query WHERE token(key) -9223372036854775808 when using Murmur3Partitioner --- Key: CASSANDRA-7020 URL: https://issues.apache.org/jira/browse/CASSANDRA-7020 Project: Cassandra Issue Type: Bug Environment: cassandra 2.0.6-snapshot Reporter: Piotr Kołaczkowski Assignee: Marko Denda {noformat} cqlsh:test1 select * from test where token(key) -9223372036854775807; (0 rows) cqlsh:test1 select * from test where token(key) -9223372036854775808; key | value -+-- 5 | ee 10 |j 1 | 8 | 2 | bbb 4 | dd 7 | 6 | fff 9 | 3 |c {noformat} Expected: empty result. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-7020) Incorrect result of query WHERE token(key) -9223372036854775808 when using Murmur3Partitioner
[ https://issues.apache.org/jira/browse/CASSANDRA-7020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133637#comment-14133637 ] Piotr Kołaczkowski edited comment on CASSANDRA-7020 at 9/15/14 7:07 AM: {quote} -9223372036854775808 is min value for ssmtable scanned so putting it in where clause of select statement behaves as select * from test{quote} And this is illogical and surprising behavior because the tokens of the returned rows do not satisfy the condition in the query. No token is ever smaller than min token, therefore the only correct answer is empty row set here. Nevertheless, if it were to wrap-around, it needs to wrap-around always consistently, not just for the min token. So actually any query not restricting the token range from both sides should return all rows (it doesn't work like this now - the case described in the ticket is the only case, when token comparison wraps-around). Actually I like the idea of never wrapping around in CQL, because it doesn't make token comparisons special and treats them just as any other integer comparison. Integer comparison is pretty well-defined concept in mathematics - I see no reason it should work differently for tokens in cassandra. was (Author: pkolaczk): {quote} -9223372036854775808 is min value for ssmtable scanned so putting it in where clause of select statement behaves as select * from test{quote} And this is illogical and surprising behavior because the tokens of the returned rows do not satisfy the condition in the query. No token is ever smaller than min token, therefore the only correct answer is empty row set here. Nevertheless, if it were to wrap-around, it needs to wrap-around always consistently, not just for the min token. So actually any query not restricting the token range from both sides should return all rows. Actually I like the idea of never wrapping around, because it doesn't make token comparisons special and treats them just as any other integer comparison. Incorrect result of query WHERE token(key) -9223372036854775808 when using Murmur3Partitioner --- Key: CASSANDRA-7020 URL: https://issues.apache.org/jira/browse/CASSANDRA-7020 Project: Cassandra Issue Type: Bug Environment: cassandra 2.0.6-snapshot Reporter: Piotr Kołaczkowski Assignee: Marko Denda {noformat} cqlsh:test1 select * from test where token(key) -9223372036854775807; (0 rows) cqlsh:test1 select * from test where token(key) -9223372036854775808; key | value -+-- 5 | ee 10 |j 1 | 8 | 2 | bbb 4 | dd 7 | 6 | fff 9 | 3 |c {noformat} Expected: empty result. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7724) Native-Transport threads get stuck in StorageProxy.preparePaxos with no one making progress
[ https://issues.apache.org/jira/browse/CASSANDRA-7724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133646#comment-14133646 ] Anton Lebedevich commented on CASSANDRA-7724: - That was a single node so at least one of them should have been moving forward. But thread dump is not expected to show all threads at the same point in time exactly so it could have missed that single making progress thread. I dumped threads several times with the same results (all threads waiting). Native-Transport threads get stuck in StorageProxy.preparePaxos with no one making progress --- Key: CASSANDRA-7724 URL: https://issues.apache.org/jira/browse/CASSANDRA-7724 Project: Cassandra Issue Type: Bug Components: Core Environment: Linux 3.13.11-4 #4 SMP PREEMPT x86_64 Intel(R) Core(TM) i7 CPU 950 @ 3.07GHz GenuineIntel java version 1.8.0_05 Java(TM) SE Runtime Environment (build 1.8.0_05-b13) Java HotSpot(TM) 64-Bit Server VM (build 25.5-b02, mixed mode) cassandra 2.0.9 Reporter: Anton Lebedevich Attachments: cassandra.threads2 We've got a lot of write timeouts (cas) when running INSERT INTO cas_demo(pri_id, sec_id, flag, something) VALUES(?, ?, ?, ?) IF NOT EXISTS from 16 connections in parallel using the same pri_id and different sec_id. Doing the same from 4 connections in parallel works ok. All configuration values are at their default values. CREATE TABLE cas_demo ( pri_id varchar, sec_id varchar, flag boolean, something setvarchar, PRIMARY KEY (pri_id, sec_id) ); CREATE INDEX cas_demo_flag ON cas_demo(flag); Full thread dump is attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7304) Ability to distinguish between NULL and UNSET values in Prepared Statements
[ https://issues.apache.org/jira/browse/CASSANDRA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133692#comment-14133692 ] Oded Peer commented on CASSANDRA-7304: -- Pinging for code review Ability to distinguish between NULL and UNSET values in Prepared Statements --- Key: CASSANDRA-7304 URL: https://issues.apache.org/jira/browse/CASSANDRA-7304 Project: Cassandra Issue Type: Improvement Reporter: Drew Kutcharian Labels: cql, protocolv4 Fix For: 3.0 Attachments: 7304-2.patch, 7304.patch Currently Cassandra inserts tombstones when a value of a column is bound to NULL in a prepared statement. At higher insert rates managing all these tombstones becomes an unnecessary overhead. This limits the usefulness of the prepared statements since developers have to either create multiple prepared statements (each with a different combination of column names, which at times is just unfeasible because of the sheer number of possible combinations) or fall back to using regular (non-prepared) statements. This JIRA is here to explore the possibility of either: A. Have a flag on prepared statements that once set, tells Cassandra to ignore null columns or B. Have an UNSET value which makes Cassandra skip the null columns and not tombstone them Basically, in the context of a prepared statement, a null value means delete, but we don’t have anything that means ignore (besides creating a new prepared statement without the ignored column). Please refer to the original conversation on DataStax Java Driver mailing list for more background: https://groups.google.com/a/lists.datastax.com/d/topic/java-driver-user/cHE3OOSIXBU/discussion -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7731) Get max values for live/tombstone cells per slice
[ https://issues.apache.org/jira/browse/CASSANDRA-7731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133715#comment-14133715 ] Cyril Scetbon commented on CASSANDRA-7731: -- [~snazy] Your link exponentially-decaying-reservoirs is outdated. It seems that the project has been removed from github or maybe renamed.. My tests are showing that the maximum value collected can be persistent for far more than 5 minutes. In the following example, I'm executing one CQL query that scans 2 tombstones and after that 1 CQL query each second that scan 0 tombstones. After more than 1300 queries, I still have the same max value. When I check the list of values, it doesn't seem to change, even if the mean changes. {code} val 545 = 0.0 val 546 = 2.0 count = 1330 max = 2.0 pmax = 2.0 mean = 0.0015037593984962407 min = 0.0 Median = 0.0 99p = 0.0 {code} So even if the mean is well calculated, I can't understand why the max value is still the same after 20 minutes of queries scanning 0 tombstones. I have to confess that after 30 minutes, I get the expected behavior : {code} val 142 = 0.0 val 143 = 0.0 val 144 = 0.0 count = 1473 max = 2.0 pmax = 0.0 mean = 0.0013577732518669382 min = 0.0 Median = 0.0 99p = 0.0 {code} However, I need to be sure that the problem is solved and that it lasts only 5 minutes and not 30 minutes ... Get max values for live/tombstone cells per slice - Key: CASSANDRA-7731 URL: https://issues.apache.org/jira/browse/CASSANDRA-7731 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Cyril Scetbon Assignee: Robert Stupp Priority: Minor Fix For: 2.1.1 Attachments: 7731-2.0.txt, 7731-2.1.txt I think you should not say that slice statistics are valid for the [last five minutes |https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/tools/NodeCmd.java#L955-L956] in CFSTATS command of nodetool. I've read the documentation from yammer for Histograms and there is no way to force values to expire after x minutes except by [clearing|http://grepcode.com/file/repo1.maven.org/maven2/com.yammer.metrics/metrics-core/2.1.2/com/yammer/metrics/core/Histogram.java#96] it . The only thing I can see is that the last snapshot used to provide the median (or whatever you'd used instead) value is based on 1028 values. I think we should also be able to detect that some requests are accessing a lot of live/tombstone cells per query and that's not possible for now without activating DEBUG for SliceQueryFilter for example and by tweaking the threshold. Currently as nodetool cfstats returns the median if a low part of the queries are scanning a lot of live/tombstone cells we miss it ! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7724) Native-Transport threads get stuck in StorageProxy.preparePaxos with no one making progress
[ https://issues.apache.org/jira/browse/CASSANDRA-7724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-7724: Attachment: aggregateddump.txt Attached is your thread dump reformatted to be easier to digest. Looking at it, there appears to be an incoming read on the IncomingTcpConnection, which is quite likely one of the prepare requests or responses being processed by the node, although it's hard to say for certain. There are multiple threads involved - the native transport requests start the work, however the mutation stage processes the paxos messages on receipt, and the incoming/outbound tcp connections deliver those messages. It's still a bit funny that you can never see these threads live, though, in any of your dumps, and I would be interested in getting a few to double check. Either way, this raises the sensible prospect of optimising cas when RF=1. Native-Transport threads get stuck in StorageProxy.preparePaxos with no one making progress --- Key: CASSANDRA-7724 URL: https://issues.apache.org/jira/browse/CASSANDRA-7724 Project: Cassandra Issue Type: Bug Components: Core Environment: Linux 3.13.11-4 #4 SMP PREEMPT x86_64 Intel(R) Core(TM) i7 CPU 950 @ 3.07GHz GenuineIntel java version 1.8.0_05 Java(TM) SE Runtime Environment (build 1.8.0_05-b13) Java HotSpot(TM) 64-Bit Server VM (build 25.5-b02, mixed mode) cassandra 2.0.9 Reporter: Anton Lebedevich Attachments: aggregateddump.txt, cassandra.threads2 We've got a lot of write timeouts (cas) when running INSERT INTO cas_demo(pri_id, sec_id, flag, something) VALUES(?, ?, ?, ?) IF NOT EXISTS from 16 connections in parallel using the same pri_id and different sec_id. Doing the same from 4 connections in parallel works ok. All configuration values are at their default values. CREATE TABLE cas_demo ( pri_id varchar, sec_id varchar, flag boolean, something setvarchar, PRIMARY KEY (pri_id, sec_id) ); CREATE INDEX cas_demo_flag ON cas_demo(flag); Full thread dump is attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-7937) Apply backpressure gently when overloaded with writes
Piotr Kołaczkowski created CASSANDRA-7937: - Summary: Apply backpressure gently when overloaded with writes Key: CASSANDRA-7937 URL: https://issues.apache.org/jira/browse/CASSANDRA-7937 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 2.0 Reporter: Piotr Kołaczkowski When writing huge amounts of data into C* cluster from analytic tools like Hadoop or Apache Spark, we can see that often C* can't keep up with the load. This is because analytic tools typically write data as fast as they can in parallel, from many nodes and they are not artificially rate-limited, so C* is the bottleneck here. Also, increasing the number of nodes doesn't really help, because in a collocated setup this also increases number of Hadoop/Spark nodes (writers) and although possible write performance is higher, the problem still remains. We observe the following behavior: 1. data is ingested at an extreme fast pace into memtables and flush queue fills up 2. the available memory limit for memtables is reached and writes are no longer accepted 3. the application gets hit by write timeout, and retries repeatedly, in vain 4. after several failed attempts to write, the job gets aborted Desired behaviour: 1. data is ingested at an extreme fast pace into memtables and flush queue fills up 2. after exceeding some memtable fill threshold, C* applies rate limiting to writes - the more the buffers are filled-up, the less writes/s are accepted, however writes still occur within the write timeout. 3. thanks to slowed down data ingestion, now flush can happen before all the memory gets used Of course the details how rate limiting could be done are up for a discussion. It may be also worth considering putting such logic into the driver, not C* core, but then C* needs to expose at least the following information to the driver, so we could calculate the desired maximum data rate: 1. current amount of memory available for writes before they would completely block 2. total amount of data queued to be flushed and flush progress (amount of data to flush remaining for the memtable currently being flushed) 3. average flush write speed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7904) Repair hangs
[ https://issues.apache.org/jira/browse/CASSANDRA-7904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133844#comment-14133844 ] Duncan Sands commented on CASSANDRA-7904: - With request_timeout_in_ms increased to 10 in cassandra.yaml on all nodes, repair completed successfully this weekend. Repair hangs Key: CASSANDRA-7904 URL: https://issues.apache.org/jira/browse/CASSANDRA-7904 Project: Cassandra Issue Type: Bug Components: Core Environment: C* 2.0.10, ubuntu 14.04, Java HotSpot(TM) 64-Bit Server, java version 1.7.0_45 Reporter: Duncan Sands Attachments: ls-172.18.68.138, ls-192.168.21.13, ls-192.168.60.134, ls-192.168.60.136 Cluster of 22 nodes spread over 4 data centres. Not used on the weekend, so repair is run on all nodes (in a staggered fashion) on the weekend. Nodetool options: -par -pr. There is usually some overlap in the repairs: repair on one node may well still be running when repair is started on the next node. Repair hangs for some of the nodes almost every weekend. It hung last weekend, here are the details: In the whole cluster, only one node had an exception since C* was last restarted. This node is 192.168.60.136 and the exception is harmless: a client disconnected abruptly. tpstats 4 nodes have a non-zero value for active or pending in AntiEntropySessions. These nodes all have Active = 1 and Pending = 1. The nodes are: 192.168.21.13 (data centre R) 192.168.60.134 (data centre A) 192.168.60.136 (data centre A) 172.18.68.138 (data centre Z) compactionstats: No compactions. All nodes have: pending tasks: 0 Active compaction remaining time :n/a netstats: All except one node have nothing. One node (192.168.60.131, not one of the nodes listed in the tpstats section above) has (note the Responses Pending value of 1): Mode: NORMAL Not sending any streams. Read Repair Statistics: Attempted: 4233 Mismatch (Blocking): 0 Mismatch (Background): 243 Pool NameActive Pending Completed Commandsn/a 0 34785445 Responses n/a 1 38567167 Repair sessions I looked for repair sessions that failed to complete. On 3 of the 4 nodes mentioned in tpstats above I found that they had sent merkle tree requests and got responses from all but one node. In the log file for the node that failed to respond there is no sign that it ever received the request. On 1 node (172.18.68.138) it looks like responses were received from every node, some streaming was done, and then... nothing. Details: Node 192.168.21.13 (data centre R): Sent merkle trees to /172.18.33.24, /192.168.60.140, /192.168.60.142, /172.18.68.139, /172.18.68.138, /172.18.33.22, /192.168.21.13 for table brokers, never got a response from /172.18.68.139. On /172.18.68.139, just before this time it sent a response for the same repair session but a different table, and there is no record of it receiving a request for table brokers. Node 192.168.60.134 (data centre A): Sent merkle trees to /172.18.68.139, /172.18.68.138, /192.168.60.132, /192.168.21.14, /192.168.60.134 for table swxess_outbound, never got a response from /172.18.68.138. On /172.18.68.138, just before this time it sent a response for the same repair session but a different table, and there is no record of it receiving a request for table swxess_outbound. Node 192.168.60.136 (data centre A): Sent merkle trees to /192.168.60.142, /172.18.68.139, /192.168.60.136 for table rollups7200, never got a response from /172.18.68.139. This repair session is never mentioned in the /172.18.68.139 log. Node 172.18.68.138 (data centre Z): The issue here seems to be repair session #a55c16e1-35eb-11e4-8e7e-51c077eaf311. It got responses for all its merkle tree requests, did some streaming, but seems to have stopped after finishing with one table (rollups60). I found it as follows: it is the only repair for which there is no session completed successfully message in the log. Some log file snippets are attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7131) Add command line option for cqlshrc file path
[ https://issues.apache.org/jira/browse/CASSANDRA-7131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-7131: Reviewer: Mikhail Stepura Add command line option for cqlshrc file path - Key: CASSANDRA-7131 URL: https://issues.apache.org/jira/browse/CASSANDRA-7131 Project: Cassandra Issue Type: New Feature Components: Tools Reporter: Jeremiah Jordan Priority: Trivial Labels: cqlsh, lhf Attachments: CASSANDRA-2.1.1-7131.txt It would be nice if you could specify the cqlshrc file location on the command line, so you don't have to jump through hoops when running it from a service user or something. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7937) Apply backpressure gently when overloaded with writes
[ https://issues.apache.org/jira/browse/CASSANDRA-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-7937: Labels: performance (was: ) Apply backpressure gently when overloaded with writes - Key: CASSANDRA-7937 URL: https://issues.apache.org/jira/browse/CASSANDRA-7937 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 2.0 Reporter: Piotr Kołaczkowski Labels: performance When writing huge amounts of data into C* cluster from analytic tools like Hadoop or Apache Spark, we can see that often C* can't keep up with the load. This is because analytic tools typically write data as fast as they can in parallel, from many nodes and they are not artificially rate-limited, so C* is the bottleneck here. Also, increasing the number of nodes doesn't really help, because in a collocated setup this also increases number of Hadoop/Spark nodes (writers) and although possible write performance is higher, the problem still remains. We observe the following behavior: 1. data is ingested at an extreme fast pace into memtables and flush queue fills up 2. the available memory limit for memtables is reached and writes are no longer accepted 3. the application gets hit by write timeout, and retries repeatedly, in vain 4. after several failed attempts to write, the job gets aborted Desired behaviour: 1. data is ingested at an extreme fast pace into memtables and flush queue fills up 2. after exceeding some memtable fill threshold, C* applies rate limiting to writes - the more the buffers are filled-up, the less writes/s are accepted, however writes still occur within the write timeout. 3. thanks to slowed down data ingestion, now flush can happen before all the memory gets used Of course the details how rate limiting could be done are up for a discussion. It may be also worth considering putting such logic into the driver, not C* core, but then C* needs to expose at least the following information to the driver, so we could calculate the desired maximum data rate: 1. current amount of memory available for writes before they would completely block 2. total amount of data queued to be flushed and flush progress (amount of data to flush remaining for the memtable currently being flushed) 3. average flush write speed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7937) Apply backpressure gently when overloaded with writes
[ https://issues.apache.org/jira/browse/CASSANDRA-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133859#comment-14133859 ] Brandon Williams commented on CASSANDRA-7937: - One easy way around this is to use the BulkOutputFormat so it doesn't go through the write path. Apply backpressure gently when overloaded with writes - Key: CASSANDRA-7937 URL: https://issues.apache.org/jira/browse/CASSANDRA-7937 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 2.0 Reporter: Piotr Kołaczkowski Labels: performance When writing huge amounts of data into C* cluster from analytic tools like Hadoop or Apache Spark, we can see that often C* can't keep up with the load. This is because analytic tools typically write data as fast as they can in parallel, from many nodes and they are not artificially rate-limited, so C* is the bottleneck here. Also, increasing the number of nodes doesn't really help, because in a collocated setup this also increases number of Hadoop/Spark nodes (writers) and although possible write performance is higher, the problem still remains. We observe the following behavior: 1. data is ingested at an extreme fast pace into memtables and flush queue fills up 2. the available memory limit for memtables is reached and writes are no longer accepted 3. the application gets hit by write timeout, and retries repeatedly, in vain 4. after several failed attempts to write, the job gets aborted Desired behaviour: 1. data is ingested at an extreme fast pace into memtables and flush queue fills up 2. after exceeding some memtable fill threshold, C* applies rate limiting to writes - the more the buffers are filled-up, the less writes/s are accepted, however writes still occur within the write timeout. 3. thanks to slowed down data ingestion, now flush can happen before all the memory gets used Of course the details how rate limiting could be done are up for a discussion. It may be also worth considering putting such logic into the driver, not C* core, but then C* needs to expose at least the following information to the driver, so we could calculate the desired maximum data rate: 1. current amount of memory available for writes before they would completely block 2. total amount of data queued to be flushed and flush progress (amount of data to flush remaining for the memtable currently being flushed) 3. average flush write speed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7937) Apply backpressure gently when overloaded with writes
[ https://issues.apache.org/jira/browse/CASSANDRA-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133862#comment-14133862 ] Benedict commented on CASSANDRA-7937: - This should certainly be dealt with by the cluster. We cannot rely on well behaved clients, and clients cannot easily calculate a safe data-rate cross cluster, so any client change would at best help direct writes only, which with RF1 is not much help. Nor could it be as responsive. My preferred solution to this is CASSANDRA-6812, which should keep the server responding to writes within the timeout window even as it blocks for lengthy flushes, but during these windows writes would be acked much more slowly, at a steady drip. This solution won't make it into 2.0 or 2.1, and possibly not even 3.0, though. Apply backpressure gently when overloaded with writes - Key: CASSANDRA-7937 URL: https://issues.apache.org/jira/browse/CASSANDRA-7937 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 2.0 Reporter: Piotr Kołaczkowski Labels: performance When writing huge amounts of data into C* cluster from analytic tools like Hadoop or Apache Spark, we can see that often C* can't keep up with the load. This is because analytic tools typically write data as fast as they can in parallel, from many nodes and they are not artificially rate-limited, so C* is the bottleneck here. Also, increasing the number of nodes doesn't really help, because in a collocated setup this also increases number of Hadoop/Spark nodes (writers) and although possible write performance is higher, the problem still remains. We observe the following behavior: 1. data is ingested at an extreme fast pace into memtables and flush queue fills up 2. the available memory limit for memtables is reached and writes are no longer accepted 3. the application gets hit by write timeout, and retries repeatedly, in vain 4. after several failed attempts to write, the job gets aborted Desired behaviour: 1. data is ingested at an extreme fast pace into memtables and flush queue fills up 2. after exceeding some memtable fill threshold, C* applies rate limiting to writes - the more the buffers are filled-up, the less writes/s are accepted, however writes still occur within the write timeout. 3. thanks to slowed down data ingestion, now flush can happen before all the memory gets used Of course the details how rate limiting could be done are up for a discussion. It may be also worth considering putting such logic into the driver, not C* core, but then C* needs to expose at least the following information to the driver, so we could calculate the desired maximum data rate: 1. current amount of memory available for writes before they would completely block 2. total amount of data queued to be flushed and flush progress (amount of data to flush remaining for the memtable currently being flushed) 3. average flush write speed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7937) Apply backpressure gently when overloaded with writes
[ https://issues.apache.org/jira/browse/CASSANDRA-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133861#comment-14133861 ] Piotr Kołaczkowski commented on CASSANDRA-7937: --- Indeed, but it is not yet supported in Spark, also not everyone uses it with Hive/Hadoop. Apply backpressure gently when overloaded with writes - Key: CASSANDRA-7937 URL: https://issues.apache.org/jira/browse/CASSANDRA-7937 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 2.0 Reporter: Piotr Kołaczkowski Labels: performance When writing huge amounts of data into C* cluster from analytic tools like Hadoop or Apache Spark, we can see that often C* can't keep up with the load. This is because analytic tools typically write data as fast as they can in parallel, from many nodes and they are not artificially rate-limited, so C* is the bottleneck here. Also, increasing the number of nodes doesn't really help, because in a collocated setup this also increases number of Hadoop/Spark nodes (writers) and although possible write performance is higher, the problem still remains. We observe the following behavior: 1. data is ingested at an extreme fast pace into memtables and flush queue fills up 2. the available memory limit for memtables is reached and writes are no longer accepted 3. the application gets hit by write timeout, and retries repeatedly, in vain 4. after several failed attempts to write, the job gets aborted Desired behaviour: 1. data is ingested at an extreme fast pace into memtables and flush queue fills up 2. after exceeding some memtable fill threshold, C* applies rate limiting to writes - the more the buffers are filled-up, the less writes/s are accepted, however writes still occur within the write timeout. 3. thanks to slowed down data ingestion, now flush can happen before all the memory gets used Of course the details how rate limiting could be done are up for a discussion. It may be also worth considering putting such logic into the driver, not C* core, but then C* needs to expose at least the following information to the driver, so we could calculate the desired maximum data rate: 1. current amount of memory available for writes before they would completely block 2. total amount of data queued to be flushed and flush progress (amount of data to flush remaining for the memtable currently being flushed) 3. average flush write speed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7937) Apply backpressure gently when overloaded with writes
[ https://issues.apache.org/jira/browse/CASSANDRA-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Kołaczkowski updated CASSANDRA-7937: -- Description: When writing huge amounts of data into C* cluster from analytic tools like Hadoop or Apache Spark, we can see that often C* can't keep up with the load. This is because analytic tools typically write data as fast as they can in parallel, from many nodes and they are not artificially rate-limited, so C* is the bottleneck here. Also, increasing the number of nodes doesn't really help, because in a collocated setup this also increases number of Hadoop/Spark nodes (writers) and although possible write performance is higher, the problem still remains. We observe the following behavior: 1. data is ingested at an extreme fast pace into memtables and flush queue fills up 2. the available memory limit for memtables is reached and writes are no longer accepted 3. the application gets hit by write timeout, and retries repeatedly, in vain 4. after several failed attempts to write, the job gets aborted Desired behaviour: 1. data is ingested at an extreme fast pace into memtables and flush queue fills up 2. after exceeding some memtable fill threshold, C* applies adaptive rate limiting to writes - the more the buffers are filled-up, the less writes/s are accepted, however writes still occur within the write timeout. 3. thanks to slowed down data ingestion, now flush can finish before all the memory gets used Of course the details how rate limiting could be done are up for a discussion. It may be also worth considering putting such logic into the driver, not C* core, but then C* needs to expose at least the following information to the driver, so we could calculate the desired maximum data rate: 1. current amount of memory available for writes before they would completely block 2. total amount of data queued to be flushed and flush progress (amount of data to flush remaining for the memtable currently being flushed) 3. average flush write speed was: When writing huge amounts of data into C* cluster from analytic tools like Hadoop or Apache Spark, we can see that often C* can't keep up with the load. This is because analytic tools typically write data as fast as they can in parallel, from many nodes and they are not artificially rate-limited, so C* is the bottleneck here. Also, increasing the number of nodes doesn't really help, because in a collocated setup this also increases number of Hadoop/Spark nodes (writers) and although possible write performance is higher, the problem still remains. We observe the following behavior: 1. data is ingested at an extreme fast pace into memtables and flush queue fills up 2. the available memory limit for memtables is reached and writes are no longer accepted 3. the application gets hit by write timeout, and retries repeatedly, in vain 4. after several failed attempts to write, the job gets aborted Desired behaviour: 1. data is ingested at an extreme fast pace into memtables and flush queue fills up 2. after exceeding some memtable fill threshold, C* applies rate limiting to writes - the more the buffers are filled-up, the less writes/s are accepted, however writes still occur within the write timeout. 3. thanks to slowed down data ingestion, now flush can happen before all the memory gets used Of course the details how rate limiting could be done are up for a discussion. It may be also worth considering putting such logic into the driver, not C* core, but then C* needs to expose at least the following information to the driver, so we could calculate the desired maximum data rate: 1. current amount of memory available for writes before they would completely block 2. total amount of data queued to be flushed and flush progress (amount of data to flush remaining for the memtable currently being flushed) 3. average flush write speed Apply backpressure gently when overloaded with writes - Key: CASSANDRA-7937 URL: https://issues.apache.org/jira/browse/CASSANDRA-7937 Project: Cassandra Issue Type: Bug Components: Core Environment: Cassandra 2.0 Reporter: Piotr Kołaczkowski Labels: performance When writing huge amounts of data into C* cluster from analytic tools like Hadoop or Apache Spark, we can see that often C* can't keep up with the load. This is because analytic tools typically write data as fast as they can in parallel, from many nodes and they are not artificially rate-limited, so C* is the bottleneck here. Also, increasing the number of nodes doesn't really help, because in a collocated setup this also increases number of Hadoop/Spark nodes (writers) and although possible write performance is higher, the problem still remains.
[jira] [Commented] (CASSANDRA-3017) add a Message size limit
[ https://issues.apache.org/jira/browse/CASSANDRA-3017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133863#comment-14133863 ] Benedict commented on CASSANDRA-3017: - This is definitely a good idea. At the same time I think it might be worth considering introducing an upper limit on either the total size of requests we've currently got in-flight for MessagingService, or the total number, or possibly both. Once the threshold is exceeded we stop consuming input from all IncomingTcpConnection(s). This is not dramatically different to our imposition of a max rpc count, but stops a single server being overloaded through a hotspot of queries driven by non-token aware clients (but also from only a slight variant on the malicious oversized payload attack). add a Message size limit Key: CASSANDRA-3017 URL: https://issues.apache.org/jira/browse/CASSANDRA-3017 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Priority: Minor Labels: lhf Attachments: 0001-use-the-thrift-max-message-size-for-inter-node-messa.patch, trunk-3017.txt We protect the server from allocating huge buffers for malformed message with the Thrift frame size (CASSANDRA-475). But we don't have similar protection for the inter-node Message objects. Adding this would be good to deal with malicious adversaries as well as a malfunctioning cluster participant. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7056) Add RAMP transactions
[ https://issues.apache.org/jira/browse/CASSANDRA-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133886#comment-14133886 ] T Jake Luciani commented on CASSANDRA-7056: --- I've been thinking about how to implement this and a couple ideas come to mind: * We would use the existing batchlog and use this as the prepare pass of the transaction (RAMP-Fast) * Since we will use TimeUUID as the timestamp we can also use this for the batchlog id * We add a way to find and read from the batchlog for a given batchlog id. * If the coordinator gets the results from two partitions and the timeuuids don't match it would read the later timeuuid from the batchlog and fix the data. Some concerns: * Let's assume we query from partition A and B, and we see the results don't match timestamps, we would pull the latest batchlog assuming they are from the same batch but let's say they in fact are not. In this case we wasted a lot of time so my question is should we only do this in the user supplies a new CL type? I think Peter was suggesting this in his preso READ_ATOMIC. * In the case of a global index we plan on reading the data *after* reading the index. The data query might reveal the indexed value is stale. We would need to apply the batchlog and fix the index, would we then restart the entire query? or maybe overquery assuming some index values will be stale? Either way this query looks different than the above scenario. Add RAMP transactions - Key: CASSANDRA-7056 URL: https://issues.apache.org/jira/browse/CASSANDRA-7056 Project: Cassandra Issue Type: Wish Components: Core Reporter: Tupshin Harper Priority: Minor We should take a look at [RAMP|http://www.bailis.org/blog/scalable-atomic-visibility-with-ramp-transactions/] transactions, and figure out if they can be used to provide more efficient LWT (or LWT-like) operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7838) Warn user when disks are network/ebs mounted
[ https://issues.apache.org/jira/browse/CASSANDRA-7838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133892#comment-14133892 ] T Jake Luciani commented on CASSANDRA-7838: --- Let's come up with another metrics that is a good Hey you will have performance problems check Ones I can think of. * Swap enabled. * High disk latency * Check Ulimits are sensible (sigar has a ulimit check I think) Warn user when disks are network/ebs mounted Key: CASSANDRA-7838 URL: https://issues.apache.org/jira/browse/CASSANDRA-7838 Project: Cassandra Issue Type: Improvement Reporter: T Jake Luciani Priority: Minor Labels: lhf Fix For: 3.0 The Sigar project let's you probe os/cpu/filesystems across the major platforms. https://github.com/hyperic/sigar It would be nice on start-up to use this to warn users if they are running with settings that will make them sad, like Network drive or EBS on Ec2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7934) Remove FBUtilities.threadLocalRandom
[ https://issues.apache.org/jira/browse/CASSANDRA-7934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133896#comment-14133896 ] T Jake Luciani commented on CASSANDRA-7934: --- TIL. +1 Remove FBUtilities.threadLocalRandom Key: CASSANDRA-7934 URL: https://issues.apache.org/jira/browse/CASSANDRA-7934 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Labels: lhf Fix For: 2.1.1 We should use ThreadLocalRandom.current() instead, as it is not only more standard, it is considerably faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7930) Warn when evicting prepared statements from cache
[ https://issues.apache.org/jira/browse/CASSANDRA-7930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robbie Strickland updated CASSANDRA-7930: - Attachment: cassandra-2.0-v5.txt Changed scheduler to use StorageService instead of new executor. Warn when evicting prepared statements from cache - Key: CASSANDRA-7930 URL: https://issues.apache.org/jira/browse/CASSANDRA-7930 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Robbie Strickland Assignee: Robbie Strickland Attachments: cassandra-2.0-v2.txt, cassandra-2.0-v3.txt, cassandra-2.0-v4.txt, cassandra-2.0-v5.txt, cassandra-2.0.txt, cassandra-2.1.txt The prepared statement cache is an LRU, with a max size of maxMemory / 256. There is currently no warning when statements are evicted, which could be problematic if the user is unaware that this is happening. At the very least, we should provide a JMX metric and possibly a log message indicating this is happening. At some point it may also be worthwhile to make this tunable for users with large numbers of statements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7069) Prevent operator mistakes due to simultaneous bootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133899#comment-14133899 ] T Jake Luciani commented on CASSANDRA-7069: --- +1 Prevent operator mistakes due to simultaneous bootstrap --- Key: CASSANDRA-7069 URL: https://issues.apache.org/jira/browse/CASSANDRA-7069 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Brandon Williams Assignee: Brandon Williams Priority: Minor Fix For: 3.0 Attachments: 7069.txt Cassandra has always had the '2 minute rule' between beginning topology changes to ensure the range announcement is known to all nodes before the next one begins. Trying to bootstrap a bunch of nodes simultaneously is a common mistake and seems to be on the rise as of late. We can prevent users from shooting themselves in the foot this way by looking for other joining nodes in the shadow round, then comparing their generation against our own and if there isn't a large enough difference, bail out or sleep until it is large enough. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7904) Repair hangs
[ https://issues.apache.org/jira/browse/CASSANDRA-7904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133901#comment-14133901 ] Michael Shuler commented on CASSANDRA-7904: --- Thanks for the update, [~baldrick]! Considering that this ticket appears to not really be a bug report at this point, can we call this closed? Repair hangs Key: CASSANDRA-7904 URL: https://issues.apache.org/jira/browse/CASSANDRA-7904 Project: Cassandra Issue Type: Bug Components: Core Environment: C* 2.0.10, ubuntu 14.04, Java HotSpot(TM) 64-Bit Server, java version 1.7.0_45 Reporter: Duncan Sands Attachments: ls-172.18.68.138, ls-192.168.21.13, ls-192.168.60.134, ls-192.168.60.136 Cluster of 22 nodes spread over 4 data centres. Not used on the weekend, so repair is run on all nodes (in a staggered fashion) on the weekend. Nodetool options: -par -pr. There is usually some overlap in the repairs: repair on one node may well still be running when repair is started on the next node. Repair hangs for some of the nodes almost every weekend. It hung last weekend, here are the details: In the whole cluster, only one node had an exception since C* was last restarted. This node is 192.168.60.136 and the exception is harmless: a client disconnected abruptly. tpstats 4 nodes have a non-zero value for active or pending in AntiEntropySessions. These nodes all have Active = 1 and Pending = 1. The nodes are: 192.168.21.13 (data centre R) 192.168.60.134 (data centre A) 192.168.60.136 (data centre A) 172.18.68.138 (data centre Z) compactionstats: No compactions. All nodes have: pending tasks: 0 Active compaction remaining time :n/a netstats: All except one node have nothing. One node (192.168.60.131, not one of the nodes listed in the tpstats section above) has (note the Responses Pending value of 1): Mode: NORMAL Not sending any streams. Read Repair Statistics: Attempted: 4233 Mismatch (Blocking): 0 Mismatch (Background): 243 Pool NameActive Pending Completed Commandsn/a 0 34785445 Responses n/a 1 38567167 Repair sessions I looked for repair sessions that failed to complete. On 3 of the 4 nodes mentioned in tpstats above I found that they had sent merkle tree requests and got responses from all but one node. In the log file for the node that failed to respond there is no sign that it ever received the request. On 1 node (172.18.68.138) it looks like responses were received from every node, some streaming was done, and then... nothing. Details: Node 192.168.21.13 (data centre R): Sent merkle trees to /172.18.33.24, /192.168.60.140, /192.168.60.142, /172.18.68.139, /172.18.68.138, /172.18.33.22, /192.168.21.13 for table brokers, never got a response from /172.18.68.139. On /172.18.68.139, just before this time it sent a response for the same repair session but a different table, and there is no record of it receiving a request for table brokers. Node 192.168.60.134 (data centre A): Sent merkle trees to /172.18.68.139, /172.18.68.138, /192.168.60.132, /192.168.21.14, /192.168.60.134 for table swxess_outbound, never got a response from /172.18.68.138. On /172.18.68.138, just before this time it sent a response for the same repair session but a different table, and there is no record of it receiving a request for table swxess_outbound. Node 192.168.60.136 (data centre A): Sent merkle trees to /192.168.60.142, /172.18.68.139, /192.168.60.136 for table rollups7200, never got a response from /172.18.68.139. This repair session is never mentioned in the /172.18.68.139 log. Node 172.18.68.138 (data centre Z): The issue here seems to be repair session #a55c16e1-35eb-11e4-8e7e-51c077eaf311. It got responses for all its merkle tree requests, did some streaming, but seems to have stopped after finishing with one table (rollups60). I found it as follows: it is the only repair for which there is no session completed successfully message in the log. Some log file snippets are attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7069) Prevent operator mistakes due to simultaneous bootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133904#comment-14133904 ] T Jake Luciani commented on CASSANDRA-7069: --- Why not 2.1.1? Prevent operator mistakes due to simultaneous bootstrap --- Key: CASSANDRA-7069 URL: https://issues.apache.org/jira/browse/CASSANDRA-7069 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Brandon Williams Assignee: Brandon Williams Priority: Minor Fix For: 3.0 Attachments: 7069.txt Cassandra has always had the '2 minute rule' between beginning topology changes to ensure the range announcement is known to all nodes before the next one begins. Trying to bootstrap a bunch of nodes simultaneously is a common mistake and seems to be on the rise as of late. We can prevent users from shooting themselves in the foot this way by looking for other joining nodes in the shadow round, then comparing their generation against our own and if there isn't a large enough difference, bail out or sleep until it is large enough. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[2/3] git commit: Don't allow simultaneous bootstrap when cassandra.consistent.rangemovement is true
Don't allow simultaneous bootstrap when cassandra.consistent.rangemovement is true Patch by brandonwilliams, reviewed by tjake for CASSANDRA-7069 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/094aa8ef Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/094aa8ef Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/094aa8ef Branch: refs/heads/trunk Commit: 094aa8ef2be1f1c832b6baa7e2f25a1f220b279e Parents: db3cc3e Author: Brandon Williams brandonwilli...@apache.org Authored: Mon Sep 15 06:52:35 2014 + Committer: Brandon Williams brandonwilli...@apache.org Committed: Mon Sep 15 06:52:35 2014 + -- CHANGES.txt | 1 + src/java/org/apache/cassandra/service/StorageService.java | 8 2 files changed, 9 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/094aa8ef/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 7e18719..608e4b1 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.1 + * Prevent operator mistakes due to simultaneous bootstrap (CASSANDRA-7069) * cassandra-stress supports whitelist mode for node config * GCInspector more closely tracks GC; cassandra-stress and nodetool report it * nodetool won't output bogus ownership info without a keyspace (CASSANDRA-7173) http://git-wip-us.apache.org/repos/asf/cassandra/blob/094aa8ef/src/java/org/apache/cassandra/service/StorageService.java -- diff --git a/src/java/org/apache/cassandra/service/StorageService.java b/src/java/org/apache/cassandra/service/StorageService.java index 86412ba..1aa3b24 100644 --- a/src/java/org/apache/cassandra/service/StorageService.java +++ b/src/java/org/apache/cassandra/service/StorageService.java @@ -748,6 +748,14 @@ public class StorageService extends NotificationBroadcasterSupport implements IE if (logger.isDebugEnabled()) logger.debug(... got ring + schema info); +if (Boolean.parseBoolean(System.getProperty(cassandra.consistent.rangemovement, true)) +( +tokenMetadata.getBootstrapTokens().valueSet().size() 0 || +tokenMetadata.getLeavingEndpoints().size() 0 || +tokenMetadata.getMovingEndpoints().size() 0 +)) +throw new UnsupportedOperationException(Other bootstrapping/leaving/moving nodes detected, cannot bootstrap while cassandra.consistent.rangemovement is true); + if (!DatabaseDescriptor.isReplacing()) { if (tokenMetadata.isMember(FBUtilities.getBroadcastAddress()))
[1/3] git commit: Don't allow simultaneous bootstrap when cassandra.consistent.rangemovement is true
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 db3cc3e65 - 094aa8ef2 refs/heads/trunk fb6c28514 - e225176b3 Don't allow simultaneous bootstrap when cassandra.consistent.rangemovement is true Patch by brandonwilliams, reviewed by tjake for CASSANDRA-7069 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/094aa8ef Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/094aa8ef Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/094aa8ef Branch: refs/heads/cassandra-2.1 Commit: 094aa8ef2be1f1c832b6baa7e2f25a1f220b279e Parents: db3cc3e Author: Brandon Williams brandonwilli...@apache.org Authored: Mon Sep 15 06:52:35 2014 + Committer: Brandon Williams brandonwilli...@apache.org Committed: Mon Sep 15 06:52:35 2014 + -- CHANGES.txt | 1 + src/java/org/apache/cassandra/service/StorageService.java | 8 2 files changed, 9 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/094aa8ef/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 7e18719..608e4b1 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.1 + * Prevent operator mistakes due to simultaneous bootstrap (CASSANDRA-7069) * cassandra-stress supports whitelist mode for node config * GCInspector more closely tracks GC; cassandra-stress and nodetool report it * nodetool won't output bogus ownership info without a keyspace (CASSANDRA-7173) http://git-wip-us.apache.org/repos/asf/cassandra/blob/094aa8ef/src/java/org/apache/cassandra/service/StorageService.java -- diff --git a/src/java/org/apache/cassandra/service/StorageService.java b/src/java/org/apache/cassandra/service/StorageService.java index 86412ba..1aa3b24 100644 --- a/src/java/org/apache/cassandra/service/StorageService.java +++ b/src/java/org/apache/cassandra/service/StorageService.java @@ -748,6 +748,14 @@ public class StorageService extends NotificationBroadcasterSupport implements IE if (logger.isDebugEnabled()) logger.debug(... got ring + schema info); +if (Boolean.parseBoolean(System.getProperty(cassandra.consistent.rangemovement, true)) +( +tokenMetadata.getBootstrapTokens().valueSet().size() 0 || +tokenMetadata.getLeavingEndpoints().size() 0 || +tokenMetadata.getMovingEndpoints().size() 0 +)) +throw new UnsupportedOperationException(Other bootstrapping/leaving/moving nodes detected, cannot bootstrap while cassandra.consistent.rangemovement is true); + if (!DatabaseDescriptor.isReplacing()) { if (tokenMetadata.isMember(FBUtilities.getBroadcastAddress()))
[3/3] git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e225176b Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e225176b Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e225176b Branch: refs/heads/trunk Commit: e225176b31413c3c5f31885eac364fa06deaca2e Parents: fb6c285 094aa8e Author: Brandon Williams brandonwilli...@apache.org Authored: Mon Sep 15 06:53:18 2014 + Committer: Brandon Williams brandonwilli...@apache.org Committed: Mon Sep 15 06:53:18 2014 + -- CHANGES.txt | 1 + src/java/org/apache/cassandra/service/StorageService.java | 8 2 files changed, 9 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/e225176b/CHANGES.txt -- diff --cc CHANGES.txt index 1e08d3c,608e4b1..bf3567f --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,26 -1,5 +1,27 @@@ +3.0 + * Remove YamlFileNetworkTopologySnitch (CASSANDRA-7917) + * Support Java source code for user-defined functions (CASSANDRA-7562) + * Require arg types to disambiguate UDF drops (CASSANDRA-7812) + * Do anticompaction in groups (CASSANDRA-6851) + * Verify that UDF class methods are static (CASSANDRA-7781) + * Support pure user-defined functions (CASSANDRA-7395, 7740) + * Permit configurable timestamps with cassandra-stress (CASSANDRA-7416) + * Move sstable RandomAccessReader to nio2, which allows using the + FILE_SHARE_DELETE flag on Windows (CASSANDRA-4050) + * Remove CQL2 (CASSANDRA-5918) + * Add Thrift get_multi_slice call (CASSANDRA-6757) + * Optimize fetching multiple cells by name (CASSANDRA-6933) + * Allow compilation in java 8 (CASSANDRA-7028) + * Make incremental repair default (CASSANDRA-7250) + * Enable code coverage thru JaCoCo (CASSANDRA-7226) + * Switch external naming of 'column families' to 'tables' (CASSANDRA-4369) + * Shorten SSTable path (CASSANDRA-6962) + * Use unsafe mutations for most unit tests (CASSANDRA-6969) + * Fix race condition during calculation of pending ranges (CASSANDRA-7390) + + 2.1.1 + * Prevent operator mistakes due to simultaneous bootstrap (CASSANDRA-7069) * cassandra-stress supports whitelist mode for node config * GCInspector more closely tracks GC; cassandra-stress and nodetool report it * nodetool won't output bogus ownership info without a keyspace (CASSANDRA-7173) http://git-wip-us.apache.org/repos/asf/cassandra/blob/e225176b/src/java/org/apache/cassandra/service/StorageService.java --
[jira] [Reopened] (CASSANDRA-7069) Prevent operator mistakes due to simultaneous bootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams reopened CASSANDRA-7069: - Hmm, it just occurred to me this prevents bootstrapping even after the two minute rule has been followed. Prevent operator mistakes due to simultaneous bootstrap --- Key: CASSANDRA-7069 URL: https://issues.apache.org/jira/browse/CASSANDRA-7069 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Brandon Williams Assignee: Brandon Williams Priority: Minor Fix For: 2.1.1, 3.0 Attachments: 7069.txt Cassandra has always had the '2 minute rule' between beginning topology changes to ensure the range announcement is known to all nodes before the next one begins. Trying to bootstrap a bunch of nodes simultaneously is a common mistake and seems to be on the rise as of late. We can prevent users from shooting themselves in the foot this way by looking for other joining nodes in the shadow round, then comparing their generation against our own and if there isn't a large enough difference, bail out or sleep until it is large enough. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7069) Prevent operator mistakes due to simultaneous bootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133912#comment-14133912 ] T Jake Luciani commented on CASSANDRA-7069: --- Perhaps a dtest then :) Prevent operator mistakes due to simultaneous bootstrap --- Key: CASSANDRA-7069 URL: https://issues.apache.org/jira/browse/CASSANDRA-7069 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Brandon Williams Assignee: Brandon Williams Priority: Minor Fix For: 2.1.1, 3.0 Attachments: 7069.txt Cassandra has always had the '2 minute rule' between beginning topology changes to ensure the range announcement is known to all nodes before the next one begins. Trying to bootstrap a bunch of nodes simultaneously is a common mistake and seems to be on the rise as of late. We can prevent users from shooting themselves in the foot this way by looking for other joining nodes in the shadow round, then comparing their generation against our own and if there isn't a large enough difference, bail out or sleep until it is large enough. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[3/5] git commit: ninja-fix CHANGES.txt
ninja-fix CHANGES.txt Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1d7691e2 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1d7691e2 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1d7691e2 Branch: refs/heads/trunk Commit: 1d7691e251e3d170916f138b0980d6a9003cf7db Parents: 028880e Author: Benedict Elliott Smith bened...@apache.org Authored: Mon Sep 15 14:57:46 2014 +0100 Committer: Benedict Elliott Smith bened...@apache.org Committed: Mon Sep 15 15:02:32 2014 +0100 -- CHANGES.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/1d7691e2/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index ffa2b71..4c0dbf9 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,8 +1,8 @@ 2.1.1 * Use ThreadLocalRandom and remove FBUtilities.threadLocalRandom (CASSANDRA-7934) * Prevent operator mistakes due to simultaneous bootstrap (CASSANDRA-7069) - * cassandra-stress supports whitelist mode for node config - * GCInspector more closely tracks GC; cassandra-stress and nodetool report it + * cassandra-stress supports whitelist mode for node config (CASSANDRA-7658) + * GCInspector more closely tracks GC; cassandra-stress and nodetool report it (CASSANDRA-7916) * nodetool won't output bogus ownership info without a keyspace (CASSANDRA-7173) * Add human readable option to nodetool commands (CASSANDRA-5433) * Don't try to set repairedAt on old sstables (CASSANDRA-7913)
[5/5] git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Conflicts: src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java src/java/org/apache/cassandra/service/QueryState.java Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/35af28e5 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/35af28e5 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/35af28e5 Branch: refs/heads/trunk Commit: 35af28e550cd80be7df36283fe283454977a2a37 Parents: e225176 1d7691e Author: Benedict Elliott Smith bened...@apache.org Authored: Mon Sep 15 15:03:29 2014 +0100 Committer: Benedict Elliott Smith bened...@apache.org Committed: Mon Sep 15 15:03:29 2014 +0100 -- CHANGES.txt | 5 +++-- src/java/org/apache/cassandra/config/CFMetaData.java | 3 ++- src/java/org/apache/cassandra/db/BatchlogManager.java | 2 +- .../org/apache/cassandra/dht/Murmur3Partitioner.java | 3 ++- .../io/compress/CompressedRandomAccessReader.java | 3 ++- src/java/org/apache/cassandra/service/QueryState.java | 3 ++- .../org/apache/cassandra/service/StorageProxy.java| 8 .../streaming/compress/CompressedInputStream.java | 3 ++- src/java/org/apache/cassandra/utils/FBUtilities.java | 14 -- 9 files changed, 18 insertions(+), 26 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/35af28e5/CHANGES.txt -- diff --cc CHANGES.txt index bf3567f,4c0dbf9..b6e9165 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,29 -1,8 +1,30 @@@ +3.0 + * Remove YamlFileNetworkTopologySnitch (CASSANDRA-7917) + * Support Java source code for user-defined functions (CASSANDRA-7562) + * Require arg types to disambiguate UDF drops (CASSANDRA-7812) + * Do anticompaction in groups (CASSANDRA-6851) + * Verify that UDF class methods are static (CASSANDRA-7781) + * Support pure user-defined functions (CASSANDRA-7395, 7740) + * Permit configurable timestamps with cassandra-stress (CASSANDRA-7416) + * Move sstable RandomAccessReader to nio2, which allows using the + FILE_SHARE_DELETE flag on Windows (CASSANDRA-4050) + * Remove CQL2 (CASSANDRA-5918) + * Add Thrift get_multi_slice call (CASSANDRA-6757) + * Optimize fetching multiple cells by name (CASSANDRA-6933) + * Allow compilation in java 8 (CASSANDRA-7028) + * Make incremental repair default (CASSANDRA-7250) + * Enable code coverage thru JaCoCo (CASSANDRA-7226) + * Switch external naming of 'column families' to 'tables' (CASSANDRA-4369) + * Shorten SSTable path (CASSANDRA-6962) + * Use unsafe mutations for most unit tests (CASSANDRA-6969) + * Fix race condition during calculation of pending ranges (CASSANDRA-7390) + + 2.1.1 + * Use ThreadLocalRandom and remove FBUtilities.threadLocalRandom (CASSANDRA-7934) * Prevent operator mistakes due to simultaneous bootstrap (CASSANDRA-7069) - * cassandra-stress supports whitelist mode for node config - * GCInspector more closely tracks GC; cassandra-stress and nodetool report it + * cassandra-stress supports whitelist mode for node config (CASSANDRA-7658) + * GCInspector more closely tracks GC; cassandra-stress and nodetool report it (CASSANDRA-7916) * nodetool won't output bogus ownership info without a keyspace (CASSANDRA-7173) * Add human readable option to nodetool commands (CASSANDRA-5433) * Don't try to set repairedAt on old sstables (CASSANDRA-7913) http://git-wip-us.apache.org/repos/asf/cassandra/blob/35af28e5/src/java/org/apache/cassandra/config/CFMetaData.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/35af28e5/src/java/org/apache/cassandra/db/BatchlogManager.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/35af28e5/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java -- diff --cc src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java index d71964c,4521c19..dca5ade --- a/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java +++ b/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java @@@ -90,69 -85,65 +91,69 @@@ public class CompressedRandomAccessRead { try { -decompressChunk(metadata.chunkFor(current)); -} -catch (CorruptBlockException e) -{ -throw new CorruptSSTableException(e, getPath()); -} -catch (IOException e) -{ -throw new FSReadError(e, getPath()); -} -} +long
[2/5] git commit: Use ThreadLocalRandom and remove FBUtilities.threadLocalRandom
Use ThreadLocalRandom and remove FBUtilities.threadLocalRandom patch by benedict; reviewed by tjake for CASSANDRA-7934 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/028880e7 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/028880e7 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/028880e7 Branch: refs/heads/trunk Commit: 028880e74ceef57b33b858fbd78d8aa9ac3b9680 Parents: 094aa8e Author: Benedict Elliott Smith bened...@apache.org Authored: Mon Sep 15 14:56:39 2014 +0100 Committer: Benedict Elliott Smith bened...@apache.org Committed: Mon Sep 15 15:02:06 2014 +0100 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/config/CFMetaData.java | 3 ++- src/java/org/apache/cassandra/db/BatchlogManager.java | 2 +- .../org/apache/cassandra/dht/Murmur3Partitioner.java | 3 ++- .../io/compress/CompressedRandomAccessReader.java | 3 ++- src/java/org/apache/cassandra/service/QueryState.java | 3 ++- .../org/apache/cassandra/service/StorageProxy.java| 8 .../streaming/compress/CompressedInputStream.java | 3 ++- src/java/org/apache/cassandra/utils/FBUtilities.java | 14 -- 9 files changed, 16 insertions(+), 24 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/028880e7/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 608e4b1..ffa2b71 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.1 + * Use ThreadLocalRandom and remove FBUtilities.threadLocalRandom (CASSANDRA-7934) * Prevent operator mistakes due to simultaneous bootstrap (CASSANDRA-7069) * cassandra-stress supports whitelist mode for node config * GCInspector more closely tracks GC; cassandra-stress and nodetool report it http://git-wip-us.apache.org/repos/asf/cassandra/blob/028880e7/src/java/org/apache/cassandra/config/CFMetaData.java -- diff --git a/src/java/org/apache/cassandra/config/CFMetaData.java b/src/java/org/apache/cassandra/config/CFMetaData.java index a5d328a..1d6e3a4 100644 --- a/src/java/org/apache/cassandra/config/CFMetaData.java +++ b/src/java/org/apache/cassandra/config/CFMetaData.java @@ -35,6 +35,7 @@ import java.util.Set; import java.util.SortedSet; import java.util.TreeSet; import java.util.UUID; +import java.util.concurrent.ThreadLocalRandom; import com.google.common.annotations.VisibleForTesting; import com.google.common.base.Objects; @@ -703,7 +704,7 @@ public final class CFMetaData public ReadRepairDecision newReadRepairDecision() { -double chance = FBUtilities.threadLocalRandom().nextDouble(); +double chance = ThreadLocalRandom.current().nextDouble(); if (getReadRepairChance() chance) return ReadRepairDecision.GLOBAL; http://git-wip-us.apache.org/repos/asf/cassandra/blob/028880e7/src/java/org/apache/cassandra/db/BatchlogManager.java -- diff --git a/src/java/org/apache/cassandra/db/BatchlogManager.java b/src/java/org/apache/cassandra/db/BatchlogManager.java index d49c620..7f8d355 100644 --- a/src/java/org/apache/cassandra/db/BatchlogManager.java +++ b/src/java/org/apache/cassandra/db/BatchlogManager.java @@ -527,7 +527,7 @@ public class BatchlogManager implements BatchlogManagerMBean @VisibleForTesting protected int getRandomInt(int bound) { -return FBUtilities.threadLocalRandom().nextInt(bound); +return ThreadLocalRandom.current().nextInt(bound); } } } http://git-wip-us.apache.org/repos/asf/cassandra/blob/028880e7/src/java/org/apache/cassandra/dht/Murmur3Partitioner.java -- diff --git a/src/java/org/apache/cassandra/dht/Murmur3Partitioner.java b/src/java/org/apache/cassandra/dht/Murmur3Partitioner.java index 5a3c4bb..2bb0423 100644 --- a/src/java/org/apache/cassandra/dht/Murmur3Partitioner.java +++ b/src/java/org/apache/cassandra/dht/Murmur3Partitioner.java @@ -24,6 +24,7 @@ import java.util.HashMap; import java.util.Iterator; import java.util.List; import java.util.Map; +import java.util.concurrent.ThreadLocalRandom; import org.apache.cassandra.db.BufferDecoratedKey; import org.apache.cassandra.db.DecoratedKey; @@ -105,7 +106,7 @@ public class Murmur3Partitioner extends AbstractPartitionerLongToken public LongToken getRandomToken() { -return new LongToken(normalize(FBUtilities.threadLocalRandom().nextLong())); +return new LongToken(normalize(ThreadLocalRandom.current().nextLong())); } private long normalize(long v)
[jira] [Commented] (CASSANDRA-7069) Prevent operator mistakes due to simultaneous bootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133915#comment-14133915 ] Brandon Williams commented on CASSANDRA-7069: - Wait, can we even do multiple bootstraps following the 2 minute rule and get consistent range movement? Prevent operator mistakes due to simultaneous bootstrap --- Key: CASSANDRA-7069 URL: https://issues.apache.org/jira/browse/CASSANDRA-7069 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Brandon Williams Assignee: Brandon Williams Priority: Minor Fix For: 2.1.1, 3.0 Attachments: 7069.txt Cassandra has always had the '2 minute rule' between beginning topology changes to ensure the range announcement is known to all nodes before the next one begins. Trying to bootstrap a bunch of nodes simultaneously is a common mistake and seems to be on the rise as of late. We can prevent users from shooting themselves in the foot this way by looking for other joining nodes in the shadow round, then comparing their generation against our own and if there isn't a large enough difference, bail out or sleep until it is large enough. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[4/5] git commit: ninja-fix CHANGES.txt
ninja-fix CHANGES.txt Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1d7691e2 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1d7691e2 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1d7691e2 Branch: refs/heads/cassandra-2.1 Commit: 1d7691e251e3d170916f138b0980d6a9003cf7db Parents: 028880e Author: Benedict Elliott Smith bened...@apache.org Authored: Mon Sep 15 14:57:46 2014 +0100 Committer: Benedict Elliott Smith bened...@apache.org Committed: Mon Sep 15 15:02:32 2014 +0100 -- CHANGES.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/1d7691e2/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index ffa2b71..4c0dbf9 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,8 +1,8 @@ 2.1.1 * Use ThreadLocalRandom and remove FBUtilities.threadLocalRandom (CASSANDRA-7934) * Prevent operator mistakes due to simultaneous bootstrap (CASSANDRA-7069) - * cassandra-stress supports whitelist mode for node config - * GCInspector more closely tracks GC; cassandra-stress and nodetool report it + * cassandra-stress supports whitelist mode for node config (CASSANDRA-7658) + * GCInspector more closely tracks GC; cassandra-stress and nodetool report it (CASSANDRA-7916) * nodetool won't output bogus ownership info without a keyspace (CASSANDRA-7173) * Add human readable option to nodetool commands (CASSANDRA-5433) * Don't try to set repairedAt on old sstables (CASSANDRA-7913)
[1/5] git commit: Use ThreadLocalRandom and remove FBUtilities.threadLocalRandom
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 094aa8ef2 - 1d7691e25 refs/heads/trunk e225176b3 - 35af28e55 Use ThreadLocalRandom and remove FBUtilities.threadLocalRandom patch by benedict; reviewed by tjake for CASSANDRA-7934 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/028880e7 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/028880e7 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/028880e7 Branch: refs/heads/cassandra-2.1 Commit: 028880e74ceef57b33b858fbd78d8aa9ac3b9680 Parents: 094aa8e Author: Benedict Elliott Smith bened...@apache.org Authored: Mon Sep 15 14:56:39 2014 +0100 Committer: Benedict Elliott Smith bened...@apache.org Committed: Mon Sep 15 15:02:06 2014 +0100 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/config/CFMetaData.java | 3 ++- src/java/org/apache/cassandra/db/BatchlogManager.java | 2 +- .../org/apache/cassandra/dht/Murmur3Partitioner.java | 3 ++- .../io/compress/CompressedRandomAccessReader.java | 3 ++- src/java/org/apache/cassandra/service/QueryState.java | 3 ++- .../org/apache/cassandra/service/StorageProxy.java| 8 .../streaming/compress/CompressedInputStream.java | 3 ++- src/java/org/apache/cassandra/utils/FBUtilities.java | 14 -- 9 files changed, 16 insertions(+), 24 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/028880e7/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 608e4b1..ffa2b71 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.1 + * Use ThreadLocalRandom and remove FBUtilities.threadLocalRandom (CASSANDRA-7934) * Prevent operator mistakes due to simultaneous bootstrap (CASSANDRA-7069) * cassandra-stress supports whitelist mode for node config * GCInspector more closely tracks GC; cassandra-stress and nodetool report it http://git-wip-us.apache.org/repos/asf/cassandra/blob/028880e7/src/java/org/apache/cassandra/config/CFMetaData.java -- diff --git a/src/java/org/apache/cassandra/config/CFMetaData.java b/src/java/org/apache/cassandra/config/CFMetaData.java index a5d328a..1d6e3a4 100644 --- a/src/java/org/apache/cassandra/config/CFMetaData.java +++ b/src/java/org/apache/cassandra/config/CFMetaData.java @@ -35,6 +35,7 @@ import java.util.Set; import java.util.SortedSet; import java.util.TreeSet; import java.util.UUID; +import java.util.concurrent.ThreadLocalRandom; import com.google.common.annotations.VisibleForTesting; import com.google.common.base.Objects; @@ -703,7 +704,7 @@ public final class CFMetaData public ReadRepairDecision newReadRepairDecision() { -double chance = FBUtilities.threadLocalRandom().nextDouble(); +double chance = ThreadLocalRandom.current().nextDouble(); if (getReadRepairChance() chance) return ReadRepairDecision.GLOBAL; http://git-wip-us.apache.org/repos/asf/cassandra/blob/028880e7/src/java/org/apache/cassandra/db/BatchlogManager.java -- diff --git a/src/java/org/apache/cassandra/db/BatchlogManager.java b/src/java/org/apache/cassandra/db/BatchlogManager.java index d49c620..7f8d355 100644 --- a/src/java/org/apache/cassandra/db/BatchlogManager.java +++ b/src/java/org/apache/cassandra/db/BatchlogManager.java @@ -527,7 +527,7 @@ public class BatchlogManager implements BatchlogManagerMBean @VisibleForTesting protected int getRandomInt(int bound) { -return FBUtilities.threadLocalRandom().nextInt(bound); +return ThreadLocalRandom.current().nextInt(bound); } } } http://git-wip-us.apache.org/repos/asf/cassandra/blob/028880e7/src/java/org/apache/cassandra/dht/Murmur3Partitioner.java -- diff --git a/src/java/org/apache/cassandra/dht/Murmur3Partitioner.java b/src/java/org/apache/cassandra/dht/Murmur3Partitioner.java index 5a3c4bb..2bb0423 100644 --- a/src/java/org/apache/cassandra/dht/Murmur3Partitioner.java +++ b/src/java/org/apache/cassandra/dht/Murmur3Partitioner.java @@ -24,6 +24,7 @@ import java.util.HashMap; import java.util.Iterator; import java.util.List; import java.util.Map; +import java.util.concurrent.ThreadLocalRandom; import org.apache.cassandra.db.BufferDecoratedKey; import org.apache.cassandra.db.DecoratedKey; @@ -105,7 +106,7 @@ public class Murmur3Partitioner extends AbstractPartitionerLongToken public LongToken getRandomToken() { -return new
[jira] [Commented] (CASSANDRA-7904) Repair hangs
[ https://issues.apache.org/jira/browse/CASSANDRA-7904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133925#comment-14133925 ] Duncan Sands commented on CASSANDRA-7904: - Hi Michael, the workaround for this issue was effective, but I still think there is a problem here, in fact two problems: (1) repair hangs rather than failing, when too many repair messages time out; (2) the hang is silent: there is nothing in the logs saying that there is a problem (unless you turn on a special debug option, see previous comment). Repair hangs Key: CASSANDRA-7904 URL: https://issues.apache.org/jira/browse/CASSANDRA-7904 Project: Cassandra Issue Type: Bug Components: Core Environment: C* 2.0.10, ubuntu 14.04, Java HotSpot(TM) 64-Bit Server, java version 1.7.0_45 Reporter: Duncan Sands Attachments: ls-172.18.68.138, ls-192.168.21.13, ls-192.168.60.134, ls-192.168.60.136 Cluster of 22 nodes spread over 4 data centres. Not used on the weekend, so repair is run on all nodes (in a staggered fashion) on the weekend. Nodetool options: -par -pr. There is usually some overlap in the repairs: repair on one node may well still be running when repair is started on the next node. Repair hangs for some of the nodes almost every weekend. It hung last weekend, here are the details: In the whole cluster, only one node had an exception since C* was last restarted. This node is 192.168.60.136 and the exception is harmless: a client disconnected abruptly. tpstats 4 nodes have a non-zero value for active or pending in AntiEntropySessions. These nodes all have Active = 1 and Pending = 1. The nodes are: 192.168.21.13 (data centre R) 192.168.60.134 (data centre A) 192.168.60.136 (data centre A) 172.18.68.138 (data centre Z) compactionstats: No compactions. All nodes have: pending tasks: 0 Active compaction remaining time :n/a netstats: All except one node have nothing. One node (192.168.60.131, not one of the nodes listed in the tpstats section above) has (note the Responses Pending value of 1): Mode: NORMAL Not sending any streams. Read Repair Statistics: Attempted: 4233 Mismatch (Blocking): 0 Mismatch (Background): 243 Pool NameActive Pending Completed Commandsn/a 0 34785445 Responses n/a 1 38567167 Repair sessions I looked for repair sessions that failed to complete. On 3 of the 4 nodes mentioned in tpstats above I found that they had sent merkle tree requests and got responses from all but one node. In the log file for the node that failed to respond there is no sign that it ever received the request. On 1 node (172.18.68.138) it looks like responses were received from every node, some streaming was done, and then... nothing. Details: Node 192.168.21.13 (data centre R): Sent merkle trees to /172.18.33.24, /192.168.60.140, /192.168.60.142, /172.18.68.139, /172.18.68.138, /172.18.33.22, /192.168.21.13 for table brokers, never got a response from /172.18.68.139. On /172.18.68.139, just before this time it sent a response for the same repair session but a different table, and there is no record of it receiving a request for table brokers. Node 192.168.60.134 (data centre A): Sent merkle trees to /172.18.68.139, /172.18.68.138, /192.168.60.132, /192.168.21.14, /192.168.60.134 for table swxess_outbound, never got a response from /172.18.68.138. On /172.18.68.138, just before this time it sent a response for the same repair session but a different table, and there is no record of it receiving a request for table swxess_outbound. Node 192.168.60.136 (data centre A): Sent merkle trees to /192.168.60.142, /172.18.68.139, /192.168.60.136 for table rollups7200, never got a response from /172.18.68.139. This repair session is never mentioned in the /172.18.68.139 log. Node 172.18.68.138 (data centre Z): The issue here seems to be repair session #a55c16e1-35eb-11e4-8e7e-51c077eaf311. It got responses for all its merkle tree requests, did some streaming, but seems to have stopped after finishing with one table (rollups60). I found it as follows: it is the only repair for which there is no session completed successfully message in the log. Some log file snippets are attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7731) Get max values for live/tombstone cells per slice
[ https://issues.apache.org/jira/browse/CASSANDRA-7731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133931#comment-14133931 ] Chris Lohfink commented on CASSANDRA-7731: -- Cassandra is using an old (few years) version of metrics and has gone through a couple name changes. The non-histogram values (sum, count, min, max etc) are the length of the application. Only the histogram uses the reservoir, the weighted one is the default but you can specify it to length of application: https://github.com/dropwizard/metrics/blob/v2.2.0/metrics-core/src/main/java/com/yammer/metrics/core/Histogram.java Get max values for live/tombstone cells per slice - Key: CASSANDRA-7731 URL: https://issues.apache.org/jira/browse/CASSANDRA-7731 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Cyril Scetbon Assignee: Robert Stupp Priority: Minor Fix For: 2.1.1 Attachments: 7731-2.0.txt, 7731-2.1.txt I think you should not say that slice statistics are valid for the [last five minutes |https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/tools/NodeCmd.java#L955-L956] in CFSTATS command of nodetool. I've read the documentation from yammer for Histograms and there is no way to force values to expire after x minutes except by [clearing|http://grepcode.com/file/repo1.maven.org/maven2/com.yammer.metrics/metrics-core/2.1.2/com/yammer/metrics/core/Histogram.java#96] it . The only thing I can see is that the last snapshot used to provide the median (or whatever you'd used instead) value is based on 1028 values. I think we should also be able to detect that some requests are accessing a lot of live/tombstone cells per query and that's not possible for now without activating DEBUG for SliceQueryFilter for example and by tweaking the threshold. Currently as nodetool cfstats returns the median if a low part of the queries are scanning a lot of live/tombstone cells we miss it ! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-7069) Prevent operator mistakes due to simultaneous bootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams resolved CASSANDRA-7069. - Resolution: Fixed Disccusing this offline with Jake, we decided it's still possible to violate consistent range movement even following the 2 minute rule, so leaving this as-is. If people don't care, they can simply disable consistent range movement. Prevent operator mistakes due to simultaneous bootstrap --- Key: CASSANDRA-7069 URL: https://issues.apache.org/jira/browse/CASSANDRA-7069 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Brandon Williams Assignee: Brandon Williams Priority: Minor Fix For: 2.1.1, 3.0 Attachments: 7069.txt Cassandra has always had the '2 minute rule' between beginning topology changes to ensure the range announcement is known to all nodes before the next one begins. Trying to bootstrap a bunch of nodes simultaneously is a common mistake and seems to be on the rise as of late. We can prevent users from shooting themselves in the foot this way by looking for other joining nodes in the shadow round, then comparing their generation against our own and if there isn't a large enough difference, bail out or sleep until it is large enough. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7904) Repair hangs
[ https://issues.apache.org/jira/browse/CASSANDRA-7904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133942#comment-14133942 ] Brandon Williams commented on CASSANDRA-7904: - Dramatically increasing rpc_timeout isn't the most desirable workaround, either. Repair hangs Key: CASSANDRA-7904 URL: https://issues.apache.org/jira/browse/CASSANDRA-7904 Project: Cassandra Issue Type: Bug Components: Core Environment: C* 2.0.10, ubuntu 14.04, Java HotSpot(TM) 64-Bit Server, java version 1.7.0_45 Reporter: Duncan Sands Attachments: ls-172.18.68.138, ls-192.168.21.13, ls-192.168.60.134, ls-192.168.60.136 Cluster of 22 nodes spread over 4 data centres. Not used on the weekend, so repair is run on all nodes (in a staggered fashion) on the weekend. Nodetool options: -par -pr. There is usually some overlap in the repairs: repair on one node may well still be running when repair is started on the next node. Repair hangs for some of the nodes almost every weekend. It hung last weekend, here are the details: In the whole cluster, only one node had an exception since C* was last restarted. This node is 192.168.60.136 and the exception is harmless: a client disconnected abruptly. tpstats 4 nodes have a non-zero value for active or pending in AntiEntropySessions. These nodes all have Active = 1 and Pending = 1. The nodes are: 192.168.21.13 (data centre R) 192.168.60.134 (data centre A) 192.168.60.136 (data centre A) 172.18.68.138 (data centre Z) compactionstats: No compactions. All nodes have: pending tasks: 0 Active compaction remaining time :n/a netstats: All except one node have nothing. One node (192.168.60.131, not one of the nodes listed in the tpstats section above) has (note the Responses Pending value of 1): Mode: NORMAL Not sending any streams. Read Repair Statistics: Attempted: 4233 Mismatch (Blocking): 0 Mismatch (Background): 243 Pool NameActive Pending Completed Commandsn/a 0 34785445 Responses n/a 1 38567167 Repair sessions I looked for repair sessions that failed to complete. On 3 of the 4 nodes mentioned in tpstats above I found that they had sent merkle tree requests and got responses from all but one node. In the log file for the node that failed to respond there is no sign that it ever received the request. On 1 node (172.18.68.138) it looks like responses were received from every node, some streaming was done, and then... nothing. Details: Node 192.168.21.13 (data centre R): Sent merkle trees to /172.18.33.24, /192.168.60.140, /192.168.60.142, /172.18.68.139, /172.18.68.138, /172.18.33.22, /192.168.21.13 for table brokers, never got a response from /172.18.68.139. On /172.18.68.139, just before this time it sent a response for the same repair session but a different table, and there is no record of it receiving a request for table brokers. Node 192.168.60.134 (data centre A): Sent merkle trees to /172.18.68.139, /172.18.68.138, /192.168.60.132, /192.168.21.14, /192.168.60.134 for table swxess_outbound, never got a response from /172.18.68.138. On /172.18.68.138, just before this time it sent a response for the same repair session but a different table, and there is no record of it receiving a request for table swxess_outbound. Node 192.168.60.136 (data centre A): Sent merkle trees to /192.168.60.142, /172.18.68.139, /192.168.60.136 for table rollups7200, never got a response from /172.18.68.139. This repair session is never mentioned in the /172.18.68.139 log. Node 172.18.68.138 (data centre Z): The issue here seems to be repair session #a55c16e1-35eb-11e4-8e7e-51c077eaf311. It got responses for all its merkle tree requests, did some streaming, but seems to have stopped after finishing with one table (rollups60). I found it as follows: it is the only repair for which there is no session completed successfully message in the log. Some log file snippets are attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7886) TombstoneOverwhelmingException should not wait for timeout
[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133959#comment-14133959 ] Christian Spriegel commented on CASSANDRA-7886: --- [~kohlisankalp]: Thanks for the reference to CASSANDRA-6747. This seems to be exactly what I am talking about. Doesn't the patch from CASSANDRA-6747 handle TOEs already? (Sorry, I haven't studied the patch yet). TombstoneOverwhelmingException should not wait for timeout -- Key: CASSANDRA-7886 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 Project: Cassandra Issue Type: Improvement Components: Core Environment: Tested with Cassandra 2.0.8 Reporter: Christian Spriegel Priority: Minor Fix For: 3.0 *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then (sooner or later) our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7886) TombstoneOverwhelmingException should not wait for timeout
[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133974#comment-14133974 ] Christian Spriegel commented on CASSANDRA-7886: --- [~kohlisankalp]: Hi again! Sorry for all the mails, but I just had a look at your 2.1 patch: I think removing the try-catch in ReadVerbHandler should do the trick, right? Then TOEs would be handled by your code in the MessageDeliveryTask? ReadVerbHandler: {code} Row row; -try -{ row = command.getRow(keyspace); -} -catch (TombstoneOverwhelmingException e) -{ -// error already logged. Drop the request -return; -} {code} TombstoneOverwhelmingException should not wait for timeout -- Key: CASSANDRA-7886 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 Project: Cassandra Issue Type: Improvement Components: Core Environment: Tested with Cassandra 2.0.8 Reporter: Christian Spriegel Priority: Minor Fix For: 3.0 *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then (sooner or later) our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (CASSANDRA-7904) Repair hangs
[ https://issues.apache.org/jira/browse/CASSANDRA-7904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Razi Khaja updated CASSANDRA-7904: -- Comment: was deleted (was: I think this might be related since it mentions *RepairJob* ... I hope it is helpful. {code} INFO [AntiEntropyStage:1] 2014-09-12 16:36:29,536 RepairSession.java (line 166) [repair #ec6b4340-3abd-11e4-b32d-db378a0ca7f3] Received merkle tree for genome_protein_v10 from /XXX.XXX.XXX.XXX ERROR [MiscStage:58] 2014-09-12 16:36:29,537 CassandraDaemon.java (line 199) Exception in thread Thread[MiscStage:58,5,main] java.lang.IllegalArgumentException: Unknown keyspace/cf pair (megalink.probe_gene_v24) at org.apache.cassandra.db.Keyspace.getColumnFamilyStore(Keyspace.java:171) at org.apache.cassandra.service.SnapshotVerbHandler.doVerb(SnapshotVerbHandler.java:42) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) ERROR [RepairJobTask:7] 2014-09-12 16:36:29,537 RepairJob.java (line 125) Error occurred during snapshot phase java.lang.RuntimeException: Could not create snapshot at /XXX.XXX.XXX.XXX at org.apache.cassandra.repair.SnapshotTask$SnapshotCallback.onFailure(SnapshotTask.java:81) at org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:47) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) ERROR [AntiEntropySessions:73] 2014-09-12 16:36:29,540 RepairSession.java (line 288) [repair #ec6b4340-3abd-11e4-b32d-db378a0ca7f3] session completed with the following error java.io.IOException: Failed during snapshot creation. at org.apache.cassandra.repair.RepairSession.failedSnapshot(RepairSession.java:323) at org.apache.cassandra.repair.RepairJob$2.onFailure(RepairJob.java:126) at com.google.common.util.concurrent.Futures$4.run(Futures.java:1160) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) ERROR [AntiEntropySessions:73] 2014-09-12 16:36:29,543 CassandraDaemon.java (line 199) Exception in thread Thread[AntiEntropySessions:73,5,RMI Runtime] java.lang.RuntimeException: java.io.IOException: Failed during snapshot creation. at com.google.common.base.Throwables.propagate(Throwables.java:160) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: Failed during snapshot creation. at org.apache.cassandra.repair.RepairSession.failedSnapshot(RepairSession.java:323) at org.apache.cassandra.repair.RepairJob$2.onFailure(RepairJob.java:126) at com.google.common.util.concurrent.Futures$4.run(Futures.java:1160) ... 3 more {code}) Repair hangs Key: CASSANDRA-7904 URL: https://issues.apache.org/jira/browse/CASSANDRA-7904 Project: Cassandra Issue Type: Bug Components: Core Environment: C* 2.0.10, ubuntu 14.04, Java HotSpot(TM) 64-Bit Server, java version 1.7.0_45 Reporter: Duncan Sands Attachments: ls-172.18.68.138, ls-192.168.21.13, ls-192.168.60.134, ls-192.168.60.136 Cluster of 22 nodes spread over 4 data centres. Not used on the weekend, so repair is run on all nodes (in a staggered fashion) on the weekend. Nodetool options: -par -pr. There is usually some overlap in the repairs: repair on one node may well still be running when repair is started on the next node. Repair hangs for some of the nodes almost every weekend. It hung last weekend, here are the details: In the whole cluster, only one node had an exception since C* was last restarted. This node is 192.168.60.136 and the exception is harmless: a client disconnected abruptly. tpstats 4 nodes have a non-zero value for active or pending in AntiEntropySessions. These nodes all have Active = 1 and Pending = 1. The
[jira] [Comment Edited] (CASSANDRA-7886) TombstoneOverwhelmingException should not wait for timeout
[ https://issues.apache.org/jira/browse/CASSANDRA-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133974#comment-14133974 ] Christian Spriegel edited comment on CASSANDRA-7886 at 9/15/14 3:07 PM: [~kohlisankalp]: Hi again! Sorry for all the mails, but I just had a look at your 2.1 patch: I think removing the try-catch in ReadVerbHandler should do the trick, right? Then TOEs would be handled by your code in the MessageDeliveryTask? ReadVerbHandler: {code} Row row; -try -{ row = command.getRow(keyspace); -} -catch (TombstoneOverwhelmingException e) -{ -// error already logged. Drop the request -return; -} {code} Edit: Looking a bit closer, I think its missing a few more pieces. But in my naive mind it does not look like a big protocol change. I would like to hear your opinion. was (Author: christianmovi): [~kohlisankalp]: Hi again! Sorry for all the mails, but I just had a look at your 2.1 patch: I think removing the try-catch in ReadVerbHandler should do the trick, right? Then TOEs would be handled by your code in the MessageDeliveryTask? ReadVerbHandler: {code} Row row; -try -{ row = command.getRow(keyspace); -} -catch (TombstoneOverwhelmingException e) -{ -// error already logged. Drop the request -return; -} {code} TombstoneOverwhelmingException should not wait for timeout -- Key: CASSANDRA-7886 URL: https://issues.apache.org/jira/browse/CASSANDRA-7886 Project: Cassandra Issue Type: Improvement Components: Core Environment: Tested with Cassandra 2.0.8 Reporter: Christian Spriegel Priority: Minor Fix For: 3.0 *Issue* When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the query to be simply dropped on every data-node, but no response is sent back to the coordinator. Instead the coordinator waits for the specified read_request_timeout_in_ms. On the application side this can cause memory issues, since the application is waiting for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions, then (sooner or later) our entire application cluster goes down :-( *Proposed solution* I think the data nodes should send a error message to the coordinator when they run into a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7904) Repair hangs
[ https://issues.apache.org/jira/browse/CASSANDRA-7904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133990#comment-14133990 ] Razi Khaja commented on CASSANDRA-7904: --- I increase my request_time_in_ms from 2 to 18 and repair is working now so far for 2 hours without *Lost notification*. In the comment I made above, for my keyspace megalink, repair command #10 lost notificaion within 4 minutes, so the fact that my current repair is still running for 2 hours is a good sign. Repair hangs Key: CASSANDRA-7904 URL: https://issues.apache.org/jira/browse/CASSANDRA-7904 Project: Cassandra Issue Type: Bug Components: Core Environment: C* 2.0.10, ubuntu 14.04, Java HotSpot(TM) 64-Bit Server, java version 1.7.0_45 Reporter: Duncan Sands Attachments: ls-172.18.68.138, ls-192.168.21.13, ls-192.168.60.134, ls-192.168.60.136 Cluster of 22 nodes spread over 4 data centres. Not used on the weekend, so repair is run on all nodes (in a staggered fashion) on the weekend. Nodetool options: -par -pr. There is usually some overlap in the repairs: repair on one node may well still be running when repair is started on the next node. Repair hangs for some of the nodes almost every weekend. It hung last weekend, here are the details: In the whole cluster, only one node had an exception since C* was last restarted. This node is 192.168.60.136 and the exception is harmless: a client disconnected abruptly. tpstats 4 nodes have a non-zero value for active or pending in AntiEntropySessions. These nodes all have Active = 1 and Pending = 1. The nodes are: 192.168.21.13 (data centre R) 192.168.60.134 (data centre A) 192.168.60.136 (data centre A) 172.18.68.138 (data centre Z) compactionstats: No compactions. All nodes have: pending tasks: 0 Active compaction remaining time :n/a netstats: All except one node have nothing. One node (192.168.60.131, not one of the nodes listed in the tpstats section above) has (note the Responses Pending value of 1): Mode: NORMAL Not sending any streams. Read Repair Statistics: Attempted: 4233 Mismatch (Blocking): 0 Mismatch (Background): 243 Pool NameActive Pending Completed Commandsn/a 0 34785445 Responses n/a 1 38567167 Repair sessions I looked for repair sessions that failed to complete. On 3 of the 4 nodes mentioned in tpstats above I found that they had sent merkle tree requests and got responses from all but one node. In the log file for the node that failed to respond there is no sign that it ever received the request. On 1 node (172.18.68.138) it looks like responses were received from every node, some streaming was done, and then... nothing. Details: Node 192.168.21.13 (data centre R): Sent merkle trees to /172.18.33.24, /192.168.60.140, /192.168.60.142, /172.18.68.139, /172.18.68.138, /172.18.33.22, /192.168.21.13 for table brokers, never got a response from /172.18.68.139. On /172.18.68.139, just before this time it sent a response for the same repair session but a different table, and there is no record of it receiving a request for table brokers. Node 192.168.60.134 (data centre A): Sent merkle trees to /172.18.68.139, /172.18.68.138, /192.168.60.132, /192.168.21.14, /192.168.60.134 for table swxess_outbound, never got a response from /172.18.68.138. On /172.18.68.138, just before this time it sent a response for the same repair session but a different table, and there is no record of it receiving a request for table swxess_outbound. Node 192.168.60.136 (data centre A): Sent merkle trees to /192.168.60.142, /172.18.68.139, /192.168.60.136 for table rollups7200, never got a response from /172.18.68.139. This repair session is never mentioned in the /172.18.68.139 log. Node 172.18.68.138 (data centre Z): The issue here seems to be repair session #a55c16e1-35eb-11e4-8e7e-51c077eaf311. It got responses for all its merkle tree requests, did some streaming, but seems to have stopped after finishing with one table (rollups60). I found it as follows: it is the only repair for which there is no session completed successfully message in the log. Some log file snippets are attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-7904) Repair hangs
[ https://issues.apache.org/jira/browse/CASSANDRA-7904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133990#comment-14133990 ] Razi Khaja edited comment on CASSANDRA-7904 at 9/15/14 3:11 PM: I increased my request_time_in_ms from 2 to 18 and repair is working now so far for 2 hours without *Lost notification*. In the comment I made above, for my keyspace megalink, repair command #10 lost notificaion within 4 minutes, so the fact that my current repair is still running for 2 hours is a good sign. was (Author: razi.kh...@gmail.com): I increase my request_time_in_ms from 2 to 18 and repair is working now so far for 2 hours without *Lost notification*. In the comment I made above, for my keyspace megalink, repair command #10 lost notificaion within 4 minutes, so the fact that my current repair is still running for 2 hours is a good sign. Repair hangs Key: CASSANDRA-7904 URL: https://issues.apache.org/jira/browse/CASSANDRA-7904 Project: Cassandra Issue Type: Bug Components: Core Environment: C* 2.0.10, ubuntu 14.04, Java HotSpot(TM) 64-Bit Server, java version 1.7.0_45 Reporter: Duncan Sands Attachments: ls-172.18.68.138, ls-192.168.21.13, ls-192.168.60.134, ls-192.168.60.136 Cluster of 22 nodes spread over 4 data centres. Not used on the weekend, so repair is run on all nodes (in a staggered fashion) on the weekend. Nodetool options: -par -pr. There is usually some overlap in the repairs: repair on one node may well still be running when repair is started on the next node. Repair hangs for some of the nodes almost every weekend. It hung last weekend, here are the details: In the whole cluster, only one node had an exception since C* was last restarted. This node is 192.168.60.136 and the exception is harmless: a client disconnected abruptly. tpstats 4 nodes have a non-zero value for active or pending in AntiEntropySessions. These nodes all have Active = 1 and Pending = 1. The nodes are: 192.168.21.13 (data centre R) 192.168.60.134 (data centre A) 192.168.60.136 (data centre A) 172.18.68.138 (data centre Z) compactionstats: No compactions. All nodes have: pending tasks: 0 Active compaction remaining time :n/a netstats: All except one node have nothing. One node (192.168.60.131, not one of the nodes listed in the tpstats section above) has (note the Responses Pending value of 1): Mode: NORMAL Not sending any streams. Read Repair Statistics: Attempted: 4233 Mismatch (Blocking): 0 Mismatch (Background): 243 Pool NameActive Pending Completed Commandsn/a 0 34785445 Responses n/a 1 38567167 Repair sessions I looked for repair sessions that failed to complete. On 3 of the 4 nodes mentioned in tpstats above I found that they had sent merkle tree requests and got responses from all but one node. In the log file for the node that failed to respond there is no sign that it ever received the request. On 1 node (172.18.68.138) it looks like responses were received from every node, some streaming was done, and then... nothing. Details: Node 192.168.21.13 (data centre R): Sent merkle trees to /172.18.33.24, /192.168.60.140, /192.168.60.142, /172.18.68.139, /172.18.68.138, /172.18.33.22, /192.168.21.13 for table brokers, never got a response from /172.18.68.139. On /172.18.68.139, just before this time it sent a response for the same repair session but a different table, and there is no record of it receiving a request for table brokers. Node 192.168.60.134 (data centre A): Sent merkle trees to /172.18.68.139, /172.18.68.138, /192.168.60.132, /192.168.21.14, /192.168.60.134 for table swxess_outbound, never got a response from /172.18.68.138. On /172.18.68.138, just before this time it sent a response for the same repair session but a different table, and there is no record of it receiving a request for table swxess_outbound. Node 192.168.60.136 (data centre A): Sent merkle trees to /192.168.60.142, /172.18.68.139, /192.168.60.136 for table rollups7200, never got a response from /172.18.68.139. This repair session is never mentioned in the /172.18.68.139 log. Node 172.18.68.138 (data centre Z): The issue here seems to be repair session #a55c16e1-35eb-11e4-8e7e-51c077eaf311. It got responses for all its merkle tree requests, did some streaming, but seems to have stopped after finishing with one table (rollups60). I found it as follows: it is the only repair for which there is no session completed successfully message in the log. Some log file snippets are attached.
[jira] [Commented] (CASSANDRA-7731) Get max values for live/tombstone cells per slice
[ https://issues.apache.org/jira/browse/CASSANDRA-7731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133997#comment-14133997 ] Cyril Scetbon commented on CASSANDRA-7731: -- bq. the weighted one is the default but you can specify it to length of application what do you mean ? we don't want to change it to be the maximum value since the application started. The only matter concerns the fact that in some cases it's not the maximum for the last 5 minutes but can be for the last 20 minutes like in my case. Do you think, the last version of metrics could enforce it to corresponds to the last 5 minutes ? Get max values for live/tombstone cells per slice - Key: CASSANDRA-7731 URL: https://issues.apache.org/jira/browse/CASSANDRA-7731 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Cyril Scetbon Assignee: Robert Stupp Priority: Minor Fix For: 2.1.1 Attachments: 7731-2.0.txt, 7731-2.1.txt I think you should not say that slice statistics are valid for the [last five minutes |https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/tools/NodeCmd.java#L955-L956] in CFSTATS command of nodetool. I've read the documentation from yammer for Histograms and there is no way to force values to expire after x minutes except by [clearing|http://grepcode.com/file/repo1.maven.org/maven2/com.yammer.metrics/metrics-core/2.1.2/com/yammer/metrics/core/Histogram.java#96] it . The only thing I can see is that the last snapshot used to provide the median (or whatever you'd used instead) value is based on 1028 values. I think we should also be able to detect that some requests are accessing a lot of live/tombstone cells per query and that's not possible for now without activating DEBUG for SliceQueryFilter for example and by tweaking the threshold. Currently as nodetool cfstats returns the median if a low part of the queries are scanning a lot of live/tombstone cells we miss it ! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7916) Stress should collect and cross-cluster GC statistics
[ https://issues.apache.org/jira/browse/CASSANDRA-7916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134014#comment-14134014 ] T Jake Luciani commented on CASSANDRA-7916: --- Just tested this and the summary stats are not working for GC {code} Results: op rate : 96 partition rate: 2389 row rate : 105789 latency mean : 166.1 latency median: 141.6 latency 95th percentile : 399.9 latency 99th percentile : 512.2 latency 99.9th percentile : 858.7 latency max : 962.3 total gc count: 0 total gc mb : 0 total gc time (s) : 0 avg gc time(ms) : NaN stdev gc time(ms) : 0 Total operation time : 00:00:39 {code} Stress should collect and cross-cluster GC statistics - Key: CASSANDRA-7916 URL: https://issues.apache.org/jira/browse/CASSANDRA-7916 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Benedict Assignee: Benedict Priority: Minor Fix For: 2.1.1 It would be useful to see stress outputs deliver cross-cluster statistics, the most useful being GC data. Some simple changes to GCInspector collect the data, and can deliver to a nodetool request or to stress over JMX. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7716) cassandra-stress: provide better error messages
[ https://issues.apache.org/jira/browse/CASSANDRA-7716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] T Jake Luciani updated CASSANDRA-7716: -- Attachment: 7166v2.txt v2 against latest stress changes cassandra-stress: provide better error messages --- Key: CASSANDRA-7716 URL: https://issues.apache.org/jira/browse/CASSANDRA-7716 Project: Cassandra Issue Type: Improvement Reporter: Robert Stupp Assignee: T Jake Luciani Priority: Trivial Fix For: 2.1.1 Attachments: 7166v2.txt, 7716.txt Just tried new stress tool. It would be great if the stress tool gives better error messages by telling the user what option or config parameter/value caused an error. YAML parse errors are meaningful (gives code snippets etc). Examples are: {noformat} WARN 16:59:39 Setting caching options with deprecated syntax. Exception in thread main java.lang.NullPointerException at java.util.regex.Matcher.getTextLength(Matcher.java:1234) at java.util.regex.Matcher.reset(Matcher.java:308) at java.util.regex.Matcher.init(Matcher.java:228) at java.util.regex.Pattern.matcher(Pattern.java:1088) at org.apache.cassandra.stress.settings.OptionDistribution.get(OptionDistribution.java:67) at org.apache.cassandra.stress.StressProfile.init(StressProfile.java:151) at org.apache.cassandra.stress.StressProfile.load(StressProfile.java:482) at org.apache.cassandra.stress.settings.SettingsCommandUser.init(SettingsCommandUser.java:53) at org.apache.cassandra.stress.settings.SettingsCommandUser.build(SettingsCommandUser.java:114) at org.apache.cassandra.stress.settings.SettingsCommand.get(SettingsCommand.java:134) at org.apache.cassandra.stress.settings.StressSettings.get(StressSettings.java:218) at org.apache.cassandra.stress.settings.StressSettings.parse(StressSettings.java:206) at org.apache.cassandra.stress.Stress.main(Stress.java:58) {noformat} When table-definition is wrong: {noformat} Exception in thread main java.lang.RuntimeException: org.apache.cassandra.exceptions.SyntaxException: line 6:14 mismatched input '(' expecting ')' at org.apache.cassandra.config.CFMetaData.compile(CFMetaData.java:550) at org.apache.cassandra.stress.StressProfile.init(StressProfile.java:134) at org.apache.cassandra.stress.StressProfile.load(StressProfile.java:482) at org.apache.cassandra.stress.settings.SettingsCommandUser.init(SettingsCommandUser.java:53) at org.apache.cassandra.stress.settings.SettingsCommandUser.build(SettingsCommandUser.java:114) at org.apache.cassandra.stress.settings.SettingsCommand.get(SettingsCommand.java:134) at org.apache.cassandra.stress.settings.StressSettings.get(StressSettings.java:218) at org.apache.cassandra.stress.settings.StressSettings.parse(StressSettings.java:206) at org.apache.cassandra.stress.Stress.main(Stress.java:58) Caused by: org.apache.cassandra.exceptions.SyntaxException: line 6:14 mismatched input '(' expecting ')' at org.apache.cassandra.cql3.CqlParser.throwLastRecognitionError(CqlParser.java:273) at org.apache.cassandra.cql3.QueryProcessor.parseStatement(QueryProcessor.java:456) at org.apache.cassandra.config.CFMetaData.compile(CFMetaData.java:541) ... 8 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7402) limit the on heap memory available to requests
[ https://issues.apache.org/jira/browse/CASSANDRA-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] T Jake Luciani updated CASSANDRA-7402: -- Labels: ops (was: ) limit the on heap memory available to requests -- Key: CASSANDRA-7402 URL: https://issues.apache.org/jira/browse/CASSANDRA-7402 Project: Cassandra Issue Type: Improvement Reporter: T Jake Luciani Labels: ops Fix For: 3.0 When running a production cluster one common operational issue is quantifying GC pauses caused by ongoing requests. Since different queries return varying amount of data you can easily get your self into a situation where you Stop the world from a couple of bad actors in the system. Or more likely the aggregate garbage generated on a single node across all in flight requests causes a GC. We should be able to set a limit on the max heap we can allocate to all outstanding requests and track the garbage per requests to stop this from happening. It should increase a single nodes availability substantially. In the yaml this would be {code} total_request_memory_space_mb: 400 {code} It would also be nice to have either a log of queries which generate the most garbage so operators can track this. Also a histogram. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7402) limit the on heap memory available to requests
[ https://issues.apache.org/jira/browse/CASSANDRA-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-7402: Labels: ops performance stability (was: ops) limit the on heap memory available to requests -- Key: CASSANDRA-7402 URL: https://issues.apache.org/jira/browse/CASSANDRA-7402 Project: Cassandra Issue Type: Improvement Reporter: T Jake Luciani Labels: ops, performance, stability Fix For: 3.0 When running a production cluster one common operational issue is quantifying GC pauses caused by ongoing requests. Since different queries return varying amount of data you can easily get your self into a situation where you Stop the world from a couple of bad actors in the system. Or more likely the aggregate garbage generated on a single node across all in flight requests causes a GC. We should be able to set a limit on the max heap we can allocate to all outstanding requests and track the garbage per requests to stop this from happening. It should increase a single nodes availability substantially. In the yaml this would be {code} total_request_memory_space_mb: 400 {code} It would also be nice to have either a log of queries which generate the most garbage so operators can track this. Also a histogram. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[1/3] git commit: ninja-fix cassandra-stress totalGcStats metric
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 1d7691e25 - 6bff5a331 refs/heads/trunk 35af28e55 - 969967cf9 ninja-fix cassandra-stress totalGcStats metric Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/6bff5a33 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/6bff5a33 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/6bff5a33 Branch: refs/heads/cassandra-2.1 Commit: 6bff5a3318c46d09f1338665d0d0251bf5bda3a1 Parents: 1d7691e Author: Benedict Elliott Smith bened...@apache.org Authored: Mon Sep 15 16:56:45 2014 +0100 Committer: Benedict Elliott Smith bened...@apache.org Committed: Mon Sep 15 16:56:45 2014 +0100 -- tools/stress/src/org/apache/cassandra/stress/StressMetrics.java | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/6bff5a33/tools/stress/src/org/apache/cassandra/stress/StressMetrics.java -- diff --git a/tools/stress/src/org/apache/cassandra/stress/StressMetrics.java b/tools/stress/src/org/apache/cassandra/stress/StressMetrics.java index 9e8e961..6d5f387 100644 --- a/tools/stress/src/org/apache/cassandra/stress/StressMetrics.java +++ b/tools/stress/src/org/apache/cassandra/stress/StressMetrics.java @@ -22,6 +22,7 @@ package org.apache.cassandra.stress; import java.io.PrintStream; +import java.util.Arrays; import java.util.List; import java.util.concurrent.Callable; import java.util.concurrent.CountDownLatch; @@ -56,16 +57,15 @@ public class StressMetrics { this.output = output; CallableJmxCollector.GcStats gcStatsCollector; +totalGcStats = new JmxCollector.GcStats(0); try { gcStatsCollector = new JmxCollector(settings.node.nodes, settings.port.jmxPort); -totalGcStats = new JmxCollector.GcStats(0); } catch (Throwable t) { t.printStackTrace(); System.err.println(Failed to connect over JMX; not collecting these stats); -totalGcStats = new JmxCollector.GcStats(Double.POSITIVE_INFINITY); gcStatsCollector = new CallableJmxCollector.GcStats() { public JmxCollector.GcStats call() throws Exception @@ -149,6 +149,7 @@ public class StressMetrics private void update() throws InterruptedException { Timing.TimingResultJmxCollector.GcStats result = timing.snap(gcStatsCollector); +totalGcStats = JmxCollector.GcStats.aggregate(Arrays.asList(totalGcStats, result.extra)); if (result.timing.partitionCount != 0) printRow(, result.timing, timing.getHistory(), result.extra, rowRateUncertainty, output); rowRateUncertainty.update(result.timing.adjustedRowRate());
[3/3] git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/969967cf Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/969967cf Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/969967cf Branch: refs/heads/trunk Commit: 969967cf9f5597a32d1a8d2575434f8088add87e Parents: 35af28e 6bff5a3 Author: Benedict Elliott Smith bened...@apache.org Authored: Mon Sep 15 16:56:50 2014 +0100 Committer: Benedict Elliott Smith bened...@apache.org Committed: Mon Sep 15 16:56:50 2014 +0100 -- tools/stress/src/org/apache/cassandra/stress/StressMetrics.java | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/969967cf/tools/stress/src/org/apache/cassandra/stress/StressMetrics.java --
[2/3] git commit: ninja-fix cassandra-stress totalGcStats metric
ninja-fix cassandra-stress totalGcStats metric Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/6bff5a33 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/6bff5a33 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/6bff5a33 Branch: refs/heads/trunk Commit: 6bff5a3318c46d09f1338665d0d0251bf5bda3a1 Parents: 1d7691e Author: Benedict Elliott Smith bened...@apache.org Authored: Mon Sep 15 16:56:45 2014 +0100 Committer: Benedict Elliott Smith bened...@apache.org Committed: Mon Sep 15 16:56:45 2014 +0100 -- tools/stress/src/org/apache/cassandra/stress/StressMetrics.java | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/6bff5a33/tools/stress/src/org/apache/cassandra/stress/StressMetrics.java -- diff --git a/tools/stress/src/org/apache/cassandra/stress/StressMetrics.java b/tools/stress/src/org/apache/cassandra/stress/StressMetrics.java index 9e8e961..6d5f387 100644 --- a/tools/stress/src/org/apache/cassandra/stress/StressMetrics.java +++ b/tools/stress/src/org/apache/cassandra/stress/StressMetrics.java @@ -22,6 +22,7 @@ package org.apache.cassandra.stress; import java.io.PrintStream; +import java.util.Arrays; import java.util.List; import java.util.concurrent.Callable; import java.util.concurrent.CountDownLatch; @@ -56,16 +57,15 @@ public class StressMetrics { this.output = output; CallableJmxCollector.GcStats gcStatsCollector; +totalGcStats = new JmxCollector.GcStats(0); try { gcStatsCollector = new JmxCollector(settings.node.nodes, settings.port.jmxPort); -totalGcStats = new JmxCollector.GcStats(0); } catch (Throwable t) { t.printStackTrace(); System.err.println(Failed to connect over JMX; not collecting these stats); -totalGcStats = new JmxCollector.GcStats(Double.POSITIVE_INFINITY); gcStatsCollector = new CallableJmxCollector.GcStats() { public JmxCollector.GcStats call() throws Exception @@ -149,6 +149,7 @@ public class StressMetrics private void update() throws InterruptedException { Timing.TimingResultJmxCollector.GcStats result = timing.snap(gcStatsCollector); +totalGcStats = JmxCollector.GcStats.aggregate(Arrays.asList(totalGcStats, result.extra)); if (result.timing.partitionCount != 0) printRow(, result.timing, timing.getHistory(), result.extra, rowRateUncertainty, output); rowRateUncertainty.update(result.timing.adjustedRowRate());
svn commit: r1625089 - in /cassandra/site: publish/index.html src/content/index.html
Author: jbellis Date: Mon Sep 15 15:56:48 2014 New Revision: 1625089 URL: http://svn.apache.org/r1625089 Log: update w/ Apple's deployment summary Modified: cassandra/site/publish/index.html cassandra/site/src/content/index.html Modified: cassandra/site/publish/index.html URL: http://svn.apache.org/viewvc/cassandra/site/publish/index.html?rev=1625089r1=1625088r2=1625089view=diff == --- cassandra/site/publish/index.html (original) +++ cassandra/site/publish/index.html Mon Sep 15 15:56:48 2014 @@ -112,7 +112,7 @@ a href=http://planetcassandra.org/blog/post/reddit-upvotes-apache-cassandras-horizontal-scaling-managing-1700-votes-daily/;Reddit/a, a href=http://planetcassandra.org/blog/post/make-it-rain-apache-cassandra-at-the-weather-channel-for-severe-weather-alerts/;The Weather Channel/a, and a href=http://planetcassandra.org/companies/;over 1500 more companies/a that have large, active data sets. /p -pOne of the largest production deployments consists of over 15,000 nodes storing over 4 PB of data. Other large Cassandra installations include Netflix (2,500 nodes, 420 TB, over 1 trillion requests per day), Chinese search engine Easou (270 nodes, 300 TB, over 800 million reqests per day), and eBay (over 100 nodes, 250 TB). +pOne of the largest production deployments is Apple's, with over 75,000 nodes storing over 10 PB of data. Other large Cassandra installations include Netflix (2,500 nodes, 420 TB, over 1 trillion requests per day), Chinese search engine Easou (270 nodes, 300 TB, over 800 million reqests per day), and eBay (over 100 nodes, 250 TB). /p /li li Modified: cassandra/site/src/content/index.html URL: http://svn.apache.org/viewvc/cassandra/site/src/content/index.html?rev=1625089r1=1625088r2=1625089view=diff == --- cassandra/site/src/content/index.html (original) +++ cassandra/site/src/content/index.html Mon Sep 15 15:56:48 2014 @@ -58,7 +58,7 @@ a href=http://planetcassandra.org/blog/post/reddit-upvotes-apache-cassandras-horizontal-scaling-managing-1700-votes-daily/;Reddit/a, a href=http://planetcassandra.org/blog/post/make-it-rain-apache-cassandra-at-the-weather-channel-for-severe-weather-alerts/;The Weather Channel/a, and a href=http://planetcassandra.org/companies/;over 1500 more companies/a that have large, active data sets. /p -pOne of the largest production deployments consists of over 15,000 nodes storing over 4 PB of data. Other large Cassandra installations include Netflix (2,500 nodes, 420 TB, over 1 trillion requests per day), Chinese search engine Easou (270 nodes, 300 TB, over 800 million reqests per day), and eBay (over 100 nodes, 250 TB). +pOne of the largest production deployments is Apple's, with over 75,000 nodes storing over 10 PB of data. Other large Cassandra installations include Netflix (2,500 nodes, 420 TB, over 1 trillion requests per day), Chinese search engine Easou (270 nodes, 300 TB, over 800 million reqests per day), and eBay (over 100 nodes, 250 TB). /p /li li
[jira] [Commented] (CASSANDRA-7916) Stress should collect and cross-cluster GC statistics
[ https://issues.apache.org/jira/browse/CASSANDRA-7916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134042#comment-14134042 ] Benedict commented on CASSANDRA-7916: - ninja fixed Stress should collect and cross-cluster GC statistics - Key: CASSANDRA-7916 URL: https://issues.apache.org/jira/browse/CASSANDRA-7916 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Benedict Assignee: Benedict Priority: Minor Fix For: 2.1.1 It would be useful to see stress outputs deliver cross-cluster statistics, the most useful being GC data. Some simple changes to GCInspector collect the data, and can deliver to a nodetool request or to stress over JMX. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7716) cassandra-stress: provide better error messages
[ https://issues.apache.org/jira/browse/CASSANDRA-7716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134049#comment-14134049 ] Benedict commented on CASSANDRA-7716: - I think the old behaviour of StressProfile.select() makes more sense...? If you've specified a property key, but no value, that should probably fail rather than using the default don't you think? Otherwise LGTM, and no really strong feeling on that issue. cassandra-stress: provide better error messages --- Key: CASSANDRA-7716 URL: https://issues.apache.org/jira/browse/CASSANDRA-7716 Project: Cassandra Issue Type: Improvement Reporter: Robert Stupp Assignee: T Jake Luciani Priority: Trivial Fix For: 2.1.1 Attachments: 7166v2.txt, 7716.txt Just tried new stress tool. It would be great if the stress tool gives better error messages by telling the user what option or config parameter/value caused an error. YAML parse errors are meaningful (gives code snippets etc). Examples are: {noformat} WARN 16:59:39 Setting caching options with deprecated syntax. Exception in thread main java.lang.NullPointerException at java.util.regex.Matcher.getTextLength(Matcher.java:1234) at java.util.regex.Matcher.reset(Matcher.java:308) at java.util.regex.Matcher.init(Matcher.java:228) at java.util.regex.Pattern.matcher(Pattern.java:1088) at org.apache.cassandra.stress.settings.OptionDistribution.get(OptionDistribution.java:67) at org.apache.cassandra.stress.StressProfile.init(StressProfile.java:151) at org.apache.cassandra.stress.StressProfile.load(StressProfile.java:482) at org.apache.cassandra.stress.settings.SettingsCommandUser.init(SettingsCommandUser.java:53) at org.apache.cassandra.stress.settings.SettingsCommandUser.build(SettingsCommandUser.java:114) at org.apache.cassandra.stress.settings.SettingsCommand.get(SettingsCommand.java:134) at org.apache.cassandra.stress.settings.StressSettings.get(StressSettings.java:218) at org.apache.cassandra.stress.settings.StressSettings.parse(StressSettings.java:206) at org.apache.cassandra.stress.Stress.main(Stress.java:58) {noformat} When table-definition is wrong: {noformat} Exception in thread main java.lang.RuntimeException: org.apache.cassandra.exceptions.SyntaxException: line 6:14 mismatched input '(' expecting ')' at org.apache.cassandra.config.CFMetaData.compile(CFMetaData.java:550) at org.apache.cassandra.stress.StressProfile.init(StressProfile.java:134) at org.apache.cassandra.stress.StressProfile.load(StressProfile.java:482) at org.apache.cassandra.stress.settings.SettingsCommandUser.init(SettingsCommandUser.java:53) at org.apache.cassandra.stress.settings.SettingsCommandUser.build(SettingsCommandUser.java:114) at org.apache.cassandra.stress.settings.SettingsCommand.get(SettingsCommand.java:134) at org.apache.cassandra.stress.settings.StressSettings.get(StressSettings.java:218) at org.apache.cassandra.stress.settings.StressSettings.parse(StressSettings.java:206) at org.apache.cassandra.stress.Stress.main(Stress.java:58) Caused by: org.apache.cassandra.exceptions.SyntaxException: line 6:14 mismatched input '(' expecting ')' at org.apache.cassandra.cql3.CqlParser.throwLastRecognitionError(CqlParser.java:273) at org.apache.cassandra.cql3.QueryProcessor.parseStatement(QueryProcessor.java:456) at org.apache.cassandra.config.CFMetaData.compile(CFMetaData.java:541) ... 8 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-7731) Get max values for live/tombstone cells per slice
[ https://issues.apache.org/jira/browse/CASSANDRA-7731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133997#comment-14133997 ] Cyril Scetbon edited comment on CASSANDRA-7731 at 9/15/14 4:06 PM: --- bq. the weighted one is the default but you can specify it to length of application what do you mean ? we don't want to change it to be the maximum value since the application started. The only matter concerns the fact that in some cases it's not the maximum for the last 5 minutes but can be for the last 20 minutes like in my case. Do you think, the last version of metrics could enforce it to corresponds to the last 5 minutes ? AFAIU the documentation it says that as [exponentially decaying reservoirs| https://dropwizard.github.io/metrics/3.1.0/manual/core/#exponentially-decaying-reservoirs] use a forward-decaying priority reservoir it should represent the recent data was (Author: cscetbon): bq. the weighted one is the default but you can specify it to length of application what do you mean ? we don't want to change it to be the maximum value since the application started. The only matter concerns the fact that in some cases it's not the maximum for the last 5 minutes but can be for the last 20 minutes like in my case. Do you think, the last version of metrics could enforce it to corresponds to the last 5 minutes ? Get max values for live/tombstone cells per slice - Key: CASSANDRA-7731 URL: https://issues.apache.org/jira/browse/CASSANDRA-7731 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Cyril Scetbon Assignee: Robert Stupp Priority: Minor Fix For: 2.1.1 Attachments: 7731-2.0.txt, 7731-2.1.txt I think you should not say that slice statistics are valid for the [last five minutes |https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/tools/NodeCmd.java#L955-L956] in CFSTATS command of nodetool. I've read the documentation from yammer for Histograms and there is no way to force values to expire after x minutes except by [clearing|http://grepcode.com/file/repo1.maven.org/maven2/com.yammer.metrics/metrics-core/2.1.2/com/yammer/metrics/core/Histogram.java#96] it . The only thing I can see is that the last snapshot used to provide the median (or whatever you'd used instead) value is based on 1028 values. I think we should also be able to detect that some requests are accessing a lot of live/tombstone cells per query and that's not possible for now without activating DEBUG for SliceQueryFilter for example and by tweaking the threshold. Currently as nodetool cfstats returns the median if a low part of the queries are scanning a lot of live/tombstone cells we miss it ! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7716) cassandra-stress: provide better error messages
[ https://issues.apache.org/jira/browse/CASSANDRA-7716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134055#comment-14134055 ] Benedict commented on CASSANDRA-7716: - I figure default is only if it's not provided; if it's provided but empty they perhaps messed up. But either is totally reasonable, so not going to quibble. cassandra-stress: provide better error messages --- Key: CASSANDRA-7716 URL: https://issues.apache.org/jira/browse/CASSANDRA-7716 Project: Cassandra Issue Type: Improvement Reporter: Robert Stupp Assignee: T Jake Luciani Priority: Trivial Fix For: 2.1.1 Attachments: 7166v2.txt, 7716.txt Just tried new stress tool. It would be great if the stress tool gives better error messages by telling the user what option or config parameter/value caused an error. YAML parse errors are meaningful (gives code snippets etc). Examples are: {noformat} WARN 16:59:39 Setting caching options with deprecated syntax. Exception in thread main java.lang.NullPointerException at java.util.regex.Matcher.getTextLength(Matcher.java:1234) at java.util.regex.Matcher.reset(Matcher.java:308) at java.util.regex.Matcher.init(Matcher.java:228) at java.util.regex.Pattern.matcher(Pattern.java:1088) at org.apache.cassandra.stress.settings.OptionDistribution.get(OptionDistribution.java:67) at org.apache.cassandra.stress.StressProfile.init(StressProfile.java:151) at org.apache.cassandra.stress.StressProfile.load(StressProfile.java:482) at org.apache.cassandra.stress.settings.SettingsCommandUser.init(SettingsCommandUser.java:53) at org.apache.cassandra.stress.settings.SettingsCommandUser.build(SettingsCommandUser.java:114) at org.apache.cassandra.stress.settings.SettingsCommand.get(SettingsCommand.java:134) at org.apache.cassandra.stress.settings.StressSettings.get(StressSettings.java:218) at org.apache.cassandra.stress.settings.StressSettings.parse(StressSettings.java:206) at org.apache.cassandra.stress.Stress.main(Stress.java:58) {noformat} When table-definition is wrong: {noformat} Exception in thread main java.lang.RuntimeException: org.apache.cassandra.exceptions.SyntaxException: line 6:14 mismatched input '(' expecting ')' at org.apache.cassandra.config.CFMetaData.compile(CFMetaData.java:550) at org.apache.cassandra.stress.StressProfile.init(StressProfile.java:134) at org.apache.cassandra.stress.StressProfile.load(StressProfile.java:482) at org.apache.cassandra.stress.settings.SettingsCommandUser.init(SettingsCommandUser.java:53) at org.apache.cassandra.stress.settings.SettingsCommandUser.build(SettingsCommandUser.java:114) at org.apache.cassandra.stress.settings.SettingsCommand.get(SettingsCommand.java:134) at org.apache.cassandra.stress.settings.StressSettings.get(StressSettings.java:218) at org.apache.cassandra.stress.settings.StressSettings.parse(StressSettings.java:206) at org.apache.cassandra.stress.Stress.main(Stress.java:58) Caused by: org.apache.cassandra.exceptions.SyntaxException: line 6:14 mismatched input '(' expecting ')' at org.apache.cassandra.cql3.CqlParser.throwLastRecognitionError(CqlParser.java:273) at org.apache.cassandra.cql3.QueryProcessor.parseStatement(QueryProcessor.java:456) at org.apache.cassandra.config.CFMetaData.compile(CFMetaData.java:541) ... 8 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7904) Repair hangs
[ https://issues.apache.org/jira/browse/CASSANDRA-7904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134057#comment-14134057 ] Yuki Morishita commented on CASSANDRA-7904: --- *Lost notification* just indicates JMX lost some notification. It is nothing to do with repair hanging(btw, I created CASSANDRA-7909 for not exiting in this situation). You should check your system.log for repair completion in that case. Repair hangs Key: CASSANDRA-7904 URL: https://issues.apache.org/jira/browse/CASSANDRA-7904 Project: Cassandra Issue Type: Bug Components: Core Environment: C* 2.0.10, ubuntu 14.04, Java HotSpot(TM) 64-Bit Server, java version 1.7.0_45 Reporter: Duncan Sands Attachments: ls-172.18.68.138, ls-192.168.21.13, ls-192.168.60.134, ls-192.168.60.136 Cluster of 22 nodes spread over 4 data centres. Not used on the weekend, so repair is run on all nodes (in a staggered fashion) on the weekend. Nodetool options: -par -pr. There is usually some overlap in the repairs: repair on one node may well still be running when repair is started on the next node. Repair hangs for some of the nodes almost every weekend. It hung last weekend, here are the details: In the whole cluster, only one node had an exception since C* was last restarted. This node is 192.168.60.136 and the exception is harmless: a client disconnected abruptly. tpstats 4 nodes have a non-zero value for active or pending in AntiEntropySessions. These nodes all have Active = 1 and Pending = 1. The nodes are: 192.168.21.13 (data centre R) 192.168.60.134 (data centre A) 192.168.60.136 (data centre A) 172.18.68.138 (data centre Z) compactionstats: No compactions. All nodes have: pending tasks: 0 Active compaction remaining time :n/a netstats: All except one node have nothing. One node (192.168.60.131, not one of the nodes listed in the tpstats section above) has (note the Responses Pending value of 1): Mode: NORMAL Not sending any streams. Read Repair Statistics: Attempted: 4233 Mismatch (Blocking): 0 Mismatch (Background): 243 Pool NameActive Pending Completed Commandsn/a 0 34785445 Responses n/a 1 38567167 Repair sessions I looked for repair sessions that failed to complete. On 3 of the 4 nodes mentioned in tpstats above I found that they had sent merkle tree requests and got responses from all but one node. In the log file for the node that failed to respond there is no sign that it ever received the request. On 1 node (172.18.68.138) it looks like responses were received from every node, some streaming was done, and then... nothing. Details: Node 192.168.21.13 (data centre R): Sent merkle trees to /172.18.33.24, /192.168.60.140, /192.168.60.142, /172.18.68.139, /172.18.68.138, /172.18.33.22, /192.168.21.13 for table brokers, never got a response from /172.18.68.139. On /172.18.68.139, just before this time it sent a response for the same repair session but a different table, and there is no record of it receiving a request for table brokers. Node 192.168.60.134 (data centre A): Sent merkle trees to /172.18.68.139, /172.18.68.138, /192.168.60.132, /192.168.21.14, /192.168.60.134 for table swxess_outbound, never got a response from /172.18.68.138. On /172.18.68.138, just before this time it sent a response for the same repair session but a different table, and there is no record of it receiving a request for table swxess_outbound. Node 192.168.60.136 (data centre A): Sent merkle trees to /192.168.60.142, /172.18.68.139, /192.168.60.136 for table rollups7200, never got a response from /172.18.68.139. This repair session is never mentioned in the /172.18.68.139 log. Node 172.18.68.138 (data centre Z): The issue here seems to be repair session #a55c16e1-35eb-11e4-8e7e-51c077eaf311. It got responses for all its merkle tree requests, did some streaming, but seems to have stopped after finishing with one table (rollups60). I found it as follows: it is the only repair for which there is no session completed successfully message in the log. Some log file snippets are attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7716) cassandra-stress: provide better error messages
[ https://issues.apache.org/jira/browse/CASSANDRA-7716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134053#comment-14134053 ] T Jake Luciani commented on CASSANDRA-7716: --- Just thinking in terms of the users intention ormeans default to me? cassandra-stress: provide better error messages --- Key: CASSANDRA-7716 URL: https://issues.apache.org/jira/browse/CASSANDRA-7716 Project: Cassandra Issue Type: Improvement Reporter: Robert Stupp Assignee: T Jake Luciani Priority: Trivial Fix For: 2.1.1 Attachments: 7166v2.txt, 7716.txt Just tried new stress tool. It would be great if the stress tool gives better error messages by telling the user what option or config parameter/value caused an error. YAML parse errors are meaningful (gives code snippets etc). Examples are: {noformat} WARN 16:59:39 Setting caching options with deprecated syntax. Exception in thread main java.lang.NullPointerException at java.util.regex.Matcher.getTextLength(Matcher.java:1234) at java.util.regex.Matcher.reset(Matcher.java:308) at java.util.regex.Matcher.init(Matcher.java:228) at java.util.regex.Pattern.matcher(Pattern.java:1088) at org.apache.cassandra.stress.settings.OptionDistribution.get(OptionDistribution.java:67) at org.apache.cassandra.stress.StressProfile.init(StressProfile.java:151) at org.apache.cassandra.stress.StressProfile.load(StressProfile.java:482) at org.apache.cassandra.stress.settings.SettingsCommandUser.init(SettingsCommandUser.java:53) at org.apache.cassandra.stress.settings.SettingsCommandUser.build(SettingsCommandUser.java:114) at org.apache.cassandra.stress.settings.SettingsCommand.get(SettingsCommand.java:134) at org.apache.cassandra.stress.settings.StressSettings.get(StressSettings.java:218) at org.apache.cassandra.stress.settings.StressSettings.parse(StressSettings.java:206) at org.apache.cassandra.stress.Stress.main(Stress.java:58) {noformat} When table-definition is wrong: {noformat} Exception in thread main java.lang.RuntimeException: org.apache.cassandra.exceptions.SyntaxException: line 6:14 mismatched input '(' expecting ')' at org.apache.cassandra.config.CFMetaData.compile(CFMetaData.java:550) at org.apache.cassandra.stress.StressProfile.init(StressProfile.java:134) at org.apache.cassandra.stress.StressProfile.load(StressProfile.java:482) at org.apache.cassandra.stress.settings.SettingsCommandUser.init(SettingsCommandUser.java:53) at org.apache.cassandra.stress.settings.SettingsCommandUser.build(SettingsCommandUser.java:114) at org.apache.cassandra.stress.settings.SettingsCommand.get(SettingsCommand.java:134) at org.apache.cassandra.stress.settings.StressSettings.get(StressSettings.java:218) at org.apache.cassandra.stress.settings.StressSettings.parse(StressSettings.java:206) at org.apache.cassandra.stress.Stress.main(Stress.java:58) Caused by: org.apache.cassandra.exceptions.SyntaxException: line 6:14 mismatched input '(' expecting ')' at org.apache.cassandra.cql3.CqlParser.throwLastRecognitionError(CqlParser.java:273) at org.apache.cassandra.cql3.QueryProcessor.parseStatement(QueryProcessor.java:456) at org.apache.cassandra.config.CFMetaData.compile(CFMetaData.java:541) ... 8 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-7731) Get max values for live/tombstone cells per slice
[ https://issues.apache.org/jira/browse/CASSANDRA-7731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14133997#comment-14133997 ] Cyril Scetbon edited comment on CASSANDRA-7731 at 9/15/14 4:12 PM: --- bq. the weighted one is the default but you can specify it to length of application what do you mean ? we don't want to change it to be the maximum value since the application started. The only matter concerns the fact that in some cases it's not the maximum for the last 5 minutes but can be for the last 20 minutes like in my case. Do you think, the last version of metrics could enforce it to corresponds to the last 5 minutes ? AFAIU the documentation it says that as [exponentially decaying reservoirs| https://dropwizard.github.io/metrics/2.2.0/manual/core/#biased-histograms] use a forward-decaying priority reservoir it should represent the recent data was (Author: cscetbon): bq. the weighted one is the default but you can specify it to length of application what do you mean ? we don't want to change it to be the maximum value since the application started. The only matter concerns the fact that in some cases it's not the maximum for the last 5 minutes but can be for the last 20 minutes like in my case. Do you think, the last version of metrics could enforce it to corresponds to the last 5 minutes ? AFAIU the documentation it says that as [exponentially decaying reservoirs| https://dropwizard.github.io/metrics/3.1.0/manual/core/#exponentially-decaying-reservoirs] use a forward-decaying priority reservoir it should represent the recent data Get max values for live/tombstone cells per slice - Key: CASSANDRA-7731 URL: https://issues.apache.org/jira/browse/CASSANDRA-7731 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Cyril Scetbon Assignee: Robert Stupp Priority: Minor Fix For: 2.1.1 Attachments: 7731-2.0.txt, 7731-2.1.txt I think you should not say that slice statistics are valid for the [last five minutes |https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/tools/NodeCmd.java#L955-L956] in CFSTATS command of nodetool. I've read the documentation from yammer for Histograms and there is no way to force values to expire after x minutes except by [clearing|http://grepcode.com/file/repo1.maven.org/maven2/com.yammer.metrics/metrics-core/2.1.2/com/yammer/metrics/core/Histogram.java#96] it . The only thing I can see is that the last snapshot used to provide the median (or whatever you'd used instead) value is based on 1028 values. I think we should also be able to detect that some requests are accessing a lot of live/tombstone cells per query and that's not possible for now without activating DEBUG for SliceQueryFilter for example and by tweaking the threshold. Currently as nodetool cfstats returns the median if a low part of the queries are scanning a lot of live/tombstone cells we miss it ! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7731) Get max values for live/tombstone cells per slice
[ https://issues.apache.org/jira/browse/CASSANDRA-7731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134064#comment-14134064 ] Chris Lohfink commented on CASSANDRA-7731: -- bq. we don't want to change it to be the maximum value since the application started maximum value since the application started is the only option. The only thing the reservoir is used for is for the percentiles like 50th, 75th, 90th etc. Only thing you can change in theory is to use a uniform reservoir instead of a EWMA reservoir to be the same (since start of app instead of last 5 min). The min,max,count,sum,std dev etc are all based on since C* started. In the newer versions of Metrics all the values would be computed from the reservoir snapshot so it would follow the 5 min-ish result. Get max values for live/tombstone cells per slice - Key: CASSANDRA-7731 URL: https://issues.apache.org/jira/browse/CASSANDRA-7731 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Cyril Scetbon Assignee: Robert Stupp Priority: Minor Fix For: 2.1.1 Attachments: 7731-2.0.txt, 7731-2.1.txt I think you should not say that slice statistics are valid for the [last five minutes |https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/tools/NodeCmd.java#L955-L956] in CFSTATS command of nodetool. I've read the documentation from yammer for Histograms and there is no way to force values to expire after x minutes except by [clearing|http://grepcode.com/file/repo1.maven.org/maven2/com.yammer.metrics/metrics-core/2.1.2/com/yammer/metrics/core/Histogram.java#96] it . The only thing I can see is that the last snapshot used to provide the median (or whatever you'd used instead) value is based on 1028 values. I think we should also be able to detect that some requests are accessing a lot of live/tombstone cells per query and that's not possible for now without activating DEBUG for SliceQueryFilter for example and by tweaking the threshold. Currently as nodetool cfstats returns the median if a low part of the queries are scanning a lot of live/tombstone cells we miss it ! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7904) Repair hangs
[ https://issues.apache.org/jira/browse/CASSANDRA-7904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134065#comment-14134065 ] Duncan Sands commented on CASSANDRA-7904: - Razi, please open a different JIRA for your issue. Repair hangs Key: CASSANDRA-7904 URL: https://issues.apache.org/jira/browse/CASSANDRA-7904 Project: Cassandra Issue Type: Bug Components: Core Environment: C* 2.0.10, ubuntu 14.04, Java HotSpot(TM) 64-Bit Server, java version 1.7.0_45 Reporter: Duncan Sands Attachments: ls-172.18.68.138, ls-192.168.21.13, ls-192.168.60.134, ls-192.168.60.136 Cluster of 22 nodes spread over 4 data centres. Not used on the weekend, so repair is run on all nodes (in a staggered fashion) on the weekend. Nodetool options: -par -pr. There is usually some overlap in the repairs: repair on one node may well still be running when repair is started on the next node. Repair hangs for some of the nodes almost every weekend. It hung last weekend, here are the details: In the whole cluster, only one node had an exception since C* was last restarted. This node is 192.168.60.136 and the exception is harmless: a client disconnected abruptly. tpstats 4 nodes have a non-zero value for active or pending in AntiEntropySessions. These nodes all have Active = 1 and Pending = 1. The nodes are: 192.168.21.13 (data centre R) 192.168.60.134 (data centre A) 192.168.60.136 (data centre A) 172.18.68.138 (data centre Z) compactionstats: No compactions. All nodes have: pending tasks: 0 Active compaction remaining time :n/a netstats: All except one node have nothing. One node (192.168.60.131, not one of the nodes listed in the tpstats section above) has (note the Responses Pending value of 1): Mode: NORMAL Not sending any streams. Read Repair Statistics: Attempted: 4233 Mismatch (Blocking): 0 Mismatch (Background): 243 Pool NameActive Pending Completed Commandsn/a 0 34785445 Responses n/a 1 38567167 Repair sessions I looked for repair sessions that failed to complete. On 3 of the 4 nodes mentioned in tpstats above I found that they had sent merkle tree requests and got responses from all but one node. In the log file for the node that failed to respond there is no sign that it ever received the request. On 1 node (172.18.68.138) it looks like responses were received from every node, some streaming was done, and then... nothing. Details: Node 192.168.21.13 (data centre R): Sent merkle trees to /172.18.33.24, /192.168.60.140, /192.168.60.142, /172.18.68.139, /172.18.68.138, /172.18.33.22, /192.168.21.13 for table brokers, never got a response from /172.18.68.139. On /172.18.68.139, just before this time it sent a response for the same repair session but a different table, and there is no record of it receiving a request for table brokers. Node 192.168.60.134 (data centre A): Sent merkle trees to /172.18.68.139, /172.18.68.138, /192.168.60.132, /192.168.21.14, /192.168.60.134 for table swxess_outbound, never got a response from /172.18.68.138. On /172.18.68.138, just before this time it sent a response for the same repair session but a different table, and there is no record of it receiving a request for table swxess_outbound. Node 192.168.60.136 (data centre A): Sent merkle trees to /192.168.60.142, /172.18.68.139, /192.168.60.136 for table rollups7200, never got a response from /172.18.68.139. This repair session is never mentioned in the /172.18.68.139 log. Node 172.18.68.138 (data centre Z): The issue here seems to be repair session #a55c16e1-35eb-11e4-8e7e-51c077eaf311. It got responses for all its merkle tree requests, did some streaming, but seems to have stopped after finishing with one table (rollups60). I found it as follows: it is the only repair for which there is no session completed successfully message in the log. Some log file snippets are attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7402) Add metrics to track memory used by client requests
[ https://issues.apache.org/jira/browse/CASSANDRA-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] T Jake Luciani updated CASSANDRA-7402: -- Summary: Add metrics to track memory used by client requests (was: limit the on heap memory available to requests) Add metrics to track memory used by client requests --- Key: CASSANDRA-7402 URL: https://issues.apache.org/jira/browse/CASSANDRA-7402 Project: Cassandra Issue Type: Improvement Reporter: T Jake Luciani Labels: ops, performance, stability Fix For: 3.0 When running a production cluster one common operational issue is quantifying GC pauses caused by ongoing requests. Since different queries return varying amount of data you can easily get your self into a situation where you Stop the world from a couple of bad actors in the system. Or more likely the aggregate garbage generated on a single node across all in flight requests causes a GC. We should be able to set a limit on the max heap we can allocate to all outstanding requests and track the garbage per requests to stop this from happening. It should increase a single nodes availability substantially. In the yaml this would be {code} total_request_memory_space_mb: 400 {code} It would also be nice to have either a log of queries which generate the most garbage so operators can track this. Also a histogram. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-7731) Get max values for live/tombstone cells per slice
[ https://issues.apache.org/jira/browse/CASSANDRA-7731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134064#comment-14134064 ] Chris Lohfink edited comment on CASSANDRA-7731 at 9/15/14 4:18 PM: --- bq. we don't want to change it to be the maximum value since the application started maximum value since the application started is the only option. The only thing the reservoir is used for is for the percentiles like 50th, 75th, 90th etc. Only thing you can change in theory is to use a uniform reservoir instead of a decaying reservoir to be the same (since start of app instead of last 5 min). The min,max,count,sum,std dev etc are all based on since C* started. In the newer versions of Metrics all the values would be computed from the reservoir snapshot so it would follow the 5 min-ish result. was (Author: cnlwsu): bq. we don't want to change it to be the maximum value since the application started maximum value since the application started is the only option. The only thing the reservoir is used for is for the percentiles like 50th, 75th, 90th etc. Only thing you can change in theory is to use a uniform reservoir instead of a EWMA reservoir to be the same (since start of app instead of last 5 min). The min,max,count,sum,std dev etc are all based on since C* started. In the newer versions of Metrics all the values would be computed from the reservoir snapshot so it would follow the 5 min-ish result. Get max values for live/tombstone cells per slice - Key: CASSANDRA-7731 URL: https://issues.apache.org/jira/browse/CASSANDRA-7731 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Cyril Scetbon Assignee: Robert Stupp Priority: Minor Fix For: 2.1.1 Attachments: 7731-2.0.txt, 7731-2.1.txt I think you should not say that slice statistics are valid for the [last five minutes |https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/tools/NodeCmd.java#L955-L956] in CFSTATS command of nodetool. I've read the documentation from yammer for Histograms and there is no way to force values to expire after x minutes except by [clearing|http://grepcode.com/file/repo1.maven.org/maven2/com.yammer.metrics/metrics-core/2.1.2/com/yammer/metrics/core/Histogram.java#96] it . The only thing I can see is that the last snapshot used to provide the median (or whatever you'd used instead) value is based on 1028 values. I think we should also be able to detect that some requests are accessing a lot of live/tombstone cells per query and that's not possible for now without activating DEBUG for SliceQueryFilter for example and by tweaking the threshold. Currently as nodetool cfstats returns the median if a low part of the queries are scanning a lot of live/tombstone cells we miss it ! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
git commit: Better error checking of stress profile
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 6bff5a331 - b3573f3d3 Better error checking of stress profile patch by tjake; reviewed by belliottsmith for CASSANDRA-7716 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b3573f3d Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b3573f3d Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b3573f3d Branch: refs/heads/cassandra-2.1 Commit: b3573f3d3bc8ab18ee82534625868dae12f7234c Parents: 6bff5a3 Author: Jake Luciani j...@apache.org Authored: Mon Sep 15 12:20:43 2014 -0400 Committer: Jake Luciani j...@apache.org Committed: Mon Sep 15 12:20:43 2014 -0400 -- CHANGES.txt | 1 + .../apache/cassandra/stress/StressProfile.java | 29 .../settings/OptionRatioDistribution.java | 2 +- 3 files changed, 26 insertions(+), 6 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/b3573f3d/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 4c0dbf9..8fe4253 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.1 + * Add better error checking of new stress profile (CASSANDRA-7716) * Use ThreadLocalRandom and remove FBUtilities.threadLocalRandom (CASSANDRA-7934) * Prevent operator mistakes due to simultaneous bootstrap (CASSANDRA-7069) * cassandra-stress supports whitelist mode for node config (CASSANDRA-7658) http://git-wip-us.apache.org/repos/asf/cassandra/blob/b3573f3d/tools/stress/src/org/apache/cassandra/stress/StressProfile.java -- diff --git a/tools/stress/src/org/apache/cassandra/stress/StressProfile.java b/tools/stress/src/org/apache/cassandra/stress/StressProfile.java index bd873e8..b0a149c 100644 --- a/tools/stress/src/org/apache/cassandra/stress/StressProfile.java +++ b/tools/stress/src/org/apache/cassandra/stress/StressProfile.java @@ -31,6 +31,7 @@ import org.apache.cassandra.cql3.QueryProcessor; import org.apache.cassandra.cql3.statements.CreateKeyspaceStatement; import org.apache.cassandra.exceptions.RequestValidationException; +import org.apache.cassandra.exceptions.SyntaxException; import org.apache.cassandra.stress.generate.Distribution; import org.apache.cassandra.stress.generate.DistributionFactory; import org.apache.cassandra.stress.generate.PartitionGenerator; @@ -124,8 +125,15 @@ public class StressProfile implements Serializable if (keyspaceCql != null keyspaceCql.length() 0) { -String name = ((CreateKeyspaceStatement) QueryProcessor.parseStatement(keyspaceCql)).keyspace(); -assert name.equalsIgnoreCase(keyspaceName) : Name in keyspace_definition doesn't match keyspace property: ' + name + ' != ' + keyspaceName + '; +try +{ +String name = ((CreateKeyspaceStatement) QueryProcessor.parseStatement(keyspaceCql)).keyspace(); +assert name.equalsIgnoreCase(keyspaceName) : Name in keyspace_definition doesn't match keyspace property: ' + name + ' != ' + keyspaceName + '; +} +catch (SyntaxException e) +{ +throw new IllegalArgumentException(There was a problem parsing the keyspace cql: + e.getMessage()); +} } else { @@ -134,8 +142,15 @@ public class StressProfile implements Serializable if (tableCql != null tableCql.length() 0) { -String name = CFMetaData.compile(tableCql, keyspaceName).cfName; -assert name.equalsIgnoreCase(tableName) : Name in table_definition doesn't match table property: ' + name + ' != ' + tableName + '; +try +{ +String name = CFMetaData.compile(tableCql, keyspaceName).cfName; +assert name.equalsIgnoreCase(tableName) : Name in table_definition doesn't match table property: ' + name + ' != ' + tableName + '; +} +catch (RuntimeException e) +{ +throw new IllegalArgumentException(There was a problem parsing the table cql: + e.getCause().getMessage()); +} } else { @@ -217,6 +232,9 @@ public class StressProfile implements Serializable .getKeyspace(keyspaceName) .getTable(tableName); +if (metadata == null) +throw new RuntimeException(Unable to find table + keyspaceName + . + tableName); + //Fill in missing column configs for (ColumnMetadata col : metadata.getColumns())
[jira] [Commented] (CASSANDRA-7904) Repair hangs
[ https://issues.apache.org/jira/browse/CASSANDRA-7904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134069#comment-14134069 ] Duncan Sands commented on CASSANDRA-7904: - Brandon, it may not have been necessary to increase rpc_timeout so much, I didn't experiment to find out what is really needed. Repair hangs Key: CASSANDRA-7904 URL: https://issues.apache.org/jira/browse/CASSANDRA-7904 Project: Cassandra Issue Type: Bug Components: Core Environment: C* 2.0.10, ubuntu 14.04, Java HotSpot(TM) 64-Bit Server, java version 1.7.0_45 Reporter: Duncan Sands Attachments: ls-172.18.68.138, ls-192.168.21.13, ls-192.168.60.134, ls-192.168.60.136 Cluster of 22 nodes spread over 4 data centres. Not used on the weekend, so repair is run on all nodes (in a staggered fashion) on the weekend. Nodetool options: -par -pr. There is usually some overlap in the repairs: repair on one node may well still be running when repair is started on the next node. Repair hangs for some of the nodes almost every weekend. It hung last weekend, here are the details: In the whole cluster, only one node had an exception since C* was last restarted. This node is 192.168.60.136 and the exception is harmless: a client disconnected abruptly. tpstats 4 nodes have a non-zero value for active or pending in AntiEntropySessions. These nodes all have Active = 1 and Pending = 1. The nodes are: 192.168.21.13 (data centre R) 192.168.60.134 (data centre A) 192.168.60.136 (data centre A) 172.18.68.138 (data centre Z) compactionstats: No compactions. All nodes have: pending tasks: 0 Active compaction remaining time :n/a netstats: All except one node have nothing. One node (192.168.60.131, not one of the nodes listed in the tpstats section above) has (note the Responses Pending value of 1): Mode: NORMAL Not sending any streams. Read Repair Statistics: Attempted: 4233 Mismatch (Blocking): 0 Mismatch (Background): 243 Pool NameActive Pending Completed Commandsn/a 0 34785445 Responses n/a 1 38567167 Repair sessions I looked for repair sessions that failed to complete. On 3 of the 4 nodes mentioned in tpstats above I found that they had sent merkle tree requests and got responses from all but one node. In the log file for the node that failed to respond there is no sign that it ever received the request. On 1 node (172.18.68.138) it looks like responses were received from every node, some streaming was done, and then... nothing. Details: Node 192.168.21.13 (data centre R): Sent merkle trees to /172.18.33.24, /192.168.60.140, /192.168.60.142, /172.18.68.139, /172.18.68.138, /172.18.33.22, /192.168.21.13 for table brokers, never got a response from /172.18.68.139. On /172.18.68.139, just before this time it sent a response for the same repair session but a different table, and there is no record of it receiving a request for table brokers. Node 192.168.60.134 (data centre A): Sent merkle trees to /172.18.68.139, /172.18.68.138, /192.168.60.132, /192.168.21.14, /192.168.60.134 for table swxess_outbound, never got a response from /172.18.68.138. On /172.18.68.138, just before this time it sent a response for the same repair session but a different table, and there is no record of it receiving a request for table swxess_outbound. Node 192.168.60.136 (data centre A): Sent merkle trees to /192.168.60.142, /172.18.68.139, /192.168.60.136 for table rollups7200, never got a response from /172.18.68.139. This repair session is never mentioned in the /172.18.68.139 log. Node 172.18.68.138 (data centre Z): The issue here seems to be repair session #a55c16e1-35eb-11e4-8e7e-51c077eaf311. It got responses for all its merkle tree requests, did some streaming, but seems to have stopped after finishing with one table (rollups60). I found it as follows: it is the only repair for which there is no session completed successfully message in the log. Some log file snippets are attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[1/2] git commit: Better error checking of stress profile
Repository: cassandra Updated Branches: refs/heads/trunk 969967cf9 - 7b5164f39 Better error checking of stress profile patch by tjake; reviewed by belliottsmith for CASSANDRA-7716 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b3573f3d Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b3573f3d Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b3573f3d Branch: refs/heads/trunk Commit: b3573f3d3bc8ab18ee82534625868dae12f7234c Parents: 6bff5a3 Author: Jake Luciani j...@apache.org Authored: Mon Sep 15 12:20:43 2014 -0400 Committer: Jake Luciani j...@apache.org Committed: Mon Sep 15 12:20:43 2014 -0400 -- CHANGES.txt | 1 + .../apache/cassandra/stress/StressProfile.java | 29 .../settings/OptionRatioDistribution.java | 2 +- 3 files changed, 26 insertions(+), 6 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/b3573f3d/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 4c0dbf9..8fe4253 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.1 + * Add better error checking of new stress profile (CASSANDRA-7716) * Use ThreadLocalRandom and remove FBUtilities.threadLocalRandom (CASSANDRA-7934) * Prevent operator mistakes due to simultaneous bootstrap (CASSANDRA-7069) * cassandra-stress supports whitelist mode for node config (CASSANDRA-7658) http://git-wip-us.apache.org/repos/asf/cassandra/blob/b3573f3d/tools/stress/src/org/apache/cassandra/stress/StressProfile.java -- diff --git a/tools/stress/src/org/apache/cassandra/stress/StressProfile.java b/tools/stress/src/org/apache/cassandra/stress/StressProfile.java index bd873e8..b0a149c 100644 --- a/tools/stress/src/org/apache/cassandra/stress/StressProfile.java +++ b/tools/stress/src/org/apache/cassandra/stress/StressProfile.java @@ -31,6 +31,7 @@ import org.apache.cassandra.cql3.QueryProcessor; import org.apache.cassandra.cql3.statements.CreateKeyspaceStatement; import org.apache.cassandra.exceptions.RequestValidationException; +import org.apache.cassandra.exceptions.SyntaxException; import org.apache.cassandra.stress.generate.Distribution; import org.apache.cassandra.stress.generate.DistributionFactory; import org.apache.cassandra.stress.generate.PartitionGenerator; @@ -124,8 +125,15 @@ public class StressProfile implements Serializable if (keyspaceCql != null keyspaceCql.length() 0) { -String name = ((CreateKeyspaceStatement) QueryProcessor.parseStatement(keyspaceCql)).keyspace(); -assert name.equalsIgnoreCase(keyspaceName) : Name in keyspace_definition doesn't match keyspace property: ' + name + ' != ' + keyspaceName + '; +try +{ +String name = ((CreateKeyspaceStatement) QueryProcessor.parseStatement(keyspaceCql)).keyspace(); +assert name.equalsIgnoreCase(keyspaceName) : Name in keyspace_definition doesn't match keyspace property: ' + name + ' != ' + keyspaceName + '; +} +catch (SyntaxException e) +{ +throw new IllegalArgumentException(There was a problem parsing the keyspace cql: + e.getMessage()); +} } else { @@ -134,8 +142,15 @@ public class StressProfile implements Serializable if (tableCql != null tableCql.length() 0) { -String name = CFMetaData.compile(tableCql, keyspaceName).cfName; -assert name.equalsIgnoreCase(tableName) : Name in table_definition doesn't match table property: ' + name + ' != ' + tableName + '; +try +{ +String name = CFMetaData.compile(tableCql, keyspaceName).cfName; +assert name.equalsIgnoreCase(tableName) : Name in table_definition doesn't match table property: ' + name + ' != ' + tableName + '; +} +catch (RuntimeException e) +{ +throw new IllegalArgumentException(There was a problem parsing the table cql: + e.getCause().getMessage()); +} } else { @@ -217,6 +232,9 @@ public class StressProfile implements Serializable .getKeyspace(keyspaceName) .getTable(tableName); +if (metadata == null) +throw new RuntimeException(Unable to find table + keyspaceName + . + tableName); + //Fill in missing column configs for (ColumnMetadata col : metadata.getColumns()) { @@ -391,9
[2/2] git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7b5164f3 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7b5164f3 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7b5164f3 Branch: refs/heads/trunk Commit: 7b5164f39845d2329b06f1348157d233adeb7453 Parents: 969967c b3573f3 Author: Jake Luciani j...@apache.org Authored: Mon Sep 15 12:22:09 2014 -0400 Committer: Jake Luciani j...@apache.org Committed: Mon Sep 15 12:22:09 2014 -0400 -- CHANGES.txt | 1 + .../apache/cassandra/stress/StressProfile.java | 29 .../settings/OptionRatioDistribution.java | 2 +- 3 files changed, 26 insertions(+), 6 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/7b5164f3/CHANGES.txt -- diff --cc CHANGES.txt index b6e9165,8fe4253..9a43511 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,26 -1,5 +1,27 @@@ +3.0 + * Remove YamlFileNetworkTopologySnitch (CASSANDRA-7917) + * Support Java source code for user-defined functions (CASSANDRA-7562) + * Require arg types to disambiguate UDF drops (CASSANDRA-7812) + * Do anticompaction in groups (CASSANDRA-6851) + * Verify that UDF class methods are static (CASSANDRA-7781) + * Support pure user-defined functions (CASSANDRA-7395, 7740) + * Permit configurable timestamps with cassandra-stress (CASSANDRA-7416) + * Move sstable RandomAccessReader to nio2, which allows using the + FILE_SHARE_DELETE flag on Windows (CASSANDRA-4050) + * Remove CQL2 (CASSANDRA-5918) + * Add Thrift get_multi_slice call (CASSANDRA-6757) + * Optimize fetching multiple cells by name (CASSANDRA-6933) + * Allow compilation in java 8 (CASSANDRA-7028) + * Make incremental repair default (CASSANDRA-7250) + * Enable code coverage thru JaCoCo (CASSANDRA-7226) + * Switch external naming of 'column families' to 'tables' (CASSANDRA-4369) + * Shorten SSTable path (CASSANDRA-6962) + * Use unsafe mutations for most unit tests (CASSANDRA-6969) + * Fix race condition during calculation of pending ranges (CASSANDRA-7390) + + 2.1.1 + * Add better error checking of new stress profile (CASSANDRA-7716) * Use ThreadLocalRandom and remove FBUtilities.threadLocalRandom (CASSANDRA-7934) * Prevent operator mistakes due to simultaneous bootstrap (CASSANDRA-7069) * cassandra-stress supports whitelist mode for node config (CASSANDRA-7658)
[jira] [Commented] (CASSANDRA-7402) Add metrics to track memory used by client requests
[ https://issues.apache.org/jira/browse/CASSANDRA-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134071#comment-14134071 ] Benedict commented on CASSANDRA-7402: - I'm not convinced tracking the per-client/per-query statistics is likely to be very viable. Once queries cross the MS threshold the information isn't available to us, and making it could be costly. We could probably serialize the prepared statement id over the wire, and wire that up as the data is requested in nodetool, say, by attempting to locate a server with the statement. I think tracking and reporting this data in this manner should be a separate ticket to constraining it, however, which is a much more concretely beneficial and achievable goal. Add metrics to track memory used by client requests --- Key: CASSANDRA-7402 URL: https://issues.apache.org/jira/browse/CASSANDRA-7402 Project: Cassandra Issue Type: Improvement Reporter: T Jake Luciani Labels: ops, performance, stability Fix For: 3.0 When running a production cluster one common operational issue is quantifying GC pauses caused by ongoing requests. Since different queries return varying amount of data you can easily get your self into a situation where you Stop the world from a couple of bad actors in the system. Or more likely the aggregate garbage generated on a single node across all in flight requests causes a GC. We should be able to set a limit on the max heap we can allocate to all outstanding requests and track the garbage per requests to stop this from happening. It should increase a single nodes availability substantially. In the yaml this would be {code} total_request_memory_space_mb: 400 {code} It would also be nice to have either a log of queries which generate the most garbage so operators can track this. Also a histogram. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7402) Add metrics to track memory used by client requests
[ https://issues.apache.org/jira/browse/CASSANDRA-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134075#comment-14134075 ] T Jake Luciani commented on CASSANDRA-7402: --- I wasn't planning on adding stats for each client, simply the aggregate of all. And perhaps a worst offenders list of queries/users Add metrics to track memory used by client requests --- Key: CASSANDRA-7402 URL: https://issues.apache.org/jira/browse/CASSANDRA-7402 Project: Cassandra Issue Type: Improvement Reporter: T Jake Luciani Labels: ops, performance, stability Fix For: 3.0 When running a production cluster one common operational issue is quantifying GC pauses caused by ongoing requests. Since different queries return varying amount of data you can easily get your self into a situation where you Stop the world from a couple of bad actors in the system. Or more likely the aggregate garbage generated on a single node across all in flight requests causes a GC. We should be able to set a limit on the max heap we can allocate to all outstanding requests and track the garbage per requests to stop this from happening. It should increase a single nodes availability substantially. In the yaml this would be {code} total_request_memory_space_mb: 400 {code} It would also be nice to have either a log of queries which generate the most garbage so operators can track this. Also a histogram. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7402) Add metrics to track memory used by client requests
[ https://issues.apache.org/jira/browse/CASSANDRA-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134076#comment-14134076 ] Benedict commented on CASSANDRA-7402: - It's the queries we cannot easily track, aggregated or not. At least, not those we received over MessagingService, or not easily and cheaply. Add metrics to track memory used by client requests --- Key: CASSANDRA-7402 URL: https://issues.apache.org/jira/browse/CASSANDRA-7402 Project: Cassandra Issue Type: Improvement Reporter: T Jake Luciani Labels: ops, performance, stability Fix For: 3.0 When running a production cluster one common operational issue is quantifying GC pauses caused by ongoing requests. Since different queries return varying amount of data you can easily get your self into a situation where you Stop the world from a couple of bad actors in the system. Or more likely the aggregate garbage generated on a single node across all in flight requests causes a GC. We should be able to set a limit on the max heap we can allocate to all outstanding requests and track the garbage per requests to stop this from happening. It should increase a single nodes availability substantially. In the yaml this would be {code} total_request_memory_space_mb: 400 {code} It would also be nice to have either a log of queries which generate the most garbage so operators can track this. Also a histogram. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7028) Allow C* to compile under java 8
[ https://issues.apache.org/jira/browse/CASSANDRA-7028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134079#comment-14134079 ] Charles Cao commented on CASSANDRA-7028: Can we also compile and run C* 2.0.x under JDK 8? Allow C* to compile under java 8 Key: CASSANDRA-7028 URL: https://issues.apache.org/jira/browse/CASSANDRA-7028 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Dave Brosius Assignee: Aleksey Yeschenko Priority: Minor Fix For: 2.1.1, 3.0 Attachments: 7028.txt, 7028_v2.txt, 7028_v3.txt, 7028_v4.txt, 7028_v5.patch antlr 3.2 has a problem with java 8, as described here: http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8015656 updating to antlr 3.5.2 solves this, however they have split up the jars differently, which adds some changes, but also the generation of CqlParser.java causes a method to be too large, so i needed to split that method to reduce the size of it. (patch against trunk) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7402) Add metrics to track memory used by client requests
[ https://issues.apache.org/jira/browse/CASSANDRA-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134081#comment-14134081 ] T Jake Luciani commented on CASSANDRA-7402: --- The queries could be just on the coordinators. there can be a per node stat as well which we should be able to gather enough stats to be useful to an operator per, keyspace/table. Add metrics to track memory used by client requests --- Key: CASSANDRA-7402 URL: https://issues.apache.org/jira/browse/CASSANDRA-7402 Project: Cassandra Issue Type: Improvement Reporter: T Jake Luciani Labels: ops, performance, stability Fix For: 3.0 When running a production cluster one common operational issue is quantifying GC pauses caused by ongoing requests. Since different queries return varying amount of data you can easily get your self into a situation where you Stop the world from a couple of bad actors in the system. Or more likely the aggregate garbage generated on a single node across all in flight requests causes a GC. We should be able to set a limit on the max heap we can allocate to all outstanding requests and track the garbage per requests to stop this from happening. It should increase a single nodes availability substantially. In the yaml this would be {code} total_request_memory_space_mb: 400 {code} It would also be nice to have either a log of queries which generate the most garbage so operators can track this. Also a histogram. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7731) Get max values for live/tombstone cells per slice
[ https://issues.apache.org/jira/browse/CASSANDRA-7731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134088#comment-14134088 ] Cyril Scetbon commented on CASSANDRA-7731: -- For the 2.1 patch I understand that it could not work as expected as it's not using a percentile when it calls [HistogramMBean.getMax|https://github.com/dropwizard/metrics/blob/v2.2.0/metrics-core/src/main/java/com/yammer/metrics/reporting/JmxReporter.java#L210-L212]. However, I'm using the [2.0 patch|https://issues.apache.org/jira/secure/attachment/12661546/7731-2.0.txt] which internally uses metric.liveScannedHistogram.cf.getSnapshot().getValue(1d) which gets the maximum from a percentile. However, as you saw in my logs, it doesn't work better and returns an old maximum Get max values for live/tombstone cells per slice - Key: CASSANDRA-7731 URL: https://issues.apache.org/jira/browse/CASSANDRA-7731 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Cyril Scetbon Assignee: Robert Stupp Priority: Minor Fix For: 2.1.1 Attachments: 7731-2.0.txt, 7731-2.1.txt I think you should not say that slice statistics are valid for the [last five minutes |https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/tools/NodeCmd.java#L955-L956] in CFSTATS command of nodetool. I've read the documentation from yammer for Histograms and there is no way to force values to expire after x minutes except by [clearing|http://grepcode.com/file/repo1.maven.org/maven2/com.yammer.metrics/metrics-core/2.1.2/com/yammer/metrics/core/Histogram.java#96] it . The only thing I can see is that the last snapshot used to provide the median (or whatever you'd used instead) value is based on 1028 values. I think we should also be able to detect that some requests are accessing a lot of live/tombstone cells per query and that's not possible for now without activating DEBUG for SliceQueryFilter for example and by tweaking the threshold. Currently as nodetool cfstats returns the median if a low part of the queries are scanning a lot of live/tombstone cells we miss it ! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-7716) cassandra-stress: provide better error messages
[ https://issues.apache.org/jira/browse/CASSANDRA-7716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] T Jake Luciani resolved CASSANDRA-7716. --- Resolution: Fixed Reviewer: Benedict (was: Robert Stupp) committed cassandra-stress: provide better error messages --- Key: CASSANDRA-7716 URL: https://issues.apache.org/jira/browse/CASSANDRA-7716 Project: Cassandra Issue Type: Improvement Reporter: Robert Stupp Assignee: T Jake Luciani Priority: Trivial Fix For: 2.1.1 Attachments: 7166v2.txt, 7716.txt Just tried new stress tool. It would be great if the stress tool gives better error messages by telling the user what option or config parameter/value caused an error. YAML parse errors are meaningful (gives code snippets etc). Examples are: {noformat} WARN 16:59:39 Setting caching options with deprecated syntax. Exception in thread main java.lang.NullPointerException at java.util.regex.Matcher.getTextLength(Matcher.java:1234) at java.util.regex.Matcher.reset(Matcher.java:308) at java.util.regex.Matcher.init(Matcher.java:228) at java.util.regex.Pattern.matcher(Pattern.java:1088) at org.apache.cassandra.stress.settings.OptionDistribution.get(OptionDistribution.java:67) at org.apache.cassandra.stress.StressProfile.init(StressProfile.java:151) at org.apache.cassandra.stress.StressProfile.load(StressProfile.java:482) at org.apache.cassandra.stress.settings.SettingsCommandUser.init(SettingsCommandUser.java:53) at org.apache.cassandra.stress.settings.SettingsCommandUser.build(SettingsCommandUser.java:114) at org.apache.cassandra.stress.settings.SettingsCommand.get(SettingsCommand.java:134) at org.apache.cassandra.stress.settings.StressSettings.get(StressSettings.java:218) at org.apache.cassandra.stress.settings.StressSettings.parse(StressSettings.java:206) at org.apache.cassandra.stress.Stress.main(Stress.java:58) {noformat} When table-definition is wrong: {noformat} Exception in thread main java.lang.RuntimeException: org.apache.cassandra.exceptions.SyntaxException: line 6:14 mismatched input '(' expecting ')' at org.apache.cassandra.config.CFMetaData.compile(CFMetaData.java:550) at org.apache.cassandra.stress.StressProfile.init(StressProfile.java:134) at org.apache.cassandra.stress.StressProfile.load(StressProfile.java:482) at org.apache.cassandra.stress.settings.SettingsCommandUser.init(SettingsCommandUser.java:53) at org.apache.cassandra.stress.settings.SettingsCommandUser.build(SettingsCommandUser.java:114) at org.apache.cassandra.stress.settings.SettingsCommand.get(SettingsCommand.java:134) at org.apache.cassandra.stress.settings.StressSettings.get(StressSettings.java:218) at org.apache.cassandra.stress.settings.StressSettings.parse(StressSettings.java:206) at org.apache.cassandra.stress.Stress.main(Stress.java:58) Caused by: org.apache.cassandra.exceptions.SyntaxException: line 6:14 mismatched input '(' expecting ')' at org.apache.cassandra.cql3.CqlParser.throwLastRecognitionError(CqlParser.java:273) at org.apache.cassandra.cql3.QueryProcessor.parseStatement(QueryProcessor.java:456) at org.apache.cassandra.config.CFMetaData.compile(CFMetaData.java:541) ... 8 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7402) Add metrics to track memory used by client requests
[ https://issues.apache.org/jira/browse/CASSANDRA-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134093#comment-14134093 ] Benedict commented on CASSANDRA-7402: - Makes sense. If we're going to be grabbing the data cross-cluster (necessary for delivering per-node stats) then tracking based on statement id would be sufficient also, since we could populate the map in nodetool from the whole cluster, so users can drill down into hotspot nodes Add metrics to track memory used by client requests --- Key: CASSANDRA-7402 URL: https://issues.apache.org/jira/browse/CASSANDRA-7402 Project: Cassandra Issue Type: Improvement Reporter: T Jake Luciani Labels: ops, performance, stability Fix For: 3.0 When running a production cluster one common operational issue is quantifying GC pauses caused by ongoing requests. Since different queries return varying amount of data you can easily get your self into a situation where you Stop the world from a couple of bad actors in the system. Or more likely the aggregate garbage generated on a single node across all in flight requests causes a GC. We should be able to set a limit on the max heap we can allocate to all outstanding requests and track the garbage per requests to stop this from happening. It should increase a single nodes availability substantially. In the yaml this would be {code} total_request_memory_space_mb: 400 {code} It would also be nice to have either a log of queries which generate the most garbage so operators can track this. Also a histogram. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-7731) Get max values for live/tombstone cells per slice
[ https://issues.apache.org/jira/browse/CASSANDRA-7731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134088#comment-14134088 ] Cyril Scetbon edited comment on CASSANDRA-7731 at 9/15/14 4:34 PM: --- For the 2.1 patch I understand that it could not work as expected as it's not using a percentile when it calls [HistogramMBean.getMax|https://github.com/dropwizard/metrics/blob/v2.2.0/metrics-core/src/main/java/com/yammer/metrics/reporting/JmxReporter.java#L210-L212] and you said that non percentile functions return values since the application started. However, I'm using the [2.0 patch|https://issues.apache.org/jira/secure/attachment/12661546/7731-2.0.txt] which internally uses metric.liveScannedHistogram.cf.getSnapshot().getValue(1d) which gets the maximum from a percentile. However, as you saw in my logs, it doesn't work better and returns an old maximum was (Author: cscetbon): For the 2.1 patch I understand that it could not work as expected as it's not using a percentile when it calls [HistogramMBean.getMax|https://github.com/dropwizard/metrics/blob/v2.2.0/metrics-core/src/main/java/com/yammer/metrics/reporting/JmxReporter.java#L210-L212]. However, I'm using the [2.0 patch|https://issues.apache.org/jira/secure/attachment/12661546/7731-2.0.txt] which internally uses metric.liveScannedHistogram.cf.getSnapshot().getValue(1d) which gets the maximum from a percentile. However, as you saw in my logs, it doesn't work better and returns an old maximum Get max values for live/tombstone cells per slice - Key: CASSANDRA-7731 URL: https://issues.apache.org/jira/browse/CASSANDRA-7731 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Cyril Scetbon Assignee: Robert Stupp Priority: Minor Fix For: 2.1.1 Attachments: 7731-2.0.txt, 7731-2.1.txt I think you should not say that slice statistics are valid for the [last five minutes |https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/tools/NodeCmd.java#L955-L956] in CFSTATS command of nodetool. I've read the documentation from yammer for Histograms and there is no way to force values to expire after x minutes except by [clearing|http://grepcode.com/file/repo1.maven.org/maven2/com.yammer.metrics/metrics-core/2.1.2/com/yammer/metrics/core/Histogram.java#96] it . The only thing I can see is that the last snapshot used to provide the median (or whatever you'd used instead) value is based on 1028 values. I think we should also be able to detect that some requests are accessing a lot of live/tombstone cells per query and that's not possible for now without activating DEBUG for SliceQueryFilter for example and by tweaking the threshold. Currently as nodetool cfstats returns the median if a low part of the queries are scanning a lot of live/tombstone cells we miss it ! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7402) Add metrics to track memory used by client requests
[ https://issues.apache.org/jira/browse/CASSANDRA-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134095#comment-14134095 ] T Jake Luciani commented on CASSANDRA-7402: --- Good idea. Add metrics to track memory used by client requests --- Key: CASSANDRA-7402 URL: https://issues.apache.org/jira/browse/CASSANDRA-7402 Project: Cassandra Issue Type: Improvement Reporter: T Jake Luciani Labels: ops, performance, stability Fix For: 3.0 When running a production cluster one common operational issue is quantifying GC pauses caused by ongoing requests. Since different queries return varying amount of data you can easily get your self into a situation where you Stop the world from a couple of bad actors in the system. Or more likely the aggregate garbage generated on a single node across all in flight requests causes a GC. We should be able to set a limit on the max heap we can allocate to all outstanding requests and track the garbage per requests to stop this from happening. It should increase a single nodes availability substantially. In the yaml this would be {code} total_request_memory_space_mb: 400 {code} It would also be nice to have either a log of queries which generate the most garbage so operators can track this. Also a histogram. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7731) Get max values for live/tombstone cells per slice
[ https://issues.apache.org/jira/browse/CASSANDRA-7731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134104#comment-14134104 ] Chris Lohfink commented on CASSANDRA-7731: -- May be hitting an issue where getting the 1 percentile from the exp. weighted reservoir looks like its coming from a uniform reservoir: https://github.com/dropwizard/metrics/pull/421 Get max values for live/tombstone cells per slice - Key: CASSANDRA-7731 URL: https://issues.apache.org/jira/browse/CASSANDRA-7731 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Cyril Scetbon Assignee: Robert Stupp Priority: Minor Fix For: 2.1.1 Attachments: 7731-2.0.txt, 7731-2.1.txt I think you should not say that slice statistics are valid for the [last five minutes |https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/tools/NodeCmd.java#L955-L956] in CFSTATS command of nodetool. I've read the documentation from yammer for Histograms and there is no way to force values to expire after x minutes except by [clearing|http://grepcode.com/file/repo1.maven.org/maven2/com.yammer.metrics/metrics-core/2.1.2/com/yammer/metrics/core/Histogram.java#96] it . The only thing I can see is that the last snapshot used to provide the median (or whatever you'd used instead) value is based on 1028 values. I think we should also be able to detect that some requests are accessing a lot of live/tombstone cells per query and that's not possible for now without activating DEBUG for SliceQueryFilter for example and by tweaking the threshold. Currently as nodetool cfstats returns the median if a low part of the queries are scanning a lot of live/tombstone cells we miss it ! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (CASSANDRA-7731) Get max values for live/tombstone cells per slice
[ https://issues.apache.org/jira/browse/CASSANDRA-7731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cyril Scetbon reopened CASSANDRA-7731: -- Reproduced In: 2.0.9, 1.2.18 (was: 1.2.18, 2.0.9) Get max values for live/tombstone cells per slice - Key: CASSANDRA-7731 URL: https://issues.apache.org/jira/browse/CASSANDRA-7731 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Cyril Scetbon Assignee: Robert Stupp Priority: Minor Fix For: 2.1.1 Attachments: 7731-2.0.txt, 7731-2.1.txt I think you should not say that slice statistics are valid for the [last five minutes |https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/tools/NodeCmd.java#L955-L956] in CFSTATS command of nodetool. I've read the documentation from yammer for Histograms and there is no way to force values to expire after x minutes except by [clearing|http://grepcode.com/file/repo1.maven.org/maven2/com.yammer.metrics/metrics-core/2.1.2/com/yammer/metrics/core/Histogram.java#96] it . The only thing I can see is that the last snapshot used to provide the median (or whatever you'd used instead) value is based on 1028 values. I think we should also be able to detect that some requests are accessing a lot of live/tombstone cells per query and that's not possible for now without activating DEBUG for SliceQueryFilter for example and by tweaking the threshold. Currently as nodetool cfstats returns the median if a low part of the queries are scanning a lot of live/tombstone cells we miss it ! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7731) Get max values for live/tombstone cells per slice
[ https://issues.apache.org/jira/browse/CASSANDRA-7731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134124#comment-14134124 ] Cyril Scetbon commented on CASSANDRA-7731: -- I hope it's not that one because the bug is still opened :( Get max values for live/tombstone cells per slice - Key: CASSANDRA-7731 URL: https://issues.apache.org/jira/browse/CASSANDRA-7731 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Cyril Scetbon Assignee: Robert Stupp Priority: Minor Fix For: 2.1.1 Attachments: 7731-2.0.txt, 7731-2.1.txt I think you should not say that slice statistics are valid for the [last five minutes |https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/tools/NodeCmd.java#L955-L956] in CFSTATS command of nodetool. I've read the documentation from yammer for Histograms and there is no way to force values to expire after x minutes except by [clearing|http://grepcode.com/file/repo1.maven.org/maven2/com.yammer.metrics/metrics-core/2.1.2/com/yammer/metrics/core/Histogram.java#96] it . The only thing I can see is that the last snapshot used to provide the median (or whatever you'd used instead) value is based on 1028 values. I think we should also be able to detect that some requests are accessing a lot of live/tombstone cells per query and that's not possible for now without activating DEBUG for SliceQueryFilter for example and by tweaking the threshold. Currently as nodetool cfstats returns the median if a low part of the queries are scanning a lot of live/tombstone cells we miss it ! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7731) Get max values for live/tombstone cells per slice
[ https://issues.apache.org/jira/browse/CASSANDRA-7731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134129#comment-14134129 ] Chris Lohfink commented on CASSANDRA-7731: -- its been merged into the 3.1 maintenance branch. Seems to make sense though, the lower the percentile the longer they will affect it outside the window Get max values for live/tombstone cells per slice - Key: CASSANDRA-7731 URL: https://issues.apache.org/jira/browse/CASSANDRA-7731 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Cyril Scetbon Assignee: Robert Stupp Priority: Minor Fix For: 2.1.1 Attachments: 7731-2.0.txt, 7731-2.1.txt I think you should not say that slice statistics are valid for the [last five minutes |https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/tools/NodeCmd.java#L955-L956] in CFSTATS command of nodetool. I've read the documentation from yammer for Histograms and there is no way to force values to expire after x minutes except by [clearing|http://grepcode.com/file/repo1.maven.org/maven2/com.yammer.metrics/metrics-core/2.1.2/com/yammer/metrics/core/Histogram.java#96] it . The only thing I can see is that the last snapshot used to provide the median (or whatever you'd used instead) value is based on 1028 values. I think we should also be able to detect that some requests are accessing a lot of live/tombstone cells per query and that's not possible for now without activating DEBUG for SliceQueryFilter for example and by tweaking the threshold. Currently as nodetool cfstats returns the median if a low part of the queries are scanning a lot of live/tombstone cells we miss it ! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7131) Add command line option for cqlshrc file path
[ https://issues.apache.org/jira/browse/CASSANDRA-7131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Stepura updated CASSANDRA-7131: --- Fix Version/s: 2.1.1 Add command line option for cqlshrc file path - Key: CASSANDRA-7131 URL: https://issues.apache.org/jira/browse/CASSANDRA-7131 Project: Cassandra Issue Type: New Feature Components: Tools Reporter: Jeremiah Jordan Priority: Trivial Labels: cqlsh, lhf Fix For: 2.1.1 Attachments: CASSANDRA-2.1.1-7131.txt It would be nice if you could specify the cqlshrc file location on the command line, so you don't have to jump through hoops when running it from a service user or something. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7731) Get max values for live/tombstone cells per slice
[ https://issues.apache.org/jira/browse/CASSANDRA-7731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134150#comment-14134150 ] Cyril Scetbon commented on CASSANDRA-7731: -- you're right for the merge, I read too fast. [~snazy] what do you think about this issue ? Get max values for live/tombstone cells per slice - Key: CASSANDRA-7731 URL: https://issues.apache.org/jira/browse/CASSANDRA-7731 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Cyril Scetbon Assignee: Robert Stupp Priority: Minor Fix For: 2.1.1 Attachments: 7731-2.0.txt, 7731-2.1.txt I think you should not say that slice statistics are valid for the [last five minutes |https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/tools/NodeCmd.java#L955-L956] in CFSTATS command of nodetool. I've read the documentation from yammer for Histograms and there is no way to force values to expire after x minutes except by [clearing|http://grepcode.com/file/repo1.maven.org/maven2/com.yammer.metrics/metrics-core/2.1.2/com/yammer/metrics/core/Histogram.java#96] it . The only thing I can see is that the last snapshot used to provide the median (or whatever you'd used instead) value is based on 1028 values. I think we should also be able to detect that some requests are accessing a lot of live/tombstone cells per query and that's not possible for now without activating DEBUG for SliceQueryFilter for example and by tweaking the threshold. Currently as nodetool cfstats returns the median if a low part of the queries are scanning a lot of live/tombstone cells we miss it ! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-5687) Move C* Python Thrift tests into dtests
[ https://issues.apache.org/jira/browse/CASSANDRA-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McGuire updated CASSANDRA-5687: Labels: qa-resolved (was: ) Move C* Python Thrift tests into dtests --- Key: CASSANDRA-5687 URL: https://issues.apache.org/jira/browse/CASSANDRA-5687 Project: Cassandra Issue Type: Test Reporter: Ryan McGuire Assignee: Ryan McGuire Priority: Minor Labels: qa-resolved Fix For: 2.1.0 Attachments: 5687.nosetests.log, 5687.nosetets.2.0.log There's several good tests currently sitting in the C* source tree under test/system/test_thrift_server.py - these tests are not run via 'ant test' and no buildbot is regularly running them. Let's move them into cassandra-dtest so that we're not wasting valid tests. It appears they will need some refactoring to automatically start clusters via ccm, like the rest of dtests do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-7938) Releases prior to 2.1 gratuitously invalidate buffer cache
Matt Stump created CASSANDRA-7938: - Summary: Releases prior to 2.1 gratuitously invalidate buffer cache Key: CASSANDRA-7938 URL: https://issues.apache.org/jira/browse/CASSANDRA-7938 Project: Cassandra Issue Type: Bug Components: Core Reporter: Matt Stump RandomAccessReader gratuitously invalidates the buffer cache in releases prior to 2.1. Additionally, Linux 3.X kernels spend 30% of CPU time in book keeping for the invalidated pages as captured by CPU flame graphs. fadvise DONT_NEED should never be called for files other than the commit log segments. https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/io/util/RandomAccessReader.java#L168 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7938) Releases prior to 2.0 gratuitously invalidate buffer cache
[ https://issues.apache.org/jira/browse/CASSANDRA-7938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] T Jake Luciani updated CASSANDRA-7938: -- Description: RandomAccessReader gratuitously invalidates the buffer cache in releases prior to 2.0. Additionally, Linux 3.X kernels spend 30% of CPU time in book keeping for the invalidated pages as captured by CPU flame graphs. fadvise DONT_NEED should never be called for files other than the commit log segments. https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/io/util/RandomAccessReader.java#L168 was: RandomAccessReader gratuitously invalidates the buffer cache in releases prior to 2.1. Additionally, Linux 3.X kernels spend 30% of CPU time in book keeping for the invalidated pages as captured by CPU flame graphs. fadvise DONT_NEED should never be called for files other than the commit log segments. https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/io/util/RandomAccessReader.java#L168 Releases prior to 2.0 gratuitously invalidate buffer cache -- Key: CASSANDRA-7938 URL: https://issues.apache.org/jira/browse/CASSANDRA-7938 Project: Cassandra Issue Type: Bug Components: Core Reporter: Matt Stump RandomAccessReader gratuitously invalidates the buffer cache in releases prior to 2.0. Additionally, Linux 3.X kernels spend 30% of CPU time in book keeping for the invalidated pages as captured by CPU flame graphs. fadvise DONT_NEED should never be called for files other than the commit log segments. https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/io/util/RandomAccessReader.java#L168 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7938) Releases prior to 2.0 gratuitously invalidate buffer cache
[ https://issues.apache.org/jira/browse/CASSANDRA-7938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Stump updated CASSANDRA-7938: -- Summary: Releases prior to 2.0 gratuitously invalidate buffer cache (was: Releases prior to 2.1 gratuitously invalidate buffer cache) Releases prior to 2.0 gratuitously invalidate buffer cache -- Key: CASSANDRA-7938 URL: https://issues.apache.org/jira/browse/CASSANDRA-7938 Project: Cassandra Issue Type: Bug Components: Core Reporter: Matt Stump RandomAccessReader gratuitously invalidates the buffer cache in releases prior to 2.1. Additionally, Linux 3.X kernels spend 30% of CPU time in book keeping for the invalidated pages as captured by CPU flame graphs. fadvise DONT_NEED should never be called for files other than the commit log segments. https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/io/util/RandomAccessReader.java#L168 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7938) Releases prior to 2.0 gratuitously invalidate buffer cache
[ https://issues.apache.org/jira/browse/CASSANDRA-7938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] T Jake Luciani updated CASSANDRA-7938: -- Fix Version/s: 1.2.19 Releases prior to 2.0 gratuitously invalidate buffer cache -- Key: CASSANDRA-7938 URL: https://issues.apache.org/jira/browse/CASSANDRA-7938 Project: Cassandra Issue Type: Bug Components: Core Reporter: Matt Stump Fix For: 1.2.19 RandomAccessReader gratuitously invalidates the buffer cache in releases prior to 2.0. Additionally, Linux 3.X kernels spend 30% of CPU time in book keeping for the invalidated pages as captured by CPU flame graphs. fadvise DONT_NEED should never be called for files other than the commit log segments. https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/io/util/RandomAccessReader.java#L168 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7818) Improve compaction logging
[ https://issues.apache.org/jira/browse/CASSANDRA-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-7818: -- [~krummas] to review Improve compaction logging -- Key: CASSANDRA-7818 URL: https://issues.apache.org/jira/browse/CASSANDRA-7818 Project: Cassandra Issue Type: Improvement Reporter: Marcus Eriksson Assignee: Mihai Suteu Priority: Minor Labels: compaction, lhf Fix For: 2.1.1 Attachments: cassandra-7818.patch We should log more information about compactions to be able to debug issues more efficiently * give each CompactionTask an id that we log (so that you can relate the start-compaction-messages to the finished-compaction ones) * log what level the sstables are taken from -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7818) Improve compaction logging
[ https://issues.apache.org/jira/browse/CASSANDRA-7818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-7818: -- Reviewer: Marcus Eriksson Improve compaction logging -- Key: CASSANDRA-7818 URL: https://issues.apache.org/jira/browse/CASSANDRA-7818 Project: Cassandra Issue Type: Improvement Reporter: Marcus Eriksson Assignee: Mihai Suteu Priority: Minor Labels: compaction, lhf Fix For: 2.1.1 Attachments: cassandra-7818.patch We should log more information about compactions to be able to debug issues more efficiently * give each CompactionTask an id that we log (so that you can relate the start-compaction-messages to the finished-compaction ones) * log what level the sstables are taken from -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-7276) Include keyspace and table names in logs where possible
[ https://issues.apache.org/jira/browse/CASSANDRA-7276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-7276: -- Reviewer: Yuki Morishita (was: Tyler Hobbs) [~yukim] to review Include keyspace and table names in logs where possible --- Key: CASSANDRA-7276 URL: https://issues.apache.org/jira/browse/CASSANDRA-7276 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Tyler Hobbs Assignee: Nitzan Volman Priority: Minor Labels: bootcamp, lhf Attachments: 2.1-CASSANDRA-7276-v1.txt, cassandra-2.1-7276-compaction.txt, cassandra-2.1-7276.txt Most error messages and stacktraces give you no clue as to what keyspace or table was causing the problem. For example: {noformat} ERROR [MutationStage:61648] 2014-05-20 12:05:45,145 CassandraDaemon.java (line 198) Exception in thread Thread[MutationStage:61648,5,main] java.lang.IllegalArgumentException at java.nio.Buffer.limit(Unknown Source) at org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:63) at org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:72) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:98) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:35) at edu.stanford.ppl.concurrent.SnapTreeMap$1.compareTo(SnapTreeMap.java:538) at edu.stanford.ppl.concurrent.SnapTreeMap.attemptUpdate(SnapTreeMap.java:1108) at edu.stanford.ppl.concurrent.SnapTreeMap.updateUnderRoot(SnapTreeMap.java:1059) at edu.stanford.ppl.concurrent.SnapTreeMap.update(SnapTreeMap.java:1023) at edu.stanford.ppl.concurrent.SnapTreeMap.putIfAbsent(SnapTreeMap.java:985) at org.apache.cassandra.db.AtomicSortedColumns$Holder.addColumn(AtomicSortedColumns.java:328) at org.apache.cassandra.db.AtomicSortedColumns.addAllWithSizeDelta(AtomicSortedColumns.java:200) at org.apache.cassandra.db.Memtable.resolve(Memtable.java:226) at org.apache.cassandra.db.Memtable.put(Memtable.java:173) at org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:893) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:368) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:333) at org.apache.cassandra.db.RowMutation.apply(RowMutation.java:206) at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:56) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:60) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) {noformat} We should try to include info on the keyspace and column family in the error messages or logs whenever possible. This includes reads, writes, compactions, flushes, repairs, and probably more. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7938) Releases prior to 2.0 gratuitously invalidate buffer cache
[ https://issues.apache.org/jira/browse/CASSANDRA-7938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134273#comment-14134273 ] T Jake Luciani commented on CASSANDRA-7938: --- The only thing that sets skipCache to true is the compaction scanner and commit log. Releases prior to 2.0 gratuitously invalidate buffer cache -- Key: CASSANDRA-7938 URL: https://issues.apache.org/jira/browse/CASSANDRA-7938 Project: Cassandra Issue Type: Bug Components: Core Reporter: Matt Stump Fix For: 1.2.19 RandomAccessReader gratuitously invalidates the buffer cache in releases prior to 2.0. Additionally, Linux 3.X kernels spend 30% of CPU time in book keeping for the invalidated pages as captured by CPU flame graphs. fadvise DONT_NEED should never be called for files other than the commit log segments. https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/io/util/RandomAccessReader.java#L168 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7938) Releases prior to 2.0 gratuitously invalidate buffer cache
[ https://issues.apache.org/jira/browse/CASSANDRA-7938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134334#comment-14134334 ] Benedict commented on CASSANDRA-7938: - Also file streaming Releases prior to 2.0 gratuitously invalidate buffer cache -- Key: CASSANDRA-7938 URL: https://issues.apache.org/jira/browse/CASSANDRA-7938 Project: Cassandra Issue Type: Bug Components: Core Reporter: Matt Stump Fix For: 1.2.19 RandomAccessReader gratuitously invalidates the buffer cache in releases prior to 2.0. Additionally, Linux 3.X kernels spend 30% of CPU time in book keeping for the invalidated pages as captured by CPU flame graphs. fadvise DONT_NEED should never be called for files other than the commit log segments. https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/io/util/RandomAccessReader.java#L168 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7928) Digest queries do not require alder32 checks
[ https://issues.apache.org/jira/browse/CASSANDRA-7928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134458#comment-14134458 ] Brandon Williams commented on CASSANDRA-7928: - I've heard that in profiling CPU usage for adler32 was ~23% while decompression was ~12%, which doesn't make a lot of sense to me. What do you think [~benedict]? Digest queries do not require alder32 checks Key: CASSANDRA-7928 URL: https://issues.apache.org/jira/browse/CASSANDRA-7928 Project: Cassandra Issue Type: Improvement Reporter: sankalp kohli Priority: Minor While reading data from sstables, C* does Alder32 checks for any data being read. We have seen that this causes higher CPU usage while doing kernel profiling. These checks might not be useful for digest queries as they will have a different digest in case of corruption. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7904) Repair hangs
[ https://issues.apache.org/jira/browse/CASSANDRA-7904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134473#comment-14134473 ] Razi Khaja commented on CASSANDRA-7904: --- Duncan, Sorry for my misunderstanding, but the linked ticket: https://issues.apache.org/jira/browse/CASSANDRA-6651?focusedCommentId=13892345page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13892345 states: {quote} Thunder Stumpges added a comment - 05/Feb/14 12:45 FWIW we have this exact same issue. We are running 2.0.3 on a 3 node cluster. It has happened multiple times, and happens more times than not when running nodetool repair. There is nearly always one or more AntiEntropySessions remaining according to tpstats. One strange thing about the behavior I see is that the output of nodetool compactionstats returns 0 active compactions, yet when restarting, we get the exception about Unfinished compactions reference missing sstables. It does seem like these two issues are related. Another thing I see sometimes in the ouput from nodetool repair is the following message: [2014-02-04 14:07:30,858] Starting repair command #7, repairing 768 ranges for keyspace thunder_test [2014-02-04 14:08:30,862] Lost notification. You should check server log for repair status of keyspace thunder_test [2014-02-04 14:08:30,870] Starting repair command #8, repairing 768 ranges for keyspace doan_synset [2014-02-04 14:09:30,874] Lost notification. You should check server log for repair status of keyspace doan_synset When this happens, it starts the next repair session immediately rather than waiting for the current one to finish. This doesn't however seem to always correlate to a hung session. My logs don't look much/any different from the OP, but I'd be glad to provide any more details that might be helpful. We will be upgrading to 2.0.4 in the next couple days and I will report back if we see any difference in behavior. {quote} Which is why I believe I had the same issue. I'll move my discussion to the newly created ticket CASSANDRA-7909 Repair hangs Key: CASSANDRA-7904 URL: https://issues.apache.org/jira/browse/CASSANDRA-7904 Project: Cassandra Issue Type: Bug Components: Core Environment: C* 2.0.10, ubuntu 14.04, Java HotSpot(TM) 64-Bit Server, java version 1.7.0_45 Reporter: Duncan Sands Attachments: ls-172.18.68.138, ls-192.168.21.13, ls-192.168.60.134, ls-192.168.60.136 Cluster of 22 nodes spread over 4 data centres. Not used on the weekend, so repair is run on all nodes (in a staggered fashion) on the weekend. Nodetool options: -par -pr. There is usually some overlap in the repairs: repair on one node may well still be running when repair is started on the next node. Repair hangs for some of the nodes almost every weekend. It hung last weekend, here are the details: In the whole cluster, only one node had an exception since C* was last restarted. This node is 192.168.60.136 and the exception is harmless: a client disconnected abruptly. tpstats 4 nodes have a non-zero value for active or pending in AntiEntropySessions. These nodes all have Active = 1 and Pending = 1. The nodes are: 192.168.21.13 (data centre R) 192.168.60.134 (data centre A) 192.168.60.136 (data centre A) 172.18.68.138 (data centre Z) compactionstats: No compactions. All nodes have: pending tasks: 0 Active compaction remaining time :n/a netstats: All except one node have nothing. One node (192.168.60.131, not one of the nodes listed in the tpstats section above) has (note the Responses Pending value of 1): Mode: NORMAL Not sending any streams. Read Repair Statistics: Attempted: 4233 Mismatch (Blocking): 0 Mismatch (Background): 243 Pool NameActive Pending Completed Commandsn/a 0 34785445 Responses n/a 1 38567167 Repair sessions I looked for repair sessions that failed to complete. On 3 of the 4 nodes mentioned in tpstats above I found that they had sent merkle tree requests and got responses from all but one node. In the log file for the node that failed to respond there is no sign that it ever received the request. On 1 node (172.18.68.138) it looks like responses were received from every node, some streaming was done, and then... nothing. Details: Node 192.168.21.13 (data centre R): Sent merkle trees to /172.18.33.24, /192.168.60.140, /192.168.60.142, /172.18.68.139, /172.18.68.138, /172.18.33.22, /192.168.21.13 for table brokers, never got a response from /172.18.68.139. On /172.18.68.139, just before this time it sent a response for the same repair session but a different
[jira] [Commented] (CASSANDRA-7928) Digest queries do not require alder32 checks
[ https://issues.apache.org/jira/browse/CASSANDRA-7928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134612#comment-14134612 ] Benedict commented on CASSANDRA-7928: - Regrettably this is very plausible, and adds credence to CASSANDRA-7130, which we should consider reopening. This ticket is also a sensible idea to help mitigate the issue. I knocked up a quick benchmark, results show lz4 being consistently at least twice as fast. It's actually quite easy to explain: if the data is compressed, there is actually less data to operate over; if it is not easily compressed (say, it is highly random), it degrades itself to a simple copy to avoid wasting work (as demonstrated in the benchmark - it's 5 times faster over completely random data than partially random data). {noformat} Benchmark (duplicateLookback) (pageSize) (randomRatio) (randomRunLength) (uniquePages) Mode SamplesScore Score error Units Compression.adler32 4..128 65536 0 4..16 8192 thrpt5 16.4761.954 ops/ms Compression.adler32 4..128 65536 0 128..512 8192 thrpt5 16.7200.230 ops/ms Compression.adler32 4..128 655360.1 4..16 8192 thrpt5 16.2692.118 ops/ms Compression.adler32 4..128 655360.1 128..512 8192 thrpt5 16.6650.246 ops/ms Compression.adler32 4..128 655361.0 4..16 8192 thrpt5 16.6530.147 ops/ms Compression.adler32 4..128 655361.0 128..512 8192 thrpt5 16.6860.214 ops/ms Compression.lz4 4..128 65536 0 4..16 8192 thrpt5 28.2750.265 ops/ms Compression.lz4 4..128 65536 0 128..512 8192 thrpt5 232.602 48.279 ops/ms Compression.lz4 4..128 655360.1 4..16 8192 thrpt5 34.0810.337 ops/ms Compression.lz4 4..128 655360.1 128..512 8192 thrpt5 130.857 18.157 ops/ms Compression.lz4 4..128 655361.0 4..16 8192 thrpt5 187.9929.190 ops/ms Compression.lz4 4..128 655361.0 128..512 8192 thrpt5 186.0542.267 ops/ms {noformat} Digest queries do not require alder32 checks Key: CASSANDRA-7928 URL: https://issues.apache.org/jira/browse/CASSANDRA-7928 Project: Cassandra Issue Type: Improvement Reporter: sankalp kohli Priority: Minor While reading data from sstables, C* does Alder32 checks for any data being read. We have seen that this causes higher CPU usage while doing kernel profiling. These checks might not be useful for digest queries as they will have a different digest in case of corruption. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7928) Digest queries do not require alder32 checks
[ https://issues.apache.org/jira/browse/CASSANDRA-7928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134625#comment-14134625 ] Benedict commented on CASSANDRA-7928: - For reference, I have uploaded the benchmark [here|https://github.com/belliottsmith/bench] Digest queries do not require alder32 checks Key: CASSANDRA-7928 URL: https://issues.apache.org/jira/browse/CASSANDRA-7928 Project: Cassandra Issue Type: Improvement Reporter: sankalp kohli Priority: Minor While reading data from sstables, C* does Alder32 checks for any data being read. We have seen that this causes higher CPU usage while doing kernel profiling. These checks might not be useful for digest queries as they will have a different digest in case of corruption. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-7546) AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory
[ https://issues.apache.org/jira/browse/CASSANDRA-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134635#comment-14134635 ] graham sanderson commented on CASSANDRA-7546: - Finally getting back to this, been doing other things (this slightly lower priority as we have it in production already)... I just realized that the version c6a2c65a75ade being voted on for 2.1.0 that I deployed is not the same as 2.1.0 released. I am now upgrading, since cassandra-stress changes snuck in. Note, than I plan to stress using 1024, 256, 16, 1 partitions, with all 5 nodes up, and then with 4 nodes up and one down to test effect of hinting, (note repl factor of 3 and cl=LOCAL_QUORUM) I want to do one cell insert per batch... I'm upgrading in part because of the new visit/revisit stuff - I'm not 100% sure how to use them correctly, I'll keep playing but you may answer before I have finished upgrading and tried with this. My first attempt on the original 2.1.0 revision, ended up with only one clustering key value per partition which is not what I wanted (because it'll make trees small) Sample YAML for 1024 partitions {code} # # This is an example YAML profile for cassandra-stress # # insert data # cassandra-stress user profile=/home/jake/stress1.yaml ops(insert=1) # # read, using query simple1: # cassandra-stress profile=/home/jake/stress1.yaml ops(simple1=1) # # mixed workload (90/10) # cassandra-stress user profile=/home/jake/stress1.yaml ops(insert=1,simple1=9) # # Keyspace info # keyspace: stresscql # # The CQL for creating a keyspace (optional if it already exists) # keyspace_definition: | CREATE KEYSPACE stresscql WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 3}; # # Table info # table: testtable # # The CQL for creating a table you wish to stress (optional if it already exists) # table_definition: | CREATE TABLE testtable ( p text, c text, v blob, PRIMARY KEY(p, c) ) WITH COMPACT STORAGE AND compaction = { 'class':'LeveledCompactionStrategy' } AND comment='TestTable' # # Optional meta information on the generated columns in the above table # The min and max only apply to text and blob types # The distribution field represents the total unique population # distribution of that column across rows. Supported types are # # EXP(min..max)An exponential distribution over the range [min..max] # EXTREME(min..max,shape) An extreme value (Weibull) distribution over the range [min..max] # GAUSSIAN(min..max,stdvrng) A gaussian/normal distribution, where mean=(min+max)/2, and stdev is (mean-min)/stdvrng # GAUSSIAN(min..max,mean,stdev)A gaussian/normal distribution, with explicitly defined mean and stdev # UNIFORM(min..max)A uniform distribution over the range [min, max] # FIXED(val) A fixed distribution, always returning the same value # Aliases: extr, gauss, normal, norm, weibull # # If preceded by ~, the distribution is inverted # # Defaults for all columns are size: uniform(4..8), population: uniform(1..100B), cluster: fixed(1) # columnspec: - name: p size: fixed(16) population: uniform(1..1024) # the range of unique values to select for the field (default is 100Billion) - name: c size: fixed(26) #cluster: uniform(1..100B) - name: v size: gaussian(50..250) insert: partitions: fixed(1)# number of unique partitions to update in a single operation # if batchcount 1, multiple batches will be used but all partitions will # occur in all batches (unless they finish early); only the row counts will vary batchtype: LOGGED # type of batch to use visits: fixed(10M)# not sure about this queries: simple1: select * from testtable where k = ? and v = ? LIMIT 10 {code} Command-line {code} ./cassandra-stress user profile=~/cqlstress-1024.yaml ops\(insert=1\) cl=LOCAL_QUORUM -node $NODES -mode native prepared cql3 | tee results/results-2.1.0-p1024-a.txt {code} AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory - Key: CASSANDRA-7546 URL: https://issues.apache.org/jira/browse/CASSANDRA-7546 Project: Cassandra Issue Type: Bug Components: Core Reporter: graham sanderson Assignee: graham sanderson Fix For: 2.1.1 Attachments: 7546.20.txt, 7546.20_2.txt, 7546.20_3.txt, 7546.20_4.txt, 7546.20_5.txt, 7546.20_6.txt, 7546.20_7.txt, 7546.20_7b.txt, 7546.20_alt.txt, 7546.20_async.txt, 7546.21_v1.txt, hint_spikes.png, suggestion1.txt, suggestion1_21.txt,
[jira] [Comment Edited] (CASSANDRA-7546) AtomicSortedColumns.addAllWithSizeDelta has a spin loop that allocates memory
[ https://issues.apache.org/jira/browse/CASSANDRA-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134635#comment-14134635 ] graham sanderson edited comment on CASSANDRA-7546 at 9/15/14 10:50 PM: --- Finally getting back to this, been doing other things (this slightly lower priority as we have it in production already) as well as keeping breaking myself physically, requiring orthopedic visits! I just realized that the version c6a2c65a75ade being voted on for 2.1.0 that I deployed is not the same as 2.1.0 released. I am now upgrading, since cassandra-stress changes snuck in. Note, than I plan to stress using 1024, 256, 16, 1 partitions, with all 5 nodes up, and then with 4 nodes up and one down to test effect of hinting, (note repl factor of 3 and cl=LOCAL_QUORUM), as well as with at least memtable_allocation_type = heap_buffers off_heap_buffers I want to do one cell insert per batch... I'm upgrading in part because of the new visit/revisit stuff - I'm not 100% sure how to use them correctly, I'll keep playing but you may answer before I have finished upgrading and tried with this. My first attempt on the original 2.1.0 revision, ended up with only one clustering key value per partition which is not what I wanted (because it'll make trees small) Sample YAML for 1024 partitions {code} # # This is an example YAML profile for cassandra-stress # # insert data # cassandra-stress user profile=/home/jake/stress1.yaml ops(insert=1) # # read, using query simple1: # cassandra-stress profile=/home/jake/stress1.yaml ops(simple1=1) # # mixed workload (90/10) # cassandra-stress user profile=/home/jake/stress1.yaml ops(insert=1,simple1=9) # # Keyspace info # keyspace: stresscql # # The CQL for creating a keyspace (optional if it already exists) # keyspace_definition: | CREATE KEYSPACE stresscql WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 3}; # # Table info # table: testtable # # The CQL for creating a table you wish to stress (optional if it already exists) # table_definition: | CREATE TABLE testtable ( p text, c text, v blob, PRIMARY KEY(p, c) ) WITH COMPACT STORAGE AND compaction = { 'class':'LeveledCompactionStrategy' } AND comment='TestTable' # # Optional meta information on the generated columns in the above table # The min and max only apply to text and blob types # The distribution field represents the total unique population # distribution of that column across rows. Supported types are # # EXP(min..max)An exponential distribution over the range [min..max] # EXTREME(min..max,shape) An extreme value (Weibull) distribution over the range [min..max] # GAUSSIAN(min..max,stdvrng) A gaussian/normal distribution, where mean=(min+max)/2, and stdev is (mean-min)/stdvrng # GAUSSIAN(min..max,mean,stdev)A gaussian/normal distribution, with explicitly defined mean and stdev # UNIFORM(min..max)A uniform distribution over the range [min, max] # FIXED(val) A fixed distribution, always returning the same value # Aliases: extr, gauss, normal, norm, weibull # # If preceded by ~, the distribution is inverted # # Defaults for all columns are size: uniform(4..8), population: uniform(1..100B), cluster: fixed(1) # columnspec: - name: p size: fixed(16) population: uniform(1..1024) # the range of unique values to select for the field (default is 100Billion) - name: c size: fixed(26) #cluster: uniform(1..100B) - name: v size: gaussian(50..250) insert: partitions: fixed(1)# number of unique partitions to update in a single operation # if batchcount 1, multiple batches will be used but all partitions will # occur in all batches (unless they finish early); only the row counts will vary batchtype: LOGGED # type of batch to use visits: fixed(10M)# not sure about this queries: simple1: select * from testtable where k = ? and v = ? LIMIT 10 {code} Command-line {code} ./cassandra-stress user profile=~/cqlstress-1024.yaml ops\(insert=1\) cl=LOCAL_QUORUM -node $NODES -mode native prepared cql3 | tee results/results-2.1.0-p1024-a.txt {code} was (Author: graham sanderson): Finally getting back to this, been doing other things (this slightly lower priority as we have it in production already)... I just realized that the version c6a2c65a75ade being voted on for 2.1.0 that I deployed is not the same as 2.1.0 released. I am now upgrading, since cassandra-stress changes snuck in. Note, than I plan to stress using 1024, 256, 16, 1 partitions, with all 5 nodes up, and then with 4 nodes up and one down to test effect of hinting, (note repl
[jira] [Commented] (CASSANDRA-7928) Digest queries do not require alder32 checks
[ https://issues.apache.org/jira/browse/CASSANDRA-7928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14134642#comment-14134642 ] Brandon Williams commented on CASSANDRA-7928: - Removing the adler32 checksum for digests seems like the most reasonable thing we can do in a minor release, but CASSANDRA-7130 sounds like a more complete solution. Digest queries do not require alder32 checks Key: CASSANDRA-7928 URL: https://issues.apache.org/jira/browse/CASSANDRA-7928 Project: Cassandra Issue Type: Improvement Reporter: sankalp kohli Priority: Minor Fix For: 2.1.1 While reading data from sstables, C* does Alder32 checks for any data being read. We have seen that this causes higher CPU usage while doing kernel profiling. These checks might not be useful for digest queries as they will have a different digest in case of corruption. -- This message was sent by Atlassian JIRA (v6.3.4#6332)