[jira] [Updated] (CASSANDRA-4277) hsha default thread limits make no sense, and yaml comments look confused
[ https://issues.apache.org/jira/browse/CASSANDRA-4277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-4277: -- Fix Version/s: 1.2 hsha default thread limits make no sense, and yaml comments look confused - Key: CASSANDRA-4277 URL: https://issues.apache.org/jira/browse/CASSANDRA-4277 Project: Cassandra Issue Type: Bug Components: Core Reporter: Peter Schuller Assignee: Peter Schuller Fix For: 1.2 The cassandra.yaml states with respect to {{rpc_max_threads}}: {code} # For the Hsha server, the min and max both default to quadruple the number of # CPU cores. {code} The code seems to indeed do this. But this makes, as far as I can tell, no sense what-so-ever since the number of concurrent RPC threads you need is a function of the throughput and the average latency of requests (that includes synchronously waiting on network traffic). Defaulting to anything having to do with CPU cores seems inherently wrong. If a default is non-static, a closer guess might be to look at thread stack size and heap size and infer what might be reasonable. *NOTE*: The effect of having this too low, is strange (if you don't know what's going on) latencies observed form the client on all thrift requests (*any* thrift request, including e.g. {{describe_ring()}}), that isn't visible in any latency metric exposed by Cassandra. This is why I consider this major, since unwitting users may be seeing detrimental performance for no good reason. In addition, I read this about async: {code} # async - Nonblocking server implementation with one thread to serve # rpc connections. This is not recommended for high throughput use # cases. Async has been tested to be about 50% slower than sync # or hsha and is deprecated: it will be removed in the next major release. {code} This makes even less sense. Running with *one* rpc thread limits you to a single concurrent request. How was that 50% number even attained? By single-node testing being completely CPU bound locally on a node? The actual effect should be stupidly slow in any real situation with lots of requests on a cluster of many nodes and network traffic (though I didn't test that) - especially in the event of any kind of hiccup like a node doing GC. I agree that if the above is true, async should *definitely* be deprecated, but the reasons seem *much* stronger than implied. I may be missing something here, in which case I apologize,, but I specifically double-checked after I fixed this setting on on our our clusters after seeing exactly the expected side-effect of having it be too low. I always was under the impression that rpc_max_threads affects the number of RPC requests running concurrently, and code inspection (it being used for the worker thread limit) + the effects of client-observed latency is consistent with my understanding. I suspect the setting was set strangely by someone because the phrasing of the comments in {{cassandra.yaml}} strongly suggest that this should be tied to CPU cores, hiding the fact that this really has to do with the number of requests that can be serviced concurrently regardless of implementation details of thrift/networking being sync/async/etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4277) hsha default thread limits make no sense, and yaml comments look confused
[ https://issues.apache.org/jira/browse/CASSANDRA-4277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-4277: -- Attachment: CASSANDRA-4277-trunk.txt Attaching suggested patch against trunk. Since the pre-existing comments claim async will be removed in the next major release, I removed it entirely from comments (but not the code). I re-phrased some of the stuff and added an attempted explanation for the user as to how to figure out what limit to set. As usual I think I may be too verbose; maybe it's better to just refer to a separate wiki page than to try to explain inline? As an aside, I'd favor making hsha the default despite it being slower on Windows, though that's a concern not within the scope of this ticket. I didn't make that change in the patch. hsha default thread limits make no sense, and yaml comments look confused - Key: CASSANDRA-4277 URL: https://issues.apache.org/jira/browse/CASSANDRA-4277 Project: Cassandra Issue Type: Bug Components: Core Reporter: Peter Schuller Assignee: Peter Schuller Fix For: 1.2 Attachments: CASSANDRA-4277-trunk.txt The cassandra.yaml states with respect to {{rpc_max_threads}}: {code} # For the Hsha server, the min and max both default to quadruple the number of # CPU cores. {code} The code seems to indeed do this. But this makes, as far as I can tell, no sense what-so-ever since the number of concurrent RPC threads you need is a function of the throughput and the average latency of requests (that includes synchronously waiting on network traffic). Defaulting to anything having to do with CPU cores seems inherently wrong. If a default is non-static, a closer guess might be to look at thread stack size and heap size and infer what might be reasonable. *NOTE*: The effect of having this too low, is strange (if you don't know what's going on) latencies observed form the client on all thrift requests (*any* thrift request, including e.g. {{describe_ring()}}), that isn't visible in any latency metric exposed by Cassandra. This is why I consider this major, since unwitting users may be seeing detrimental performance for no good reason. In addition, I read this about async: {code} # async - Nonblocking server implementation with one thread to serve # rpc connections. This is not recommended for high throughput use # cases. Async has been tested to be about 50% slower than sync # or hsha and is deprecated: it will be removed in the next major release. {code} This makes even less sense. Running with *one* rpc thread limits you to a single concurrent request. How was that 50% number even attained? By single-node testing being completely CPU bound locally on a node? The actual effect should be stupidly slow in any real situation with lots of requests on a cluster of many nodes and network traffic (though I didn't test that) - especially in the event of any kind of hiccup like a node doing GC. I agree that if the above is true, async should *definitely* be deprecated, but the reasons seem *much* stronger than implied. I may be missing something here, in which case I apologize,, but I specifically double-checked after I fixed this setting on on our our clusters after seeing exactly the expected side-effect of having it be too low. I always was under the impression that rpc_max_threads affects the number of RPC requests running concurrently, and code inspection (it being used for the worker thread limit) + the effects of client-observed latency is consistent with my understanding. I suspect the setting was set strangely by someone because the phrasing of the comments in {{cassandra.yaml}} strongly suggest that this should be tied to CPU cores, hiding the fact that this really has to do with the number of requests that can be serviced concurrently regardless of implementation details of thrift/networking being sync/async/etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4277) hsha default thread limits make no sense, and yaml comments look confused
[ https://issues.apache.org/jira/browse/CASSANDRA-4277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-4277: -- Reviewer: slebresne hsha default thread limits make no sense, and yaml comments look confused - Key: CASSANDRA-4277 URL: https://issues.apache.org/jira/browse/CASSANDRA-4277 Project: Cassandra Issue Type: Bug Components: Core Reporter: Peter Schuller Assignee: Peter Schuller Fix For: 1.2 Attachments: CASSANDRA-4277-trunk.txt The cassandra.yaml states with respect to {{rpc_max_threads}}: {code} # For the Hsha server, the min and max both default to quadruple the number of # CPU cores. {code} The code seems to indeed do this. But this makes, as far as I can tell, no sense what-so-ever since the number of concurrent RPC threads you need is a function of the throughput and the average latency of requests (that includes synchronously waiting on network traffic). Defaulting to anything having to do with CPU cores seems inherently wrong. If a default is non-static, a closer guess might be to look at thread stack size and heap size and infer what might be reasonable. *NOTE*: The effect of having this too low, is strange (if you don't know what's going on) latencies observed form the client on all thrift requests (*any* thrift request, including e.g. {{describe_ring()}}), that isn't visible in any latency metric exposed by Cassandra. This is why I consider this major, since unwitting users may be seeing detrimental performance for no good reason. In addition, I read this about async: {code} # async - Nonblocking server implementation with one thread to serve # rpc connections. This is not recommended for high throughput use # cases. Async has been tested to be about 50% slower than sync # or hsha and is deprecated: it will be removed in the next major release. {code} This makes even less sense. Running with *one* rpc thread limits you to a single concurrent request. How was that 50% number even attained? By single-node testing being completely CPU bound locally on a node? The actual effect should be stupidly slow in any real situation with lots of requests on a cluster of many nodes and network traffic (though I didn't test that) - especially in the event of any kind of hiccup like a node doing GC. I agree that if the above is true, async should *definitely* be deprecated, but the reasons seem *much* stronger than implied. I may be missing something here, in which case I apologize,, but I specifically double-checked after I fixed this setting on on our our clusters after seeing exactly the expected side-effect of having it be too low. I always was under the impression that rpc_max_threads affects the number of RPC requests running concurrently, and code inspection (it being used for the worker thread limit) + the effects of client-observed latency is consistent with my understanding. I suspect the setting was set strangely by someone because the phrasing of the comments in {{cassandra.yaml}} strongly suggest that this should be tied to CPU cores, hiding the fact that this really has to do with the number of requests that can be serviced concurrently regardless of implementation details of thrift/networking being sync/async/etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-4282) Improve startup time by making row cache population multi threaded
Marcus Eriksson created CASSANDRA-4282: -- Summary: Improve startup time by making row cache population multi threaded Key: CASSANDRA-4282 URL: https://issues.apache.org/jira/browse/CASSANDRA-4282 Project: Cassandra Issue Type: Improvement Reporter: Marcus Eriksson Priority: Minor Fix For: 1.2 Attached patch multi threads the row cache population my small tests show improvements atleast; {code} - 4 cores - 11G stress-generated data - page cache dropped between runs single platter spinning disk: single threaded: INFO 10:18:21,365 completed loading (245562 ms; 311381 keys) row cache for Keyspace1.Standard1 INFO 10:32:43,738 completed loading (255106 ms; 311381 keys) row cache for Keyspace1.Standard1 multi threaded: INFO 10:22:47,567 completed loading (213905 ms; 311381 keys) row cache for Keyspace1.Standard1 INFO 10:27:26,873 completed loading (214514 ms; 311381 keys) row cache for Keyspace1.Standard1 ssd; single threaded: INFO 10:40:49,079 completed loading (103883 ms; 311381 keys) row cache for Keyspace1.Standard1 INFO 10:43:45,799 completed loading (106913 ms; 311381 keys) row cache for Keyspace1.Standard1 multi threaded: INFO 10:38:20,798 completed loading (57617 ms; 311381 keys) row cache for Keyspace1.Standard1 INFO 10:47:20,339 completed loading (56682 ms; 311381 keys) row cache for Keyspace1.Standard1 {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4282) Improve startup time by making row cache population multi threaded
[ https://issues.apache.org/jira/browse/CASSANDRA-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283226#comment-13283226 ] Marcus Eriksson commented on CASSANDRA-4282: btw, unsure how this relates to CASSANDRA-3762, might be conflicts Improve startup time by making row cache population multi threaded -- Key: CASSANDRA-4282 URL: https://issues.apache.org/jira/browse/CASSANDRA-4282 Project: Cassandra Issue Type: Improvement Reporter: Marcus Eriksson Priority: Minor Fix For: 1.2 Attachments: 0001-improve-startup-time-by-multi-threading-row-cache-re.patch Attached patch multi threads the row cache population my small tests show improvements atleast; {code} - 4 cores - 11G stress-generated data - page cache dropped between runs single platter spinning disk: single threaded: INFO 10:18:21,365 completed loading (245562 ms; 311381 keys) row cache for Keyspace1.Standard1 INFO 10:32:43,738 completed loading (255106 ms; 311381 keys) row cache for Keyspace1.Standard1 multi threaded: INFO 10:22:47,567 completed loading (213905 ms; 311381 keys) row cache for Keyspace1.Standard1 INFO 10:27:26,873 completed loading (214514 ms; 311381 keys) row cache for Keyspace1.Standard1 ssd; single threaded: INFO 10:40:49,079 completed loading (103883 ms; 311381 keys) row cache for Keyspace1.Standard1 INFO 10:43:45,799 completed loading (106913 ms; 311381 keys) row cache for Keyspace1.Standard1 multi threaded: INFO 10:38:20,798 completed loading (57617 ms; 311381 keys) row cache for Keyspace1.Standard1 INFO 10:47:20,339 completed loading (56682 ms; 311381 keys) row cache for Keyspace1.Standard1 {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4282) Improve startup time by making row cache population multi threaded
[ https://issues.apache.org/jira/browse/CASSANDRA-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-4282: --- Attachment: 0001-improve-startup-time-by-multi-threading-row-cache-re.patch Improve startup time by making row cache population multi threaded -- Key: CASSANDRA-4282 URL: https://issues.apache.org/jira/browse/CASSANDRA-4282 Project: Cassandra Issue Type: Improvement Reporter: Marcus Eriksson Priority: Minor Fix For: 1.2 Attachments: 0001-improve-startup-time-by-multi-threading-row-cache-re.patch Attached patch multi threads the row cache population my small tests show improvements atleast; {code} - 4 cores - 11G stress-generated data - page cache dropped between runs single platter spinning disk: single threaded: INFO 10:18:21,365 completed loading (245562 ms; 311381 keys) row cache for Keyspace1.Standard1 INFO 10:32:43,738 completed loading (255106 ms; 311381 keys) row cache for Keyspace1.Standard1 multi threaded: INFO 10:22:47,567 completed loading (213905 ms; 311381 keys) row cache for Keyspace1.Standard1 INFO 10:27:26,873 completed loading (214514 ms; 311381 keys) row cache for Keyspace1.Standard1 ssd; single threaded: INFO 10:40:49,079 completed loading (103883 ms; 311381 keys) row cache for Keyspace1.Standard1 INFO 10:43:45,799 completed loading (106913 ms; 311381 keys) row cache for Keyspace1.Standard1 multi threaded: INFO 10:38:20,798 completed loading (57617 ms; 311381 keys) row cache for Keyspace1.Standard1 INFO 10:47:20,339 completed loading (56682 ms; 311381 keys) row cache for Keyspace1.Standard1 {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3794) Avoid ID conflicts from concurrent schema changes
[ https://issues.apache.org/jira/browse/CASSANDRA-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283235#comment-13283235 ] Sylvain Lebresne commented on CASSANDRA-3794: - Looking more closely, there is actually two problems with respect to rolling upgrades: # because newly created CF won't have an old format id, it means people shouldn't create any CF in a mixed version cluster. That would clearly be fine for a major upgrade, it's more annoying to roll this in a minor upgrade though imo. I don't think there is anything we can do about that. # as is, streaming won't work in a mixed version cluster (as is the case in major upgrade) by virtue of the following code in IncomingTcpConnection: {noformat} if (version == MessagingService.version_) { } else { // streaming connections are per-session and have a fixed version. we can't do anything with a wrong-version stream connection, so drop it. logger.error(Received stream using protocol version {} (my version {}). Terminating connection, version, MessagingService.version_); } {noformat} We could avoid that, by say adding some isStreamingCompatible(v1, v2) method that would return true for VERSION_11 and VERSION_111, since after all there is no change to the stream format. However, the patch also need to version correctly StreamRequestMessage for it to work correctly. Overall, this is not a small patch, and it will induces more limited rolling upgrade behavior than is the norm in a minor version, so I'll admit I'm personally growing more in favor of solution #2 above (postpone to 1.2). That being said, on the patch itself: * In RowCacheKey.compareTo(), == is used intead of equals(). * In Schema, we can remove the cfIdGen field MIN_CF_ID. * nameUUIDFromBytes already does a md5 internally, so we should just pass the concatenation of ksName and cfName bytes (doubling the md5 slightly augments the chance of collisions). * When writing the schema, for the id column, the code write a string/UUID (toSchemaNoColumns) but expect an int when reading (fromSchemaNoColumns). The fact is, we don't need to save the new style id in the schema since we can recompute it. So we should keep the id column for oldId (if they exist). Also, when writing a CF schema, we should check if it has an associated old cfId and write it if it has (i.e. we should preserve the old ids mapping (when it exists) for now, we'll drop that in a future version). * Schema.addOldCfIdMapping should check for null value for the oldId and ignore it, since in fromSchemaNoColumns, result.getInt(id) will return null for new CF. * ColumnFamilySerializer needs to version the serialize version, when we talk to old node (same in RowMutation serialize method). Of course, when a CF don't have a old id, we'll have to throw an exception instead (that the 'user shouldn't create CF in a mixed cluster'). * StreamRequestMessage should version cfId correctly. * In SchemaLoader, not sure we want to always assign an old style id to the CF. Instead, it would probably be better to add a few specific tests (serialization test ?) that validate the old id are correctly handled. * OCD nit: convertOldCFId could be renamed to convertOldCfId for consistency with the rest (i.e. 'F' could be lowercased) :P Avoid ID conflicts from concurrent schema changes - Key: CASSANDRA-3794 URL: https://issues.apache.org/jira/browse/CASSANDRA-3794 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Pavel Yaskevich Assignee: Pavel Yaskevich Fix For: 1.1.1 Attachments: CASSANDRA-3794.patch Change ColumnFamily identifiers to be UUIDs instead of sequential Integers. Would be useful in the situation when nodes simultaneously trying to create ColumnFamilies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
git commit: Fix dataSize() in DeletionInfo
Updated Branches: refs/heads/trunk 2979820e5 - 5ab69b62c Fix dataSize() in DeletionInfo Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5ab69b62 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5ab69b62 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5ab69b62 Branch: refs/heads/trunk Commit: 5ab69b62cce884266e0e1ba5a8a0e64137b1098f Parents: 2979820 Author: Sylvain Lebresne sylv...@datastax.com Authored: Fri May 25 11:29:47 2012 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Fri May 25 11:29:47 2012 +0200 -- src/java/org/apache/cassandra/db/DeletionInfo.java |5 - 1 files changed, 4 insertions(+), 1 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/5ab69b62/src/java/org/apache/cassandra/db/DeletionInfo.java -- diff --git a/src/java/org/apache/cassandra/db/DeletionInfo.java b/src/java/org/apache/cassandra/db/DeletionInfo.java index e0fbb79..97fd4d6 100644 --- a/src/java/org/apache/cassandra/db/DeletionInfo.java +++ b/src/java/org/apache/cassandra/db/DeletionInfo.java @@ -205,7 +205,10 @@ public class DeletionInfo { int size = TypeSizes.NATIVE.sizeof(topLevel.markedForDeleteAt); for (RangeTombstone r : ranges) -size += r.data.markedForDeleteAt; +{ +size += r.min.remaining() + r.max.remaining(); +size += TypeSizes.NATIVE.sizeof(r.data.markedForDeleteAt); +} return size; }
[jira] [Resolved] (CASSANDRA-3708) Support composite prefix tombstones
[ https://issues.apache.org/jira/browse/CASSANDRA-3708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne resolved CASSANDRA-3708. - Resolution: Fixed Right, I've been arguably a little too hasty with my rebase-before-commit. Fixed in commit 5ab69b6, thanks. Support composite prefix tombstones - Key: CASSANDRA-3708 URL: https://issues.apache.org/jira/browse/CASSANDRA-3708 Project: Cassandra Issue Type: Sub-task Reporter: Jonathan Ellis Assignee: Sylvain Lebresne Fix For: 1.2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4277) hsha default thread limits make no sense, and yaml comments look confused
[ https://issues.apache.org/jira/browse/CASSANDRA-4277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283264#comment-13283264 ] Sylvain Lebresne commented on CASSANDRA-4277: - I do think it's maybe a bit too much info for a config file :). I'd remove the two last paragraphs and maybe rewrite the second one with something along the lines of: The default is unlimited and thus provide no protection against clients overwhelming the server. You are encouraged to set a maximum that makes sense for you in production, but do keep in mind that rpc_max_threads represents the maximum number of client requests this server may execute concurrently.. Also, there's a typo: Regardless of your *cohice*. And for # rpc_max_threads: (unlimited), it suggests that '(unlimited)' is a valid value which is not the case. I'd prefer leaving the 2048 value, it does not have to represent the actual default (it doesn't always for other configs). hsha default thread limits make no sense, and yaml comments look confused - Key: CASSANDRA-4277 URL: https://issues.apache.org/jira/browse/CASSANDRA-4277 Project: Cassandra Issue Type: Bug Components: Core Reporter: Peter Schuller Assignee: Peter Schuller Fix For: 1.2 Attachments: CASSANDRA-4277-trunk.txt The cassandra.yaml states with respect to {{rpc_max_threads}}: {code} # For the Hsha server, the min and max both default to quadruple the number of # CPU cores. {code} The code seems to indeed do this. But this makes, as far as I can tell, no sense what-so-ever since the number of concurrent RPC threads you need is a function of the throughput and the average latency of requests (that includes synchronously waiting on network traffic). Defaulting to anything having to do with CPU cores seems inherently wrong. If a default is non-static, a closer guess might be to look at thread stack size and heap size and infer what might be reasonable. *NOTE*: The effect of having this too low, is strange (if you don't know what's going on) latencies observed form the client on all thrift requests (*any* thrift request, including e.g. {{describe_ring()}}), that isn't visible in any latency metric exposed by Cassandra. This is why I consider this major, since unwitting users may be seeing detrimental performance for no good reason. In addition, I read this about async: {code} # async - Nonblocking server implementation with one thread to serve # rpc connections. This is not recommended for high throughput use # cases. Async has been tested to be about 50% slower than sync # or hsha and is deprecated: it will be removed in the next major release. {code} This makes even less sense. Running with *one* rpc thread limits you to a single concurrent request. How was that 50% number even attained? By single-node testing being completely CPU bound locally on a node? The actual effect should be stupidly slow in any real situation with lots of requests on a cluster of many nodes and network traffic (though I didn't test that) - especially in the event of any kind of hiccup like a node doing GC. I agree that if the above is true, async should *definitely* be deprecated, but the reasons seem *much* stronger than implied. I may be missing something here, in which case I apologize,, but I specifically double-checked after I fixed this setting on on our our clusters after seeing exactly the expected side-effect of having it be too low. I always was under the impression that rpc_max_threads affects the number of RPC requests running concurrently, and code inspection (it being used for the worker thread limit) + the effects of client-observed latency is consistent with my understanding. I suspect the setting was set strangely by someone because the phrasing of the comments in {{cassandra.yaml}} strongly suggest that this should be tied to CPU cores, hiding the fact that this really has to do with the number of requests that can be serviced concurrently regardless of implementation details of thrift/networking being sync/async/etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3881) reduce computational complexity of processing topology changes
[ https://issues.apache.org/jira/browse/CASSANDRA-3881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Overton updated CASSANDRA-3881: --- Description: This constitutes follow-up work from CASSANDRA-3831 where a partial improvement was committed, but the fundamental issue was not fixed. The maximum practical cluster size was significantly improved, but further work is expected to be necessary as cluster sizes grow. _Edit0: Appended patch information._ h3. Patches ||Compare||Raw diff||Description|| |[00_snitch_topology|https://github.com/acunu/cassandra/compare/refs/top-bases/p/3881/00_snitch_topology...p/3881/00_snitch_topology]|[00_snitch_topology.patch|https://github.com/acunu/cassandra/compare/refs/top-bases/p/3881/00_snitch_topology...p/3881/00_snitch_topology.diff]|Adds some functionality to TokenMetadata to track which endpoints and racks exist in a DC.| |[01_calc_natural_endpoints|https://github.com/acunu/cassandra/compare/refs/top-bases/p/3881/01_calc_natural_endpoints...p/3881/01_calc_natural_endpoints]|[01_calc_natural_endpoints.patch|https://github.com/acunu/cassandra/compare/refs/top-bases/p/3881/01_calc_natural_endpoints...p/3881/01_calc_natural_endpoints.diff]|Rewritten O(logN) implementation of calculateNaturalEndpoints using the topology information from the tokenMetadata.| _Note: These are branches managed with TopGit. If you are applying the patch output manually, you will either need to filter the TopGit metadata files (i.e. {{wget -O - url | filterdiff -x*.topdeps -x*.topmsg | patch -p1}}), or remove them afterward ({{rm .topmsg .topdeps}})._ was: This constitutes follow-up work from CASSANDRA-3831 where a partial improvement was committed, but the fundamental issue was not fixed. The maximum practical cluster size was significantly improved, but further work is expected to be necessary as cluster sizes grow. _Edit0: Appended patch information._ h3. Patches ||Compare||Raw diff||Description|| |[00_snitch_topology|https://github.com/acunu/cassandra/compare/refs/top-bases/p/3881/00_snitch_topology...p/3881/00_snitch_topology]|[00_snitch_topology.patch|https://github.com/acunu/cassandra/compare/refs/top-bases/p/3881/00_snitch_topology...p/3881/00_snitch_topology.diff]|Adds some functionality to AbstractEndpointSnitch to track which endpoints and racks exist in a DC.| |[01_calc_natural_endpoints|https://github.com/acunu/cassandra/compare/refs/top-bases/p/3881/01_calc_natural_endpoints...p/3881/01_calc_natural_endpoints]|[01_calc_natural_endpoints.patch|https://github.com/acunu/cassandra/compare/refs/top-bases/p/3881/01_calc_natural_endpoints...p/3881/01_calc_natural_endpoints.diff]|Rewritten O(logN) implementation of calculateNaturalEndpoints using the topology information from the snitch.| _Note: These are branches managed with TopGit. If you are applying the patch output manually, you will either need to filter the TopGit metadata files (i.e. {{wget -O - url | filterdiff -x*.topdeps -x*.topmsg | patch -p1}}), or remove them afterward ({{rm .topmsg .topdeps}})._ reduce computational complexity of processing topology changes -- Key: CASSANDRA-3881 URL: https://issues.apache.org/jira/browse/CASSANDRA-3881 Project: Cassandra Issue Type: Bug Components: Core Reporter: Peter Schuller Assignee: Sam Overton Labels: vnodes This constitutes follow-up work from CASSANDRA-3831 where a partial improvement was committed, but the fundamental issue was not fixed. The maximum practical cluster size was significantly improved, but further work is expected to be necessary as cluster sizes grow. _Edit0: Appended patch information._ h3. Patches ||Compare||Raw diff||Description|| |[00_snitch_topology|https://github.com/acunu/cassandra/compare/refs/top-bases/p/3881/00_snitch_topology...p/3881/00_snitch_topology]|[00_snitch_topology.patch|https://github.com/acunu/cassandra/compare/refs/top-bases/p/3881/00_snitch_topology...p/3881/00_snitch_topology.diff]|Adds some functionality to TokenMetadata to track which endpoints and racks exist in a DC.| |[01_calc_natural_endpoints|https://github.com/acunu/cassandra/compare/refs/top-bases/p/3881/01_calc_natural_endpoints...p/3881/01_calc_natural_endpoints]|[01_calc_natural_endpoints.patch|https://github.com/acunu/cassandra/compare/refs/top-bases/p/3881/01_calc_natural_endpoints...p/3881/01_calc_natural_endpoints.diff]|Rewritten O(logN) implementation of calculateNaturalEndpoints using the topology information from the tokenMetadata.| _Note: These are branches managed with TopGit. If you are applying the patch output manually, you will either need to filter the TopGit metadata files (i.e. {{wget -O - url | filterdiff -x*.topdeps -x*.topmsg | patch -p1}}), or
[jira] [Commented] (CASSANDRA-3881) reduce computational complexity of processing topology changes
[ https://issues.apache.org/jira/browse/CASSANDRA-3881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283289#comment-13283289 ] Sam Overton commented on CASSANDRA-3881: The original approach was not quite there. The snitch was tracking the topology of nodes in NORMAL state for the benefit of NTS.calculateNaturalEndpoints, but calculateNaturalEndpoints is called with modified TokenMetadata (eg, with leaving nodes removed, or a bootstrapped node added or some other modification) to calculate ranges for some future state of the ring, not the current state as tracked by the snitch. The correct solution is to have TokenMetadata track the topology of the nodes which it considers to be part of the ring, so that when a tokenMetadata is cloned and modified it also updates its view of the topology. This is also much simpler and cleaner. Patches above are updated. reduce computational complexity of processing topology changes -- Key: CASSANDRA-3881 URL: https://issues.apache.org/jira/browse/CASSANDRA-3881 Project: Cassandra Issue Type: Bug Components: Core Reporter: Peter Schuller Assignee: Sam Overton Labels: vnodes This constitutes follow-up work from CASSANDRA-3831 where a partial improvement was committed, but the fundamental issue was not fixed. The maximum practical cluster size was significantly improved, but further work is expected to be necessary as cluster sizes grow. _Edit0: Appended patch information._ h3. Patches ||Compare||Raw diff||Description|| |[00_snitch_topology|https://github.com/acunu/cassandra/compare/refs/top-bases/p/3881/00_snitch_topology...p/3881/00_snitch_topology]|[00_snitch_topology.patch|https://github.com/acunu/cassandra/compare/refs/top-bases/p/3881/00_snitch_topology...p/3881/00_snitch_topology.diff]|Adds some functionality to AbstractEndpointSnitch to track which endpoints and racks exist in a DC.| |[01_calc_natural_endpoints|https://github.com/acunu/cassandra/compare/refs/top-bases/p/3881/01_calc_natural_endpoints...p/3881/01_calc_natural_endpoints]|[01_calc_natural_endpoints.patch|https://github.com/acunu/cassandra/compare/refs/top-bases/p/3881/01_calc_natural_endpoints...p/3881/01_calc_natural_endpoints.diff]|Rewritten O(logN) implementation of calculateNaturalEndpoints using the topology information from the snitch.| _Note: These are branches managed with TopGit. If you are applying the patch output manually, you will either need to filter the TopGit metadata files (i.e. {{wget -O - url | filterdiff -x*.topdeps -x*.topmsg | patch -p1}}), or remove them afterward ({{rm .topmsg .topdeps}})._ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4238) Pig secondary index usage could be improved
[ https://issues.apache.org/jira/browse/CASSANDRA-4238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-4238: Reviewer: xedin Pig secondary index usage could be improved --- Key: CASSANDRA-4238 URL: https://issues.apache.org/jira/browse/CASSANDRA-4238 Project: Cassandra Issue Type: Improvement Components: Hadoop Affects Versions: 1.1.0 Reporter: Brandon Williams Assignee: Brandon Williams Attachments: 4238-v2.txt, 4238-v3.txt, 4238.txt As Dmitriy suggested on CASSANDRA-2246, CassandraStorage could implement LoadMetadata.getPartitionKeys and LoadMetadata.setPartitionFilter to automatically apply secondary indexes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2478) Custom CQL protocol/transport
[ https://issues.apache.org/jira/browse/CASSANDRA-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283379#comment-13283379 ] Sylvain Lebresne commented on CASSANDRA-2478: - bq. The ResultSetMetaData interface provides methods for getSchemaName(column) and getTableName(column) on a column-by-column basis Which begs the question: do we want to also allow per-column keyspace/table names? As of C* current state this is not needed, one can only query one at a time. But wiring that in the protocol could be limiting in the future. On the other side, it will more simple/compact to only allow 1 keyspace/table name and adding query on multiple table, if we ever do it, won't be a small addition, so maybe we're fine with having it trigger a bump in the protocol version when that happen. I suppose we could support both version through a simple flag that say whether there is just one keyspace/table pair or one per column, but that complicates the protocol for something that may well never be useful. Opinions? bq. use this as an opportunity to get rid of our custom authentication/authorization, and add hooks for SASL instead I'm not against that in theory. But I'll admit not knowing all the nuts and bolts of SASL. From an initial read, it seems the protocol part is fairly simple, it's a just a couple of simple message carrying string to support. However what's less clear to me is how to wire that in the Cassandra side and in particular how to ensure some form of compatibility with our current IAuthenticator interface. Custom CQL protocol/transport - Key: CASSANDRA-2478 URL: https://issues.apache.org/jira/browse/CASSANDRA-2478 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Eric Evans Assignee: Sylvain Lebresne Priority: Minor Labels: cql Attachments: cql_binary_protocol A custom wire protocol would give us the flexibility to optimize for our specific use-cases, and eliminate a troublesome dependency (I'm referring to Thrift, but none of the others would be significantly better). Additionally, RPC is bad fit here, and we'd do better to move in the direction of something that natively supports streaming. I don't think this is as daunting as it might seem initially. Utilizing an existing server framework like Netty, combined with some copy-and-paste of bits from other FLOSS projects would probably get us 80% of the way there. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-1974) PFEPS-like snitch that uses gossip instead of a property file
[ https://issues.apache.org/jira/browse/CASSANDRA-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-1974: Reviewer: vijay2...@yahoo.com PFEPS-like snitch that uses gossip instead of a property file - Key: CASSANDRA-1974 URL: https://issues.apache.org/jira/browse/CASSANDRA-1974 Project: Cassandra Issue Type: New Feature Reporter: Brandon Williams Assignee: Brandon Williams Priority: Minor Attachments: 1974.txt Now that we have an ec2 snitch that propagates its rack/dc info via gossip from CASSANDRA-1654, it doesn't make a lot of sense to use PFEPS where you have to rsync the property file across all the machines when you add a node. Instead, we could have a snitch where you specify its rack/dc in a property file, and propagate this via gossip like the ec2 snitch. In order to not break PFEPS, this should probably be a new snitch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3794) Avoid ID conflicts from concurrent schema changes
[ https://issues.apache.org/jira/browse/CASSANDRA-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Yaskevich updated CASSANDRA-3794: --- Attachment: CASSANDRA-3794-v2.patch v2 that addresses all pointes. Avoid ID conflicts from concurrent schema changes - Key: CASSANDRA-3794 URL: https://issues.apache.org/jira/browse/CASSANDRA-3794 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Pavel Yaskevich Assignee: Pavel Yaskevich Fix For: 1.1.1 Attachments: CASSANDRA-3794-v2.patch, CASSANDRA-3794.patch Change ColumnFamily identifiers to be UUIDs instead of sequential Integers. Would be useful in the situation when nodes simultaneously trying to create ColumnFamilies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3127) Message (inter-node) compression
[ https://issues.apache.org/jira/browse/CASSANDRA-3127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283474#comment-13283474 ] Marcus Eriksson commented on CASSANDRA-3127: Could we perhaps always compress the message and check if the resulting message is smaller than the original one? MySQL does that when using client-server compression for example. I'll assign this to me and start poking around a bit Message (inter-node) compression Key: CASSANDRA-3127 URL: https://issues.apache.org/jira/browse/CASSANDRA-3127 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Sylvain Lebresne Priority: Minor CASSANDRA-3015 adds compression of streams. But it could be useful to also compress some messages. Compressing messages is easy, but what may be little bit trickier is when and what messages to compress to get the best performances. The simple solution would be to just have it either always on or always off. But for very small messages (gossip?) that may be counter-productive. On the other side of the spectrum, this is likely always a good choice to compress for say the exchange of merkle trees across data-centers. We could maybe define a size of messages after which we start to compress. Maybe the option to only compress for cross data-center messages would be useful too (but I may also just be getting carried away). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CASSANDRA-3127) Message (inter-node) compression
[ https://issues.apache.org/jira/browse/CASSANDRA-3127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson reassigned CASSANDRA-3127: -- Assignee: Marcus Eriksson Message (inter-node) compression Key: CASSANDRA-3127 URL: https://issues.apache.org/jira/browse/CASSANDRA-3127 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Sylvain Lebresne Assignee: Marcus Eriksson Priority: Minor CASSANDRA-3015 adds compression of streams. But it could be useful to also compress some messages. Compressing messages is easy, but what may be little bit trickier is when and what messages to compress to get the best performances. The simple solution would be to just have it either always on or always off. But for very small messages (gossip?) that may be counter-productive. On the other side of the spectrum, this is likely always a good choice to compress for say the exchange of merkle trees across data-centers. We could maybe define a size of messages after which we start to compress. Maybe the option to only compress for cross data-center messages would be useful too (but I may also just be getting carried away). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (CASSANDRA-3127) Message (inter-node) compression
[ https://issues.apache.org/jira/browse/CASSANDRA-3127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283474#comment-13283474 ] Marcus Eriksson edited comment on CASSANDRA-3127 at 5/25/12 2:32 PM: - Could we perhaps always compress the message and check if the resulting message is smaller than the original one? And then of course send the smallest one over the wire. MySQL does that when using client-server compression for example. I'll assign this to me and start poking around a bit was (Author: krummas): Could we perhaps always compress the message and check if the resulting message is smaller than the original one? MySQL does that when using client-server compression for example. I'll assign this to me and start poking around a bit Message (inter-node) compression Key: CASSANDRA-3127 URL: https://issues.apache.org/jira/browse/CASSANDRA-3127 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Sylvain Lebresne Assignee: Marcus Eriksson Priority: Minor CASSANDRA-3015 adds compression of streams. But it could be useful to also compress some messages. Compressing messages is easy, but what may be little bit trickier is when and what messages to compress to get the best performances. The simple solution would be to just have it either always on or always off. But for very small messages (gossip?) that may be counter-productive. On the other side of the spectrum, this is likely always a good choice to compress for say the exchange of merkle trees across data-centers. We could maybe define a size of messages after which we start to compress. Maybe the option to only compress for cross data-center messages would be useful too (but I may also just be getting carried away). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-4283) CQL3: dates are not handled correctly in slices
Sylvain Lebresne created CASSANDRA-4283: --- Summary: CQL3: dates are not handled correctly in slices Key: CASSANDRA-4283 URL: https://issues.apache.org/jira/browse/CASSANDRA-4283 Project: Cassandra Issue Type: Bug Components: API Affects Versions: 1.1.0 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Fix For: 1.1.2 Our timestamp type allows to input timestamp as dates like '2012-06-06'. However, those don't work as expected in slice queries, as for instance: {noformat} SELECT * FROM timeline WHERE k = ... AND time '2012-06-06' AND time = '2012-06-09' {noformat} will return timestamps from '2012-06-06' and not those from '2012-06-09'. The reason being of course that we always translate a date the same way, using 0 for whichever part is not precised. A reasonably simple fix could be to add a new fromString(String s, boolean gt) method to AbstractType that is used when the the string should be interpreted in an inequality (the boolean gt would then say which kind of inequality). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4275) Oracle Java 1.7 u4 does not allow Xss128k
[ https://issues.apache.org/jira/browse/CASSANDRA-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283483#comment-13283483 ] Edward Capriolo commented on CASSANDRA-4275: I understand wanting to keep this setting low. But there must be some fairly major underlying factor that made the JVM developers decided to raise the minimum allowable from whatever it was to 160. While the increase is large 125K-250K the net effect on 1000 connections is not a big deal. 125MB vs 250MB This is stack space not heap correct? Will using 100MB more stack be a problem in the grand schema of Cassandra memory management? Also we do not want some minimum setting that is likely to blow up on someone at the first sign of load. I do not think it is a bad idea to dig in and determine if we can safely code/tune cassandra to lower the setting. But I will un-assign myself if that is the road we are going to take because that type of analysis is not my bread and butter skill set. Oracle Java 1.7 u4 does not allow Xss128k - Key: CASSANDRA-4275 URL: https://issues.apache.org/jira/browse/CASSANDRA-4275 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.0.9, 1.1.0 Reporter: Edward Capriolo Assignee: Edward Capriolo Attachments: trunk-cassandra-4275.1.patch.txt Problem: This happens when you try to start it with default Xss setting of 128k === The stack size specified is too small, Specify at least 160k Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit. Solution === Set -Xss to 256k Problem: This happens when you try to start it with Xss = 160k ERROR [Thrift:14] 2012-05-22 14:42:40,479 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[Thrift:14,5,main] java.lang.StackOverflowError Solution === Set -Xss to 256k -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CASSANDRA-4275) Oracle Java 1.7 u4 does not allow Xss128k
[ https://issues.apache.org/jira/browse/CASSANDRA-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo reassigned CASSANDRA-4275: -- Assignee: (was: Edward Capriolo) Oracle Java 1.7 u4 does not allow Xss128k - Key: CASSANDRA-4275 URL: https://issues.apache.org/jira/browse/CASSANDRA-4275 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.0.9, 1.1.0 Reporter: Edward Capriolo Attachments: trunk-cassandra-4275.1.patch.txt Problem: This happens when you try to start it with default Xss setting of 128k === The stack size specified is too small, Specify at least 160k Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit. Solution === Set -Xss to 256k Problem: This happens when you try to start it with Xss = 160k ERROR [Thrift:14] 2012-05-22 14:42:40,479 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[Thrift:14,5,main] java.lang.StackOverflowError Solution === Set -Xss to 256k -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-4284) Improve timeuuid - date relationship
Sylvain Lebresne created CASSANDRA-4284: --- Summary: Improve timeuuid - date relationship Key: CASSANDRA-4284 URL: https://issues.apache.org/jira/browse/CASSANDRA-4284 Project: Cassandra Issue Type: Improvement Components: API Affects Versions: 1.1.0 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Priority: Minor Fix For: 1.2 We added timeuuid to CQL3, whose purpose is to provide a collision-free timestamp basically and as a convenience, such timeuuid can be inputed as as date. However, two things seems non-optimal to me: * When one insert a timeuuid using a date format, we always pick *the* UUID corresponding to this date with every other part of the UUID to 0. This kind of defeat the purpose of collision-free timestamp and thus greatly limit the usefulness of the date syntax. * When cqlsh print timeuuid, it print them as date. But as thus, there is some information lost which can be problematic (you can't update a specific column based on that return). In a way, this is a cqlsh limitation, since cassandra return the UUID bytes. Yet, it also emphasis somehow that from the point of using them, timeuuid are more UUID than really time. For the first point, it would make more sense that when inserting a date, we actually pick a uuid with the corresponding timestamp *but* with the rest of the UUID being random. It's not completely that simple because we don't want that randomness when the date are used in a select query, but that's roughtly the same problem than CASSANDRA-4283 (and we can thus use the same solution). The second point gives an idea. We could extends the date syntax to allow it to represent uniquely a type 1 UUID. Typically, we could allow something like: '2012-06-06 12:03:00+ %a2fc07', where the part after the '%' character would be hexadecimal for the non-timestamp part of the UUID. Understanding this syntax could allow to work with timeuuid uniquely with meaningful date string which I think would be neat. But maybe that's a crazy idea, opinions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4221) Error while deleting a columnfamily that is being compacted.
[ https://issues.apache.org/jira/browse/CASSANDRA-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Yaskevich updated CASSANDRA-4221: --- Attachment: CASSANDRA-4221-v2.patch Error while deleting a columnfamily that is being compacted. Key: CASSANDRA-4221 URL: https://issues.apache.org/jira/browse/CASSANDRA-4221 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.0 Environment: ccm, dtest, cassandra-1.1. The error does not happen in cassandra-1.0. Reporter: Tyler Patterson Assignee: Pavel Yaskevich Fix For: 1.1.1 Attachments: CASSANDRA-4221-logging.patch, CASSANDRA-4221-v2.patch, CASSANDRA-4221.patch, system.log The following dtest command produces an error: {code}export CASSANDRA_VERSION=git:cassandra-1.1; nosetests --nocapture --nologcapture concurrent_schema_changes_test.py:TestConcurrentSchemaChanges.load_test{code} Here is the error: {code} Error occured during compaction java.util.concurrent.ExecutionException: java.io.IOError: java.io.FileNotFoundException: /tmp/dtest-6ECMgy/test/node1/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-47-Data.db (No such file or directory) at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:252) at java.util.concurrent.FutureTask.get(FutureTask.java:111) at org.apache.cassandra.db.compaction.CompactionManager.performMaximal(CompactionManager.java:239) at org.apache.cassandra.db.ColumnFamilyStore.forceMajorCompaction(ColumnFamilyStore.java:1580) at org.apache.cassandra.service.StorageService.forceTableCompaction(StorageService.java:1770) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:111) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:45) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:226) at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:251) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:857) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:795) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1450) at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:90) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1285) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1383) at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:807) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322) at sun.rmi.transport.Transport$1.run(Transport.java:177) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:173) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:553) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:808) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:667) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:679) Caused by: java.io.IOError: java.io.FileNotFoundException: /tmp/dtest-6ECMgy/test/node1/data/Keyspace1/Standard1/Keyspace1-Standard1-hc-47-Data.db (No such file or directory) at org.apache.cassandra.io.sstable.SSTableScanner.init(SSTableScanner.java:61) at org.apache.cassandra.io.sstable.SSTableReader.getDirectScanner(SSTableReader.java:839) at org.apache.cassandra.io.sstable.SSTableReader.getDirectScanner(SSTableReader.java:851) at
[jira] [Commented] (CASSANDRA-4283) CQL3: dates are not handled correctly in slices
[ https://issues.apache.org/jira/browse/CASSANDRA-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283529#comment-13283529 ] Jonathan Ellis commented on CASSANDRA-4283: --- Is the problem that it doesn't use the correct operation marker at the end of the CompositeType? CQL3: dates are not handled correctly in slices Key: CASSANDRA-4283 URL: https://issues.apache.org/jira/browse/CASSANDRA-4283 Project: Cassandra Issue Type: Bug Components: API Affects Versions: 1.1.0 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Labels: cql3 Fix For: 1.1.2 Our timestamp type allows to input timestamp as dates like '2012-06-06'. However, those don't work as expected in slice queries, as for instance: {noformat} SELECT * FROM timeline WHERE k = ... AND time '2012-06-06' AND time = '2012-06-09' {noformat} will return timestamps from '2012-06-06' and not those from '2012-06-09'. The reason being of course that we always translate a date the same way, using 0 for whichever part is not precised. A reasonably simple fix could be to add a new fromString(String s, boolean gt) method to AbstractType that is used when the the string should be interpreted in an inequality (the boolean gt would then say which kind of inequality). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3794) Avoid ID conflicts from concurrent schema changes
[ https://issues.apache.org/jira/browse/CASSANDRA-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283528#comment-13283528 ] Sylvain Lebresne commented on CASSANDRA-3794: - Two small nits while looking at v2 quickly: * In fromSchemaNoColumns, the try catch is not useful. * In ColumnFamilySerializer, when we serialize for an old version and can't find an oldId, it'll probably be better to use a more user friendly message like Cannot send column family X to Y as it's version is pre-1.1.1. Please update the whole cluster to 1.1.1 first. But I think we agreed that it's safer to target this at 1.2 and acknowledge that concurrent table creation will only be supported then, so this will need rebase to trunk :) Avoid ID conflicts from concurrent schema changes - Key: CASSANDRA-3794 URL: https://issues.apache.org/jira/browse/CASSANDRA-3794 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Pavel Yaskevich Assignee: Pavel Yaskevich Fix For: 1.1.1 Attachments: CASSANDRA-3794-v2.patch, CASSANDRA-3794.patch Change ColumnFamily identifiers to be UUIDs instead of sequential Integers. Would be useful in the situation when nodes simultaneously trying to create ColumnFamilies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4275) Oracle Java 1.7 u4 does not allow Xss128k
[ https://issues.apache.org/jira/browse/CASSANDRA-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283533#comment-13283533 ] Jonathan Ellis commented on CASSANDRA-4275: --- bq. there must be some fairly major underlying factor that made the JVM developers decided to raise the minimum allowable from whatever it was to 160 Doubt it. Old minimum was 128, so it's not *that* big a change. I've asked Jake Farrell to have a look at Thrift/Java7 to see if anything odd is going on here. Oracle Java 1.7 u4 does not allow Xss128k - Key: CASSANDRA-4275 URL: https://issues.apache.org/jira/browse/CASSANDRA-4275 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.0.9, 1.1.0 Reporter: Edward Capriolo Attachments: trunk-cassandra-4275.1.patch.txt Problem: This happens when you try to start it with default Xss setting of 128k === The stack size specified is too small, Specify at least 160k Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit. Solution === Set -Xss to 256k Problem: This happens when you try to start it with Xss = 160k ERROR [Thrift:14] 2012-05-22 14:42:40,479 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[Thrift:14,5,main] java.lang.StackOverflowError Solution === Set -Xss to 256k -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4283) CQL3: dates are not handled correctly in slices
[ https://issues.apache.org/jira/browse/CASSANDRA-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283538#comment-13283538 ] Sylvain Lebresne commented on CASSANDRA-4283: - No, the problem is that '2012-06-06' is always translated as '2012-06-06 00:00:00', but when you ask for {{time '2012-06-06'}} you clearly don't expect to get something dating of the 6 June at midnigt and 1 minute. Or in other words, '2012-06-06' means a time range, more than a precise timestamp, which we don't handle. CQL3: dates are not handled correctly in slices Key: CASSANDRA-4283 URL: https://issues.apache.org/jira/browse/CASSANDRA-4283 Project: Cassandra Issue Type: Bug Components: API Affects Versions: 1.1.0 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Labels: cql3 Fix For: 1.1.2 Our timestamp type allows to input timestamp as dates like '2012-06-06'. However, those don't work as expected in slice queries, as for instance: {noformat} SELECT * FROM timeline WHERE k = ... AND time '2012-06-06' AND time = '2012-06-09' {noformat} will return timestamps from '2012-06-06' and not those from '2012-06-09'. The reason being of course that we always translate a date the same way, using 0 for whichever part is not precised. A reasonably simple fix could be to add a new fromString(String s, boolean gt) method to AbstractType that is used when the the string should be interpreted in an inequality (the boolean gt would then say which kind of inequality). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4283) CQL3: dates are not handled correctly in slices
[ https://issues.apache.org/jira/browse/CASSANDRA-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283555#comment-13283555 ] Jonathan Ellis commented on CASSANDRA-4283: --- Two points. First, I don't think trying to guess what the user wants here is the right solution. They could mean: # any time after 2012-06-06 00:00:00 # at least a day after time Y # time X is a *different* day than time Y (different from #2 when Y is not 00:00:00) Second, for the query you give, #1 is normal and expected behavior for every relational database. The SQL standard specifies an {{interval}} type that can be used to solve #2: {{SELECT x FROM foo WHERE x y + '1 day'::interval}} The easiest way to do #3 is to force everything to truncate to the {{date}} type: {{SELECT x FROM foo WHERE x::date y::date}} CQL3: dates are not handled correctly in slices Key: CASSANDRA-4283 URL: https://issues.apache.org/jira/browse/CASSANDRA-4283 Project: Cassandra Issue Type: Bug Components: API Affects Versions: 1.1.0 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Labels: cql3 Fix For: 1.1.2 Our timestamp type allows to input timestamp as dates like '2012-06-06'. However, those don't work as expected in slice queries, as for instance: {noformat} SELECT * FROM timeline WHERE k = ... AND time '2012-06-06' AND time = '2012-06-09' {noformat} will return timestamps from '2012-06-06' and not those from '2012-06-09'. The reason being of course that we always translate a date the same way, using 0 for whichever part is not precised. A reasonably simple fix could be to add a new fromString(String s, boolean gt) method to AbstractType that is used when the the string should be interpreted in an inequality (the boolean gt would then say which kind of inequality). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4283) CQL3: dates are not handled correctly in slices
[ https://issues.apache.org/jira/browse/CASSANDRA-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283576#comment-13283576 ] Sylvain Lebresne commented on CASSANDRA-4283: - bq. First, I don't think trying to guess what the user wants here is the right solution. Maybe I'm wrong on what the SQL standard actually does and that's another problem, but at least in principle I really don't think that what I was suggesting is in any way ambiguous. If you use '2012-06-06', it feels natural and straightforward that you expect a day precision. And that if you want any time after '2012-06-06 00:00:00' then you'd use {{time '2012-06-06 00:00:00'}}. And if you say, {{time '2012'}}, likely you want something after 2012. I mean, it reads like that. That being said, I haven't checked what the SQL standard and if it prescribes that the query in the description of this ticket is equivalent to: {noformat} SELECT * FROM timeline WHERE k = ... AND time '2012-06-06 00:00:00' AND time = '2012-06-09 00:00:00' {noformat} and thus will return everything done the 6 june 2012, then so be it. That's not the most natural definition but so be it. CQL3: dates are not handled correctly in slices Key: CASSANDRA-4283 URL: https://issues.apache.org/jira/browse/CASSANDRA-4283 Project: Cassandra Issue Type: Bug Components: API Affects Versions: 1.1.0 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Labels: cql3 Fix For: 1.1.2 Our timestamp type allows to input timestamp as dates like '2012-06-06'. However, those don't work as expected in slice queries, as for instance: {noformat} SELECT * FROM timeline WHERE k = ... AND time '2012-06-06' AND time = '2012-06-09' {noformat} will return timestamps from '2012-06-06' and not those from '2012-06-09'. The reason being of course that we always translate a date the same way, using 0 for whichever part is not precised. A reasonably simple fix could be to add a new fromString(String s, boolean gt) method to AbstractType that is used when the the string should be interpreted in an inequality (the boolean gt would then say which kind of inequality). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4283) CQL3: dates are not handled correctly in slices
[ https://issues.apache.org/jira/browse/CASSANDRA-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283581#comment-13283581 ] Jonathan Ellis commented on CASSANDRA-4283: --- Example from postgresql: {code} # select '2012-06-06 00:00:01'::timestamp '2012-06-06'::timestamp; ?column? -- t (1 row) # select '2012-06-06 00:00:01'::date '2012-06-06'::date; ?column? -- f (1 row) {code} CQL3: dates are not handled correctly in slices Key: CASSANDRA-4283 URL: https://issues.apache.org/jira/browse/CASSANDRA-4283 Project: Cassandra Issue Type: Bug Components: API Affects Versions: 1.1.0 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Labels: cql3 Fix For: 1.1.2 Our timestamp type allows to input timestamp as dates like '2012-06-06'. However, those don't work as expected in slice queries, as for instance: {noformat} SELECT * FROM timeline WHERE k = ... AND time '2012-06-06' AND time = '2012-06-09' {noformat} will return timestamps from '2012-06-06' and not those from '2012-06-09'. The reason being of course that we always translate a date the same way, using 0 for whichever part is not precised. A reasonably simple fix could be to add a new fromString(String s, boolean gt) method to AbstractType that is used when the the string should be interpreted in an inequality (the boolean gt would then say which kind of inequality). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4199) There should be an easy way to find out which sstables a key lives in
[ https://issues.apache.org/jira/browse/CASSANDRA-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-4199: Attachment: 4199.txt Patch to add a getsstables command to nodetool. There should be an easy way to find out which sstables a key lives in - Key: CASSANDRA-4199 URL: https://issues.apache.org/jira/browse/CASSANDRA-4199 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Brandon Williams Labels: lhf Fix For: 1.1.1 Attachments: 4199.txt When debugging, often times on a live server you want to extract a certain key with sst2j, but unfortunately you can't know which sstable(s) you need to run this on, causing you to iterate over much more data than necessary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4283) CQL3: dates are not handled correctly in slices
[ https://issues.apache.org/jira/browse/CASSANDRA-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283591#comment-13283591 ] Sylvain Lebresne commented on CASSANDRA-4283: - To be clear, I did not pretended you were wrong on how SQL works and I'm on board with the argument of let's continue to do what SQL defines. I was just finding it a bit unfair to pretend that, taking aside what SQL actually defines, to expect {{SELECT * FROM timeline WHERE time '2012-06-06'}} not to return value dating of the 6 June amounts to guess what the user wants. CQL3: dates are not handled correctly in slices Key: CASSANDRA-4283 URL: https://issues.apache.org/jira/browse/CASSANDRA-4283 Project: Cassandra Issue Type: Bug Components: API Affects Versions: 1.1.0 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Labels: cql3 Fix For: 1.1.2 Our timestamp type allows to input timestamp as dates like '2012-06-06'. However, those don't work as expected in slice queries, as for instance: {noformat} SELECT * FROM timeline WHERE k = ... AND time '2012-06-06' AND time = '2012-06-09' {noformat} will return timestamps from '2012-06-06' and not those from '2012-06-09'. The reason being of course that we always translate a date the same way, using 0 for whichever part is not precised. A reasonably simple fix could be to add a new fromString(String s, boolean gt) method to AbstractType that is used when the the string should be interpreted in an inequality (the boolean gt would then say which kind of inequality). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-4283) CQL3: dates are not handled correctly in slices
[ https://issues.apache.org/jira/browse/CASSANDRA-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne resolved CASSANDRA-4283. - Resolution: Not A Problem CQL3: dates are not handled correctly in slices Key: CASSANDRA-4283 URL: https://issues.apache.org/jira/browse/CASSANDRA-4283 Project: Cassandra Issue Type: Bug Components: API Affects Versions: 1.1.0 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Labels: cql3 Fix For: 1.1.2 Our timestamp type allows to input timestamp as dates like '2012-06-06'. However, those don't work as expected in slice queries, as for instance: {noformat} SELECT * FROM timeline WHERE k = ... AND time '2012-06-06' AND time = '2012-06-09' {noformat} will return timestamps from '2012-06-06' and not those from '2012-06-09'. The reason being of course that we always translate a date the same way, using 0 for whichever part is not precised. A reasonably simple fix could be to add a new fromString(String s, boolean gt) method to AbstractType that is used when the the string should be interpreted in an inequality (the boolean gt would then say which kind of inequality). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-4285) Atomic, eventually-consistent batches
Jonathan Ellis created CASSANDRA-4285: - Summary: Atomic, eventually-consistent batches Key: CASSANDRA-4285 URL: https://issues.apache.org/jira/browse/CASSANDRA-4285 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Jonathan Ellis I discussed this in the context of triggers (CASSANDRA-1311) but it's useful as a standalone feature as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CASSANDRA-4262) 1.1 does not preserve compatibility w/ index queries against 1.0 nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-4262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne reassigned CASSANDRA-4262: --- Assignee: Sylvain Lebresne 1.1 does not preserve compatibility w/ index queries against 1.0 nodes -- Key: CASSANDRA-4262 URL: https://issues.apache.org/jira/browse/CASSANDRA-4262 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.0 Reporter: Jonathan Ellis Assignee: Sylvain Lebresne Priority: Critical Fix For: 1.1.1 1.1 merged index + seq scan paths into RangeSliceCommand. 1.1 StorageProxy always sends a RSC for either scan type. But 1.0 RSVH only does seq scans. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4262) 1.1 does not preserve compatibility w/ index queries against 1.0 nodes
[ https://issues.apache.org/jira/browse/CASSANDRA-4262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-4262: Attachment: 4262.txt Attached patch to generate IndexScanCommand message when appropriate. I've pushed a test in dtests too (in rolling_upgrade_test.py). Fails before the patch, works after. 1.1 does not preserve compatibility w/ index queries against 1.0 nodes -- Key: CASSANDRA-4262 URL: https://issues.apache.org/jira/browse/CASSANDRA-4262 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.0 Reporter: Jonathan Ellis Assignee: Sylvain Lebresne Priority: Critical Fix For: 1.1.1 Attachments: 4262.txt 1.1 merged index + seq scan paths into RangeSliceCommand. 1.1 StorageProxy always sends a RSC for either scan type. But 1.0 RSVH only does seq scans. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-4286) Enhance Write Survey mode to use a custom range
Benjamin Coverston created CASSANDRA-4286: - Summary: Enhance Write Survey mode to use a custom range Key: CASSANDRA-4286 URL: https://issues.apache.org/jira/browse/CASSANDRA-4286 Project: Cassandra Issue Type: New Feature Reporter: Benjamin Coverston Write survey mode could be more useful for sizing and tuning if you could define a custom range to survey. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1974) PFEPS-like snitch that uses gossip instead of a property file
[ https://issues.apache.org/jira/browse/CASSANDRA-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283639#comment-13283639 ] Vijay commented on CASSANDRA-1974: -- +1 PFEPS-like snitch that uses gossip instead of a property file - Key: CASSANDRA-1974 URL: https://issues.apache.org/jira/browse/CASSANDRA-1974 Project: Cassandra Issue Type: New Feature Reporter: Brandon Williams Assignee: Brandon Williams Priority: Minor Attachments: 1974.txt Now that we have an ec2 snitch that propagates its rack/dc info via gossip from CASSANDRA-1654, it doesn't make a lot of sense to use PFEPS where you have to rsync the property file across all the machines when you add a node. Instead, we could have a snitch where you specify its rack/dc in a property file, and propagate this via gossip like the ec2 snitch. In order to not break PFEPS, this should probably be a new snitch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2478) Custom CQL protocol/transport
[ https://issues.apache.org/jira/browse/CASSANDRA-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283664#comment-13283664 ] Rick Shaw commented on CASSANDRA-2478: -- My personal opinion is we will never get to a point where we need multiple KSs. But Multiple CFs (Tables) yes. The point is now in the current state of the returned meta-data we do not know _which_ KS or CF, even if there is only one, and I contend we have client calls that want to know that info. bikesheding on my part will now end. :) Custom CQL protocol/transport - Key: CASSANDRA-2478 URL: https://issues.apache.org/jira/browse/CASSANDRA-2478 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Eric Evans Assignee: Sylvain Lebresne Priority: Minor Labels: cql Attachments: cql_binary_protocol A custom wire protocol would give us the flexibility to optimize for our specific use-cases, and eliminate a troublesome dependency (I'm referring to Thrift, but none of the others would be significantly better). Additionally, RPC is bad fit here, and we'd do better to move in the direction of something that natively supports streaming. I don't think this is as daunting as it might seem initially. Utilizing an existing server framework like Netty, combined with some copy-and-paste of bits from other FLOSS projects would probably get us 80% of the way there. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2478) Custom CQL protocol/transport
[ https://issues.apache.org/jira/browse/CASSANDRA-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283667#comment-13283667 ] Jeremy Hanna commented on CASSANDRA-2478: - Rick: do you have specific tools which would benefit from that metadata? Custom CQL protocol/transport - Key: CASSANDRA-2478 URL: https://issues.apache.org/jira/browse/CASSANDRA-2478 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Eric Evans Assignee: Sylvain Lebresne Priority: Minor Labels: cql Attachments: cql_binary_protocol A custom wire protocol would give us the flexibility to optimize for our specific use-cases, and eliminate a troublesome dependency (I'm referring to Thrift, but none of the others would be significantly better). Additionally, RPC is bad fit here, and we'd do better to move in the direction of something that natively supports streaming. I don't think this is as daunting as it might seem initially. Utilizing an existing server framework like Netty, combined with some copy-and-paste of bits from other FLOSS projects would probably get us 80% of the way there. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2478) Custom CQL protocol/transport
[ https://issues.apache.org/jira/browse/CASSANDRA-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283703#comment-13283703 ] Rick Shaw commented on CASSANDRA-2478: -- Much of the pressure in the corporate world I live in is from teams that are trying to use JDBC driver for C* in the same fashion and with the same BI, ETL and statistics tools (Talend, Informatica, Datastage, SPSS, SAS) as they do for relational solutions. Many of these tools are quite comprehensive in the information they can display and use, so they are heavy users of the metadata features of the driver. Returning no data for some functions often causes the tooling to just give up, not to use what they have. Custom CQL protocol/transport - Key: CASSANDRA-2478 URL: https://issues.apache.org/jira/browse/CASSANDRA-2478 Project: Cassandra Issue Type: New Feature Components: API, Core Reporter: Eric Evans Assignee: Sylvain Lebresne Priority: Minor Labels: cql Attachments: cql_binary_protocol A custom wire protocol would give us the flexibility to optimize for our specific use-cases, and eliminate a troublesome dependency (I'm referring to Thrift, but none of the others would be significantly better). Additionally, RPC is bad fit here, and we'd do better to move in the direction of something that natively supports streaming. I don't think this is as daunting as it might seem initially. Utilizing an existing server framework like Netty, combined with some copy-and-paste of bits from other FLOSS projects would probably get us 80% of the way there. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4277) hsha default thread limits make no sense, and yaml comments look confused
[ https://issues.apache.org/jira/browse/CASSANDRA-4277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-4277: -- Attachment: CASSANDRA-4277-trunk-v2.txt Attached v2 that incorporates those changes. hsha default thread limits make no sense, and yaml comments look confused - Key: CASSANDRA-4277 URL: https://issues.apache.org/jira/browse/CASSANDRA-4277 Project: Cassandra Issue Type: Bug Components: Core Reporter: Peter Schuller Assignee: Peter Schuller Fix For: 1.2 Attachments: CASSANDRA-4277-trunk-v2.txt, CASSANDRA-4277-trunk.txt The cassandra.yaml states with respect to {{rpc_max_threads}}: {code} # For the Hsha server, the min and max both default to quadruple the number of # CPU cores. {code} The code seems to indeed do this. But this makes, as far as I can tell, no sense what-so-ever since the number of concurrent RPC threads you need is a function of the throughput and the average latency of requests (that includes synchronously waiting on network traffic). Defaulting to anything having to do with CPU cores seems inherently wrong. If a default is non-static, a closer guess might be to look at thread stack size and heap size and infer what might be reasonable. *NOTE*: The effect of having this too low, is strange (if you don't know what's going on) latencies observed form the client on all thrift requests (*any* thrift request, including e.g. {{describe_ring()}}), that isn't visible in any latency metric exposed by Cassandra. This is why I consider this major, since unwitting users may be seeing detrimental performance for no good reason. In addition, I read this about async: {code} # async - Nonblocking server implementation with one thread to serve # rpc connections. This is not recommended for high throughput use # cases. Async has been tested to be about 50% slower than sync # or hsha and is deprecated: it will be removed in the next major release. {code} This makes even less sense. Running with *one* rpc thread limits you to a single concurrent request. How was that 50% number even attained? By single-node testing being completely CPU bound locally on a node? The actual effect should be stupidly slow in any real situation with lots of requests on a cluster of many nodes and network traffic (though I didn't test that) - especially in the event of any kind of hiccup like a node doing GC. I agree that if the above is true, async should *definitely* be deprecated, but the reasons seem *much* stronger than implied. I may be missing something here, in which case I apologize,, but I specifically double-checked after I fixed this setting on on our our clusters after seeing exactly the expected side-effect of having it be too low. I always was under the impression that rpc_max_threads affects the number of RPC requests running concurrently, and code inspection (it being used for the worker thread limit) + the effects of client-observed latency is consistent with my understanding. I suspect the setting was set strangely by someone because the phrasing of the comments in {{cassandra.yaml}} strongly suggest that this should be tied to CPU cores, hiding the fact that this really has to do with the number of requests that can be serviced concurrently regardless of implementation details of thrift/networking being sync/async/etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-4287) SizeTieredCompactionStrategy.getBuckets is quadradic in the number of sstables
Jonathan Ellis created CASSANDRA-4287: - Summary: SizeTieredCompactionStrategy.getBuckets is quadradic in the number of sstables Key: CASSANDRA-4287 URL: https://issues.apache.org/jira/browse/CASSANDRA-4287 Project: Cassandra Issue Type: Bug Components: Core Reporter: Jonathan Ellis Priority: Minor Fix For: 1.0.11 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4287) SizeTieredCompactionStrategy.getBuckets is quadradic in the number of sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-4287: -- Description: getBuckets first sorts the sstables by size (N log N) then adds each sstable to a bucket (N**2 in the worst case of all sstables the same size, because we use the bucket's contents as a hash key). Assignee: Jonathan Ellis Labels: compaction (was: ) SizeTieredCompactionStrategy.getBuckets is quadradic in the number of sstables -- Key: CASSANDRA-4287 URL: https://issues.apache.org/jira/browse/CASSANDRA-4287 Project: Cassandra Issue Type: Bug Components: Core Reporter: Jonathan Ellis Assignee: Jonathan Ellis Priority: Minor Labels: compaction Fix For: 1.0.11 getBuckets first sorts the sstables by size (N log N) then adds each sstable to a bucket (N**2 in the worst case of all sstables the same size, because we use the bucket's contents as a hash key). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-4288) prevent thrift server from starting before gossip has settled
Peter Schuller created CASSANDRA-4288: - Summary: prevent thrift server from starting before gossip has settled Key: CASSANDRA-4288 URL: https://issues.apache.org/jira/browse/CASSANDRA-4288 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Attachments: CASSANDRA-4288-trunk.txt A serious problem is that there is no co-ordination whatsoever between gossip and the consumers of gossip. In particular, on a large cluster with hundreds of nodes, it takes several seconds for gossip to settle because the gossip stage is CPU bound. This leads to a node starting up and accessing thrift traffic long before it has any clue of what up and down. This leads to client-visible timeouts (for nodes that are down but not identified as such) and UnavailableException (for nodes that are up but not yet identified as such). This is really bad in general, but in particular for clients doing non-idempotent writes (counter increments). I was going to fix this as part of more significant re-writing in other tickets having to do with gossip/topology/etc, but that's not going to happen. So, the attached patch is roughly what we're running with in production now to make restarts bearable. The minimum wait time is both for ensuring that gossip has time to start becoming CPU bound if it will be, and the reason it's large is to allow for down nodes to be identified as such in most typical cases with a default phi conviction threshold (untested, we actually ran with a smaller number of 5 seconds minimum, but from past experience I believe 15 seconds is enough). The patch is tested on our 1.1 branch. It applies on trunk, and the diff is against trunk, but I have not tested it against trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4288) prevent thrift server from starting before gossip has settled
[ https://issues.apache.org/jira/browse/CASSANDRA-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-4288: -- Attachment: CASSANDRA-4288-trunk.txt prevent thrift server from starting before gossip has settled - Key: CASSANDRA-4288 URL: https://issues.apache.org/jira/browse/CASSANDRA-4288 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Attachments: CASSANDRA-4288-trunk.txt A serious problem is that there is no co-ordination whatsoever between gossip and the consumers of gossip. In particular, on a large cluster with hundreds of nodes, it takes several seconds for gossip to settle because the gossip stage is CPU bound. This leads to a node starting up and accessing thrift traffic long before it has any clue of what up and down. This leads to client-visible timeouts (for nodes that are down but not identified as such) and UnavailableException (for nodes that are up but not yet identified as such). This is really bad in general, but in particular for clients doing non-idempotent writes (counter increments). I was going to fix this as part of more significant re-writing in other tickets having to do with gossip/topology/etc, but that's not going to happen. So, the attached patch is roughly what we're running with in production now to make restarts bearable. The minimum wait time is both for ensuring that gossip has time to start becoming CPU bound if it will be, and the reason it's large is to allow for down nodes to be identified as such in most typical cases with a default phi conviction threshold (untested, we actually ran with a smaller number of 5 seconds minimum, but from past experience I believe 15 seconds is enough). The patch is tested on our 1.1 branch. It applies on trunk, and the diff is against trunk, but I have not tested it against trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3794) Avoid ID conflicts from concurrent schema changes
[ https://issues.apache.org/jira/browse/CASSANDRA-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Yaskevich updated CASSANDRA-3794: --- Attachment: CASSANDRA-3794-trunk.patch rebased with the latest trunk and fixed nit. Avoid ID conflicts from concurrent schema changes - Key: CASSANDRA-3794 URL: https://issues.apache.org/jira/browse/CASSANDRA-3794 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Pavel Yaskevich Assignee: Pavel Yaskevich Fix For: 1.1.1 Attachments: CASSANDRA-3794-trunk.patch, CASSANDRA-3794-v2.patch, CASSANDRA-3794.patch Change ColumnFamily identifiers to be UUIDs instead of sequential Integers. Would be useful in the situation when nodes simultaneously trying to create ColumnFamilies. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[1/3] git commit: Merge branch 'cassandra-1.1' into trunk
Updated Branches: refs/heads/cassandra-1.1 cbf043618 - cf9a581bf refs/heads/trunk 5ab69b62c - 782b1561f Merge branch 'cassandra-1.1' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/782b1561 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/782b1561 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/782b1561 Branch: refs/heads/trunk Commit: 782b1561fad6b7339e62161232d2a0e421cd6f22 Parents: 5ab69b6 cf9a581 Author: Brandon Williams brandonwilli...@apache.org Authored: Fri May 25 15:22:08 2012 -0500 Committer: Brandon Williams brandonwilli...@apache.org Committed: Fri May 25 15:22:08 2012 -0500 -- conf/cassandra.yaml|5 + .../locator/GossipingPropertyFileSnitch.java | 140 +++ .../cassandra/locator/PropertyFileSnitch.java |4 +- 3 files changed, 147 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/782b1561/src/java/org/apache/cassandra/locator/PropertyFileSnitch.java --
[2/3] git commit: Add GossipingPropertyFileSnitch. Patch by brandonwilliams reviewed by Vijay for CASSANDRA-1974
Add GossipingPropertyFileSnitch. Patch by brandonwilliams reviewed by Vijay for CASSANDRA-1974 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/cf9a581b Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/cf9a581b Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/cf9a581b Branch: refs/heads/trunk Commit: cf9a581bf7957d9404a6a5f33f77cf32468c6336 Parents: cbf0436 Author: Brandon Williams brandonwilli...@apache.org Authored: Fri May 25 15:21:27 2012 -0500 Committer: Brandon Williams brandonwilli...@apache.org Committed: Fri May 25 15:21:27 2012 -0500 -- conf/cassandra.yaml|5 + .../locator/GossipingPropertyFileSnitch.java | 140 +++ .../cassandra/locator/PropertyFileSnitch.java |4 +- 3 files changed, 147 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/cf9a581b/conf/cassandra.yaml -- diff --git a/conf/cassandra.yaml b/conf/cassandra.yaml index 14cc525..d64e2c9 100644 --- a/conf/cassandra.yaml +++ b/conf/cassandra.yaml @@ -437,6 +437,11 @@ rpc_timeout_in_ms: 1 # - PropertyFileSnitch: #Proximity is determined by rack and data center, which are #explicitly configured in cassandra-topology.properties. +# - GossipingPropertyFileSnitch +#The rack and datacenter for the local node are defined in +#cassandra-rackdc.properties and propagated to other nodes via gossip. If +#cassandra-topology.properties exists, it is used as a fallback, allowing +#migration from the PropertyFileSnitch. # - RackInferringSnitch: #Proximity is determined by rack and data center, which are #assumed to correspond to the 3rd and 2nd octet of each node's http://git-wip-us.apache.org/repos/asf/cassandra/blob/cf9a581b/src/java/org/apache/cassandra/locator/GossipingPropertyFileSnitch.java -- diff --git a/src/java/org/apache/cassandra/locator/GossipingPropertyFileSnitch.java b/src/java/org/apache/cassandra/locator/GossipingPropertyFileSnitch.java new file mode 100644 index 000..9a22544 --- /dev/null +++ b/src/java/org/apache/cassandra/locator/GossipingPropertyFileSnitch.java @@ -0,0 +1,140 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.cassandra.locator; + +import org.apache.cassandra.config.ConfigurationException; +import org.apache.cassandra.gms.ApplicationState; +import org.apache.cassandra.gms.EndpointState; +import org.apache.cassandra.gms.Gossiper; +import org.apache.cassandra.io.util.FileUtils; +import org.apache.cassandra.service.StorageService; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.InputStream; +import java.net.InetAddress; +import java.util.Map; +import java.util.Properties; + +public class GossipingPropertyFileSnitch extends AbstractNetworkTopologySnitch +{ +private static final Logger logger = LoggerFactory.getLogger(GossipingPropertyFileSnitch.class); + +public static final String RACKDC_PROPERTY_FILENAME = cassandra-rackdc.properties; +private PropertyFileSnitch psnitch; +private String myDC; +private String myRack; + +public GossipingPropertyFileSnitch() throws ConfigurationException +{ +try +{ +loadConfiguration(); +} +catch (ConfigurationException e) +{ +throw new RuntimeException(Unable to load + RACKDC_PROPERTY_FILENAME + : , e); +} +try +{ +psnitch = new PropertyFileSnitch(); +logger.info(Loaded + PropertyFileSnitch.RACK_PROPERTY_FILENAME + for compatibility); +} +catch (ConfigurationException e) +{ +logger.info(Unable to load + PropertyFileSnitch.RACK_PROPERTY_FILENAME + ; compatibility mode disabled); +} +} + +private void loadConfiguration() throws ConfigurationException +{ +
[3/3] git commit: Add GossipingPropertyFileSnitch. Patch by brandonwilliams reviewed by Vijay for CASSANDRA-1974
Add GossipingPropertyFileSnitch. Patch by brandonwilliams reviewed by Vijay for CASSANDRA-1974 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/cf9a581b Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/cf9a581b Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/cf9a581b Branch: refs/heads/cassandra-1.1 Commit: cf9a581bf7957d9404a6a5f33f77cf32468c6336 Parents: cbf0436 Author: Brandon Williams brandonwilli...@apache.org Authored: Fri May 25 15:21:27 2012 -0500 Committer: Brandon Williams brandonwilli...@apache.org Committed: Fri May 25 15:21:27 2012 -0500 -- conf/cassandra.yaml|5 + .../locator/GossipingPropertyFileSnitch.java | 140 +++ .../cassandra/locator/PropertyFileSnitch.java |4 +- 3 files changed, 147 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/cf9a581b/conf/cassandra.yaml -- diff --git a/conf/cassandra.yaml b/conf/cassandra.yaml index 14cc525..d64e2c9 100644 --- a/conf/cassandra.yaml +++ b/conf/cassandra.yaml @@ -437,6 +437,11 @@ rpc_timeout_in_ms: 1 # - PropertyFileSnitch: #Proximity is determined by rack and data center, which are #explicitly configured in cassandra-topology.properties. +# - GossipingPropertyFileSnitch +#The rack and datacenter for the local node are defined in +#cassandra-rackdc.properties and propagated to other nodes via gossip. If +#cassandra-topology.properties exists, it is used as a fallback, allowing +#migration from the PropertyFileSnitch. # - RackInferringSnitch: #Proximity is determined by rack and data center, which are #assumed to correspond to the 3rd and 2nd octet of each node's http://git-wip-us.apache.org/repos/asf/cassandra/blob/cf9a581b/src/java/org/apache/cassandra/locator/GossipingPropertyFileSnitch.java -- diff --git a/src/java/org/apache/cassandra/locator/GossipingPropertyFileSnitch.java b/src/java/org/apache/cassandra/locator/GossipingPropertyFileSnitch.java new file mode 100644 index 000..9a22544 --- /dev/null +++ b/src/java/org/apache/cassandra/locator/GossipingPropertyFileSnitch.java @@ -0,0 +1,140 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.cassandra.locator; + +import org.apache.cassandra.config.ConfigurationException; +import org.apache.cassandra.gms.ApplicationState; +import org.apache.cassandra.gms.EndpointState; +import org.apache.cassandra.gms.Gossiper; +import org.apache.cassandra.io.util.FileUtils; +import org.apache.cassandra.service.StorageService; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.InputStream; +import java.net.InetAddress; +import java.util.Map; +import java.util.Properties; + +public class GossipingPropertyFileSnitch extends AbstractNetworkTopologySnitch +{ +private static final Logger logger = LoggerFactory.getLogger(GossipingPropertyFileSnitch.class); + +public static final String RACKDC_PROPERTY_FILENAME = cassandra-rackdc.properties; +private PropertyFileSnitch psnitch; +private String myDC; +private String myRack; + +public GossipingPropertyFileSnitch() throws ConfigurationException +{ +try +{ +loadConfiguration(); +} +catch (ConfigurationException e) +{ +throw new RuntimeException(Unable to load + RACKDC_PROPERTY_FILENAME + : , e); +} +try +{ +psnitch = new PropertyFileSnitch(); +logger.info(Loaded + PropertyFileSnitch.RACK_PROPERTY_FILENAME + for compatibility); +} +catch (ConfigurationException e) +{ +logger.info(Unable to load + PropertyFileSnitch.RACK_PROPERTY_FILENAME + ; compatibility mode disabled); +} +} + +private void loadConfiguration() throws ConfigurationException +
[jira] [Commented] (CASSANDRA-4287) SizeTieredCompactionStrategy.getBuckets is quadradic in the number of sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283734#comment-13283734 ] Jonathan Ellis commented on CASSANDRA-4287: --- Changesets up on https://github.com/jbellis/cassandra/branches/4287 and https://github.com/jbellis/cassandra/branches/4287-1.0.8 (build against 1.0.8 for convenience) SizeTieredCompactionStrategy.getBuckets is quadradic in the number of sstables -- Key: CASSANDRA-4287 URL: https://issues.apache.org/jira/browse/CASSANDRA-4287 Project: Cassandra Issue Type: Bug Components: Core Reporter: Jonathan Ellis Assignee: Jonathan Ellis Priority: Minor Labels: compaction Fix For: 1.0.11 getBuckets first sorts the sstables by size (N log N) then adds each sstable to a bucket (N**2 in the worst case of all sstables the same size, because we use the bucket's contents as a hash key). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (CASSANDRA-4287) SizeTieredCompactionStrategy.getBuckets is quadradic in the number of sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283734#comment-13283734 ] Jonathan Ellis edited comment on CASSANDRA-4287 at 5/25/12 8:25 PM: Changesets up on https://github.com/jbellis/cassandra/branches/4287 was (Author: jbellis): Changesets up on https://github.com/jbellis/cassandra/branches/4287 and https://github.com/jbellis/cassandra/branches/4287-1.0.8 (build against 1.0.8 for convenience) SizeTieredCompactionStrategy.getBuckets is quadradic in the number of sstables -- Key: CASSANDRA-4287 URL: https://issues.apache.org/jira/browse/CASSANDRA-4287 Project: Cassandra Issue Type: Bug Components: Core Reporter: Jonathan Ellis Assignee: Jonathan Ellis Priority: Minor Labels: compaction Fix For: 1.0.11 getBuckets first sorts the sstables by size (N log N) then adds each sstable to a bucket (N**2 in the worst case of all sstables the same size, because we use the bucket's contents as a hash key). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4288) prevent thrift server from starting before gossip has settled
[ https://issues.apache.org/jira/browse/CASSANDRA-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283736#comment-13283736 ] Brandon Williams commented on CASSANDRA-4288: - It seems like defining a new random delay is the wrong thing to do instead of using RING_DELAY, which is actually how we define gossip as 'settled.' However, after completing one round of gossip the node should know which nodes are up or down, so a hook for that may be better than a fixed delay. prevent thrift server from starting before gossip has settled - Key: CASSANDRA-4288 URL: https://issues.apache.org/jira/browse/CASSANDRA-4288 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Attachments: CASSANDRA-4288-trunk.txt A serious problem is that there is no co-ordination whatsoever between gossip and the consumers of gossip. In particular, on a large cluster with hundreds of nodes, it takes several seconds for gossip to settle because the gossip stage is CPU bound. This leads to a node starting up and accessing thrift traffic long before it has any clue of what up and down. This leads to client-visible timeouts (for nodes that are down but not identified as such) and UnavailableException (for nodes that are up but not yet identified as such). This is really bad in general, but in particular for clients doing non-idempotent writes (counter increments). I was going to fix this as part of more significant re-writing in other tickets having to do with gossip/topology/etc, but that's not going to happen. So, the attached patch is roughly what we're running with in production now to make restarts bearable. The minimum wait time is both for ensuring that gossip has time to start becoming CPU bound if it will be, and the reason it's large is to allow for down nodes to be identified as such in most typical cases with a default phi conviction threshold (untested, we actually ran with a smaller number of 5 seconds minimum, but from past experience I believe 15 seconds is enough). The patch is tested on our 1.1 branch. It applies on trunk, and the diff is against trunk, but I have not tested it against trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4199) There should be an easy way to find out which sstables a key lives in
[ https://issues.apache.org/jira/browse/CASSANDRA-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-4199: -- Reviewer: yukim There should be an easy way to find out which sstables a key lives in - Key: CASSANDRA-4199 URL: https://issues.apache.org/jira/browse/CASSANDRA-4199 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Brandon Williams Assignee: Brandon Williams Labels: lhf Fix For: 1.1.1 Attachments: 4199.txt When debugging, often times on a live server you want to extract a certain key with sst2j, but unfortunately you can't know which sstable(s) you need to run this on, causing you to iterate over much more data than necessary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4199) There should be an easy way to find out which sstables a key lives in
[ https://issues.apache.org/jira/browse/CASSANDRA-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-4199: -- Priority: Minor (was: Major) There should be an easy way to find out which sstables a key lives in - Key: CASSANDRA-4199 URL: https://issues.apache.org/jira/browse/CASSANDRA-4199 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Brandon Williams Assignee: Brandon Williams Priority: Minor Labels: lhf Fix For: 1.1.1 Attachments: 4199.txt When debugging, often times on a live server you want to extract a certain key with sst2j, but unfortunately you can't know which sstable(s) you need to run this on, causing you to iterate over much more data than necessary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4288) prevent thrift server from starting before gossip has settled
[ https://issues.apache.org/jira/browse/CASSANDRA-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283743#comment-13283743 ] Peter Schuller commented on CASSANDRA-4288: --- I completely agree that it's the wrong approach. Abstractions need to change and things like that need to be in there. This is a hack that we're running with which fixes the main symptom on restart (but doesn't e.g. fix it on initial bootstrap). I don't agree that RING_DELAY is the right solution; that itself is IMO a hack, at least when used to combat CPU bound churning in gossip as opposed to actual legitimate probability driven propagation delay in a cluster. prevent thrift server from starting before gossip has settled - Key: CASSANDRA-4288 URL: https://issues.apache.org/jira/browse/CASSANDRA-4288 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Attachments: CASSANDRA-4288-trunk.txt A serious problem is that there is no co-ordination whatsoever between gossip and the consumers of gossip. In particular, on a large cluster with hundreds of nodes, it takes several seconds for gossip to settle because the gossip stage is CPU bound. This leads to a node starting up and accessing thrift traffic long before it has any clue of what up and down. This leads to client-visible timeouts (for nodes that are down but not identified as such) and UnavailableException (for nodes that are up but not yet identified as such). This is really bad in general, but in particular for clients doing non-idempotent writes (counter increments). I was going to fix this as part of more significant re-writing in other tickets having to do with gossip/topology/etc, but that's not going to happen. So, the attached patch is roughly what we're running with in production now to make restarts bearable. The minimum wait time is both for ensuring that gossip has time to start becoming CPU bound if it will be, and the reason it's large is to allow for down nodes to be identified as such in most typical cases with a default phi conviction threshold (untested, we actually ran with a smaller number of 5 seconds minimum, but from past experience I believe 15 seconds is enough). The patch is tested on our 1.1 branch. It applies on trunk, and the diff is against trunk, but I have not tested it against trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4288) prevent thrift server from starting before gossip has settled
[ https://issues.apache.org/jira/browse/CASSANDRA-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283745#comment-13283745 ] Peter Schuller commented on CASSANDRA-4288: --- (Feel free to substitute my use of the term settled for something else. I don't mean to imply overlap with the pre-existing condition that RING_DELAY tries to solve.) prevent thrift server from starting before gossip has settled - Key: CASSANDRA-4288 URL: https://issues.apache.org/jira/browse/CASSANDRA-4288 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Attachments: CASSANDRA-4288-trunk.txt A serious problem is that there is no co-ordination whatsoever between gossip and the consumers of gossip. In particular, on a large cluster with hundreds of nodes, it takes several seconds for gossip to settle because the gossip stage is CPU bound. This leads to a node starting up and accessing thrift traffic long before it has any clue of what up and down. This leads to client-visible timeouts (for nodes that are down but not identified as such) and UnavailableException (for nodes that are up but not yet identified as such). This is really bad in general, but in particular for clients doing non-idempotent writes (counter increments). I was going to fix this as part of more significant re-writing in other tickets having to do with gossip/topology/etc, but that's not going to happen. So, the attached patch is roughly what we're running with in production now to make restarts bearable. The minimum wait time is both for ensuring that gossip has time to start becoming CPU bound if it will be, and the reason it's large is to allow for down nodes to be identified as such in most typical cases with a default phi conviction threshold (untested, we actually ran with a smaller number of 5 seconds minimum, but from past experience I believe 15 seconds is enough). The patch is tested on our 1.1 branch. It applies on trunk, and the diff is against trunk, but I have not tested it against trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4284) Improve timeuuid - date relationship
[ https://issues.apache.org/jira/browse/CASSANDRA-4284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283746#comment-13283746 ] Jonathan Ellis commented on CASSANDRA-4284: --- bq. it would make more sense that when inserting a date, we actually pick a uuid with the corresponding timestamp but with the rest of the UUID being random +1 bq. We could extends the date syntax to allow it to represent uniquely a type 1 UUID. Typically, we could allow something like: '2012-06-06 12:03:00+ %a2fc07', where the part after the '%' character would be hexadecimal for the non-timestamp part of the UUID +0, sounds messy wrt the general to/from string api in AbstractType. Improve timeuuid - date relationship -- Key: CASSANDRA-4284 URL: https://issues.apache.org/jira/browse/CASSANDRA-4284 Project: Cassandra Issue Type: Improvement Components: API Affects Versions: 1.1.0 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Priority: Minor Labels: cql3 Fix For: 1.2 We added timeuuid to CQL3, whose purpose is to provide a collision-free timestamp basically and as a convenience, such timeuuid can be inputed as as date. However, two things seems non-optimal to me: * When one insert a timeuuid using a date format, we always pick *the* UUID corresponding to this date with every other part of the UUID to 0. This kind of defeat the purpose of collision-free timestamp and thus greatly limit the usefulness of the date syntax. * When cqlsh print timeuuid, it print them as date. But as thus, there is some information lost which can be problematic (you can't update a specific column based on that return). In a way, this is a cqlsh limitation, since cassandra return the UUID bytes. Yet, it also emphasis somehow that from the point of using them, timeuuid are more UUID than really time. For the first point, it would make more sense that when inserting a date, we actually pick a uuid with the corresponding timestamp *but* with the rest of the UUID being random. It's not completely that simple because we don't want that randomness when the date are used in a select query, but that's roughtly the same problem than CASSANDRA-4283 (and we can thus use the same solution). The second point gives an idea. We could extends the date syntax to allow it to represent uniquely a type 1 UUID. Typically, we could allow something like: '2012-06-06 12:03:00+ %a2fc07', where the part after the '%' character would be hexadecimal for the non-timestamp part of the UUID. Understanding this syntax could allow to work with timeuuid uniquely with meaningful date string which I think would be neat. But maybe that's a crazy idea, opinions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4288) prevent thrift server from starting before gossip has settled
[ https://issues.apache.org/jira/browse/CASSANDRA-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283744#comment-13283744 ] Peter Schuller commented on CASSANDRA-4288: --- Btw, it's not a random delay. It waits until gossip has settled by looking at the active/pending in the gossip stage. It's still a hack, but it's not a random delay. prevent thrift server from starting before gossip has settled - Key: CASSANDRA-4288 URL: https://issues.apache.org/jira/browse/CASSANDRA-4288 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Attachments: CASSANDRA-4288-trunk.txt A serious problem is that there is no co-ordination whatsoever between gossip and the consumers of gossip. In particular, on a large cluster with hundreds of nodes, it takes several seconds for gossip to settle because the gossip stage is CPU bound. This leads to a node starting up and accessing thrift traffic long before it has any clue of what up and down. This leads to client-visible timeouts (for nodes that are down but not identified as such) and UnavailableException (for nodes that are up but not yet identified as such). This is really bad in general, but in particular for clients doing non-idempotent writes (counter increments). I was going to fix this as part of more significant re-writing in other tickets having to do with gossip/topology/etc, but that's not going to happen. So, the attached patch is roughly what we're running with in production now to make restarts bearable. The minimum wait time is both for ensuring that gossip has time to start becoming CPU bound if it will be, and the reason it's large is to allow for down nodes to be identified as such in most typical cases with a default phi conviction threshold (untested, we actually ran with a smaller number of 5 seconds minimum, but from past experience I believe 15 seconds is enough). The patch is tested on our 1.1 branch. It applies on trunk, and the diff is against trunk, but I have not tested it against trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
git commit: update NEWS.txt to warn users about risk of simultaneously creating CFs on different nodes before version 1.2
Updated Branches: refs/heads/cassandra-1.1 cf9a581bf - a7277178a update NEWS.txt to warn users about risk of simultaneously creating CFs on different nodes before version 1.2 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a7277178 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a7277178 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a7277178 Branch: refs/heads/cassandra-1.1 Commit: a7277178a15ce899a9fc248e3ed1fbaecdec3e01 Parents: cf9a581 Author: Pavel Yaskevich xe...@apache.org Authored: Fri May 25 23:48:53 2012 +0300 Committer: Pavel Yaskevich xe...@apache.org Committed: Fri May 25 23:49:57 2012 +0300 -- NEWS.txt |6 -- 1 files changed, 4 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a7277178/NEWS.txt -- diff --git a/NEWS.txt b/NEWS.txt index b87f05c..3368d98 100644 --- a/NEWS.txt +++ b/NEWS.txt @@ -68,8 +68,10 @@ Upgrading Features - Concurrent schema updates are now supported, with any conflicts - automatically resolved. This makes temporary columnfamilies and - other uses of dynamic schema appropriate to use in applications. + automatically resolved. Please note that simultaneously running + ‘CREATE COLUMN FAMILY’ operation on the different nodes wouldn’t + be safe until version 1.2 due to the nature of ColumnFamily + identifier generation, for more details see CASSANDRA-3794. - The CQL language has undergone a major revision, CQL3, the highlights of which are covered at [1]. CQL3 is not backwards-compatibile with CQL2, so we've introduced a
git commit: kick off background compaction when min/max thresholds change patch by jbellis; reviewed by slebresne for CASSANDRA-4279
Updated Branches: refs/heads/cassandra-1.0 f89b9aecc - 00a553438 kick off background compaction when min/max thresholds change patch by jbellis; reviewed by slebresne for CASSANDRA-4279 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/00a55343 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/00a55343 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/00a55343 Branch: refs/heads/cassandra-1.0 Commit: 00a553438623945117066d4adfc2826c17d59ccb Parents: f89b9ae Author: Jonathan Ellis jbel...@apache.org Authored: Fri May 25 15:49:49 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Fri May 25 15:49:49 2012 -0500 -- CHANGES.txt|2 ++ .../org/apache/cassandra/db/ColumnFamilyStore.java |8 2 files changed, 6 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/00a55343/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 15f3c8a..404a744 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,6 +1,8 @@ 1.0.11 * synchronize LCS getEstimatedTasks to avoid CME (CASSANDRA-4255) * ensure unique streaming session id's (CASSANDRA-4223) + * kick off background compaction when min/max thresholds change + (CASSANDRA-4279) 1.0.10 http://git-wip-us.apache.org/repos/asf/cassandra/blob/00a55343/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java index b7d74bc..56de67e 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java +++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java @@ -1727,10 +1727,10 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean public void setMinimumCompactionThreshold(int minCompactionThreshold) { if ((minCompactionThreshold this.maxCompactionThreshold.value()) this.maxCompactionThreshold.value() != 0) -{ throw new RuntimeException(The min_compaction_threshold cannot be larger than the max.); -} + this.minCompactionThreshold.set(minCompactionThreshold); +CompactionManager.instance.submitBackground(this); } public int getMaximumCompactionThreshold() @@ -1741,10 +1741,10 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean public void setMaximumCompactionThreshold(int maxCompactionThreshold) { if (maxCompactionThreshold 0 maxCompactionThreshold this.minCompactionThreshold.value()) -{ throw new RuntimeException(The max_compaction_threshold cannot be smaller than the min.); -} + this.maxCompactionThreshold.set(maxCompactionThreshold); +CompactionManager.instance.submitBackground(this); } public boolean isCompactionDisabled()
[2/11] git commit: kick off background compaction when min/max thresholds change patch by jbellis; reviewed by slebresne for CASSANDRA-4279
kick off background compaction when min/max thresholds change patch by jbellis; reviewed by slebresne for CASSANDRA-4279 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/9383aebf Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/9383aebf Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/9383aebf Branch: refs/heads/trunk Commit: 9383aebfcb911ff64787be750815193715f14900 Parents: c4d6f78 Author: Jonathan Ellis jbel...@apache.org Authored: Fri May 25 15:49:49 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Fri May 25 15:50:57 2012 -0500 -- CHANGES.txt|2 ++ .../org/apache/cassandra/db/ColumnFamilyStore.java |8 2 files changed, 6 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/9383aebf/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 91161b1..4b6ac77 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -85,6 +85,8 @@ Merged from 1.0: (CASSANDRA-3985) * synchronize LCS getEstimatedTasks to avoid CME (CASSANDRA-4255) * ensure unique streaming session id's (CASSANDRA-4223) + * kick off background compaction when min/max thresholds change + (CASSANDRA-4279) 1.1.0-final http://git-wip-us.apache.org/repos/asf/cassandra/blob/9383aebf/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java index e549506..48604a8 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java +++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java @@ -1823,10 +1823,10 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean public void setMinimumCompactionThreshold(int minCompactionThreshold) { if ((minCompactionThreshold this.maxCompactionThreshold.value()) this.maxCompactionThreshold.value() != 0) -{ throw new RuntimeException(The min_compaction_threshold cannot be larger than the max.); -} + this.minCompactionThreshold.set(minCompactionThreshold); +CompactionManager.instance.submitBackground(this); } public int getMaximumCompactionThreshold() @@ -1837,10 +1837,10 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean public void setMaximumCompactionThreshold(int maxCompactionThreshold) { if (maxCompactionThreshold 0 maxCompactionThreshold this.minCompactionThreshold.value()) -{ throw new RuntimeException(The max_compaction_threshold cannot be smaller than the min.); -} + this.maxCompactionThreshold.set(maxCompactionThreshold); +CompactionManager.instance.submitBackground(this); } public boolean isCompactionDisabled()
[1/11] git commit: merge from 1.1
Updated Branches: refs/heads/cassandra-1.1 a7277178a - 1bfb68518 refs/heads/trunk 782b1561f - 2155250fd merge from 1.1 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2155250f Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2155250f Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2155250f Branch: refs/heads/trunk Commit: 2155250fd8b5643b7a838db7d8f83793a549473a Parents: 9383aeb 1bfb685 Author: Jonathan Ellis jbel...@apache.org Authored: Fri May 25 15:52:23 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Fri May 25 15:52:23 2012 -0500 -- NEWS.txt | 29 ++--- 1 files changed, 10 insertions(+), 19 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/2155250f/NEWS.txt -- diff --cc NEWS.txt index 73ec79b,3a4f222..3445d3f --- a/NEWS.txt +++ b/NEWS.txt @@@ -9,27 -9,6 +9,29 @@@ upgrade, just in case you need to roll by version X, but the inverse is not necessarily the case.) +1.2 +=== - +Upgrading +- - - Network compatibility for versions older than 1.0 has been removed. - (Disk compatibility is retained.) If you want to do node-at-a-time, - zero-downtime upgrades to 1.2, you must be on 1.0 or 1.1 first. ++- 1.2 is NOT network-compatible with versions older than 1.0. That ++ means if you want to do a rolling, zero-downtime upgrade, you'll need ++ to upgrade first to 1.0.x or 1.1.x, and then to 1.2. 1.2 retains ++ the ability to read data files from Cassandra versions at least ++ back to 0.6, so a non-rolling upgrade remains possible with just ++ one step. +- The hints schema was changed from 1.1 to 1.2. Cassandra automatically + snapshots and then truncates the hints column family as part of + starting up 1.2 for the first time. Additionally, upgraded nodes + will not store new hints destined for older (pre-1.2) nodes. It is + therefore recommended that you perform a cluster upgrade when all + nodes are up. +- The `nodetool removetoken` command (and corresponding JMX operation) + have been renamed to `nodetool removenode`. This function is + incompatible with the earlier `nodetool removetoken`, and attempts to + remove nodes in this way with a mixed 1.1 (or lower) / 1.2 cluster, + is not supported. + + 1.1.1 =
[3/11] git commit: r/m duplicate entry for #3912
r/m duplicate entry for #3912 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c4d6f781 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c4d6f781 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c4d6f781 Branch: refs/heads/trunk Commit: c4d6f781e7473cbcc3715c3e1dd51de5a66c1216 Parents: 6b29f91 Author: Jonathan Ellis jbel...@apache.org Authored: Fri May 25 15:41:11 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Fri May 25 15:50:56 2012 -0500 -- CHANGES.txt |1 - 1 files changed, 0 insertions(+), 1 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/c4d6f781/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 2ae5f1a..91161b1 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -30,7 +30,6 @@ * Open 1 sstableScanner per level for leveled compaction (CASSANDRA-4142) * Optimize reads when row deletion timestamps allow us to restrict the set of sstables we check (CASSANDRA-4116) - * incremental repair by token range (CASSANDRA-3912) * add support for commitlog archiving and point-in-time recovery (CASSANDRA-3690) * avoid generating redundant compaction tasks during streaming
[4/11] git commit: fix merge
fix merge Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/6b29f91d Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/6b29f91d Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/6b29f91d Branch: refs/heads/trunk Commit: 6b29f91dc9e2c6c1dfe1b713d62dd49fed07c07e Parents: 782b156 Author: Jonathan Ellis jbel...@apache.org Authored: Fri May 25 15:40:26 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Fri May 25 15:50:55 2012 -0500 -- NEWS.txt |7 --- 1 files changed, 0 insertions(+), 7 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/6b29f91d/NEWS.txt -- diff --git a/NEWS.txt b/NEWS.txt index 400b29b..73ec79b 100644 --- a/NEWS.txt +++ b/NEWS.txt @@ -8,13 +8,6 @@ upgrade, just in case you need to roll back to the previous version. (Cassandra version X + 1 will always be able to read data files created by version X, but the inverse is not necessarily the case.) -1.0.10 -== - -Upgrading -- -- Nothing specific to 1.0.10 - 1.2 ===
[5/11] git commit: kick off background compaction when min/max thresholds change patch by jbellis; reviewed by slebresne for CASSANDRA-4279
kick off background compaction when min/max thresholds change patch by jbellis; reviewed by slebresne for CASSANDRA-4279 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1bfb6851 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1bfb6851 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1bfb6851 Branch: refs/heads/cassandra-1.1 Commit: 1bfb68518a20abdfd7fd1c3e3422df9214df6b05 Parents: 0f439b0 Author: Jonathan Ellis jbel...@apache.org Authored: Fri May 25 15:49:49 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Fri May 25 15:50:48 2012 -0500 -- CHANGES.txt|2 ++ .../org/apache/cassandra/db/ColumnFamilyStore.java |8 2 files changed, 6 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/1bfb6851/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 3abd9fc..2700661 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -73,6 +73,8 @@ Merged from 1.0: (CASSANDRA-3985) * synchronize LCS getEstimatedTasks to avoid CME (CASSANDRA-4255) * ensure unique streaming session id's (CASSANDRA-4223) + * kick off background compaction when min/max thresholds change + (CASSANDRA-4279) 1.1.0-final http://git-wip-us.apache.org/repos/asf/cassandra/blob/1bfb6851/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java index c55d924..4a49d10 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java +++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java @@ -1824,10 +1824,10 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean public void setMinimumCompactionThreshold(int minCompactionThreshold) { if ((minCompactionThreshold this.maxCompactionThreshold.value()) this.maxCompactionThreshold.value() != 0) -{ throw new RuntimeException(The min_compaction_threshold cannot be larger than the max.); -} + this.minCompactionThreshold.set(minCompactionThreshold); +CompactionManager.instance.submitBackground(this); } public int getMaximumCompactionThreshold() @@ -1838,10 +1838,10 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean public void setMaximumCompactionThreshold(int maxCompactionThreshold) { if (maxCompactionThreshold 0 maxCompactionThreshold this.minCompactionThreshold.value()) -{ throw new RuntimeException(The max_compaction_threshold cannot be smaller than the min.); -} + this.maxCompactionThreshold.set(maxCompactionThreshold); +CompactionManager.instance.submitBackground(this); } public boolean isCompactionDisabled()
[6/11] git commit: kick off background compaction when min/max thresholds change patch by jbellis; reviewed by slebresne for CASSANDRA-4279
kick off background compaction when min/max thresholds change patch by jbellis; reviewed by slebresne for CASSANDRA-4279 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1bfb6851 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1bfb6851 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1bfb6851 Branch: refs/heads/trunk Commit: 1bfb68518a20abdfd7fd1c3e3422df9214df6b05 Parents: 0f439b0 Author: Jonathan Ellis jbel...@apache.org Authored: Fri May 25 15:49:49 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Fri May 25 15:50:48 2012 -0500 -- CHANGES.txt|2 ++ .../org/apache/cassandra/db/ColumnFamilyStore.java |8 2 files changed, 6 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/1bfb6851/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 3abd9fc..2700661 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -73,6 +73,8 @@ Merged from 1.0: (CASSANDRA-3985) * synchronize LCS getEstimatedTasks to avoid CME (CASSANDRA-4255) * ensure unique streaming session id's (CASSANDRA-4223) + * kick off background compaction when min/max thresholds change + (CASSANDRA-4279) 1.1.0-final http://git-wip-us.apache.org/repos/asf/cassandra/blob/1bfb6851/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java index c55d924..4a49d10 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java +++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java @@ -1824,10 +1824,10 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean public void setMinimumCompactionThreshold(int minCompactionThreshold) { if ((minCompactionThreshold this.maxCompactionThreshold.value()) this.maxCompactionThreshold.value() != 0) -{ throw new RuntimeException(The min_compaction_threshold cannot be larger than the max.); -} + this.minCompactionThreshold.set(minCompactionThreshold); +CompactionManager.instance.submitBackground(this); } public int getMaximumCompactionThreshold() @@ -1838,10 +1838,10 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean public void setMaximumCompactionThreshold(int maxCompactionThreshold) { if (maxCompactionThreshold 0 maxCompactionThreshold this.minCompactionThreshold.value()) -{ throw new RuntimeException(The max_compaction_threshold cannot be smaller than the min.); -} + this.maxCompactionThreshold.set(maxCompactionThreshold); +CompactionManager.instance.submitBackground(this); } public boolean isCompactionDisabled()
[7/11] git commit: r/m duplicate entry for #3912
r/m duplicate entry for #3912 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0f439b03 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0f439b03 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0f439b03 Branch: refs/heads/trunk Commit: 0f439b0390b9ceaad17be77e146aca2243e9a846 Parents: 06b2467 Author: Jonathan Ellis jbel...@apache.org Authored: Fri May 25 15:41:11 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Fri May 25 15:50:47 2012 -0500 -- CHANGES.txt |1 - 1 files changed, 0 insertions(+), 1 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/0f439b03/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 8c58af7..3abd9fc 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -18,7 +18,6 @@ * Open 1 sstableScanner per level for leveled compaction (CASSANDRA-4142) * Optimize reads when row deletion timestamps allow us to restrict the set of sstables we check (CASSANDRA-4116) - * incremental repair by token range (CASSANDRA-3912) * add support for commitlog archiving and point-in-time recovery (CASSANDRA-3690) * avoid generating redundant compaction tasks during streaming
[8/11] git commit: r/m duplicate entry for #3912
r/m duplicate entry for #3912 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0f439b03 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0f439b03 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0f439b03 Branch: refs/heads/cassandra-1.1 Commit: 0f439b0390b9ceaad17be77e146aca2243e9a846 Parents: 06b2467 Author: Jonathan Ellis jbel...@apache.org Authored: Fri May 25 15:41:11 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Fri May 25 15:50:47 2012 -0500 -- CHANGES.txt |1 - 1 files changed, 0 insertions(+), 1 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/0f439b03/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 8c58af7..3abd9fc 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -18,7 +18,6 @@ * Open 1 sstableScanner per level for leveled compaction (CASSANDRA-4142) * Optimize reads when row deletion timestamps allow us to restrict the set of sstables we check (CASSANDRA-4116) - * incremental repair by token range (CASSANDRA-3912) * add support for commitlog archiving and point-in-time recovery (CASSANDRA-3690) * avoid generating redundant compaction tasks during streaming
[9/11] git commit: fix merge
fix merge Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/06b24679 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/06b24679 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/06b24679 Branch: refs/heads/trunk Commit: 06b2467942ea4bbcde6341eeff5a66695f8ec2cd Parents: a727717 Author: Jonathan Ellis jbel...@apache.org Authored: Fri May 25 15:40:26 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Fri May 25 15:50:46 2012 -0500 -- NEWS.txt |7 --- 1 files changed, 0 insertions(+), 7 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/06b24679/NEWS.txt -- diff --git a/NEWS.txt b/NEWS.txt index 3368d98..3a4f222 100644 --- a/NEWS.txt +++ b/NEWS.txt @@ -8,13 +8,6 @@ upgrade, just in case you need to roll back to the previous version. (Cassandra version X + 1 will always be able to read data files created by version X, but the inverse is not necessarily the case.) -1.0.10 -== - -Upgrading -- -- Nothing specific to 1.0.10 - 1.1.1 =
[11/11] git commit: update NEWS.txt to warn users about risk of simultaneously creating CFs on different nodes before version 1.2
update NEWS.txt to warn users about risk of simultaneously creating CFs on different nodes before version 1.2 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a7277178 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a7277178 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a7277178 Branch: refs/heads/trunk Commit: a7277178a15ce899a9fc248e3ed1fbaecdec3e01 Parents: cf9a581 Author: Pavel Yaskevich xe...@apache.org Authored: Fri May 25 23:48:53 2012 +0300 Committer: Pavel Yaskevich xe...@apache.org Committed: Fri May 25 23:49:57 2012 +0300 -- NEWS.txt |6 -- 1 files changed, 4 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a7277178/NEWS.txt -- diff --git a/NEWS.txt b/NEWS.txt index b87f05c..3368d98 100644 --- a/NEWS.txt +++ b/NEWS.txt @@ -68,8 +68,10 @@ Upgrading Features - Concurrent schema updates are now supported, with any conflicts - automatically resolved. This makes temporary columnfamilies and - other uses of dynamic schema appropriate to use in applications. + automatically resolved. Please note that simultaneously running + ‘CREATE COLUMN FAMILY’ operation on the different nodes wouldn’t + be safe until version 1.2 due to the nature of ColumnFamily + identifier generation, for more details see CASSANDRA-3794. - The CQL language has undergone a major revision, CQL3, the highlights of which are covered at [1]. CQL3 is not backwards-compatibile with CQL2, so we've introduced a
[10/11] git commit: fix merge
fix merge Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/06b24679 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/06b24679 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/06b24679 Branch: refs/heads/cassandra-1.1 Commit: 06b2467942ea4bbcde6341eeff5a66695f8ec2cd Parents: a727717 Author: Jonathan Ellis jbel...@apache.org Authored: Fri May 25 15:40:26 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Fri May 25 15:50:46 2012 -0500 -- NEWS.txt |7 --- 1 files changed, 0 insertions(+), 7 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/06b24679/NEWS.txt -- diff --git a/NEWS.txt b/NEWS.txt index 3368d98..3a4f222 100644 --- a/NEWS.txt +++ b/NEWS.txt @@ -8,13 +8,6 @@ upgrade, just in case you need to roll back to the previous version. (Cassandra version X + 1 will always be able to read data files created by version X, but the inverse is not necessarily the case.) -1.0.10 -== - -Upgrading -- -- Nothing specific to 1.0.10 - 1.1.1 =
[jira] [Commented] (CASSANDRA-4287) SizeTieredCompactionStrategy.getBuckets is quadradic in the number of sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13283749#comment-13283749 ] Tyler Hobbs commented on CASSANDRA-4287: +1 SizeTieredCompactionStrategy.getBuckets is quadradic in the number of sstables -- Key: CASSANDRA-4287 URL: https://issues.apache.org/jira/browse/CASSANDRA-4287 Project: Cassandra Issue Type: Bug Components: Core Reporter: Jonathan Ellis Assignee: Jonathan Ellis Priority: Minor Labels: compaction Fix For: 1.0.11 getBuckets first sorts the sstables by size (N log N) then adds each sstable to a bucket (N**2 in the worst case of all sstables the same size, because we use the bucket's contents as a hash key). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[4/22] git commit: fix NPE from circular dependency on compaction strategy
fix NPE from circular dependency on compaction strategy Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/043d1808 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/043d1808 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/043d1808 Branch: refs/heads/trunk Commit: 043d1808366a40b81d5275090060b7372ae4cbf5 Parents: 853a759 Author: Jonathan Ellis jbel...@apache.org Authored: Sat May 26 00:22:00 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Sat May 26 00:22:00 2012 -0500 -- .../org/apache/cassandra/db/ColumnFamilyStore.java | 10 -- 1 files changed, 8 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/043d1808/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java index 56de67e..b3da68e 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java +++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java @@ -1730,7 +1730,10 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean throw new RuntimeException(The min_compaction_threshold cannot be larger than the max.); this.minCompactionThreshold.set(minCompactionThreshold); -CompactionManager.instance.submitBackground(this); + +// this is called as part of CompactionStrategy constructor; avoid circular dependency by checking for null +if (compactionStrategy != null) +CompactionManager.instance.submitBackground(this); } public int getMaximumCompactionThreshold() @@ -1744,7 +1747,10 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean throw new RuntimeException(The max_compaction_threshold cannot be smaller than the min.); this.maxCompactionThreshold.set(maxCompactionThreshold); -CompactionManager.instance.submitBackground(this); + +// this is called as part of CompactionStrategy constructor; avoid circular dependency by checking for null +if (compactionStrategy != null) +CompactionManager.instance.submitBackground(this); } public boolean isCompactionDisabled()
[8/22] git commit: merge from 1.0
merge from 1.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/64a5e70e Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/64a5e70e Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/64a5e70e Branch: refs/heads/cassandra-1.1 Commit: 64a5e70ef5aacdfe244baa91b5698a326173082f Parents: 1bfb685 853a759 Author: Jonathan Ellis jbel...@apache.org Authored: Fri May 25 16:23:52 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Fri May 25 16:23:52 2012 -0500 -- .../db/compaction/AbstractCompactionTask.java |5 ++ .../cassandra/db/compaction/CompactionManager.java | 21 +++- .../cassandra/db/compaction/LeveledManifest.java | 10 +++- .../compaction/SizeTieredCompactionStrategy.java | 40 +++ 4 files changed, 51 insertions(+), 25 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/64a5e70e/src/java/org/apache/cassandra/db/compaction/CompactionManager.java -- diff --cc src/java/org/apache/cassandra/db/compaction/CompactionManager.java index f8bab20,872ce0b..e09a012 --- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java +++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java @@@ -111,31 -109,54 +111,50 @@@ public class CompactionManager implemen * It's okay to over-call (within reason) since the compactions are single-threaded, * and if a call is unnecessary, it will just be no-oped in the bucketing phase. */ -public FutureInteger submitBackground(final ColumnFamilyStore cfs) +public Future? submitBackground(final ColumnFamilyStore cfs) { + logger.debug(Scheduling a background task check for {}.{} with {}, + new Object[] {cfs.table.name, +cfs.columnFamily, + cfs.getCompactionStrategy().getClass().getSimpleName()}); -CallableInteger callable = new CallableInteger() +Runnable runnable = new WrappedRunnable() { -public Integer call() throws IOException +protected void runMayThrow() throws IOException { compactionLock.readLock().lock(); try { + logger.debug(Checking {}.{}, cfs.table.name, cfs.columnFamily); // log after we get the lock so we can see delays from that if any + if (!cfs.isValid()) + { + logger.debug(Aborting compaction for dropped CF); -return 0; ++return; + } + -boolean taskExecuted = false; AbstractCompactionStrategy strategy = cfs.getCompactionStrategy(); -ListAbstractCompactionTask tasks = strategy.getBackgroundTasks(getDefaultGcBefore(cfs)); -logger.debug({} minor compaction tasks available, tasks.size()); -for (AbstractCompactionTask task : tasks) +AbstractCompactionTask task = strategy.getNextBackgroundTask(getDefaultGcBefore(cfs)); - if (task == null || !task.markSSTablesForCompaction()) ++if (task == null) + { -if (!task.markSSTablesForCompaction()) -{ -logger.debug(Skipping {}; sstables are busy, task); -continue; -} - -taskExecuted = true; -try -{ -task.execute(executor); -} -finally -{ -task.unmarkSSTables(); -} ++logger.debug(No tasks available); +return; ++} ++if (!task.markSSTablesForCompaction()) ++{ ++logger.debug(Unable to mark SSTables for {}, task); ++return; + } -// newly created sstables might have made other compactions eligible -if (taskExecuted) -submitBackground(cfs); +try +{ +task.execute(executor); +} +finally +{ +task.unmarkSSTables(); +} +submitBackground(cfs); }
[1/22] git commit: merge from 1.1
Updated Branches: refs/heads/cassandra-1.0 00a553438 - 043d18083 refs/heads/cassandra-1.1 1bfb68518 - dad8d80fe refs/heads/trunk 2155250fd - 15d690fdc merge from 1.1 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/15d690fd Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/15d690fd Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/15d690fd Branch: refs/heads/trunk Commit: 15d690fdc2362e93beeddc364dbc63a56ab5ec44 Parents: 2155250 dad8d80 Author: Jonathan Ellis jbel...@apache.org Authored: Sat May 26 00:25:25 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Sat May 26 00:25:25 2012 -0500 -- .../org/apache/cassandra/db/ColumnFamilyStore.java | 10 +++- .../db/compaction/AbstractCompactionTask.java |5 ++ .../cassandra/db/compaction/CompactionManager.java | 21 +++- .../cassandra/db/compaction/LeveledManifest.java | 10 +++- .../compaction/SizeTieredCompactionStrategy.java | 41 +++ 5 files changed, 60 insertions(+), 27 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/15d690fd/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/15d690fd/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java -- diff --cc src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java index 9826941,1449c87..49c98d2 --- a/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java +++ b/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java @@@ -90,15 -85,8 +90,20 @@@ public abstract class AbstractCompactio // execute (if sstable can't be marked successfully) protected void cancel() {} +public AbstractCompactionTask isUserDefined(boolean isUserDefined) +{ +this.isUserDefined = isUserDefined; +return this; +} + +public AbstractCompactionTask setCompactionType(OperationType compactionType) +{ +this.compactionType = compactionType; +return this; +} ++ + public String toString() + { + return CompactionTask( + sstables + ); + } } http://git-wip-us.apache.org/repos/asf/cassandra/blob/15d690fd/src/java/org/apache/cassandra/db/compaction/CompactionManager.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/15d690fd/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/15d690fd/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java --
[2/22] git commit: Merge branch 'cassandra-1.0' into cassandra-1.1
Merge branch 'cassandra-1.0' into cassandra-1.1 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/dad8d80f Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/dad8d80f Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/dad8d80f Branch: refs/heads/cassandra-1.1 Commit: dad8d80fe5b48c56215dcc596866f8610909dfa8 Parents: 64a5e70 043d180 Author: Jonathan Ellis jbel...@apache.org Authored: Sat May 26 00:23:17 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Sat May 26 00:23:17 2012 -0500 -- .../org/apache/cassandra/db/ColumnFamilyStore.java | 10 -- 1 files changed, 8 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/dad8d80f/src/java/org/apache/cassandra/db/ColumnFamilyStore.java --
[6/22] git commit: fix NPE from circular dependency on compaction strategy
fix NPE from circular dependency on compaction strategy Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/043d1808 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/043d1808 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/043d1808 Branch: refs/heads/cassandra-1.1 Commit: 043d1808366a40b81d5275090060b7372ae4cbf5 Parents: 853a759 Author: Jonathan Ellis jbel...@apache.org Authored: Sat May 26 00:22:00 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Sat May 26 00:22:00 2012 -0500 -- .../org/apache/cassandra/db/ColumnFamilyStore.java | 10 -- 1 files changed, 8 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/043d1808/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java index 56de67e..b3da68e 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java +++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java @@ -1730,7 +1730,10 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean throw new RuntimeException(The min_compaction_threshold cannot be larger than the max.); this.minCompactionThreshold.set(minCompactionThreshold); -CompactionManager.instance.submitBackground(this); + +// this is called as part of CompactionStrategy constructor; avoid circular dependency by checking for null +if (compactionStrategy != null) +CompactionManager.instance.submitBackground(this); } public int getMaximumCompactionThreshold() @@ -1744,7 +1747,10 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean throw new RuntimeException(The max_compaction_threshold cannot be smaller than the min.); this.maxCompactionThreshold.set(maxCompactionThreshold); -CompactionManager.instance.submitBackground(this); + +// this is called as part of CompactionStrategy constructor; avoid circular dependency by checking for null +if (compactionStrategy != null) +CompactionManager.instance.submitBackground(this); } public boolean isCompactionDisabled()
[3/22] git commit: Merge branch 'cassandra-1.0' into cassandra-1.1
Merge branch 'cassandra-1.0' into cassandra-1.1 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/dad8d80f Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/dad8d80f Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/dad8d80f Branch: refs/heads/trunk Commit: dad8d80fe5b48c56215dcc596866f8610909dfa8 Parents: 64a5e70 043d180 Author: Jonathan Ellis jbel...@apache.org Authored: Sat May 26 00:23:17 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Sat May 26 00:23:17 2012 -0500 -- .../org/apache/cassandra/db/ColumnFamilyStore.java | 10 -- 1 files changed, 8 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/dad8d80f/src/java/org/apache/cassandra/db/ColumnFamilyStore.java --
[9/22] git commit: rewrite to avoid bFound variable
rewrite to avoid bFound variable Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/853a7593 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/853a7593 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/853a7593 Branch: refs/heads/trunk Commit: 853a75936c886136e9c7d5e0be42583e0305a6bd Parents: 2717610 Author: Jonathan Ellis jbel...@apache.org Authored: Fri May 25 15:17:21 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Fri May 25 16:15:24 2012 -0500 -- .../compaction/SizeTieredCompactionStrategy.java | 15 ++- 1 files changed, 6 insertions(+), 9 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/853a7593/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java b/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java index 747edcc..636d6ba 100644 --- a/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java +++ b/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java @@ -124,11 +124,11 @@ public class SizeTieredCompactionStrategy extends AbstractCompactionStrategy MapLong, ListT buckets = new HashMapLong, ListT(); +outer: for (PairT, Long pair: sortedFiles) { long size = pair.right; -boolean bFound = false; // look for a bucket containing similar-sized files: // group in the same bucket if it's w/in 50% of the average for this bucket, // or this file and the bucket are all considered small (less than `minSSTableSize`) @@ -145,17 +145,14 @@ public class SizeTieredCompactionStrategy extends AbstractCompactionStrategy long newAverageSize = (totalSize + size) / (bucket.size() + 1); bucket.add(pair.left); buckets.put(newAverageSize, bucket); -bFound = true; -break; +continue outer; } } + // no similar bucket found; put it in a new one -if (!bFound) -{ -ArrayListT bucket = new ArrayListT(); -bucket.add(pair.left); -buckets.put(size, bucket); -} +ArrayListT bucket = new ArrayListT(); +bucket.add(pair.left); +buckets.put(size, bucket); } return new ArrayListListT(buckets.values());
[12/22] git commit: rewrite to avoid bFound variable
rewrite to avoid bFound variable Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/853a7593 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/853a7593 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/853a7593 Branch: refs/heads/cassandra-1.0 Commit: 853a75936c886136e9c7d5e0be42583e0305a6bd Parents: 2717610 Author: Jonathan Ellis jbel...@apache.org Authored: Fri May 25 15:17:21 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Fri May 25 16:15:24 2012 -0500 -- .../compaction/SizeTieredCompactionStrategy.java | 15 ++- 1 files changed, 6 insertions(+), 9 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/853a7593/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java b/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java index 747edcc..636d6ba 100644 --- a/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java +++ b/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java @@ -124,11 +124,11 @@ public class SizeTieredCompactionStrategy extends AbstractCompactionStrategy MapLong, ListT buckets = new HashMapLong, ListT(); +outer: for (PairT, Long pair: sortedFiles) { long size = pair.right; -boolean bFound = false; // look for a bucket containing similar-sized files: // group in the same bucket if it's w/in 50% of the average for this bucket, // or this file and the bucket are all considered small (less than `minSSTableSize`) @@ -145,17 +145,14 @@ public class SizeTieredCompactionStrategy extends AbstractCompactionStrategy long newAverageSize = (totalSize + size) / (bucket.size() + 1); bucket.add(pair.left); buckets.put(newAverageSize, bucket); -bFound = true; -break; +continue outer; } } + // no similar bucket found; put it in a new one -if (!bFound) -{ -ArrayListT bucket = new ArrayListT(); -bucket.add(pair.left); -buckets.put(size, bucket); -} +ArrayListT bucket = new ArrayListT(); +bucket.add(pair.left); +buckets.put(size, bucket); } return new ArrayListListT(buckets.values());
[10/22] git commit: rewrite to avoid bFound variable
rewrite to avoid bFound variable Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/853a7593 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/853a7593 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/853a7593 Branch: refs/heads/cassandra-1.1 Commit: 853a75936c886136e9c7d5e0be42583e0305a6bd Parents: 2717610 Author: Jonathan Ellis jbel...@apache.org Authored: Fri May 25 15:17:21 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Fri May 25 16:15:24 2012 -0500 -- .../compaction/SizeTieredCompactionStrategy.java | 15 ++- 1 files changed, 6 insertions(+), 9 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/853a7593/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java b/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java index 747edcc..636d6ba 100644 --- a/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java +++ b/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java @@ -124,11 +124,11 @@ public class SizeTieredCompactionStrategy extends AbstractCompactionStrategy MapLong, ListT buckets = new HashMapLong, ListT(); +outer: for (PairT, Long pair: sortedFiles) { long size = pair.right; -boolean bFound = false; // look for a bucket containing similar-sized files: // group in the same bucket if it's w/in 50% of the average for this bucket, // or this file and the bucket are all considered small (less than `minSSTableSize`) @@ -145,17 +145,14 @@ public class SizeTieredCompactionStrategy extends AbstractCompactionStrategy long newAverageSize = (totalSize + size) / (bucket.size() + 1); bucket.add(pair.left); buckets.put(newAverageSize, bucket); -bFound = true; -break; +continue outer; } } + // no similar bucket found; put it in a new one -if (!bFound) -{ -ArrayListT bucket = new ArrayListT(); -bucket.add(pair.left); -buckets.put(size, bucket); -} +ArrayListT bucket = new ArrayListT(); +bucket.add(pair.left); +buckets.put(size, bucket); } return new ArrayListListT(buckets.values());
[5/22] git commit: fix NPE from circular dependency on compaction strategy
fix NPE from circular dependency on compaction strategy Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/043d1808 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/043d1808 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/043d1808 Branch: refs/heads/cassandra-1.0 Commit: 043d1808366a40b81d5275090060b7372ae4cbf5 Parents: 853a759 Author: Jonathan Ellis jbel...@apache.org Authored: Sat May 26 00:22:00 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Sat May 26 00:22:00 2012 -0500 -- .../org/apache/cassandra/db/ColumnFamilyStore.java | 10 -- 1 files changed, 8 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/043d1808/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java index 56de67e..b3da68e 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java +++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java @@ -1730,7 +1730,10 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean throw new RuntimeException(The min_compaction_threshold cannot be larger than the max.); this.minCompactionThreshold.set(minCompactionThreshold); -CompactionManager.instance.submitBackground(this); + +// this is called as part of CompactionStrategy constructor; avoid circular dependency by checking for null +if (compactionStrategy != null) +CompactionManager.instance.submitBackground(this); } public int getMaximumCompactionThreshold() @@ -1744,7 +1747,10 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean throw new RuntimeException(The max_compaction_threshold cannot be smaller than the min.); this.maxCompactionThreshold.set(maxCompactionThreshold); -CompactionManager.instance.submitBackground(this); + +// this is called as part of CompactionStrategy constructor; avoid circular dependency by checking for null +if (compactionStrategy != null) +CompactionManager.instance.submitBackground(this); } public boolean isCompactionDisabled()
[7/22] git commit: merge from 1.0
merge from 1.0 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/64a5e70e Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/64a5e70e Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/64a5e70e Branch: refs/heads/trunk Commit: 64a5e70ef5aacdfe244baa91b5698a326173082f Parents: 1bfb685 853a759 Author: Jonathan Ellis jbel...@apache.org Authored: Fri May 25 16:23:52 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Fri May 25 16:23:52 2012 -0500 -- .../db/compaction/AbstractCompactionTask.java |5 ++ .../cassandra/db/compaction/CompactionManager.java | 21 +++- .../cassandra/db/compaction/LeveledManifest.java | 10 +++- .../compaction/SizeTieredCompactionStrategy.java | 40 +++ 4 files changed, 51 insertions(+), 25 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/64a5e70e/src/java/org/apache/cassandra/db/compaction/CompactionManager.java -- diff --cc src/java/org/apache/cassandra/db/compaction/CompactionManager.java index f8bab20,872ce0b..e09a012 --- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java +++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java @@@ -111,31 -109,54 +111,50 @@@ public class CompactionManager implemen * It's okay to over-call (within reason) since the compactions are single-threaded, * and if a call is unnecessary, it will just be no-oped in the bucketing phase. */ -public FutureInteger submitBackground(final ColumnFamilyStore cfs) +public Future? submitBackground(final ColumnFamilyStore cfs) { + logger.debug(Scheduling a background task check for {}.{} with {}, + new Object[] {cfs.table.name, +cfs.columnFamily, + cfs.getCompactionStrategy().getClass().getSimpleName()}); -CallableInteger callable = new CallableInteger() +Runnable runnable = new WrappedRunnable() { -public Integer call() throws IOException +protected void runMayThrow() throws IOException { compactionLock.readLock().lock(); try { + logger.debug(Checking {}.{}, cfs.table.name, cfs.columnFamily); // log after we get the lock so we can see delays from that if any + if (!cfs.isValid()) + { + logger.debug(Aborting compaction for dropped CF); -return 0; ++return; + } + -boolean taskExecuted = false; AbstractCompactionStrategy strategy = cfs.getCompactionStrategy(); -ListAbstractCompactionTask tasks = strategy.getBackgroundTasks(getDefaultGcBefore(cfs)); -logger.debug({} minor compaction tasks available, tasks.size()); -for (AbstractCompactionTask task : tasks) +AbstractCompactionTask task = strategy.getNextBackgroundTask(getDefaultGcBefore(cfs)); - if (task == null || !task.markSSTablesForCompaction()) ++if (task == null) + { -if (!task.markSSTablesForCompaction()) -{ -logger.debug(Skipping {}; sstables are busy, task); -continue; -} - -taskExecuted = true; -try -{ -task.execute(executor); -} -finally -{ -task.unmarkSSTables(); -} ++logger.debug(No tasks available); +return; ++} ++if (!task.markSSTablesForCompaction()) ++{ ++logger.debug(Unable to mark SSTables for {}, task); ++return; + } -// newly created sstables might have made other compactions eligible -if (taskExecuted) -submitBackground(cfs); +try +{ +task.execute(executor); +} +finally +{ +task.unmarkSSTables(); +} +submitBackground(cfs); } -
[20/22] git commit: debug logging
debug logging Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/99ad7d60 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/99ad7d60 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/99ad7d60 Branch: refs/heads/cassandra-1.0 Commit: 99ad7d60012c05e98754ca8c5730d865ffe024e8 Parents: 00a5534 Author: Jonathan Ellis jbel...@apache.org Authored: Fri May 25 16:15:19 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Fri May 25 16:15:19 2012 -0500 -- .../db/compaction/AbstractCompactionTask.java |5 + .../cassandra/db/compaction/CompactionManager.java | 12 .../cassandra/db/compaction/LeveledManifest.java | 10 +++--- .../compaction/SizeTieredCompactionStrategy.java |3 ++- 4 files changed, 26 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/99ad7d60/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java b/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java index dbcd8cd..1449c87 100644 --- a/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java +++ b/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java @@ -84,4 +84,9 @@ public abstract class AbstractCompactionTask // Can be overriden for action that need to be performed if the task won't // execute (if sstable can't be marked successfully) protected void cancel() {} + +public String toString() +{ +return CompactionTask( + sstables + ); +} } http://git-wip-us.apache.org/repos/asf/cassandra/blob/99ad7d60/src/java/org/apache/cassandra/db/compaction/CompactionManager.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java index 55fab3c..872ce0b 100644 --- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java +++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java @@ -111,6 +111,10 @@ public class CompactionManager implements CompactionManagerMBean */ public FutureInteger submitBackground(final ColumnFamilyStore cfs) { +logger.debug(Scheduling a background task check for {}.{} with {}, + new Object[] {cfs.table.name, + cfs.columnFamily, + cfs.getCompactionStrategy().getClass().getSimpleName()}); CallableInteger callable = new CallableInteger() { public Integer call() throws IOException @@ -118,16 +122,24 @@ public class CompactionManager implements CompactionManagerMBean compactionLock.readLock().lock(); try { +logger.debug(Checking {}.{}, cfs.table.name, cfs.columnFamily); // log after we get the lock so we can see delays from that if any if (!cfs.isValid()) +{ +logger.debug(Aborting compaction for dropped CF); return 0; +} boolean taskExecuted = false; AbstractCompactionStrategy strategy = cfs.getCompactionStrategy(); ListAbstractCompactionTask tasks = strategy.getBackgroundTasks(getDefaultGcBefore(cfs)); +logger.debug({} minor compaction tasks available, tasks.size()); for (AbstractCompactionTask task : tasks) { if (!task.markSSTablesForCompaction()) +{ +logger.debug(Skipping {}; sstables are busy, task); continue; +} taskExecuted = true; try http://git-wip-us.apache.org/repos/asf/cassandra/blob/99ad7d60/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java b/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java index 4e13640..7e06848 100644 --- a/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java +++ b/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java @@ -448,13 +448,17 @@ public class LeveledManifest public synchronized int getEstimatedTasks() { long tasks = 0; +long[] estimated = new long[generations.length]; +
[11/22] git commit: switch to MapLong, List to avoid re-hashing the list with every addition patch by jbellis; reviewed by thobbs for CASSANDRA-4287
switch to MapLong, List to avoid re-hashing the list with every addition patch by jbellis; reviewed by thobbs for CASSANDRA-4287 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/27176103 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/27176103 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/27176103 Branch: refs/heads/trunk Commit: 271761038ce1b77eea05b96f1226016b45f5f3d0 Parents: 6ab02c5 Author: Jonathan Ellis jbel...@apache.org Authored: Fri May 25 15:16:52 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Fri May 25 16:15:24 2012 -0500 -- CHANGES.txt|2 + .../compaction/SizeTieredCompactionStrategy.java | 26 +++--- 2 files changed, 15 insertions(+), 13 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/27176103/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 404a744..b566c6f 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -3,6 +3,8 @@ * ensure unique streaming session id's (CASSANDRA-4223) * kick off background compaction when min/max thresholds change (CASSANDRA-4279) + * improve ability of STCS.getBuckets to deal with 100s of 1000s of + sstables, such as when convertinb back from LCS (CASSANDRA-4287) 1.0.10 http://git-wip-us.apache.org/repos/asf/cassandra/blob/27176103/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java b/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java index 380982f..747edcc 100644 --- a/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java +++ b/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java @@ -122,7 +122,7 @@ public class SizeTieredCompactionStrategy extends AbstractCompactionStrategy } }); -MapListT, Long buckets = new HashMapListT, Long(); +MapLong, ListT buckets = new HashMapLong, ListT(); for (PairT, Long pair: sortedFiles) { @@ -132,19 +132,19 @@ public class SizeTieredCompactionStrategy extends AbstractCompactionStrategy // look for a bucket containing similar-sized files: // group in the same bucket if it's w/in 50% of the average for this bucket, // or this file and the bucket are all considered small (less than `minSSTableSize`) -for (EntryListT, Long entry : buckets.entrySet()) +for (EntryLong, ListT entry : buckets.entrySet()) { -ListT bucket = entry.getKey(); -long averageSize = entry.getValue(); -if ((size (averageSize / 2) size (3 * averageSize) / 2) -|| (size minSSTableSize averageSize minSSTableSize)) +ListT bucket = entry.getValue(); +long oldAverageSize = entry.getKey(); +if ((size (oldAverageSize / 2) size (3 * oldAverageSize) / 2) +|| (size minSSTableSize oldAverageSize minSSTableSize)) { -// remove and re-add because adding changes the hash -buckets.remove(bucket); -long totalSize = bucket.size() * averageSize; -averageSize = (totalSize + size) / (bucket.size() + 1); +// remove and re-add under new new average size +buckets.remove(oldAverageSize); +long totalSize = bucket.size() * oldAverageSize; +long newAverageSize = (totalSize + size) / (bucket.size() + 1); bucket.add(pair.left); -buckets.put(bucket, averageSize); +buckets.put(newAverageSize, bucket); bFound = true; break; } @@ -154,11 +154,11 @@ public class SizeTieredCompactionStrategy extends AbstractCompactionStrategy { ArrayListT bucket = new ArrayListT(); bucket.add(pair.left); -buckets.put(bucket, size); +buckets.put(size, bucket); } } -return new ArrayListListT(buckets.keySet()); +return new ArrayListListT(buckets.values()); } private void updateEstimatedCompactionsByTasks(ListAbstractCompactionTask tasks)
[18/22] git commit: debug logging
debug logging Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/99ad7d60 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/99ad7d60 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/99ad7d60 Branch: refs/heads/trunk Commit: 99ad7d60012c05e98754ca8c5730d865ffe024e8 Parents: 00a5534 Author: Jonathan Ellis jbel...@apache.org Authored: Fri May 25 16:15:19 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Fri May 25 16:15:19 2012 -0500 -- .../db/compaction/AbstractCompactionTask.java |5 + .../cassandra/db/compaction/CompactionManager.java | 12 .../cassandra/db/compaction/LeveledManifest.java | 10 +++--- .../compaction/SizeTieredCompactionStrategy.java |3 ++- 4 files changed, 26 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/99ad7d60/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java b/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java index dbcd8cd..1449c87 100644 --- a/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java +++ b/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java @@ -84,4 +84,9 @@ public abstract class AbstractCompactionTask // Can be overriden for action that need to be performed if the task won't // execute (if sstable can't be marked successfully) protected void cancel() {} + +public String toString() +{ +return CompactionTask( + sstables + ); +} } http://git-wip-us.apache.org/repos/asf/cassandra/blob/99ad7d60/src/java/org/apache/cassandra/db/compaction/CompactionManager.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java index 55fab3c..872ce0b 100644 --- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java +++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java @@ -111,6 +111,10 @@ public class CompactionManager implements CompactionManagerMBean */ public FutureInteger submitBackground(final ColumnFamilyStore cfs) { +logger.debug(Scheduling a background task check for {}.{} with {}, + new Object[] {cfs.table.name, + cfs.columnFamily, + cfs.getCompactionStrategy().getClass().getSimpleName()}); CallableInteger callable = new CallableInteger() { public Integer call() throws IOException @@ -118,16 +122,24 @@ public class CompactionManager implements CompactionManagerMBean compactionLock.readLock().lock(); try { +logger.debug(Checking {}.{}, cfs.table.name, cfs.columnFamily); // log after we get the lock so we can see delays from that if any if (!cfs.isValid()) +{ +logger.debug(Aborting compaction for dropped CF); return 0; +} boolean taskExecuted = false; AbstractCompactionStrategy strategy = cfs.getCompactionStrategy(); ListAbstractCompactionTask tasks = strategy.getBackgroundTasks(getDefaultGcBefore(cfs)); +logger.debug({} minor compaction tasks available, tasks.size()); for (AbstractCompactionTask task : tasks) { if (!task.markSSTablesForCompaction()) +{ +logger.debug(Skipping {}; sstables are busy, task); continue; +} taskExecuted = true; try http://git-wip-us.apache.org/repos/asf/cassandra/blob/99ad7d60/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java b/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java index 4e13640..7e06848 100644 --- a/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java +++ b/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java @@ -448,13 +448,17 @@ public class LeveledManifest public synchronized int getEstimatedTasks() { long tasks = 0; +long[] estimated = new long[generations.length]; + for
[13/22] git commit: switch to MapLong, List to avoid re-hashing the list with every addition patch by jbellis; reviewed by thobbs for CASSANDRA-4287
switch to MapLong, List to avoid re-hashing the list with every addition patch by jbellis; reviewed by thobbs for CASSANDRA-4287 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/27176103 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/27176103 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/27176103 Branch: refs/heads/cassandra-1.1 Commit: 271761038ce1b77eea05b96f1226016b45f5f3d0 Parents: 6ab02c5 Author: Jonathan Ellis jbel...@apache.org Authored: Fri May 25 15:16:52 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Fri May 25 16:15:24 2012 -0500 -- CHANGES.txt|2 + .../compaction/SizeTieredCompactionStrategy.java | 26 +++--- 2 files changed, 15 insertions(+), 13 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/27176103/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 404a744..b566c6f 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -3,6 +3,8 @@ * ensure unique streaming session id's (CASSANDRA-4223) * kick off background compaction when min/max thresholds change (CASSANDRA-4279) + * improve ability of STCS.getBuckets to deal with 100s of 1000s of + sstables, such as when convertinb back from LCS (CASSANDRA-4287) 1.0.10 http://git-wip-us.apache.org/repos/asf/cassandra/blob/27176103/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java b/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java index 380982f..747edcc 100644 --- a/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java +++ b/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java @@ -122,7 +122,7 @@ public class SizeTieredCompactionStrategy extends AbstractCompactionStrategy } }); -MapListT, Long buckets = new HashMapListT, Long(); +MapLong, ListT buckets = new HashMapLong, ListT(); for (PairT, Long pair: sortedFiles) { @@ -132,19 +132,19 @@ public class SizeTieredCompactionStrategy extends AbstractCompactionStrategy // look for a bucket containing similar-sized files: // group in the same bucket if it's w/in 50% of the average for this bucket, // or this file and the bucket are all considered small (less than `minSSTableSize`) -for (EntryListT, Long entry : buckets.entrySet()) +for (EntryLong, ListT entry : buckets.entrySet()) { -ListT bucket = entry.getKey(); -long averageSize = entry.getValue(); -if ((size (averageSize / 2) size (3 * averageSize) / 2) -|| (size minSSTableSize averageSize minSSTableSize)) +ListT bucket = entry.getValue(); +long oldAverageSize = entry.getKey(); +if ((size (oldAverageSize / 2) size (3 * oldAverageSize) / 2) +|| (size minSSTableSize oldAverageSize minSSTableSize)) { -// remove and re-add because adding changes the hash -buckets.remove(bucket); -long totalSize = bucket.size() * averageSize; -averageSize = (totalSize + size) / (bucket.size() + 1); +// remove and re-add under new new average size +buckets.remove(oldAverageSize); +long totalSize = bucket.size() * oldAverageSize; +long newAverageSize = (totalSize + size) / (bucket.size() + 1); bucket.add(pair.left); -buckets.put(bucket, averageSize); +buckets.put(newAverageSize, bucket); bFound = true; break; } @@ -154,11 +154,11 @@ public class SizeTieredCompactionStrategy extends AbstractCompactionStrategy { ArrayListT bucket = new ArrayListT(); bucket.add(pair.left); -buckets.put(bucket, size); +buckets.put(size, bucket); } } -return new ArrayListListT(buckets.keySet()); +return new ArrayListListT(buckets.values()); } private void updateEstimatedCompactionsByTasks(ListAbstractCompactionTask tasks)
[15/22] git commit: switch to ArrayList
switch to ArrayList Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/6ab02c59 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/6ab02c59 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/6ab02c59 Branch: refs/heads/cassandra-1.1 Commit: 6ab02c5997f2eed760d371e9d9049fc01c4c5952 Parents: 99ad7d6 Author: Jonathan Ellis jbel...@apache.org Authored: Fri May 25 15:03:30 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Fri May 25 16:15:23 2012 -0500 -- .../compaction/SizeTieredCompactionStrategy.java |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/6ab02c59/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java b/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java index 15cc01d..380982f 100644 --- a/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java +++ b/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java @@ -56,7 +56,7 @@ public class SizeTieredCompactionStrategy extends AbstractCompactionStrategy return Collections.emptyList(); } -ListAbstractCompactionTask tasks = new LinkedListAbstractCompactionTask(); +ListAbstractCompactionTask tasks = new ArrayListAbstractCompactionTask(); ListListSSTableReader buckets = getBuckets(createSSTableAndLengthPairs(cfs.getSSTables()), minSSTableSize); logger.debug(Compaction buckets are {}, buckets); @@ -81,7 +81,7 @@ public class SizeTieredCompactionStrategy extends AbstractCompactionStrategy public ListAbstractCompactionTask getMaximalTasks(final int gcBefore) { -ListAbstractCompactionTask tasks = new LinkedListAbstractCompactionTask(); +ListAbstractCompactionTask tasks = new ArrayListAbstractCompactionTask(); if (!cfs.getSSTables().isEmpty()) tasks.add(new CompactionTask(cfs, cfs.getSSTables(), gcBefore)); return tasks; @@ -158,7 +158,7 @@ public class SizeTieredCompactionStrategy extends AbstractCompactionStrategy } } -return new LinkedListListT(buckets.keySet()); +return new ArrayListListT(buckets.keySet()); } private void updateEstimatedCompactionsByTasks(ListAbstractCompactionTask tasks)
[19/22] git commit: debug logging
debug logging Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/99ad7d60 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/99ad7d60 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/99ad7d60 Branch: refs/heads/cassandra-1.1 Commit: 99ad7d60012c05e98754ca8c5730d865ffe024e8 Parents: 00a5534 Author: Jonathan Ellis jbel...@apache.org Authored: Fri May 25 16:15:19 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Fri May 25 16:15:19 2012 -0500 -- .../db/compaction/AbstractCompactionTask.java |5 + .../cassandra/db/compaction/CompactionManager.java | 12 .../cassandra/db/compaction/LeveledManifest.java | 10 +++--- .../compaction/SizeTieredCompactionStrategy.java |3 ++- 4 files changed, 26 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/99ad7d60/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java b/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java index dbcd8cd..1449c87 100644 --- a/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java +++ b/src/java/org/apache/cassandra/db/compaction/AbstractCompactionTask.java @@ -84,4 +84,9 @@ public abstract class AbstractCompactionTask // Can be overriden for action that need to be performed if the task won't // execute (if sstable can't be marked successfully) protected void cancel() {} + +public String toString() +{ +return CompactionTask( + sstables + ); +} } http://git-wip-us.apache.org/repos/asf/cassandra/blob/99ad7d60/src/java/org/apache/cassandra/db/compaction/CompactionManager.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java index 55fab3c..872ce0b 100644 --- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java +++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java @@ -111,6 +111,10 @@ public class CompactionManager implements CompactionManagerMBean */ public FutureInteger submitBackground(final ColumnFamilyStore cfs) { +logger.debug(Scheduling a background task check for {}.{} with {}, + new Object[] {cfs.table.name, + cfs.columnFamily, + cfs.getCompactionStrategy().getClass().getSimpleName()}); CallableInteger callable = new CallableInteger() { public Integer call() throws IOException @@ -118,16 +122,24 @@ public class CompactionManager implements CompactionManagerMBean compactionLock.readLock().lock(); try { +logger.debug(Checking {}.{}, cfs.table.name, cfs.columnFamily); // log after we get the lock so we can see delays from that if any if (!cfs.isValid()) +{ +logger.debug(Aborting compaction for dropped CF); return 0; +} boolean taskExecuted = false; AbstractCompactionStrategy strategy = cfs.getCompactionStrategy(); ListAbstractCompactionTask tasks = strategy.getBackgroundTasks(getDefaultGcBefore(cfs)); +logger.debug({} minor compaction tasks available, tasks.size()); for (AbstractCompactionTask task : tasks) { if (!task.markSSTablesForCompaction()) +{ +logger.debug(Skipping {}; sstables are busy, task); continue; +} taskExecuted = true; try http://git-wip-us.apache.org/repos/asf/cassandra/blob/99ad7d60/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java b/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java index 4e13640..7e06848 100644 --- a/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java +++ b/src/java/org/apache/cassandra/db/compaction/LeveledManifest.java @@ -448,13 +448,17 @@ public class LeveledManifest public synchronized int getEstimatedTasks() { long tasks = 0; +long[] estimated = new long[generations.length]; +
[21/22] git commit: kick off background compaction when min/max thresholds change patch by jbellis; reviewed by slebresne for CASSANDRA-4279
kick off background compaction when min/max thresholds change patch by jbellis; reviewed by slebresne for CASSANDRA-4279 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/00a55343 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/00a55343 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/00a55343 Branch: refs/heads/trunk Commit: 00a553438623945117066d4adfc2826c17d59ccb Parents: f89b9ae Author: Jonathan Ellis jbel...@apache.org Authored: Fri May 25 15:49:49 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Fri May 25 15:49:49 2012 -0500 -- CHANGES.txt|2 ++ .../org/apache/cassandra/db/ColumnFamilyStore.java |8 2 files changed, 6 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/00a55343/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 15f3c8a..404a744 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,6 +1,8 @@ 1.0.11 * synchronize LCS getEstimatedTasks to avoid CME (CASSANDRA-4255) * ensure unique streaming session id's (CASSANDRA-4223) + * kick off background compaction when min/max thresholds change + (CASSANDRA-4279) 1.0.10 http://git-wip-us.apache.org/repos/asf/cassandra/blob/00a55343/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java index b7d74bc..56de67e 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java +++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java @@ -1727,10 +1727,10 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean public void setMinimumCompactionThreshold(int minCompactionThreshold) { if ((minCompactionThreshold this.maxCompactionThreshold.value()) this.maxCompactionThreshold.value() != 0) -{ throw new RuntimeException(The min_compaction_threshold cannot be larger than the max.); -} + this.minCompactionThreshold.set(minCompactionThreshold); +CompactionManager.instance.submitBackground(this); } public int getMaximumCompactionThreshold() @@ -1741,10 +1741,10 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean public void setMaximumCompactionThreshold(int maxCompactionThreshold) { if (maxCompactionThreshold 0 maxCompactionThreshold this.minCompactionThreshold.value()) -{ throw new RuntimeException(The max_compaction_threshold cannot be smaller than the min.); -} + this.maxCompactionThreshold.set(maxCompactionThreshold); +CompactionManager.instance.submitBackground(this); } public boolean isCompactionDisabled()
[16/22] git commit: switch to ArrayList
switch to ArrayList Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/6ab02c59 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/6ab02c59 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/6ab02c59 Branch: refs/heads/cassandra-1.0 Commit: 6ab02c5997f2eed760d371e9d9049fc01c4c5952 Parents: 99ad7d6 Author: Jonathan Ellis jbel...@apache.org Authored: Fri May 25 15:03:30 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Fri May 25 16:15:23 2012 -0500 -- .../compaction/SizeTieredCompactionStrategy.java |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/6ab02c59/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java b/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java index 15cc01d..380982f 100644 --- a/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java +++ b/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java @@ -56,7 +56,7 @@ public class SizeTieredCompactionStrategy extends AbstractCompactionStrategy return Collections.emptyList(); } -ListAbstractCompactionTask tasks = new LinkedListAbstractCompactionTask(); +ListAbstractCompactionTask tasks = new ArrayListAbstractCompactionTask(); ListListSSTableReader buckets = getBuckets(createSSTableAndLengthPairs(cfs.getSSTables()), minSSTableSize); logger.debug(Compaction buckets are {}, buckets); @@ -81,7 +81,7 @@ public class SizeTieredCompactionStrategy extends AbstractCompactionStrategy public ListAbstractCompactionTask getMaximalTasks(final int gcBefore) { -ListAbstractCompactionTask tasks = new LinkedListAbstractCompactionTask(); +ListAbstractCompactionTask tasks = new ArrayListAbstractCompactionTask(); if (!cfs.getSSTables().isEmpty()) tasks.add(new CompactionTask(cfs, cfs.getSSTables(), gcBefore)); return tasks; @@ -158,7 +158,7 @@ public class SizeTieredCompactionStrategy extends AbstractCompactionStrategy } } -return new LinkedListListT(buckets.keySet()); +return new ArrayListListT(buckets.keySet()); } private void updateEstimatedCompactionsByTasks(ListAbstractCompactionTask tasks)
[14/22] git commit: switch to MapLong, List to avoid re-hashing the list with every addition patch by jbellis; reviewed by thobbs for CASSANDRA-4287
switch to MapLong, List to avoid re-hashing the list with every addition patch by jbellis; reviewed by thobbs for CASSANDRA-4287 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/27176103 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/27176103 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/27176103 Branch: refs/heads/cassandra-1.0 Commit: 271761038ce1b77eea05b96f1226016b45f5f3d0 Parents: 6ab02c5 Author: Jonathan Ellis jbel...@apache.org Authored: Fri May 25 15:16:52 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Fri May 25 16:15:24 2012 -0500 -- CHANGES.txt|2 + .../compaction/SizeTieredCompactionStrategy.java | 26 +++--- 2 files changed, 15 insertions(+), 13 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/27176103/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 404a744..b566c6f 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -3,6 +3,8 @@ * ensure unique streaming session id's (CASSANDRA-4223) * kick off background compaction when min/max thresholds change (CASSANDRA-4279) + * improve ability of STCS.getBuckets to deal with 100s of 1000s of + sstables, such as when convertinb back from LCS (CASSANDRA-4287) 1.0.10 http://git-wip-us.apache.org/repos/asf/cassandra/blob/27176103/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java b/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java index 380982f..747edcc 100644 --- a/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java +++ b/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java @@ -122,7 +122,7 @@ public class SizeTieredCompactionStrategy extends AbstractCompactionStrategy } }); -MapListT, Long buckets = new HashMapListT, Long(); +MapLong, ListT buckets = new HashMapLong, ListT(); for (PairT, Long pair: sortedFiles) { @@ -132,19 +132,19 @@ public class SizeTieredCompactionStrategy extends AbstractCompactionStrategy // look for a bucket containing similar-sized files: // group in the same bucket if it's w/in 50% of the average for this bucket, // or this file and the bucket are all considered small (less than `minSSTableSize`) -for (EntryListT, Long entry : buckets.entrySet()) +for (EntryLong, ListT entry : buckets.entrySet()) { -ListT bucket = entry.getKey(); -long averageSize = entry.getValue(); -if ((size (averageSize / 2) size (3 * averageSize) / 2) -|| (size minSSTableSize averageSize minSSTableSize)) +ListT bucket = entry.getValue(); +long oldAverageSize = entry.getKey(); +if ((size (oldAverageSize / 2) size (3 * oldAverageSize) / 2) +|| (size minSSTableSize oldAverageSize minSSTableSize)) { -// remove and re-add because adding changes the hash -buckets.remove(bucket); -long totalSize = bucket.size() * averageSize; -averageSize = (totalSize + size) / (bucket.size() + 1); +// remove and re-add under new new average size +buckets.remove(oldAverageSize); +long totalSize = bucket.size() * oldAverageSize; +long newAverageSize = (totalSize + size) / (bucket.size() + 1); bucket.add(pair.left); -buckets.put(bucket, averageSize); +buckets.put(newAverageSize, bucket); bFound = true; break; } @@ -154,11 +154,11 @@ public class SizeTieredCompactionStrategy extends AbstractCompactionStrategy { ArrayListT bucket = new ArrayListT(); bucket.add(pair.left); -buckets.put(bucket, size); +buckets.put(size, bucket); } } -return new ArrayListListT(buckets.keySet()); +return new ArrayListListT(buckets.values()); } private void updateEstimatedCompactionsByTasks(ListAbstractCompactionTask tasks)
[17/22] git commit: switch to ArrayList
switch to ArrayList Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/6ab02c59 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/6ab02c59 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/6ab02c59 Branch: refs/heads/trunk Commit: 6ab02c5997f2eed760d371e9d9049fc01c4c5952 Parents: 99ad7d6 Author: Jonathan Ellis jbel...@apache.org Authored: Fri May 25 15:03:30 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Fri May 25 16:15:23 2012 -0500 -- .../compaction/SizeTieredCompactionStrategy.java |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/6ab02c59/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java b/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java index 15cc01d..380982f 100644 --- a/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java +++ b/src/java/org/apache/cassandra/db/compaction/SizeTieredCompactionStrategy.java @@ -56,7 +56,7 @@ public class SizeTieredCompactionStrategy extends AbstractCompactionStrategy return Collections.emptyList(); } -ListAbstractCompactionTask tasks = new LinkedListAbstractCompactionTask(); +ListAbstractCompactionTask tasks = new ArrayListAbstractCompactionTask(); ListListSSTableReader buckets = getBuckets(createSSTableAndLengthPairs(cfs.getSSTables()), minSSTableSize); logger.debug(Compaction buckets are {}, buckets); @@ -81,7 +81,7 @@ public class SizeTieredCompactionStrategy extends AbstractCompactionStrategy public ListAbstractCompactionTask getMaximalTasks(final int gcBefore) { -ListAbstractCompactionTask tasks = new LinkedListAbstractCompactionTask(); +ListAbstractCompactionTask tasks = new ArrayListAbstractCompactionTask(); if (!cfs.getSSTables().isEmpty()) tasks.add(new CompactionTask(cfs, cfs.getSSTables(), gcBefore)); return tasks; @@ -158,7 +158,7 @@ public class SizeTieredCompactionStrategy extends AbstractCompactionStrategy } } -return new LinkedListListT(buckets.keySet()); +return new ArrayListListT(buckets.keySet()); } private void updateEstimatedCompactionsByTasks(ListAbstractCompactionTask tasks)
[22/22] git commit: kick off background compaction when min/max thresholds change patch by jbellis; reviewed by slebresne for CASSANDRA-4279
kick off background compaction when min/max thresholds change patch by jbellis; reviewed by slebresne for CASSANDRA-4279 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/00a55343 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/00a55343 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/00a55343 Branch: refs/heads/cassandra-1.1 Commit: 00a553438623945117066d4adfc2826c17d59ccb Parents: f89b9ae Author: Jonathan Ellis jbel...@apache.org Authored: Fri May 25 15:49:49 2012 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Fri May 25 15:49:49 2012 -0500 -- CHANGES.txt|2 ++ .../org/apache/cassandra/db/ColumnFamilyStore.java |8 2 files changed, 6 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/00a55343/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 15f3c8a..404a744 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,6 +1,8 @@ 1.0.11 * synchronize LCS getEstimatedTasks to avoid CME (CASSANDRA-4255) * ensure unique streaming session id's (CASSANDRA-4223) + * kick off background compaction when min/max thresholds change + (CASSANDRA-4279) 1.0.10 http://git-wip-us.apache.org/repos/asf/cassandra/blob/00a55343/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java index b7d74bc..56de67e 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java +++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java @@ -1727,10 +1727,10 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean public void setMinimumCompactionThreshold(int minCompactionThreshold) { if ((minCompactionThreshold this.maxCompactionThreshold.value()) this.maxCompactionThreshold.value() != 0) -{ throw new RuntimeException(The min_compaction_threshold cannot be larger than the max.); -} + this.minCompactionThreshold.set(minCompactionThreshold); +CompactionManager.instance.submitBackground(this); } public int getMaximumCompactionThreshold() @@ -1741,10 +1741,10 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean public void setMaximumCompactionThreshold(int maxCompactionThreshold) { if (maxCompactionThreshold 0 maxCompactionThreshold this.minCompactionThreshold.value()) -{ throw new RuntimeException(The max_compaction_threshold cannot be smaller than the min.); -} + this.maxCompactionThreshold.set(maxCompactionThreshold); +CompactionManager.instance.submitBackground(this); } public boolean isCompactionDisabled()