[jira] [Commented] (CASSANDRA-7225) is = and is = in CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997922#comment-13997922 ] Robert Stupp commented on CASSANDRA-7225: - It's about the sentence from {{help select_where}} in cqlsh: {code} cqlsh:demo help select_where SELECT: Filtering rows SELECT ... WHERE key = keyname AND name1 = value1 SELECT ... WHERE key = startkey and key = endkey AND name1 = value1 SELECT ... WHERE key IN ('key', 'key', 'key', ...) ... Note: The greater-than and less-than operators ( and ) result in key ranges that are inclusive of the terms. There is no supported notion of strictly greater-than or less-than; these operators are merely supported as aliases to = and =. {code} is = and is = in CQL -- Key: CASSANDRA-7225 URL: https://issues.apache.org/jira/browse/CASSANDRA-7225 Project: Cassandra Issue Type: Bug Reporter: Robert Stupp Just a small line of text in cqlsh help command indicates that is = and is = in CQL. This is confusing to many people (including me :) ) because I did not expect to return the equals portion. Please allow distinct behaviours for , =, and = in CQL queries. Maybe in combination with CASSANDRA-5184 and/or CASSANDRA-4914 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7233) Dropping a keyspace fails to purge the Key Cache resulting in SSTable Corruption during searches
[ https://issues.apache.org/jira/browse/CASSANDRA-7233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoo Ganesh updated CASSANDRA-7233: Description: Dropping a keyspace fails to purge the Key Cache resulting in SSTable corruption during searches. Dropping a full keyspace (with column families) and re-creating that same keyspace (with column families) without restarting Cassandra causes searches to print out CorruptSSTable messages. This has to do with the fact that keys that reference the deleted data persist in the key cache after the data has been deleted. Could we have the key cache automatically invalidated when for data that is dropped? was: Dropping a keyspace fails to purge the Key Cache resulting in SSTable corruption during searches. One of our workflows involves dropping a full keyspace (with column families) and re-creating it all without restarting Cassandra. When data is dropped from Cassandra, it doesn't look the key cache is invalidated which causes searches to print out CorruptSSTable messages. At an initial glance, it looks like the issue we're seeing has to do with the fact that the Descriptor passed into KeyCacheKey's constructor checks directory, generation, ksname, cfname, and temp. In our workflow, when the new keyspace is created, generation restarts at 1 which creates issues. We're not sure if it makes a lot of sense to try and preserve the generation during the deletion/recreation process (and we're not sure where Cassandra would even save this) but that would be a fix for our workflow. Additionally, making the actual Column Family UUIDs unique would be great as well. It looks like in RowKeyCache, they UUIDs are just made up of the keyspace name and column family. Dropping a keyspace fails to purge the Key Cache resulting in SSTable Corruption during searches Key: CASSANDRA-7233 URL: https://issues.apache.org/jira/browse/CASSANDRA-7233 Project: Cassandra Issue Type: Bug Components: Core Reporter: Vinoo Ganesh Fix For: 1.2.17 Dropping a keyspace fails to purge the Key Cache resulting in SSTable corruption during searches. Dropping a full keyspace (with column families) and re-creating that same keyspace (with column families) without restarting Cassandra causes searches to print out CorruptSSTable messages. This has to do with the fact that keys that reference the deleted data persist in the key cache after the data has been deleted. Could we have the key cache automatically invalidated when for data that is dropped? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-4718) More-efficient ExecutorService for improved throughput
[ https://issues.apache.org/jira/browse/CASSANDRA-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997999#comment-13997999 ] Benedict edited comment on CASSANDRA-4718 at 5/14/14 9:14 PM: -- Thanks [~enigmacurry]! Those graphs all look pretty good to me. Think it's time to run some of the longer tests to see that performance is still good for other workloads. Let's drop thrift from the equation now. I'd suggest something like write n=24 -key populate=1..24 force major compaction for each thread count/branch: read n=10 -key dist=extr(1..24,2) and warm up with one (any) read test run before the rest, so that they all are playing from a roughly level page cache point This should create a dataset in the region of 440Gb, but around 75% of requests will be to ~160Gb of it, which should be in the region of the amount of page cache available to the EC2 systems after bloom filters etc. are accounted for NB: if you want to play with different distributions, cassandra-stress print lets you see what a spec would yield was (Author: benedict): Thanks [~enigmacurry]! Those graphs all look pretty good to me. Think it's time to run some of the longer tests to see that performance is still good for other workloads. Let's drop thrift from the equation now. I'd suggest something like write n=6 -key populate=1..6 force major compaction for each thread count/branch: read n=1 -key dist=extr(1..6,2) and warm up with one (any) read test run before the rest, so that they all are playing from a roughly level page cache point This should create a dataset in the region of 110Gb, but around 75% of requests will be to ~40Gb of it, which should be in the region of the amount of page cache available to the EC2 systems after bloom filters etc. are accounted for NB: if you want to play with different distributions, cassandra-stress print lets you see what a spec would yield More-efficient ExecutorService for improved throughput -- Key: CASSANDRA-4718 URL: https://issues.apache.org/jira/browse/CASSANDRA-4718 Project: Cassandra Issue Type: Improvement Reporter: Jonathan Ellis Assignee: Benedict Priority: Minor Labels: performance Fix For: 2.1.0 Attachments: 4718-v1.patch, PerThreadQueue.java, aws.svg, aws_read.svg, backpressure-stress.out.txt, baq vs trunk.png, belliotsmith_branches-stress.out.txt, jason_read.svg, jason_read_latency.svg, jason_write.svg, op costs of various queues.ods, stress op rate with various queues.ods, v1-stress.out Currently all our execution stages dequeue tasks one at a time. This can result in contention between producers and consumers (although we do our best to minimize this by using LinkedBlockingQueue). One approach to mitigating this would be to make consumer threads do more work in bulk instead of just one task per dequeue. (Producer threads tend to be single-task oriented by nature, so I don't see an equivalent opportunity there.) BlockingQueue has a drainTo(collection, int) method that would be perfect for this. However, no ExecutorService in the jdk supports using drainTo, nor could I google one. What I would like to do here is create just such a beast and wire it into (at least) the write and read stages. (Other possible candidates for such an optimization, such as the CommitLog and OutboundTCPConnection, are not ExecutorService-based and will need to be one-offs.) AbstractExecutorService may be useful. The implementations of ICommitLogExecutorService may also be useful. (Despite the name these are not actual ExecutorServices, although they share the most important properties of one.) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-5663) Add write batching for the native protocol
[ https://issues.apache.org/jira/browse/CASSANDRA-5663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-5663: Reviewer: T Jake Luciani Add write batching for the native protocol -- Key: CASSANDRA-5663 URL: https://issues.apache.org/jira/browse/CASSANDRA-5663 Project: Cassandra Issue Type: Improvement Reporter: Sylvain Lebresne Assignee: Benedict Labels: performance Fix For: 2.1 rc1 Attachments: 5663.txt As discussed in CASSANDRA-5422, adding write batching to the native protocol implementation is likely to improve throughput in a number of cases. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-4718) More-efficient ExecutorService for improved throughput
[ https://issues.apache.org/jira/browse/CASSANDRA-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997809#comment-13997809 ] Ryan McGuire edited comment on CASSANDRA-4718 at 5/14/14 7:45 PM: -- [~benedict] more short tests. Updated here as they complete: EC2 c3.8xlarge, cql native: * [810 threads|http://riptano.github.io/cassandra_performance/graph_v3/graph.html?stats=stats.4718.ec2.may12.threads-810-cql3_native_prepared.json] * [270 threads|http://riptano.github.io/cassandra_performance/graph_v3/graph.html?stats=stats.4718.ec2.may12.threads-270-cql3_native_prepared.json] bdplab, cql native: * [810 threads|http://riptano.github.io/cassandra_performance/graph_v3/graph.html?stats=stats.4718.bdplab.may12.threads-810-cql3_native_prepared.json] * [270 threads|http://riptano.github.io/cassandra_performance/graph_v3/graph.html?stats=stats.4718.bdplab.may12.threads-270-cql3_native_prepared.json] * [90 threads|http://riptano.github.io/cassandra_performance/graph_v3/graph.html?stats=stats.4718.bdplab.may12.threads-90-cql3_native_prepared.json] EC2 c3.8xlarge, thrfit smart: * [810 threads|http://riptano.github.io/cassandra_performance/graph_v3/graph.html?stats=stats.4718.ec2.may12.threads-810-thrift_smart.json] * [270 threads|http://riptano.github.io/cassandra_performance/graph_v3/graph.html?stats=stats.4718.ec2.may12.threads-270-thrift_smart.json] bdplab, thrift smart: * [810 threads|http://riptano.github.io/cassandra_performance/graph_v3/graph.html?stats=stats.4718.bdplab.may12.threads-810-thrift_smart.json] * [270 threads|http://riptano.github.io/cassandra_performance/graph_v3/graph.html?stats=stats.4718.bdplab.may12.threads-270-thrift_smart.json] * [90 threads|http://riptano.github.io/cassandra_performance/graph_v3/graph.html?stats=stats.4718.bdplab.may12.threads-90-thrift_smart.json] was (Author: enigmacurry): [~benedict] more short tests. Updated here as they complete: EC2 c3.8xlarge, cql native: * [810 threads|http://riptano.github.io/cassandra_performance/graph_v3/graph.html?stats=stats.4718.ec2.may12.threads-810-cql3_native_prepared.json] * [270 threads|http://riptano.github.io/cassandra_performance/graph_v3/graph.html?stats=stats.4718.ec2.may12.threads-270-cql3_native_prepared.json] bdplab, cql native: * [810 threads|http://riptano.github.io/cassandra_performance/graph_v3/graph.html?stats=stats.4718.bdplab.may12.threads-810-cql3_native_prepared.json] * [270 threads|http://riptano.github.io/cassandra_performance/graph_v3/graph.html?stats=stats.4718.bdplab.may12.threads-270-cql3_native_prepared.json] * [90 threads|http://riptano.github.io/cassandra_performance/graph_v3/graph.html?stats=stats.4718.bdplab.may12.threads-90-cql3_native_prepared.json] EC2 c3.8xlarge, thrfit smart: * [810 threads|http://riptano.github.io/cassandra_performance/graph_v3/graph.html?stats=stats.4718.ec2.may12.threads-810-thrift_smart.json] * [270 threads|http://riptano.github.io/cassandra_performance/graph_v3/graph.html?stats=stats.4718.ec2.may12.threads-270-thrift_smart.json] bdplab, stress smart: * [810 threads|http://riptano.github.io/cassandra_performance/graph_v3/graph.html?stats=stats.4718.bdplab.may12.threads-810-thrift_smart.json] * [270 threads|http://riptano.github.io/cassandra_performance/graph_v3/graph.html?stats=stats.4718.bdplab.may12.threads-270-thrift_smart.json] * [90 threads|http://riptano.github.io/cassandra_performance/graph_v3/graph.html?stats=stats.4718.bdplab.may12.threads-90-thrift_smart.json] More-efficient ExecutorService for improved throughput -- Key: CASSANDRA-4718 URL: https://issues.apache.org/jira/browse/CASSANDRA-4718 Project: Cassandra Issue Type: Improvement Reporter: Jonathan Ellis Assignee: Benedict Priority: Minor Labels: performance Fix For: 2.1.0 Attachments: 4718-v1.patch, PerThreadQueue.java, aws.svg, aws_read.svg, backpressure-stress.out.txt, baq vs trunk.png, belliotsmith_branches-stress.out.txt, jason_read.svg, jason_read_latency.svg, jason_write.svg, op costs of various queues.ods, stress op rate with various queues.ods, v1-stress.out Currently all our execution stages dequeue tasks one at a time. This can result in contention between producers and consumers (although we do our best to minimize this by using LinkedBlockingQueue). One approach to mitigating this would be to make consumer threads do more work in bulk instead of just one task per dequeue. (Producer threads tend to be single-task oriented by nature, so I don't see an equivalent opportunity there.) BlockingQueue has a drainTo(collection, int) method that would be perfect for this. However, no ExecutorService in the jdk supports using
[jira] [Commented] (CASSANDRA-6134) Asynchronous batchlog replay
[ https://issues.apache.org/jira/browse/CASSANDRA-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998148#comment-13998148 ] Aleksey Yeschenko commented on CASSANDRA-6134: -- Created CASSANDRA-7237 for #1. Asynchronous batchlog replay Key: CASSANDRA-6134 URL: https://issues.apache.org/jira/browse/CASSANDRA-6134 Project: Cassandra Issue Type: Improvement Reporter: Oleg Anastasyev Assignee: Oleg Anastasyev Priority: Minor Fix For: 2.1 rc1 Attachments: 6134-async.txt, 6134-cleanup.txt, BatchlogManager.txt As we discussed earlier in CASSANDRA-6079 this is the new BatchManager. It stores batch records in {code} CREATE TABLE batchlog ( id_partition int, id timeuuid, data blob, PRIMARY KEY (id_partition, id) ) WITH COMPACT STORAGE AND CLUSTERING ORDER BY (id DESC) {code} where id_partition is minute-since-epoch of id uuid. So when it scans for batches to replay ot scans within a single partition for a slice of ids since last processed date till now minus write timeout. So no full batchlog CF scan and lot of randrom reads are made on normal cycle. Other improvements: 1. It runs every 1/2 of write timeout and replays all batches written within 0.9 * write timeout from now. This way we ensure, that batched updates will be replayed to th moment client times out from coordinator. 2. It submits all mutations from single batch in parallel (Like StorageProxy do). Old implementation played them one-by-one, so client can see half applied batches in CF for a long time (depending on size of batch). 3. It fixes a subtle racing bug with incorrect hint ttl calculation -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-4718) More-efficient ExecutorService for improved throughput
[ https://issues.apache.org/jira/browse/CASSANDRA-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998146#comment-13998146 ] Jonathan Ellis commented on CASSANDRA-4718: --- Granted that a new executorservice won't help i/o bound workloads, but I knew that when I created the ticket and must be significantly better for all workloads is an unrealistically high bar for optimization work. This gives us a pretty huge benefit on at least some workloads ([1|http://riptano.github.io/cassandra_performance/graph_v3/graph.html?stats=stats.4718.bdplab.may12.threads-810-cql3_native_prepared.jsonmetric=op_rateoperation=4_readsmoothing=1xmin=0xmax=141.13ymin=0ymax=238843], [2|http://riptano.github.io/cassandra_performance/graph_v3/graph.html?stats=stats.4718.ec2.may12.threads-810-cql3_native_prepared.jsonmetric=op_rateoperation=4_readsmoothing=1xmin=0xmax=134.31ymin=0ymax=340354.3]) and a smaller benefit on others, which I'm quite happy with. Unless the longer benchmarks Ryan is running show dramatically different results, I'm +1. I also note that the work here is almost entirely self contained, with the major exception being some new code in Message.Dispatcher. So while it's not as simple as dropping in LTQ or BAQ or FJP, the results are absolutely good enough to be worth a new Executor implementation. More-efficient ExecutorService for improved throughput -- Key: CASSANDRA-4718 URL: https://issues.apache.org/jira/browse/CASSANDRA-4718 Project: Cassandra Issue Type: Improvement Reporter: Jonathan Ellis Assignee: Benedict Priority: Minor Labels: performance Fix For: 2.1.0 Attachments: 4718-v1.patch, PerThreadQueue.java, aws.svg, aws_read.svg, backpressure-stress.out.txt, baq vs trunk.png, belliotsmith_branches-stress.out.txt, jason_read.svg, jason_read_latency.svg, jason_write.svg, op costs of various queues.ods, stress op rate with various queues.ods, v1-stress.out Currently all our execution stages dequeue tasks one at a time. This can result in contention between producers and consumers (although we do our best to minimize this by using LinkedBlockingQueue). One approach to mitigating this would be to make consumer threads do more work in bulk instead of just one task per dequeue. (Producer threads tend to be single-task oriented by nature, so I don't see an equivalent opportunity there.) BlockingQueue has a drainTo(collection, int) method that would be perfect for this. However, no ExecutorService in the jdk supports using drainTo, nor could I google one. What I would like to do here is create just such a beast and wire it into (at least) the write and read stages. (Other possible candidates for such an optimization, such as the CommitLog and OutboundTCPConnection, are not ExecutorService-based and will need to be one-offs.) AbstractExecutorService may be useful. The implementations of ICommitLogExecutorService may also be useful. (Despite the name these are not actual ExecutorServices, although they share the most important properties of one.) -- This message was sent by Atlassian JIRA (v6.2#6252)
git commit: Make batchlog replay asynchronous
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 e7b3deee6 - 92c38c0e6 Make batchlog replay asynchronous patch by Oleg Anastasyev; reviewed by Aleksey Yeschenko for CASSANDRA-6134 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/92c38c0e Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/92c38c0e Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/92c38c0e Branch: refs/heads/cassandra-2.1 Commit: 92c38c0e6a5e23bdb77c23073a28f118a9f23add Parents: e7b3dee Author: Aleksey Yeschenko alek...@apache.org Authored: Thu May 15 01:13:09 2014 +0300 Committer: Aleksey Yeschenko alek...@apache.org Committed: Thu May 15 01:13:09 2014 +0300 -- CHANGES.txt | 1 + .../apache/cassandra/db/BatchlogManager.java| 287 --- 2 files changed, 188 insertions(+), 100 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/92c38c0e/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 3dd47a1..d43a0f5 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -12,6 +12,7 @@ * Fix repair hang when given CF does not exist (CASSANDRA-7189) * Allow c* to be shutdown in an embedded mode (CASSANDRA-5635) * Add server side batching to native transport (CASSANDRA-5663) + * Make batchlog replay asynchronous (CASSANDRA-6134) Merged from 2.0: * (Hadoop) Close java driver Cluster in CQLRR.close (CASSANDRA-7228) * Warn when 'USING TIMESTAMP' is used on a CAS BATCH (CASSANDRA-7067) http://git-wip-us.apache.org/repos/asf/cassandra/blob/92c38c0e/src/java/org/apache/cassandra/db/BatchlogManager.java -- diff --git a/src/java/org/apache/cassandra/db/BatchlogManager.java b/src/java/org/apache/cassandra/db/BatchlogManager.java index 3ffc7a7..1a441f6 100644 --- a/src/java/org/apache/cassandra/db/BatchlogManager.java +++ b/src/java/org/apache/cassandra/db/BatchlogManager.java @@ -48,6 +48,8 @@ import org.apache.cassandra.gms.FailureDetector; import org.apache.cassandra.io.sstable.Descriptor; import org.apache.cassandra.io.sstable.SSTableReader; import org.apache.cassandra.io.util.DataOutputBuffer; +import org.apache.cassandra.net.MessageIn; +import org.apache.cassandra.net.MessageOut; import org.apache.cassandra.net.MessagingService; import org.apache.cassandra.service.StorageProxy; import org.apache.cassandra.service.StorageService; @@ -193,162 +195,247 @@ public class BatchlogManager implements BatchlogManagerMBean logger.debug(Finished replayAllFailedBatches); } -// returns the UUID of the last seen batch +private void deleteBatch(UUID id) +{ +Mutation mutation = new Mutation(Keyspace.SYSTEM_KS, UUIDType.instance.decompose(id)); +mutation.delete(SystemKeyspace.BATCHLOG_CF, FBUtilities.timestampMicros()); +mutation.apply(); +} + private UUID processBatchlogPage(UntypedResultSet page, RateLimiter rateLimiter) { UUID id = null; +ArrayListBatch batches = new ArrayList(page.size()); + +// Sending out batches for replay without waiting for them, so that one stuck batch doesn't affect others for (UntypedResultSet.Row row : page) { id = row.getUUID(id); long writtenAt = row.getLong(written_at); -int version = row.has(version) ? row.getInt(version) : MessagingService.VERSION_12; // enough time for the actual write + batchlog entry mutation delivery (two separate requests). long timeout = DatabaseDescriptor.getWriteRpcTimeout() * 2; // enough time for the actual write + BM removal mutation if (System.currentTimeMillis() writtenAt + timeout) continue; // not ready to replay yet, might still get a deletion. -replayBatch(id, row.getBytes(data), writtenAt, version, rateLimiter); + +int version = row.has(version) ? row.getInt(version) : MessagingService.VERSION_12; +Batch batch = new Batch(id, writtenAt, row.getBytes(data), version); +try +{ +if (batch.replay(rateLimiter) 0) +{ +batches.add(batch); +} +else +{ +deleteBatch(id); // no write mutations were sent (either expired or all CFs involved truncated). +totalBatchesReplayed.incrementAndGet(); +} +} +catch (IOException e) +{ +logger.warn(Skipped batch replay of {} due to {}, id, e); +deleteBatch(id); +} +} + +// now waiting
[jira] [Created] (CASSANDRA-7238) Nodetool Status performance is much slower with VNodes On
Russell Alexander Spitzer created CASSANDRA-7238: Summary: Nodetool Status performance is much slower with VNodes On Key: CASSANDRA-7238 URL: https://issues.apache.org/jira/browse/CASSANDRA-7238 Project: Cassandra Issue Type: Bug Components: Tools Environment: 1000 M1.Large Ubuntu 12.04 Reporter: Russell Alexander Spitzer Priority: Minor Fix For: 2.1 beta2 Nodetool status on a 1000 Node cluster without vnodes returns in several seconds. With vnodes on (256) there are OOM errors with the default XMX of 32. Adjusting the XMX to 128 allows nodetool status to complete but the execution takes roughly 10 minutes. Tested {code} XMX| Status 32 |OOM 64 |OOM: GC Overhead 128|Finishes in ~10 minutes 500|Finishes in ~10 minutes 1000 |Finishes in ~10 minutes {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7069) Prevent operator mistakes due to simultaneous bootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13997941#comment-13997941 ] Robert Coli commented on CASSANDRA-7069: {quote}We know from experience that telling people don't do that isn't good enough... what I'm proposing here is to either not allow it, or sleep long enough that it avoids any issues.{quote} +1 this, a lot. Prevent operator mistakes due to simultaneous bootstrap --- Key: CASSANDRA-7069 URL: https://issues.apache.org/jira/browse/CASSANDRA-7069 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Brandon Williams Assignee: Brandon Williams Priority: Minor Fix For: 3.0 Cassandra has always had the '2 minute rule' between beginning topology changes to ensure the range announcement is known to all nodes before the next one begins. Trying to bootstrap a bunch of nodes simultaneously is a common mistake and seems to be on the rise as of late. We can prevent users from shooting themselves in the foot this way by looking for other joining nodes in the shadow round, then comparing their generation against our own and if there isn't a large enough difference, bail out or sleep until it is large enough. -- This message was sent by Atlassian JIRA (v6.2#6252)
[2/2] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1
Merge branch 'cassandra-2.0' into cassandra-2.1 Conflicts: CHANGES.txt Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8dab582c Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8dab582c Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8dab582c Branch: refs/heads/cassandra-2.1 Commit: 8dab582c2421b21f8d87a26f55941f1ee1f8b516 Parents: 3b299c4 d6f32e4 Author: Sylvain Lebresne sylv...@datastax.com Authored: Thu May 8 18:05:13 2014 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Thu May 8 18:05:13 2014 +0200 -- CHANGES.txt| 1 + src/java/org/apache/cassandra/db/marshal/AbstractType.java | 3 +-- .../apache/cassandra/serializers/InetAddressSerializer.java| 2 +- .../org/apache/cassandra/serializers/IntegerSerializer.java| 6 +++--- src/java/org/apache/cassandra/serializers/LongSerializer.java | 2 +- .../org/apache/cassandra/serializers/TimestampSerializer.java | 2 +- src/java/org/apache/cassandra/serializers/UUIDSerializer.java | 2 +- 7 files changed, 9 insertions(+), 9 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/8dab582c/CHANGES.txt -- diff --cc CHANGES.txt index 5afe800,a6cbc18..057fd34 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,20 -1,27 +1,21 @@@ -2.0.9 - * Warn when 'USING TIMESTAMP' is used on a CAS BATCH (CASSANDRA-7067) - * Starting threads in OutboundTcpConnectionPool constructor causes race conditions (CASSANDRA-7177) - * return all cpu values from BackgroundActivityMonitor.readAndCompute (CASSANDRA-7183) - * fix c* launch issues on Russian os's due to output of linux 'free' cmd (CASSANDRA-6162) - * Fix disabling autocompaction (CASSANDRA-7187) - * Fix potential NumberFormatException when deserializing IntegerType (CASSANDRA-7088) - -2.0.8 +2.1.0-rc1 + * Fix marking commitlogsegments clean (CASSANDRA-6959) + * Add snapshot manifest describing files included (CASSANDRA-6326) + * Parallel streaming for sstableloader (CASSANDRA-3668) + * Fix bugs in supercolumns handling (CASSANDRA-7138) + * Fix ClassClassException on composite dense tables (CASSANDRA-7112) + * Cleanup and optimize collation and slice iterators (CASSANDRA-7107) + * Upgrade NBHM lib (CASSANDRA-7128) + * Optimize netty server (CASSANDRA-6861) +Merged from 2.0: * Correctly delete scheduled range xfers (CASSANDRA-7143) * Make batchlog replica selection rack-aware (CASSANDRA-6551) - * Allow overriding cassandra-rackdc.properties file (CASSANDRA-7072) - * Set JMX RMI port to 7199 (CASSANDRA-7087) - * Use LOCAL_QUORUM for data reads at LOCAL_SERIAL (CASSANDRA-6939) - * Log a warning for large batches (CASSANDRA-6487) - * Queries on compact tables can return more rows that requested (CASSANDRA-7052) - * USING TIMESTAMP for batches does not work (CASSANDRA-7053) - * Fix performance regression from CASSANDRA-5614 (CASSANDRA-6949) - * Merge groupable mutations in TriggerExecutor#execute() (CASSANDRA-7047) - * Fix CFMetaData#getColumnDefinitionFromColumnName() (CASSANDRA-7074) - * Plug holes in resource release when wiring up StreamSession (CASSANDRA-7073) - * Re-add parameter columns to tracing session (CASSANDRA-6942) - * Fix writetime/ttl functions for static columns (CASSANDRA-7081) * Suggest CTRL-C or semicolon after three blank lines in cqlsh (CASSANDRA-7142) + * return all cpu values from BackgroundActivityMonitor.readAndCompute (CASSANDRA-7183) + * reduce garbage creation in calculatePendingRanges (CASSANDRA-7191) + * fix c* launch issues on Russian os's due to output of linux 'free' cmd (CASSANDRA-6162) + * Fix disabling autocompaction (CASSANDRA-7187) ++ * Fix potential NumberFormatException when deserializing IntegerType (CASSANDRA-7088) Merged from 1.2: * Add Cloudstack snitch (CASSANDRA-7147) * Update system.peers correctly when relocating tokens (CASSANDRA-7126) http://git-wip-us.apache.org/repos/asf/cassandra/blob/8dab582c/src/java/org/apache/cassandra/db/marshal/AbstractType.java --
[01/10] git commit: cqlsh should return a non-zero error code if a query fails
Repository: cassandra Updated Branches: refs/heads/cassandra-1.2 d839350f4 - 8abe9f6f5 refs/heads/cassandra-2.0 d6f32e4fc - 51f9e9804 refs/heads/cassandra-2.1 8dab582c2 - a0d096b03 refs/heads/trunk 28497fd11 - 310d6e4ef cqlsh should return a non-zero error code if a query fails patch by Branden Visser and Mikhail Stepura; reviewed by Mikhail Stepura for CASSANDRA-6344 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8abe9f6f Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8abe9f6f Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8abe9f6f Branch: refs/heads/cassandra-1.2 Commit: 8abe9f6f522146dc478f006a8160b4db1c233169 Parents: d839350 Author: Mikhail Stepura mish...@apache.org Authored: Thu May 8 13:20:35 2014 -0700 Committer: Mikhail Stepura mish...@apache.org Committed: Thu May 8 13:27:41 2014 -0700 -- CHANGES.txt | 1 + bin/cqlsh | 4 2 files changed, 5 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/8abe9f6f/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 312cf06..7021e7b 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -23,6 +23,7 @@ * remove duplicate query for local tokens (CASSANDRA-7182) * raise streaming phi convict threshold level (CASSANDRA-7063) * reduce garbage creation in calculatePendingRanges (CASSANDRA-7191) + * exit CQLSH with error status code if script fails (CASSANDRA-6344) 1.2.16 * Add UNLOGGED, COUNTER options to BATCH documentation (CASSANDRA-6816) http://git-wip-us.apache.org/repos/asf/cassandra/blob/8abe9f6f/bin/cqlsh -- diff --git a/bin/cqlsh b/bin/cqlsh index 8e1e0e2..24cb3b8 100755 --- a/bin/cqlsh +++ b/bin/cqlsh @@ -548,6 +548,7 @@ class Shell(cmd.Cmd): self.show_line_nums = True self.stdin = stdin self.query_out = sys.stdout +self.statement_error = False def set_expanded_cql_version(self, ver): ver, vertuple = full_cql_version(ver) @@ -2175,6 +2176,7 @@ class Shell(cmd.Cmd): self.query_out.flush() def printerr(self, text, color=RED, newline=True, shownum=None): +self.statement_error = True if shownum is None: shownum = self.show_line_nums if shownum: @@ -2404,6 +2406,8 @@ def main(options, hostname, port): shell.cmdloop() save_history() +if options.file and shell.statement_error: +sys.exit(2) if __name__ == '__main__': main(*read_options(sys.argv[1:], os.environ))
[1/6] git commit: Support authentication in CqlRecordReader
Repository: cassandra Updated Branches: refs/heads/cassandra-2.0 b927f790a - 569177fb9 refs/heads/cassandra-2.1 92c38c0e6 - 6be62c2c4 refs/heads/trunk eea5c3748 - 072283739 Support authentication in CqlRecordReader Patch by Jacek Lewandowski, reviewed by Alex Liu for CASSANDRA-7221 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/569177fb Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/569177fb Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/569177fb Branch: refs/heads/cassandra-2.0 Commit: 569177fb9e7c2b7935ff2e7f8b7c0b10806b8f50 Parents: b927f79 Author: Brandon Williams brandonwilli...@apache.org Authored: Wed May 14 17:35:38 2014 -0500 Committer: Brandon Williams brandonwilli...@apache.org Committed: Wed May 14 17:35:38 2014 -0500 -- CHANGES.txt | 1 + .../cassandra/hadoop/cql3/CqlConfigHelper.java | 32 ++-- 2 files changed, 30 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/569177fb/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 9a43040..285efd1 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.0.9 + * (Hadoop) support authentication in CqlRecordReader (CASSANDRA-7221) * (Hadoop) Close java driver Cluster in CQLRR.close (CASSANDRA-7228) * Fix potential SlabAllocator yield-starvation (CASSANDRA-7133) * Warn when 'USING TIMESTAMP' is used on a CAS BATCH (CASSANDRA-7067) http://git-wip-us.apache.org/repos/asf/cassandra/blob/569177fb/src/java/org/apache/cassandra/hadoop/cql3/CqlConfigHelper.java -- diff --git a/src/java/org/apache/cassandra/hadoop/cql3/CqlConfigHelper.java b/src/java/org/apache/cassandra/hadoop/cql3/CqlConfigHelper.java index a4a9c44..a2cf1e7 100644 --- a/src/java/org/apache/cassandra/hadoop/cql3/CqlConfigHelper.java +++ b/src/java/org/apache/cassandra/hadoop/cql3/CqlConfigHelper.java @@ -40,9 +40,11 @@ import javax.net.ssl.TrustManagerFactory; import org.apache.cassandra.hadoop.ConfigHelper; import org.apache.cassandra.io.util.FileUtils; +import org.apache.commons.lang3.StringUtils; import org.apache.hadoop.conf.Configuration; import com.datastax.driver.core.AuthProvider; +import com.datastax.driver.core.PlainTextAuthProvider; import com.datastax.driver.core.Cluster; import com.datastax.driver.core.Host; import com.datastax.driver.core.HostDistance; @@ -64,6 +66,9 @@ public class CqlConfigHelper private static final String INPUT_CQL_WHERE_CLAUSE_CONFIG = cassandra.input.where.clause; private static final String INPUT_CQL = cassandra.input.cql; +private static final String USERNAME = cassandra.username; +private static final String PASSWORD = cassandra.password; + private static final String INPUT_NATIVE_PORT = cassandra.input.native.port; private static final String INPUT_NATIVE_CORE_CONNECTIONS_PER_HOST = cassandra.input.native.core.connections.per.host; private static final String INPUT_NATIVE_MAX_CONNECTIONS_PER_HOST = cassandra.input.native.max.connections.per.host; @@ -152,6 +157,16 @@ public class CqlConfigHelper conf.set(INPUT_CQL, cql); } +public static void setUserNameAndPassword(Configuration conf, String username, String password) +{ +if (StringUtils.isNotBlank(username)) +{ +conf.set(INPUT_NATIVE_AUTH_PROVIDER, PlainTextAuthProvider.class.getName()); +conf.set(USERNAME, username); +conf.set(PASSWORD, password); +} +} + public static OptionalInteger getInputCoreConnections(Configuration conf) { return getIntSetting(INPUT_NATIVE_CORE_CONNECTIONS_PER_HOST, conf); @@ -547,7 +562,7 @@ public class CqlConfigHelper if (!authProvider.isPresent()) return Optional.absent(); -return Optional.of(getClientAuthProvider(authProvider.get())); +return Optional.of(getClientAuthProvider(authProvider.get(), conf)); } private static OptionalSSLOptions getSSLOptions(Configuration conf) @@ -602,11 +617,22 @@ public class CqlConfigHelper return Optional.of(setting); } -private static AuthProvider getClientAuthProvider(String factoryClassName) +private static AuthProvider getClientAuthProvider(String factoryClassName, Configuration conf) { try { -return (AuthProvider) Class.forName(factoryClassName).newInstance(); +Class? c = Class.forName(factoryClassName); +if (PlainTextAuthProvider.class.equals(c)) +{ +String username = getStringSetting(USERNAME, conf).or(); +
[4/4] git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/67ed3375 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/67ed3375 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/67ed3375 Branch: refs/heads/trunk Commit: 67ed3375b26286a1c767644283374e784df4563f Parents: e48a00b 773b95e Author: Dave Brosius dbros...@mebigfatguy.com Authored: Wed May 7 21:23:38 2014 -0400 Committer: Dave Brosius dbros...@mebigfatguy.com Committed: Wed May 7 21:23:38 2014 -0400 -- CHANGES.txt| 3 ++- .../cassandra/service/PendingRangeCalculatorService.java | 6 +++--- 2 files changed, 5 insertions(+), 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/67ed3375/CHANGES.txt -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/67ed3375/src/java/org/apache/cassandra/service/PendingRangeCalculatorService.java --
[2/3] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1
Merge branch 'cassandra-2.0' into cassandra-2.1 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1b1f0b07 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1b1f0b07 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1b1f0b07 Branch: refs/heads/trunk Commit: 1b1f0b0790cf4f36fddd6815634261ace6cb588e Parents: 773b95e 16fd1a4 Author: Dave Brosius dbros...@mebigfatguy.com Authored: Thu May 8 00:39:13 2014 -0400 Committer: Dave Brosius dbros...@mebigfatguy.com Committed: Thu May 8 00:39:13 2014 -0400 -- CHANGES.txt | 3 ++- conf/cassandra-env.sh | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/1b1f0b07/CHANGES.txt -- diff --cc CHANGES.txt index 4a0548a,05cc193..f57b649 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,17 -1,25 +1,18 @@@ -2.0.9 - * Warn when 'USING TIMESTAMP' is used on a CAS BATCH (CASSANDRA-7067) - * Starting threads in OutboundTcpConnectionPool constructor causes race conditions (CASSANDRA-7177) - * return all cpu values from BackgroundActivityMonitor.readAndCompute (CASSANDRA-7183) - * fix c* launch issues on Russian os's due to output of linux 'free' cmd (CASSANDRA-6162) - -2.0.8 +2.1.0-rc1 + * Add snapshot manifest describing files included (CASSANDRA-6326) + * Parallel streaming for sstableloader (CASSANDRA-3668) + * Fix bugs in supercolumns handling (CASSANDRA-7138) + * Fix ClassClassException on composite dense tables (CASSANDRA-7112) + * Cleanup and optimize collation and slice iterators (CASSANDRA-7107) + * Upgrade NBHM lib (CASSANDRA-7128) + * Optimize netty server (CASSANDRA-6861) +Merged from 2.0: * Correctly delete scheduled range xfers (CASSANDRA-7143) * Make batchlog replica selection rack-aware (CASSANDRA-6551) - * Allow overriding cassandra-rackdc.properties file (CASSANDRA-7072) - * Set JMX RMI port to 7199 (CASSANDRA-7087) - * Use LOCAL_QUORUM for data reads at LOCAL_SERIAL (CASSANDRA-6939) - * Log a warning for large batches (CASSANDRA-6487) - * Queries on compact tables can return more rows that requested (CASSANDRA-7052) - * USING TIMESTAMP for batches does not work (CASSANDRA-7053) - * Fix performance regression from CASSANDRA-5614 (CASSANDRA-6949) - * Merge groupable mutations in TriggerExecutor#execute() (CASSANDRA-7047) - * Fix CFMetaData#getColumnDefinitionFromColumnName() (CASSANDRA-7074) - * Plug holes in resource release when wiring up StreamSession (CASSANDRA-7073) - * Re-add parameter columns to tracing session (CASSANDRA-6942) - * Fix writetime/ttl functions for static columns (CASSANDRA-7081) * Suggest CTRL-C or semicolon after three blank lines in cqlsh (CASSANDRA-7142) - * return all cpu values from BackgroundActivityMonitor.readAndCompute (CASSANDRA-7183) ++ * return all cpu values from BackgroundActivityMonitor.readAndCompute (CASSANDRA-7183) + * reduce garbage creation in calculatePendingRanges (CASSANDRA-7191) ++ * fix c* launch issues on Russian os's due to output of linux 'free' cmd (CASSANDRA-6162) Merged from 1.2: * Add Cloudstack snitch (CASSANDRA-7147) * Update system.peers correctly when relocating tokens (CASSANDRA-7126) http://git-wip-us.apache.org/repos/asf/cassandra/blob/1b1f0b07/conf/cassandra-env.sh --
git commit: Fix marking commitlog segments clean patch by bes; reviewed by jbellis for CASSANDRA-6959
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 af802014d - 7da562053 Fix marking commitlog segments clean patch by bes; reviewed by jbellis for CASSANDRA-6959 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7da56205 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7da56205 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7da56205 Branch: refs/heads/cassandra-2.1 Commit: 7da562053fe729adb41061e52bfda17837f77d62 Parents: af80201 Author: Jonathan Ellis jbel...@apache.org Authored: Thu May 8 10:51:36 2014 -0500 Committer: Jonathan Ellis jbel...@apache.org Committed: Thu May 8 10:51:59 2014 -0500 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/7da56205/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 714a475..5afe800 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 2.1.0-rc1 + * Fix marking commitlogsegments clean (CASSANDRA-6959) * Add snapshot manifest describing files included (CASSANDRA-6326) * Parallel streaming for sstableloader (CASSANDRA-3668) * Fix bugs in supercolumns handling (CASSANDRA-7138) http://git-wip-us.apache.org/repos/asf/cassandra/blob/7da56205/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java -- diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java index e5c9b3e..3830966 100644 --- a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java @@ -469,7 +469,7 @@ public class CommitLogSegment UUID cfId = clean.getKey(); AtomicInteger cleanPos = clean.getValue(); AtomicInteger dirtyPos = cfDirty.get(cfId); -if (dirtyPos != null dirtyPos.intValue() cleanPos.intValue()) +if (dirtyPos != null dirtyPos.intValue() = cleanPos.intValue()) { cfDirty.remove(cfId); iter.remove(); @@ -482,9 +482,9 @@ public class CommitLogSegment */ public synchronized CollectionUUID getDirtyCFIDs() { -removeCleanFromDirty(); if (cfClean.isEmpty() || cfDirty.isEmpty()) return cfDirty.keySet(); + ListUUID r = new ArrayList(cfDirty.size()); for (Map.EntryUUID, AtomicInteger dirty : cfDirty.entrySet()) {
[jira] [Commented] (CASSANDRA-6065) Use CQL3 internally in schema code and HHOM
[ https://issues.apache.org/jira/browse/CASSANDRA-6065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999125#comment-13999125 ] Aleksey Yeschenko commented on CASSANDRA-6065: -- For the reasons listed in https://issues.apache.org/jira/browse/CASSANDRA-6975?focusedCommentId=13999108page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13999108, I propose to re-purpose this issue and make it about 2.1 HHOM conversion of 1.2-serialised mutations on startup to 2.1-encoded ones. Ideally - just once in the 2.1 node's lifetime. Use CQL3 internally in schema code and HHOM --- Key: CASSANDRA-6065 URL: https://issues.apache.org/jira/browse/CASSANDRA-6065 Project: Cassandra Issue Type: Improvement Reporter: Aleksey Yeschenko Assignee: Mikhail Stepura Priority: Minor Fix For: 3.0 We mostly use CQL3 internally everywhere now, except HHOM and schema-related code. We should switch to CQL3+the new paging for HHOM to replace the current ugliness and to CQL3 for all schema-related serialization and deserialization. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7245) Out-of-Order keys with stress + CQL3
[ https://issues.apache.org/jira/browse/CASSANDRA-7245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999536#comment-13999536 ] T Jake Luciani commented on CASSANDRA-7245: --- Well an easy check would be roll back 6861 locally and see if the issue goes away. Out-of-Order keys with stress + CQL3 Key: CASSANDRA-7245 URL: https://issues.apache.org/jira/browse/CASSANDRA-7245 Project: Cassandra Issue Type: Bug Components: Core Reporter: Pavel Yaskevich We have been generating data (stress with CQL3 prepared) for CASSANDRA-4718 and found following problem almost in every SSTable generated (~200 GB of data and 821 SSTables). We set up they keys to be 10 bytes in size (default) and population between 1 and 6. Once I ran 'sstablekeys' on the generated SSTable files I got following exceptions: _There is a problem with sorting of normal looking keys:_ 30303039443538353645 30303039443745364242 java.io.IOException: Key out of order! DecoratedKey(-217680888487824985, *30303039443745364242*) DecoratedKey(-1767746583617597213, *30303039443437454333*) 0a30303033343933 3734441388343933 java.io.IOException: Key out of order! DecoratedKey(5440473860101999581, *3734441388343933*) DecoratedKey(-7565486415339257200, *30303033344639443137*) 30303033354244363031 30303033354133423742 java.io.IOException: Key out of order! DecoratedKey(2687072396429900180, *30303033354133423742*) DecoratedKey(-7838239767410066684, *30303033354145344534*) 30303034313442354137 3034313635363334 java.io.IOException: Key out of order! DecoratedKey(1516003874415400462, *3034313635363334*) DecoratedKey(-9106177395653818217, *3030303431444238*) 30303035373044373435 30303035373044334631 java.io.IOException: Key out of order! DecoratedKey(-3645715702154616540, *30303035373044334631*) DecoratedKey(-4296696226469000945, *30303035373132364138*) _And completely different ones:_ 30303041333745373543 7cd045c59a90d7587d8d java.io.IOException: Key out of order! DecoratedKey(-3595402345023230196, *7cd045c59a90d7587d8d*) DecoratedKey(-5146766422778260690, *30303041333943303232*) 3030303332314144 30303033323346343932 java.io.IOException: Key out of order! DecoratedKey(7071845511166615635, *30303033323346343932*) DecoratedKey(5233296131921119414, *53d83e0012287e03*) 30303034314531374431 3806734b256c27e41ec2 java.io.IOException: Key out of order! DecoratedKey(-7720474642702543193, *3806734b256c27e41ec2*) DecoratedKey(-8072288379146044663, *30303034314136413343*) _And sometimes there is no problem at all:_ 30303033353144463637 002a31b3b31a1c2f 5d616dd38211ebb5d6ec 444236451388 1388138844463744 30303033353143394343 It's worth to mention that we have got 22 timeout exceptions but number of out-of-order keys is much larger than that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7222) cqlsh does not wait for tracing to complete before printing
[ https://issues.apache.org/jira/browse/CASSANDRA-7222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999587#comment-13999587 ] Mikhail Stepura commented on CASSANDRA-7222: We can't fully rely on {{duration}}, see CASSANDRA-6317, but I'll take a look what can be improved. cqlsh does not wait for tracing to complete before printing --- Key: CASSANDRA-7222 URL: https://issues.apache.org/jira/browse/CASSANDRA-7222 Project: Cassandra Issue Type: Bug Components: Tools Reporter: Tyler Hobbs Assignee: Mikhail Stepura Fix For: 2.0.9 cqlsh currently sleeps 0.5 seconds before fetching trace info. Instead, it should fetch the {{system_traces.sessions}} row in a loop until the {{duration}} column is set (maybe with a total timeout of 30s). After the {{duration}} column is set, it's safe to fetch the rows from {{system_traces.events}}. This is already fixed in 2.1 by the switch to the python driver, we just need to fix 2.0. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-7247) Provide top ten most frequent keys per column family
[ https://issues.apache.org/jira/browse/CASSANDRA-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999621#comment-13999621 ] Chris Lohfink edited comment on CASSANDRA-7247 at 5/16/14 5:54 AM: --- Problem is StreamSummary is not thread safe. There is a ConcurrentStreamSummary, which I found in this implementation to be ~5x slower then a synchronized block around the offer of the non-thread safe one. Concurrent did perform similarly when also wrapped in synchronized block which I will show below but because it would lose any benefit of being a concurrent implementation when access is serialized I think the faster impl is best. Done on 2013 retina MBP with 500gb ssd against trunk: {code:title=No Changes} id, ops ,op/s, key/s,mean, med, .95, .99,.999, max, time, stderr 4 threadCount, 634450, 21692, 21692, 0.2, 0.2, 0.2, 0.2, 0.4, 740.1, 29.2, 0.01188 8 threadCount, 886600, 29762, 29762, 0.3, 0.2, 0.3, 0.4, 1.3, 1007.3, 29.8, 0.01220 16 threadCount, 912050, 29035, 29035, 0.5, 0.3, 0.9, 2.5,11.2, 1393.8, 31.4, 0.01162 24 threadCount, 1022250 , 32681, 32681, 0.7, 0.5, 1.0, 2.9,13.5, 1126.5, 31.3, 0.00923 36 threadCount, 946550, 30900, 30900, 1.2, 0.8, 1.4, 3.0,22.5, 1369.2, 30.6, 0.01089 {code} {code:title=With Patch} id, ops ,op/s, key/s,mean, med, .95, .99,.999, max, time, stderr 4 threadCount, 643900, 21700, 21700, 0.2, 0.2, 0.2, 0.2, 0.9, 941.1, 29.7, 0.01079 8 threadCount, 942100, 32300, 32300, 0.2, 0.2, 0.3, 0.3, 1.2, 849.5, 29.2, 0.01519 16 threadCount, 907400, 30650, 30650, 0.5, 0.3, 0.8, 1.9,10.7, 1124.0, 29.6, 0.01112 24 threadCount, 1026150 , 31753, 31753, 0.7, 0.5, 0.9, 3.3,20.6, 1299.0, 32.3, 0.01295 36 threadCount, 980600, 30077, 30077, 1.2, 0.8, 1.3, 2.7,24.9, 1394.3, 32.6, 0.01747 {code} {code:title=ConcurrentStreamSummary with sync} 4 threadCount, 494350, 16643, 16643, 0.2, 0.2, 0.3, 0.3, 1.0, 943.6, 29.7, 0.01286 8 threadCount, 812950, 26358, 26358, 0.3, 0.2, 0.3, 0.5, 1.4, 1488.9, 30.8, 0.01909 16 threadCount, 877500, 27396, 27396, 0.6, 0.3, 1.0, 2.2,12.1, 1299.2, 32.0, 0.01824 24 threadCount, 837550, 25345, 25345, 0.9, 0.4, 1.2, 3.7,84.2, 2123.6, 33.0, 0.02437 36 threadCount, 910200, 28008, 28008, 1.3, 0.6, 2.8, 9.2,32.2, 1212.8, 32.5, 0.01654 {code} {code:title=ConcurentStreamSummary no blocking} id, ops ,op/s, key/s,mean, med, .95, .99,.999, max, time, stderr 4 threadCount, 183600,6145,6145, 0.6, 0.6, 0.8, 1.0, 2.6, 354.5, 29.9, 0.01063 8 threadCount, 197200,6593,6593, 1.2, 1.1, 1.4, 1.8, 3.3, 413.5, 29.9, 0.00716 16 threadCount, 203200,6794,6794, 2.3, 2.2, 2.6, 3.5,12.1, 649.1, 29.9, 0.01096 24 threadCount, 198000,6615,6615, 3.6, 3.3, 4.2, 4.9,44.2, 570.4, 29.9, 0.00894 36 threadCount, 199800,6627,6627, 5.4, 4.9, 6.5, 8.0, 110.8, 272.3, 30.1, 0.01452 {code} was (Author: cnlwsu): Problem is StreamSummary is not thread safe. There is a ConcurrentStreamSummary, which I found in this implementation to be ~5x slower then a synchronized block around the offer of the non-thread safe one. Concurrent did perform similarly when also wrapped in synchronized block which I will show below but because it would lose any benefit of being a concurrent implementation when access is serialized I think the faster impl is best. Done on 2013 retina MBP with 500gb ssd against trunk: {code:title=No Changes} id, ops ,op/s, key/s,mean, med, .95, .99,.999, max, time, stderr 4 threadCount, 634450, 21692, 21692, 0.2, 0.2, 0.2, 0.2, 0.4, 740.1, 29.2, 0.01188 8 threadCount, 886600, 29762, 29762, 0.3, 0.2, 0.3, 0.4, 1.3, 1007.3, 29.8, 0.01220 16 threadCount, 912050, 29035, 29035, 0.5, 0.3, 0.9, 2.5,11.2, 1393.8, 31.4, 0.01162 24 threadCount, 1022250 , 32681, 32681, 0.7, 0.5, 1.0, 2.9,13.5, 1126.5, 31.3, 0.00923 36 threadCount, 946550, 30900, 30900, 1.2, 0.8, 1.4, 3.0,22.5, 1369.2, 30.6, 0.01089 {code} {code:title=With Patch}
[3/6] git commit: remove duplicate queries for local tokens
remove duplicate queries for local tokens patch by dbrosius reviewed by ayeschenko for cassandra-7182 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0132e546 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0132e546 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0132e546 Branch: refs/heads/trunk Commit: 0132e546b55b67f68fca230c9e0ca1ccef6aa273 Parents: c7e472e Author: Dave Brosius dbros...@mebigfatguy.com Authored: Wed May 7 01:34:02 2014 -0400 Committer: Dave Brosius dbros...@mebigfatguy.com Committed: Wed May 7 01:34:02 2014 -0400 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/service/StorageService.java | 5 +++-- 2 files changed, 4 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/0132e546/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 8c1d234..d7b7f00 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -20,6 +20,7 @@ * fix npe when doing -Dcassandra.fd_initial_value_ms (CASSANDRA-6751) * Preserves CQL metadata when updating table from thrift (CASSANDRA-6831) * fix time conversion to milliseconds in SimpleCondition.await (CASSANDRA-7149) + * remove duplicate query for local tokens (CASSANDRA-7182) 1.2.16 http://git-wip-us.apache.org/repos/asf/cassandra/blob/0132e546/src/java/org/apache/cassandra/service/StorageService.java -- diff --git a/src/java/org/apache/cassandra/service/StorageService.java b/src/java/org/apache/cassandra/service/StorageService.java index ed6d031..7cecec9 100644 --- a/src/java/org/apache/cassandra/service/StorageService.java +++ b/src/java/org/apache/cassandra/service/StorageService.java @@ -209,8 +209,9 @@ public class StorageService extends NotificationBroadcasterSupport implements IE SystemTable.updateTokens(tokens); tokenMetadata.updateNormalTokens(tokens, FBUtilities.getBroadcastAddress()); // order is important here, the gossiper can fire in between adding these two states. It's ok to send TOKENS without STATUS, but *not* vice versa. -Gossiper.instance.addLocalApplicationState(ApplicationState.TOKENS, valueFactory.tokens(getLocalTokens())); -Gossiper.instance.addLocalApplicationState(ApplicationState.STATUS, valueFactory.normal(getLocalTokens())); +CollectionToken localTokens = getLocalTokens(); +Gossiper.instance.addLocalApplicationState(ApplicationState.TOKENS, valueFactory.tokens(localTokens)); +Gossiper.instance.addLocalApplicationState(ApplicationState.STATUS, valueFactory.normal(localTokens)); setMode(Mode.NORMAL, false); }
[jira] [Comment Edited] (CASSANDRA-7199) [dtest] snapshot_test hung on 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-7199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993570#comment-13993570 ] Michael Shuler edited comment on CASSANDRA-7199 at 5/9/14 2:13 PM: --- http://cassci.datastax.com/job/scratch-2.1_dtest/1/ 0b26c778bcdbef83cb5e0480a8c7b38d58d2aec6 http://cassci.datastax.com/job/scratch-2.1_dtest/2/ b2dd6a7f6c5d3751df9483a557f7ec8d54901e4b (those commits are not the cause of the error, just when the jobs ran) was (Author: mshuler): happened on Revision: 0b26c778bcdbef83cb5e0480a8c7b38d58d2aec6 and b2dd6a7f6c5d3751df9483a557f7ec8d54901e4b [dtest] snapshot_test hung on 2.1 - Key: CASSANDRA-7199 URL: https://issues.apache.org/jira/browse/CASSANDRA-7199 Project: Cassandra Issue Type: Test Components: Tests Reporter: Michael Shuler Assignee: Benedict Priority: Minor Labels: qa-resolved Fix For: 2.1 rc1 Attachments: 7199.txt, jenkins-scratch-2.1_dtest-failed-snapshot_test-dtestdir.tar.gz Test hung twice on 2.1 in the same manner while trying a new ccm branch as a scratch jenkins job {noformat} 11:57:44 dont_test_archive_commitlog (snapshot_test.TestArchiveCommitlog) ... Requested creating snapshot(s) for [ks] with snapshot name [basic] 11:58:03 Snapshot directory: basic 11:58:41 Established connection to initial hosts 11:58:41 Opening sstables and calculating sections to stream 11:58:41 Streaming relevant part of /tmp/tmpgTsloD/ks/cf/ks-cf-ka-1-Data.db to [/127.0.0.1] 11:58:41 progress: [/127.0.0.1]0:1/1 100% total: 100% 0 MB/s(avg: 0 MB/s) progress: [/127.0.0.1]0:1/1 100% total: 100% 0 MB/s(avg: 0 MB/s) 11:58:42 Summary statistics: 11:58:42Connections per host: : 1 11:58:42Total files transferred: : 1 11:58:42Total bytes transferred: : 527659 11:58:42Total duration (ms): : 2384 11:58:42Average transfer rate (MB/s): : 0 11:58:42Peak transfer rate (MB/s):: 0 11:58:42 11:58:59 ok 11:58:59 test_archive_commitlog (snapshot_test.TestArchiveCommitlog) ... rm: cannot remove `/tmp/tmp6c0qNr/*': No such file or directory 12:14:15 Build timed out (after 15 minutes). Marking the build as aborted. {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7226) get code coverage working again (cobertura or other)
[ https://issues.apache.org/jira/browse/CASSANDRA-7226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russ Hatch updated CASSANDRA-7226: -- Attachment: trunk-7226.txt Attaching patch to add jacoco for build tests (functional tests aren't sorted out yet but should be possible). To try it out, get the latest stuff: ant maven-ant-tasks-retrieve-build Run the unit tests: ant jacoco-run Run more tests if you want. The execution info is appended by default, so you can run many tasks for a combined report. If you don't want the unit tests, provide an ant task name to invoke. As long as the task runs through the testmacro the jacoco java agent will be included and coverage data will be appended. ant jacoco-run -Dtaskname=pig-tests Run a coverage report on the test(s) you have run: ant jacoco-report To run tests and the report in one shot: ant codecoverage or: ant codecoverage -Dtaskname=pig-tests To view the report, open up build/jacoco/index.html in your browser. When you are done with the coverage data and report, run: ant jacoco-cleanup get code coverage working again (cobertura or other) Key: CASSANDRA-7226 URL: https://issues.apache.org/jira/browse/CASSANDRA-7226 Project: Cassandra Issue Type: Test Reporter: Russ Hatch Assignee: Russ Hatch Attachments: coverage.png, trunk-7226.txt We need to sort out code coverage again, for unit and cassandra-dtest tests. Preferably the same tool for both. Seems like cobertura project activity has dwindled. Jacoco might be a viable alternative to cobertura. Jacoco can can instrument running bytecode so I think it could also work for dtests (does require an agent, not sure if that's a problem yet). If using an agent is problematic looks like it can also work with offline bytecode though I don't see how that could benefit dtests. Project seems pretty active, with a release just last week. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (CASSANDRA-7244) Don't allocate a Codec.Flag enum value array on every read
Benedict created CASSANDRA-7244: --- Summary: Don't allocate a Codec.Flag enum value array on every read Key: CASSANDRA-7244 URL: https://issues.apache.org/jira/browse/CASSANDRA-7244 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Fix For: 2.1 rc1 Attachments: 7244.txt In QueryOptions.Codec.Flag.deserialize() we call Flag.values(), which constructs a copy of the Enum array each time. Since we only use this lookup the enums, and since it happens _often_, this is pretty wasteful. It seems to make a few % difference to throughput. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7225) cqlsh help for CQL3 is often incorrect and should be modernized
[ https://issues.apache.org/jira/browse/CASSANDRA-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-7225: Summary: cqlsh help for CQL3 is often incorrect and should be modernized (was: is = and is = in CQL) cqlsh help for CQL3 is often incorrect and should be modernized --- Key: CASSANDRA-7225 URL: https://issues.apache.org/jira/browse/CASSANDRA-7225 Project: Cassandra Issue Type: Bug Reporter: Robert Stupp Priority: Trivial Labels: cqlsh Just a small line of text in cqlsh help command indicates that is = and is = in CQL. This is confusing to many people (including me :) ) because I did not expect to return the equals portion. Please allow distinct behaviours for , =, and = in CQL queries. Maybe in combination with CASSANDRA-5184 and/or CASSANDRA-4914 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6568) sstables incorrectly getting marked as not live
[ https://issues.apache.org/jira/browse/CASSANDRA-6568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998794#comment-13998794 ] Chris Burroughs commented on CASSANDRA-6568: Whoops let that slip by. I'd still like to give the patch a try. sstables incorrectly getting marked as not live - Key: CASSANDRA-6568 URL: https://issues.apache.org/jira/browse/CASSANDRA-6568 Project: Cassandra Issue Type: Bug Components: Core Environment: 1.2.12 with several 1.2.13 patches Reporter: Chris Burroughs Assignee: Marcus Eriksson Fix For: 2.0.8 Attachments: 0001-add-jmx-method-to-get-non-active-sstables.patch {noformat} -rw-rw-r-- 14 cassandra cassandra 1.4G Nov 25 19:46 /data/sstables/data/ks/cf/ks-cf-ic-402383-Data.db -rw-rw-r-- 14 cassandra cassandra 13G Nov 26 00:04 /data/sstables/data/ks/cf/ks-cf-ic-402430-Data.db -rw-rw-r-- 14 cassandra cassandra 13G Nov 26 05:03 /data/sstables/data/ks/cf/ks-cf-ic-405231-Data.db -rw-rw-r-- 31 cassandra cassandra 21G Nov 26 08:38 /data/sstables/data/ks/cf/ks-cf-ic-405232-Data.db -rw-rw-r-- 2 cassandra cassandra 2.6G Dec 3 13:44 /data/sstables/data/ks/cf/ks-cf-ic-434662-Data.db -rw-rw-r-- 14 cassandra cassandra 1.5G Dec 5 09:05 /data/sstables/data/ks/cf/ks-cf-ic-438698-Data.db -rw-rw-r-- 2 cassandra cassandra 3.1G Dec 6 12:10 /data/sstables/data/ks/cf/ks-cf-ic-440983-Data.db -rw-rw-r-- 2 cassandra cassandra 96M Dec 8 01:52 /data/sstables/data/ks/cf/ks-cf-ic-444041-Data.db -rw-rw-r-- 2 cassandra cassandra 3.3G Dec 9 16:37 /data/sstables/data/ks/cf/ks-cf-ic-451116-Data.db -rw-rw-r-- 2 cassandra cassandra 876M Dec 10 11:23 /data/sstables/data/ks/cf/ks-cf-ic-453552-Data.db -rw-rw-r-- 2 cassandra cassandra 891M Dec 11 03:21 /data/sstables/data/ks/cf/ks-cf-ic-454518-Data.db -rw-rw-r-- 2 cassandra cassandra 102M Dec 11 12:27 /data/sstables/data/ks/cf/ks-cf-ic-455429-Data.db -rw-rw-r-- 2 cassandra cassandra 906M Dec 11 23:54 /data/sstables/data/ks/cf/ks-cf-ic-455533-Data.db -rw-rw-r-- 1 cassandra cassandra 214M Dec 12 05:02 /data/sstables/data/ks/cf/ks-cf-ic-456426-Data.db -rw-rw-r-- 1 cassandra cassandra 203M Dec 12 10:49 /data/sstables/data/ks/cf/ks-cf-ic-456879-Data.db -rw-rw-r-- 1 cassandra cassandra 49M Dec 12 12:03 /data/sstables/data/ks/cf/ks-cf-ic-456963-Data.db -rw-rw-r-- 18 cassandra cassandra 20G Dec 25 01:09 /data/sstables/data/ks/cf/ks-cf-ic-507770-Data.db -rw-rw-r-- 3 cassandra cassandra 12G Jan 8 04:22 /data/sstables/data/ks/cf/ks-cf-ic-567100-Data.db -rw-rw-r-- 3 cassandra cassandra 957M Jan 8 22:51 /data/sstables/data/ks/cf/ks-cf-ic-569015-Data.db -rw-rw-r-- 2 cassandra cassandra 923M Jan 9 17:04 /data/sstables/data/ks/cf/ks-cf-ic-571303-Data.db -rw-rw-r-- 1 cassandra cassandra 821M Jan 10 08:20 /data/sstables/data/ks/cf/ks-cf-ic-574642-Data.db -rw-rw-r-- 1 cassandra cassandra 18M Jan 10 08:48 /data/sstables/data/ks/cf/ks-cf-ic-574723-Data.db {noformat} I tried to do a user defined compaction on sstables from November and got it is not an active sstable. Live sstable count from jmx was about 7 while on disk there were over 20. Live vs total size showed about a ~50 GiB difference. Forcing a gc from jconsole had no effect. However, restarting the node resulted in live sstables/bytes *increasing* to match what was on disk. User compaction could now compact the November sstables. This cluster was last restarted in mid December. I'm not sure what affect not live had on other operations of the cluster. From the logs it seems that the files were sent at least at some point as part of repair, but I don't know if they were being being used for read requests or not. Because the problem that got me looking in the first place was poor performance I suspect they were used for reads (and the reads were slow because so many sstables were being read). I presume based on their age at the least they were being excluded from compaction. I'm not aware of any isLive() or getRefCount() to problematically confirm which nodes have this problem. In this cluster almost all columns have a 14 day TTL, based on the number of nodes with November sstables it appears to be occurring on a significant fraction of the nodes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (CASSANDRA-7247) Provide top ten most frequent keys per column family
Chris Lohfink created CASSANDRA-7247: Summary: Provide top ten most frequent keys per column family Key: CASSANDRA-7247 URL: https://issues.apache.org/jira/browse/CASSANDRA-7247 Project: Cassandra Issue Type: Improvement Reporter: Chris Lohfink Priority: Minor Attachments: patch.diff Since already have the nice addthis stream library, can use it to keep track of most frequent DecoratedKeys that come through the system using StreamSummaries ([nice explaination|http://boundary.com/blog/2013/05/14/approximate-heavy-hitters-the-spacesaving-algorithm/]). Then provide a new metric to access them via JMX. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6602) Compaction improvements to optimize time series data
[ https://issues.apache.org/jira/browse/CASSANDRA-6602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Björn Hegerfors updated CASSANDRA-6602: --- Attachment: cassandra-2.0-CASSANDRA-6602-DateTieredCompactionStrategy.txt Compaction improvements to optimize time series data Key: CASSANDRA-6602 URL: https://issues.apache.org/jira/browse/CASSANDRA-6602 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Tupshin Harper Assignee: Marcus Eriksson Labels: compaction, performance Fix For: 3.0 Attachments: cassandra-2.0-CASSANDRA-6602-DateTieredCompactionStrategy.txt, cassandra-2.0-CASSANDRA-6602-DateTieredCompactionStrategy_v2.txt There are some unique characteristics of many/most time series use cases that both provide challenges, as well as provide unique opportunities for optimizations. One of the major challenges is in compaction. The existing compaction strategies will tend to re-compact data on disk at least a few times over the lifespan of each data point, greatly increasing the cpu and IO costs of that write. Compaction exists to 1) ensure that there aren't too many files on disk 2) ensure that data that should be contiguous (part of the same partition) is laid out contiguously 3) deleting data due to ttls or tombstones The special characteristics of time series data allow us to optimize away all three. Time series data 1) tends to be delivered in time order, with relatively constrained exceptions 2) often has a pre-determined and fixed expiration date 3) Never gets deleted prior to TTL 4) Has relatively predictable ingestion rates Note that I filed CASSANDRA-5561 and this ticket potentially replaces or lowers the need for it. In that ticket, jbellis reasonably asks, how that compaction strategy is better than disabling compaction. Taking that to heart, here is a compaction-strategy-less approach that could be extremely efficient for time-series use cases that follow the above pattern. (For context, I'm thinking of an example use case involving lots of streams of time-series data with a 5GB per day ingestion rate, and a 1000 day retention with TTL, resulting in an eventual steady state of 5TB per node) 1) You have an extremely large memtable (preferably off heap, if/when doable) for the table, and that memtable is sized to be able to hold a lengthy window of time. A typical period might be one day. At the end of that period, you flush the contents of the memtable to an sstable and move to the next one. This is basically identical to current behaviour, but with thresholds adjusted so that you can ensure flushing at predictable intervals. (Open question is whether predictable intervals is actually necessary, or whether just waiting until the huge memtable is nearly full is sufficient) 2) Combine the behaviour with CASSANDRA-5228 so that sstables will be efficiently dropped once all of the columns have. (Another side note, it might be valuable to have a modified version of CASSANDRA-3974 that doesn't bother storing per-column TTL since it is required that all columns have the same TTL) 3) Be able to mark column families as read/write only (no explicit deletes), so no tombstones. 4) Optionally add back an additional type of delete that would delete all data earlier than a particular timestamp, resulting in immediate dropping of obsoleted sstables. The result is that for in-order delivered data, Every cell will be laid out optimally on disk on the first pass, and over the course of 1000 days and 5TB of data, there will only be 1000 5GB sstables, so the number of filehandles will be reasonable. For exceptions (out-of-order delivery), most cases will be caught by the extended (24 hour+) memtable flush times and merged correctly automatically. For those that were slightly askew at flush time, or were delivered so far out of order that they go in the wrong sstable, there is relatively low overhead to reading from two sstables for a time slice, instead of one, and that overhead would be incurred relatively rarely unless out-of-order delivery was the common case, in which case, this strategy should not be used. Another possible optimization to address out-of-order would be to maintain more than one time-centric memtables in memory at a time (e.g. two 12 hour ones), and then you always insert into whichever one of the two owns the appropriate range of time. By delaying flushing the ahead one until we are ready to roll writes over to a third one, we are able to avoid any fragmentation as long as all deliveries come in no more than 12 hours late (in this example, presumably tunable). Anything that triggers compactions will have to be looked
[jira] [Resolved] (CASSANDRA-7192) QueryTrace for a paginated query exists only for the first element of the list returned by getAllExecutionInfo()
[ https://issues.apache.org/jira/browse/CASSANDRA-7192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne resolved CASSANDRA-7192. - Resolution: Not a Problem Definitively a Java Driver problem. If you could open an issue on the driver JIRA, that would be perfect. QueryTrace for a paginated query exists only for the first element of the list returned by getAllExecutionInfo() Key: CASSANDRA-7192 URL: https://issues.apache.org/jira/browse/CASSANDRA-7192 Project: Cassandra Issue Type: Bug Components: Core Environment: A Cassandra 2.0.6 cluster of 16 nodes running on Ubuntu 12.04.2 LTS, using the Java Driver in the client. Reporter: Roger Hernandez Priority: Minor Within the Java Driver, with tracing enabled, I execute a large query that benefits from automatic pagination (with fetchSize=10). I make sure to go through all of the ResultSet, and by the end of the query I call getAllExecutionInfo() on the ResultSet. This returns an ArrayList of 9 ExecutionInfo elements (the number of pages it requested from Cassandra). When accessing the QueryTrace in the ExecutionInfo from the ArrayList at index 0, I can retrieve the information without issues. However, the first is the only one that has QueryTrace information, every other ExecutionInfo of the array returns a NULL QueryTrace object. -- This message was sent by Atlassian JIRA (v6.2#6252)
[3/6] git commit: Backported CASSANDRA-5818 to cassandra-2.0
Backported CASSANDRA-5818 to cassandra-2.0 patch by Lyuben Todorov; reviewed by Mikhail Stepura Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5d187fb8 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5d187fb8 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5d187fb8 Branch: refs/heads/trunk Commit: 5d187fb82a804f7c9b29c1cf7322436d5488ba4d Parents: a01eb5a Author: Lyuben Todorov ltodo...@datastax.com Authored: Thu May 15 20:22:58 2014 -0700 Committer: Mikhail Stepura mish...@apache.org Committed: Thu May 15 20:22:58 2014 -0700 -- CHANGES.txt | 1 + .../org/apache/cassandra/db/Directories.java| 72 .../cassandra/service/CassandraDaemon.java | 21 +- 3 files changed, 91 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/5d187fb8/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 30e45c1..ab663eb 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -13,6 +13,7 @@ * Fix IllegalStateException in CqlPagingRecordReader (CASSANDRA-7198) * Fix the InvertedIndex trigger example (CASSANDRA-7211) * Add --resolve-ip option to 'nodetool ring' (CASSANDRA-7210) + * Fix duplicated error messages on directory creation error at startup (CASSANDRA-5818) 2.0.8 http://git-wip-us.apache.org/repos/asf/cassandra/blob/5d187fb8/src/java/org/apache/cassandra/db/Directories.java -- diff --git a/src/java/org/apache/cassandra/db/Directories.java b/src/java/org/apache/cassandra/db/Directories.java index 2db4d9b..e118f86 100644 --- a/src/java/org/apache/cassandra/db/Directories.java +++ b/src/java/org/apache/cassandra/db/Directories.java @@ -80,6 +80,78 @@ public class Directories dataFileLocations[i] = new DataDirectory(new File(locations[i])); } +/** + * Checks whether Cassandra has RWX permissions to the specified directory. + * + * @param dir File object of the directory. + * @param dataDir String representation of the directory's location + * @return status representing Cassandra's RWX permissions to the supplied folder location. + */ +public static boolean hasFullPermissions(File dir, String dataDir) +{ +if (!dir.isDirectory()) +{ +logger.error(Not a directory {}, dataDir); +return false; +} +else if (!FileAction.hasPrivilege(dir, FileAction.X)) +{ +logger.error(Doesn't have execute permissions for {} directory, dataDir); +return false; +} +else if (!FileAction.hasPrivilege(dir, FileAction.R)) +{ +logger.error(Doesn't have read permissions for {} directory, dataDir); +return false; +} +else if (dir.exists() !FileAction.hasPrivilege(dir, FileAction.W)) +{ +logger.error(Doesn't have write permissions for {} directory, dataDir); +return false; +} + +return true; +} + +public enum FileAction +{ +X, W, XW, R, XR, RW, XRW; + +private FileAction() +{ +} + +public static boolean hasPrivilege(File file, FileAction action) +{ +boolean privilege = false; + +switch (action) { +case X: +privilege = file.canExecute(); +break; +case W: +privilege = file.canWrite(); +break; +case XW: +privilege = file.canExecute() file.canWrite(); +break; +case R: +privilege = file.canRead(); +break; +case XR: +privilege = file.canExecute() file.canRead(); +break; +case RW: +privilege = file.canRead() file.canWrite(); +break; +case XRW: +privilege = file.canExecute() file.canRead() file.canWrite(); +break; +} +return privilege; +} +} + private final String keyspacename; private final String cfname; private final File[] sstableDirectories; http://git-wip-us.apache.org/repos/asf/cassandra/blob/5d187fb8/src/java/org/apache/cassandra/service/CassandraDaemon.java -- diff --git a/src/java/org/apache/cassandra/service/CassandraDaemon.java b/src/java/org/apache/cassandra/service/CassandraDaemon.java index 9c7cc94..89d2bb0 100644
[jira] [Updated] (CASSANDRA-6602) Compaction improvements to optimize time series data
[ https://issues.apache.org/jira/browse/CASSANDRA-6602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-6602: --- Labels: compaction performance (was: performance) Compaction improvements to optimize time series data Key: CASSANDRA-6602 URL: https://issues.apache.org/jira/browse/CASSANDRA-6602 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Tupshin Harper Assignee: Marcus Eriksson Labels: compaction, performance Fix For: 3.0 There are some unique characteristics of many/most time series use cases that both provide challenges, as well as provide unique opportunities for optimizations. One of the major challenges is in compaction. The existing compaction strategies will tend to re-compact data on disk at least a few times over the lifespan of each data point, greatly increasing the cpu and IO costs of that write. Compaction exists to 1) ensure that there aren't too many files on disk 2) ensure that data that should be contiguous (part of the same partition) is laid out contiguously 3) deleting data due to ttls or tombstones The special characteristics of time series data allow us to optimize away all three. Time series data 1) tends to be delivered in time order, with relatively constrained exceptions 2) often has a pre-determined and fixed expiration date 3) Never gets deleted prior to TTL 4) Has relatively predictable ingestion rates Note that I filed CASSANDRA-5561 and this ticket potentially replaces or lowers the need for it. In that ticket, jbellis reasonably asks, how that compaction strategy is better than disabling compaction. Taking that to heart, here is a compaction-strategy-less approach that could be extremely efficient for time-series use cases that follow the above pattern. (For context, I'm thinking of an example use case involving lots of streams of time-series data with a 5GB per day ingestion rate, and a 1000 day retention with TTL, resulting in an eventual steady state of 5TB per node) 1) You have an extremely large memtable (preferably off heap, if/when doable) for the table, and that memtable is sized to be able to hold a lengthy window of time. A typical period might be one day. At the end of that period, you flush the contents of the memtable to an sstable and move to the next one. This is basically identical to current behaviour, but with thresholds adjusted so that you can ensure flushing at predictable intervals. (Open question is whether predictable intervals is actually necessary, or whether just waiting until the huge memtable is nearly full is sufficient) 2) Combine the behaviour with CASSANDRA-5228 so that sstables will be efficiently dropped once all of the columns have. (Another side note, it might be valuable to have a modified version of CASSANDRA-3974 that doesn't bother storing per-column TTL since it is required that all columns have the same TTL) 3) Be able to mark column families as read/write only (no explicit deletes), so no tombstones. 4) Optionally add back an additional type of delete that would delete all data earlier than a particular timestamp, resulting in immediate dropping of obsoleted sstables. The result is that for in-order delivered data, Every cell will be laid out optimally on disk on the first pass, and over the course of 1000 days and 5TB of data, there will only be 1000 5GB sstables, so the number of filehandles will be reasonable. For exceptions (out-of-order delivery), most cases will be caught by the extended (24 hour+) memtable flush times and merged correctly automatically. For those that were slightly askew at flush time, or were delivered so far out of order that they go in the wrong sstable, there is relatively low overhead to reading from two sstables for a time slice, instead of one, and that overhead would be incurred relatively rarely unless out-of-order delivery was the common case, in which case, this strategy should not be used. Another possible optimization to address out-of-order would be to maintain more than one time-centric memtables in memory at a time (e.g. two 12 hour ones), and then you always insert into whichever one of the two owns the appropriate range of time. By delaying flushing the ahead one until we are ready to roll writes over to a third one, we are able to avoid any fragmentation as long as all deliveries come in no more than 12 hours late (in this example, presumably tunable). Anything that triggers compactions will have to be looked at, since there won't be any. The one concern I have is the ramificaiton of repair. Initially, at least, I think it would be acceptable to just write one sstable per
[5/8] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1
Merge branch 'cassandra-2.0' into cassandra-2.1 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b9c9039a Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b9c9039a Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b9c9039a Branch: refs/heads/trunk Commit: b9c9039a1965730c294b179934e82cfdfdeb3662 Parents: 33bc1e8 a01eb5a Author: Mikhail Stepura mish...@apache.org Authored: Thu May 15 19:27:32 2014 -0700 Committer: Mikhail Stepura mish...@apache.org Committed: Thu May 15 19:28:47 2014 -0700 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/tools/NodeTool.java | 7 +-- 2 files changed, 6 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/b9c9039a/CHANGES.txt -- diff --cc CHANGES.txt index 4b9085a,30e45c1..223f331 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -32,6 -12,25 +32,7 @@@ Merged from 2.0 * cqlsh: Accept and execute CQL statement(s) from command-line parameter (CASSANDRA-7172) * Fix IllegalStateException in CqlPagingRecordReader (CASSANDRA-7198) * Fix the InvertedIndex trigger example (CASSANDRA-7211) + * Add --resolve-ip option to 'nodetool ring' (CASSANDRA-7210) - - -2.0.8 - * Correctly delete scheduled range xfers (CASSANDRA-7143) - * Make batchlog replica selection rack-aware (CASSANDRA-6551) - * Allow overriding cassandra-rackdc.properties file (CASSANDRA-7072) - * Set JMX RMI port to 7199 (CASSANDRA-7087) - * Use LOCAL_QUORUM for data reads at LOCAL_SERIAL (CASSANDRA-6939) - * Log a warning for large batches (CASSANDRA-6487) - * Queries on compact tables can return more rows that requested (CASSANDRA-7052) - * USING TIMESTAMP for batches does not work (CASSANDRA-7053) - * Fix performance regression from CASSANDRA-5614 (CASSANDRA-6949) - * Merge groupable mutations in TriggerExecutor#execute() (CASSANDRA-7047) - * Fix CFMetaData#getColumnDefinitionFromColumnName() (CASSANDRA-7074) - * Plug holes in resource release when wiring up StreamSession (CASSANDRA-7073) - * Re-add parameter columns to tracing session (CASSANDRA-6942) - * Fix writetime/ttl functions for static columns (CASSANDRA-7081) - * Suggest CTRL-C or semicolon after three blank lines in cqlsh (CASSANDRA-7142) Merged from 1.2: * Add Cloudstack snitch (CASSANDRA-7147) * Update system.peers correctly when relocating tokens (CASSANDRA-7126)
[2/4] git commit: Starting threads in the OutboundTcpConnectionPool constructor causes race conditions
Starting threads in the OutboundTcpConnectionPool constructor causes race conditions patch by sbtourist; reviewed by jasobrown for CASSANDRA-7177 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/05bacaea Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/05bacaea Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/05bacaea Branch: refs/heads/cassandra-2.1 Commit: 05bacaeabc96a6d85fbf908dce8474acffcab730 Parents: 2e61cd5 Author: Jason Brown jasobr...@apple.com Authored: Wed May 7 11:58:56 2014 -0700 Committer: Jason Brown jasobr...@apple.com Committed: Wed May 7 11:58:56 2014 -0700 -- CHANGES.txt | 2 +- .../apache/cassandra/net/MessagingService.java | 6 +-- .../net/OutboundTcpConnectionPool.java | 41 +--- 3 files changed, 40 insertions(+), 9 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/05bacaea/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index fc192ef..65ee6cf 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,6 +1,6 @@ 2.0.9 * Warn when 'USING TIMESTAMP' is used on a CAS BATCH (CASSANDRA-7067) - + * Starting threads in OutboundTcpConnectionPool constructor causes race conditions (CASSANDRA-7177) 2.0.8 * Correctly delete scheduled range xfers (CASSANDRA-7143) http://git-wip-us.apache.org/repos/asf/cassandra/blob/05bacaea/src/java/org/apache/cassandra/net/MessagingService.java -- diff --git a/src/java/org/apache/cassandra/net/MessagingService.java b/src/java/org/apache/cassandra/net/MessagingService.java index cccf698..dbd76d6 100644 --- a/src/java/org/apache/cassandra/net/MessagingService.java +++ b/src/java/org/apache/cassandra/net/MessagingService.java @@ -498,11 +498,11 @@ public final class MessagingService implements MessagingServiceMBean cp = new OutboundTcpConnectionPool(to); OutboundTcpConnectionPool existingPool = connectionManagers.putIfAbsent(to, cp); if (existingPool != null) -{ -cp.close(); cp = existingPool; -} +else +cp.start(); } +cp.waitForStarted(); return cp; } http://git-wip-us.apache.org/repos/asf/cassandra/blob/05bacaea/src/java/org/apache/cassandra/net/OutboundTcpConnectionPool.java -- diff --git a/src/java/org/apache/cassandra/net/OutboundTcpConnectionPool.java b/src/java/org/apache/cassandra/net/OutboundTcpConnectionPool.java index 81168c6..c45fc53 100644 --- a/src/java/org/apache/cassandra/net/OutboundTcpConnectionPool.java +++ b/src/java/org/apache/cassandra/net/OutboundTcpConnectionPool.java @@ -22,6 +22,8 @@ import java.net.InetAddress; import java.net.InetSocketAddress; import java.net.Socket; import java.nio.channels.SocketChannel; +import java.util.concurrent.CountDownLatch; +import java.util.concurrent.TimeUnit; import org.apache.cassandra.concurrent.Stage; import org.apache.cassandra.config.Config; @@ -36,6 +38,7 @@ public class OutboundTcpConnectionPool { // pointer for the real Address. private final InetAddress id; +private final CountDownLatch started; public final OutboundTcpConnection cmdCon; public final OutboundTcpConnection ackCon; // pointer to the reseted Address. @@ -46,13 +49,10 @@ public class OutboundTcpConnectionPool { id = remoteEp; resetedEndpoint = SystemKeyspace.getPreferredIP(remoteEp); +started = new CountDownLatch(1); cmdCon = new OutboundTcpConnection(this); -cmdCon.start(); ackCon = new OutboundTcpConnection(this); -ackCon.start(); - -metrics = new ConnectionMetrics(id, this); } /** @@ -167,14 +167,45 @@ public class OutboundTcpConnectionPool } return true; } + +public void start() +{ +cmdCon.start(); +ackCon.start(); + +metrics = new ConnectionMetrics(id, this); + +started.countDown(); +} + +public void waitForStarted() +{ +if (started.getCount() == 0) +return; + +boolean error = false; +try +{ +if (!started.await(1, TimeUnit.MINUTES)) +error = true; +} +catch (InterruptedException e) +{ +Thread.currentThread().interrupt(); +error = true; +} +if (error) +throw new IllegalStateException(String.format(Connections to %s are not started!, id.getHostAddress())); +} - public void
[03/10] git commit: cqlsh should return a non-zero error code if a query fails
cqlsh should return a non-zero error code if a query fails patch by Branden Visser and Mikhail Stepura; reviewed by Mikhail Stepura for CASSANDRA-6344 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8abe9f6f Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8abe9f6f Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8abe9f6f Branch: refs/heads/cassandra-2.1 Commit: 8abe9f6f522146dc478f006a8160b4db1c233169 Parents: d839350 Author: Mikhail Stepura mish...@apache.org Authored: Thu May 8 13:20:35 2014 -0700 Committer: Mikhail Stepura mish...@apache.org Committed: Thu May 8 13:27:41 2014 -0700 -- CHANGES.txt | 1 + bin/cqlsh | 4 2 files changed, 5 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/8abe9f6f/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 312cf06..7021e7b 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -23,6 +23,7 @@ * remove duplicate query for local tokens (CASSANDRA-7182) * raise streaming phi convict threshold level (CASSANDRA-7063) * reduce garbage creation in calculatePendingRanges (CASSANDRA-7191) + * exit CQLSH with error status code if script fails (CASSANDRA-6344) 1.2.16 * Add UNLOGGED, COUNTER options to BATCH documentation (CASSANDRA-6816) http://git-wip-us.apache.org/repos/asf/cassandra/blob/8abe9f6f/bin/cqlsh -- diff --git a/bin/cqlsh b/bin/cqlsh index 8e1e0e2..24cb3b8 100755 --- a/bin/cqlsh +++ b/bin/cqlsh @@ -548,6 +548,7 @@ class Shell(cmd.Cmd): self.show_line_nums = True self.stdin = stdin self.query_out = sys.stdout +self.statement_error = False def set_expanded_cql_version(self, ver): ver, vertuple = full_cql_version(ver) @@ -2175,6 +2176,7 @@ class Shell(cmd.Cmd): self.query_out.flush() def printerr(self, text, color=RED, newline=True, shownum=None): +self.statement_error = True if shownum is None: shownum = self.show_line_nums if shownum: @@ -2404,6 +2406,8 @@ def main(options, hostname, port): shell.cmdloop() save_history() +if options.file and shell.statement_error: +sys.exit(2) if __name__ == '__main__': main(*read_options(sys.argv[1:], os.environ))
[3/3] git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/05a54e0c Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/05a54e0c Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/05a54e0c Branch: refs/heads/trunk Commit: 05a54e0ce6715c15821764b9ec21480088669de5 Parents: ac1a9cd 65a4626 Author: Brandon Williams brandonwilli...@apache.org Authored: Fri May 9 13:29:08 2014 -0500 Committer: Brandon Williams brandonwilli...@apache.org Committed: Fri May 9 13:29:08 2014 -0500 -- src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --
[5/6] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1
Merge branch 'cassandra-2.0' into cassandra-2.1 Conflicts: CHANGES.txt Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/6be62c2c Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/6be62c2c Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/6be62c2c Branch: refs/heads/cassandra-2.1 Commit: 6be62c2c46de06170dd4a10327ecab4ab7a41d78 Parents: 92c38c0 569177f Author: Brandon Williams brandonwilli...@apache.org Authored: Wed May 14 17:36:48 2014 -0500 Committer: Brandon Williams brandonwilli...@apache.org Committed: Wed May 14 17:36:48 2014 -0500 -- CHANGES.txt | 1 + .../cassandra/hadoop/cql3/CqlConfigHelper.java | 32 ++-- 2 files changed, 30 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/6be62c2c/CHANGES.txt -- diff --cc CHANGES.txt index d43a0f5,285efd1..450e337 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,20 -1,7 +1,21 @@@ -2.0.9 +2.1.0-rc1 + * Add PowerShell Windows launch scripts (CASSANDRA-7001) + * Make commitlog archive+restore more robust (CASSANDRA-6974) + * Fix marking commitlogsegments clean (CASSANDRA-6959) + * Add snapshot manifest describing files included (CASSANDRA-6326) + * Parallel streaming for sstableloader (CASSANDRA-3668) + * Fix bugs in supercolumns handling (CASSANDRA-7138) + * Fix ClassClassException on composite dense tables (CASSANDRA-7112) + * Cleanup and optimize collation and slice iterators (CASSANDRA-7107) + * Upgrade NBHM lib (CASSANDRA-7128) + * Optimize netty server (CASSANDRA-6861) + * Fix repair hang when given CF does not exist (CASSANDRA-7189) + * Allow c* to be shutdown in an embedded mode (CASSANDRA-5635) + * Add server side batching to native transport (CASSANDRA-5663) + * Make batchlog replay asynchronous (CASSANDRA-6134) +Merged from 2.0: + * (Hadoop) support authentication in CqlRecordReader (CASSANDRA-7221) * (Hadoop) Close java driver Cluster in CQLRR.close (CASSANDRA-7228) - * Fix potential SlabAllocator yield-starvation (CASSANDRA-7133) * Warn when 'USING TIMESTAMP' is used on a CAS BATCH (CASSANDRA-7067) * Starting threads in OutboundTcpConnectionPool constructor causes race conditions (CASSANDRA-7177) * return all cpu values from BackgroundActivityMonitor.readAndCompute (CASSANDRA-7183)
[4/4] git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Conflicts: src/java/org/apache/cassandra/cql3/statements/ModificationStatement.java Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/fe2d7dda Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/fe2d7dda Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/fe2d7dda Branch: refs/heads/trunk Commit: fe2d7ddafa29b100b54aefd730d2855f84cd174e Parents: adea05f 0cb1db6 Author: Jake Luciani j...@apache.org Authored: Wed May 7 15:22:00 2014 -0400 Committer: Jake Luciani j...@apache.org Committed: Wed May 7 15:22:00 2014 -0400 -- CHANGES.txt | 1 + .../cql3/statements/ModificationStatement.java | 3 ++ .../apache/cassandra/net/MessagingService.java | 6 +-- .../net/OutboundTcpConnectionPool.java | 41 +--- 4 files changed, 43 insertions(+), 8 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/fe2d7dda/CHANGES.txt -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/fe2d7dda/src/java/org/apache/cassandra/cql3/statements/ModificationStatement.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/fe2d7dda/src/java/org/apache/cassandra/net/MessagingService.java --
[2/3] git commit: Remove duplicate entries in changelog
Remove duplicate entries in changelog Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/19ff1932 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/19ff1932 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/19ff1932 Branch: refs/heads/cassandra-2.1 Commit: 19ff1932cd3275f114a8820d3a5d0cdf176aa261 Parents: 32e1b16 Author: Sylvain Lebresne sylv...@datastax.com Authored: Wed May 7 10:30:14 2014 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Wed May 7 10:30:14 2014 +0200 -- CHANGES.txt | 4 1 file changed, 4 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/19ff1932/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 1e60fc3..fc192ef 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -39,10 +39,6 @@ Merged from 1.2: 2.0.7 * Put nodes in hibernate when join_ring is false (CASSANDRA-6961) - * Continue assassinating even if the endpoint vanishes (CASSANDRA-6787) - * Non-droppable verbs shouldn't be dropped from OTC (CASSANDRA-6980) - * Shutdown batchlog executor in SS#drain() (CASSANDRA-7025) - * Schedule schema pulls on change (CASSANDRA-6971) * Avoid early loading of non-system keyspaces before compaction-leftovers cleanup at startup (CASSANDRA-6913) * Restrict Windows to parallel repairs (CASSANDRA-6907)
svn commit: r1595030 - in /cassandra/site: publish/index.html publish/media/summit14.jpg src/content/index.html src/media/summit14.jpg
Author: jbellis Date: Thu May 15 20:14:02 2014 New Revision: 1595030 URL: http://svn.apache.org/r1595030 Log: link Cassandra Summit 2014 Added: cassandra/site/publish/media/summit14.jpg (with props) cassandra/site/src/media/summit14.jpg (with props) Modified: cassandra/site/publish/index.html cassandra/site/src/content/index.html Modified: cassandra/site/publish/index.html URL: http://svn.apache.org/viewvc/cassandra/site/publish/index.html?rev=1595030r1=1595029r2=1595030view=diff == --- cassandra/site/publish/index.html (original) +++ cassandra/site/publish/index.html Thu May 15 20:14:02 2014 @@ -85,6 +85,12 @@ /div +div class=span-24 +a href=http://www.cvent.com/events/cassandra-summit-2014/event-summary-176f061a4b144525ae05f18cd9a9cb12.aspx/; + img src=/media/img/summit14.jpg +/a +/div + div id=features class=container span-24 h2 class=hdrOverview/h2 div class=span-8 Added: cassandra/site/publish/media/summit14.jpg URL: http://svn.apache.org/viewvc/cassandra/site/publish/media/summit14.jpg?rev=1595030view=auto == Binary file - no diff available. Propchange: cassandra/site/publish/media/summit14.jpg -- svn:mime-type = application/octet-stream Modified: cassandra/site/src/content/index.html URL: http://svn.apache.org/viewvc/cassandra/site/src/content/index.html?rev=1595030r1=1595029r2=1595030view=diff == --- cassandra/site/src/content/index.html (original) +++ cassandra/site/src/content/index.html Thu May 15 20:14:02 2014 @@ -31,6 +31,12 @@ {% include skeleton/_download.html %} +div class=span-24 +a href=http://www.cvent.com/events/cassandra-summit-2014/event-summary-176f061a4b144525ae05f18cd9a9cb12.aspx/; + img src=/media/img/summit14.jpg +/a +/div + div id=features class=container span-24 h2 class=hdrOverview/h2 div class=span-8 Added: cassandra/site/src/media/summit14.jpg URL: http://svn.apache.org/viewvc/cassandra/site/src/media/summit14.jpg?rev=1595030view=auto == Binary file - no diff available. Propchange: cassandra/site/src/media/summit14.jpg -- svn:mime-type = application/octet-stream
git commit: add the sample properties file to the example trigger jar so it can be loaded
Repository: cassandra Updated Branches: refs/heads/trunk 401ed436a - a1f9b7c71 add the sample properties file to the example trigger jar so it can be loaded Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a1f9b7c7 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a1f9b7c7 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a1f9b7c7 Branch: refs/heads/trunk Commit: a1f9b7c71e6ba549e61a1d5f5214796f355347a9 Parents: 401ed43 Author: Dave Brosius dbros...@mebigfatguy.com Authored: Thu May 15 23:07:35 2014 -0400 Committer: Dave Brosius dbros...@mebigfatguy.com Committed: Thu May 15 23:07:35 2014 -0400 -- examples/triggers/build.xml | 4 1 file changed, 4 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a1f9b7c7/examples/triggers/build.xml -- diff --git a/examples/triggers/build.xml b/examples/triggers/build.xml index 293b08d..450def6 100644 --- a/examples/triggers/build.xml +++ b/examples/triggers/build.xml @@ -24,6 +24,7 @@ property name=cassandra.classes value=${cassandra.dir}/build/classes/main / property name=build.src value=${basedir}/src / property name=build.dir value=${basedir}/build / + property name=conf.dir value=${basedir}/conf / property name=build.classes value=${build.dir}/classes / property name=final.name value=trigger-example / @@ -50,6 +51,9 @@ target name=jar depends=build jar jarfile=${build.dir}/${final.name}.jar fileset dir=${build.classes} / + fileset dir=${conf.dir} + include name=**/*.properties / + /fileset /jar /target
[jira] [Commented] (CASSANDRA-7245) Out-of-Order keys with stress + CQL3
[ https://issues.apache.org/jira/browse/CASSANDRA-7245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999572#comment-13999572 ] Brandon Williams commented on CASSANDRA-7245: - Any chance you're using HSHA? CASSANDRA-6285 which is still not yet committed (I have a reason for that) could be a factor in that case. Out-of-Order keys with stress + CQL3 Key: CASSANDRA-7245 URL: https://issues.apache.org/jira/browse/CASSANDRA-7245 Project: Cassandra Issue Type: Bug Components: Core Reporter: Pavel Yaskevich We have been generating data (stress with CQL3 prepared) for CASSANDRA-4718 and found following problem almost in every SSTable generated (~200 GB of data and 821 SSTables). We set up they keys to be 10 bytes in size (default) and population between 1 and 6. Once I ran 'sstablekeys' on the generated SSTable files I got following exceptions: _There is a problem with sorting of normal looking keys:_ 30303039443538353645 30303039443745364242 java.io.IOException: Key out of order! DecoratedKey(-217680888487824985, *30303039443745364242*) DecoratedKey(-1767746583617597213, *30303039443437454333*) 0a30303033343933 3734441388343933 java.io.IOException: Key out of order! DecoratedKey(5440473860101999581, *3734441388343933*) DecoratedKey(-7565486415339257200, *30303033344639443137*) 30303033354244363031 30303033354133423742 java.io.IOException: Key out of order! DecoratedKey(2687072396429900180, *30303033354133423742*) DecoratedKey(-7838239767410066684, *30303033354145344534*) 30303034313442354137 3034313635363334 java.io.IOException: Key out of order! DecoratedKey(1516003874415400462, *3034313635363334*) DecoratedKey(-9106177395653818217, *3030303431444238*) 30303035373044373435 30303035373044334631 java.io.IOException: Key out of order! DecoratedKey(-3645715702154616540, *30303035373044334631*) DecoratedKey(-4296696226469000945, *30303035373132364138*) _And completely different ones:_ 30303041333745373543 7cd045c59a90d7587d8d java.io.IOException: Key out of order! DecoratedKey(-3595402345023230196, *7cd045c59a90d7587d8d*) DecoratedKey(-5146766422778260690, *30303041333943303232*) 3030303332314144 30303033323346343932 java.io.IOException: Key out of order! DecoratedKey(7071845511166615635, *30303033323346343932*) DecoratedKey(5233296131921119414, *53d83e0012287e03*) 30303034314531374431 3806734b256c27e41ec2 java.io.IOException: Key out of order! DecoratedKey(-7720474642702543193, *3806734b256c27e41ec2*) DecoratedKey(-8072288379146044663, *30303034314136413343*) _And sometimes there is no problem at all:_ 30303033353144463637 002a31b3b31a1c2f 5d616dd38211ebb5d6ec 444236451388 1388138844463744 30303033353143394343 It's worth to mention that we have got 22 timeout exceptions but number of out-of-order keys is much larger than that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7210) Add --resolve-ip option on 'nodetool ring'
[ https://issues.apache.org/jira/browse/CASSANDRA-7210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999554#comment-13999554 ] Yasuharu Goto commented on CASSANDRA-7210: -- [~mishail] Thank you for your review and commit ! Add --resolve-ip option on 'nodetool ring' -- Key: CASSANDRA-7210 URL: https://issues.apache.org/jira/browse/CASSANDRA-7210 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Yasuharu Goto Assignee: Yasuharu Goto Priority: Trivial Fix For: 2.0.9, 2.1 rc1 Attachments: 2.0-7210-2.txt, 2.0-7210.txt, trunk-7210-2.txt, trunk-7210.txt Give nodetool ring the option of either displaying IPs or hostnames for the nodes in a ring. -- This message was sent by Atlassian JIRA (v6.2#6252)
[1/3] git commit: Raise the phi convict threshold check in 1.2
Repository: cassandra Updated Branches: refs/heads/trunk 6cecbf914 - adea05f58 Raise the phi convict threshold check in 1.2 patch by tjake; reviewed by jbellis for CASSANDRA-7063 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/21b3a679 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/21b3a679 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/21b3a679 Branch: refs/heads/trunk Commit: 21b3a679f0dbb3f9219bba4624dff0be36d9fe8e Parents: 0132e54 Author: Jake Luciani j...@apache.org Authored: Wed May 7 13:12:55 2014 -0400 Committer: Jake Luciani j...@apache.org Committed: Wed May 7 13:17:33 2014 -0400 -- CHANGES.txt| 2 +- src/java/org/apache/cassandra/streaming/AbstractStreamSession.java | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/21b3a679/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index d7b7f00..8533e64 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -21,7 +21,7 @@ * Preserves CQL metadata when updating table from thrift (CASSANDRA-6831) * fix time conversion to milliseconds in SimpleCondition.await (CASSANDRA-7149) * remove duplicate query for local tokens (CASSANDRA-7182) - + * raise streaming phi convict threshold level (CASSANDRA-7063) 1.2.16 * Add UNLOGGED, COUNTER options to BATCH documentation (CASSANDRA-6816) http://git-wip-us.apache.org/repos/asf/cassandra/blob/21b3a679/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java -- diff --git a/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java b/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java index f8de827..77dbcd6 100644 --- a/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java +++ b/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java @@ -102,7 +102,7 @@ public abstract class AbstractStreamSession implements IEndpointStateChangeSubsc return; // We want a higher confidence in the failure detection than usual because failing a streaming wrongly has a high cost. -if (phi 2 * DatabaseDescriptor.getPhiConvictThreshold()) +if (phi 100 * DatabaseDescriptor.getPhiConvictThreshold()) return; logger.error(Stream failed because {} died or was restarted/removed (streams may still be active
git commit: Raise the phi convict threshold check in 1.2
Repository: cassandra Updated Branches: refs/heads/cassandra-1.2 0132e546b - 21b3a679f Raise the phi convict threshold check in 1.2 patch by tjake; reviewed by jbellis for CASSANDRA-7063 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/21b3a679 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/21b3a679 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/21b3a679 Branch: refs/heads/cassandra-1.2 Commit: 21b3a679f0dbb3f9219bba4624dff0be36d9fe8e Parents: 0132e54 Author: Jake Luciani j...@apache.org Authored: Wed May 7 13:12:55 2014 -0400 Committer: Jake Luciani j...@apache.org Committed: Wed May 7 13:17:33 2014 -0400 -- CHANGES.txt| 2 +- src/java/org/apache/cassandra/streaming/AbstractStreamSession.java | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/21b3a679/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index d7b7f00..8533e64 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -21,7 +21,7 @@ * Preserves CQL metadata when updating table from thrift (CASSANDRA-6831) * fix time conversion to milliseconds in SimpleCondition.await (CASSANDRA-7149) * remove duplicate query for local tokens (CASSANDRA-7182) - + * raise streaming phi convict threshold level (CASSANDRA-7063) 1.2.16 * Add UNLOGGED, COUNTER options to BATCH documentation (CASSANDRA-6816) http://git-wip-us.apache.org/repos/asf/cassandra/blob/21b3a679/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java -- diff --git a/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java b/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java index f8de827..77dbcd6 100644 --- a/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java +++ b/src/java/org/apache/cassandra/streaming/AbstractStreamSession.java @@ -102,7 +102,7 @@ public abstract class AbstractStreamSession implements IEndpointStateChangeSubsc return; // We want a higher confidence in the failure detection than usual because failing a streaming wrongly has a high cost. -if (phi 2 * DatabaseDescriptor.getPhiConvictThreshold()) +if (phi 100 * DatabaseDescriptor.getPhiConvictThreshold()) return; logger.error(Stream failed because {} died or was restarted/removed (streams may still be active
[jira] [Updated] (CASSANDRA-7242) More compaction visibility into thread pool and per CF
[ https://issues.apache.org/jira/browse/CASSANDRA-7242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Lohfink updated CASSANDRA-7242: - Attachment: 7242_jmxify_compactionpool.txt More compaction visibility into thread pool and per CF -- Key: CASSANDRA-7242 URL: https://issues.apache.org/jira/browse/CASSANDRA-7242 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Chris Lohfink Priority: Minor Attachments: 7242_jmxify_compactionpool.txt, 7242_per_cf_compactionstats.txt Two parts to this to help diagnose compactions issues/bottlenecks. Could be two different issues but pretty closely related. First is adding per column family pending compactions. When theres a lot of backed up compactions but multiple ones currently being compacted its hard to identify which CF is causing the backlog. In patch provided this doesnt cover the compactions in the thread pools queue like compactionstats does but not sure how big that gets ever or if needs to be... which brings me to the second idea. Second is to change compactionExecutor to extend the JMXEnabledThreadPoolExecutor. Big difference there would be the blocking rejection handler. With a 2^31 pending queue the blocking becoming an issue is a pretty extreme case in itself that would most likely OOM the server. So the different rejection policy shouldn't cause much of an issue but if it does can always override it to use default behavior. Would help identify scenarios where corrupted sstables or unhandled exceptions etc killing the compactions lead to a large backlog with nothing actively working. Also just for added visibility into this from tpstats. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7232) Enable live replay of commit logs
[ https://issues.apache.org/jira/browse/CASSANDRA-7232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999218#comment-13999218 ] Tyler Hobbs commented on CASSANDRA-7232: To preemptively bikeshed, I would call this nodetool replaycommitlogs instead. Refresh doesn't have a clear meaning here. Enable live replay of commit logs - Key: CASSANDRA-7232 URL: https://issues.apache.org/jira/browse/CASSANDRA-7232 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Patrick McFadin Assignee: Lyuben Todorov Priority: Minor Fix For: 2.0.9 Replaying commit logs takes a restart but restoring sstables can be an online operation with refresh. In order to restore a point-in-time without a restart, the node needs to live replay the commit logs from JMX and a nodetool command. nodetool refreshcommitlogs keyspace table -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6397) removenode outputs confusing non-error
[ https://issues.apache.org/jira/browse/CASSANDRA-6397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-6397: -- Reviewer: Brandon Williams Reproduced In: 2.0.3, 1.2.12 (was: 1.2.12, 2.0.3) [~brandon.williams] to review removenode outputs confusing non-error -- Key: CASSANDRA-6397 URL: https://issues.apache.org/jira/browse/CASSANDRA-6397 Project: Cassandra Issue Type: Bug Components: Tools Reporter: Ryan McGuire Assignee: Kirk True Priority: Trivial Labels: lhf Fix For: 2.0.8 Attachments: trunk-6397.txt *{{nodetool removenode force}}* outputs a slightly confusing error message when there is nothing for it to do. * Start a cluster, then kill one of the nodes. * run *{{nodetool removenode}}* on the node you killed. * Simultaneously, in another shell, run *{{nodetool removenode force}}*, see that it outputs a simple message regarding it's status. * Run *{{nodetool removenode force}}* again after the firsrt removenode command finishes, you'll see this message and traceback: {code} $ ~/.ccm/test/node1/bin/nodetool -p 7100 removenode force RemovalStatus: No token removals in process. Exception in thread main java.lang.UnsupportedOperationException: No tokens to force removal on, call 'removetoken' first at org.apache.cassandra.service.StorageService.forceRemoveCompletion(StorageService.java:3140) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:111) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:45) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:235) at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:250) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:791) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1486) at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:96) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1327) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1419) at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:847) at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322) at sun.rmi.transport.Transport$1.run(Transport.java:177) at sun.rmi.transport.Transport$1.run(Transport.java:174) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:173) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:553) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:808) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:667) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722) {code} Two issues I see with this traceback: * No tokens to force removal on is telling me the same thing that the message before it tells me: RemovalStatus: No token removals in process., So the entire traceback is redundant. * call 'removetoken' first - removetoken has been deprecated according to the message output by removenode, so there is inconsistency in directions to the user. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6975) Allow usage of QueryOptions in CQLStatement.executeInternal
[ https://issues.apache.org/jira/browse/CASSANDRA-6975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999838#comment-13999838 ] Aleksey Yeschenko commented on CASSANDRA-6975: -- bq. Just to be clear, I'm too fussed about the fact that the paging call is not used, it's just convenient to have it here for when we'll need it and I'm sure we will at some point. Oh, we will - in at least two places in 2.1.x and in BM in 3.0. Allow usage of QueryOptions in CQLStatement.executeInternal --- Key: CASSANDRA-6975 URL: https://issues.apache.org/jira/browse/CASSANDRA-6975 Project: Cassandra Issue Type: Improvement Reporter: Mikhail Stepura Assignee: Mikhail Stepura Priority: Minor Fix For: 2.1 rc1 Attachments: cassandra-2.1-executeInternal.patch The current implementations of {{CQLStatement.executeInternal}} accept only {{QueryState}} as a parameter. That means it's impossible to use prepared statements with variables for internal calls (you can only pass the variables via {{QueryOptions}}). We also can't use the internal paging in internal SELECT statements for the very same reason. I'm attaching the patch which implements that. [~slebresne] [~iamaleksey] what do you think guys? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7210) Add --resolve-ip option on 'nodetool ring'
[ https://issues.apache.org/jira/browse/CASSANDRA-7210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yasuharu Goto updated CASSANDRA-7210: - Attachment: trunk-7210-2.txt [~mishail] Oops, I fixed. Add --resolve-ip option on 'nodetool ring' -- Key: CASSANDRA-7210 URL: https://issues.apache.org/jira/browse/CASSANDRA-7210 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Yasuharu Goto Assignee: Yasuharu Goto Priority: Trivial Fix For: 2.0.9, 2.1 rc1 Attachments: 2.0-7210-2.txt, 2.0-7210.txt, trunk-7210-2.txt, trunk-7210.txt Give nodetool ring the option of either displaying IPs or hostnames for the nodes in a ring. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7093) ConfigHelper.setInputColumnFamily incompatible with upper-cased keyspaces since 2.0.7
[ https://issues.apache.org/jira/browse/CASSANDRA-7093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998797#comment-13998797 ] Piotr Kołaczkowski commented on CASSANDRA-7093: --- +1 ConfigHelper.setInputColumnFamily incompatible with upper-cased keyspaces since 2.0.7 - Key: CASSANDRA-7093 URL: https://issues.apache.org/jira/browse/CASSANDRA-7093 Project: Cassandra Issue Type: Bug Reporter: Maxime Nay Assignee: Alex Liu Attachments: 7093.txt Hi, We have a keyspace starting with an upper-case character: Visitors. We are trying to run a map reduce job on one of the column family of this keyspace. To specify the keyspace it seems we have to use: org.apache.cassandra.hadoop. ConfigHelper.setInputColumnFamily(conf, keyspace, columnFamily); If we do: ConfigHelper.setInputColumnFamily(conf, Visitors, columnFamily); we get: com.datastax.driver.core.exceptions.InvalidQueryException: Keyspace 'visitors' does not exist at com.datastax.driver.core.exceptions.InvalidQueryException.copy(InvalidQueryException.java:35) at com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:256) at com.datastax.driver.core.SessionManager.setKeyspace(SessionManager.java:335) ... And if we do: ConfigHelper.setInputColumnFamily(conf, \Visitors\, columnFamily); we get: Exception in thread main java.lang.RuntimeException: InvalidRequestException(why:No such keyspace: Visitors) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getRangeMap(AbstractColumnFamilyInputFormat.java:339) at org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat.getSplits(AbstractColumnFamilyInputFormat.java:125) at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:962) at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:979) ... This is working just fine if the keyspace is lowercase. And it was working just fine with Cassandra 2.0.6. But with Cassandra 2.0.7, and the addition of Datastax's java driver in the dependencies, I am getting this error. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-4718) More-efficient ExecutorService for improved throughput
[ https://issues.apache.org/jira/browse/CASSANDRA-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998750#comment-13998750 ] Sylvain Lebresne commented on CASSANDRA-4718: - bq. The batching has already been committed to tip, so the 2.1-batchnetty is essentially this. And 4718-sep is essentially 2.1-batchnetty + the patch for this, right? More-efficient ExecutorService for improved throughput -- Key: CASSANDRA-4718 URL: https://issues.apache.org/jira/browse/CASSANDRA-4718 Project: Cassandra Issue Type: Improvement Reporter: Jonathan Ellis Assignee: Benedict Priority: Minor Labels: performance Fix For: 2.1.0 Attachments: 4718-v1.patch, PerThreadQueue.java, aws.svg, aws_read.svg, backpressure-stress.out.txt, baq vs trunk.png, belliotsmith_branches-stress.out.txt, jason_read.svg, jason_read_latency.svg, jason_write.svg, op costs of various queues.ods, stress op rate with various queues.ods, v1-stress.out Currently all our execution stages dequeue tasks one at a time. This can result in contention between producers and consumers (although we do our best to minimize this by using LinkedBlockingQueue). One approach to mitigating this would be to make consumer threads do more work in bulk instead of just one task per dequeue. (Producer threads tend to be single-task oriented by nature, so I don't see an equivalent opportunity there.) BlockingQueue has a drainTo(collection, int) method that would be perfect for this. However, no ExecutorService in the jdk supports using drainTo, nor could I google one. What I would like to do here is create just such a beast and wire it into (at least) the write and read stages. (Other possible candidates for such an optimization, such as the CommitLog and OutboundTCPConnection, are not ExecutorService-based and will need to be one-offs.) AbstractExecutorService may be useful. The implementations of ICommitLogExecutorService may also be useful. (Despite the name these are not actual ExecutorServices, although they share the most important properties of one.) -- This message was sent by Atlassian JIRA (v6.2#6252)
git commit: fix typo
Repository: cassandra Updated Branches: refs/heads/trunk d2b6063ad - 401ed436a fix typo Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/401ed436 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/401ed436 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/401ed436 Branch: refs/heads/trunk Commit: 401ed436ab0ca1f9294fbdd26d3da78ee70e9c3f Parents: d2b6063 Author: Dave Brosius dbros...@mebigfatguy.com Authored: Thu May 15 22:32:41 2014 -0400 Committer: Dave Brosius dbros...@mebigfatguy.com Committed: Thu May 15 22:32:41 2014 -0400 -- .../triggers/src/org/apache/cassandra/triggers/InvertedIndex.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/401ed436/examples/triggers/src/org/apache/cassandra/triggers/InvertedIndex.java -- diff --git a/examples/triggers/src/org/apache/cassandra/triggers/InvertedIndex.java b/examples/triggers/src/org/apache/cassandra/triggers/InvertedIndex.java index fa90053..cdcb962 100644 --- a/examples/triggers/src/org/apache/cassandra/triggers/InvertedIndex.java +++ b/examples/triggers/src/org/apache/cassandra/triggers/InvertedIndex.java @@ -42,7 +42,7 @@ public class InvertedIndex implements ITrigger ListMutation mutations = new ArrayList(update.getColumnCount()); String indexKeySpace = properties.getProperty(keyspace); -String indexColumnFamily = properties.getProperty(columnfamily) +String indexColumnFamily = properties.getProperty(columnfamily); for (Cell cell : update) { // Skip the row marker and other empty values, since they lead to an empty key.
[jira] [Updated] (CASSANDRA-7239) Nodetool Status Reports Negative Load With VNodes Disabled
[ https://issues.apache.org/jira/browse/CASSANDRA-7239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Russell Alexander Spitzer updated CASSANDRA-7239: - Reproduced In: 2.1 beta2 (was: 2.0 beta 2) Nodetool Status Reports Negative Load With VNodes Disabled -- Key: CASSANDRA-7239 URL: https://issues.apache.org/jira/browse/CASSANDRA-7239 Project: Cassandra Issue Type: Bug Components: Tools Environment: 1000 Nodes EC2 m1.large ubuntu 12.04 Reporter: Russell Alexander Spitzer Priority: Minor When I run stress on a large cluster without vnodes (num_token =1 initial token set) The loads reported by nodetool status are negative, or become negative after stress is run. {code} UN 10.97.155.31-447426217 bytes 1 0.2% 8d40568c-044c-4753-be26-4ab62710beba rack1 UN 10.9.132.53 -447342449 bytes 1 0.2% 58e7f255-803d-493b-a19e-58137466fb78 rack1 UN 10.37.151.202 -447298672 bytes 1 0.2% ba29b1f1-186f-45d0-9e59-6a528db8df5d rack1 {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
git commit: Limit user types to the keyspace they are defined in
Repository: cassandra Updated Branches: refs/heads/cassandra-2.1 ada8f1257 - c045690b1 Limit user types to the keyspace they are defined in patch by slebresne; reviewed by iamaleksey for CASSANDRA-6643 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c045690b Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c045690b Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c045690b Branch: refs/heads/cassandra-2.1 Commit: c045690b126e6b1f594291c6eff3957a6d9079ea Parents: ada8f12 Author: Sylvain Lebresne sylv...@datastax.com Authored: Fri May 16 15:56:05 2014 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Fri May 16 15:56:05 2014 +0200 -- CHANGES.txt| 1 + src/java/org/apache/cassandra/cql3/CQL3Type.java | 17 ++--- .../cql3/statements/CreateTypeStatement.java | 3 --- .../cql3/statements/DropTypeStatement.java | 3 --- 4 files changed, 15 insertions(+), 9 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/c045690b/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 223f331..d12acc9 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -14,6 +14,7 @@ * Add server side batching to native transport (CASSANDRA-5663) * Make batchlog replay asynchronous (CASSANDRA-6134) * remove unused classes (CASSANDRA-7197) + * Limit user types to the keyspace they are defined in (CASSANDRA-6643) Merged from 2.0: * (Hadoop) support authentication in CqlRecordReader (CASSANDRA-7221) * (Hadoop) Close java driver Cluster in CQLRR.close (CASSANDRA-7228) http://git-wip-us.apache.org/repos/asf/cassandra/blob/c045690b/src/java/org/apache/cassandra/cql3/CQL3Type.java -- diff --git a/src/java/org/apache/cassandra/cql3/CQL3Type.java b/src/java/org/apache/cassandra/cql3/CQL3Type.java index 8ce89ca..f07bc19 100644 --- a/src/java/org/apache/cassandra/cql3/CQL3Type.java +++ b/src/java/org/apache/cassandra/cql3/CQL3Type.java @@ -347,7 +347,6 @@ public interface CQL3Type private static class RawUT extends Raw { - private final UTName name; private RawUT(UTName name) @@ -357,7 +356,19 @@ public interface CQL3Type public CQL3Type prepare(String keyspace) throws InvalidRequestException { -name.setKeyspace(keyspace); +if (name.hasKeyspace()) +{ +// The provided keyspace is the one of the current statement this is part of. If it's different from the keyspace of +// the UTName, we reject since we want to limit user types to their own keyspace (see #6643) +if (!keyspace.equals(name.getKeyspace())) +throw new InvalidRequestException(String.format(Statement on keyspace %s cannot refer to a user type in keyspace %s; ++ user types can only be used in the keyspace they are defined in, + keyspace, name.getKeyspace())); +} +else +{ +name.setKeyspace(keyspace); +} KSMetaData ksm = Schema.instance.getKSMetaData(name.getKeyspace()); if (ksm == null) @@ -374,6 +385,6 @@ public interface CQL3Type { return name.toString(); } -} +} } } http://git-wip-us.apache.org/repos/asf/cassandra/blob/c045690b/src/java/org/apache/cassandra/cql3/statements/CreateTypeStatement.java -- diff --git a/src/java/org/apache/cassandra/cql3/statements/CreateTypeStatement.java b/src/java/org/apache/cassandra/cql3/statements/CreateTypeStatement.java index de7ce56..aa8b769 100644 --- a/src/java/org/apache/cassandra/cql3/statements/CreateTypeStatement.java +++ b/src/java/org/apache/cassandra/cql3/statements/CreateTypeStatement.java @@ -50,9 +50,6 @@ public class CreateTypeStatement extends SchemaAlteringStatement { if (!name.hasKeyspace()) name.setKeyspace(state.getKeyspace()); - -if (name.getKeyspace() == null) -throw new InvalidRequestException(You need to be logged in a keyspace or use a fully qualified user type name); } public void addDefinition(ColumnIdentifier name, CQL3Type.Raw type) http://git-wip-us.apache.org/repos/asf/cassandra/blob/c045690b/src/java/org/apache/cassandra/cql3/statements/DropTypeStatement.java
[jira] [Created] (CASSANDRA-7245) Out-of-Order keys with stress + CQL3
Pavel Yaskevich created CASSANDRA-7245: -- Summary: Out-of-Order keys with stress + CQL3 Key: CASSANDRA-7245 URL: https://issues.apache.org/jira/browse/CASSANDRA-7245 Project: Cassandra Issue Type: Bug Components: Core Reporter: Pavel Yaskevich We have been generating data (stress with CQL3 prepared) for CASSANDRA-4718 and found following problem almost in every SSTable generated (~200 GB of data and 821 SSTables). We set up they keys to be 10 bytes in size (default) and population between 1 and 6. Once I ran 'sstablekeys' on the generated SSTable files I get following exceptions: 30303041333745373543 7cd045c59a90d7587d8d java.io.IOException: Key out of order! DecoratedKey(-3595402345023230196, 7cd045c59a90d7587d8d) DecoratedKey(-5146766422778260690, 30303041333943303232) 3030303332314144 30303033323346343932 java.io.IOException: Key out of order! DecoratedKey(7071845511166615635, 30303033323346343932) DecoratedKey(5233296131921119414, 53d83e0012287e03) 30303034314531374431 3806734b256c27e41ec2 java.io.IOException: Key out of order! DecoratedKey(-7720474642702543193, 3806734b256c27e41ec2) DecoratedKey(-8072288379146044663, 30303034314136413343) There is also a problem with sorting of normal looking keys: 30303039443538353645 30303039443745364242 java.io.IOException: Key out of order! DecoratedKey(-217680888487824985, 30303039443745364242) DecoratedKey(-1767746583617597213, 30303039443437454333) 0a30303033343933 3734441388343933 java.io.IOException: Key out of order! DecoratedKey(5440473860101999581, 3734441388343933) DecoratedKey(-7565486415339257200, 30303033344639443137) 30303033354244363031 30303033354133423742 java.io.IOException: Key out of order! DecoratedKey(2687072396429900180, 30303033354133423742) DecoratedKey(-7838239767410066684, 30303033354145344534) 30303034313442354137 3034313635363334 java.io.IOException: Key out of order! DecoratedKey(1516003874415400462, 3034313635363334) DecoratedKey(-9106177395653818217, 3030303431444238) And sometimes there is no problem at all: 30303033353144463637 002a31b3b31a1c2f 5d616dd38211ebb5d6ec 444236451388 1388138844463744 30303033353143394343 It's worth to mention that we have got 22 timeout exceptions but number of out-of-order keys is much larger than that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-5483) Repair tracing
[ https://issues.apache.org/jira/browse/CASSANDRA-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Chan updated CASSANDRA-5483: Attachment: 5483-v13-608fb03-May-14-trace-formatting-changes.patch May 14 formatting changes in [^5483-v13-01-608fb03-May-14-trace-formatting-changes.patch] (based off of commit 608fb03). {quote} I think the session log messages are still confusing, especially since we use the same term for repairing a subrange and streaming data. {quote} Currently the session terminology is baked into the source code, in {{StreamSession.java}} and {{RepairSession.java}}. If the messages are changed to reflect different terminology, hopefully the source code can eventually be changed to match (fewer special cases to remember). Perhaps the best thing is to always qualify them, e.g. stream session and repair session? {quote} I don't actually see the session uuid being used in the logs except at start/finish. {quote} Sorry, that was another inadvertent mixing of nodetool messages and trace output. {{\[2014-05-13 23:49:52,283] Repair session cd6aad80-db1a-11e3-b0e7-f94811c7b860 for range (3074457345618258602,-9223372036854775808] finished}} is not a trace, but a separate (pre-patch) sendNotification in {{StorageService.java}}. This message (and some of the error messages, I think) is redundant when combined with trace output. It should have been either one or the other, not both. In the trace proper, the session UUID only shows up at the start. But note: not all nodetool messages are rendered redundant by trace output. Since we can't just suppress all non-trace sendNotification, how can we unambiguously tell nodetool trace output from normal sendNotification messages? I'm currently leaning towards just marking all sendNotification trace output with a {{TRACE:}} tag. The repair session UUIDs used to be prepended to everything, but were removed in [^5483-v08-11-Shorten-trace-messages.-Use-Tracing-begin.patch]. Without them, things are less verbose, but it's sometimes hard to unambiguously follow traces for concurrent repair sessions. To make the point clearer, I've marked each sub-task graphically in the nodetool trace output below (I've cross-checked this with the logs, which do retain the UUIDs). If you cover up the left side, it's harder to figure out which trace goes with which sub-task. Real-world repair traces will probably be even more confusing. Note: indentation here does not denote nesting; the column roughly indicates task identity, though I reuse columns when it's not ambiguous. {noformat} 1 [2014-05-15 11:31:37,839] Starting repair command #1, repairing 3 ranges for s1.users (seq=true, full=true) x [2014-05-15 11:31:37,922] Syncing range (-3074457345618258603,3074457345618258602] x [2014-05-15 11:31:38,108] Requesting merkle trees for users from [/127.0.0.2, /127.0.0.3, /127.0.0.1] x [2014-05-15 11:31:38,833] /127.0.0.2: Sending completed merkle tree to /127.0.0.1 for s1.users x [2014-05-15 11:31:39,953] Received merkle tree for users from /127.0.0.2 x [2014-05-15 11:31:40,939] /127.0.0.3: Sending completed merkle tree to /127.0.0.1 for s1.users x [2014-05-15 11:31:41,279] Received merkle tree for users from /127.0.0.3 x [2014-05-15 11:31:42,632] Received merkle tree for users from /127.0.0.1 x [2014-05-15 11:31:42,671] Syncing range (-9223372036854775808,-3074457345618258603] x [2014-05-15 11:31:42,766] Requesting merkle trees for users from [/127.0.0.2, /127.0.0.3, /127.0.0.1] x [2014-05-15 11:31:42,905] Endpoint /127.0.0.2 is consistent with /127.0.0.3 for users x [2014-05-15 11:31:43,044] Endpoint /127.0.0.2 is consistent with /127.0.0.1 for users x [2014-05-15 11:31:43,047] Endpoint /127.0.0.3 is consistent with /127.0.0.1 for users x [2014-05-15 11:31:43,084] Completed sync of range (-3074457345618258603,3074457345618258602] x [2014-05-15 11:31:43,251] /127.0.0.2: Sending completed merkle tree to /127.0.0.1 for s1.users x [2014-05-15 11:31:43,422] Received merkle tree for users from /127.0.0.2 x [2014-05-15 11:31:44,495] /127.0.0.3: Sending completed merkle tree to /127.0.0.1 for s1.users x [2014-05-15 11:31:44,637] Received merkle tree for users from /127.0.0.3 x [2014-05-15 11:31:45,474] Received merkle tree for users from /127.0.0.1 x [2014-05-15 11:31:45,494] Syncing range (3074457345618258602,-9223372036854775808] x [2014-05-15 11:31:45,499] Endpoint /127.0.0.3 is consistent with /127.0.0.1 for users x [2014-05-15 11:31:45,520] Endpoint /127.0.0.2 is consistent with /127.0.0.1 for users x [2014-05-15 11:31:45,544] Endpoint /127.0.0.2 is consistent with /127.0.0.3 for users x [2014-05-15 11:31:45,564] Completed sync of range
[jira] [Updated] (CASSANDRA-7231) Support more concurrent requests per native transport connection
[ https://issues.apache.org/jira/browse/CASSANDRA-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-7231: -- Reviewer: Aleksey Yeschenko Support more concurrent requests per native transport connection Key: CASSANDRA-7231 URL: https://issues.apache.org/jira/browse/CASSANDRA-7231 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Sylvain Lebresne Priority: Minor Fix For: 2.1.0 Attachments: 7231.txt Right now we only support 127 concurrent requests against a given native transport connection. This causes us to waste file handles opening multiple connections, increases driver complexity and dilutes writes across multiple connections so that batching cannot easily be performed. I propose raising this limit substantially, to somewhere in the region of 16-64K, and that this is a good time to do it since we're already bumping the protocol version. -- This message was sent by Atlassian JIRA (v6.2#6252)
[4/6] git commit: Merge branch 'cassandra-2.0' into cassandra-2.1
Merge branch 'cassandra-2.0' into cassandra-2.1 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7f3d07ac Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7f3d07ac Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7f3d07ac Branch: refs/heads/trunk Commit: 7f3d07ac02178c3d3012557f4df18fe116e5ec11 Parents: 361ad68 7484bd4 Author: Yuki Morishita yu...@apache.org Authored: Fri May 9 10:41:36 2014 -0500 Committer: Yuki Morishita yu...@apache.org Committed: Fri May 9 10:41:36 2014 -0500 -- .../org/apache/cassandra/streaming/StreamSession.java | 12 +--- 1 file changed, 5 insertions(+), 7 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/7f3d07ac/src/java/org/apache/cassandra/streaming/StreamSession.java --
[jira] [Commented] (CASSANDRA-7198) CqlPagingRecordReader throws IllegalStateException
[ https://issues.apache.org/jira/browse/CASSANDRA-7198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993398#comment-13993398 ] Brent Theisen commented on CASSANDRA-7198: -- Ahh, of course. Attached a patch that catches it. CqlPagingRecordReader throws IllegalStateException -- Key: CASSANDRA-7198 URL: https://issues.apache.org/jira/browse/CASSANDRA-7198 Project: Cassandra Issue Type: Bug Components: Hadoop Environment: Spark with Calliope EA against Cassandra 2.0.7 Reporter: Brent Theisen Priority: Trivial Fix For: 2.0.9 Attachments: trunk-7198-2.txt Getting the following exception when running a Spark job that does *not* specify cassandra.input.page.row.size: {code} 14/05/08 14:30:43 ERROR executor.Executor: Exception in task ID 12 java.lang.IllegalStateException: Optional.get() cannot be called on an absent value at com.google.common.base.Absent.get(Absent.java:47) at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader.initialize(CqlPagingRecordReader.java:120) at com.tuplejump.calliope.cql3.Cql3CassandraRDD$$anon$1.init(Cql3CassandraRDD.scala:65) at com.tuplejump.calliope.cql3.Cql3CassandraRDD.compute(Cql3CassandraRDD.scala:53) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:109) at org.apache.spark.scheduler.Task.run(Task.scala:53) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213) at org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:49) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) 14/05/08 14:30:43 ERROR executor.Executor: Exception in task ID 21 java.lang.IllegalStateException: Optional.get() cannot be called on an absent value at com.google.common.base.Absent.get(Absent.java:47) at org.apache.cassandra.hadoop.cql3.CqlPagingRecordReader.initialize(CqlPagingRecordReader.java:120) at com.tuplejump.calliope.cql3.Cql3CassandraRDD$$anon$1.init(Cql3CassandraRDD.scala:65) at com.tuplejump.calliope.cql3.Cql3CassandraRDD.compute(Cql3CassandraRDD.scala:53) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241) at org.apache.spark.rdd.RDD.iterator(RDD.scala:232) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:109) at org.apache.spark.scheduler.Task.run(Task.scala:53) at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213) at org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:49) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) {code} Reason why is CqlPagingRecordReader catching the wrong exception type. Patch attached. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7244) Don't allocate a Codec.Flag enum value array on every read
[ https://issues.apache.org/jira/browse/CASSANDRA-7244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-7244: Attachment: 7244.txt Don't allocate a Codec.Flag enum value array on every read -- Key: CASSANDRA-7244 URL: https://issues.apache.org/jira/browse/CASSANDRA-7244 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Benedict Assignee: Benedict Priority: Minor Labels: performance Fix For: 2.1 rc1 Attachments: 7244.txt In QueryOptions.Codec.Flag.deserialize() we call Flag.values(), which constructs a copy of the Enum array each time. Since we only use this lookup the enums, and since it happens _often_, this is pretty wasteful. It seems to make a few % difference to throughput. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-6454) Pig support for hadoop CqlInputFormat
[ https://issues.apache.org/jira/browse/CASSANDRA-6454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999765#comment-13999765 ] Shridhar edited comment on CASSANDRA-6454 at 5/16/14 12:11 PM: --- [~alexliu68] I tried to run pig script to load data with CqlNativeStorage but getting problems, looks like jar conflicts or may be something else can you please let me know the required jars and their version needed to run CqlNativeStorage . Below are the things i tried. 1.Applied this patch on top of Cassandra 2.0.07. 2.When i tried to run pig script with CqlNativeStorage it threw me ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. com/datastax/driver/core/policies/LoadBalancingPolicy exception. 3.Then added cassandra-driver-core-2.0.1.jar in my classpath after this i got an exception ERROR org.apache.pig.tools.pigstats.SimplePigStats - ERROR: com.codahale.metrics.Metric ... Caused by: java.lang.ClassNotFoundException: com.codahale.metrics.Metric 4.Then added metrics-core-3.0.2.jar. After adding this jar file i was able to run the job but failed and my hadoop log shows me this exception Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: slave1.tpgsi.com/.yyy.zzz.aa (com.datastax.driver.core.TransportException: [slave1/.yyy.zzz.aa] Error writing)) NOTE: metrics-core-2.2.0.jar already exists in C* 2.0.7 lib folder. was (Author: shridharb): [~alexliu68] I tried to run pig script to load data with CqlNativeStorage but getting problems, looks like jar conflicts or may be something else can you please let me know the required jars and their version needed to run CqlNativeStorage . Below are the things i tried. 1.Applied this patch on top of Cassandra 2.0.07. 2.When i tried to run pig script with CqlNativeStorage it threw me ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. com/datastax/driver/core/policies/LoadBalancingPolicy exception. 3.Then added cassandra-driver-core-2.0.1.jar in my classpath after this i got an exception ERROR org.apache.pig.tools.pigstats.SimplePigStats - ERROR: com.codahale.metrics.Metric ... Caused by: java.lang.ClassNotFoundException: com.codahale.metrics.Metric 4.Then added metrics-core-3.0.2.jar. After adding this jar file i was able to run the job but failed and my hadoop log shows me this exception NOTE: metrics-core-2.2.0.jar already exists in C* 2.0.7 lib folder. Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: slave1.tpgsi.com/.yyy.zzz.aa (com.datastax.driver.core.TransportException: [slave1/.yyy.zzz.aa] Error writing)) Pig support for hadoop CqlInputFormat - Key: CASSANDRA-6454 URL: https://issues.apache.org/jira/browse/CASSANDRA-6454 Project: Cassandra Issue Type: Bug Components: Hadoop Reporter: Alex Liu Assignee: Alex Liu Fix For: 2.0.9 Attachments: 6454-2.0-branch.txt CASSANDRA-6311 adds new CqlInputFormat, we need add the Pig support for it -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7250) Make incremental repair default in 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-7250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-7250: --- Attachment: 0001-make-incremental-repair-default-in-trunk.patch Make incremental repair default in 3.0 -- Key: CASSANDRA-7250 URL: https://issues.apache.org/jira/browse/CASSANDRA-7250 Project: Cassandra Issue Type: Bug Reporter: Marcus Eriksson Assignee: Marcus Eriksson Priority: Minor Labels: repair Fix For: 3.0 Attachments: 0001-make-incremental-repair-default-in-trunk.patch To get more testing etc, we should make incremental repair default in trunk -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7232) Enable live replay of commit logs
[ https://issues.apache.org/jira/browse/CASSANDRA-7232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999843#comment-13999843 ] Jonathan Ellis commented on CASSANDRA-7232: --- +1 replaycommitlogs Enable live replay of commit logs - Key: CASSANDRA-7232 URL: https://issues.apache.org/jira/browse/CASSANDRA-7232 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Patrick McFadin Assignee: Lyuben Todorov Priority: Minor Fix For: 2.0.9 Replaying commit logs takes a restart but restoring sstables can be an online operation with refresh. In order to restore a point-in-time without a restart, the node needs to live replay the commit logs from JMX and a nodetool command. nodetool refreshcommitlogs keyspace table -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Issue Comment Deleted] (CASSANDRA-7242) More compaction visibility into thread pool and per CF
[ https://issues.apache.org/jira/browse/CASSANDRA-7242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Lohfink updated CASSANDRA-7242: - Comment: was deleted (was: patch for cassandra-2.0 branch) More compaction visibility into thread pool and per CF -- Key: CASSANDRA-7242 URL: https://issues.apache.org/jira/browse/CASSANDRA-7242 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Chris Lohfink Assignee: Chris Lohfink Priority: Minor Attachments: 7242_jmxify_compactionpool.txt, 7242_per_cf_compactionstats.txt Two parts to this to help diagnose compactions issues/bottlenecks. Could be two different issues but pretty closely related. First is adding per column family pending compactions. When theres a lot of backed up compactions but multiple ones currently being compacted its hard to identify which CF is causing the backlog. In patch provided this doesnt cover the compactions in the thread pools queue like compactionstats does but not sure how big that gets ever or if needs to be... which brings me to the second idea. Second is to change compactionExecutor to extend the JMXEnabledThreadPoolExecutor. Big difference there would be the blocking rejection handler. With a 2^31 pending queue the blocking becoming an issue is a pretty extreme case in itself that would most likely OOM the server. So the different rejection policy shouldn't cause much of an issue but if it does can always override it to use default behavior. Would help identify scenarios where corrupted sstables or unhandled exceptions etc killing the compactions lead to a large backlog with nothing actively working. Also just for added visibility into this from tpstats. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7247) Provide top ten most frequent keys per column family
[ https://issues.apache.org/jira/browse/CASSANDRA-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999849#comment-13999849 ] Chris Lohfink commented on CASSANDRA-7247: -- Another option might be to spin off this and other metrics into the MiscStage, it only has single thread so no synchronization required and wont be as bad to put additional metrics in there as well for additional visibility like topK size in bytes, worst latencies and such. I wouldn't expect much difference performance-wise with just the one stream summary above since enqueuing onto the LinkedBlockingQueue should have similar locking performance (synchronization on putlock), but then reading of metric would never cause contention (albeit very small) on write path. If theres any interest I can give it a shot though and maybe throw in some additional metrics. Provide top ten most frequent keys per column family Key: CASSANDRA-7247 URL: https://issues.apache.org/jira/browse/CASSANDRA-7247 Project: Cassandra Issue Type: Improvement Reporter: Chris Lohfink Priority: Minor Attachments: patch.diff Since already have the nice addthis stream library, can use it to keep track of most frequent DecoratedKeys that come through the system using StreamSummaries ([nice explaination|http://boundary.com/blog/2013/05/14/approximate-heavy-hitters-the-spacesaving-algorithm/]). Then provide a new metric to access them via JMX. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7225) cqlsh help for CQL3 is often incorrect and should be modernized
[ https://issues.apache.org/jira/browse/CASSANDRA-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-7225: -- Component/s: Tools Documentation website Assignee: Mikhail Stepura cqlsh help for CQL3 is often incorrect and should be modernized --- Key: CASSANDRA-7225 URL: https://issues.apache.org/jira/browse/CASSANDRA-7225 Project: Cassandra Issue Type: Bug Components: Documentation website, Tools Reporter: Robert Stupp Assignee: Mikhail Stepura Priority: Trivial Labels: cqlsh Just a small line of text in cqlsh help command indicates that is = and is = in CQL. This is confusing to many people (including me :) ) because I did not expect to return the equals portion. Please allow distinct behaviours for , =, and = in CQL queries. Maybe in combination with CASSANDRA-5184 and/or CASSANDRA-4914 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7209) Consider changing UDT serialization format before 2.1 release.
[ https://issues.apache.org/jira/browse/CASSANDRA-7209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999845#comment-13999845 ] Aleksey Yeschenko commented on CASSANDRA-7209: -- This is going to break cqlsh support, obviously (aka python-driver support), so I'm reopening CASSANDRA-6305, /cc [~thobbs] [~mishail] Consider changing UDT serialization format before 2.1 release. -- Key: CASSANDRA-7209 URL: https://issues.apache.org/jira/browse/CASSANDRA-7209 Project: Cassandra Issue Type: Bug Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Fix For: 2.1 rc1 Attachments: 0001-7209.txt, 0002-Rename-column_names-types-to-field_names-types.txt The current serialization format of UDT is the one of CompositeType. This was initially done on purpose, so that users that were using CompositeType for values in their thrift schema could migrate smoothly to UDT (it was also convenient code wise but that's a weak point). I'm having serious doubt about this being wise however for 2 reasons: * for each component, CompositeType stores an addition byte (the end-of-component) for reasons that only pertain to querying. This byte is basically wasted for UDT and makes no sense. I'll note that outside the inefficiency, there is also the fact that it will likely be pretty surprising/error-prone for driver authors. * it uses an unsigned short for the length of each component. While it's certainly not advisable in the current implementation to use values too big inside an UDT, having this limitation hard-coded in the serialization format is wrong and we've been bitten by this with collection already which we've had to fix in the protocol v3. It's probably worth no doing that mistake again. Furthermore, if we use an int for the size, we can use a negative size to represent a null value (the main point being that it's consistent with how we serialize values in the native protocol), which can be useful (CASSANDRA-7206). Of course, if we change that serialization format, we'd better do it before the 2.1 release. But I think the advantages outweigh the cons especially in the long run so I think we should do it. I'll try to work out a patch quickly so if you have a problem with the principle of this issue, it would be nice to voice it quickly. I'll note that doing that change will mean existing CompositeType values won't be able to be migrated transparently to UDT. I think this was anecdotal in the first place at best, I don't think using CompositeType for values is that popular in thrift tbh. Besides, if we really really want to, it might not be too hard to re-introduce that compatibility later by having some protocol level trick. We can't change the serialization format without breaking people however. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6563) TTL histogram compactions not triggered at high Estimated droppable tombstones rate
[ https://issues.apache.org/jira/browse/CASSANDRA-6563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Ricardo Motta Gomes updated CASSANDRA-6563: - Attachment: patch-v1-range1.png patch-v2-range3.png patch-v1-range2.png TTL histogram compactions not triggered at high Estimated droppable tombstones rate - Key: CASSANDRA-6563 URL: https://issues.apache.org/jira/browse/CASSANDRA-6563 Project: Cassandra Issue Type: Bug Components: Core Environment: 1.2.12ish Reporter: Chris Burroughs Assignee: Paulo Ricardo Motta Gomes Fix For: 1.2.17, 2.0.8 Attachments: 1.2.16-CASSANDRA-6563.txt, 2.0.7-CASSANDRA-6563.txt, patch-v1-range1.png, patch-v1-range2.png, patch-v2-range3.png, patched-droppadble-ratio.png, patched-storage-load.png, patched1-compacted-bytes.png, patched2-compacted-bytes.png, unpatched-droppable-ratio.png, unpatched-storage-load.png, unpatched1-compacted-bytes.png, unpatched2-compacted-bytes.png I have several column families in a largish cluster where virtually all columns are written with a (usually the same) TTL. My understanding of CASSANDRA-3442 is that sstables that have a high ( 20%) estimated percentage of droppable tombstones should be individually compacted. This does not appear to be occurring with size tired compaction. Example from one node: {noformat} $ ll /data/sstables/data/ks/Cf/*Data.db -rw-rw-r-- 31 cassandra cassandra 26651211757 Nov 26 22:59 /data/sstables/data/ks/Cf/ks-Cf-ic-295562-Data.db -rw-rw-r-- 31 cassandra cassandra 6272641818 Nov 27 02:51 /data/sstables/data/ks/Cf/ks-Cf-ic-296121-Data.db -rw-rw-r-- 31 cassandra cassandra 1814691996 Dec 4 21:50 /data/sstables/data/ks/Cf/ks-Cf-ic-320449-Data.db -rw-rw-r-- 30 cassandra cassandra 10909061157 Dec 11 17:31 /data/sstables/data/ks/Cf/ks-Cf-ic-340318-Data.db -rw-rw-r-- 29 cassandra cassandra 459508942 Dec 12 10:37 /data/sstables/data/ks/Cf/ks-Cf-ic-342259-Data.db -rw-rw-r-- 1 cassandra cassandra 336908 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342307-Data.db -rw-rw-r-- 1 cassandra cassandra 2063935 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342309-Data.db -rw-rw-r-- 1 cassandra cassandra 409 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342314-Data.db -rw-rw-r-- 1 cassandra cassandra31180007 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342319-Data.db -rw-rw-r-- 1 cassandra cassandra 2398345 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342322-Data.db -rw-rw-r-- 1 cassandra cassandra 21095 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342331-Data.db -rw-rw-r-- 1 cassandra cassandra 81454 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342335-Data.db -rw-rw-r-- 1 cassandra cassandra 1063718 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342339-Data.db -rw-rw-r-- 1 cassandra cassandra 127004 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342344-Data.db -rw-rw-r-- 1 cassandra cassandra 146785 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342346-Data.db -rw-rw-r-- 1 cassandra cassandra 697338 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342351-Data.db -rw-rw-r-- 1 cassandra cassandra 3921428 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342367-Data.db -rw-rw-r-- 1 cassandra cassandra 240332 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342370-Data.db -rw-rw-r-- 1 cassandra cassandra 45669 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342374-Data.db -rw-rw-r-- 1 cassandra cassandra53127549 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342375-Data.db -rw-rw-r-- 16 cassandra cassandra 12466853166 Dec 25 22:40 /data/sstables/data/ks/Cf/ks-Cf-ic-396473-Data.db -rw-rw-r-- 12 cassandra cassandra 3903237198 Dec 29 19:42 /data/sstables/data/ks/Cf/ks-Cf-ic-408926-Data.db -rw-rw-r-- 7 cassandra cassandra 3692260987 Jan 3 08:25 /data/sstables/data/ks/Cf/ks-Cf-ic-427733-Data.db -rw-rw-r-- 4 cassandra cassandra 3971403602 Jan 6 20:50 /data/sstables/data/ks/Cf/ks-Cf-ic-437537-Data.db -rw-rw-r-- 3 cassandra cassandra 1007832224 Jan 7 15:19 /data/sstables/data/ks/Cf/ks-Cf-ic-440331-Data.db -rw-rw-r-- 2 cassandra cassandra 896132537 Jan 8 11:05 /data/sstables/data/ks/Cf/ks-Cf-ic-447740-Data.db -rw-rw-r-- 1 cassandra cassandra 963039096 Jan 9 04:59 /data/sstables/data/ks/Cf/ks-Cf-ic-449425-Data.db -rw-rw-r-- 1 cassandra cassandra 232168351 Jan 9 10:14 /data/sstables/data/ks/Cf/ks-Cf-ic-450287-Data.db -rw-rw-r-- 1 cassandra cassandra73126319 Jan 9 11:28 /data/sstables/data/ks/Cf/ks-Cf-ic-450307-Data.db -rw-rw-r-- 1 cassandra cassandra40921916 Jan 9 12:08
[jira] [Updated] (CASSANDRA-7249) Too many threads associated with parallel compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cyril Scetbon updated CASSANDRA-7249: - Description: We have a lot of threads on some nodes as you can see : node001: 560 node002: 529 node003: 4350 node004: 552 node005: 547 node006: 554 node007: 572 node008: 1444 == node009: 540 node010: 13691 == node011: 577 node012: 536 node013: 448 node014: 10295 == node015: 452 node016: 576 When I check what are those threads I see a lot of Deserializer sstables. Enabling DEBUG mode shows that a lot of actions are about parallel compaction. What is really surprising is that it tries to deserialize a huge number of times each sstable even if we only have 8 files for the concerned column family : 512690 /data/ks1/cf1/ks1-cf1-ic-616-Data.db 296623 /data/ks1/cf1/ks1-cf1-ic-637-Data.db 311904 /data/ks1/cf1/ks1-cf1-ic-642-Data.db 127061 /data/ks1/cf1/ks1-cf1-ic-643-Data.db 126921 /data/ks1/cf1/ks1-cf1-ic-644-Data.db 129815 /data/ks1/cf1/ks1-cf1-ic-645-Data.db 127862 /data/ks1/cf1/ks1-cf1-ic-646-Data.db 317069 /data/ks1/cf1/ks1-cf1-ic-647-Data.db so, in a minute Cassandra execute 2 millions of times the following code : {code} else { logger.debug(parallel eager deserialize from + iter.getPath()); queue.put(new RowContainer(new Row(iter.getKey(), iter.getColumnFamilyWithColumns(ArrayBackedSortedColumns.factory(); } {code} It seems to be related to [CASSANDRA-5720|https://issues.apache.org/jira/browse/CASSANDRA-5720] cause we got the same error on the concerned column families before the number of threads raise. Upgrading to 2.0 is not a solution for now :( was: We have a lot of threads on some nodes as you can see : node001: 560 node002: 529 node003: 4350 node004: 552 node005: 547 node006: 554 node007: 572 node008: 1444 == node009: 540 node010: 13691 == node011: 577 node012: 536 node013: 448 node014: 10295 == node015: 452 node016: 576 When I check what are those threads I see a lot of Deserializer sstables. Enabling DEBUG mode shows that a lot of actions are about parallel compaction. What is really surprising is that it tries to deserialize a huge number of times each sstable even if we only have 8 files for the concerned column family : 512690 /data/pns_fr_prod/syndic/pns_fr_prod-syndic-ic-616-Data.db 296623 /data/pns_fr_prod/syndic/pns_fr_prod-syndic-ic-637-Data.db 311904 /data/pns_fr_prod/syndic/pns_fr_prod-syndic-ic-642-Data.db 127061 /data/pns_fr_prod/syndic/pns_fr_prod-syndic-ic-643-Data.db 126921 /data/pns_fr_prod/syndic/pns_fr_prod-syndic-ic-644-Data.db 129815 /data/pns_fr_prod/syndic/pns_fr_prod-syndic-ic-645-Data.db 127862 /data/pns_fr_prod/syndic/pns_fr_prod-syndic-ic-646-Data.db 317069 /data/pns_fr_prod/syndic/pns_fr_prod-syndic-ic-647-Data.db so, in a minute Cassandra execute 2 millions of times the following code : {code} else { logger.debug(parallel eager deserialize from + iter.getPath()); queue.put(new RowContainer(new Row(iter.getKey(), iter.getColumnFamilyWithColumns(ArrayBackedSortedColumns.factory(); } {code} It seems to be related to [CASSANDRA-5720|https://issues.apache.org/jira/browse/CASSANDRA-5720] cause we got the same error on the concerned column families before the number of threads raise. Upgrading to 2.0 is not a solution for now :( Too many threads associated with parallel compaction Key: CASSANDRA-7249 URL: https://issues.apache.org/jira/browse/CASSANDRA-7249 Project: Cassandra Issue Type: Bug Components: Core Environment: Ubuntu 12.04.3 LTS 24 CPUs (hyper threading enabled) Reporter: Cyril Scetbon Labels: compaction, parallel, threads We have a lot of threads on some nodes as you can see : node001: 560 node002: 529 node003: 4350 node004: 552 node005: 547 node006: 554 node007: 572 node008: 1444 == node009: 540 node010: 13691 == node011: 577 node012: 536 node013: 448 node014: 10295 == node015: 452 node016: 576 When I check what are those threads I see a lot of Deserializer sstables. Enabling DEBUG mode shows that a lot of actions are about parallel compaction. What is really surprising is that it tries to deserialize a huge number of times each sstable even if we only have 8 files for the concerned column family : 512690 /data/ks1/cf1/ks1-cf1-ic-616-Data.db 296623 /data/ks1/cf1/ks1-cf1-ic-637-Data.db 311904 /data/ks1/cf1/ks1-cf1-ic-642-Data.db 127061 /data/ks1/cf1/ks1-cf1-ic-643-Data.db 126921 /data/ks1/cf1/ks1-cf1-ic-644-Data.db 129815 /data/ks1/cf1/ks1-cf1-ic-645-Data.db 127862 /data/ks1/cf1/ks1-cf1-ic-646-Data.db 317069 /data/ks1/cf1/ks1-cf1-ic-647-Data.db so, in a minute Cassandra execute 2 millions of times the following code : {code} else { logger.debug(parallel eager deserialize from
[jira] [Updated] (CASSANDRA-6602) Compaction improvements to optimize time series data
[ https://issues.apache.org/jira/browse/CASSANDRA-6602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-6602: -- Reviewer: Marcus Eriksson Assignee: Björn Hegerfors (was: Marcus Eriksson) Compaction improvements to optimize time series data Key: CASSANDRA-6602 URL: https://issues.apache.org/jira/browse/CASSANDRA-6602 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Tupshin Harper Assignee: Björn Hegerfors Labels: compaction, performance Fix For: 3.0 Attachments: cassandra-2.0-CASSANDRA-6602-DateTieredCompactionStrategy.txt, cassandra-2.0-CASSANDRA-6602-DateTieredCompactionStrategy_v2.txt There are some unique characteristics of many/most time series use cases that both provide challenges, as well as provide unique opportunities for optimizations. One of the major challenges is in compaction. The existing compaction strategies will tend to re-compact data on disk at least a few times over the lifespan of each data point, greatly increasing the cpu and IO costs of that write. Compaction exists to 1) ensure that there aren't too many files on disk 2) ensure that data that should be contiguous (part of the same partition) is laid out contiguously 3) deleting data due to ttls or tombstones The special characteristics of time series data allow us to optimize away all three. Time series data 1) tends to be delivered in time order, with relatively constrained exceptions 2) often has a pre-determined and fixed expiration date 3) Never gets deleted prior to TTL 4) Has relatively predictable ingestion rates Note that I filed CASSANDRA-5561 and this ticket potentially replaces or lowers the need for it. In that ticket, jbellis reasonably asks, how that compaction strategy is better than disabling compaction. Taking that to heart, here is a compaction-strategy-less approach that could be extremely efficient for time-series use cases that follow the above pattern. (For context, I'm thinking of an example use case involving lots of streams of time-series data with a 5GB per day ingestion rate, and a 1000 day retention with TTL, resulting in an eventual steady state of 5TB per node) 1) You have an extremely large memtable (preferably off heap, if/when doable) for the table, and that memtable is sized to be able to hold a lengthy window of time. A typical period might be one day. At the end of that period, you flush the contents of the memtable to an sstable and move to the next one. This is basically identical to current behaviour, but with thresholds adjusted so that you can ensure flushing at predictable intervals. (Open question is whether predictable intervals is actually necessary, or whether just waiting until the huge memtable is nearly full is sufficient) 2) Combine the behaviour with CASSANDRA-5228 so that sstables will be efficiently dropped once all of the columns have. (Another side note, it might be valuable to have a modified version of CASSANDRA-3974 that doesn't bother storing per-column TTL since it is required that all columns have the same TTL) 3) Be able to mark column families as read/write only (no explicit deletes), so no tombstones. 4) Optionally add back an additional type of delete that would delete all data earlier than a particular timestamp, resulting in immediate dropping of obsoleted sstables. The result is that for in-order delivered data, Every cell will be laid out optimally on disk on the first pass, and over the course of 1000 days and 5TB of data, there will only be 1000 5GB sstables, so the number of filehandles will be reasonable. For exceptions (out-of-order delivery), most cases will be caught by the extended (24 hour+) memtable flush times and merged correctly automatically. For those that were slightly askew at flush time, or were delivered so far out of order that they go in the wrong sstable, there is relatively low overhead to reading from two sstables for a time slice, instead of one, and that overhead would be incurred relatively rarely unless out-of-order delivery was the common case, in which case, this strategy should not be used. Another possible optimization to address out-of-order would be to maintain more than one time-centric memtables in memory at a time (e.g. two 12 hour ones), and then you always insert into whichever one of the two owns the appropriate range of time. By delaying flushing the ahead one until we are ready to roll writes over to a third one, we are able to avoid any fragmentation as long as all deliveries come in no more than 12 hours late (in this example, presumably tunable). Anything that triggers compactions will have to be
[jira] [Commented] (CASSANDRA-7241) Pig test fails on 2.1 branch
[ https://issues.apache.org/jira/browse/CASSANDRA-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999898#comment-13999898 ] Sylvain Lebresne commented on CASSANDRA-7241: - The output of those test is a bit of a mess, but basically I get both failing with: {noformat} [junit] java.lang.ExceptionInInitializerError [junit] at org.apache.cassandra.pig.PigTestBase.startHadoopCluster(PigTestBase.java:109) [junit] at org.apache.cassandra.pig.CqlTableDataTypeTest.setup(CqlTableDataTypeTest.java:216) [junit] Caused by: java.lang.NullPointerException [junit] at org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:422) [junit] at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:280) [junit] at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:124) [junit] at org.apache.pig.test.MiniCluster.setupMiniDfsAndMrClusters(MiniCluster.java:51) [junit] at org.apache.pig.test.MiniGenericCluster.init(MiniGenericCluster.java:49) [junit] at org.apache.pig.test.MiniCluster.init(MiniCluster.java:32) [junit] at org.apache.pig.test.MiniGenericCluster.clinit(MiniGenericCluster.java:45) {noformat} but I doubt CASSANDRA-5417 is the source of that. Is there any configuration to do to get those tests working? Pig test fails on 2.1 branch Key: CASSANDRA-7241 URL: https://issues.apache.org/jira/browse/CASSANDRA-7241 Project: Cassandra Issue Type: Bug Reporter: Alex Liu Assignee: Sylvain Lebresne run ant pig-test on cassandra-2.1 branch. There are many tests failed. I trace it a little and find out Pig test fails starts from https://github.com/apache/cassandra/commit/362cc05352ec67e707e0ac790732e96a15e63f6b commit. It looks like storage changes break Pig tests. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7195) ColumnFamilyStoreTest unit tests fails on Windows
[ https://issues.apache.org/jira/browse/CASSANDRA-7195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-7195: --- Attachment: 7195_v1.txt Trivial patch to clear Keyspace1 on prior test that's not releasing reference to file that's being deleted. Also renamed tada test. ColumnFamilyStoreTest unit tests fails on Windows - Key: CASSANDRA-7195 URL: https://issues.apache.org/jira/browse/CASSANDRA-7195 Project: Cassandra Issue Type: Sub-task Reporter: Joshua McKenzie Assignee: Joshua McKenzie Priority: Minor Labels: Windows Fix For: 3.0 Attachments: 7195_v1.txt Looks like files aren't getting deleted correctly during testLoadNewSSTablesAvoidsOverwrites during a sanity check. Test passes on linux. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Reopened] (CASSANDRA-6305) cqlsh support for User Types
[ https://issues.apache.org/jira/browse/CASSANDRA-6305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko reopened CASSANDRA-6305: -- See CASSANDRA-7209 cqlsh support for User Types Key: CASSANDRA-6305 URL: https://issues.apache.org/jira/browse/CASSANDRA-6305 Project: Cassandra Issue Type: New Feature Reporter: Aleksey Yeschenko Assignee: Mikhail Stepura Labels: cqlsh Fix For: 2.1 beta1 Attachments: trunk-6305-1.patch, trunk-6305-2.patch, trunk-6305-3.patch We need cqlsh support for several things: 1. Autocomplete for UPDATE/INSERT/SELECT 2. Autocomplete for ALTER TYPE/CREATE TYPE/DROP TYPE 3. Proper decoding of UserType values (currently showing the encoded blob) 4. Support UserTypes in DESCRIBE 5. Separate DESCRIBE TYPES|TYPE name cmds -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6975) Allow usage of QueryOptions in CQLStatement.executeInternal
[ https://issues.apache.org/jira/browse/CASSANDRA-6975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-6975: - Reviewer: Mikhail Stepura (was: Sylvain Lebresne) Allow usage of QueryOptions in CQLStatement.executeInternal --- Key: CASSANDRA-6975 URL: https://issues.apache.org/jira/browse/CASSANDRA-6975 Project: Cassandra Issue Type: Improvement Reporter: Mikhail Stepura Assignee: Sylvain Lebresne Priority: Minor Fix For: 2.1 rc1 Attachments: cassandra-2.1-executeInternal.patch The current implementations of {{CQLStatement.executeInternal}} accept only {{QueryState}} as a parameter. That means it's impossible to use prepared statements with variables for internal calls (you can only pass the variables via {{QueryOptions}}). We also can't use the internal paging in internal SELECT statements for the very same reason. I'm attaching the patch which implements that. [~slebresne] [~iamaleksey] what do you think guys? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (CASSANDRA-7249) Too many threads associated with parallel compaction
Cyril Scetbon created CASSANDRA-7249: Summary: Too many threads associated with parallel compaction Key: CASSANDRA-7249 URL: https://issues.apache.org/jira/browse/CASSANDRA-7249 Project: Cassandra Issue Type: Bug Components: Core Environment: Ubuntu 12.04.3 LTS 24 CPUs (hyper threading enabled) Reporter: Cyril Scetbon We have a lot of threads on some nodes as you can see : node001: 560 node002: 529 node003: 4350 node004: 552 node005: 547 node006: 554 node007: 572 node008: 1444 == node009: 540 node010: 13691 == node011: 577 node012: 536 node013: 448 node014: 10295 == node015: 452 node016: 576 When I check what are those threads I see a lot of Deserializer sstables. Enabling DEBUG mode shows that a lot of actions are about parallel compaction. What is really surprising is that it tries to deserialize a huge number of times each sstable even if we only have 8 files for the concerned column family : 512690 /data/pns_fr_prod/syndic/pns_fr_prod-syndic-ic-616-Data.db 296623 /data/pns_fr_prod/syndic/pns_fr_prod-syndic-ic-637-Data.db 311904 /data/pns_fr_prod/syndic/pns_fr_prod-syndic-ic-642-Data.db 127061 /data/pns_fr_prod/syndic/pns_fr_prod-syndic-ic-643-Data.db 126921 /data/pns_fr_prod/syndic/pns_fr_prod-syndic-ic-644-Data.db 129815 /data/pns_fr_prod/syndic/pns_fr_prod-syndic-ic-645-Data.db 127862 /data/pns_fr_prod/syndic/pns_fr_prod-syndic-ic-646-Data.db 317069 /data/pns_fr_prod/syndic/pns_fr_prod-syndic-ic-647-Data.db so, in a minute Cassandra execute 2 millions of times the following code : {code} else { logger.debug(parallel eager deserialize from + iter.getPath()); queue.put(new RowContainer(new Row(iter.getKey(), iter.getColumnFamilyWithColumns(ArrayBackedSortedColumns.factory(); } {code} It seems to be related to [CASSANDRA-5720|https://issues.apache.org/jira/browse/CASSANDRA-5720] cause we got the same error on the concerned column families before the number of threads raise. Upgrading to 2.0 is not a solution for now :( -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7209) Consider changing UDT serialization format before 2.1 release.
[ https://issues.apache.org/jira/browse/CASSANDRA-7209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-7209: - Reviewer: Aleksey Yeschenko Consider changing UDT serialization format before 2.1 release. -- Key: CASSANDRA-7209 URL: https://issues.apache.org/jira/browse/CASSANDRA-7209 Project: Cassandra Issue Type: Bug Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Fix For: 2.1 rc1 Attachments: 0001-7209.txt, 0002-Rename-column_names-types-to-field_names-types.txt The current serialization format of UDT is the one of CompositeType. This was initially done on purpose, so that users that were using CompositeType for values in their thrift schema could migrate smoothly to UDT (it was also convenient code wise but that's a weak point). I'm having serious doubt about this being wise however for 2 reasons: * for each component, CompositeType stores an addition byte (the end-of-component) for reasons that only pertain to querying. This byte is basically wasted for UDT and makes no sense. I'll note that outside the inefficiency, there is also the fact that it will likely be pretty surprising/error-prone for driver authors. * it uses an unsigned short for the length of each component. While it's certainly not advisable in the current implementation to use values too big inside an UDT, having this limitation hard-coded in the serialization format is wrong and we've been bitten by this with collection already which we've had to fix in the protocol v3. It's probably worth no doing that mistake again. Furthermore, if we use an int for the size, we can use a negative size to represent a null value (the main point being that it's consistent with how we serialize values in the native protocol), which can be useful (CASSANDRA-7206). Of course, if we change that serialization format, we'd better do it before the 2.1 release. But I think the advantages outweigh the cons especially in the long run so I think we should do it. I'll try to work out a patch quickly so if you have a problem with the principle of this issue, it would be nice to voice it quickly. I'll note that doing that change will mean existing CompositeType values won't be able to be migrated transparently to UDT. I think this was anecdotal in the first place at best, I don't think using CompositeType for values is that popular in thrift tbh. Besides, if we really really want to, it might not be too hard to re-introduce that compatibility later by having some protocol level trick. We can't change the serialization format without breaking people however. -- This message was sent by Atlassian JIRA (v6.2#6252)
[2/2] git commit: Merge branch 'cassandra-2.1' into trunk
Merge branch 'cassandra-2.1' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/6ac88ab1 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/6ac88ab1 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/6ac88ab1 Branch: refs/heads/trunk Commit: 6ac88ab16a62cf6e9cc2a10f5dfc958fad6074c8 Parents: 7879e7f c045690 Author: Sylvain Lebresne sylv...@datastax.com Authored: Fri May 16 15:56:54 2014 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Fri May 16 15:56:54 2014 +0200 -- CHANGES.txt| 1 + src/java/org/apache/cassandra/cql3/CQL3Type.java | 17 ++--- .../cql3/statements/CreateTypeStatement.java | 3 --- .../cql3/statements/DropTypeStatement.java | 3 --- 4 files changed, 15 insertions(+), 9 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/6ac88ab1/CHANGES.txt --
[1/2] git commit: Limit user types to the keyspace they are defined in
Repository: cassandra Updated Branches: refs/heads/trunk 7879e7f9b - 6ac88ab16 Limit user types to the keyspace they are defined in patch by slebresne; reviewed by iamaleksey for CASSANDRA-6643 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c045690b Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c045690b Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c045690b Branch: refs/heads/trunk Commit: c045690b126e6b1f594291c6eff3957a6d9079ea Parents: ada8f12 Author: Sylvain Lebresne sylv...@datastax.com Authored: Fri May 16 15:56:05 2014 +0200 Committer: Sylvain Lebresne sylv...@datastax.com Committed: Fri May 16 15:56:05 2014 +0200 -- CHANGES.txt| 1 + src/java/org/apache/cassandra/cql3/CQL3Type.java | 17 ++--- .../cql3/statements/CreateTypeStatement.java | 3 --- .../cql3/statements/DropTypeStatement.java | 3 --- 4 files changed, 15 insertions(+), 9 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/c045690b/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 223f331..d12acc9 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -14,6 +14,7 @@ * Add server side batching to native transport (CASSANDRA-5663) * Make batchlog replay asynchronous (CASSANDRA-6134) * remove unused classes (CASSANDRA-7197) + * Limit user types to the keyspace they are defined in (CASSANDRA-6643) Merged from 2.0: * (Hadoop) support authentication in CqlRecordReader (CASSANDRA-7221) * (Hadoop) Close java driver Cluster in CQLRR.close (CASSANDRA-7228) http://git-wip-us.apache.org/repos/asf/cassandra/blob/c045690b/src/java/org/apache/cassandra/cql3/CQL3Type.java -- diff --git a/src/java/org/apache/cassandra/cql3/CQL3Type.java b/src/java/org/apache/cassandra/cql3/CQL3Type.java index 8ce89ca..f07bc19 100644 --- a/src/java/org/apache/cassandra/cql3/CQL3Type.java +++ b/src/java/org/apache/cassandra/cql3/CQL3Type.java @@ -347,7 +347,6 @@ public interface CQL3Type private static class RawUT extends Raw { - private final UTName name; private RawUT(UTName name) @@ -357,7 +356,19 @@ public interface CQL3Type public CQL3Type prepare(String keyspace) throws InvalidRequestException { -name.setKeyspace(keyspace); +if (name.hasKeyspace()) +{ +// The provided keyspace is the one of the current statement this is part of. If it's different from the keyspace of +// the UTName, we reject since we want to limit user types to their own keyspace (see #6643) +if (!keyspace.equals(name.getKeyspace())) +throw new InvalidRequestException(String.format(Statement on keyspace %s cannot refer to a user type in keyspace %s; ++ user types can only be used in the keyspace they are defined in, + keyspace, name.getKeyspace())); +} +else +{ +name.setKeyspace(keyspace); +} KSMetaData ksm = Schema.instance.getKSMetaData(name.getKeyspace()); if (ksm == null) @@ -374,6 +385,6 @@ public interface CQL3Type { return name.toString(); } -} +} } } http://git-wip-us.apache.org/repos/asf/cassandra/blob/c045690b/src/java/org/apache/cassandra/cql3/statements/CreateTypeStatement.java -- diff --git a/src/java/org/apache/cassandra/cql3/statements/CreateTypeStatement.java b/src/java/org/apache/cassandra/cql3/statements/CreateTypeStatement.java index de7ce56..aa8b769 100644 --- a/src/java/org/apache/cassandra/cql3/statements/CreateTypeStatement.java +++ b/src/java/org/apache/cassandra/cql3/statements/CreateTypeStatement.java @@ -50,9 +50,6 @@ public class CreateTypeStatement extends SchemaAlteringStatement { if (!name.hasKeyspace()) name.setKeyspace(state.getKeyspace()); - -if (name.getKeyspace() == null) -throw new InvalidRequestException(You need to be logged in a keyspace or use a fully qualified user type name); } public void addDefinition(ColumnIdentifier name, CQL3Type.Raw type) http://git-wip-us.apache.org/repos/asf/cassandra/blob/c045690b/src/java/org/apache/cassandra/cql3/statements/DropTypeStatement.java
[jira] [Commented] (CASSANDRA-6950) Secondary index query fails with tc range query when ordered by DESC
[ https://issues.apache.org/jira/browse/CASSANDRA-6950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998724#comment-13998724 ] Sylvain Lebresne commented on CASSANDRA-6950: - bq. Actually, I think that would fix thrift, because the current behavior could be considered wrong. Well this is the whole question, what behavior is right for thrift :). In thrift we never (to the best of my knowledge) special case code for a specific AbstractType, and starting doing so now in that case makes me slightly uncomfortable. In particular, the order in thrift is so far always directly the one of said AbstractType. So does it makes more sense to ignore ReverseType as defaultValidator/column validator or not, I'm not sure (and I don't think there is an absolute right answer) but doing so would risk breaking user that rely on the current behavior (that has been here basically forever). I'd agree though that using ReversedType for a validator is somewhat whack and probably almost no-one does that, but I'll admit that I'm leaning towards keeping things the way it is as far as thrift goes just in case someone relies on it (and if someone using this is bugged by this, he can easily remove the ReversedType use). I don't feel extremely strongly though, if we prefer going with changing thrift too, I'll just blame it on you if some user complains that we've break his code in a minor release :). Secondary index query fails with tc range query when ordered by DESC Key: CASSANDRA-6950 URL: https://issues.apache.org/jira/browse/CASSANDRA-6950 Project: Cassandra Issue Type: Bug Components: Core Environment: RHEL 6.3 virtual guest, apache-cassandra-2.0.6-SNAPSHOT-src.tar.gz from build #284 (also tried with 2.0.5 with CASSANDRA- patch custom-applied with same result). Reporter: Andre Campeau Assignee: Sylvain Lebresne Fix For: 2.0.8 Attachments: 6950-pycassa-repro.py, 6950.txt create table test4 ( name text, lname text, tc bigint, record text, PRIMARY KEY ((name, lname), tc)) WITH CLUSTERING ORDER BY (tc DESC) AND compaction={'class': 'LeveledCompactionStrategy'}; create index test4_index ON test4(lname); Populate it with some data and non-zero tc values, then try: select * from test4 where lname='blah' and tc0 allow filtering; And, (0 rows) returned, even though there are rows which should be found. When I create the table using CLUSTERING ORDER BY (tc ASC), the above query works. Rows are correctly returned based on the range check. Tried various combinations but with descending order on tc nothing works. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Reopened] (CASSANDRA-6974) Replaying archived commitlogs isn't working
[ https://issues.apache.org/jira/browse/CASSANDRA-6974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Shuler reopened CASSANDRA-6974: --- Replaying archived commitlogs isn't working --- Key: CASSANDRA-6974 URL: https://issues.apache.org/jira/browse/CASSANDRA-6974 Project: Cassandra Issue Type: Bug Reporter: Ryan McGuire Assignee: Benedict Fix For: 2.1 rc1 Attachments: 2.0.system.log, 2.1.system.log I have a test for restoring archived commitlogs, which is not working in 2.1 HEAD. My commitlogs consist of 30,000 inserts, but system.log indicates there were only 2 mutations replayed: {code} INFO [main] 2014-04-02 11:49:54,173 CommitLog.java:115 - Log replay complete, 2 replayed mutations {code} There are several warnings in the logs about bad headers and invalid CRCs: {code} WARN [main] 2014-04-02 11:49:54,156 CommitLogReplayer.java:138 - Encountered bad header at position 0 of commit log /tmp/dtest -mZIlPE/test/node1/commitlogs/CommitLog-4-1396453793570.log, with invalid CRC. The end of segment marker should be zero. {code} compare that to the same test run on 2.0, where it replayed many more mutations: {code} INFO [main] 2014-04-02 11:49:04,673 CommitLog.java (line 132) Log replay complete, 35960 replayed mutations {code} I'll attach the system logs for reference. [Here is the dtest to reproduce this|https://github.com/riptano/cassandra-dtest/blob/master/snapshot_test.py#L75] - (This currently relies on the fix for snapshots available in CASSANDRA-6965.) -- This message was sent by Atlassian JIRA (v6.2#6252)
[02/10] git commit: cqlsh should return a non-zero error code if a query fails
cqlsh should return a non-zero error code if a query fails patch by Branden Visser and Mikhail Stepura; reviewed by Mikhail Stepura for CASSANDRA-6344 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8abe9f6f Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8abe9f6f Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8abe9f6f Branch: refs/heads/cassandra-2.0 Commit: 8abe9f6f522146dc478f006a8160b4db1c233169 Parents: d839350 Author: Mikhail Stepura mish...@apache.org Authored: Thu May 8 13:20:35 2014 -0700 Committer: Mikhail Stepura mish...@apache.org Committed: Thu May 8 13:27:41 2014 -0700 -- CHANGES.txt | 1 + bin/cqlsh | 4 2 files changed, 5 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/8abe9f6f/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 312cf06..7021e7b 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -23,6 +23,7 @@ * remove duplicate query for local tokens (CASSANDRA-7182) * raise streaming phi convict threshold level (CASSANDRA-7063) * reduce garbage creation in calculatePendingRanges (CASSANDRA-7191) + * exit CQLSH with error status code if script fails (CASSANDRA-6344) 1.2.16 * Add UNLOGGED, COUNTER options to BATCH documentation (CASSANDRA-6816) http://git-wip-us.apache.org/repos/asf/cassandra/blob/8abe9f6f/bin/cqlsh -- diff --git a/bin/cqlsh b/bin/cqlsh index 8e1e0e2..24cb3b8 100755 --- a/bin/cqlsh +++ b/bin/cqlsh @@ -548,6 +548,7 @@ class Shell(cmd.Cmd): self.show_line_nums = True self.stdin = stdin self.query_out = sys.stdout +self.statement_error = False def set_expanded_cql_version(self, ver): ver, vertuple = full_cql_version(ver) @@ -2175,6 +2176,7 @@ class Shell(cmd.Cmd): self.query_out.flush() def printerr(self, text, color=RED, newline=True, shownum=None): +self.statement_error = True if shownum is None: shownum = self.show_line_nums if shownum: @@ -2404,6 +2406,8 @@ def main(options, hostname, port): shell.cmdloop() save_history() +if options.file and shell.statement_error: +sys.exit(2) if __name__ == '__main__': main(*read_options(sys.argv[1:], os.environ))
[jira] [Created] (CASSANDRA-7195) ColumnFamilyStoreTest unit tests fails on Windows
Joshua McKenzie created CASSANDRA-7195: -- Summary: ColumnFamilyStoreTest unit tests fails on Windows Key: CASSANDRA-7195 URL: https://issues.apache.org/jira/browse/CASSANDRA-7195 Project: Cassandra Issue Type: Sub-task Reporter: Joshua McKenzie Assignee: Joshua McKenzie Priority: Minor Looks like files aren't getting deleted correctly during testLoadNewSSTablesAvoidsOverwrites during a sanity check. Test passes on linux. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-7196) Select query with IN restriction times out in CQLSH
[ https://issues.apache.org/jira/browse/CASSANDRA-7196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] T Jake Luciani updated CASSANDRA-7196: -- Attachment: 7196-v2.txt v2 avoids trying to write an empty buffer. Sounds like there is a bug in the off heap pool in netty for this call. I'll try and repro in a test and submit an issue. Select query with IN restriction times out in CQLSH --- Key: CASSANDRA-7196 URL: https://issues.apache.org/jira/browse/CASSANDRA-7196 Project: Cassandra Issue Type: Bug Reporter: Mikhail Stepura Assignee: T Jake Luciani Labels: regression Fix For: 2.1 rc1 Attachments: 7196-v2.txt, 7196-v3.txt, 7196.txt, init_bug.cql I've noticed that {{pylib.cqlshlib.test.test_cqlsh_output.TestCqlshOutput#test_numeric_output}} tests fails on the current 2.1 branch, which wasn't the case before. Here are the steps to reproduce. I'm attaching the script to populate schema. {code} mstepura-mac:cassandra mikhail$ bin/cqlsh -f path_to/init_bug.cql mstepura-mac:cassandra mikhail$ bin/cqlsh Connected to Test Cluster at 127.0.0.1:9042. [cqlsh 5.0.0 | Cassandra 2.1.0-beta2-SNAPSHOT | CQL spec 3.1.6 | Native protocol v2] Use HELP for help. cqlsh use test; cqlsh:test select intcol, bigintcol, varintcol from has_all_types where num in (0, 1, 2, 3, 4); errors={}, last_host=127.0.0.1 cqlsh:test {code} That works perfectly on 2.0 branch. And there are no errors in the logs. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (CASSANDRA-7250) Make incremental repair default in 3.0
Marcus Eriksson created CASSANDRA-7250: -- Summary: Make incremental repair default in 3.0 Key: CASSANDRA-7250 URL: https://issues.apache.org/jira/browse/CASSANDRA-7250 Project: Cassandra Issue Type: Bug Reporter: Marcus Eriksson Assignee: Marcus Eriksson Priority: Minor Fix For: 3.0 To get more testing etc, we should make incremental repair default in trunk -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (CASSANDRA-6563) TTL histogram compactions not triggered at high Estimated droppable tombstones rate
[ https://issues.apache.org/jira/browse/CASSANDRA-6563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Ricardo Motta Gomes updated CASSANDRA-6563: - Attachment: (was: patch-v1-range1.png) TTL histogram compactions not triggered at high Estimated droppable tombstones rate - Key: CASSANDRA-6563 URL: https://issues.apache.org/jira/browse/CASSANDRA-6563 Project: Cassandra Issue Type: Bug Components: Core Environment: 1.2.12ish Reporter: Chris Burroughs Assignee: Paulo Ricardo Motta Gomes Fix For: 1.2.17, 2.0.8 Attachments: 1.2.16-CASSANDRA-6563.txt, 2.0.7-CASSANDRA-6563.txt, patch-v1-range1.png, patch-v1-range2.png, patch-v2-range3.png, patched-droppadble-ratio.png, patched-storage-load.png, patched1-compacted-bytes.png, patched2-compacted-bytes.png, unpatched-droppable-ratio.png, unpatched-storage-load.png, unpatched1-compacted-bytes.png, unpatched2-compacted-bytes.png I have several column families in a largish cluster where virtually all columns are written with a (usually the same) TTL. My understanding of CASSANDRA-3442 is that sstables that have a high ( 20%) estimated percentage of droppable tombstones should be individually compacted. This does not appear to be occurring with size tired compaction. Example from one node: {noformat} $ ll /data/sstables/data/ks/Cf/*Data.db -rw-rw-r-- 31 cassandra cassandra 26651211757 Nov 26 22:59 /data/sstables/data/ks/Cf/ks-Cf-ic-295562-Data.db -rw-rw-r-- 31 cassandra cassandra 6272641818 Nov 27 02:51 /data/sstables/data/ks/Cf/ks-Cf-ic-296121-Data.db -rw-rw-r-- 31 cassandra cassandra 1814691996 Dec 4 21:50 /data/sstables/data/ks/Cf/ks-Cf-ic-320449-Data.db -rw-rw-r-- 30 cassandra cassandra 10909061157 Dec 11 17:31 /data/sstables/data/ks/Cf/ks-Cf-ic-340318-Data.db -rw-rw-r-- 29 cassandra cassandra 459508942 Dec 12 10:37 /data/sstables/data/ks/Cf/ks-Cf-ic-342259-Data.db -rw-rw-r-- 1 cassandra cassandra 336908 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342307-Data.db -rw-rw-r-- 1 cassandra cassandra 2063935 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342309-Data.db -rw-rw-r-- 1 cassandra cassandra 409 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342314-Data.db -rw-rw-r-- 1 cassandra cassandra31180007 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342319-Data.db -rw-rw-r-- 1 cassandra cassandra 2398345 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342322-Data.db -rw-rw-r-- 1 cassandra cassandra 21095 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342331-Data.db -rw-rw-r-- 1 cassandra cassandra 81454 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342335-Data.db -rw-rw-r-- 1 cassandra cassandra 1063718 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342339-Data.db -rw-rw-r-- 1 cassandra cassandra 127004 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342344-Data.db -rw-rw-r-- 1 cassandra cassandra 146785 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342346-Data.db -rw-rw-r-- 1 cassandra cassandra 697338 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342351-Data.db -rw-rw-r-- 1 cassandra cassandra 3921428 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342367-Data.db -rw-rw-r-- 1 cassandra cassandra 240332 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342370-Data.db -rw-rw-r-- 1 cassandra cassandra 45669 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342374-Data.db -rw-rw-r-- 1 cassandra cassandra53127549 Dec 12 12:03 /data/sstables/data/ks/Cf/ks-Cf-ic-342375-Data.db -rw-rw-r-- 16 cassandra cassandra 12466853166 Dec 25 22:40 /data/sstables/data/ks/Cf/ks-Cf-ic-396473-Data.db -rw-rw-r-- 12 cassandra cassandra 3903237198 Dec 29 19:42 /data/sstables/data/ks/Cf/ks-Cf-ic-408926-Data.db -rw-rw-r-- 7 cassandra cassandra 3692260987 Jan 3 08:25 /data/sstables/data/ks/Cf/ks-Cf-ic-427733-Data.db -rw-rw-r-- 4 cassandra cassandra 3971403602 Jan 6 20:50 /data/sstables/data/ks/Cf/ks-Cf-ic-437537-Data.db -rw-rw-r-- 3 cassandra cassandra 1007832224 Jan 7 15:19 /data/sstables/data/ks/Cf/ks-Cf-ic-440331-Data.db -rw-rw-r-- 2 cassandra cassandra 896132537 Jan 8 11:05 /data/sstables/data/ks/Cf/ks-Cf-ic-447740-Data.db -rw-rw-r-- 1 cassandra cassandra 963039096 Jan 9 04:59 /data/sstables/data/ks/Cf/ks-Cf-ic-449425-Data.db -rw-rw-r-- 1 cassandra cassandra 232168351 Jan 9 10:14 /data/sstables/data/ks/Cf/ks-Cf-ic-450287-Data.db -rw-rw-r-- 1 cassandra cassandra73126319 Jan 9 11:28 /data/sstables/data/ks/Cf/ks-Cf-ic-450307-Data.db -rw-rw-r-- 1 cassandra cassandra40921916 Jan 9 12:08 /data/sstables/data/ks/Cf/ks-Cf-ic-450336-Data.db
[3/8] git commit: Add --resolve-ip option to 'nodetool ring'
Add --resolve-ip option to 'nodetool ring' patch by Yasuharu Goto; revised by Mikhail Stepura for CASSANDRA-7210 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a01eb5a9 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a01eb5a9 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a01eb5a9 Branch: refs/heads/trunk Commit: a01eb5a9458e03e7e53509d738b8717d0975141b Parents: 569177f Author: Yasuharu Goto matope@gmail.com Authored: Thu May 15 19:21:20 2014 -0700 Committer: Mikhail Stepura mish...@apache.org Committed: Thu May 15 19:27:01 2014 -0700 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/tools/NodeCmd.java | 13 +++-- 2 files changed, 8 insertions(+), 6 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a01eb5a9/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 285efd1..30e45c1 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -12,6 +12,7 @@ * cqlsh: Accept and execute CQL statement(s) from command-line parameter (CASSANDRA-7172) * Fix IllegalStateException in CqlPagingRecordReader (CASSANDRA-7198) * Fix the InvertedIndex trigger example (CASSANDRA-7211) + * Add --resolve-ip option to 'nodetool ring' (CASSANDRA-7210) 2.0.8 http://git-wip-us.apache.org/repos/asf/cassandra/blob/a01eb5a9/src/java/org/apache/cassandra/tools/NodeCmd.java -- diff --git a/src/java/org/apache/cassandra/tools/NodeCmd.java b/src/java/org/apache/cassandra/tools/NodeCmd.java index b79a037..1af1ec8 100644 --- a/src/java/org/apache/cassandra/tools/NodeCmd.java +++ b/src/java/org/apache/cassandra/tools/NodeCmd.java @@ -250,7 +250,7 @@ public class NodeCmd * @param outs *the stream to write to */ -public void printRing(PrintStream outs, String keyspace) +public void printRing(PrintStream outs, String keyspace, boolean resolveIp) { MapString, String tokensToEndpoints = probe.getTokenToEndpointMap(); LinkedHashMultimapString, String endpointsToTokens = LinkedHashMultimap.create(); @@ -285,7 +285,7 @@ public class NodeCmd try { outs.println(); -for (EntryString, SetHostStat entry : getOwnershipByDc(false, tokensToEndpoints, ownerships).entrySet()) +for (EntryString, SetHostStat entry : getOwnershipByDc(resolveIp, tokensToEndpoints, ownerships).entrySet()) printDc(outs, format, entry.getKey(), endpointsToTokens, keyspaceSelected, entry.getValue()); } catch (UnknownHostException e) @@ -362,7 +362,7 @@ public class NodeCmd ? loadMap.get(endpoint) : ?; String owns = stat.owns != null ? new DecimalFormat(##0.00%).format(stat.owns) : ?; -outs.printf(format, endpoint, rack, status, state, load, owns, stat.token); +outs.printf(format, stat.ipOrDns(), rack, status, state, load, owns, stat.token); } outs.println(); } @@ -1216,8 +1216,9 @@ public class NodeCmd switch (command) { case RING : -if (arguments.length 0) { nodeCmd.printRing(System.out, arguments[0]); } -else { nodeCmd.printRing(System.out, null); }; +boolean resolveIp = cmd.hasOption(RESOLVE_IP.left); +if (arguments.length 0) { nodeCmd.printRing(System.out, arguments[0], resolveIp); } +else { nodeCmd.printRing(System.out, null, resolveIp); }; break; case INFO: nodeCmd.printInfo(System.out, cmd); break; @@ -1257,7 +1258,7 @@ public class NodeCmd break; case STATUS : -boolean resolveIp = cmd.hasOption(RESOLVE_IP.left); +resolveIp = cmd.hasOption(RESOLVE_IP.left); if (arguments.length 0) nodeCmd.printClusterStatus(System.out, arguments[0], resolveIp); else nodeCmd.printClusterStatus(System.out, null, resolveIp); break;
[jira] [Updated] (CASSANDRA-4718) More-efficient ExecutorService for improved throughput
[ https://issues.apache.org/jira/browse/CASSANDRA-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Brown updated CASSANDRA-4718: --- Attachment: stress_2014May15.txt Adding more fuel to the testing fire, i took a first pass at having a large of amount of data on disk (~2x the memory size of each box), and running the read tests - see attached file: stress_2014May15.txt. I cleared the page cache before switching to each branch from for the reads, and then performed 3 rounds of stress. The goal here was to see how the sep branch compared with cassandra-2.1 when doing most of the reads from disk (with a cold page cache, or where the cache is constantly churning due to new blocks being pulled in). The short story is the sep branch performs slightly worse the current cassandra-2.1 (which includes CASSANDRA-5663) on both ops/s and latencies. I'm going to do one more test where I preload a good chunk of the data into the page cache, then run the stress - hopefully to emulate the case where most reads come from the page cache and some go to disk. Will me to use a less naive key distribution alg, to ensure that we hit the hot keys, which is provided by stress. More-efficient ExecutorService for improved throughput -- Key: CASSANDRA-4718 URL: https://issues.apache.org/jira/browse/CASSANDRA-4718 Project: Cassandra Issue Type: Improvement Reporter: Jonathan Ellis Assignee: Benedict Priority: Minor Labels: performance Fix For: 2.1.0 Attachments: 4718-v1.patch, PerThreadQueue.java, aws.svg, aws_read.svg, backpressure-stress.out.txt, baq vs trunk.png, belliotsmith_branches-stress.out.txt, jason_read.svg, jason_read_latency.svg, jason_write.svg, op costs of various queues.ods, stress op rate with various queues.ods, stress_2014May15.txt, v1-stress.out Currently all our execution stages dequeue tasks one at a time. This can result in contention between producers and consumers (although we do our best to minimize this by using LinkedBlockingQueue). One approach to mitigating this would be to make consumer threads do more work in bulk instead of just one task per dequeue. (Producer threads tend to be single-task oriented by nature, so I don't see an equivalent opportunity there.) BlockingQueue has a drainTo(collection, int) method that would be perfect for this. However, no ExecutorService in the jdk supports using drainTo, nor could I google one. What I would like to do here is create just such a beast and wire it into (at least) the write and read stages. (Other possible candidates for such an optimization, such as the CommitLog and OutboundTCPConnection, are not ExecutorService-based and will need to be one-offs.) AbstractExecutorService may be useful. The implementations of ICommitLogExecutorService may also be useful. (Despite the name these are not actual ExecutorServices, although they share the most important properties of one.) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-4718) More-efficient ExecutorService for improved throughput
[ https://issues.apache.org/jira/browse/CASSANDRA-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998622#comment-13998622 ] Benedict edited comment on CASSANDRA-4718 at 5/15/14 9:30 AM: -- bq. (latest bdplab tests) Which latest bdplab tests? The longer bdplab test from before (not the latest tests) had some issues (unrelated to this ticket) so we didn't get any read results, but showed increased write throughput. The latest tests have all been short runs. I am actually very pleased we are _at all_ faster on bdplab for any workload, as the first versions of these patches did not seem to benefit older hardware/kernels (we don't have enough hardware configurations to say which was the deciding factor), and actually incurred a slight penalty. The fact that the gap is very narrow for bdplab is not really important, nor are the thrift numbers. In both of those instances the interesting thing is only that we _do not perform any worse_; performing slightly better even here is just a bonus. was (Author: benedict): bq. (latest bdplab tests) Which latest bdplab tests? The longer bdplab test from before (not the latest tests) had some issues (unrelated to this ticket) so we didn't get any read results, but showed increased write throughput. The latest tests have all been short runs. I am actually very pleased we are _at all_ faster for bdplab on any workload, as the first versions of these patches did not seem to benefit older hardware/kernels (we don't have enough hardware configurations to say which was the deciding factor), and actually incurred a slight penalty. The fact that the gap is very narrow for bdplab is not really important, nor are the thrift numbers. In both of those instances I am interested only in that we _do not perform any worse_; performing slightly better even here is just a bonus. More-efficient ExecutorService for improved throughput -- Key: CASSANDRA-4718 URL: https://issues.apache.org/jira/browse/CASSANDRA-4718 Project: Cassandra Issue Type: Improvement Reporter: Jonathan Ellis Assignee: Benedict Priority: Minor Labels: performance Fix For: 2.1.0 Attachments: 4718-v1.patch, PerThreadQueue.java, aws.svg, aws_read.svg, backpressure-stress.out.txt, baq vs trunk.png, belliotsmith_branches-stress.out.txt, jason_read.svg, jason_read_latency.svg, jason_write.svg, op costs of various queues.ods, stress op rate with various queues.ods, v1-stress.out Currently all our execution stages dequeue tasks one at a time. This can result in contention between producers and consumers (although we do our best to minimize this by using LinkedBlockingQueue). One approach to mitigating this would be to make consumer threads do more work in bulk instead of just one task per dequeue. (Producer threads tend to be single-task oriented by nature, so I don't see an equivalent opportunity there.) BlockingQueue has a drainTo(collection, int) method that would be perfect for this. However, no ExecutorService in the jdk supports using drainTo, nor could I google one. What I would like to do here is create just such a beast and wire it into (at least) the write and read stages. (Other possible candidates for such an optimization, such as the CommitLog and OutboundTCPConnection, are not ExecutorService-based and will need to be one-offs.) AbstractExecutorService may be useful. The implementations of ICommitLogExecutorService may also be useful. (Despite the name these are not actual ExecutorServices, although they share the most important properties of one.) -- This message was sent by Atlassian JIRA (v6.2#6252)
[3/4] git commit: make sure manifest's parent dirs exist before trying to write the file.
make sure manifest's parent dirs exist before trying to write the file. Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d267cf88 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d267cf88 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d267cf88 Branch: refs/heads/trunk Commit: d267cf88c870a05efc9109a53b51b8628b4dfe48 Parents: e241319 Author: Jason Brown jasobr...@apple.com Authored: Wed May 7 16:34:29 2014 -0700 Committer: Jason Brown jasobr...@apple.com Committed: Wed May 7 16:34:29 2014 -0700 -- src/java/org/apache/cassandra/db/ColumnFamilyStore.java | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/d267cf88/src/java/org/apache/cassandra/db/ColumnFamilyStore.java -- diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java index 33b7303..417a5b4 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java +++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java @@ -2171,9 +2171,10 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean final JSONObject manifestJSON = new JSONObject(); manifestJSON.put(files, filesJSONArr); - try { +if (!manifestFile.getParentFile().exists()) +manifestFile.getParentFile().mkdirs(); PrintStream out = new PrintStream(manifestFile); out.println(manifestJSON.toJSONString()); out.close();
[jira] [Updated] (CASSANDRA-6975) Allow usage of QueryOptions in CQLStatement.executeInternal
[ https://issues.apache.org/jira/browse/CASSANDRA-6975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-6975: - Assignee: Sylvain Lebresne (was: Mikhail Stepura) Allow usage of QueryOptions in CQLStatement.executeInternal --- Key: CASSANDRA-6975 URL: https://issues.apache.org/jira/browse/CASSANDRA-6975 Project: Cassandra Issue Type: Improvement Reporter: Mikhail Stepura Assignee: Sylvain Lebresne Priority: Minor Fix For: 2.1 rc1 Attachments: cassandra-2.1-executeInternal.patch The current implementations of {{CQLStatement.executeInternal}} accept only {{QueryState}} as a parameter. That means it's impossible to use prepared statements with variables for internal calls (you can only pass the variables via {{QueryOptions}}). We also can't use the internal paging in internal SELECT statements for the very same reason. I'm attaching the patch which implements that. [~slebresne] [~iamaleksey] what do you think guys? -- This message was sent by Atlassian JIRA (v6.2#6252)
git commit: minor nit: use the keyspace parameter
Repository: cassandra Updated Branches: refs/heads/trunk 57f3b802b - 7879e7f9b minor nit: use the keyspace parameter Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7879e7f9 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7879e7f9 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7879e7f9 Branch: refs/heads/trunk Commit: 7879e7f9b6f0d95b85f24923bb93229ab129fca2 Parents: 57f3b80 Author: Dave Brosius dbros...@mebigfatguy.com Authored: Fri May 16 06:34:38 2014 -0400 Committer: Dave Brosius dbros...@mebigfatguy.com Committed: Fri May 16 06:35:07 2014 -0400 -- test/unit/org/apache/cassandra/db/CommitLogTest.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/7879e7f9/test/unit/org/apache/cassandra/db/CommitLogTest.java -- diff --git a/test/unit/org/apache/cassandra/db/CommitLogTest.java b/test/unit/org/apache/cassandra/db/CommitLogTest.java index 660e91e..c4a1fe1 100644 --- a/test/unit/org/apache/cassandra/db/CommitLogTest.java +++ b/test/unit/org/apache/cassandra/db/CommitLogTest.java @@ -171,7 +171,7 @@ public class CommitLogTest extends SchemaLoader private static int getMaxRecordDataSize(String keyspace, ByteBuffer key, String table, CellName column) { -Mutation rm = new Mutation(Keyspace1, bytes(k)); +Mutation rm = new Mutation(keyspace, bytes(k)); rm.add(Standard1, Util.cellname(c1), ByteBuffer.allocate(0), 0); int max = (DatabaseDescriptor.getCommitLogSegmentSize() / 2);
[jira] [Commented] (CASSANDRA-7216) Restricted superuser account request
[ https://issues.apache.org/jira/browse/CASSANDRA-7216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998718#comment-13998718 ] Aleksey Yeschenko commented on CASSANDRA-7216: -- [~odpeer] Are user names and keyspace names related in your multitenant setup? Specifically, can the latter be derived from the former? If so, you can get what you want with a combination of something like the attached patch and a custom IAuthenticator, that would pre-setup necessary permissions upon user-creation. Restricted superuser account request Key: CASSANDRA-7216 URL: https://issues.apache.org/jira/browse/CASSANDRA-7216 Project: Cassandra Issue Type: Improvement Reporter: Oded Peer Assignee: Dave Brosius Priority: Minor Fix For: 3.0 Attachments: 7216.txt I am developing a multi-tenant service. Every tenant has its own user, keyspace and can access only his keyspace. As new tenants are provisioned there is a need to create new users and keyspaces. Only a superuser can issue CREATE USER requests, so we must have a super user account in the system. On the other hand super users have access to all the keyspaces, which poses a security risk. For tenant provisioning I would like to have a restricted account which can only create new users, without read access to keyspaces. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (CASSANDRA-7199) [dtest] snapshot_test hung on 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-7199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams reassigned CASSANDRA-7199: --- Assignee: Benedict (was: Michael Shuler) [dtest] snapshot_test hung on 2.1 - Key: CASSANDRA-7199 URL: https://issues.apache.org/jira/browse/CASSANDRA-7199 Project: Cassandra Issue Type: Test Components: Tests Reporter: Michael Shuler Assignee: Benedict Priority: Minor Labels: qa-resolved Fix For: 2.1 rc1 Attachments: 7199.txt, jenkins-scratch-2.1_dtest-failed-snapshot_test-dtestdir.tar.gz Test hung twice on 2.1 in the same manner while trying a new ccm branch as a scratch jenkins job {noformat} 11:57:44 dont_test_archive_commitlog (snapshot_test.TestArchiveCommitlog) ... Requested creating snapshot(s) for [ks] with snapshot name [basic] 11:58:03 Snapshot directory: basic 11:58:41 Established connection to initial hosts 11:58:41 Opening sstables and calculating sections to stream 11:58:41 Streaming relevant part of /tmp/tmpgTsloD/ks/cf/ks-cf-ka-1-Data.db to [/127.0.0.1] 11:58:41 progress: [/127.0.0.1]0:1/1 100% total: 100% 0 MB/s(avg: 0 MB/s) progress: [/127.0.0.1]0:1/1 100% total: 100% 0 MB/s(avg: 0 MB/s) 11:58:42 Summary statistics: 11:58:42Connections per host: : 1 11:58:42Total files transferred: : 1 11:58:42Total bytes transferred: : 527659 11:58:42Total duration (ms): : 2384 11:58:42Average transfer rate (MB/s): : 0 11:58:42Peak transfer rate (MB/s):: 0 11:58:42 11:58:59 ok 11:58:59 test_archive_commitlog (snapshot_test.TestArchiveCommitlog) ... rm: cannot remove `/tmp/tmp6c0qNr/*': No such file or directory 12:14:15 Build timed out (after 15 minutes). Marking the build as aborted. {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-7247) Provide top ten most frequent keys per column family
[ https://issues.apache.org/jira/browse/CASSANDRA-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999621#comment-13999621 ] Chris Lohfink edited comment on CASSANDRA-7247 at 5/16/14 5:55 AM: --- Problem is StreamSummary is not thread safe. There is a ConcurrentStreamSummary, which I found in this implementation to be ~4x slower then a synchronized block around the offer of the non-thread safe one. Concurrent did perform similarly when also wrapped in synchronized block which I will show below but because it would lose any benefit of being a concurrent implementation when access is serialized I think the faster impl is best. Done on 2013 retina MBP with 500gb ssd against trunk: {code:title=No Changes} id, ops ,op/s, key/s,mean, med, .95, .99,.999, max, time, stderr 4 threadCount, 634450, 21692, 21692, 0.2, 0.2, 0.2, 0.2, 0.4, 740.1, 29.2, 0.01188 8 threadCount, 886600, 29762, 29762, 0.3, 0.2, 0.3, 0.4, 1.3, 1007.3, 29.8, 0.01220 16 threadCount, 912050, 29035, 29035, 0.5, 0.3, 0.9, 2.5,11.2, 1393.8, 31.4, 0.01162 24 threadCount, 1022250 , 32681, 32681, 0.7, 0.5, 1.0, 2.9,13.5, 1126.5, 31.3, 0.00923 36 threadCount, 946550, 30900, 30900, 1.2, 0.8, 1.4, 3.0,22.5, 1369.2, 30.6, 0.01089 {code} {code:title=With Patch} id, ops ,op/s, key/s,mean, med, .95, .99,.999, max, time, stderr 4 threadCount, 643900, 21700, 21700, 0.2, 0.2, 0.2, 0.2, 0.9, 941.1, 29.7, 0.01079 8 threadCount, 942100, 32300, 32300, 0.2, 0.2, 0.3, 0.3, 1.2, 849.5, 29.2, 0.01519 16 threadCount, 907400, 30650, 30650, 0.5, 0.3, 0.8, 1.9,10.7, 1124.0, 29.6, 0.01112 24 threadCount, 1026150 , 31753, 31753, 0.7, 0.5, 0.9, 3.3,20.6, 1299.0, 32.3, 0.01295 36 threadCount, 980600, 30077, 30077, 1.2, 0.8, 1.3, 2.7,24.9, 1394.3, 32.6, 0.01747 {code} {code:title=ConcurrentStreamSummary with sync} 4 threadCount, 494350, 16643, 16643, 0.2, 0.2, 0.3, 0.3, 1.0, 943.6, 29.7, 0.01286 8 threadCount, 812950, 26358, 26358, 0.3, 0.2, 0.3, 0.5, 1.4, 1488.9, 30.8, 0.01909 16 threadCount, 877500, 27396, 27396, 0.6, 0.3, 1.0, 2.2,12.1, 1299.2, 32.0, 0.01824 24 threadCount, 837550, 25345, 25345, 0.9, 0.4, 1.2, 3.7,84.2, 2123.6, 33.0, 0.02437 36 threadCount, 910200, 28008, 28008, 1.3, 0.6, 2.8, 9.2,32.2, 1212.8, 32.5, 0.01654 {code} {code:title=ConcurentStreamSummary no blocking} id, ops ,op/s, key/s,mean, med, .95, .99,.999, max, time, stderr 4 threadCount, 183600,6145,6145, 0.6, 0.6, 0.8, 1.0, 2.6, 354.5, 29.9, 0.01063 8 threadCount, 197200,6593,6593, 1.2, 1.1, 1.4, 1.8, 3.3, 413.5, 29.9, 0.00716 16 threadCount, 203200,6794,6794, 2.3, 2.2, 2.6, 3.5,12.1, 649.1, 29.9, 0.01096 24 threadCount, 198000,6615,6615, 3.6, 3.3, 4.2, 4.9,44.2, 570.4, 29.9, 0.00894 36 threadCount, 199800,6627,6627, 5.4, 4.9, 6.5, 8.0, 110.8, 272.3, 30.1, 0.01452 {code} was (Author: cnlwsu): Problem is StreamSummary is not thread safe. There is a ConcurrentStreamSummary, which I found in this implementation to be ~5x slower then a synchronized block around the offer of the non-thread safe one. Concurrent did perform similarly when also wrapped in synchronized block which I will show below but because it would lose any benefit of being a concurrent implementation when access is serialized I think the faster impl is best. Done on 2013 retina MBP with 500gb ssd against trunk: {code:title=No Changes} id, ops ,op/s, key/s,mean, med, .95, .99,.999, max, time, stderr 4 threadCount, 634450, 21692, 21692, 0.2, 0.2, 0.2, 0.2, 0.4, 740.1, 29.2, 0.01188 8 threadCount, 886600, 29762, 29762, 0.3, 0.2, 0.3, 0.4, 1.3, 1007.3, 29.8, 0.01220 16 threadCount, 912050, 29035, 29035, 0.5, 0.3, 0.9, 2.5,11.2, 1393.8, 31.4, 0.01162 24 threadCount, 1022250 , 32681, 32681, 0.7, 0.5, 1.0, 2.9,13.5, 1126.5, 31.3, 0.00923 36 threadCount, 946550, 30900, 30900, 1.2, 0.8, 1.4, 3.0,22.5, 1369.2, 30.6, 0.01089 {code} {code:title=With Patch}
[jira] [Commented] (CASSANDRA-7036) counter_tests.py:TestCounters.upgrade_test dtest hangs in 1.2, 2.0, 2.1
[ https://issues.apache.org/jira/browse/CASSANDRA-7036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13992761#comment-13992761 ] Michael Shuler commented on CASSANDRA-7036: --- 1.2 is also affected - http://cassci.datastax.com/job/cassandra-1.2_novnode_dtest/150/console I grabbed the log artifacts to dig around. counter_tests.py:TestCounters.upgrade_test dtest hangs in 1.2, 2.0, 2.1 --- Key: CASSANDRA-7036 URL: https://issues.apache.org/jira/browse/CASSANDRA-7036 Project: Cassandra Issue Type: Test Components: Tests Reporter: Michael Shuler Assignee: Aleksey Yeschenko Fix For: 2.1 rc1 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (CASSANDRA-7210) Add --resolve-ip option on 'nodetool ring'
[ https://issues.apache.org/jira/browse/CASSANDRA-7210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Stepura resolved CASSANDRA-7210. Resolution: Fixed Committed, thanks! Add --resolve-ip option on 'nodetool ring' -- Key: CASSANDRA-7210 URL: https://issues.apache.org/jira/browse/CASSANDRA-7210 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Yasuharu Goto Assignee: Yasuharu Goto Priority: Trivial Fix For: 2.0.9, 2.1 rc1 Attachments: 2.0-7210-2.txt, 2.0-7210.txt, trunk-7210-2.txt, trunk-7210.txt Give nodetool ring the option of either displaying IPs or hostnames for the nodes in a ring. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7245) Out-of-Order keys with stress + CQL3
[ https://issues.apache.org/jira/browse/CASSANDRA-7245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999588#comment-13999588 ] Pavel Yaskevich commented on CASSANDRA-7245: [~brandon.williams] It's using native, so no thrift involvement what so ever. [~tjake] We will try to do that tomorrow. Out-of-Order keys with stress + CQL3 Key: CASSANDRA-7245 URL: https://issues.apache.org/jira/browse/CASSANDRA-7245 Project: Cassandra Issue Type: Bug Components: Core Reporter: Pavel Yaskevich We have been generating data (stress with CQL3 prepared) for CASSANDRA-4718 and found following problem almost in every SSTable generated (~200 GB of data and 821 SSTables). We set up they keys to be 10 bytes in size (default) and population between 1 and 6. Once I ran 'sstablekeys' on the generated SSTable files I got following exceptions: _There is a problem with sorting of normal looking keys:_ 30303039443538353645 30303039443745364242 java.io.IOException: Key out of order! DecoratedKey(-217680888487824985, *30303039443745364242*) DecoratedKey(-1767746583617597213, *30303039443437454333*) 0a30303033343933 3734441388343933 java.io.IOException: Key out of order! DecoratedKey(5440473860101999581, *3734441388343933*) DecoratedKey(-7565486415339257200, *30303033344639443137*) 30303033354244363031 30303033354133423742 java.io.IOException: Key out of order! DecoratedKey(2687072396429900180, *30303033354133423742*) DecoratedKey(-7838239767410066684, *30303033354145344534*) 30303034313442354137 3034313635363334 java.io.IOException: Key out of order! DecoratedKey(1516003874415400462, *3034313635363334*) DecoratedKey(-9106177395653818217, *3030303431444238*) 30303035373044373435 30303035373044334631 java.io.IOException: Key out of order! DecoratedKey(-3645715702154616540, *30303035373044334631*) DecoratedKey(-4296696226469000945, *30303035373132364138*) _And completely different ones:_ 30303041333745373543 7cd045c59a90d7587d8d java.io.IOException: Key out of order! DecoratedKey(-3595402345023230196, *7cd045c59a90d7587d8d*) DecoratedKey(-5146766422778260690, *30303041333943303232*) 3030303332314144 30303033323346343932 java.io.IOException: Key out of order! DecoratedKey(7071845511166615635, *30303033323346343932*) DecoratedKey(5233296131921119414, *53d83e0012287e03*) 30303034314531374431 3806734b256c27e41ec2 java.io.IOException: Key out of order! DecoratedKey(-7720474642702543193, *3806734b256c27e41ec2*) DecoratedKey(-8072288379146044663, *30303034314136413343*) _And sometimes there is no problem at all:_ 30303033353144463637 002a31b3b31a1c2f 5d616dd38211ebb5d6ec 444236451388 1388138844463744 30303033353143394343 It's worth to mention that we have got 22 timeout exceptions but number of out-of-order keys is much larger than that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-6643) Limit user types to the keyspace they are defined in
[ https://issues.apache.org/jira/browse/CASSANDRA-6643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993130#comment-13993130 ] Jonathan Ellis commented on CASSANDRA-6643: --- bq. contrarily to what I said just above, if we do so, I don't think we'll be able to change our mind later and allow referencing other keyspace type Why is that? bq. I wonder if saying non qualified type and table names always refer to the logged keyspace wouldn't be simpler I think type names always refer to the statement keyspace makes the most sense; if we don't allow referencing types in a different keyspace, then referring to the logged keyspace is an error. If we want to help people out, we could raise an error if there is ambiguity and require explicitly specifying ks1.mytype if ks1 and ks2 both have a mytype defined. Limit user types to the keyspace they are defined in Key: CASSANDRA-6643 URL: https://issues.apache.org/jira/browse/CASSANDRA-6643 Project: Cassandra Issue Type: Bug Environment: java version 1.7.0_51 cassandra from trunk, 4b54b8... Reporter: Russ Hatch Assignee: Sylvain Lebresne Priority: Minor Fix For: 2.1 rc1 Attachments: 6643.txt I'm not 100% certain this is a bug. The current syntax for alter type rename requires the keyspace on the old and new table name (if a keyspace is not active). So, to rename the type 'foo' to 'bar', you have to issue this statement: ALTER TYPE ks.foo rename to ks.bar . As a result, this syntax will also allow renaming the type into another existing keyspace, which updates the metadata in system.schema_usertypes. I'm wondering if perhaps we can omit the second keyspace prefix and implicitly rename into the same keyspace. To reproduce: {noformat} cqlsh create keyspace user_types with replication = {'class':'SimpleStrategy', 'replication_factor':3} ; cqlsh create keyspace user_types2 with replication = {'class':'SimpleStrategy', 'replication_factor':3} ; cqlsh CREATE TYPE user_types.simple_type (user_number int); cqlsh alter type user_types.simple_type rename to user_types2.simple_type; {noformat} Renaming to another keyspace is also possible when a keyspace is active, like so: {noformat} cqlsh:user_types alter type simple_type rename to user_types2.simple_type; {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (CASSANDRA-4718) More-efficient ExecutorService for improved throughput
[ https://issues.apache.org/jira/browse/CASSANDRA-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999045#comment-13999045 ] Jason Brown edited comment on CASSANDRA-4718 at 5/15/14 10:40 PM: -- Adding more fuel to the testing fire, i took a first pass at having a large of amount of data on disk (~2x the memory size of each box), and running the read tests - see attached file: stress_2014May15.txt. I cleared the page cache before switching to each branch from for the reads, and then performed 3 rounds of stress. The goal here was to see how the sep branch compared with cassandra-2.1 when doing most of the reads from disk (with a cold page cache, or where the cache is constantly churning due to new blocks being pulled in). The short story is the sep branch performs slightly worse the current cassandra-2.1 (which includes CASSANDRA-5663) on both ops/s and latencies. I'm going to do one more test where I preload a good chunk of the data into the page cache, then run the stress - hopefully to emulate the case where most reads come from the page cache and some go to disk. Will try to use a less naive key distribution alg, to ensure that we hit the hot keys, which is provided by stress. was (Author: jasobrown): Adding more fuel to the testing fire, i took a first pass at having a large of amount of data on disk (~2x the memory size of each box), and running the read tests - see attached file: stress_2014May15.txt. I cleared the page cache before switching to each branch from for the reads, and then performed 3 rounds of stress. The goal here was to see how the sep branch compared with cassandra-2.1 when doing most of the reads from disk (with a cold page cache, or where the cache is constantly churning due to new blocks being pulled in). The short story is the sep branch performs slightly worse the current cassandra-2.1 (which includes CASSANDRA-5663) on both ops/s and latencies. I'm going to do one more test where I preload a good chunk of the data into the page cache, then run the stress - hopefully to emulate the case where most reads come from the page cache and some go to disk. Will me to use a less naive key distribution alg, to ensure that we hit the hot keys, which is provided by stress. More-efficient ExecutorService for improved throughput -- Key: CASSANDRA-4718 URL: https://issues.apache.org/jira/browse/CASSANDRA-4718 Project: Cassandra Issue Type: Improvement Reporter: Jonathan Ellis Assignee: Benedict Priority: Minor Labels: performance Fix For: 2.1.0 Attachments: 4718-v1.patch, PerThreadQueue.java, aws.svg, aws_read.svg, backpressure-stress.out.txt, baq vs trunk.png, belliotsmith_branches-stress.out.txt, jason_read.svg, jason_read_latency.svg, jason_write.svg, op costs of various queues.ods, stress op rate with various queues.ods, stress_2014May15.txt, v1-stress.out Currently all our execution stages dequeue tasks one at a time. This can result in contention between producers and consumers (although we do our best to minimize this by using LinkedBlockingQueue). One approach to mitigating this would be to make consumer threads do more work in bulk instead of just one task per dequeue. (Producer threads tend to be single-task oriented by nature, so I don't see an equivalent opportunity there.) BlockingQueue has a drainTo(collection, int) method that would be perfect for this. However, no ExecutorService in the jdk supports using drainTo, nor could I google one. What I would like to do here is create just such a beast and wire it into (at least) the write and read stages. (Other possible candidates for such an optimization, such as the CommitLog and OutboundTCPConnection, are not ExecutorService-based and will need to be one-offs.) AbstractExecutorService may be useful. The implementations of ICommitLogExecutorService may also be useful. (Despite the name these are not actual ExecutorServices, although they share the most important properties of one.) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7242) More compaction visibility into thread pool and per CF
[ https://issues.apache.org/jira/browse/CASSANDRA-7242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999865#comment-13999865 ] Chris Lohfink commented on CASSANDRA-7242: -- sorry, was cassandra-2.0 branch. Is there a branch that patches should generally be made against? More compaction visibility into thread pool and per CF -- Key: CASSANDRA-7242 URL: https://issues.apache.org/jira/browse/CASSANDRA-7242 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Chris Lohfink Assignee: Chris Lohfink Priority: Minor Attachments: 7242_jmxify_compactionpool.txt, 7242_per_cf_compactionstats.txt Two parts to this to help diagnose compactions issues/bottlenecks. Could be two different issues but pretty closely related. First is adding per column family pending compactions. When theres a lot of backed up compactions but multiple ones currently being compacted its hard to identify which CF is causing the backlog. In patch provided this doesnt cover the compactions in the thread pools queue like compactionstats does but not sure how big that gets ever or if needs to be... which brings me to the second idea. Second is to change compactionExecutor to extend the JMXEnabledThreadPoolExecutor. Big difference there would be the blocking rejection handler. With a 2^31 pending queue the blocking becoming an issue is a pretty extreme case in itself that would most likely OOM the server. So the different rejection policy shouldn't cause much of an issue but if it does can always override it to use default behavior. Would help identify scenarios where corrupted sstables or unhandled exceptions etc killing the compactions lead to a large backlog with nothing actively working. Also just for added visibility into this from tpstats. -- This message was sent by Atlassian JIRA (v6.2#6252)