[jira] [Updated] (CASSANDRA-7396) Allow selecting Map key, List index
[ https://issues.apache.org/jira/browse/CASSANDRA-7396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anonymous updated CASSANDRA-7396: - Status: Ready to Commit (was: Patch Available) > Allow selecting Map key, List index > --- > > Key: CASSANDRA-7396 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7396 > Project: Cassandra > Issue Type: New Feature > Components: CQL >Reporter: Jonathan Ellis >Assignee: Robert Stupp > Labels: cql, docs-impacting > Fix For: 3.x > > Attachments: 7396_unit_tests.txt > > > Allow "SELECT map['key]" and "SELECT list[index]." (Selecting a UDT subfield > is already supported.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11031) MultiTenant : support “ALLOW FILTERING" for Partition Key
[ https://issues.apache.org/jira/browse/CASSANDRA-11031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ZhaoYang updated CASSANDRA-11031: - Summary: MultiTenant : support “ALLOW FILTERING" for Partition Key (was: MultiTenant : support “ALLOW FILTERING" for First Partition Key) > MultiTenant : support “ALLOW FILTERING" for Partition Key > - > > Key: CASSANDRA-11031 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11031 > Project: Cassandra > Issue Type: New Feature > Components: CQL >Reporter: ZhaoYang >Assignee: ZhaoYang >Priority: Minor > Fix For: 3.x > > Attachments: CASSANDRA-11031-3.7.patch > > > Currently, Allow Filtering only works for secondary Index column or > clustering columns. And it's slow, because Cassandra will read all data from > SSTABLE from hard-disk to memory to filter. > But we can support allow filtering on Partition Key, as far as I know, > Partition Key is in memory, so we can easily filter them, and then read > required data from SSTable. > This will similar to "Select * from table" which scan through entire cluster. > CREATE TABLE multi_tenant_table ( > tenant_id text, > pk2 text, > c1 text, > c2 text, > v1 text, > v2 text, > PRIMARY KEY ((tenant_id,pk2),c1,c2) > ) ; > Select * from multi_tenant_table where tenant_id = "datastax" allow filtering; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11537) Give clear error when certain nodetool commands are issued before server is ready
[ https://issues.apache.org/jira/browse/CASSANDRA-11537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15335269#comment-15335269 ] Edward Capriolo commented on CASSANDRA-11537: - Fixed java test issues https://github.com/apache/cassandra/compare/trunk...edwardcapriolo:CASSANDRA-11537-2?expand=1 > Give clear error when certain nodetool commands are issued before server is > ready > - > > Key: CASSANDRA-11537 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11537 > Project: Cassandra > Issue Type: Improvement >Reporter: Edward Capriolo >Assignee: Edward Capriolo >Priority: Minor > Labels: lhf > > As an ops person upgrading and servicing Cassandra servers, I require a more > clear message when I issue a nodetool command that the server is not ready > for it so that I am not confused. > Technical description: > If you deploy a new binary, restart, and issue nodetool > scrub/compact/updatess etc you get unfriendly assertion. An exception would > be easier to understand. Also if a user has turned assertions off it is > unclear what might happen. > {noformat} > EC1: Throw exception to make it clear server is still in start up process. > :~# nodetool upgradesstables > error: null > -- StackTrace -- > java.lang.AssertionError > at org.apache.cassandra.db.Keyspace.open(Keyspace.java:97) > at > org.apache.cassandra.service.StorageService.getValidKeyspace(StorageService.java:2573) > at > org.apache.cassandra.service.StorageService.getValidColumnFamilies(StorageService.java:2661) > at > org.apache.cassandra.service.StorageService.upgradeSSTables(StorageService.java:2421) > {noformat} > EC1: > Patch against 2.1 (branch) > https://github.com/apache/cassandra/compare/trunk...edwardcapriolo:exception-on-startup?expand=1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
cassandra git commit: use long math, for long results
Repository: cassandra Updated Branches: refs/heads/trunk 27395e78b -> 057c32997 use long math, for long results Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/057c3299 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/057c3299 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/057c3299 Branch: refs/heads/trunk Commit: 057c32997442b5df8842fe46aa2ebe9b178d8647 Parents: 27395e7 Author: Dave Brosius Authored: Thu Jun 16 22:32:00 2016 -0400 Committer: Dave Brosius Committed: Thu Jun 16 22:32:00 2016 -0400 -- .../cassandra/db/compaction/TimeWindowCompactionStrategy.java | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/057c3299/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java index df688c5..70f29e9 100644 --- a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java +++ b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java @@ -189,16 +189,16 @@ public class TimeWindowCompactionStrategy extends AbstractCompactionStrategy switch(windowTimeUnit) { case MINUTES: -lowerTimestamp = timestampInSeconds - ((timestampInSeconds) % (60 * windowTimeSize)); +lowerTimestamp = timestampInSeconds - ((timestampInSeconds) % (60L * windowTimeSize)); upperTimestamp = (lowerTimestamp + (60L * (windowTimeSize - 1L))) + 59L; break; case HOURS: -lowerTimestamp = timestampInSeconds - ((timestampInSeconds) % (3600 * windowTimeSize)); +lowerTimestamp = timestampInSeconds - ((timestampInSeconds) % (3600L * windowTimeSize)); upperTimestamp = (lowerTimestamp + (3600L * (windowTimeSize - 1L))) + 3599L; break; case DAYS: default: -lowerTimestamp = timestampInSeconds - ((timestampInSeconds) % (86400 * windowTimeSize)); +lowerTimestamp = timestampInSeconds - ((timestampInSeconds) % (86400L * windowTimeSize)); upperTimestamp = (lowerTimestamp + (86400L * (windowTimeSize - 1L))) + 86399L; break; }
cassandra git commit: remove dead params
Repository: cassandra Updated Branches: refs/heads/trunk ff9673920 -> 27395e78b remove dead params Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/27395e78 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/27395e78 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/27395e78 Branch: refs/heads/trunk Commit: 27395e78befd3694535736c9756c0380c6a00516 Parents: ff96739 Author: Dave Brosius Authored: Thu Jun 16 22:28:55 2016 -0400 Committer: Dave Brosius Committed: Thu Jun 16 22:28:55 2016 -0400 -- .../db/compaction/TimeWindowCompactionStrategy.java | 5 + .../db/compaction/TimeWindowCompactionStrategyTest.java | 9 +++-- 2 files changed, 4 insertions(+), 10 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/27395e78/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java index da3ef70..df688c5 100644 --- a/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java +++ b/src/java/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategy.java @@ -19,7 +19,6 @@ package org.apache.cassandra.db.compaction; import java.util.ArrayList; -import java.util.Arrays; import java.util.Collection; import java.util.Collections; import java.util.Iterator; @@ -158,8 +157,6 @@ public class TimeWindowCompactionStrategy extends AbstractCompactionStrategy List mostInteresting = newestBucket(buckets.left, cfs.getMinimumCompactionThreshold(), cfs.getMaximumCompactionThreshold(), - options.sstableWindowUnit, - options.sstableWindowSize, options.stcsOptions, this.highestWindowSeen); if (!mostInteresting.isEmpty()) @@ -267,7 +264,7 @@ public class TimeWindowCompactionStrategy extends AbstractCompactionStrategy * @return a bucket (list) of sstables to compact. */ @VisibleForTesting -static List newestBucket(HashMultimap buckets, int minThreshold, int maxThreshold, TimeUnit sstableWindowUnit, int sstableWindowSize, SizeTieredCompactionStrategyOptions stcsOptions, long now) +static List newestBucket(HashMultimap buckets, int minThreshold, int maxThreshold, SizeTieredCompactionStrategyOptions stcsOptions, long now) { // If the current bucket has at least minThreshold SSTables, choose that one. // For any other bucket, at least 2 SSTables is enough. http://git-wip-us.apache.org/repos/asf/cassandra/blob/27395e78/test/unit/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyTest.java -- diff --git a/test/unit/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyTest.java b/test/unit/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyTest.java index 3238170..5041b31 100644 --- a/test/unit/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyTest.java +++ b/test/unit/org/apache/cassandra/db/compaction/TimeWindowCompactionStrategyTest.java @@ -26,9 +26,6 @@ import java.util.concurrent.TimeUnit; import com.google.common.collect.HashMultimap; import com.google.common.collect.Iterables; -import com.google.common.collect.Iterables; -import com.google.common.collect.Lists; - import org.junit.BeforeClass; import org.junit.Test; @@ -179,10 +176,10 @@ public class TimeWindowCompactionStrategyTest extends SchemaLoader Pair bounds = getWindowBoundsInMillis(TimeUnit.HOURS, 1, tstamp ); buckets.put(bounds.left, sstrs.get(i)); } -List newBucket = newestBucket(buckets, 4, 32, TimeUnit.HOURS, 1, new SizeTieredCompactionStrategyOptions(), getWindowBoundsInMillis(TimeUnit.HOURS, 1, System.currentTimeMillis()).left ); +List newBucket = newestBucket(buckets, 4, 32, new SizeTieredCompactionStrategyOptions(), getWindowBoundsInMillis(TimeUnit.HOURS, 1, System.currentTimeMillis()).left ); assertTrue("incoming bucket should not be accepted when it has below the min threshold SSTables", newBucket.isEmpty()); -newBucket = newestBucket(buckets, 2, 32, TimeUnit.HOURS, 1, new SizeTieredCompactionStrategyOptions(), getWindowBoundsInMillis(TimeUnit.HOURS, 1, System.currentTimeMi
cassandra git commit: fix Exception message generation by adding String.format markers needed
Repository: cassandra Updated Branches: refs/heads/trunk 04afa2bf5 -> ff9673920 fix Exception message generation by adding String.format markers needed Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ff967392 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ff967392 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ff967392 Branch: refs/heads/trunk Commit: ff96739207d94a3e18339566c6abf8108196f95f Parents: 04afa2b Author: Dave Brosius Authored: Thu Jun 16 22:17:14 2016 -0400 Committer: Dave Brosius Committed: Thu Jun 16 22:17:14 2016 -0400 -- src/java/org/apache/cassandra/db/commitlog/CommitLogReader.java | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/ff967392/src/java/org/apache/cassandra/db/commitlog/CommitLogReader.java -- diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLogReader.java b/src/java/org/apache/cassandra/db/commitlog/CommitLogReader.java index 73d70f3..6c4bb60 100644 --- a/src/java/org/apache/cassandra/db/commitlog/CommitLogReader.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogReader.java @@ -315,7 +315,7 @@ public class CommitLogReader catch (EOFException eof) { if (handler.shouldSkipSegmentOnError(new CommitLogReadException( -String.format("Unexpected end of segment", mutationStart, statusTracker.errorContext), +String.format("Unexpected end of segment at %d in %s", mutationStart, statusTracker.errorContext), CommitLogReadErrorReason.EOF, statusTracker.tolerateErrorsInSection))) {
[jira] [Commented] (CASSANDRA-11868) unused imports and generic types
[ https://issues.apache.org/jira/browse/CASSANDRA-11868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15335213#comment-15335213 ] Edward Capriolo commented on CASSANDRA-11868: - I do not think 8385 is a blocker. I did not clean up abstract types in this ticket. I only cleaned imports and a few unused constants. > unused imports and generic types > > > Key: CASSANDRA-11868 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11868 > Project: Cassandra > Issue Type: Improvement >Reporter: Edward Capriolo >Assignee: Edward Capriolo > Fix For: 3.8 > > > I was going through Cassandra source and for busy work I started looking at > all the .java files eclipse flags as warning. They are broken roughly into a > few cases. > 1) unused imports > 2) raw types missing <> > 3) case statements without defaults > 4) @resource annotation > My IDE claims item 4 is not needed (it looks like we have done this to > signify methods that return objects that need to be closed) I can guess 4 was > done intentionally and short of making out own annotation I will ignore these > for now. > I would like to tackle this busy work before I get started. I have some > questions: > 1) Do this only on trunk? or multiple branches > 2) should I tackle 1,2,3 in separate branches/patches -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12015) Rebuilding from another DC should use different sources
[ https://issues.apache.org/jira/browse/CASSANDRA-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334937#comment-15334937 ] Paulo Motta commented on CASSANDRA-12015: - bq. However, beware of different RF in different DCs. You may have RF=3 in source DC and RF=5 in target DC, what will be the paired replica of the 4th replica of target DC ? Maybe use some modulo function. Same kind of issue if target DC RF > source DC RF. hmm good point. it seems this might be a bit harder than initially thought... I suggest we restrict this ticket to avoid using dynamic snitch proximity to pick replicas to stream from, which would already prevent hotspots and help in the reported case, and tackle the more general problem of load balancing replica selection in another ticket > Rebuilding from another DC should use different sources > --- > > Key: CASSANDRA-12015 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12015 > Project: Cassandra > Issue Type: Improvement >Reporter: Fabien Rousseau > > Currently, when adding a new DC (ex: DC2) and rebuilding it from an existing > DC (ex: DC1), only the closest replica is used as a "source of data". > It works but is not optimal, because in case of an RF=3 and 3 nodes cluster, > only one node in DC1 is streaming the data to DC2. > To build the new DC in a reasonable time, it would be better, in that case, > to stream from multiple sources, thus distributing more evenly the load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11519) Add support for IBM POWER
[ https://issues.apache.org/jira/browse/CASSANDRA-11519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334750#comment-15334750 ] Rei Odaira commented on CASSANDRA-11519: Thanks for the suggestion. Let me investigate how we can do that. > Add support for IBM POWER > - > > Key: CASSANDRA-11519 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11519 > Project: Cassandra > Issue Type: Improvement > Components: Core > Environment: POWER architecture >Reporter: Rei Odaira >Assignee: Rei Odaira >Priority: Minor > Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x > > Attachments: 11519-2.1.txt, 11519-3.0.txt > > > Add support for the IBM POWER architecture (ppc, ppc64, and ppc64le) in > org.apache.cassandra.utils.FastByteOperations, > org.apache.cassandra.utils.memory.MemoryUtil, and > org.apache.cassandra.io.util.Memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11576) Add support for JNA mlockall(2) on POWER
[ https://issues.apache.org/jira/browse/CASSANDRA-11576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rei Odaira updated CASSANDRA-11576: --- Attachment: 11576-2.1.txt > Add support for JNA mlockall(2) on POWER > > > Key: CASSANDRA-11576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11576 > Project: Cassandra > Issue Type: Improvement > Components: Core > Environment: POWER architecture >Reporter: Rei Odaira >Assignee: Rei Odaira >Priority: Minor > Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x > > Attachments: 11576-2.1.txt > > > org.apache.cassandra.utils.CLibrary contains hard-coded C-macro values to be > passed to system calls through JNA. These values are system-dependent, and as > far as I investigated, Linux and AIX on the IBM POWER architecture define > {{MCL_CURRENT}} and {{MCL_FUTURE}} (for mlockall(2)) as different values than > the current hard-coded values. As a result, mlockall(2) fails on these > platforms. > {code} > WARN 18:51:51 Unknown mlockall error 22 > {code} > I am going to provide a patch to support JNA mlockall(2) on POWER. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11576) Add support for JNA mlockall(2) on POWER
[ https://issues.apache.org/jira/browse/CASSANDRA-11576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334746#comment-15334746 ] Rei Odaira commented on CASSANDRA-11576: I have updated the patch. > Add support for JNA mlockall(2) on POWER > > > Key: CASSANDRA-11576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11576 > Project: Cassandra > Issue Type: Improvement > Components: Core > Environment: POWER architecture >Reporter: Rei Odaira >Assignee: Rei Odaira >Priority: Minor > Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x > > Attachments: 11576-2.1.txt > > > org.apache.cassandra.utils.CLibrary contains hard-coded C-macro values to be > passed to system calls through JNA. These values are system-dependent, and as > far as I investigated, Linux and AIX on the IBM POWER architecture define > {{MCL_CURRENT}} and {{MCL_FUTURE}} (for mlockall(2)) as different values than > the current hard-coded values. As a result, mlockall(2) fails on these > platforms. > {code} > WARN 18:51:51 Unknown mlockall error 22 > {code} > I am going to provide a patch to support JNA mlockall(2) on POWER. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11576) Add support for JNA mlockall(2) on POWER
[ https://issues.apache.org/jira/browse/CASSANDRA-11576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rei Odaira updated CASSANDRA-11576: --- Attachment: (was: 11576-2.1.txt) > Add support for JNA mlockall(2) on POWER > > > Key: CASSANDRA-11576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11576 > Project: Cassandra > Issue Type: Improvement > Components: Core > Environment: POWER architecture >Reporter: Rei Odaira >Assignee: Rei Odaira >Priority: Minor > Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x > > Attachments: 11576-2.1.txt > > > org.apache.cassandra.utils.CLibrary contains hard-coded C-macro values to be > passed to system calls through JNA. These values are system-dependent, and as > far as I investigated, Linux and AIX on the IBM POWER architecture define > {{MCL_CURRENT}} and {{MCL_FUTURE}} (for mlockall(2)) as different values than > the current hard-coded values. As a result, mlockall(2) fails on these > platforms. > {code} > WARN 18:51:51 Unknown mlockall error 22 > {code} > I am going to provide a patch to support JNA mlockall(2) on POWER. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11988) NullPointerExpception when reading/compacting table
[ https://issues.apache.org/jira/browse/CASSANDRA-11988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334641#comment-15334641 ] Bartlomiej commented on CASSANDRA-11988: [~carlyeks] according to "I'm worried that other places might have the same issue;". Can deletion of those static columns protect us from such data corruption in future ? > NullPointerExpception when reading/compacting table > --- > > Key: CASSANDRA-11988 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11988 > Project: Cassandra > Issue Type: Bug >Reporter: Nimi Wariboko Jr. >Assignee: Carl Yeksigian > Fix For: 3.6 > > > I have a table that suddenly refuses to be read or compacted. Issuing a read > on the table causes a NPE. > On compaction, it returns the error > {code} > ERROR [CompactionExecutor:6] 2016-06-09 17:10:15,724 CassandraDaemon.java:213 > - Exception in thread Thread[CompactionExecutor:6,1,main] > java.lang.NullPointerException: null > at > org.apache.cassandra.db.transform.UnfilteredRows.isEmpty(UnfilteredRows.java:38) > ~[apache-cassandra-3.6.jar:3.6] > at > org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:64) > ~[apache-cassandra-3.6.jar:3.6] > at > org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:24) > ~[apache-cassandra-3.6.jar:3.6] > at > org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:76) > ~[apache-cassandra-3.6.jar:3.6] > at > org.apache.cassandra.db.compaction.CompactionIterator.hasNext(CompactionIterator.java:226) > ~[apache-cassandra-3.6.jar:3.6] > at > org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:182) > ~[apache-cassandra-3.6.jar:3.6] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > ~[apache-cassandra-3.6.jar:3.6] > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:82) > ~[apache-cassandra-3.6.jar:3.6] > at > org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60) > ~[apache-cassandra-3.6.jar:3.6] > at > org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:264) > ~[apache-cassandra-3.6.jar:3.6] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_45] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_45] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > ~[na:1.8.0_45] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_45] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45] > {code} > Schema: > {code} > CREATE TABLE cmpayments.report_payments ( > reportid timeuuid, > userid timeuuid, > adjustedearnings decimal, > deleted set static, > earnings map, > gross map, > organizationid text, > payall timestamp static, > status text, > PRIMARY KEY (reportid, userid) > ) WITH CLUSTERING ORDER BY (userid ASC) > AND bloom_filter_fp_chance = 0.01 > AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} > AND comment = '' > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', > 'max_threshold': '32', 'min_threshold': '4'} > AND compression = {'chunk_length_in_kb': '64', 'class': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND crc_check_chance = 1.0 > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99PERCENTILE'; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11988) NullPointerExpception when reading/compacting table
[ https://issues.apache.org/jira/browse/CASSANDRA-11988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334626#comment-15334626 ] Carl Yeksigian commented on CASSANDRA-11988: This issue has been happening all the way back to 3.0; it was caused by CASSANDRA-9975. We hit this condition when we have tombstoned a static column, and are ready to delete it. I was able to reproduce by setting {{gc_grace_seconds}} to 0, and then doing some tombstones of static columns. Looking at the usages of {{BaseRows.staticRow()}}, I'm worried that other places might have the same issue; they take the static row, expecting it to be non-null, and call {{isEmpty()}} on it. > NullPointerExpception when reading/compacting table > --- > > Key: CASSANDRA-11988 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11988 > Project: Cassandra > Issue Type: Bug >Reporter: Nimi Wariboko Jr. >Assignee: Carl Yeksigian > Fix For: 3.6 > > > I have a table that suddenly refuses to be read or compacted. Issuing a read > on the table causes a NPE. > On compaction, it returns the error > {code} > ERROR [CompactionExecutor:6] 2016-06-09 17:10:15,724 CassandraDaemon.java:213 > - Exception in thread Thread[CompactionExecutor:6,1,main] > java.lang.NullPointerException: null > at > org.apache.cassandra.db.transform.UnfilteredRows.isEmpty(UnfilteredRows.java:38) > ~[apache-cassandra-3.6.jar:3.6] > at > org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:64) > ~[apache-cassandra-3.6.jar:3.6] > at > org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:24) > ~[apache-cassandra-3.6.jar:3.6] > at > org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:76) > ~[apache-cassandra-3.6.jar:3.6] > at > org.apache.cassandra.db.compaction.CompactionIterator.hasNext(CompactionIterator.java:226) > ~[apache-cassandra-3.6.jar:3.6] > at > org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:182) > ~[apache-cassandra-3.6.jar:3.6] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > ~[apache-cassandra-3.6.jar:3.6] > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:82) > ~[apache-cassandra-3.6.jar:3.6] > at > org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60) > ~[apache-cassandra-3.6.jar:3.6] > at > org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:264) > ~[apache-cassandra-3.6.jar:3.6] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_45] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_45] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > ~[na:1.8.0_45] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_45] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45] > {code} > Schema: > {code} > CREATE TABLE cmpayments.report_payments ( > reportid timeuuid, > userid timeuuid, > adjustedearnings decimal, > deleted set static, > earnings map, > gross map, > organizationid text, > payall timestamp static, > status text, > PRIMARY KEY (reportid, userid) > ) WITH CLUSTERING ORDER BY (userid ASC) > AND bloom_filter_fp_chance = 0.01 > AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} > AND comment = '' > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', > 'max_threshold': '32', 'min_threshold': '4'} > AND compression = {'chunk_length_in_kb': '64', 'class': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND crc_check_chance = 1.0 > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99PERCENTILE'; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11516) Make max number of streams configurable
[ https://issues.apache.org/jira/browse/CASSANDRA-11516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334620#comment-15334620 ] Paulo Motta commented on CASSANDRA-11516: - [~giampaolo] my bad, I actually just noticed that executor is only used to establish stream connections. I'm not sure if the idea here is to actually bound the number of active streams, in which case it will be a bit more involved, since right now they're pretty much unbounded, or only bound the number of [post-processing streaming threads|https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/streaming/StreamReceiveTask.java#L54] which finalizes received sstables and adds them to the data tracker, and what actually caused the reported problem, so maybe [~sebastian.este...@datastax.com] will be able to clarify best. > Make max number of streams configurable > --- > > Key: CASSANDRA-11516 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11516 > Project: Cassandra > Issue Type: New Feature >Reporter: Sebastian Estevez > Labels: lhf > > Today we default to num cores. In large boxes (many cores), this is > suboptimal as it can generate huge amounts of garbage that GC can't keep up > with. > Usually we tackle issues like this with the streaming throughput levers but > in this case the problem is CPU consumption by StreamReceiverTasks > specifically in the IntervalTree build -- > https://github.com/apache/cassandra/blob/cassandra-2.1.12/src/java/org/apache/cassandra/utils/IntervalTree.java#L257 > We need a max number of parallel streams lever to hanlde this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12015) Rebuilding from another DC should use different sources
[ https://issues.apache.org/jira/browse/CASSANDRA-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334609#comment-15334609 ] DOAN DuyHai commented on CASSANDRA-12015: - +1 on the paired replica approach used for MVs. However, beware of different RF in different DCs. You may have RF=3 in source DC and RF=5 in target DC, what will be the paired replica of the 4th replica of target DC ? Maybe use some modulo function. Same kind of issue if target DC RF > source DC RF. > Rebuilding from another DC should use different sources > --- > > Key: CASSANDRA-12015 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12015 > Project: Cassandra > Issue Type: Improvement >Reporter: Fabien Rousseau > > Currently, when adding a new DC (ex: DC2) and rebuilding it from an existing > DC (ex: DC1), only the closest replica is used as a "source of data". > It works but is not optimal, because in case of an RF=3 and 3 nodes cluster, > only one node in DC1 is streaming the data to DC2. > To build the new DC in a reasonable time, it would be better, in that case, > to stream from multiple sources, thus distributing more evenly the load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11363) Blocked NTR When Connecting Causing Excessive Load
[ https://issues.apache.org/jira/browse/CASSANDRA-11363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334604#comment-15334604 ] T Jake Luciani commented on CASSANDRA-11363: The Native Transport Request pool is the only thread pool that has a bounded limit (128) The NTR uses the SEPExecutor which effectively blocks till the queue has room. However if I'm reading it correctly the SEPWorker goes into a spin loop for some scenarios when there us no work so perhaps we are hitting some edge case when the tasks are blocked. [~pauloricardomg] perhaps try setting this queue to something small like 4 to force blocking? > Blocked NTR When Connecting Causing Excessive Load > -- > > Key: CASSANDRA-11363 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11363 > Project: Cassandra > Issue Type: Bug > Components: Coordination >Reporter: Russell Bradberry >Assignee: Paulo Motta > Attachments: cassandra-102-cms.stack, cassandra-102-g1gc.stack > > > When upgrading from 2.1.9 to 2.1.13, we are witnessing an issue where the > machine load increases to very high levels (> 120 on an 8 core machine) and > native transport requests get blocked in tpstats. > I was able to reproduce this in both CMS and G1GC as well as on JVM 7 and 8. > The issue does not seem to affect the nodes running 2.1.9. > The issue seems to coincide with the number of connections OR the number of > total requests being processed at a given time (as the latter increases with > the former in our system) > Currently there is between 600 and 800 client connections on each machine and > each machine is handling roughly 2000-3000 client requests per second. > Disabling the binary protocol fixes the issue for this node but isn't a > viable option cluster-wide. > Here is the output from tpstats: > {code} > Pool NameActive Pending Completed Blocked All > time blocked > MutationStage 0 88387821 0 > 0 > ReadStage 0 0 355860 0 > 0 > RequestResponseStage 0 72532457 0 > 0 > ReadRepairStage 0 0150 0 > 0 > CounterMutationStage 32 104 897560 0 > 0 > MiscStage 0 0 0 0 > 0 > HintedHandoff 0 0 65 0 > 0 > GossipStage 0 0 2338 0 > 0 > CacheCleanupExecutor 0 0 0 0 > 0 > InternalResponseStage 0 0 0 0 > 0 > CommitLogArchiver 0 0 0 0 > 0 > CompactionExecutor2 190474 0 > 0 > ValidationExecutor0 0 0 0 > 0 > MigrationStage0 0 10 0 > 0 > AntiEntropyStage 0 0 0 0 > 0 > PendingRangeCalculator0 0310 0 > 0 > Sampler 0 0 0 0 > 0 > MemtableFlushWriter 110 94 0 > 0 > MemtablePostFlush 134257 0 > 0 > MemtableReclaimMemory 0 0 94 0 > 0 > Native-Transport-Requests 128 156 38795716 > 278451 > Message type Dropped > READ 0 > RANGE_SLICE 0 > _TRACE 0 > MUTATION 0 > COUNTER_MUTATION 0 > BINARY 0 > REQUEST_RESPONSE 0 > PAGED_RANGE 0 > READ_REPAIR 0 > {code} > Attached is the jstack output for both CMS and G1GC. > Flight recordings are here: > https://s3.amazonaws.com/simple-logs/cassandra-102-cms.jfr > https://s3.amazonaws.com/simple-logs/cassandra-102-g1gc.jfr > It is interesting to note that while the flight recording was taking place, > the load on the machine went back to healthy, and when the flight recording > finished the load went back to > 100. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12010) UserTypesTest# is failing on trunk
[ https://issues.apache.org/jira/browse/CASSANDRA-12010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Knighton updated CASSANDRA-12010: -- Fix Version/s: 3.x > UserTypesTest# is failing on trunk > -- > > Key: CASSANDRA-12010 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12010 > Project: Cassandra > Issue Type: Test > Components: Testing >Reporter: Alex Petrov >Assignee: Alex Petrov > Fix For: 3.x > > > Test failure: > http://cassci.datastax.com/job/trunk_utest/1445/testReport/org.apache.cassandra.cql3.validation.entities/UserTypesTest/testAlteringUserTypeNestedWithinNonFrozenMap/ > This was caused by the merge after > [11604|https://issues.apache.org/jira/browse/CASSANDRA-11604] which probably > coincided with some other change, as this failure did not happen during the > [test run on the > branch|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-trunk-testall/lastCompletedBuild/testReport/]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8700) replace the wiki with docs in the git repo
[ https://issues.apache.org/jira/browse/CASSANDRA-8700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334573#comment-15334573 ] Tyler Hobbs commented on CASSANDRA-8700: bq. On the cqlsh doc though, I wonder if it's a good idea to include the description of the command line options, and even of the special commands? Feels like it we'll easily forgot to update it and it doesn't seem to add a lot of value over getting the help from cqlsh directly. I think the main advantage is that this documentation will show up in search results. This is particularly useful if you don't know what you're looking for, or you don't know if it's a commandline thing or a special command. I also considered including docs for {{cqlshrc}} here for the same reason, but those are technically already online and searchable (although not nicely formatted). However, it would be nice to generate the docs directly from the code, as you mention. Sphinx works well with python, so this is probably reasonable to do. I'll look into it. > replace the wiki with docs in the git repo > -- > > Key: CASSANDRA-8700 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8700 > Project: Cassandra > Issue Type: Improvement > Components: Documentation and Website >Reporter: Jon Haddad >Assignee: Sylvain Lebresne >Priority: Blocker > Fix For: 3.8 > > Attachments: TombstonesAndGcGrace.md, bloom_filters.md, > compression.md, contributing.zip, getting_started.zip, hardware.md > > > The wiki as it stands is pretty terrible. It takes several minutes to apply > a single update, and as a result, it's almost never updated. The information > there has very little context as to what version it applies to. Most people > I've talked to that try to use the information they find there find it is > more confusing than helpful. > I'd like to propose that instead of using the wiki, the doc directory in the > cassandra repo be used for docs (already used for CQL3 spec) in a format that > can be built to a variety of output formats like HTML / epub / etc. I won't > start the bikeshedding on which markup format is preferable - but there are > several options that can work perfectly fine. I've personally use sphinx w/ > restructured text, and markdown. Both can build easily and as an added bonus > be pushed to readthedocs (or something similar) automatically. For an > example, see cqlengine's documentation, which I think is already > significantly better than the wiki: > http://cqlengine.readthedocs.org/en/latest/ > In addition to being overall easier to maintain, putting the documentation in > the git repo adds context, since it evolves with the versions of Cassandra. > If the wiki were kept even remotely up to date, I wouldn't bother with this, > but not having at least some basic documentation in the repo, or anywhere > associated with the project, is frustrating. > For reference, the last 3 updates were: > 1/15/15 - updating committers list > 1/08/15 - updating contributers and how to contribute > 12/16/14 - added a link to CQL docs from wiki frontpage (by me) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12010) UserTypesTest# is failing on trunk
[ https://issues.apache.org/jira/browse/CASSANDRA-12010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334563#comment-15334563 ] Joel Knighton commented on CASSANDRA-12010: --- +1, lgtm. > UserTypesTest# is failing on trunk > -- > > Key: CASSANDRA-12010 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12010 > Project: Cassandra > Issue Type: Test > Components: Testing >Reporter: Alex Petrov >Assignee: Alex Petrov > Fix For: 3.x > > > Test failure: > http://cassci.datastax.com/job/trunk_utest/1445/testReport/org.apache.cassandra.cql3.validation.entities/UserTypesTest/testAlteringUserTypeNestedWithinNonFrozenMap/ > This was caused by the merge after > [11604|https://issues.apache.org/jira/browse/CASSANDRA-11604] which probably > coincided with some other change, as this failure did not happen during the > [test run on the > branch|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-trunk-testall/lastCompletedBuild/testReport/]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12010) UserTypesTest# is failing on trunk
[ https://issues.apache.org/jira/browse/CASSANDRA-12010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Knighton updated CASSANDRA-12010: -- Status: Ready to Commit (was: Patch Available) > UserTypesTest# is failing on trunk > -- > > Key: CASSANDRA-12010 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12010 > Project: Cassandra > Issue Type: Test > Components: Testing >Reporter: Alex Petrov >Assignee: Alex Petrov > Fix For: 3.x > > > Test failure: > http://cassci.datastax.com/job/trunk_utest/1445/testReport/org.apache.cassandra.cql3.validation.entities/UserTypesTest/testAlteringUserTypeNestedWithinNonFrozenMap/ > This was caused by the merge after > [11604|https://issues.apache.org/jira/browse/CASSANDRA-11604] which probably > coincided with some other change, as this failure did not happen during the > [test run on the > branch|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-trunk-testall/lastCompletedBuild/testReport/]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12010) UserTypesTest# is failing on trunk
[ https://issues.apache.org/jira/browse/CASSANDRA-12010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Knighton updated CASSANDRA-12010: -- Component/s: Testing > UserTypesTest# is failing on trunk > -- > > Key: CASSANDRA-12010 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12010 > Project: Cassandra > Issue Type: Test > Components: Testing >Reporter: Alex Petrov >Assignee: Alex Petrov > Fix For: 3.x > > > Test failure: > http://cassci.datastax.com/job/trunk_utest/1445/testReport/org.apache.cassandra.cql3.validation.entities/UserTypesTest/testAlteringUserTypeNestedWithinNonFrozenMap/ > This was caused by the merge after > [11604|https://issues.apache.org/jira/browse/CASSANDRA-11604] which probably > coincided with some other change, as this failure did not happen during the > [test run on the > branch|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-trunk-testall/lastCompletedBuild/testReport/]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12015) Rebuilding from another DC should use different sources
[ https://issues.apache.org/jira/browse/CASSANDRA-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334555#comment-15334555 ] Paulo Motta commented on CASSANDRA-12015: - bq. However, it does not help the original concern of this JIRA, which is to have some sort of randomization/round-robin selection for the source replica to stream data from. I think there are two concerns here: a) Improve source diversity for single node rebuilds b) For simultaneous rebuilds, divide the load more evenly across replicas. >From my understanding a) is easily solvable by using token order instead of >proximity to pick replicas to stream from, but this does not solve b) because >primary replicas from simultaneous rebuilds might become overloaded Maybe b) can be solved without keeping state by using a paired replica approach similar to MVs? > Rebuilding from another DC should use different sources > --- > > Key: CASSANDRA-12015 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12015 > Project: Cassandra > Issue Type: Improvement >Reporter: Fabien Rousseau > > Currently, when adding a new DC (ex: DC2) and rebuilding it from an existing > DC (ex: DC1), only the closest replica is used as a "source of data". > It works but is not optimal, because in case of an RF=3 and 3 nodes cluster, > only one node in DC1 is streaming the data to DC2. > To build the new DC in a reasonable time, it would be better, in that case, > to stream from multiple sources, thus distributing more evenly the load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12018) CDC follow-ups
[ https://issues.apache.org/jira/browse/CASSANDRA-12018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-12018: Description: h6. Platform independent implementation of DirectorySizeCalculator On linux, simplify to {{Arrays.stream(path.listFiles()).mapToLong(File::length).sum();}} h6. Refactor DirectorySizeCalculator bq. I don't get the DirectorySizeCalculator. Why the alive and visited sets, the listFiles step? Either list the files and just loop through them, or do the walkFileTree operation – you are now doing the same work twice. Use a plain long instead of the atomic as the class is still thread-unsafe. h6. TolerateErrorsInSection should not depend on previous SyncSegment status in CommitLogReader bq. tolerateErrorsInSection &=: I don't think it was intended for the value to depend on previous iterations. h6. Refactor interface of SImpleCachedBufferPool bq. SimpleCachedBufferPool should provide getThreadLocalReusableBuffer(int size) which should automatically reallocate if the available size is less, and not expose a setter at all. h6. Change CDC exception to WriteFailureException instead of WriteTimeoutException h6. Remove unused CommitLogTest.testRecovery(byte[] logData) > CDC follow-ups > -- > > Key: CASSANDRA-12018 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12018 > Project: Cassandra > Issue Type: Improvement >Reporter: Joshua McKenzie >Assignee: Joshua McKenzie >Priority: Minor > > h6. Platform independent implementation of DirectorySizeCalculator > On linux, simplify to > {{Arrays.stream(path.listFiles()).mapToLong(File::length).sum();}} > h6. Refactor DirectorySizeCalculator > bq. I don't get the DirectorySizeCalculator. Why the alive and visited sets, > the listFiles step? Either list the files and just loop through them, or do > the walkFileTree operation – you are now doing the same work twice. Use a > plain long instead of the atomic as the class is still thread-unsafe. > h6. TolerateErrorsInSection should not depend on previous SyncSegment status > in CommitLogReader > bq. tolerateErrorsInSection &=: I don't think it was intended for the value > to depend on previous iterations. > h6. Refactor interface of SImpleCachedBufferPool > bq. SimpleCachedBufferPool should provide getThreadLocalReusableBuffer(int > size) which should automatically reallocate if the available size is less, > and not expose a setter at all. > h6. Change CDC exception to WriteFailureException instead of > WriteTimeoutException > h6. Remove unused CommitLogTest.testRecovery(byte[] logData) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12018) CDC follow-ups
[ https://issues.apache.org/jira/browse/CASSANDRA-12018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-12018: Description: (was: Parent ticket to hold subtasks for things that came up during CDC discussion) > CDC follow-ups > -- > > Key: CASSANDRA-12018 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12018 > Project: Cassandra > Issue Type: Improvement >Reporter: Joshua McKenzie >Assignee: Joshua McKenzie >Priority: Minor > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Deleted] (CASSANDRA-12019) Platform independent implementation of DirectorySizeCalculator and refactor
[ https://issues.apache.org/jira/browse/CASSANDRA-12019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie deleted CASSANDRA-12019: > Platform independent implementation of DirectorySizeCalculator and refactor > --- > > Key: CASSANDRA-12019 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12019 > Project: Cassandra > Issue Type: Sub-task >Reporter: Joshua McKenzie >Assignee: Joshua McKenzie >Priority: Minor > > On linux, simplify to > {{Arrays.stream(path.listFiles()).mapToLong(File::length).sum();}} > It's simpler and performs better, however is much slower on Windows. > See discussion on CASSANDRA-8844. > Also: > bq. I don't get the DirectorySizeCalculator. Why the alive and visited sets, > the listFiles step? Either list the files and just loop through them, or do > the walkFileTree operation – you are now doing the same work twice. Use a > plain long instead of the atomic as the class is still thread-unsafe. > So the existing class could use a refactor as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12019) Platform independent implementation of DirectorySizeCalculator and refactor
[ https://issues.apache.org/jira/browse/CASSANDRA-12019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-12019: Summary: Platform independent implementation of DirectorySizeCalculator and refactor (was: Platform independent implementation of DirectorySizeCalculator) > Platform independent implementation of DirectorySizeCalculator and refactor > --- > > Key: CASSANDRA-12019 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12019 > Project: Cassandra > Issue Type: Sub-task >Reporter: Joshua McKenzie >Assignee: Joshua McKenzie >Priority: Minor > > On linux, simplify to > {{Arrays.stream(path.listFiles()).mapToLong(File::length).sum();}} > It's simpler and performs better, however is much slower on Windows. > See discussion on CASSANDRA-8844. > Also: > bq. I don't get the DirectorySizeCalculator. Why the alive and visited sets, > the listFiles step? Either list the files and just loop through them, or do > the walkFileTree operation – you are now doing the same work twice. Use a > plain long instead of the atomic as the class is still thread-unsafe. > So the existing class could use a refactor as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12019) Platform independent implementation of DirectorySizeCalculator
[ https://issues.apache.org/jira/browse/CASSANDRA-12019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-12019: Description: On linux, simplify to {{Arrays.stream(path.listFiles()).mapToLong(File::length).sum();}} It's simpler and performs better, however is much slower on Windows. See discussion on CASSANDRA-8844. Also: bq. I don't get the DirectorySizeCalculator. Why the alive and visited sets, the listFiles step? Either list the files and just loop through them, or do the walkFileTree operation – you are now doing the same work twice. Use a plain long instead of the atomic as the class is still thread-unsafe. So the existing class could use a refactor as well. was: On linux, simplify to {{Arrays.stream(path.listFiles()).mapToLong(File::length).sum();}} It's simpler and performs better, however is much slower on Windows. See discussion on CASSANDRA-8844. > Platform independent implementation of DirectorySizeCalculator > -- > > Key: CASSANDRA-12019 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12019 > Project: Cassandra > Issue Type: Sub-task >Reporter: Joshua McKenzie >Assignee: Joshua McKenzie >Priority: Minor > > On linux, simplify to > {{Arrays.stream(path.listFiles()).mapToLong(File::length).sum();}} > It's simpler and performs better, however is much slower on Windows. > See discussion on CASSANDRA-8844. > Also: > bq. I don't get the DirectorySizeCalculator. Why the alive and visited sets, > the listFiles step? Either list the files and just loop through them, or do > the walkFileTree operation – you are now doing the same work twice. Use a > plain long instead of the atomic as the class is still thread-unsafe. > So the existing class could use a refactor as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12019) Platform independent implementation of DirectorySizeCalculator
Joshua McKenzie created CASSANDRA-12019: --- Summary: Platform independent implementation of DirectorySizeCalculator Key: CASSANDRA-12019 URL: https://issues.apache.org/jira/browse/CASSANDRA-12019 Project: Cassandra Issue Type: Sub-task Reporter: Joshua McKenzie Assignee: Joshua McKenzie Priority: Minor On linux, simplify to {{Arrays.stream(path.listFiles()).mapToLong(File::length).sum();}} It's simpler and performs better, however is much slower on Windows. See discussion on CASSANDRA-8844. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12018) CDC follow-ups
Joshua McKenzie created CASSANDRA-12018: --- Summary: CDC follow-ups Key: CASSANDRA-12018 URL: https://issues.apache.org/jira/browse/CASSANDRA-12018 Project: Cassandra Issue Type: Improvement Reporter: Joshua McKenzie Assignee: Joshua McKenzie Priority: Minor Parent ticket to hold subtasks for things that came up during CDC discussion -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12015) Rebuilding from another DC should use different sources
[ https://issues.apache.org/jira/browse/CASSANDRA-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334412#comment-15334412 ] DOAN DuyHai commented on CASSANDRA-12015: - I have checked the Git history and it seems that the {{snitch.getSortedListByProximity(address, rangeAddresses.get(range))}} line has always been there since 2012. Look likes nobody has noticed that DynamicSnitch can create this kind of hotspots since then, so yes we may update this. However, it does not help the original concern of this JIRA, which is to have some sort of randomization/round-robin selection for the source replica to stream data from. If we replace the dynamic snitch by {{AbstractEndpointSnitch.sortByProximity}}, it also will always pick the *same* replica for a given token range, whereas the idea is to pick randomly or to round-robin the replica. But it also mean that we need to keep *state* to know which replica has been selected previously for a given token range so that we can move to the next one. And I'm not sure whether having *state* is desirable or technically feasible > Rebuilding from another DC should use different sources > --- > > Key: CASSANDRA-12015 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12015 > Project: Cassandra > Issue Type: Improvement >Reporter: Fabien Rousseau > > Currently, when adding a new DC (ex: DC2) and rebuilding it from an existing > DC (ex: DC1), only the closest replica is used as a "source of data". > It works but is not optimal, because in case of an RF=3 and 3 nodes cluster, > only one node in DC1 is streaming the data to DC2. > To build the new DC in a reasonable time, it would be better, in that case, > to stream from multiple sources, thus distributing more evenly the load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12010) UserTypesTest# is failing on trunk
[ https://issues.apache.org/jira/browse/CASSANDRA-12010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334373#comment-15334373 ] Alex Petrov commented on CASSANDRA-12010: - Great catch! I've completely overlooked {{beforeAndAfterFlush}}. I've compared and ported all the changes from [here|https://github.com/ifesdjeen/cassandra/commit/0034d8b60acd52ff517cf8c7ab1ac86277c3dbc3]. Updated tree: |[trunk|https://github.com/ifesdjeen/cassandra/tree/12010-trunk]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-12010-trunk-testall/] > UserTypesTest# is failing on trunk > -- > > Key: CASSANDRA-12010 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12010 > Project: Cassandra > Issue Type: Test >Reporter: Alex Petrov >Assignee: Alex Petrov > > Test failure: > http://cassci.datastax.com/job/trunk_utest/1445/testReport/org.apache.cassandra.cql3.validation.entities/UserTypesTest/testAlteringUserTypeNestedWithinNonFrozenMap/ > This was caused by the merge after > [11604|https://issues.apache.org/jira/browse/CASSANDRA-11604] which probably > coincided with some other change, as this failure did not happen during the > [test run on the > branch|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-trunk-testall/lastCompletedBuild/testReport/]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-10202) simplify CommitLogSegmentManager
[ https://issues.apache.org/jira/browse/CASSANDRA-10202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-10202: Status: Patch Available (was: Open) > simplify CommitLogSegmentManager > > > Key: CASSANDRA-10202 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10202 > Project: Cassandra > Issue Type: Improvement > Components: Local Write-Read Paths >Reporter: Jonathan Ellis >Assignee: Branimir Lambov >Priority: Minor > > Now that we only keep one active segment around we can simplify this from the > old recycling design. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10202) simplify CommitLogSegmentManager
[ https://issues.apache.org/jira/browse/CASSANDRA-10202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334230#comment-15334230 ] Branimir Lambov commented on CASSANDRA-10202: - Branch is updated to remove the custom concurrent list implementation. > simplify CommitLogSegmentManager > > > Key: CASSANDRA-10202 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10202 > Project: Cassandra > Issue Type: Improvement > Components: Local Write-Read Paths >Reporter: Jonathan Ellis >Assignee: Branimir Lambov >Priority: Minor > > Now that we only keep one active segment around we can simplify this from the > old recycling design. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11866) nodetool repair does not obey the column family parameter when -st and -et are provided (subrange repair)
[ https://issues.apache.org/jira/browse/CASSANDRA-11866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334184#comment-15334184 ] Paulo Motta commented on CASSANDRA-11866: - [~mahdix] interested to take this? should be quite easy. > nodetool repair does not obey the column family parameter when -st and -et > are provided (subrange repair) > - > > Key: CASSANDRA-11866 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11866 > Project: Cassandra > Issue Type: Bug > Components: Tools > Environment: Red Hat Enterprise Linux Server release 6.7 (Santiago) > x86_64 >Reporter: Shiva Venkateswaran > Labels: newbie > Fix For: 2.1.x > > > Command 1: Repairs all the CFs in ADL_GLOBAL keyspace and ignores the > parameter AssetModifyTimes_data used to restrict the CFs > Executing: /aladdin/local/apps/apache-cassandra-2.1.8a/bin/nodetool -h > localhost -p 7199 -u user-pw ** repair ADL_GLOBAL AssetModifyTimes_data > -st 205279477618143669 -et 230991685737746901 -par > [2016-05-20 17:31:39,116] Starting repair command #9, repairing 1 ranges for > keyspace ADL_GLOBAL (parallelism=PARALLEL, full=true) > [2016-05-20 17:32:21,568] Repair session 3cae2530-1ed2-11e6-b490-d9df6932c7cf > for range (205279477618143669,230991685737746901] finished > Command 2: Repairs all the CFs in ADL_GLOBAL keyspace and ignores the > parameter AssetModifyTimes_data used to restrict the CFs > Executing: /aladdin/local/apps/apache-cassandra-2.1.8a/bin/nodetool -h > localhost -p 7199 -u controlRole -pw ** repair -st 205279477618143669 -et > 230991685737746901 -par -- ADL_GLOBAL AssetModifyTimes_data > [2016-05-20 17:36:34,473] Starting repair command #10, repairing 1 ranges for > keyspace ADL_GLOBAL (parallelism=PARALLEL, full=true) > [2016-05-20 17:37:15,365] Repair session ecb996d0-1ed2-11e6-b490-d9df6932c7cf > for range (205279477618143669,230991685737746901] finished > [2016-05-20 17:37:15,365] Repair command #10 finished > Command 3: Repairs only the CF ADL3Test1_data in keyspace ADL_GLOBAL > Executing: /aladdin/local/apps/apache-cassandra-2.1.8a/bin/nodetool -h > localhost -p 7199 -u controlRole -pw ** repair -- ADL_GLOBAL > ADL3Test1_data > [2016-05-20 17:38:35,781] Starting repair command #11, repairing 1043 ranges > for keyspace ADL_GLOBAL (parallelism=SEQUENTIAL, full=true) > [2016-05-20 17:42:32,682] Repair session 3c8af050-1ed3-11e6-b490-d9df6932c7cf > for range (6241639152751626129,6241693909092643958] finished > [2016-05-20 17:42:32,683] Repair session 3caf1a20-1ed3-11e6-b490-d9df6932c7cf > for range (-7096993048358106082,-7095000706885780850] finished > [2016-05-20 17:42:32,683] Repair session 3ccfc180-1ed3-11e6-b490-d9df6932c7cf > for range (-7218939248114487080,-7218289345961492809] finished > [2016-05-20 17:42:32,683] Repair session 3cf21690-1ed3-11e6-b490-d9df6932c7cf > for range (-5244794756638190874,-5190307341355030282] finished > [2016-05-20 17:42:32,683] Repair session 3d126fd0-1ed3-11e6-b490-d9df6932c7cf > for range (3551629701277971766,321736534916502] finished > [2016-05-20 17:42:32,683] Repair session 3d32f020-1ed3-11e6-b490-d9df6932c7cf > for range (-8139355591560661944,-8127928369093576603] finished > [2016-05-20 17:42:32,683] Repair session 3d537070-1ed3-11e6-b490-d9df6932c7cf > for range (7098010153980465751,7100863011896759020] finished > [2016-05-20 17:42:32,683] Repair session 3d73f0c0-1ed3-11e6-b490-d9df6932c7cf > for range (1004538726866173536,1008586133746764703] finished > [2016-05-20 17:42:32,683] Repair session 3d947110-1ed3-11e6-b490-d9df6932c7cf > for range (5770817093573726645,5771418910784831587] finished > . > . > . > [2016-05-20 17:42:32,732] Repair command #11 finished -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-11986) Repair using subranges (-st / -et) ignore Keyspace / Table name arguments
[ https://issues.apache.org/jira/browse/CASSANDRA-11986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta resolved CASSANDRA-11986. - Resolution: Duplicate Closing as duplicate of CASSANDRA-11866. The fix is quite trivial, we only need to pass the CF argument to [probe.forceRepairRangeAsync|https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/tools/NodeTool.java#L1950]. > Repair using subranges (-st / -et) ignore Keyspace / Table name arguments > - > > Key: CASSANDRA-11986 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11986 > Project: Cassandra > Issue Type: Bug > Environment: Reproduced using ccm and Cassandra 2.1.12 >Reporter: Alain RODRIGUEZ > > When repairing, it is impossible to repair using subranges and a specific > table at the same time. > When running this: > {noformat} > date && echo "Repairing standard1 on 127.0.0.1" && time nodetool -h localhost > -p 7100 repair -dc datacenter1 -local -par -- keyspace1 standard1 > {noformat} > *Without -st / -et* options, I have the following output: > {noformat} > MacBook-Pro:~ alain$ tail -100f ~/.ccm/test-2.1.12/node1/logs/system.log > INFO [Thread-33] 2016-06-09 14:18:52,193 StorageService.java:2939 - Starting > repair command #8, repairing 3 ranges for keyspace keyspace1 > (parallelism=PARALLEL, full=true) > INFO [AntiEntropySessions:12] 2016-06-09 14:18:52,194 RepairSession.java:260 > - [repair #53e6f820-2e3c-11e6-95ae-d1beb0ba4c9e] new session: will sync > /127.0.0.1, /127.0.0.2, /127.0.0.3 on range > (3074457345618258602,-9223372036854775808] for keyspace1.[standard1] > INFO [AntiEntropySessions:12] 2016-06-09 14:18:52,195 RepairJob.java:163 - > [repair #53e6f820-2e3c-11e6-95ae-d1beb0ba4c9e] requesting merkle trees for > standard1 (to [/127.0.0.2, /127.0.0.3, /127.0.0.1]) > INFO [AntiEntropyStage:1] 2016-06-09 14:18:57,433 RepairSession.java:171 - > [repair #53e6f820-2e3c-11e6-95ae-d1beb0ba4c9e] Received merkle tree for > standard1 from /127.0.0.2 > INFO [AntiEntropyStage:1] 2016-06-09 14:18:57,436 RepairSession.java:171 - > [repair #53e6f820-2e3c-11e6-95ae-d1beb0ba4c9e] Received merkle tree for > standard1 from /127.0.0.3 > INFO [AntiEntropyStage:1] 2016-06-09 14:18:57,439 RepairSession.java:171 - > [repair #53e6f820-2e3c-11e6-95ae-d1beb0ba4c9e] Received merkle tree for > standard1 from /127.0.0.1 > INFO [AntiEntropySessions:13] 2016-06-09 14:18:57,439 RepairSession.java:260 > - [repair #57074af0-2e3c-11e6-95ae-d1beb0ba4c9e] new session: will sync > /127.0.0.1, /127.0.0.2, /127.0.0.3 on range > (-9223372036854775808,-3074457345618258603] for keyspace1.[standard1] > INFO [RepairJobTask:1] 2016-06-09 14:18:57,440 Differencer.java:67 - [repair > #53e6f820-2e3c-11e6-95ae-d1beb0ba4c9e] Endpoints /127.0.0.2 and /127.0.0.3 > are consistent for standard1 > INFO [RepairJobTask:3] 2016-06-09 14:18:57,440 Differencer.java:67 - [repair > #53e6f820-2e3c-11e6-95ae-d1beb0ba4c9e] Endpoints /127.0.0.3 and /127.0.0.1 > are consistent for standard1 > INFO [RepairJobTask:2] 2016-06-09 14:18:57,440 Differencer.java:67 - [repair > #53e6f820-2e3c-11e6-95ae-d1beb0ba4c9e] Endpoints /127.0.0.2 and /127.0.0.1 > are consistent for standard1 > INFO [AntiEntropySessions:13] 2016-06-09 14:18:57,440 RepairJob.java:163 - > [repair #57074af0-2e3c-11e6-95ae-d1beb0ba4c9e] requesting merkle trees for > standard1 (to [/127.0.0.2, /127.0.0.3, /127.0.0.1]) > INFO [AntiEntropyStage:1] 2016-06-09 14:18:57,440 RepairSession.java:237 - > [repair #53e6f820-2e3c-11e6-95ae-d1beb0ba4c9e] standard1 is fully synced > INFO [AntiEntropySessions:12] 2016-06-09 14:18:57,440 RepairSession.java:299 > - [repair #53e6f820-2e3c-11e6-95ae-d1beb0ba4c9e] session completed > successfully > INFO [AntiEntropyStage:1] 2016-06-09 14:19:03,676 RepairSession.java:171 - > [repair #57074af0-2e3c-11e6-95ae-d1beb0ba4c9e] Received merkle tree for > standard1 from /127.0.0.2 > INFO [AntiEntropyStage:1] 2016-06-09 14:19:03,684 RepairSession.java:171 - > [repair #57074af0-2e3c-11e6-95ae-d1beb0ba4c9e] Received merkle tree for > standard1 from /127.0.0.3 > INFO [AntiEntropyStage:1] 2016-06-09 14:19:03,758 RepairSession.java:171 - > [repair #57074af0-2e3c-11e6-95ae-d1beb0ba4c9e] Received merkle tree for > standard1 from /127.0.0.1 > INFO [AntiEntropySessions:14] 2016-06-09 14:19:03,759 RepairSession.java:260 > - [repair #5acba5f0-2e3c-11e6-95ae-d1beb0ba4c9e] new session: will sync > /127.0.0.1, /127.0.0.2, /127.0.0.3 on range > (-3074457345618258603,3074457345618258602] for keyspace1.[standard1] > INFO [RepairJobTask:1] 2016-06-09 14:19:03,759 Differencer.java:67 - [repair > #57074af0-2e3c-11e6-95ae-d1beb0ba4c9e] Endpoints /127.0.0.2 and /127.0.0.3 > are consistent for standard1 > INFO [AntiEntropySessions:1
[jira] [Commented] (CASSANDRA-11516) Make max number of streams configurable
[ https://issues.apache.org/jira/browse/CASSANDRA-11516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334171#comment-15334171 ] Giampaolo commented on CASSANDRA-11516: --- Thanks, [~pauloricardomg] for the pointer. That was my first choice at the beginning but the issue refers to {{StreamReceiveTask}}. I will go with the one you gave me. > Make max number of streams configurable > --- > > Key: CASSANDRA-11516 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11516 > Project: Cassandra > Issue Type: New Feature >Reporter: Sebastian Estevez > Labels: lhf > > Today we default to num cores. In large boxes (many cores), this is > suboptimal as it can generate huge amounts of garbage that GC can't keep up > with. > Usually we tackle issues like this with the streaming throughput levers but > in this case the problem is CPU consumption by StreamReceiverTasks > specifically in the IntervalTree build -- > https://github.com/apache/cassandra/blob/cassandra-2.1.12/src/java/org/apache/cassandra/utils/IntervalTree.java#L257 > We need a max number of parallel streams lever to hanlde this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12008) Make decommission operations resumable
[ https://issues.apache.org/jira/browse/CASSANDRA-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta updated CASSANDRA-12008: Priority: Minor (was: Major) Component/s: (was: Lifecycle) Streaming and Messaging Issue Type: Improvement (was: Bug) Summary: Make decommission operations resumable (was: Allow retrying failed streams (or stop them from failing)) > Make decommission operations resumable > -- > > Key: CASSANDRA-12008 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12008 > Project: Cassandra > Issue Type: Improvement > Components: Streaming and Messaging >Reporter: Tom van der Woerdt >Priority: Minor > > We're dealing with large data sets (multiple terabytes per node) and > sometimes we need to add or remove nodes. These operations are very dependent > on the entire cluster being up, so while we're joining a new node (which > sometimes takes 6 hours or longer) a lot can go wrong and in a lot of cases > something does. > It would be great if the ability to retry streams was implemented. > Example to illustrate the problem : > {code} > 03:18 PM ~ $ nodetool decommission > error: Stream failed > -- StackTrace -- > org.apache.cassandra.streaming.StreamException: Stream failed > at > org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85) > at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) > at > com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457) > at > com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156) > at > com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145) > at > com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202) > at > org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:210) > at > org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:186) > at > org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:430) > at > org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:622) > at > org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:486) > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:274) > at java.lang.Thread.run(Thread.java:745) > 08:04 PM ~ $ nodetool decommission > nodetool: Unsupported operation: Node in LEAVING state; wait for status to > become normal or restart > See 'nodetool help' or 'nodetool help '. > {code} > Streaming failed, probably due to load : > {code} > ERROR [STREAM-IN-/] 2016-06-14 18:05:47,275 StreamSession.java:520 - > [Stream #] Streaming error occurred > java.net.SocketTimeoutException: null > at > sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:211) > ~[na:1.8.0_77] > at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) > ~[na:1.8.0_77] > at > java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385) > ~[na:1.8.0_77] > at > org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:54) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:268) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77] > {code} > If implementing retries is not possible, can we have a 'nodetool decommission > resume'? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12008) Allow retrying failed streams (or stop them from failing)
[ https://issues.apache.org/jira/browse/CASSANDRA-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334153#comment-15334153 ] Paulo Motta commented on CASSANDRA-12008: - bq. As for "nodetool decommission resume", can we have that? It's definitely possible, so I will update the ticket to reflect that. > Allow retrying failed streams (or stop them from failing) > - > > Key: CASSANDRA-12008 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12008 > Project: Cassandra > Issue Type: Bug > Components: Lifecycle >Reporter: Tom van der Woerdt > > We're dealing with large data sets (multiple terabytes per node) and > sometimes we need to add or remove nodes. These operations are very dependent > on the entire cluster being up, so while we're joining a new node (which > sometimes takes 6 hours or longer) a lot can go wrong and in a lot of cases > something does. > It would be great if the ability to retry streams was implemented. > Example to illustrate the problem : > {code} > 03:18 PM ~ $ nodetool decommission > error: Stream failed > -- StackTrace -- > org.apache.cassandra.streaming.StreamException: Stream failed > at > org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85) > at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) > at > com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457) > at > com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156) > at > com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145) > at > com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202) > at > org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:210) > at > org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:186) > at > org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:430) > at > org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:622) > at > org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:486) > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:274) > at java.lang.Thread.run(Thread.java:745) > 08:04 PM ~ $ nodetool decommission > nodetool: Unsupported operation: Node in LEAVING state; wait for status to > become normal or restart > See 'nodetool help' or 'nodetool help '. > {code} > Streaming failed, probably due to load : > {code} > ERROR [STREAM-IN-/] 2016-06-14 18:05:47,275 StreamSession.java:520 - > [Stream #] Streaming error occurred > java.net.SocketTimeoutException: null > at > sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:211) > ~[na:1.8.0_77] > at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) > ~[na:1.8.0_77] > at > java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385) > ~[na:1.8.0_77] > at > org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:54) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:268) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77] > {code} > If implementing retries is not possible, can we have a 'nodetool decommission > resume'? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (CASSANDRA-12008) Allow retrying failed streams (or stop them from failing)
[ https://issues.apache.org/jira/browse/CASSANDRA-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta updated CASSANDRA-12008: Comment: was deleted (was: In this specific case it seems the streaming failed due to low {{streaming_socket_timeout}} value. We just found out our previous default of 1 hour was too low, and raised that to 24 hours on CASSANDRA-11840, on 3.0.7. Could you try increasing that and see if it helps with failed decommissions?) > Allow retrying failed streams (or stop them from failing) > - > > Key: CASSANDRA-12008 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12008 > Project: Cassandra > Issue Type: Bug > Components: Lifecycle >Reporter: Tom van der Woerdt > > We're dealing with large data sets (multiple terabytes per node) and > sometimes we need to add or remove nodes. These operations are very dependent > on the entire cluster being up, so while we're joining a new node (which > sometimes takes 6 hours or longer) a lot can go wrong and in a lot of cases > something does. > It would be great if the ability to retry streams was implemented. > Example to illustrate the problem : > {code} > 03:18 PM ~ $ nodetool decommission > error: Stream failed > -- StackTrace -- > org.apache.cassandra.streaming.StreamException: Stream failed > at > org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85) > at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) > at > com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457) > at > com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156) > at > com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145) > at > com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202) > at > org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:210) > at > org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:186) > at > org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:430) > at > org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:622) > at > org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:486) > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:274) > at java.lang.Thread.run(Thread.java:745) > 08:04 PM ~ $ nodetool decommission > nodetool: Unsupported operation: Node in LEAVING state; wait for status to > become normal or restart > See 'nodetool help' or 'nodetool help '. > {code} > Streaming failed, probably due to load : > {code} > ERROR [STREAM-IN-/] 2016-06-14 18:05:47,275 StreamSession.java:520 - > [Stream #] Streaming error occurred > java.net.SocketTimeoutException: null > at > sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:211) > ~[na:1.8.0_77] > at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) > ~[na:1.8.0_77] > at > java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385) > ~[na:1.8.0_77] > at > org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:54) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:268) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77] > {code} > If implementing retries is not possible, can we have a 'nodetool decommission > resume'? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (CASSANDRA-12008) Allow retrying failed streams (or stop them from failing)
[ https://issues.apache.org/jira/browse/CASSANDRA-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta updated CASSANDRA-12008: Comment: was deleted (was: In this specific case it seems the streaming failed due to low {{streaming_socket_timeout}} value. We just found out our previous default of 1 hour was too low, and raised that to 24 hours on CASSANDRA-11840, on 3.0.7. Could you try increasing that and see if it helps with failed decommissions?) > Allow retrying failed streams (or stop them from failing) > - > > Key: CASSANDRA-12008 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12008 > Project: Cassandra > Issue Type: Bug > Components: Lifecycle >Reporter: Tom van der Woerdt > > We're dealing with large data sets (multiple terabytes per node) and > sometimes we need to add or remove nodes. These operations are very dependent > on the entire cluster being up, so while we're joining a new node (which > sometimes takes 6 hours or longer) a lot can go wrong and in a lot of cases > something does. > It would be great if the ability to retry streams was implemented. > Example to illustrate the problem : > {code} > 03:18 PM ~ $ nodetool decommission > error: Stream failed > -- StackTrace -- > org.apache.cassandra.streaming.StreamException: Stream failed > at > org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85) > at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) > at > com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457) > at > com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156) > at > com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145) > at > com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202) > at > org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:210) > at > org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:186) > at > org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:430) > at > org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:622) > at > org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:486) > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:274) > at java.lang.Thread.run(Thread.java:745) > 08:04 PM ~ $ nodetool decommission > nodetool: Unsupported operation: Node in LEAVING state; wait for status to > become normal or restart > See 'nodetool help' or 'nodetool help '. > {code} > Streaming failed, probably due to load : > {code} > ERROR [STREAM-IN-/] 2016-06-14 18:05:47,275 StreamSession.java:520 - > [Stream #] Streaming error occurred > java.net.SocketTimeoutException: null > at > sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:211) > ~[na:1.8.0_77] > at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) > ~[na:1.8.0_77] > at > java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385) > ~[na:1.8.0_77] > at > org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:54) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:268) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77] > {code} > If implementing retries is not possible, can we have a 'nodetool decommission > resume'? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11516) Make max number of streams configurable
[ https://issues.apache.org/jira/browse/CASSANDRA-11516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334142#comment-15334142 ] Paulo Motta commented on CASSANDRA-11516: - [~giampaolo] You should probably take a look at making [this|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/streaming/StreamCoordinator.java#L42] configurable. > Make max number of streams configurable > --- > > Key: CASSANDRA-11516 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11516 > Project: Cassandra > Issue Type: New Feature >Reporter: Sebastian Estevez > Labels: lhf > > Today we default to num cores. In large boxes (many cores), this is > suboptimal as it can generate huge amounts of garbage that GC can't keep up > with. > Usually we tackle issues like this with the streaming throughput levers but > in this case the problem is CPU consumption by StreamReceiverTasks > specifically in the IntervalTree build -- > https://github.com/apache/cassandra/blob/cassandra-2.1.12/src/java/org/apache/cassandra/utils/IntervalTree.java#L257 > We need a max number of parallel streams lever to hanlde this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10862) LCS repair: compact tables before making available in L0
[ https://issues.apache.org/jira/browse/CASSANDRA-10862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334128#comment-15334128 ] Paulo Motta commented on CASSANDRA-10862: - Thanks for the patch and sorry for the delay [~scv...@gmail.com]. Overall I like your approach, because it mitigates the impact without so many changes. See some suggestions for improvement below: * I'm a bit uncomfortable with the unbounded busy wait so we should probably add a time bound to the loop in order to avoid hanging indefinitely if there is a problem with compactions catching up * While the synchronization block on {{CFS}} would guarantee only one producer would add new sstables at a time, I think this might be a premature optimization that could be a source of problems later, so I'd prefer to take a best effort approach initially since I think that we're trying to protect from an abysmal number of sstables, and we don't allow concurrent repairs on the same tables anyway, so there shouldn't me many concurrent {{OnCompletionRunnable}} running for the same {{CFS}} * With that said, I think we could have something like {{CompactionManager.waitForL0Leveling(ColumnFamilyStore cfs, int maxSStableCount, long maxWaitTime}}), similar to {{waitForCessation}} method but waits for L0 leveling instead and without taking a {{Callable}} as argument * I think waiting for leveling on validation will probably cause overstreaming during repair, since different replicas will flush on different times, causing digest mismatches, so we should probably avoid that * {{compaction_max_l0_sstable_count}} is quite an advanced lever so I don't think it should be exposed as a {{cassandra.yaml}} attribute, but instead as a system property (similar to {{cassandra.disable_stcs_in_l0}}). We could also maybe add this as a dynamic JMX attribute to facilitate tuning. * We could also probably add another property for the {{max_wait_time}} for the L0 leveling, and maybe even provide a conservative default to both properties on trunk that could already bring some benefits to the average user and still allow more advanced users to tune it according to usage, something like: {{streaming.max_L0_count=1000}} and {{streaming.max_L0_wait_time=1min}}. I'm not really sure about these values, so it would be nice if you have any suggestions based on your tests so far. Anything else to add here or any caveat I might be missing [~krummas]? > LCS repair: compact tables before making available in L0 > > > Key: CASSANDRA-10862 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10862 > Project: Cassandra > Issue Type: Improvement > Components: Compaction, Streaming and Messaging >Reporter: Jeff Ferland >Assignee: Chen Shen > > When doing repair on a system with lots of mismatched ranges, the number of > tables in L0 goes up dramatically, as correspondingly goes the number of > tables referenced for a query. Latency increases dramatically in tandem. > Eventually all the copied tables are compacted down in L0, then copied into > L1 (which may be a very large copy), finally reducing the number of SSTables > per query into the manageable range. > It seems to me that the cleanest answer is to compact after streaming, then > mark tables available rather than marking available when the file itself is > complete. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12015) Rebuilding from another DC should use different sources
[ https://issues.apache.org/jira/browse/CASSANDRA-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334109#comment-15334109 ] Paulo Motta commented on CASSANDRA-12015: - {{getAllRangesWithSourcesFor}} is only used for bootstrap when {{cassandra.consistent.rangemovement=false}}, otherwise {{getAllRangesWithStrictSourcesFor}} is used (which tries to stream from sources which will lose ranges to the bootstrapping node). When {{cassandra.consistent.rangemovement=false}} it doesn't really matter from which replica you pick from, so I guess we're safe to move away from latency-based proximity. This is also used for replace, so I think it can also distribute replace/non-consistent-bootstrap load more evenly on that case, because right now we are prioritizing replicas which have a better dynamic snitch score, what will probably overload them with streaming originating from rebuild/replace/non-consistent-bootstrap. > Rebuilding from another DC should use different sources > --- > > Key: CASSANDRA-12015 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12015 > Project: Cassandra > Issue Type: Improvement >Reporter: Fabien Rousseau > > Currently, when adding a new DC (ex: DC2) and rebuilding it from an existing > DC (ex: DC1), only the closest replica is used as a "source of data". > It works but is not optimal, because in case of an RF=3 and 3 nodes cluster, > only one node in DC1 is streaming the data to DC2. > To build the new DC in a reasonable time, it would be better, in that case, > to stream from multiple sources, thus distributing more evenly the load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12017) Allow configuration of inter DC compression
Thom Valley created CASSANDRA-12017: --- Summary: Allow configuration of inter DC compression Key: CASSANDRA-12017 URL: https://issues.apache.org/jira/browse/CASSANDRA-12017 Project: Cassandra Issue Type: Improvement Reporter: Thom Valley With larger and more extensively geographically distributed clusters, users are beginning to need the ability to reduce bandwidth consumption as much as possible. With larger workloads, the limits of even large intercontinental data links (55MBps is pretty typical) are beginning to be stretched. InterDC SSL is currently hard coded to use the fastest (not highest) compression settings. LZ4 is a great option, but being able to raise the compression at the cost of some additional CPU may save as much as 10% (perhaps slightly more depending on the data). 10% of a 55MBps link, if running at or near capacity is substantial. This also has a large impact on the overhead and rate possible for instantiating new DCs as well as rebuilding a DC after a failure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11873) Add duration type
[ https://issues.apache.org/jira/browse/CASSANDRA-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334073#comment-15334073 ] Tyler Hobbs commented on CASSANDRA-11873: - This is a good point to take a step back and plan out the semantics of our date and time types more thoroughly. I don't think we need to implement everything up front, but we should think about how we want the various date, time, and interval types to work together. We do not currently support a datetime type with timezones. However, it's certainly possible that this may be added in the future, especially if we focus on timeseries (where you may want rollups by conceptual day instead of 24 hour periods). So, I think we should consider how the types might interact with a timezone-aware datetime. The current {{duration}} type is similar to Java's {{Duration}} and Python's {{timedelta}}. It adds a number of nanoseconds to a datetime, ignoring effects like daylight savings time. On the other hand, we may also want something like Java's {{Period}} class, which works in terms of "conceptual" days, months and years. For example, if you add a conceptual day to a datetime and it happens to cross the daylight savings time boundary, it would end up at the same time of day on the next day (instead of being off by one hour, like the equivalent {{duration}} addition would be). Or, we might combine these into an interval type like Postgres's {{interval}} that stores conceptual months and days, but also stores seconds and nanoseconds. This could work in a pretty straightfoward way with our current timestamps (effectively UTC datetimes), but also work well with timezone-aware datetimes when those are added. This type is certainly more complex than the current {{duration}} type, but I think we'll eventually need something like this anyway, and it's good to ask whether we also want to have a naive {{duration}} alongside that type. If we introduce special syntax for {{duration}}, that may force future {{interval}} literals to have a more cumbersome syntax. At the very least, the differences between the two may confuse users. To summarize, if we want to plan for the future, it may be best to go ahead and implement a full {{interval}} type now that handles conceptual time units as well as raw seconds/nanoseconds. bq. By consequence, it can be difficult for the driver to handle such a type. I don't think that this should weigh heavily on how we design Cassandra's types. We are already forced to implement custom types in several of the drivers. For example, the python driver has custom classes for {{OrderedMap}}, {{SortedSet}}, {{Time}}, and {{Date}} to handle things like nested collections and nanosecond resolution. These are slightly less friendly for users than types in the standard library, but it's fairly normal for a database driver to need to do this. > Add duration type > - > > Key: CASSANDRA-11873 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11873 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > Labels: client-impacting, doc-impacting > Fix For: 3.x > > > For CASSANDRA-11871 or to allow queries with {{WHERE}} clause like: > {{... WHERE reading_time < now() - 2h}}, we need to support some duration > type. > In my opinion, it should be represented internally as a number of > microseconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11845) Hanging repair in cassandra 2.2.4
[ https://issues.apache.org/jira/browse/CASSANDRA-11845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334083#comment-15334083 ] vin01 commented on CASSANDRA-11845: --- It never succeeded.. I just keep going with "nodetool repair -full -local" to minimize the inconsistency issues. > Hanging repair in cassandra 2.2.4 > - > > Key: CASSANDRA-11845 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11845 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging > Environment: Centos 6 >Reporter: vin01 >Priority: Minor > Attachments: cassandra-2.2.4.error.log > > > So after increasing the streaming_timeout_in_ms value to 3 hours, i was able > to avoid the socketTimeout errors i was getting earlier > (https://issues.apAache.org/jira/browse/CASSANDRA-11826), but now the issue > is repair just stays stuck. > current status :- > [2016-05-19 05:52:50,835] Repair session a0e590e1-1d99-11e6-9d63-b717b380ffdd > for range (-3309358208555432808,-3279958773585646585] finished (progress: 54%) > [2016-05-19 05:53:09,446] Repair session a0e590e3-1d99-11e6-9d63-b717b380ffdd > for range (8149151263857514385,8181801084802729407] finished (progress: 55%) > [2016-05-19 05:53:13,808] Repair session a0e5b7f1-1d99-11e6-9d63-b717b380ffdd > for range (3372779397996730299,3381236471688156773] finished (progress: 55%) > [2016-05-19 05:53:27,543] Repair session a0e5b7f3-1d99-11e6-9d63-b717b380ffdd > for range (-4182952858113330342,-4157904914928848809] finished (progress: 55%) > [2016-05-19 05:53:41,128] Repair session a0e5df00-1d99-11e6-9d63-b717b380ffdd > for range (6499366179019889198,6523760493740195344] finished (progress: 55%) > And its 10:46:25 Now, almost 5 hours since it has been stuck right there. > Earlier i could see repair session going on in system.log but there are no > logs coming in right now, all i get in logs is regular index summary > redistribution logs. > Last logs for repair i saw in logs :- > INFO [RepairJobTask:5] 2016-05-19 05:53:41,125 RepairJob.java:152 - [repair > #a0e5df00-1d99-11e6-9d63-b717b380ffdd] TABLE_NAME is fully synced > INFO [RepairJobTask:5] 2016-05-19 05:53:41,126 RepairSession.java:279 - > [repair #a0e5df00-1d99-11e6-9d63-b717b380ffdd] Session completed successfully > INFO [RepairJobTask:5] 2016-05-19 05:53:41,126 RepairRunnable.java:232 - > Repair session a0e5df00-1d99-11e6-9d63-b717b380ffdd for range > (6499366179019889198,6523760493740195344] finished > Its an incremental repair, and in "nodetool netstats" output i can see logs > like :- > Repair e3055fb0-1d9d-11e6-9d63-b717b380ffdd > /Node-2 > Receiving 8 files, 1093461 bytes total. Already received 8 files, > 1093461 bytes total > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80872-big-Data.db > 399475/399475 bytes(100%) received from idx:0/Node-2 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80879-big-Data.db > 53809/53809 bytes(100%) received from idx:0/Node-2 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80878-big-Data.db > 89955/89955 bytes(100%) received from idx:0/Node-2 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80881-big-Data.db > 168790/168790 bytes(100%) received from idx:0/Node-2 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80886-big-Data.db > 107785/107785 bytes(100%) received from idx:0/Node-2 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80880-big-Data.db > 52889/52889 bytes(100%) received from idx:0/Node-2 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80884-big-Data.db > 148882/148882 bytes(100%) received from idx:0/Node-2 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80883-big-Data.db > 71876/71876 bytes(100%) received from idx:0/Node-2 > Sending 5 files, 863321 bytes total. Already sent 5 files, 863321 > bytes total > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/la-73168-big-Data.db > 161895/161895 bytes(100%) sent to idx:0/Node-2 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/la-72604-big-Data.db > 399865/399865 bytes(100%) sent to idx:0/Node-2 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/la-73147-big-Data.db > 149066/149066 bytes(100%) sent to idx:0/Node-2 > > /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe
[jira] [Commented] (CASSANDRA-12015) Rebuilding from another DC should use different sources
[ https://issues.apache.org/jira/browse/CASSANDRA-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334081#comment-15334081 ] DOAN DuyHai commented on CASSANDRA-12015: - [~pauloricardomg] The problem is that this call to {{snitch.getSortedListByProximity(address, rangeAddresses.get(range))}} is inside {{RangeStreamer}} class, which is also used for Bootstrap also (maybe for other operations too, I did not do a comprehensive check) > Rebuilding from another DC should use different sources > --- > > Key: CASSANDRA-12015 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12015 > Project: Cassandra > Issue Type: Improvement >Reporter: Fabien Rousseau > > Currently, when adding a new DC (ex: DC2) and rebuilding it from an existing > DC (ex: DC1), only the closest replica is used as a "source of data". > It works but is not optimal, because in case of an RF=3 and 3 nodes cluster, > only one node in DC1 is streaming the data to DC2. > To build the new DC in a reasonable time, it would be better, in that case, > to stream from multiple sources, thus distributing more evenly the load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11919) Failure in nodetool decommission
[ https://issues.apache.org/jira/browse/CASSANDRA-11919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334064#comment-15334064 ] vin01 commented on CASSANDRA-11919: --- Replication Factor : 2 for DC1 and 1 for DC2. CREATE KEYSPACE KEYSPACE_NAME WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1': '2', 'DC2': '1'} AND durable_writes = true; I was able to remove the node with 'removenode' but it left cluster inconsistent and i had to perform a full repair on all keyspaces to fix that. > Failure in nodetool decommission > > > Key: CASSANDRA-11919 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11919 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging > Environment: Centos 6.6 x86_64, Cassandra 2.2.4 >Reporter: vin01 >Priority: Minor > Fix For: 2.2.x > > > I keep getting an exception while attempting "nodetool decommission". > {code} > ERROR [STREAM-IN-/[NODE_ON_WHICH_DECOMMISSION_RUNNING]] 2016-05-29 > 13:08:39,040 StreamSession.java:524 - [Stream > #b2039080-25c2-11e6-bd92-d71331aaf180] Streaming error occurred > java.lang.IllegalArgumentException: Unknown type 0 > at > org.apache.cassandra.streaming.messages.StreamMessage$Type.get(StreamMessage.java:96) > ~[apache-cassandra-2.2.4.jar:2.2.4] > at > org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:57) > ~[apache-cassandra-2.2.4.jar:2.2.4] > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:261) > ~[apache-cassandra-2.2.4.jar:2.2.4] > {code} > Because of these, decommission process is not succeeding. > Is interrupting the decommission process safe? Seems like i will have to > retry to make it work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12010) UserTypesTest# is failing on trunk
[ https://issues.apache.org/jira/browse/CASSANDRA-12010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334059#comment-15334059 ] Joel Knighton commented on CASSANDRA-12010: --- It looks like the problem here is that on [CASSANDRA-11604] commit, the 3.0 test was simply merged into trunk instead of the trunk test being merged to trunk. Your fix works - on your original [CASSANDRA-11604] trunk branch, you also used the {{beforeAndAfterFlush}} helper in the style of the rest of the tests. Let's update the test in this ticket to use that helper. I don't think we need to rerun CI; this run looked good and an updated test in that style should be identical to the one on [CASSANDRA-11604] for which we have CI results. > UserTypesTest# is failing on trunk > -- > > Key: CASSANDRA-12010 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12010 > Project: Cassandra > Issue Type: Test >Reporter: Alex Petrov >Assignee: Alex Petrov > > Test failure: > http://cassci.datastax.com/job/trunk_utest/1445/testReport/org.apache.cassandra.cql3.validation.entities/UserTypesTest/testAlteringUserTypeNestedWithinNonFrozenMap/ > This was caused by the merge after > [11604|https://issues.apache.org/jira/browse/CASSANDRA-11604] which probably > coincided with some other change, as this failure did not happen during the > [test run on the > branch|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11604-trunk-testall/lastCompletedBuild/testReport/]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11991) On clock skew, paxos may "corrupt" the node clock
[ https://issues.apache.org/jira/browse/CASSANDRA-11991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-11991: - Status: Patch Available (was: Open) For context, the problem is basically the one I described in [my comment|https://issues.apache.org/jira/browse/CASSANDRA-9649?focusedCommentId=14601016&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14601016] on CASSANDRA-9649 and for which I suggested reverting CASSANDRA-7801. Now, I was kind of wrong about reverting CASSANDRA-7801 since since CASSANDRA-9649 we were relying on {{ClientState.getTimestamp()}} to give use timestamp that were unique for the running VM, which meant we can't blindly revert CASSANDRA-7801. What I think is the simplest solution however is to stop relying on that property (of {{ClientState.getTimestamp()}}) for the uniqueness of our ballots, but instead randomize the non-timestamp parts of the ballot for every new ballot. With that, we don't have to revert CASSANDRA-7801, we just have to ensure that if we use the last known proposal timestamp (i.e. if whomever clock generated that timestamp is "in the future"), we don't persist it in the local clock (this in turn means the timestamp might not be unique in the VM for 2 concurrent paxos operation and hence the need to randomize the rest of the UUID). I've pushed a patch for this for 2.1. I'll attach branches for 2.2+ with tests tomorrow (but was waiting on the 2.1 results before doing that) but I don't think the modified code has changed since 2.1 so marking ready for review in the meantime. | [2.1|https://github.com/pcmanus/cassandra/commits/11991-2.1] | [utests|http://cassci.datastax.com/job/pcmanus-11991-2.1-testall/] | [dtests|http://cassci.datastax.com/job/pcmanus-11991-2.1-dtest/] | > On clock skew, paxos may "corrupt" the node clock > - > > Key: CASSANDRA-11991 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11991 > Project: Cassandra > Issue Type: Bug >Reporter: Sylvain Lebresne >Assignee: Sylvain Lebresne > Fix For: 2.1.x, 2.2.x, 3.0.x > > > W made a mistake in CASSANDRA-9649 so that a temporal clock skew on one node > can "corrupt" other node clocks through Paxos. That wasn't intended and we > should fix that. I'll attach a patch later. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11873) Add duration type
[ https://issues.apache.org/jira/browse/CASSANDRA-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334052#comment-15334052 ] Brian Hess commented on CASSANDRA-11873: - I will save the discussion/debate on the relationship between CQL and SQL to another venue. The reason to bring it up is in the context of user/developer experience and usability. If SQL has an approach then we should consider it, but if we can do better then by all means we should do that instead (which I think nobody is debating). A few comments: 1. We should certainly consider the month and year durations. These are common uses and we should at least sketch out how we would support that (if not also implement it in this ticket - which I think we should do). 2. How would we abbreviate the example that Sylvain proposes "1 year 2 months 3 days 4 hours 5 minutes 6 seconds"? Specifically, what is the abbreviation for months and minutes? ISO 8601 has M for both, but the P/T format allows for disambiguation. 3. With respect to ISO 8601 that Postgres does also support, if someone bothers to read the CQL documentation on Date formats for Timestamp types he will find that it states "A timestamp type can be entered as an integer for CQL input, or as a string literal in any of the following ISO 8601 formats" (https://docs.datastax.com/en/cql/3.3/cql/cql_reference/timestamp_type_r.html). So, C* already chose ISO 8601 for Date formats. For consistency with CQL itself we should seriously consider making the same choice for durations. 4. According to the C* documentation, the TIMESTAMP data type, which is what is returned from the Now() call, is the "number of milliseconds since the standard base time known as the epoch". How are we going to support microseconds and nanoseconds? Even Version 1 UUIDs (UUID/TimeUUID format for C*) don't support nanosecond resolution. 5. If we choose to stick with the current bespoke syntax, I suggest moving at least to the Influx format. That leaves 2 items: a) change microseconds from "us" to "u", which is what Influx uses b) support weeks with the "w" abbreviation. > Add duration type > - > > Key: CASSANDRA-11873 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11873 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > Labels: client-impacting, doc-impacting > Fix For: 3.x > > > For CASSANDRA-11871 or to allow queries with {{WHERE}} clause like: > {{... WHERE reading_time < now() - 2h}}, we need to support some duration > type. > In my opinion, it should be represented internally as a number of > microseconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12015) Rebuilding from another DC should use different sources
[ https://issues.apache.org/jira/browse/CASSANDRA-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334041#comment-15334041 ] Paulo Motta commented on CASSANDRA-12015: - while picking replicas from the same DC/rack is definitely useful, I'm not sure sorting replicas by dynamic snitch within the same rack/dc will buy us many benefits here for bulk operation like streaming. A simple fix here would be to use the current AbstractEndpointSnitch.sortByProximity instead, that will only sort replicas by rack/dc, which should pick primary replicas for each range and that should already yield a reasonable load distribution. > Rebuilding from another DC should use different sources > --- > > Key: CASSANDRA-12015 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12015 > Project: Cassandra > Issue Type: Improvement >Reporter: Fabien Rousseau > > Currently, when adding a new DC (ex: DC2) and rebuilding it from an existing > DC (ex: DC1), only the closest replica is used as a "source of data". > It works but is not optimal, because in case of an RF=3 and 3 nodes cluster, > only one node in DC1 is streaming the data to DC2. > To build the new DC in a reasonable time, it would be better, in that case, > to stream from multiple sources, thus distributing more evenly the load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12002) SSTable tools mishandling LocalPartitioner
[ https://issues.apache.org/jira/browse/CASSANDRA-12002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334010#comment-15334010 ] Chris Lohfink commented on CASSANDRA-12002: --- Was reported on the user list when someone tried to look at sstables of system.batches. +1 to the changes. I can add some unit tests. > SSTable tools mishandling LocalPartitioner > -- > > Key: CASSANDRA-12002 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12002 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Chris Lohfink >Assignee: Chris Lohfink >Priority: Minor > Attachments: CASSADNRA-12002.txt > > > The sstabledump and sstablemetadata tools use the FBUtilities.newPartitioner > from the name of the partitioner in the validation component. This fails on > sstables that are created with things that use the LocalPartitioner > (secondary indexes, and the system.batches table). The sstabledump had a > check for secondary indexes, but still failed for the system table it was > failing for all in the metadata tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11960) Hints are not seekable
[ https://issues.apache.org/jira/browse/CASSANDRA-11960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Podkowinski updated CASSANDRA-11960: --- Status: Awaiting Feedback (was: In Progress) > Hints are not seekable > -- > > Key: CASSANDRA-11960 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11960 > Project: Cassandra > Issue Type: Bug >Reporter: Robert Stupp >Assignee: Stefan Podkowinski > > Got the following error message on trunk. No idea how to reproduce. But the > only thing the (not overridden) seek method does is throwing this exception. > {code} > ERROR [HintsDispatcher:2] 2016-06-05 18:51:09,397 CassandraDaemon.java:222 - > Exception in thread Thread[HintsDispatcher:2,1,main] > java.lang.UnsupportedOperationException: Hints are not seekable. > at org.apache.cassandra.hints.HintsReader.seek(HintsReader.java:114) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.seek(HintsDispatcher.java:79) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:257) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:242) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:220) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:199) > ~[main/:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_91] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_91] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > ~[na:1.8.0_91] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_91] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91] > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11960) Hints are not seekable
[ https://issues.apache.org/jira/browse/CASSANDRA-11960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333995#comment-15333995 ] Stefan Podkowinski commented on CASSANDRA-11960: I've now created a patch that would move away from file offset based retries and instead replays the whole page. As describe above, the {{RebufferingInputStream}} data input doesn't provide a way to seek an offset. Although this should be possible to implement, I think these changes should be considered more carefully, as they have to be done in the common io.utils code. Maybe we should open a different ticket for that? Although replaying a complete page isn't optimal, as we'll deliver duplicate hints, we don't guarantee at-most-once semantics for hints anyway. This is not so great for non-idempotent operations, such as list appends (counters are not hinted), but the current implementation is clearly broken so we have to do something about it. But I'm open to ideas how to further optimize this. > Hints are not seekable > -- > > Key: CASSANDRA-11960 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11960 > Project: Cassandra > Issue Type: Bug >Reporter: Robert Stupp >Assignee: Stefan Podkowinski > > Got the following error message on trunk. No idea how to reproduce. But the > only thing the (not overridden) seek method does is throwing this exception. > {code} > ERROR [HintsDispatcher:2] 2016-06-05 18:51:09,397 CassandraDaemon.java:222 - > Exception in thread Thread[HintsDispatcher:2,1,main] > java.lang.UnsupportedOperationException: Hints are not seekable. > at org.apache.cassandra.hints.HintsReader.seek(HintsReader.java:114) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatcher.seek(HintsDispatcher.java:79) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:257) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:242) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:220) > ~[main/:na] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:199) > ~[main/:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_91] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_91] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > ~[na:1.8.0_91] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_91] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91] > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-12016) Create MessagingService mocking classes
[ https://issues.apache.org/jira/browse/CASSANDRA-12016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Podkowinski updated CASSANDRA-12016: --- Status: Awaiting Feedback (was: In Progress) > Create MessagingService mocking classes > --- > > Key: CASSANDRA-12016 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12016 > Project: Cassandra > Issue Type: New Feature > Components: Testing >Reporter: Stefan Podkowinski >Assignee: Stefan Podkowinski > > Interactions between clients and nodes in the cluster are taking place by > exchanging messages through the {{MessagingService}}. Black box testing for > message based systems is usually pretty easy, as we're just dealing with > messages in/out. My suggestion would be to add tests that make use of this > fact by mocking message exchanges via MessagingService. Given the right use > case, this would turn out to be a much simpler and more efficient alternative > for dtests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12016) Create MessagingService mocking classes
[ https://issues.apache.org/jira/browse/CASSANDRA-12016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333973#comment-15333973 ] Stefan Podkowinski commented on CASSANDRA-12016: Please find the suggested implementation in the linked WIP branch. An example how a unit test using those classes looks like can be found [here|https://github.com/spodkowinski/cassandra/blob/3cd4ef203cd147713a6f8c4b1466703436124e0b/test/unit/org/apache/cassandra/hints/HintsServiceTest.java]. I'm looking forward for any feedback. > Create MessagingService mocking classes > --- > > Key: CASSANDRA-12016 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12016 > Project: Cassandra > Issue Type: New Feature > Components: Testing >Reporter: Stefan Podkowinski >Assignee: Stefan Podkowinski > > Interactions between clients and nodes in the cluster are taking place by > exchanging messages through the {{MessagingService}}. Black box testing for > message based systems is usually pretty easy, as we're just dealing with > messages in/out. My suggestion would be to add tests that make use of this > fact by mocking message exchanges via MessagingService. Given the right use > case, this would turn out to be a much simpler and more efficient alternative > for dtests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
cassandra git commit: Add cross-DC latency metrics
Repository: cassandra Updated Branches: refs/heads/trunk e31e21623 -> 04afa2bf5 Add cross-DC latency metrics Patch by Chris Lohfink, reviewed by Carl Yeksigian for CASSANDRA-11596 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/04afa2bf Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/04afa2bf Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/04afa2bf Branch: refs/heads/trunk Commit: 04afa2bf52ce6f5a534323678defd625dca67336 Parents: e31e216 Author: Chris Lohfink Authored: Wed May 18 16:00:04 2016 -0500 Committer: Carl Yeksigian Committed: Thu Jun 16 10:49:24 2016 -0400 -- CHANGES.txt | 1 + .../cassandra/metrics/MessagingMetrics.java | 59 +++ .../cassandra/net/IncomingTcpConnection.java| 2 +- .../org/apache/cassandra/net/MessageIn.java | 9 ++- .../apache/cassandra/net/MessagingService.java | 3 + .../cassandra/net/MessagingServiceTest.java | 62 +++- 6 files changed, 131 insertions(+), 5 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/04afa2bf/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 9c44a63..08b5e4a 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.8 + * Add cross-DC latency metrics (CASSANDRA-11596) * Allow terms in selection clause (CASSANDRA-10783) * Add bind variables to trace (CASSANDRA-11719) * Switch counter shards' clock to timestamps (CASSANDRA-9811) http://git-wip-us.apache.org/repos/asf/cassandra/blob/04afa2bf/src/java/org/apache/cassandra/metrics/MessagingMetrics.java -- diff --git a/src/java/org/apache/cassandra/metrics/MessagingMetrics.java b/src/java/org/apache/cassandra/metrics/MessagingMetrics.java new file mode 100644 index 000..e126c93 --- /dev/null +++ b/src/java/org/apache/cassandra/metrics/MessagingMetrics.java @@ -0,0 +1,59 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.cassandra.metrics; + +import java.net.InetAddress; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.TimeUnit; + +import org.apache.cassandra.config.DatabaseDescriptor; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import com.codahale.metrics.Timer; + +import static org.apache.cassandra.metrics.CassandraMetricsRegistry.Metrics; + +/** + * Metrics for messages + */ +public class MessagingMetrics +{ +private static Logger logger = LoggerFactory.getLogger(MessagingMetrics.class); +private static final MetricNameFactory factory = new DefaultNameFactory("Messaging"); +public final Timer crossNodeLatency; +public final ConcurrentHashMap dcLatency; + +public MessagingMetrics() +{ +crossNodeLatency = Metrics.timer(factory.createMetricName("CrossNodeLatency")); +dcLatency = new ConcurrentHashMap<>(); +} + +public void addTimeTaken(InetAddress from, long timeTaken) +{ +String dc = DatabaseDescriptor.getEndpointSnitch().getDatacenter(from); +Timer timer = dcLatency.get(dc); +if (timer == null) +{ +timer = dcLatency.computeIfAbsent(dc, k -> Metrics.timer(factory.createMetricName(dc + "-Latency"))); +} +timer.update(timeTaken, TimeUnit.MILLISECONDS); +crossNodeLatency.update(timeTaken, TimeUnit.MILLISECONDS); +} +} http://git-wip-us.apache.org/repos/asf/cassandra/blob/04afa2bf/src/java/org/apache/cassandra/net/IncomingTcpConnection.java -- diff --git a/src/java/org/apache/cassandra/net/IncomingTcpConnection.java b/src/java/org/apache/cassandra/net/IncomingTcpConnection.java index 2a09bf4..9e8e2e1 100644 --- a/src/java/org/apache/cassandra/net/IncomingTcpConnection.java +++ b/src/java/org/apache/cassandra/net/IncomingTcpConnection.java @@ -187,7 +187,7 @@ public class IncomingTc
[jira] [Updated] (CASSANDRA-11569) Track message latency across DCs
[ https://issues.apache.org/jira/browse/CASSANDRA-11569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Yeksigian updated CASSANDRA-11569: --- Resolution: Fixed Fix Version/s: 3.8 Status: Resolved (was: Patch Available) +1. Thanks, [~cnlwsu]! Commited as [04afa2b|https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=04afa2bf52ce6f5a534323678defd625dca67336]. > Track message latency across DCs > > > Key: CASSANDRA-11569 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11569 > Project: Cassandra > Issue Type: Improvement > Components: Observability >Reporter: Chris Lohfink >Assignee: Chris Lohfink >Priority: Minor > Fix For: 3.8 > > Attachments: CASSANDRA-11569.patch, CASSANDRA-11569v2.txt, > nodeLatency.PNG > > > Since we have the timestamp a message is created and when arrives, we can get > an approximate time it took relatively easy and would remove necessity for > more complex hacks to determine latency between DCs. > Although is not going to be very meaningful when ntp is not setup, it is > pretty common to have NTP setup and even with clock drift nothing is really > hurt except the metric becoming whacky. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8119) More Expressive Consistency Levels
[ https://issues.apache.org/jira/browse/CASSANDRA-8119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333886#comment-15333886 ] Randy Fradin commented on CASSANDRA-8119: - I think taking a map of DC -> level pushes too much complexity to the client, and also lacks some of the flexibility we're looking for. For example, one use case is to do a QUORUM operation across nodes in a subset of data centers (where the data centers involved depend on where the coordinator is located). Another use case is to do "uneven" quorums, e.g. hit somewhere between (n/2)+1 and n-1 replicas on write and (n+1)/2 and 2 on read, or vice-versa (which is useful when the distance between replicas is not uniform and the number of replicas may not be an odd number). The interface Tyler describes allows for that level of flexibility. Putting it in CQL makes it simple for operators to define, deploy, and view the custom CLs. A custom "strategy" class approach provides a similar level of flexibility but would be more cumbersome for operators. > More Expressive Consistency Levels > -- > > Key: CASSANDRA-8119 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8119 > Project: Cassandra > Issue Type: New Feature > Components: CQL >Reporter: Tyler Hobbs > Fix For: 3.x > > > For some multi-datacenter environments, the current set of consistency levels > are too restrictive. For example, the following consistency requirements > cannot be expressed: > * LOCAL_QUORUM in two specific DCs > * LOCAL_QUORUM in the local DC plus LOCAL_QUORUM in at least one other DC > * LOCAL_QUORUM in the local DC plus N remote replicas in any DC > I propose that we add a new consistency level: CUSTOM. In the v4 (or v5) > protocol, this would be accompanied by an additional map argument. A map of > {DC: CL} or a map of {DC: int} is sufficient to cover the first example. If > we accept a special keys to represent "any datacenter", the second case can > be handled. A similar technique could be used for "any other nodes". > I'm not in love with the special keys, so if anybody has ideas for something > more elegant, feel free to propose them. The main idea is that we want to be > flexible enough to cover any reasonable consistency or durability > requirements. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12016) Create MessagingService mocking classes
Stefan Podkowinski created CASSANDRA-12016: -- Summary: Create MessagingService mocking classes Key: CASSANDRA-12016 URL: https://issues.apache.org/jira/browse/CASSANDRA-12016 Project: Cassandra Issue Type: New Feature Components: Testing Reporter: Stefan Podkowinski Assignee: Stefan Podkowinski Interactions between clients and nodes in the cluster are taking place by exchanging messages through the {{MessagingService}}. Black box testing for message based systems is usually pretty easy, as we're just dealing with messages in/out. My suggestion would be to add tests that make use of this fact by mocking message exchanges via MessagingService. Given the right use case, this would turn out to be a much simpler and more efficient alternative for dtests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-8844) Change Data Capture (CDC)
[ https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333832#comment-15333832 ] Joshua McKenzie edited comment on CASSANDRA-8844 at 6/16/16 2:03 PM: - Switching between C# and Java everyday has its costs. Fixed that, tidied up NEWS.txt (spacing and ordering on Upgrading and Deprecation), and [committed|https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=e31e216234c6b57a531cae607e0355666007deb2]. Thanks for the assist [~carlyeks] and [~blambov]! I'll be creating a follow-up meta ticket w/subtasks from all the stuff that came up here that we deferred and link that to this ticket, as well as moving the link to CASSANDRA-11957 over there. was (Author: joshuamckenzie): Switching between C# and Java everyday has its costs. Fixed that, tidied up NEWS.txt (spacing and ordering on Upgrading and Deprecation), and [committed|https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=5dcab286ca0fcd9a71e28dad805f028362572e21]. Thanks for the assist [~carlyeks] and [~blambov]! I'll be creating a follow-up meta ticket w/subtasks from all the stuff that came up here that we deferred and link that to this ticket, as well as moving the link to CASSANDRA-11957 over there. > Change Data Capture (CDC) > - > > Key: CASSANDRA-8844 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8844 > Project: Cassandra > Issue Type: New Feature > Components: Coordination, Local Write-Read Paths >Reporter: Tupshin Harper >Assignee: Joshua McKenzie >Priority: Critical > Fix For: 3.8 > > > "In databases, change data capture (CDC) is a set of software design patterns > used to determine (and track) the data that has changed so that action can be > taken using the changed data. Also, Change data capture (CDC) is an approach > to data integration that is based on the identification, capture and delivery > of the changes made to enterprise data sources." > -Wikipedia > As Cassandra is increasingly being used as the Source of Record (SoR) for > mission critical data in large enterprises, it is increasingly being called > upon to act as the central hub of traffic and data flow to other systems. In > order to try to address the general need, we (cc [~brianmhess]), propose > implementing a simple data logging mechanism to enable per-table CDC patterns. > h2. The goals: > # Use CQL as the primary ingestion mechanism, in order to leverage its > Consistency Level semantics, and in order to treat it as the single > reliable/durable SoR for the data. > # To provide a mechanism for implementing good and reliable > (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) > continuous semi-realtime feeds of mutations going into a Cassandra cluster. > # To eliminate the developmental and operational burden of users so that they > don't have to do dual writes to other systems. > # For users that are currently doing batch export from a Cassandra system, > give them the opportunity to make that realtime with a minimum of coding. > h2. The mechanism: > We propose a durable logging mechanism that functions similar to a commitlog, > with the following nuances: > - Takes place on every node, not just the coordinator, so RF number of copies > are logged. > - Separate log per table. > - Per-table configuration. Only tables that are specified as CDC_LOG would do > any logging. > - Per DC. We are trying to keep the complexity to a minimum to make this an > easy enhancement, but most likely use cases would prefer to only implement > CDC logging in one (or a subset) of the DCs that are being replicated to > - In the critical path of ConsistencyLevel acknowledgment. Just as with the > commitlog, failure to write to the CDC log should fail that node's write. If > that means the requested consistency level was not met, then clients *should* > experience UnavailableExceptions. > - Be written in a Row-centric manner such that it is easy for consumers to > reconstitute rows atomically. > - Written in a simple format designed to be consumed *directly* by daemons > written in non JVM languages > h2. Nice-to-haves > I strongly suspect that the following features will be asked for, but I also > believe that they can be deferred for a subsequent release, and to guage > actual interest. > - Multiple logs per table. This would make it easy to have multiple > "subscribers" to a single table's changes. A workaround would be to create a > forking daemon listener, but that's not a great answer. > - Log filtering. Being able to apply filters, including UDF-based filters > would make Casandra a much more versatile feeder into other systems, and > again, reduce complexity that would otherwise need to be bui
[2/5] cassandra git commit: Add Change Data Capture
http://git-wip-us.apache.org/repos/asf/cassandra/blob/5dcab286/src/java/org/apache/cassandra/db/commitlog/SegmentReader.java -- diff --git a/src/java/org/apache/cassandra/db/commitlog/SegmentReader.java b/src/java/org/apache/cassandra/db/commitlog/SegmentReader.java deleted file mode 100644 index 17980de..000 --- a/src/java/org/apache/cassandra/db/commitlog/SegmentReader.java +++ /dev/null @@ -1,355 +0,0 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package org.apache.cassandra.db.commitlog; - -import java.io.IOException; -import java.nio.ByteBuffer; -import java.util.Iterator; -import java.util.zip.CRC32; -import javax.crypto.Cipher; - -import com.google.common.annotations.VisibleForTesting; -import com.google.common.collect.AbstractIterator; - -import org.apache.cassandra.db.commitlog.EncryptedFileSegmentInputStream.ChunkProvider; -import org.apache.cassandra.io.FSReadError; -import org.apache.cassandra.io.compress.ICompressor; -import org.apache.cassandra.io.util.FileDataInput; -import org.apache.cassandra.io.util.FileSegmentInputStream; -import org.apache.cassandra.io.util.RandomAccessReader; -import org.apache.cassandra.schema.CompressionParams; -import org.apache.cassandra.security.EncryptionUtils; -import org.apache.cassandra.security.EncryptionContext; -import org.apache.cassandra.utils.ByteBufferUtil; - -import static org.apache.cassandra.db.commitlog.CommitLogSegment.SYNC_MARKER_SIZE; -import static org.apache.cassandra.utils.FBUtilities.updateChecksumInt; - -/** - * Read each sync section of a commit log, iteratively. - */ -public class SegmentReader implements Iterable -{ -private final CommitLogDescriptor descriptor; -private final RandomAccessReader reader; -private final Segmenter segmenter; -private final boolean tolerateTruncation; - -/** - * ending position of the current sync section. - */ -protected int end; - -protected SegmentReader(CommitLogDescriptor descriptor, RandomAccessReader reader, boolean tolerateTruncation) -{ -this.descriptor = descriptor; -this.reader = reader; -this.tolerateTruncation = tolerateTruncation; - -end = (int) reader.getFilePointer(); -if (descriptor.getEncryptionContext().isEnabled()) -segmenter = new EncryptedSegmenter(reader, descriptor); -else if (descriptor.compression != null) -segmenter = new CompressedSegmenter(descriptor, reader); -else -segmenter = new NoOpSegmenter(reader); -} - -public Iterator iterator() -{ -return new SegmentIterator(); -} - -protected class SegmentIterator extends AbstractIterator -{ -protected SyncSegment computeNext() -{ -while (true) -{ -try -{ -final int currentStart = end; -end = readSyncMarker(descriptor, currentStart, reader); -if (end == -1) -{ -return endOfData(); -} -if (end > reader.length()) -{ -// the CRC was good (meaning it was good when it was written and still looks legit), but the file is truncated now. -// try to grab and use as much of the file as possible, which might be nothing if the end of the file truly is corrupt -end = (int) reader.length(); -} - -return segmenter.nextSegment(currentStart + SYNC_MARKER_SIZE, end); -} -catch(SegmentReader.SegmentReadException e) -{ -try -{ -CommitLogReplayer.handleReplayError(!e.invalidCrc && tolerateTruncation, e.getMessage()); -} -catch (IOException ioe) -{ -throw new RuntimeException(ioe); -} -} -catch (IOException e) -{ -try -
[4/5] cassandra git commit: Add Change Data Capture
http://git-wip-us.apache.org/repos/asf/cassandra/blob/5dcab286/src/java/org/apache/cassandra/db/commitlog/CommitLog.java -- diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLog.java b/src/java/org/apache/cassandra/db/commitlog/CommitLog.java index 4a660ca..b1f48b2 100644 --- a/src/java/org/apache/cassandra/db/commitlog/CommitLog.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLog.java @@ -22,34 +22,35 @@ import java.lang.management.ManagementFactory; import java.nio.ByteBuffer; import java.util.*; import java.util.zip.CRC32; - import javax.management.MBeanServer; import javax.management.ObjectName; import com.google.common.annotations.VisibleForTesting; - +import org.apache.commons.lang3.StringUtils; import org.slf4j.Logger; import org.slf4j.LoggerFactory; -import org.apache.commons.lang3.StringUtils; - import org.apache.cassandra.config.Config; import org.apache.cassandra.config.DatabaseDescriptor; import org.apache.cassandra.config.ParameterizedClass; import org.apache.cassandra.db.*; +import org.apache.cassandra.exceptions.WriteTimeoutException; import org.apache.cassandra.io.FSWriteError; -import org.apache.cassandra.schema.CompressionParams; import org.apache.cassandra.io.compress.ICompressor; import org.apache.cassandra.io.util.BufferedDataOutputStreamPlus; import org.apache.cassandra.io.util.DataOutputBufferFixed; +import org.apache.cassandra.io.util.FileUtils; import org.apache.cassandra.metrics.CommitLogMetrics; import org.apache.cassandra.net.MessagingService; +import org.apache.cassandra.schema.CompressionParams; import org.apache.cassandra.security.EncryptionContext; import org.apache.cassandra.service.StorageService; import org.apache.cassandra.utils.FBUtilities; import org.apache.cassandra.utils.JVMStabilityInspector; -import static org.apache.cassandra.db.commitlog.CommitLogSegment.*; +import static org.apache.cassandra.db.commitlog.CommitLogSegment.Allocation; +import static org.apache.cassandra.db.commitlog.CommitLogSegment.CommitLogSegmentFileComparator; +import static org.apache.cassandra.db.commitlog.CommitLogSegment.ENTRY_OVERHEAD_SIZE; import static org.apache.cassandra.utils.FBUtilities.updateChecksum; import static org.apache.cassandra.utils.FBUtilities.updateChecksumInt; @@ -65,19 +66,19 @@ public class CommitLog implements CommitLogMBean // we only permit records HALF the size of a commit log, to ensure we don't spin allocating many mostly // empty segments when writing large records -private final long MAX_MUTATION_SIZE = DatabaseDescriptor.getMaxMutationSize(); +final long MAX_MUTATION_SIZE = DatabaseDescriptor.getMaxMutationSize(); + +final public AbstractCommitLogSegmentManager segmentManager; -public final CommitLogSegmentManager allocator; public final CommitLogArchiver archiver; final CommitLogMetrics metrics; final AbstractCommitLogService executor; volatile Configuration configuration; -final public String location; private static CommitLog construct() { -CommitLog log = new CommitLog(DatabaseDescriptor.getCommitLogLocation(), CommitLogArchiver.construct()); +CommitLog log = new CommitLog(CommitLogArchiver.construct()); MBeanServer mbs = ManagementFactory.getPlatformMBeanServer(); try @@ -92,9 +93,8 @@ public class CommitLog implements CommitLogMBean } @VisibleForTesting -CommitLog(String location, CommitLogArchiver archiver) +CommitLog(CommitLogArchiver archiver) { -this.location = location; this.configuration = new Configuration(DatabaseDescriptor.getCommitLogCompression(), DatabaseDescriptor.getEncryptionContext()); DatabaseDescriptor.createAllDirectories(); @@ -106,16 +106,17 @@ public class CommitLog implements CommitLogMBean ? new BatchCommitLogService(this) : new PeriodicCommitLogService(this); -allocator = new CommitLogSegmentManager(this); - +segmentManager = DatabaseDescriptor.isCDCEnabled() + ? new CommitLogSegmentManagerCDC(this, DatabaseDescriptor.getCommitLogLocation()) + : new CommitLogSegmentManagerStandard(this, DatabaseDescriptor.getCommitLogLocation()); // register metrics -metrics.attach(executor, allocator); +metrics.attach(executor, segmentManager); } CommitLog start() { executor.start(); -allocator.start(); +segmentManager.start(); return this; } @@ -123,11 +124,12 @@ public class CommitLog implements CommitLogMBean * Perform recovery on commit logs located in the directory specified by the config file. * * @return the number of mutations replayed + * @throws IOException */ -public int recover() throws IO
[5/5] cassandra git commit: Add Change Data Capture
Add Change Data Capture Patch by jmckenzie; reviewed by cyeksigian and blambov for CASSANDRA-8844 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e31e2162 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e31e2162 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e31e2162 Branch: refs/heads/trunk Commit: e31e216234c6b57a531cae607e0355666007deb2 Parents: ed538f9 Author: Josh McKenzie Authored: Sun Mar 27 09:20:47 2016 -0400 Committer: Josh McKenzie Committed: Thu Jun 16 10:01:39 2016 -0400 -- CHANGES.txt | 1 + NEWS.txt| 32 +- build.xml | 51 +- conf/cassandra.yaml | 26 + pylib/cqlshlib/cql3handling.py | 5 +- src/antlr/Parser.g | 3 +- .../org/apache/cassandra/config/Config.java | 6 + .../cassandra/config/DatabaseDescriptor.java| 86 ++- .../statements/CreateKeyspaceStatement.java | 1 + .../cql3/statements/DropKeyspaceStatement.java | 2 +- .../cql3/statements/TableAttributes.java| 3 + .../apache/cassandra/db/ColumnFamilyStore.java | 67 +-- .../org/apache/cassandra/db/Directories.java| 51 +- src/java/org/apache/cassandra/db/Keyspace.java | 17 +- src/java/org/apache/cassandra/db/Memtable.java | 47 +- src/java/org/apache/cassandra/db/Mutation.java | 18 + .../org/apache/cassandra/db/SystemKeyspace.java | 26 +- src/java/org/apache/cassandra/db/WriteType.java | 3 +- .../AbstractCommitLogSegmentManager.java| 584 +++ .../db/commitlog/AbstractCommitLogService.java | 3 +- .../cassandra/db/commitlog/CommitLog.java | 157 +++-- .../db/commitlog/CommitLogPosition.java | 121 .../db/commitlog/CommitLogReadHandler.java | 76 +++ .../cassandra/db/commitlog/CommitLogReader.java | 501 .../db/commitlog/CommitLogReplayer.java | 582 +++--- .../db/commitlog/CommitLogSegment.java | 110 ++-- .../db/commitlog/CommitLogSegmentManager.java | 567 -- .../commitlog/CommitLogSegmentManagerCDC.java | 302 ++ .../CommitLogSegmentManagerStandard.java| 89 +++ .../db/commitlog/CommitLogSegmentReader.java| 366 .../db/commitlog/CompressedSegment.java | 12 +- .../db/commitlog/EncryptedSegment.java | 18 +- .../db/commitlog/FileDirectSegment.java | 73 +-- .../db/commitlog/MemoryMappedSegment.java | 6 +- .../cassandra/db/commitlog/ReplayPosition.java | 178 -- .../cassandra/db/commitlog/SegmentReader.java | 355 --- .../db/commitlog/SimpleCachedBufferPool.java| 118 .../apache/cassandra/db/lifecycle/Tracker.java | 8 +- .../apache/cassandra/db/view/TableViews.java| 4 +- .../apache/cassandra/db/view/ViewManager.java | 2 - .../io/sstable/format/SSTableReader.java| 1 - .../metadata/LegacyMetadataSerializer.java | 12 +- .../io/sstable/metadata/MetadataCollector.java | 16 +- .../io/sstable/metadata/StatsMetadata.java | 24 +- .../cassandra/metrics/CommitLogMetrics.java | 9 +- .../apache/cassandra/schema/SchemaKeyspace.java | 6 +- .../apache/cassandra/schema/TableParams.java| 23 +- .../cassandra/service/CassandraDaemon.java | 4 +- .../cassandra/streaming/StreamReceiveTask.java | 36 +- .../utils/DirectorySizeCalculator.java | 98 .../cassandra/utils/JVMStabilityInspector.java | 3 +- .../cassandra/utils/memory/BufferPool.java | 2 +- test/conf/cassandra-murmur.yaml | 2 + test/conf/cassandra.yaml| 2 + test/conf/cdc.yaml | 1 + test/data/bloom-filter/ka/foo.cql | 2 +- .../db/commitlog/CommitLogStressTest.java | 123 ++-- .../test/microbench/DirectorySizerBench.java| 105 .../OffsetAwareConfigurationLoader.java | 13 +- .../cassandra/batchlog/BatchlogManagerTest.java | 4 +- .../apache/cassandra/cql3/CDCStatementTest.java | 50 ++ .../org/apache/cassandra/cql3/CQLTester.java| 4 + .../apache/cassandra/cql3/OutOfSpaceTest.java | 2 +- .../cql3/validation/operations/CreateTest.java | 5 +- .../apache/cassandra/db/ReadMessageTest.java| 10 +- .../db/commitlog/CommitLogReaderTest.java | 267 + .../CommitLogSegmentManagerCDCTest.java | 220 +++ .../commitlog/CommitLogSegmentManagerTest.java | 23 +- .../cassandra/db/commitlog/CommitLogTest.java | 130 +++-- .../db/commitlog/CommitLogTestReplayer.java | 59 +- .../db/commitlog/CommitLogUpgradeTest.java | 18 +- .../db/commitlog/CommitLogUpgradeTestMaker.java | 4 +- .../db/commitlog/SegmentReaderTest
[2/5] cassandra git commit: Add Change Data Capture
http://git-wip-us.apache.org/repos/asf/cassandra/blob/e31e2162/src/java/org/apache/cassandra/db/commitlog/SegmentReader.java -- diff --git a/src/java/org/apache/cassandra/db/commitlog/SegmentReader.java b/src/java/org/apache/cassandra/db/commitlog/SegmentReader.java deleted file mode 100644 index 17980de..000 --- a/src/java/org/apache/cassandra/db/commitlog/SegmentReader.java +++ /dev/null @@ -1,355 +0,0 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -package org.apache.cassandra.db.commitlog; - -import java.io.IOException; -import java.nio.ByteBuffer; -import java.util.Iterator; -import java.util.zip.CRC32; -import javax.crypto.Cipher; - -import com.google.common.annotations.VisibleForTesting; -import com.google.common.collect.AbstractIterator; - -import org.apache.cassandra.db.commitlog.EncryptedFileSegmentInputStream.ChunkProvider; -import org.apache.cassandra.io.FSReadError; -import org.apache.cassandra.io.compress.ICompressor; -import org.apache.cassandra.io.util.FileDataInput; -import org.apache.cassandra.io.util.FileSegmentInputStream; -import org.apache.cassandra.io.util.RandomAccessReader; -import org.apache.cassandra.schema.CompressionParams; -import org.apache.cassandra.security.EncryptionUtils; -import org.apache.cassandra.security.EncryptionContext; -import org.apache.cassandra.utils.ByteBufferUtil; - -import static org.apache.cassandra.db.commitlog.CommitLogSegment.SYNC_MARKER_SIZE; -import static org.apache.cassandra.utils.FBUtilities.updateChecksumInt; - -/** - * Read each sync section of a commit log, iteratively. - */ -public class SegmentReader implements Iterable -{ -private final CommitLogDescriptor descriptor; -private final RandomAccessReader reader; -private final Segmenter segmenter; -private final boolean tolerateTruncation; - -/** - * ending position of the current sync section. - */ -protected int end; - -protected SegmentReader(CommitLogDescriptor descriptor, RandomAccessReader reader, boolean tolerateTruncation) -{ -this.descriptor = descriptor; -this.reader = reader; -this.tolerateTruncation = tolerateTruncation; - -end = (int) reader.getFilePointer(); -if (descriptor.getEncryptionContext().isEnabled()) -segmenter = new EncryptedSegmenter(reader, descriptor); -else if (descriptor.compression != null) -segmenter = new CompressedSegmenter(descriptor, reader); -else -segmenter = new NoOpSegmenter(reader); -} - -public Iterator iterator() -{ -return new SegmentIterator(); -} - -protected class SegmentIterator extends AbstractIterator -{ -protected SyncSegment computeNext() -{ -while (true) -{ -try -{ -final int currentStart = end; -end = readSyncMarker(descriptor, currentStart, reader); -if (end == -1) -{ -return endOfData(); -} -if (end > reader.length()) -{ -// the CRC was good (meaning it was good when it was written and still looks legit), but the file is truncated now. -// try to grab and use as much of the file as possible, which might be nothing if the end of the file truly is corrupt -end = (int) reader.length(); -} - -return segmenter.nextSegment(currentStart + SYNC_MARKER_SIZE, end); -} -catch(SegmentReader.SegmentReadException e) -{ -try -{ -CommitLogReplayer.handleReplayError(!e.invalidCrc && tolerateTruncation, e.getMessage()); -} -catch (IOException ioe) -{ -throw new RuntimeException(ioe); -} -} -catch (IOException e) -{ -try -
[4/5] cassandra git commit: Add Change Data Capture
http://git-wip-us.apache.org/repos/asf/cassandra/blob/e31e2162/src/java/org/apache/cassandra/db/commitlog/CommitLog.java -- diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLog.java b/src/java/org/apache/cassandra/db/commitlog/CommitLog.java index 4a660ca..b1f48b2 100644 --- a/src/java/org/apache/cassandra/db/commitlog/CommitLog.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLog.java @@ -22,34 +22,35 @@ import java.lang.management.ManagementFactory; import java.nio.ByteBuffer; import java.util.*; import java.util.zip.CRC32; - import javax.management.MBeanServer; import javax.management.ObjectName; import com.google.common.annotations.VisibleForTesting; - +import org.apache.commons.lang3.StringUtils; import org.slf4j.Logger; import org.slf4j.LoggerFactory; -import org.apache.commons.lang3.StringUtils; - import org.apache.cassandra.config.Config; import org.apache.cassandra.config.DatabaseDescriptor; import org.apache.cassandra.config.ParameterizedClass; import org.apache.cassandra.db.*; +import org.apache.cassandra.exceptions.WriteTimeoutException; import org.apache.cassandra.io.FSWriteError; -import org.apache.cassandra.schema.CompressionParams; import org.apache.cassandra.io.compress.ICompressor; import org.apache.cassandra.io.util.BufferedDataOutputStreamPlus; import org.apache.cassandra.io.util.DataOutputBufferFixed; +import org.apache.cassandra.io.util.FileUtils; import org.apache.cassandra.metrics.CommitLogMetrics; import org.apache.cassandra.net.MessagingService; +import org.apache.cassandra.schema.CompressionParams; import org.apache.cassandra.security.EncryptionContext; import org.apache.cassandra.service.StorageService; import org.apache.cassandra.utils.FBUtilities; import org.apache.cassandra.utils.JVMStabilityInspector; -import static org.apache.cassandra.db.commitlog.CommitLogSegment.*; +import static org.apache.cassandra.db.commitlog.CommitLogSegment.Allocation; +import static org.apache.cassandra.db.commitlog.CommitLogSegment.CommitLogSegmentFileComparator; +import static org.apache.cassandra.db.commitlog.CommitLogSegment.ENTRY_OVERHEAD_SIZE; import static org.apache.cassandra.utils.FBUtilities.updateChecksum; import static org.apache.cassandra.utils.FBUtilities.updateChecksumInt; @@ -65,19 +66,19 @@ public class CommitLog implements CommitLogMBean // we only permit records HALF the size of a commit log, to ensure we don't spin allocating many mostly // empty segments when writing large records -private final long MAX_MUTATION_SIZE = DatabaseDescriptor.getMaxMutationSize(); +final long MAX_MUTATION_SIZE = DatabaseDescriptor.getMaxMutationSize(); + +final public AbstractCommitLogSegmentManager segmentManager; -public final CommitLogSegmentManager allocator; public final CommitLogArchiver archiver; final CommitLogMetrics metrics; final AbstractCommitLogService executor; volatile Configuration configuration; -final public String location; private static CommitLog construct() { -CommitLog log = new CommitLog(DatabaseDescriptor.getCommitLogLocation(), CommitLogArchiver.construct()); +CommitLog log = new CommitLog(CommitLogArchiver.construct()); MBeanServer mbs = ManagementFactory.getPlatformMBeanServer(); try @@ -92,9 +93,8 @@ public class CommitLog implements CommitLogMBean } @VisibleForTesting -CommitLog(String location, CommitLogArchiver archiver) +CommitLog(CommitLogArchiver archiver) { -this.location = location; this.configuration = new Configuration(DatabaseDescriptor.getCommitLogCompression(), DatabaseDescriptor.getEncryptionContext()); DatabaseDescriptor.createAllDirectories(); @@ -106,16 +106,17 @@ public class CommitLog implements CommitLogMBean ? new BatchCommitLogService(this) : new PeriodicCommitLogService(this); -allocator = new CommitLogSegmentManager(this); - +segmentManager = DatabaseDescriptor.isCDCEnabled() + ? new CommitLogSegmentManagerCDC(this, DatabaseDescriptor.getCommitLogLocation()) + : new CommitLogSegmentManagerStandard(this, DatabaseDescriptor.getCommitLogLocation()); // register metrics -metrics.attach(executor, allocator); +metrics.attach(executor, segmentManager); } CommitLog start() { executor.start(); -allocator.start(); +segmentManager.start(); return this; } @@ -123,11 +124,12 @@ public class CommitLog implements CommitLogMBean * Perform recovery on commit logs located in the directory specified by the config file. * * @return the number of mutations replayed + * @throws IOException */ -public int recover() throws IO
[1/5] cassandra git commit: Add Change Data Capture [Forced Update!]
Repository: cassandra Updated Branches: refs/heads/trunk 5dcab286c -> e31e21623 (forced update) http://git-wip-us.apache.org/repos/asf/cassandra/blob/e31e2162/test/unit/org/apache/cassandra/db/commitlog/CommitLogReaderTest.java -- diff --git a/test/unit/org/apache/cassandra/db/commitlog/CommitLogReaderTest.java b/test/unit/org/apache/cassandra/db/commitlog/CommitLogReaderTest.java new file mode 100644 index 000..edff3b7 --- /dev/null +++ b/test/unit/org/apache/cassandra/db/commitlog/CommitLogReaderTest.java @@ -0,0 +1,267 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.cassandra.db.commitlog; + +import java.io.File; +import java.io.IOException; +import java.util.ArrayList; +import java.util.List; + +import org.junit.Assert; +import org.junit.Before; +import org.junit.BeforeClass; +import org.junit.Test; + +import org.apache.cassandra.config.CFMetaData; +import org.apache.cassandra.config.ColumnDefinition; +import org.apache.cassandra.config.Config; +import org.apache.cassandra.config.DatabaseDescriptor; +import org.apache.cassandra.cql3.CQLTester; +import org.apache.cassandra.cql3.ColumnIdentifier; +import org.apache.cassandra.db.Keyspace; +import org.apache.cassandra.db.Mutation; +import org.apache.cassandra.db.partitions.PartitionUpdate; +import org.apache.cassandra.db.rows.Row; +import org.apache.cassandra.utils.JVMStabilityInspector; +import org.apache.cassandra.utils.KillerForTests; + +public class CommitLogReaderTest extends CQLTester +{ +@BeforeClass +public static void beforeClass() +{ + DatabaseDescriptor.setCommitFailurePolicy(Config.CommitFailurePolicy.ignore); +JVMStabilityInspector.replaceKiller(new KillerForTests(false)); +} + +@Before +public void before() throws IOException +{ +CommitLog.instance.resetUnsafe(true); +} + +@Test +public void testReadAll() throws Throwable +{ +int samples = 1000; +populateData(samples); +ArrayList toCheck = getCommitLogs(); + +CommitLogReader reader = new CommitLogReader(); + +TestCLRHandler testHandler = new TestCLRHandler(currentTableMetadata()); +for (File f : toCheck) +reader.readCommitLogSegment(testHandler, f, CommitLogReader.ALL_MUTATIONS, false); + +Assert.assertEquals("Expected 1000 seen mutations, got: " + testHandler.seenMutationCount(), +1000, testHandler.seenMutationCount()); + +confirmReadOrder(testHandler, 0); +} + +@Test +public void testReadCount() throws Throwable +{ +int samples = 50; +int readCount = 10; +populateData(samples); +ArrayList toCheck = getCommitLogs(); + +CommitLogReader reader = new CommitLogReader(); +TestCLRHandler testHandler = new TestCLRHandler(); + +for (File f : toCheck) +reader.readCommitLogSegment(testHandler, f, readCount - testHandler.seenMutationCount(), false); + +Assert.assertEquals("Expected " + readCount + " seen mutations, got: " + testHandler.seenMutations.size(), +readCount, testHandler.seenMutationCount()); +} + +@Test +public void testReadFromMidpoint() throws Throwable +{ +int samples = 1000; +int readCount = 500; +CommitLogPosition midpoint = populateData(samples); +ArrayList toCheck = getCommitLogs(); + +CommitLogReader reader = new CommitLogReader(); +TestCLRHandler testHandler = new TestCLRHandler(); + +// Will skip on incorrect segments due to id mismatch on midpoint +for (File f : toCheck) +reader.readCommitLogSegment(testHandler, f, midpoint, readCount, false); + +// Confirm correct count on replay +Assert.assertEquals("Expected " + readCount + " seen mutations, got: " + testHandler.seenMutations.size(), +readCount, testHandler.seenMutationCount()); + +confirmReadOrder(testHandler, samples / 2); +} + +@Test +public void testReadFromMidpointTooMany() throws Throwable +{ +int samples
[3/5] cassandra git commit: Add Change Data Capture
http://git-wip-us.apache.org/repos/asf/cassandra/blob/e31e2162/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java -- diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java index 2045c35..2e97fd5 100644 --- a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java @@ -22,34 +22,22 @@ import java.io.IOException; import java.nio.ByteBuffer; import java.nio.channels.FileChannel; import java.nio.file.StandardOpenOption; -import java.util.ArrayList; -import java.util.Collection; -import java.util.Collections; -import java.util.Comparator; -import java.util.Iterator; -import java.util.List; -import java.util.Map; -import java.util.UUID; +import java.util.*; import java.util.concurrent.ConcurrentHashMap; import java.util.concurrent.ConcurrentMap; import java.util.concurrent.atomic.AtomicInteger; import java.util.zip.CRC32; -import com.codahale.metrics.Timer; - import org.cliffc.high_scale_lib.NonBlockingHashMap; - import org.slf4j.Logger; import org.slf4j.LoggerFactory; -import org.apache.cassandra.config.CFMetaData; -import org.apache.cassandra.config.DatabaseDescriptor; -import org.apache.cassandra.config.Schema; +import com.codahale.metrics.Timer; +import org.apache.cassandra.config.*; import org.apache.cassandra.db.Mutation; import org.apache.cassandra.db.commitlog.CommitLog.Configuration; import org.apache.cassandra.db.partitions.PartitionUpdate; import org.apache.cassandra.io.FSWriteError; -import org.apache.cassandra.io.util.FileUtils; import org.apache.cassandra.utils.CLibrary; import org.apache.cassandra.utils.concurrent.OpOrder; import org.apache.cassandra.utils.concurrent.WaitQueue; @@ -66,6 +54,14 @@ public abstract class CommitLogSegment private static final Logger logger = LoggerFactory.getLogger(CommitLogSegment.class); private final static long idBase; + +private CDCState cdcState = CDCState.PERMITTED; +public enum CDCState { +PERMITTED, +FORBIDDEN, +CONTAINS +} + private final static AtomicInteger nextId = new AtomicInteger(1); private static long replayLimitId; static @@ -115,18 +111,20 @@ public abstract class CommitLogSegment final FileChannel channel; final int fd; +protected final AbstractCommitLogSegmentManager manager; + ByteBuffer buffer; private volatile boolean headerWritten; final CommitLog commitLog; public final CommitLogDescriptor descriptor; -static CommitLogSegment createSegment(CommitLog commitLog, Runnable onClose) +static CommitLogSegment createSegment(CommitLog commitLog, AbstractCommitLogSegmentManager manager, Runnable onClose) { Configuration config = commitLog.configuration; -CommitLogSegment segment = config.useEncryption() ? new EncryptedSegment(commitLog, onClose) - : config.useCompression() ? new CompressedSegment(commitLog, onClose) - : new MemoryMappedSegment(commitLog); +CommitLogSegment segment = config.useEncryption() ? new EncryptedSegment(commitLog, manager, onClose) + : config.useCompression() ? new CompressedSegment(commitLog, manager, onClose) + : new MemoryMappedSegment(commitLog, manager); segment.writeLogHeader(); return segment; } @@ -151,14 +149,16 @@ public abstract class CommitLogSegment /** * Constructs a new segment file. */ -CommitLogSegment(CommitLog commitLog) +CommitLogSegment(CommitLog commitLog, AbstractCommitLogSegmentManager manager) { this.commitLog = commitLog; +this.manager = manager; + id = getNextId(); descriptor = new CommitLogDescriptor(id, commitLog.configuration.getCompressorClass(), commitLog.configuration.getEncryptionContext()); -logFile = new File(commitLog.location, descriptor.fileName()); +logFile = new File(manager.storageDirectory, descriptor.fileName()); try { @@ -369,22 +369,11 @@ public abstract class CommitLogSegment } /** - * Completely discards a segment file by deleting it. (Potentially blocking operation) - */ -void discard(boolean deleteFile) -{ -close(); -if (deleteFile) -FileUtils.deleteWithConfirm(logFile); -commitLog.allocator.addSize(-onDiskSize()); -} - -/** - * @return the current ReplayPosition for this log segment + * @retur
[jira] [Updated] (CASSANDRA-8844) Change Data Capture (CDC)
[ https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-8844: --- Resolution: Fixed Fix Version/s: (was: 3.x) 3.8 Status: Resolved (was: Ready to Commit) Switching between C# and Java everyday has its costs. Fixed that, tidied up NEWS.txt (spacing and ordering on Upgrading and Deprecation), and [committed|https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=5dcab286ca0fcd9a71e28dad805f028362572e21]. Thanks for the assist [~carlyeks] and [~blambov]! I'll be creating a follow-up meta ticket w/subtasks from all the stuff that came up here that we deferred and link that to this ticket, as well as moving the link to CASSANDRA-11957 over there. > Change Data Capture (CDC) > - > > Key: CASSANDRA-8844 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8844 > Project: Cassandra > Issue Type: New Feature > Components: Coordination, Local Write-Read Paths >Reporter: Tupshin Harper >Assignee: Joshua McKenzie >Priority: Critical > Fix For: 3.8 > > > "In databases, change data capture (CDC) is a set of software design patterns > used to determine (and track) the data that has changed so that action can be > taken using the changed data. Also, Change data capture (CDC) is an approach > to data integration that is based on the identification, capture and delivery > of the changes made to enterprise data sources." > -Wikipedia > As Cassandra is increasingly being used as the Source of Record (SoR) for > mission critical data in large enterprises, it is increasingly being called > upon to act as the central hub of traffic and data flow to other systems. In > order to try to address the general need, we (cc [~brianmhess]), propose > implementing a simple data logging mechanism to enable per-table CDC patterns. > h2. The goals: > # Use CQL as the primary ingestion mechanism, in order to leverage its > Consistency Level semantics, and in order to treat it as the single > reliable/durable SoR for the data. > # To provide a mechanism for implementing good and reliable > (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) > continuous semi-realtime feeds of mutations going into a Cassandra cluster. > # To eliminate the developmental and operational burden of users so that they > don't have to do dual writes to other systems. > # For users that are currently doing batch export from a Cassandra system, > give them the opportunity to make that realtime with a minimum of coding. > h2. The mechanism: > We propose a durable logging mechanism that functions similar to a commitlog, > with the following nuances: > - Takes place on every node, not just the coordinator, so RF number of copies > are logged. > - Separate log per table. > - Per-table configuration. Only tables that are specified as CDC_LOG would do > any logging. > - Per DC. We are trying to keep the complexity to a minimum to make this an > easy enhancement, but most likely use cases would prefer to only implement > CDC logging in one (or a subset) of the DCs that are being replicated to > - In the critical path of ConsistencyLevel acknowledgment. Just as with the > commitlog, failure to write to the CDC log should fail that node's write. If > that means the requested consistency level was not met, then clients *should* > experience UnavailableExceptions. > - Be written in a Row-centric manner such that it is easy for consumers to > reconstitute rows atomically. > - Written in a simple format designed to be consumed *directly* by daemons > written in non JVM languages > h2. Nice-to-haves > I strongly suspect that the following features will be asked for, but I also > believe that they can be deferred for a subsequent release, and to guage > actual interest. > - Multiple logs per table. This would make it easy to have multiple > "subscribers" to a single table's changes. A workaround would be to create a > forking daemon listener, but that's not a great answer. > - Log filtering. Being able to apply filters, including UDF-based filters > would make Casandra a much more versatile feeder into other systems, and > again, reduce complexity that would otherwise need to be built into the > daemons. > h2. Format and Consumption > - Cassandra would only write to the CDC log, and never delete from it. > - Cleaning up consumed logfiles would be the client daemon's responibility > - Logfile size should probably be configurable. > - Logfiles should be named with a predictable naming schema, making it > triivial to process them in order. > - Daemons should be able to checkpoint their work, and resume from where they > left off. This means they would have to leave some file artifact in th
[jira] [Updated] (CASSANDRA-8844) Change Data Capture (CDC)
[ https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-8844: --- Status: Ready to Commit (was: Patch Available) > Change Data Capture (CDC) > - > > Key: CASSANDRA-8844 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8844 > Project: Cassandra > Issue Type: New Feature > Components: Coordination, Local Write-Read Paths >Reporter: Tupshin Harper >Assignee: Joshua McKenzie >Priority: Critical > Fix For: 3.x > > > "In databases, change data capture (CDC) is a set of software design patterns > used to determine (and track) the data that has changed so that action can be > taken using the changed data. Also, Change data capture (CDC) is an approach > to data integration that is based on the identification, capture and delivery > of the changes made to enterprise data sources." > -Wikipedia > As Cassandra is increasingly being used as the Source of Record (SoR) for > mission critical data in large enterprises, it is increasingly being called > upon to act as the central hub of traffic and data flow to other systems. In > order to try to address the general need, we (cc [~brianmhess]), propose > implementing a simple data logging mechanism to enable per-table CDC patterns. > h2. The goals: > # Use CQL as the primary ingestion mechanism, in order to leverage its > Consistency Level semantics, and in order to treat it as the single > reliable/durable SoR for the data. > # To provide a mechanism for implementing good and reliable > (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) > continuous semi-realtime feeds of mutations going into a Cassandra cluster. > # To eliminate the developmental and operational burden of users so that they > don't have to do dual writes to other systems. > # For users that are currently doing batch export from a Cassandra system, > give them the opportunity to make that realtime with a minimum of coding. > h2. The mechanism: > We propose a durable logging mechanism that functions similar to a commitlog, > with the following nuances: > - Takes place on every node, not just the coordinator, so RF number of copies > are logged. > - Separate log per table. > - Per-table configuration. Only tables that are specified as CDC_LOG would do > any logging. > - Per DC. We are trying to keep the complexity to a minimum to make this an > easy enhancement, but most likely use cases would prefer to only implement > CDC logging in one (or a subset) of the DCs that are being replicated to > - In the critical path of ConsistencyLevel acknowledgment. Just as with the > commitlog, failure to write to the CDC log should fail that node's write. If > that means the requested consistency level was not met, then clients *should* > experience UnavailableExceptions. > - Be written in a Row-centric manner such that it is easy for consumers to > reconstitute rows atomically. > - Written in a simple format designed to be consumed *directly* by daemons > written in non JVM languages > h2. Nice-to-haves > I strongly suspect that the following features will be asked for, but I also > believe that they can be deferred for a subsequent release, and to guage > actual interest. > - Multiple logs per table. This would make it easy to have multiple > "subscribers" to a single table's changes. A workaround would be to create a > forking daemon listener, but that's not a great answer. > - Log filtering. Being able to apply filters, including UDF-based filters > would make Casandra a much more versatile feeder into other systems, and > again, reduce complexity that would otherwise need to be built into the > daemons. > h2. Format and Consumption > - Cassandra would only write to the CDC log, and never delete from it. > - Cleaning up consumed logfiles would be the client daemon's responibility > - Logfile size should probably be configurable. > - Logfiles should be named with a predictable naming schema, making it > triivial to process them in order. > - Daemons should be able to checkpoint their work, and resume from where they > left off. This means they would have to leave some file artifact in the CDC > log's directory. > - A sophisticated daemon should be able to be written that could > -- Catch up, in written-order, even when it is multiple logfiles behind in > processing > -- Be able to continuously "tail" the most recent logfile and get > low-latency(ms?) access to the data as it is written. > h2. Alternate approach > In order to make consuming a change log easy and efficient to do with low > latency, the following could supplement the approach outlined above > - Instead of writing to a logfile, by default, Cassandra could expose a > socket for a daemon to connect to, and
[1/5] cassandra git commit: Add Change Data Capture
Repository: cassandra Updated Branches: refs/heads/trunk ed538f90e -> 5dcab286c http://git-wip-us.apache.org/repos/asf/cassandra/blob/5dcab286/test/unit/org/apache/cassandra/db/commitlog/CommitLogReaderTest.java -- diff --git a/test/unit/org/apache/cassandra/db/commitlog/CommitLogReaderTest.java b/test/unit/org/apache/cassandra/db/commitlog/CommitLogReaderTest.java new file mode 100644 index 000..edff3b7 --- /dev/null +++ b/test/unit/org/apache/cassandra/db/commitlog/CommitLogReaderTest.java @@ -0,0 +1,267 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.cassandra.db.commitlog; + +import java.io.File; +import java.io.IOException; +import java.util.ArrayList; +import java.util.List; + +import org.junit.Assert; +import org.junit.Before; +import org.junit.BeforeClass; +import org.junit.Test; + +import org.apache.cassandra.config.CFMetaData; +import org.apache.cassandra.config.ColumnDefinition; +import org.apache.cassandra.config.Config; +import org.apache.cassandra.config.DatabaseDescriptor; +import org.apache.cassandra.cql3.CQLTester; +import org.apache.cassandra.cql3.ColumnIdentifier; +import org.apache.cassandra.db.Keyspace; +import org.apache.cassandra.db.Mutation; +import org.apache.cassandra.db.partitions.PartitionUpdate; +import org.apache.cassandra.db.rows.Row; +import org.apache.cassandra.utils.JVMStabilityInspector; +import org.apache.cassandra.utils.KillerForTests; + +public class CommitLogReaderTest extends CQLTester +{ +@BeforeClass +public static void beforeClass() +{ + DatabaseDescriptor.setCommitFailurePolicy(Config.CommitFailurePolicy.ignore); +JVMStabilityInspector.replaceKiller(new KillerForTests(false)); +} + +@Before +public void before() throws IOException +{ +CommitLog.instance.resetUnsafe(true); +} + +@Test +public void testReadAll() throws Throwable +{ +int samples = 1000; +populateData(samples); +ArrayList toCheck = getCommitLogs(); + +CommitLogReader reader = new CommitLogReader(); + +TestCLRHandler testHandler = new TestCLRHandler(currentTableMetadata()); +for (File f : toCheck) +reader.readCommitLogSegment(testHandler, f, CommitLogReader.ALL_MUTATIONS, false); + +Assert.assertEquals("Expected 1000 seen mutations, got: " + testHandler.seenMutationCount(), +1000, testHandler.seenMutationCount()); + +confirmReadOrder(testHandler, 0); +} + +@Test +public void testReadCount() throws Throwable +{ +int samples = 50; +int readCount = 10; +populateData(samples); +ArrayList toCheck = getCommitLogs(); + +CommitLogReader reader = new CommitLogReader(); +TestCLRHandler testHandler = new TestCLRHandler(); + +for (File f : toCheck) +reader.readCommitLogSegment(testHandler, f, readCount - testHandler.seenMutationCount(), false); + +Assert.assertEquals("Expected " + readCount + " seen mutations, got: " + testHandler.seenMutations.size(), +readCount, testHandler.seenMutationCount()); +} + +@Test +public void testReadFromMidpoint() throws Throwable +{ +int samples = 1000; +int readCount = 500; +CommitLogPosition midpoint = populateData(samples); +ArrayList toCheck = getCommitLogs(); + +CommitLogReader reader = new CommitLogReader(); +TestCLRHandler testHandler = new TestCLRHandler(); + +// Will skip on incorrect segments due to id mismatch on midpoint +for (File f : toCheck) +reader.readCommitLogSegment(testHandler, f, midpoint, readCount, false); + +// Confirm correct count on replay +Assert.assertEquals("Expected " + readCount + " seen mutations, got: " + testHandler.seenMutations.size(), +readCount, testHandler.seenMutationCount()); + +confirmReadOrder(testHandler, samples / 2); +} + +@Test +public void testReadFromMidpointTooMany() throws Throwable +{ +int samples = 1000; +
[5/5] cassandra git commit: Add Change Data Capture
Add Change Data Capture Patch by jmckenzie; reviewed by cyeksigian and blambov for CASSANDRA-8844 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5dcab286 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5dcab286 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5dcab286 Branch: refs/heads/trunk Commit: 5dcab286ca0fcd9a71e28dad805f028362572e21 Parents: ed538f9 Author: Josh McKenzie Authored: Sun Mar 27 09:20:47 2016 -0400 Committer: Josh McKenzie Committed: Thu Jun 16 09:53:49 2016 -0400 -- CHANGES.txt | 1 + NEWS.txt| 32 +- build.xml | 51 +- conf/cassandra.yaml | 26 + pylib/cqlshlib/cql3handling.py | 5 +- src/antlr/Parser.g | 3 +- .../org/apache/cassandra/config/Config.java | 6 + .../cassandra/config/DatabaseDescriptor.java| 86 ++- .../statements/CreateKeyspaceStatement.java | 1 + .../cql3/statements/DropKeyspaceStatement.java | 2 +- .../cql3/statements/TableAttributes.java| 3 + .../apache/cassandra/db/ColumnFamilyStore.java | 67 +-- .../org/apache/cassandra/db/Directories.java| 51 +- src/java/org/apache/cassandra/db/Keyspace.java | 17 +- src/java/org/apache/cassandra/db/Memtable.java | 47 +- src/java/org/apache/cassandra/db/Mutation.java | 18 + .../org/apache/cassandra/db/SystemKeyspace.java | 26 +- src/java/org/apache/cassandra/db/WriteType.java | 3 +- .../AbstractCommitLogSegmentManager.java| 584 +++ .../db/commitlog/AbstractCommitLogService.java | 3 +- .../cassandra/db/commitlog/CommitLog.java | 157 +++-- .../db/commitlog/CommitLogPosition.java | 121 .../db/commitlog/CommitLogReadHandler.java | 76 +++ .../cassandra/db/commitlog/CommitLogReader.java | 501 .../db/commitlog/CommitLogReplayer.java | 582 +++--- .../db/commitlog/CommitLogSegment.java | 110 ++-- .../db/commitlog/CommitLogSegmentManager.java | 567 -- .../commitlog/CommitLogSegmentManagerCDC.java | 302 ++ .../CommitLogSegmentManagerStandard.java| 89 +++ .../db/commitlog/CommitLogSegmentReader.java| 366 .../db/commitlog/CompressedSegment.java | 12 +- .../db/commitlog/EncryptedSegment.java | 18 +- .../db/commitlog/FileDirectSegment.java | 73 +-- .../db/commitlog/MemoryMappedSegment.java | 6 +- .../cassandra/db/commitlog/ReplayPosition.java | 178 -- .../cassandra/db/commitlog/SegmentReader.java | 355 --- .../db/commitlog/SimpleCachedBufferPool.java| 118 .../apache/cassandra/db/lifecycle/Tracker.java | 8 +- .../apache/cassandra/db/view/TableViews.java| 4 +- .../apache/cassandra/db/view/ViewManager.java | 2 - .../io/sstable/format/SSTableReader.java| 1 - .../metadata/LegacyMetadataSerializer.java | 12 +- .../io/sstable/metadata/MetadataCollector.java | 16 +- .../io/sstable/metadata/StatsMetadata.java | 24 +- .../cassandra/metrics/CommitLogMetrics.java | 9 +- .../apache/cassandra/schema/SchemaKeyspace.java | 6 +- .../apache/cassandra/schema/TableParams.java| 23 +- .../cassandra/service/CassandraDaemon.java | 4 +- .../cassandra/streaming/StreamReceiveTask.java | 36 +- .../utils/DirectorySizeCalculator.java | 98 .../cassandra/utils/JVMStabilityInspector.java | 3 +- .../cassandra/utils/memory/BufferPool.java | 2 +- test/conf/cassandra-murmur.yaml | 2 + test/conf/cassandra.yaml| 2 + test/conf/cdc.yaml | 1 + test/data/bloom-filter/ka/foo.cql | 2 +- .../db/commitlog/CommitLogStressTest.java | 123 ++-- .../test/microbench/DirectorySizerBench.java| 105 .../OffsetAwareConfigurationLoader.java | 13 +- .../cassandra/batchlog/BatchlogManagerTest.java | 4 +- .../apache/cassandra/cql3/CDCStatementTest.java | 50 ++ .../org/apache/cassandra/cql3/CQLTester.java| 4 + .../apache/cassandra/cql3/OutOfSpaceTest.java | 2 +- .../cql3/validation/operations/CreateTest.java | 5 +- .../apache/cassandra/db/ReadMessageTest.java| 10 +- .../db/commitlog/CommitLogReaderTest.java | 267 + .../CommitLogSegmentManagerCDCTest.java | 220 +++ .../commitlog/CommitLogSegmentManagerTest.java | 23 +- .../cassandra/db/commitlog/CommitLogTest.java | 130 +++-- .../db/commitlog/CommitLogTestReplayer.java | 59 +- .../db/commitlog/CommitLogUpgradeTest.java | 18 +- .../db/commitlog/CommitLogUpgradeTestMaker.java | 4 +- .../db/commitlog/SegmentReaderTest
[3/5] cassandra git commit: Add Change Data Capture
http://git-wip-us.apache.org/repos/asf/cassandra/blob/5dcab286/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java -- diff --git a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java index 2045c35..2e97fd5 100644 --- a/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java +++ b/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java @@ -22,34 +22,22 @@ import java.io.IOException; import java.nio.ByteBuffer; import java.nio.channels.FileChannel; import java.nio.file.StandardOpenOption; -import java.util.ArrayList; -import java.util.Collection; -import java.util.Collections; -import java.util.Comparator; -import java.util.Iterator; -import java.util.List; -import java.util.Map; -import java.util.UUID; +import java.util.*; import java.util.concurrent.ConcurrentHashMap; import java.util.concurrent.ConcurrentMap; import java.util.concurrent.atomic.AtomicInteger; import java.util.zip.CRC32; -import com.codahale.metrics.Timer; - import org.cliffc.high_scale_lib.NonBlockingHashMap; - import org.slf4j.Logger; import org.slf4j.LoggerFactory; -import org.apache.cassandra.config.CFMetaData; -import org.apache.cassandra.config.DatabaseDescriptor; -import org.apache.cassandra.config.Schema; +import com.codahale.metrics.Timer; +import org.apache.cassandra.config.*; import org.apache.cassandra.db.Mutation; import org.apache.cassandra.db.commitlog.CommitLog.Configuration; import org.apache.cassandra.db.partitions.PartitionUpdate; import org.apache.cassandra.io.FSWriteError; -import org.apache.cassandra.io.util.FileUtils; import org.apache.cassandra.utils.CLibrary; import org.apache.cassandra.utils.concurrent.OpOrder; import org.apache.cassandra.utils.concurrent.WaitQueue; @@ -66,6 +54,14 @@ public abstract class CommitLogSegment private static final Logger logger = LoggerFactory.getLogger(CommitLogSegment.class); private final static long idBase; + +private CDCState cdcState = CDCState.PERMITTED; +public enum CDCState { +PERMITTED, +FORBIDDEN, +CONTAINS +} + private final static AtomicInteger nextId = new AtomicInteger(1); private static long replayLimitId; static @@ -115,18 +111,20 @@ public abstract class CommitLogSegment final FileChannel channel; final int fd; +protected final AbstractCommitLogSegmentManager manager; + ByteBuffer buffer; private volatile boolean headerWritten; final CommitLog commitLog; public final CommitLogDescriptor descriptor; -static CommitLogSegment createSegment(CommitLog commitLog, Runnable onClose) +static CommitLogSegment createSegment(CommitLog commitLog, AbstractCommitLogSegmentManager manager, Runnable onClose) { Configuration config = commitLog.configuration; -CommitLogSegment segment = config.useEncryption() ? new EncryptedSegment(commitLog, onClose) - : config.useCompression() ? new CompressedSegment(commitLog, onClose) - : new MemoryMappedSegment(commitLog); +CommitLogSegment segment = config.useEncryption() ? new EncryptedSegment(commitLog, manager, onClose) + : config.useCompression() ? new CompressedSegment(commitLog, manager, onClose) + : new MemoryMappedSegment(commitLog, manager); segment.writeLogHeader(); return segment; } @@ -151,14 +149,16 @@ public abstract class CommitLogSegment /** * Constructs a new segment file. */ -CommitLogSegment(CommitLog commitLog) +CommitLogSegment(CommitLog commitLog, AbstractCommitLogSegmentManager manager) { this.commitLog = commitLog; +this.manager = manager; + id = getNextId(); descriptor = new CommitLogDescriptor(id, commitLog.configuration.getCompressorClass(), commitLog.configuration.getEncryptionContext()); -logFile = new File(commitLog.location, descriptor.fileName()); +logFile = new File(manager.storageDirectory, descriptor.fileName()); try { @@ -369,22 +369,11 @@ public abstract class CommitLogSegment } /** - * Completely discards a segment file by deleting it. (Potentially blocking operation) - */ -void discard(boolean deleteFile) -{ -close(); -if (deleteFile) -FileUtils.deleteWithConfirm(logFile); -commitLog.allocator.addSize(-onDiskSize()); -} - -/** - * @return the current ReplayPosition for this log segment + * @retur
[jira] [Commented] (CASSANDRA-8844) Change Data Capture (CDC)
[ https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333774#comment-15333774 ] Branimir Lambov commented on CASSANDRA-8844: +1, with a final rename nit: [{{Allocation.GetCommitLogPosition}}|https://github.com/apache/cassandra/compare/trunk...josh-mckenzie:8844_review#diff-7720d4b5123a354876e0b3139222f34eR669] is in PascalCase. > Change Data Capture (CDC) > - > > Key: CASSANDRA-8844 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8844 > Project: Cassandra > Issue Type: New Feature > Components: Coordination, Local Write-Read Paths >Reporter: Tupshin Harper >Assignee: Joshua McKenzie >Priority: Critical > Fix For: 3.x > > > "In databases, change data capture (CDC) is a set of software design patterns > used to determine (and track) the data that has changed so that action can be > taken using the changed data. Also, Change data capture (CDC) is an approach > to data integration that is based on the identification, capture and delivery > of the changes made to enterprise data sources." > -Wikipedia > As Cassandra is increasingly being used as the Source of Record (SoR) for > mission critical data in large enterprises, it is increasingly being called > upon to act as the central hub of traffic and data flow to other systems. In > order to try to address the general need, we (cc [~brianmhess]), propose > implementing a simple data logging mechanism to enable per-table CDC patterns. > h2. The goals: > # Use CQL as the primary ingestion mechanism, in order to leverage its > Consistency Level semantics, and in order to treat it as the single > reliable/durable SoR for the data. > # To provide a mechanism for implementing good and reliable > (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) > continuous semi-realtime feeds of mutations going into a Cassandra cluster. > # To eliminate the developmental and operational burden of users so that they > don't have to do dual writes to other systems. > # For users that are currently doing batch export from a Cassandra system, > give them the opportunity to make that realtime with a minimum of coding. > h2. The mechanism: > We propose a durable logging mechanism that functions similar to a commitlog, > with the following nuances: > - Takes place on every node, not just the coordinator, so RF number of copies > are logged. > - Separate log per table. > - Per-table configuration. Only tables that are specified as CDC_LOG would do > any logging. > - Per DC. We are trying to keep the complexity to a minimum to make this an > easy enhancement, but most likely use cases would prefer to only implement > CDC logging in one (or a subset) of the DCs that are being replicated to > - In the critical path of ConsistencyLevel acknowledgment. Just as with the > commitlog, failure to write to the CDC log should fail that node's write. If > that means the requested consistency level was not met, then clients *should* > experience UnavailableExceptions. > - Be written in a Row-centric manner such that it is easy for consumers to > reconstitute rows atomically. > - Written in a simple format designed to be consumed *directly* by daemons > written in non JVM languages > h2. Nice-to-haves > I strongly suspect that the following features will be asked for, but I also > believe that they can be deferred for a subsequent release, and to guage > actual interest. > - Multiple logs per table. This would make it easy to have multiple > "subscribers" to a single table's changes. A workaround would be to create a > forking daemon listener, but that's not a great answer. > - Log filtering. Being able to apply filters, including UDF-based filters > would make Casandra a much more versatile feeder into other systems, and > again, reduce complexity that would otherwise need to be built into the > daemons. > h2. Format and Consumption > - Cassandra would only write to the CDC log, and never delete from it. > - Cleaning up consumed logfiles would be the client daemon's responibility > - Logfile size should probably be configurable. > - Logfiles should be named with a predictable naming schema, making it > triivial to process them in order. > - Daemons should be able to checkpoint their work, and resume from where they > left off. This means they would have to leave some file artifact in the CDC > log's directory. > - A sophisticated daemon should be able to be written that could > -- Catch up, in written-order, even when it is multiple logfiles behind in > processing > -- Be able to continuously "tail" the most recent logfile and get > low-latency(ms?) access to the data as it is written. > h2. Alternate approach > In order to make consuming a change log easy an
[jira] [Commented] (CASSANDRA-11516) Make max number of streams configurable
[ https://issues.apache.org/jira/browse/CASSANDRA-11516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333742#comment-15333742 ] Giampaolo commented on CASSANDRA-11516: --- I'm studying how to solve this issue. A quick question: do you mean to put a configuration for [this line|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/streaming/StreamReceiveTask.java#L62] using {{newFixedThreadPool}} and defaulting to {{FBUtilities#getAvailableProcessors}}? > Make max number of streams configurable > --- > > Key: CASSANDRA-11516 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11516 > Project: Cassandra > Issue Type: New Feature >Reporter: Sebastian Estevez > Labels: lhf > > Today we default to num cores. In large boxes (many cores), this is > suboptimal as it can generate huge amounts of garbage that GC can't keep up > with. > Usually we tackle issues like this with the streaming throughput levers but > in this case the problem is CPU consumption by StreamReceiverTasks > specifically in the IntervalTree build -- > https://github.com/apache/cassandra/blob/cassandra-2.1.12/src/java/org/apache/cassandra/utils/IntervalTree.java#L257 > We need a max number of parallel streams lever to hanlde this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11873) Add duration type
[ https://issues.apache.org/jira/browse/CASSANDRA-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333706#comment-15333706 ] Benjamin Lerer commented on CASSANDRA-11873: bq. What about leap year? I think that point worth some discussion. The current patch store the duration in a number of nanoseconds. Which means that some information will be lost. If a user provide {{3y}} or {{3 year}} it will be converted in nanoseconds and the {{now() - 3y}} will not result in the correct date. We can try to guess what the user intended but it is a risky business. If we want to handle properly things like that it means that we have to use a more complex serialization format. Basically we need to store at least {{year}} and {{month}} separately from the remaining time in nanosecond (which is I guess the main reason why Influxdb is not supporting the month and year units). Even if it allow a better handling of some use cases, I think that this solution will bring some problems. {{Java}} for example do not have a type that can be directly mapped to that (if I am not mistaken). It has 2 different classes: {{Period}} for the date part and {{Duration}} for the time part. By consequence, it can be difficult for the driver to handle such a type. I also believe (even if I do not have some concret proof right now) that it will make some computations, like the one needed for CASSANDRA-11871, more expensives. Overall, I am in favor of keeping the thing as simple as possible. Which is for me: storing the duration has nanoseconds, supporting as litterals only a number followed by a symbol (in this first version at least) and not supporting {{month}} or {{year}} units (the current patch does not support {{week}} but it can easily be added). Having said that, I am fully open to discussion. > Add duration type > - > > Key: CASSANDRA-11873 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11873 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > Labels: client-impacting, doc-impacting > Fix For: 3.x > > > For CASSANDRA-11871 or to allow queries with {{WHERE}} clause like: > {{... WHERE reading_time < now() - 2h}}, we need to support some duration > type. > In my opinion, it should be represented internally as a number of > microseconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12015) Rebuilding from another DC should use different sources
[ https://issues.apache.org/jira/browse/CASSANDRA-12015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333698#comment-15333698 ] DOAN DuyHai commented on CASSANDRA-12015: - Here is the important code path 1) org.apache.cassandra.tools.nodetool.Rebuild::execute() 2) StorageService::rebuild(String sourceDc, String keyspace, String tokens) 3) RangeStreamer::getAllRangesWithSourcesFor(String keyspaceName, Collection> desiredRanges) Inside the last metho, we call the snitch to sort replicas : List preferred = snitch.getSortedListByProximity(address, rangeAddresses.get(range)); If you're rebuilding nodes in new DC with "nodetool rebuild" command very fast, it may happen that one replica has better latency that the others so it will be picked up by DynamicSnitch > Rebuilding from another DC should use different sources > --- > > Key: CASSANDRA-12015 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12015 > Project: Cassandra > Issue Type: Improvement >Reporter: Fabien Rousseau > > Currently, when adding a new DC (ex: DC2) and rebuilding it from an existing > DC (ex: DC1), only the closest replica is used as a "source of data". > It works but is not optimal, because in case of an RF=3 and 3 nodes cluster, > only one node in DC1 is streaming the data to DC2. > To build the new DC in a reasonable time, it would be better, in that case, > to stream from multiple sources, thus distributing more evenly the load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8700) replace the wiki with docs in the git repo
[ https://issues.apache.org/jira/browse/CASSANDRA-8700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333615#comment-15333615 ] Sylvain Lebresne commented on CASSANDRA-8700: - I included the pull requests made so far on [my branch|https://github.com/pcmanus/cassandra/commits/doc_in_tree] (cherry-picked because I like my branches like my complexities: linear). Note that I'm the middle of fixing the CQL doc so that's why it looks bad currently (I'm taking the time to add missing parts and reorganize things a bit so it's taking a bit of time). On the cqlsh doc though, I wonder if it's a good idea to include the description of the command line options, and even of the special commands? Feels like it we'll easily forgot to update it and it doesn't seem to add a lot of value over getting the help from cqlsh directly. Maybe we could just point to how to get said help (just mentioning that you should use {{cqlsh -h}} for command line options and that there is a HELP command within cqlsh)? Or maybe we can have all that generated from cqlsh code directly so things stay in sync automatically? > replace the wiki with docs in the git repo > -- > > Key: CASSANDRA-8700 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8700 > Project: Cassandra > Issue Type: Improvement > Components: Documentation and Website >Reporter: Jon Haddad >Assignee: Sylvain Lebresne >Priority: Blocker > Fix For: 3.8 > > Attachments: TombstonesAndGcGrace.md, bloom_filters.md, > compression.md, contributing.zip, getting_started.zip, hardware.md > > > The wiki as it stands is pretty terrible. It takes several minutes to apply > a single update, and as a result, it's almost never updated. The information > there has very little context as to what version it applies to. Most people > I've talked to that try to use the information they find there find it is > more confusing than helpful. > I'd like to propose that instead of using the wiki, the doc directory in the > cassandra repo be used for docs (already used for CQL3 spec) in a format that > can be built to a variety of output formats like HTML / epub / etc. I won't > start the bikeshedding on which markup format is preferable - but there are > several options that can work perfectly fine. I've personally use sphinx w/ > restructured text, and markdown. Both can build easily and as an added bonus > be pushed to readthedocs (or something similar) automatically. For an > example, see cqlengine's documentation, which I think is already > significantly better than the wiki: > http://cqlengine.readthedocs.org/en/latest/ > In addition to being overall easier to maintain, putting the documentation in > the git repo adds context, since it evolves with the versions of Cassandra. > If the wiki were kept even remotely up to date, I wouldn't bother with this, > but not having at least some basic documentation in the repo, or anywhere > associated with the project, is frustrating. > For reference, the last 3 updates were: > 1/15/15 - updating committers list > 1/08/15 - updating contributers and how to contribute > 12/16/14 - added a link to CQL docs from wiki frontpage (by me) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11873) Add duration type
[ https://issues.apache.org/jira/browse/CASSANDRA-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333585#comment-15333585 ] Sylvain Lebresne commented on CASSANDRA-11873: -- For the record, CQL is not SQL and it's not even close. Artificially forcing ourselves to reuse something existing in SQL *every single time* we need new syntax is largely pointless. Anyone trying to use CQL as if it was SQL is going to have a bad surprise, and small syntax differences is going to be the least of its problem. Don't get me wrong, CQL has the same _general_ structure than SQL and so informing our choices with what SQL (and popular SQL databases) is doing and borrowing good ideas is certainly desirable. But that's only the beginning of the conversation, not the end (even more so when said existing SQL databases don't even agree between themselves). If we think an existing syntax is not particular good and we can do better for instance, why we would pick a lesser solution? And in that particular case, I'm _convinced_ that the syntax currently implemented is better than what Postgres or Oracle do (I reckon that such statement is partly subjective, but I still stand by it). Certainly not a lot better, granted, but better because as intuitive as any of the options but more concise. For that reason, count me as a PMC-binding -1 on *not* supporting it. That said, I'm not against compromises, so please read below before answering. bq. Of the formats I've seen here, Postgres native format is the most user friendly And by "Postgres native format", you mean {{1 year 2 months 3 days 4 hours 5 minutes 6 seconds}} right? If so (and as mentioned previously), I don't really mind supporting that (I guess for the sake of making the live of Postgres developer easier, or pleasing those that want to show off their touch-typing skills). I don't mind it as long as we also support the shorter version (because, if I don't care about Postgres, why wouldn't I be allowed to abbreviate the units? It surely is pretty natural). > Add duration type > - > > Key: CASSANDRA-11873 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11873 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > Labels: client-impacting, doc-impacting > Fix For: 3.x > > > For CASSANDRA-11871 or to allow queries with {{WHERE}} clause like: > {{... WHERE reading_time < now() - 2h}}, we need to support some duration > type. > In my opinion, it should be represented internally as a number of > microseconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-11873) Add duration type
[ https://issues.apache.org/jira/browse/CASSANDRA-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333529#comment-15333529 ] Brian Hess edited comment on CASSANDRA-11873 at 6/16/16 10:27 AM: --- Being subtlety different on syntax is in some cases worse than being very different. So, if we are thinking we will go with ISO 8601 format (an option that could make sense - it is a widely recognized format and present in more than a few systems (not just databases, I mean)) then we should make sure we include the "P" and the "T". While Postgres does support ISO 8601 formats (of course I bothered to read it), in that format the highest resolution is seconds. There is a good reason to want milliseconds and microseconds (and maybe nanoseconds). The standard Postgres format support all of these (with the exception of nanoseconds, though that addition to their format would be straightforward to understand). If you want to shorten the Postgres format to save typing, what abbreviation do you propose for "minute" and "month"? I will certainly agree that the Oracle syntax is not user-friendly. I think arguing it is desirable is a stretch. I have a reservation on the Influx syntax here, though. Influx does not support month or year. They only have up to week (https://docs.influxdata.com/influxdb/v0.13/query_language/data_exploration/#relative-time). So, it is not possible to say "now() - 2 months" or "now() - 1 year". To do 1 year, what would you do? "now() - 365d"? What about leap year? What about going back one month? In fact, if one bothers to read https://docs.influxdata.com/influxdb/v0.13/query_language/data_exploration/#time-syntax-in-queries, he would find that this patch only had a subset of Influx's supported format. I don't see a week unit. Moreover, Influx doesn't use "us", it uses just "u". So, our proposed syntax isn't even consistent (in subtle ways) with Influx's format. Let alone that Influx's format is incomplete (specifically, no support for months and years). Of the formats I've seen here, Postgres native format is the most user friendly, and accomplishes the goals of durations for us. I'm (non-PMC, non-binding) -1 on the currently proposed format from a usability/product/developer POV. was (Author: brianmhess): Being subtlety different on syntax is in some cases worse than being very different. So, if we are thinking we will go with ISO 8601 format (an option that could make sense - it is a widely recognized format and present in more than a few systems (not just databases, I mean)) then we should make sure we include the "P" and the "T". While Postgres does support ISO 8601 formats (of course I bothered to read it), in that format the highest resolution is seconds. There is a good reason to want milliseconds and microseconds (and maybe nanoseconds). The standard Postgres format support all of these (with the exception of nanoseconds, though that addition to their format would be straightforward to understand). If you want to shorten the Postgres format to save typing, what abbreviation do you propose for "minute" and "month"? I will certainly agree that the Oracle syntax is not user-friendly. I think arguing it is desirable is a stretch. I have a reservation on the Influx syntax here, though. Influx does not support month or year. They only have up to week (https://docs.influxdata.com/influxdb/v0.13/query_language/data_exploration/#relative-time). So, it is not possible to say "now() - 2 months" or "now() - 1 year". To do 1 year, what would you do? "now() - 365d"? What about leap year? What about going back one month? In fact, this patch only had a subset of Influx's supported format. I don't see a week unit. Moreover, Influx doesn't use "us", it uses just "u". So, our proposed syntax isn't even consistent (in subtle ways) with Influx's format. Let alone that Influx's format is incomplete (specifically, no support for months and years). Of the formats I've seen here, Postgres native format is the most user friendly, and accomplishes the goals of durations for us. I'm (non-PMC, non-binding) -1 on the currently proposed format from a usability/product/developer POV. > Add duration type > - > > Key: CASSANDRA-11873 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11873 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > Labels: client-impacting, doc-impacting > Fix For: 3.x > > > For CASSANDRA-11871 or to allow queries with {{WHERE}} clause like: > {{... WHERE reading_time < now() - 2h}}, we need to support some duration > type. > In my opinion, it should be represented internally as a number of > microseconds. -- This
[jira] [Commented] (CASSANDRA-11873) Add duration type
[ https://issues.apache.org/jira/browse/CASSANDRA-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333529#comment-15333529 ] Brian Hess commented on CASSANDRA-11873: - Being subtlety different on syntax is in some cases worse than being very different. So, if we are thinking we will go with ISO 8601 format (an option that could make sense - it is a widely recognized format and present in more than a few systems (not just databases, I mean)) then we should make sure we include the "P" and the "T". While Postgres does support ISO 8601 formats (of course I bothered to read it), in that format the highest resolution is seconds. There is a good reason to want milliseconds and microseconds (and maybe nanoseconds). The standard Postgres format support all of these (with the exception of nanoseconds, though that addition to their format would be straightforward to understand). If you want to shorten the Postgres format to save typing, what abbreviation do you propose for "minute" and "month"? I will certainly agree that the Oracle syntax is not user-friendly. I think arguing it is desirable is a stretch. I have a reservation on the Influx syntax here, though. Influx does not support month or year. They only have up to week (https://docs.influxdata.com/influxdb/v0.13/query_language/data_exploration/#relative-time). So, it is not possible to say "now() - 2 months" or "now() - 1 year". To do 1 year, what would you do? "now() - 365d"? What about leap year? What about going back one month? In fact, this patch only had a subset of Influx's supported format. I don't see a week unit. Moreover, Influx doesn't use "us", it uses just "u". So, our proposed syntax isn't even consistent (in subtle ways) with Influx's format. Let alone that Influx's format is incomplete (specifically, no support for months and years). Of the formats I've seen here, Postgres native format is the most user friendly, and accomplishes the goals of durations for us. I'm (non-PMC, non-binding) -1 on the currently proposed format from a usability/product/developer POV. > Add duration type > - > > Key: CASSANDRA-11873 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11873 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > Labels: client-impacting, doc-impacting > Fix For: 3.x > > > For CASSANDRA-11871 or to allow queries with {{WHERE}} clause like: > {{... WHERE reading_time < now() - 2h}}, we need to support some duration > type. > In my opinion, it should be represented internally as a number of > microseconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11349) MerkleTree mismatch when multiple range tombstones exists for the same partition and interval
[ https://issues.apache.org/jira/browse/CASSANDRA-11349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333457#comment-15333457 ] Fabien Rousseau commented on CASSANDRA-11349: - Just to let you know that we packaged the patch done by Branimir (as it is the one that have more chances to be included mainstream). We restored one cluster (3 nodes, 100GB of data per node, affected table is 25GB) from a snapshot on new hardware, and did a full repair. So far, so good, not much differences are found for the affected table but this was expected because repairs are not run for a few months (around a hundred VS a few hundred of thousands before). We will continue testing by recreating all of our clusters, and then, deploy it on our production (and I'll let you know once this is done). > MerkleTree mismatch when multiple range tombstones exists for the same > partition and interval > - > > Key: CASSANDRA-11349 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11349 > Project: Cassandra > Issue Type: Bug >Reporter: Fabien Rousseau >Assignee: Stefan Podkowinski > Labels: repair > Fix For: 2.1.x, 2.2.x > > Attachments: 11349-2.1-v2.patch, 11349-2.1-v3.patch, > 11349-2.1-v4.patch, 11349-2.1.patch, 11349-2.2-v4.patch > > > We observed that repair, for some of our clusters, streamed a lot of data and > many partitions were "out of sync". > Moreover, the read repair mismatch ratio is around 3% on those clusters, > which is really high. > After investigation, it appears that, if two range tombstones exists for a > partition for the same range/interval, they're both included in the merkle > tree computation. > But, if for some reason, on another node, the two range tombstones were > already compacted into a single range tombstone, this will result in a merkle > tree difference. > Currently, this is clearly bad because MerkleTree differences are dependent > on compactions (and if a partition is deleted and created multiple times, the > only way to ensure that repair "works correctly"/"don't overstream data" is > to major compact before each repair... which is not really feasible). > Below is a list of steps allowing to easily reproduce this case: > {noformat} > ccm create test -v 2.1.13 -n 2 -s > ccm node1 cqlsh > CREATE KEYSPACE test_rt WITH replication = {'class': 'SimpleStrategy', > 'replication_factor': 2}; > USE test_rt; > CREATE TABLE IF NOT EXISTS table1 ( > c1 text, > c2 text, > c3 float, > c4 float, > PRIMARY KEY ((c1), c2) > ); > INSERT INTO table1 (c1, c2, c3, c4) VALUES ( 'a', 'b', 1, 2); > DELETE FROM table1 WHERE c1 = 'a' AND c2 = 'b'; > ctrl ^d > # now flush only one of the two nodes > ccm node1 flush > ccm node1 cqlsh > USE test_rt; > INSERT INTO table1 (c1, c2, c3, c4) VALUES ( 'a', 'b', 1, 3); > DELETE FROM table1 WHERE c1 = 'a' AND c2 = 'b'; > ctrl ^d > ccm node1 repair > # now grep the log and observe that there was some inconstencies detected > between nodes (while it shouldn't have detected any) > ccm node1 showlog | grep "out of sync" > {noformat} > Consequences of this are a costly repair, accumulating many small SSTables > (up to thousands for a rather short period of time when using VNodes, the > time for compaction to absorb those small files), but also an increased size > on disk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11726) IndexOutOfBoundsException when selecting (distinct) row ids from counter table.
[ https://issues.apache.org/jira/browse/CASSANDRA-11726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jaroslav Kamenik updated CASSANDRA-11726: - Reproduced In: 3.7, 3.5 (was: 3.5) > IndexOutOfBoundsException when selecting (distinct) row ids from counter > table. > --- > > Key: CASSANDRA-11726 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11726 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: C* 3.5, cluster of 4 nodes. >Reporter: Jaroslav Kamenik > > I have simple table containing counters: > {code} > CREATE TABLE tablename ( > object_id ascii, > counter_id ascii, > count counter, > PRIMARY KEY (object_id, counter_id) > ) WITH CLUSTERING ORDER BY (counter_id ASC) > AND bloom_filter_fp_chance = 0.01 > AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} > AND comment = '' > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', > 'max_threshold': '32', 'min_threshold': '4'} > AND compression = {'enabled': 'false'} > AND crc_check_chance = 1.0 > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99PERCENTILE'; > {code} > Counters are often inc/decreased, whole rows are queried, deleted sometimes. > After some time I tried to query all object_ids, but it failed with: > {code} > cqlsh:woc> consistency quorum; > cqlsh:woc> select object_id from tablename; > ServerError: message="java.lang.IndexOutOfBoundsException"> > {code} > select * from ..., select where .., updates works well.. > With consistency one it works sometimes, so it seems something is broken at > one server, but I tried to repair table there and it did not help. > Whole exception from server log: > {code} > java.lang.IndexOutOfBoundsException: null > at java.nio.Buffer.checkIndex(Buffer.java:546) ~[na:1.8.0_73] > at java.nio.HeapByteBuffer.getShort(HeapByteBuffer.java:314) > ~[na:1.8.0_73] > at > org.apache.cassandra.db.context.CounterContext.headerLength(CounterContext.java:141) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.context.CounterContext.access$100(CounterContext.java:76) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.context.CounterContext$ContextState.(CounterContext.java:758) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.context.CounterContext$ContextState.wrap(CounterContext.java:765) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.context.CounterContext.merge(CounterContext.java:271) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.Conflicts.mergeCounterValues(Conflicts.java:76) > ~[apache-cassandra-3.5.jar:3.5] > at org.apache.cassandra.db.rows.Cells.reconcile(Cells.java:143) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.rows.Row$Merger$ColumnDataReducer.getReduced(Row.java:591) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.rows.Row$Merger$ColumnDataReducer.getReduced(Row.java:549) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:217) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:156) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[apache-cassandra-3.5.jar:3.5] > at org.apache.cassandra.db.rows.Row$Merger.merge(Row.java:526) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:473) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:437) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:217) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:156) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:419
[jira] [Commented] (CASSANDRA-11726) IndexOutOfBoundsException when selecting (distinct) row ids from counter table.
[ https://issues.apache.org/jira/browse/CASSANDRA-11726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333430#comment-15333430 ] Jaroslav Kamenik commented on CASSANDRA-11726: -- Hi, we have experienced same problem now, at C* 3.7. It seems there are others with same problem, look at https://issues.apache.org/jira/browse/CASSANDRA-11812 . > IndexOutOfBoundsException when selecting (distinct) row ids from counter > table. > --- > > Key: CASSANDRA-11726 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11726 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: C* 3.5, cluster of 4 nodes. >Reporter: Jaroslav Kamenik > > I have simple table containing counters: > {code} > CREATE TABLE tablename ( > object_id ascii, > counter_id ascii, > count counter, > PRIMARY KEY (object_id, counter_id) > ) WITH CLUSTERING ORDER BY (counter_id ASC) > AND bloom_filter_fp_chance = 0.01 > AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} > AND comment = '' > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', > 'max_threshold': '32', 'min_threshold': '4'} > AND compression = {'enabled': 'false'} > AND crc_check_chance = 1.0 > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99PERCENTILE'; > {code} > Counters are often inc/decreased, whole rows are queried, deleted sometimes. > After some time I tried to query all object_ids, but it failed with: > {code} > cqlsh:woc> consistency quorum; > cqlsh:woc> select object_id from tablename; > ServerError: message="java.lang.IndexOutOfBoundsException"> > {code} > select * from ..., select where .., updates works well.. > With consistency one it works sometimes, so it seems something is broken at > one server, but I tried to repair table there and it did not help. > Whole exception from server log: > {code} > java.lang.IndexOutOfBoundsException: null > at java.nio.Buffer.checkIndex(Buffer.java:546) ~[na:1.8.0_73] > at java.nio.HeapByteBuffer.getShort(HeapByteBuffer.java:314) > ~[na:1.8.0_73] > at > org.apache.cassandra.db.context.CounterContext.headerLength(CounterContext.java:141) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.context.CounterContext.access$100(CounterContext.java:76) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.context.CounterContext$ContextState.(CounterContext.java:758) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.context.CounterContext$ContextState.wrap(CounterContext.java:765) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.context.CounterContext.merge(CounterContext.java:271) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.Conflicts.mergeCounterValues(Conflicts.java:76) > ~[apache-cassandra-3.5.jar:3.5] > at org.apache.cassandra.db.rows.Cells.reconcile(Cells.java:143) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.rows.Row$Merger$ColumnDataReducer.getReduced(Row.java:591) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.rows.Row$Merger$ColumnDataReducer.getReduced(Row.java:549) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:217) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:156) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[apache-cassandra-3.5.jar:3.5] > at org.apache.cassandra.db.rows.Row$Merger.merge(Row.java:526) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:473) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator$MergeReducer.getReduced(UnfilteredRowIterators.java:437) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:217) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:156) > ~[apache-cassandra-3.5.jar:3.5] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterato
[jira] [Commented] (CASSANDRA-11812) IndexOutOfBoundsException in CounterContext.headerLength
[ https://issues.apache.org/jira/browse/CASSANDRA-11812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333427#comment-15333427 ] Jaroslav Kamenik commented on CASSANDRA-11812: -- Hi, it seems we have same problem, I have reported it here: https://issues.apache.org/jira/browse/CASSANDRA-11726 > IndexOutOfBoundsException in CounterContext.headerLength > > > Key: CASSANDRA-11812 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11812 > Project: Cassandra > Issue Type: Bug > Environment: Ubuntu 14.04 TLS > Cassandra 3.4 >Reporter: Jeff Evans > Fix For: 3.x > > > My team is using > https://github.com/Contrast-Security-OSS/cassandra-migration for schema > migrations, and it creates a table with a counter to store the schema > version. We're able to create the table fine in 3.4, but when we run the > tool on an existing keyspace, the client reports an error from the server. > The server logs show an IndexOutOfBounds exception related to the counter > column. > the library creates a table with name and count: > {code} > CREATE TABLE silver.cassandra_migration_version_counts ( > name text PRIMARY KEY, > count counter > ) WITH bloom_filter_fp_chance = 0.01 > AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} > AND comment = '' > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', > 'max_threshold': '32', 'min_threshold': '4'} > AND compression = {'chunk_length_in_kb': '64', 'class': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND crc_check_chance = 1.0 > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99PERCENTILE'; > {code} > then when the library performs a migration, it counts the rows in this table: > {{SELECT count(\*) FROM silver.cassandra_migration_version_counts;}} > this query throws the error: > {code} > cqlsh:silver> SELECT count(*) FROM silver.cassandra_migration_version_counts; > ServerError: message="java.lang.IndexOutOfBoundsException"> > {code} > the client driver debug logs show the query is running with CONSISTENCY ALL > {code} > 2016-05-16 18:03:05 [cluster2-nio-worker-3] WARN > c.d.driver.core.RequestHandler - /172.24.131.52:9042 replied with server > error (java.lang.IndexOutOfBoundsException), defuncting connection. > 2016-05-16 18:03:05 [cluster2-nio-worker-3] DEBUG > com.datastax.driver.core.Host.STATES - Defuncting > Connection[/172.24.131.52:9042-1, inFlight=0, closed=false] because: An > unexpected error occurred server side on /172.24.131.52:9042: > java.lang.IndexOutOfBoundsException > 2016-05-16 18:03:05 [cluster2-nio-worker-3] DEBUG > com.datastax.driver.core.Connection - Connection[/172.24.131.52:9042-1, > inFlight=0, closed=true] closing connection > 2016-05-16 18:03:05 [cluster2-nio-worker-3] DEBUG > com.datastax.driver.core.Host.STATES - [/172.24.131.52:9042] preventing new > connections for the next 1000 ms > 2016-05-16 18:03:05 [cluster2-nio-worker-3] DEBUG > com.datastax.driver.core.Host.STATES - [/172.24.131.52:9042] > Connection[/172.24.131.52:9042-1, inFlight=0, closed=true] failed, remaining > = 0 > 2016-05-16 18:03:05 [cluster2-nio-worker-3] DEBUG > c.d.driver.core.RequestHandler - [937744315-1] Doing retry 1 for query SELECT > count(*) FROM silver.cassandra_migration_version_counts; at consistency ALL > 2016-05-16 18:03:05 [cluster2-nio-worker-3] DEBUG > c.d.driver.core.RequestHandler - [937744315-1] Error querying > /172.24.131.52:9042 : com.datastax.driver.core.exceptions.ServerError: An > unexpected error occurred server side on /172.24.131.52:9042: > java.lang.IndexOutOfBoundsException > {code} > I can repro the error with this table, but I haven't found a simple repro yet. > here's the call stack from the server log: > {code} > ERROR [SharedPool-Worker-2] 2016-05-16 17:00:39,313 ErrorMessage.java:338 - > Unexpected exception during request > java.lang.IndexOutOfBoundsException: null > at java.nio.Buffer.checkIndex(Buffer.java:546) ~[na:1.8.0_72] > at java.nio.HeapByteBuffer.getShort(HeapByteBuffer.java:314) > ~[na:1.8.0_72] > at > org.apache.cassandra.db.context.CounterContext.headerLength(CounterContext.java:141) > ~[apache-cassandra-3.4.jar:3.4] > at > org.apache.cassandra.db.context.CounterContext.access$100(CounterContext.java:76) > ~[apache-cassandra-3.4.jar:3.4] > at > org.apache.cassandra.db.context.CounterContext$ContextState.(CounterContext.java:758) > ~[apache-cassandra-3.4.jar:3.4] > at > org.apache.c
[jira] [Created] (CASSANDRA-12015) Rebuilding from another DC should use different sources
Fabien Rousseau created CASSANDRA-12015: --- Summary: Rebuilding from another DC should use different sources Key: CASSANDRA-12015 URL: https://issues.apache.org/jira/browse/CASSANDRA-12015 Project: Cassandra Issue Type: Improvement Reporter: Fabien Rousseau Currently, when adding a new DC (ex: DC2) and rebuilding it from an existing DC (ex: DC1), only the closest replica is used as a "source of data". It works but is not optimal, because in case of an RF=3 and 3 nodes cluster, only one node in DC1 is streaming the data to DC2. To build the new DC in a reasonable time, it would be better, in that case, to stream from multiple sources, thus distributing more evenly the load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-10389) Repair session exception Validation failed
[ https://issues.apache.org/jira/browse/CASSANDRA-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333412#comment-15333412 ] Heiko Sommer edited comment on CASSANDRA-10389 at 6/16/16 9:09 AM: --- I'm getting the same problem with Cassandra 2.2.5, cluster of 6 nodes, RF=2. As a workaround I must restart all nodes before running a repair. For sure I do not start multiple repairs simultaneously. Here is what happened the last time I tried it out: The previous incremental repair ("{{nodetool repair --partitioner-range -- mykeyspace}}") started on a single node after rolling cluster restart finished nicely, with the expected number of "Session completed successfully" logs. There were no more repair tasks or anticompaction tasks running, the cluster was stable. I restarted C* on 4 nodes, but left it running on 2 nodes. On one of the restarted nodes I ran an incremental repair again, this time also with the "{{--sequential}}" option. On the repairing node I get failure logs such as {noformat} java.lang.RuntimeException: Could not create snapshot at /10.195.62.171 at org.apache.cassandra.repair.SnapshotTask$SnapshotCallback.onFailure(SnapshotTask.java:79) ~[apache-cassandra-2.2.5.jar:2.2.5] ERROR [Repair#1:16] 2016-06-16 07:10:29,239 CassandraDaemon.java:185 - Exception in thread Thread[Repair#1:16,5,RMI Runtime] com.google.common.util.concurrent.UncheckedExecutionException: java.lang.RuntimeException: Could not create snapshot at /10.195.62.171 at com.google.common.util.concurrent.Futures.wrapAndThrowUnchecked(Futures.java:1387) ~[guava-16.0.jar:na] {noformat} while on the failing target nodes (those that were not restarted before the repair) I get logs such as {noformat} ERROR [AntiEntropyStage:1] 2016-06-16 07:10:29,237 RepairMessageVerbHandler.java:108 - Cannot start multiple repair sessions over the same sstables {noformat} Before that, I also tried with full repair, and got the impression that it is the same problem for full or incremental repairs. As I can reproduce the issue, I would be glad to provide you with more logs or some experimenting if that would help resolve the issue. was (Author: hsommer): I'm getting the same problem with Cassandra 2.2.5, cluster of 6 nodes, RF=2. As a workaround I must restart all nodes before running a repair. For sure I do not start multiple repairs simultaneously. Here is what happened the last time I tried it out: The previous incremental repair ("nodetool repair --partitioner-range -- mykeyspace") started on a single node after rolling cluster restart finished nicely, with the expected number of "Session completed successfully" logs. There were no more repair tasks or anticompaction tasks running, the cluster was stable. I restarted C* on 4 nodes, but left it running on 2 nodes. On one of the restarted nodes I ran an incremental repair again, this time also with the "--sequential" option. On the repairing node I get failure logs such as {noformat} java.lang.RuntimeException: Could not create snapshot at /10.195.62.171 at org.apache.cassandra.repair.SnapshotTask$SnapshotCallback.onFailure(SnapshotTask.java:79) ~[apache-cassandra-2.2.5.jar:2.2.5] ERROR [Repair#1:16] 2016-06-16 07:10:29,239 CassandraDaemon.java:185 - Exception in thread Thread[Repair#1:16,5,RMI Runtime] com.google.common.util.concurrent.UncheckedExecutionException: java.lang.RuntimeException: Could not create snapshot at /10.195.62.171 at com.google.common.util.concurrent.Futures.wrapAndThrowUnchecked(Futures.java:1387) ~[guava-16.0.jar:na] {noformat} while on the failing target nodes (those that were not restarted before the repair) I get logs such as {noformat} ERROR [AntiEntropyStage:1] 2016-06-16 07:10:29,237 RepairMessageVerbHandler.java:108 - Cannot start multiple repair sessions over the same sstables {noformat} Before that, I also tried with full repair, and got the impression that it is the same problem for full or incremental repairs. As I can reproduce the issue, I would be glad to provide you with more logs or some experimenting if that would help resolve the issue. > Repair session exception Validation failed > -- > > Key: CASSANDRA-10389 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10389 > Project: Cassandra > Issue Type: Bug > Environment: Debian 8, Java 1.8.0_60, Cassandra 2.2.1 (datastax > compilation) >Reporter: Jędrzej Sieracki > Fix For: 2.2.x > > > I'm running a repair on a ring of nodes, that was recently extented from 3 to > 13 nodes. The extension was done two days ago, the repair was attempted > yesterday. > {quote} > [2015-09-22 11:55:55,266] Starting repair command #9, repairing keyspace > perspectiv with repair options (parallelism: parallel
[jira] [Commented] (CASSANDRA-10389) Repair session exception Validation failed
[ https://issues.apache.org/jira/browse/CASSANDRA-10389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333412#comment-15333412 ] Heiko Sommer commented on CASSANDRA-10389: -- I'm getting the same problem with Cassandra 2.2.5, cluster of 6 nodes, RF=2. As a workaround I must restart all nodes before running a repair. For sure I do not start multiple repairs simultaneously. Here is what happened the last time I tried it out: The previous incremental repair ("nodetool repair --partitioner-range -- mykeyspace") started on a single node after rolling cluster restart finished nicely, with the expected number of "Session completed successfully" logs. There were no more repair tasks or anticompaction tasks running, the cluster was stable. I restarted C* on 4 nodes, but left it running on 2 nodes. On one of the restarted nodes I ran an incremental repair again, this time also with the "--sequential" option. On the repairing node I get failure logs such as {noformat} java.lang.RuntimeException: Could not create snapshot at /10.195.62.171 at org.apache.cassandra.repair.SnapshotTask$SnapshotCallback.onFailure(SnapshotTask.java:79) ~[apache-cassandra-2.2.5.jar:2.2.5] ERROR [Repair#1:16] 2016-06-16 07:10:29,239 CassandraDaemon.java:185 - Exception in thread Thread[Repair#1:16,5,RMI Runtime] com.google.common.util.concurrent.UncheckedExecutionException: java.lang.RuntimeException: Could not create snapshot at /10.195.62.171 at com.google.common.util.concurrent.Futures.wrapAndThrowUnchecked(Futures.java:1387) ~[guava-16.0.jar:na] {noformat} while on the failing target nodes (those that were not restarted before the repair) I get logs such as {noformat} ERROR [AntiEntropyStage:1] 2016-06-16 07:10:29,237 RepairMessageVerbHandler.java:108 - Cannot start multiple repair sessions over the same sstables {noformat} Before that, I also tried with full repair, and got the impression that it is the same problem for full or incremental repairs. As I can reproduce the issue, I would be glad to provide you with more logs or some experimenting if that would help resolve the issue. > Repair session exception Validation failed > -- > > Key: CASSANDRA-10389 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10389 > Project: Cassandra > Issue Type: Bug > Environment: Debian 8, Java 1.8.0_60, Cassandra 2.2.1 (datastax > compilation) >Reporter: Jędrzej Sieracki > Fix For: 2.2.x > > > I'm running a repair on a ring of nodes, that was recently extented from 3 to > 13 nodes. The extension was done two days ago, the repair was attempted > yesterday. > {quote} > [2015-09-22 11:55:55,266] Starting repair command #9, repairing keyspace > perspectiv with repair options (parallelism: parallel, primary range: false, > incremental: true, job threads: 1, ColumnFamilies: [], dataCenters: [], > hosts: [], # of ranges: 517) > [2015-09-22 11:55:58,043] Repair session 1f7c50c0-6110-11e5-b992-9f13fa8664c8 > for range (-5927186132136652665,-5917344746039874798] failed with error > [repair #1f7c50c0-6110-11e5-b992-9f13fa8664c8 on > perspectiv/stock_increment_agg, (-5927186132136652665,-5917344746039874798]] > Validation failed in cblade1.XXX/XXX (progress: 0%) > {quote} > BTW, I am ignoring the LEAK errors for now, that's outside of the scope of > the main issue: > {quote} > ERROR [Reference-Reaper:1] 2015-09-22 11:58:27,843 Ref.java:187 - LEAK > DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@4d25ad8f) to class > org.apache.cassandra.io.sstable.format.SSTableReader$InstanceTidier@896826067:/var/lib/cassandra/data/perspectiv/stock_increment_agg-840cad405de711e5b9929f13fa8664c8/la-73-big > was not released before the reference was garbage collected > {quote} > I scrubbed the sstable with failed validation on cblade1 with nodetool scrub > perspectiv stock_increment_agg: > {quote} > INFO [CompactionExecutor:1704] 2015-09-22 12:05:31,615 OutputHandler.java:42 > - Scrubbing > BigTableReader(path='/var/lib/cassandra/data/perspectiv/stock_increment_agg-840cad405de711e5b9929f13fa8664c8/la-83-big-Data.db') > (345466609 bytes) > INFO [CompactionExecutor:1703] 2015-09-22 12:05:31,615 OutputHandler.java:42 > - Scrubbing > BigTableReader(path='/var/lib/cassandra/data/perspectiv/stock_increment_agg-840cad405de711e5b9929f13fa8664c8/la-82-big-Data.db') > (60496378 bytes) > ERROR [Reference-Reaper:1] 2015-09-22 12:05:31,676 Ref.java:187 - LEAK > DETECTED: a reference > (org.apache.cassandra.utils.concurrent.Ref$State@4ca8951e) to class > org.apache.cassandra.io.sstable.format.SSTableReader$InstanceTidier@114161559:/var/lib/cassandra/data/perspectiv/receipt_agg_total-76abb0625de711e59f6e0b7d98a25b6e/la-48-big > was not released before the reference was garbage collecte
[jira] [Commented] (CASSANDRA-12002) SSTable tools mishandling LocalPartitioner
[ https://issues.apache.org/jira/browse/CASSANDRA-12002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1575#comment-1575 ] Stefania commented on CASSANDRA-12002: -- Thanks for the trunk patch. I do agree on committing only to trunk, since this is a minor bug and sstablemetadata is using the partitioner only since CASSANDRA-7159, which was committed in 3.6, whilst sstabledump already handles secondary indexes as you mention. This worried me a little bit: {{if (validationMetadata.partitioner.endsWith("LocalPartitioner"))}}, although _extremely_ unlikely to be a problem, I prefer to use the fully qualified class name, see this commit [here|https://github.com/stef1927/cassandra/commit/5e5e3c818adea83b702e52ee6653c2a54c7dbef4], could you cross review it? I've started CI tests for trunk here: |[patch|https://github.com/stef1927/cassandra/commits/12002]| |[testall|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-12002-testall/]| |[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-12002-dtest/]| It would be nice to have a dtest or unit test to exercise this code, how did you bump into this issue? > SSTable tools mishandling LocalPartitioner > -- > > Key: CASSANDRA-12002 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12002 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Chris Lohfink >Assignee: Chris Lohfink >Priority: Minor > Attachments: CASSADNRA-12002.txt > > > The sstabledump and sstablemetadata tools use the FBUtilities.newPartitioner > from the name of the partitioner in the validation component. This fails on > sstables that are created with things that use the LocalPartitioner > (secondary indexes, and the system.batches table). The sstabledump had a > check for secondary indexes, but still failed for the system table it was > failing for all in the metadata tool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12012) CQLSSTableWriter and composite clustering keys trigger NPE
[ https://issues.apache.org/jira/browse/CASSANDRA-12012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1565#comment-1565 ] Pierre N. commented on CASSANDRA-12012: --- hasSupportingIndex() of org.apache.cassandra.cql3.restrictions.PrimaryKeyRestrictionSet is doing Keyspace.openAndGetStore(cfm) which trigger error because of uninitialized keyspace in client mode. I hotfixed by adding this check : {code} +import org.apache.cassandra.config.Config; import org.apache.cassandra.cql3.QueryOptions; import org.apache.cassandra.cql3.functions.Function; import org.apache.cassandra.cql3.statements.Bound; @@ -115,7 +116,7 @@ final class PrimaryKeyRestrictionSet extends AbstractPrimaryKeyRestrictions this.isPartitionKey = primaryKeyRestrictions.isPartitionKey; this.cfm = primaryKeyRestrictions.cfm; -if (!primaryKeyRestrictions.isEmpty() && !hasSupportingIndex(restriction)) +if (!Config.isClientMode() && !primaryKeyRestrictions.isEmpty() && !hasSupportingIndex(restriction)) { ColumnDefinition lastRestrictionStart = primaryKeyRestrictions.restrictions.lastRestriction().getFirstColumn(); ColumnDefinition newRestrictionStart = restriction.getFirstColumn(); n)) {code} It works and generate a valid sstable, however, not sure this is the best way to fix it. > CQLSSTableWriter and composite clustering keys trigger NPE > -- > > Key: CASSANDRA-12012 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12012 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Pierre N. >Assignee: Mahdi Mohammadi > > It triggers when using multiple clustering keys in the primary keys > {code} > package tests; > import java.io.File; > import org.apache.cassandra.io.sstable.CQLSSTableWriter; > import org.apache.cassandra.config.Config; > public class DefaultWriter { > > public static void main(String[] args) throws Exception { > Config.setClientMode(true); > > String createTableQuery = "CREATE TABLE ks_test.table_test (" > + "pk1 int," > + "ck1 int," > + "ck2 int," > + "PRIMARY KEY ((pk1), ck1, ck2)" > + ");"; > String insertQuery = "INSERT INTO ks_test.table_test(pk1, ck1, ck2) > VALUES(?,?,?)"; > > CQLSSTableWriter writer = CQLSSTableWriter.builder() > .inDirectory(Files.createTempDirectory("sst").toFile()) > .forTable(createTableQuery) > .using(insertQuery) > .build(); > writer.close(); > } > } > {code} > Exception : > {code} > Exception in thread "main" java.lang.ExceptionInInitializerError > at org.apache.cassandra.db.Keyspace.initCf(Keyspace.java:368) > at org.apache.cassandra.db.Keyspace.(Keyspace.java:305) > at org.apache.cassandra.db.Keyspace.open(Keyspace.java:129) > at org.apache.cassandra.db.Keyspace.open(Keyspace.java:106) > at org.apache.cassandra.db.Keyspace.openAndGetStore(Keyspace.java:159) > at > org.apache.cassandra.cql3.restrictions.PrimaryKeyRestrictionSet.hasSupportingIndex(PrimaryKeyRestrictionSet.java:156) > at > org.apache.cassandra.cql3.restrictions.PrimaryKeyRestrictionSet.(PrimaryKeyRestrictionSet.java:118) > at > org.apache.cassandra.cql3.restrictions.PrimaryKeyRestrictionSet.mergeWith(PrimaryKeyRestrictionSet.java:213) > at > org.apache.cassandra.cql3.restrictions.StatementRestrictions.addSingleColumnRestriction(StatementRestrictions.java:266) > at > org.apache.cassandra.cql3.restrictions.StatementRestrictions.addRestriction(StatementRestrictions.java:250) > at > org.apache.cassandra.cql3.restrictions.StatementRestrictions.(StatementRestrictions.java:159) > at > org.apache.cassandra.cql3.statements.UpdateStatement$ParsedInsert.prepareInternal(UpdateStatement.java:183) > at > org.apache.cassandra.cql3.statements.ModificationStatement$Parsed.prepare(ModificationStatement.java:782) > at > org.apache.cassandra.cql3.statements.ModificationStatement$Parsed.prepare(ModificationStatement.java:768) > at > org.apache.cassandra.cql3.QueryProcessor.getStatement(QueryProcessor.java:505) > at > org.apache.cassandra.io.sstable.CQLSSTableWriter$Builder.getStatement(CQLSSTableWriter.java:508) > at > org.apache.cassandra.io.sstable.CQLSSTableWriter$Builder.using(CQLSSTableWriter.java:439) > at tests.DefaultWriter.main(DefaultWriter.java:29) > Caused by: java.lang.NullPointerException > at > org.apache.cassandra.config.DatabaseDescriptor.getFlushWriters(DatabaseDescriptor.java:1188) > at > org.apache.cassandra.db.ColumnFamilyStore.(ColumnFamilyStore.java:127) > ... 18 more
[jira] [Updated] (CASSANDRA-12012) CQLSSTableWriter and composite clustering keys trigger NPE
[ https://issues.apache.org/jira/browse/CASSANDRA-12012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pierre N. updated CASSANDRA-12012: -- Description: It triggers when using multiple clustering keys in the primary keys {code} package tests; import java.io.File; import org.apache.cassandra.io.sstable.CQLSSTableWriter; import org.apache.cassandra.config.Config; public class DefaultWriter { public static void main(String[] args) throws Exception { Config.setClientMode(true); String createTableQuery = "CREATE TABLE ks_test.table_test (" + "pk1 int," + "ck1 int," + "ck2 int," + "PRIMARY KEY ((pk1), ck1, ck2)" + ");"; String insertQuery = "INSERT INTO ks_test.table_test(pk1, ck1, ck2) VALUES(?,?,?)"; CQLSSTableWriter writer = CQLSSTableWriter.builder() .inDirectory(Files.createTempDirectory("sst").toFile()) .forTable(createTableQuery) .using(insertQuery) .build(); writer.close(); } } {code} Exception : {code} Exception in thread "main" java.lang.ExceptionInInitializerError at org.apache.cassandra.db.Keyspace.initCf(Keyspace.java:368) at org.apache.cassandra.db.Keyspace.(Keyspace.java:305) at org.apache.cassandra.db.Keyspace.open(Keyspace.java:129) at org.apache.cassandra.db.Keyspace.open(Keyspace.java:106) at org.apache.cassandra.db.Keyspace.openAndGetStore(Keyspace.java:159) at org.apache.cassandra.cql3.restrictions.PrimaryKeyRestrictionSet.hasSupportingIndex(PrimaryKeyRestrictionSet.java:156) at org.apache.cassandra.cql3.restrictions.PrimaryKeyRestrictionSet.(PrimaryKeyRestrictionSet.java:118) at org.apache.cassandra.cql3.restrictions.PrimaryKeyRestrictionSet.mergeWith(PrimaryKeyRestrictionSet.java:213) at org.apache.cassandra.cql3.restrictions.StatementRestrictions.addSingleColumnRestriction(StatementRestrictions.java:266) at org.apache.cassandra.cql3.restrictions.StatementRestrictions.addRestriction(StatementRestrictions.java:250) at org.apache.cassandra.cql3.restrictions.StatementRestrictions.(StatementRestrictions.java:159) at org.apache.cassandra.cql3.statements.UpdateStatement$ParsedInsert.prepareInternal(UpdateStatement.java:183) at org.apache.cassandra.cql3.statements.ModificationStatement$Parsed.prepare(ModificationStatement.java:782) at org.apache.cassandra.cql3.statements.ModificationStatement$Parsed.prepare(ModificationStatement.java:768) at org.apache.cassandra.cql3.QueryProcessor.getStatement(QueryProcessor.java:505) at org.apache.cassandra.io.sstable.CQLSSTableWriter$Builder.getStatement(CQLSSTableWriter.java:508) at org.apache.cassandra.io.sstable.CQLSSTableWriter$Builder.using(CQLSSTableWriter.java:439) at tests.DefaultWriter.main(DefaultWriter.java:29) Caused by: java.lang.NullPointerException at org.apache.cassandra.config.DatabaseDescriptor.getFlushWriters(DatabaseDescriptor.java:1188) at org.apache.cassandra.db.ColumnFamilyStore.(ColumnFamilyStore.java:127) ... 18 more {code} was: It triggers when using multiple clustering keys in the primary keys {code} package tests; import java.io.File; import org.apache.cassandra.io.sstable.CQLSSTableWriter; import org.apache.cassandra.config.Config; public class DefaultWriter { public static void main(String[] args) throws Exception { Config.setClientMode(true); String createTableQuery = "CREATE TABLE ks_test.table_test (" + "pk1 int," + "ck1 int," + "ck2 int," + "PRIMARY KEY ((pk1), ck1, ck2)" + ");"; String insertQuery = "INSERT INTO ks_test.table_test(pk1, ck1, ck2) VALUES(?,?,?)"; CQLSSTableWriter writer = CQLSSTableWriter.builder() .inDirectory(File.createTempFile("sstdir", "-tmp")) .forTable(createTableQuery) .using(insertQuery) .build(); writer.close(); } } {code} Exception : {code} Exception in thread "main" java.lang.ExceptionInInitializerError at org.apache.cassandra.db.Keyspace.initCf(Keyspace.java:368) at org.apache.cassandra.db.Keyspace.(Keyspace.java:305) at org.apache.cassandra.db.Keyspace.open(Keyspace.java:129) at org.apache.cassandra.db.Keyspace.open(Keyspace.java:106) at org.apache.cassandra.db.Keyspace.openAndGetStore(Keyspace.java:159) at org.apache.cassandra.cql3.restrictions.PrimaryKeyRestrictionSet.hasSupportingIndex(PrimaryKeyRestrictionSet.java:156) at org.apache.cassandra.cql3.restrictions.PrimaryKeyRestrictionSet.(PrimaryKeyRestrictionSet.java:118) at org.apache.cassandra.cql3.restrictions.PrimaryKeyRestrictionSet.mergeWith(PrimaryKeyRestrictionSet.ja
[jira] [Issue Comment Deleted] (CASSANDRA-11873) Add duration type
[ https://issues.apache.org/jira/browse/CASSANDRA-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer updated CASSANDRA-11873: --- Comment: was deleted (was: I am -1 on this. I think those syntaxes are more complex than needed. In my opinion {{now() - 3d}} will be easily understood by everybody and I do not think that there is a need to have to write {{now() -3 day}}. In the case of CASSANDRA-11871 I found that the {{INTERVAL}} syntax is making the query much more verbose and less readable: {{GROUP BY floor(time, INTERVAL '1' HOUR)}}. ) > Add duration type > - > > Key: CASSANDRA-11873 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11873 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > Labels: client-impacting, doc-impacting > Fix For: 3.x > > > For CASSANDRA-11871 or to allow queries with {{WHERE}} clause like: > {{... WHERE reading_time < now() - 2h}}, we need to support some duration > type. > In my opinion, it should be represented internally as a number of > microseconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11873) Add duration type
[ https://issues.apache.org/jira/browse/CASSANDRA-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1528#comment-1528 ] Benjamin Lerer commented on CASSANDRA-11873: I am -1 on this. I think those syntaxes are more complex than needed. In my opinion {{now() - 3d}} will be easily understood by everybody and I do not think that there is a need to have to write {{now() -3 day}}. In the case of CASSANDRA-11871 I found that the {{INTERVAL}} syntax is making the query much more verbose and less readable: {{GROUP BY floor(time, INTERVAL '1' HOUR)}}. > Add duration type > - > > Key: CASSANDRA-11873 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11873 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > Labels: client-impacting, doc-impacting > Fix For: 3.x > > > For CASSANDRA-11871 or to allow queries with {{WHERE}} clause like: > {{... WHERE reading_time < now() - 2h}}, we need to support some duration > type. > In my opinion, it should be represented internally as a number of > microseconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11873) Add duration type
[ https://issues.apache.org/jira/browse/CASSANDRA-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1529#comment-1529 ] Benjamin Lerer commented on CASSANDRA-11873: I am -1 on this. I think those syntaxes are more complex than needed. In my opinion {{now() - 3d}} will be easily understood by everybody and I do not think that there is a need to have to write {{now() -3 day}}. In the case of CASSANDRA-11871 I found that the {{INTERVAL}} syntax is making the query much more verbose and less readable: {{GROUP BY floor(time, INTERVAL '1' HOUR)}}. > Add duration type > - > > Key: CASSANDRA-11873 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11873 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > Labels: client-impacting, doc-impacting > Fix For: 3.x > > > For CASSANDRA-11871 or to allow queries with {{WHERE}} clause like: > {{... WHERE reading_time < now() - 2h}}, we need to support some duration > type. > In my opinion, it should be represented internally as a number of > microseconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11873) Add duration type
[ https://issues.apache.org/jira/browse/CASSANDRA-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1527#comment-1527 ] Sylvain Lebresne commented on CASSANDRA-11873: -- Well, # We're not reinventing the wheel, we're reusing [influxdb syntax|https://docs.influxdata.com/influxdb/v0.13/query_language/data_exploration/#time-syntax-in-queries]. Even besides that, calling a syntax like {{2h3m}} "reinventing the wheel" feels to me a bit of a strech. # If one bothers reading the [linked Postgres page|https://www.postgresql.org/docs/current/static/datatype-datetime.html#DATATYPE-INTERVAL-INPUT-EXAMPLES], he'll note that Postgres supports {{P2h3m}} which is pretty damn close (it also supports {{2 hours 3 minutes}} which I don't think is necessary but wouldn't mind supporting as alternative to the shorted version). Surely, Postgres veterans are smart enough to not be thrown off by us dropping the {{P}} at the beginning. # Regarding the Oracle syntax, I think it's terrible. The goal of this ticket is to add a user-friendly syntax for inputing durations, but imo {{now() - (INTERVAL '4 5:12' DAY TO MINUTE)}} (to mean {{now() - 4d5h12m}}) is verbose, unintuitive and plain ugly. And as far as I can tell, it's nowhere near standard (Postgres don't seem to support it for instance). So I'm basically a strong PMC binding -1 on it. Overall, we're not "Making up completely new syntax". {{3h2m5s}} is pretty standard (as in, in life in general) and concise, and it's even supported by some other database (influxdb and, up to a minor detail, Postgres). And I don't see any other syntax being a de-factor standard in other databases. > Add duration type > - > > Key: CASSANDRA-11873 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11873 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer > Labels: client-impacting, doc-impacting > Fix For: 3.x > > > For CASSANDRA-11871 or to allow queries with {{WHERE}} clause like: > {{... WHERE reading_time < now() - 2h}}, we need to support some duration > type. > In my opinion, it should be represented internally as a number of > microseconds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)