[jira] [Updated] (CASSANDRA-11443) Prevent (or warn) changing clustering order with ALTER TABLE when data already exists
[ https://issues.apache.org/jira/browse/CASSANDRA-11443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Ramirez updated CASSANDRA-11443: -- Description: Inexperienced DBAs get caught out on certain schema changes thinking that Cassandra will automatically retrofit/convert the existing data on disk. We should prevent users from changing the clustering order on existing tables or they will run into compaction/read issues such as (example from Cassandra 2.0.14): {noformat} ERROR [CompactionExecutor:6488] 2015-07-14 19:33:14,247 CassandraDaemon.java (line 258) Exception in thread Thread[CompactionExecutor:6488,1,main] java.lang.AssertionError: Added column does not sort as the last column at org.apache.cassandra.db.ArrayBackedSortedColumns.addColumn(ArrayBackedSortedColumns.java:116) at org.apache.cassandra.db.ColumnFamily.addColumn(ColumnFamily.java:121) at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:155) at org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:186) at org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:98) at org.apache.cassandra.db.compaction.PrecompactedRow.(PrecompactedRow.java:85) at org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:196) at org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:74) at org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:55) at org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:115) at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:98) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:164) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60) at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59) at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:198) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} At the very least, we should report a warning advising users about possible problems when changing the clustering order if the table is not empty. was: Inexperienced DBAs get caught out on certain schema changes thinking that Cassandra will automatically retrofit/convert the existing data on disk. We should prevent users from changing the clustering order on existing tables or they will run into compaction/read issues such as: {noformat} ERROR [CompactionExecutor:6488] 2015-07-14 19:33:14,247 CassandraDaemon.java (line 258) Exception in thread Thread[CompactionExecutor:6488,1,main] java.lang.AssertionError: Added column does not sort as the last column at org.apache.cassandra.db.ArrayBackedSortedColumns.addColumn(ArrayBackedSortedColumns.java:116) at org.apache.cassandra.db.ColumnFamily.addColumn(ColumnFamily.java:121) at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:155) at org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:186) at org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:98) at org.apache.cassandra.db.compaction.PrecompactedRow.(PrecompactedRow.java:85) at org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:196) at org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:74) at org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:55) at org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:115) at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:98) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:164) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
[jira] [Created] (CASSANDRA-11443) Prevent (or warn) changing clustering order with ALTER TABLE when data already exists
Erick Ramirez created CASSANDRA-11443: - Summary: Prevent (or warn) changing clustering order with ALTER TABLE when data already exists Key: CASSANDRA-11443 URL: https://issues.apache.org/jira/browse/CASSANDRA-11443 Project: Cassandra Issue Type: Improvement Components: Compaction, CQL Reporter: Erick Ramirez Inexperienced DBAs get caught out on certain schema changes thinking that Cassandra will automatically retrofit/convert the existing data on disk. We should prevent users from changing the clustering order on existing tables or they will run into compaction/read issues such as: {noformat} ERROR [CompactionExecutor:6488] 2015-07-14 19:33:14,247 CassandraDaemon.java (line 258) Exception in thread Thread[CompactionExecutor:6488,1,main] java.lang.AssertionError: Added column does not sort as the last column at org.apache.cassandra.db.ArrayBackedSortedColumns.addColumn(ArrayBackedSortedColumns.java:116) at org.apache.cassandra.db.ColumnFamily.addColumn(ColumnFamily.java:121) at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:155) at org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:186) at org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:98) at org.apache.cassandra.db.compaction.PrecompactedRow.(PrecompactedRow.java:85) at org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:196) at org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:74) at org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:55) at org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:115) at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:98) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:164) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60) at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59) at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:198) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} At the very least, we should report a warning advising users about possible problems when changing the clustering order if the table is not empty. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11420) Add the JMX metrics to track number of data flushed from memtable to disk
[ https://issues.apache.org/jira/browse/CASSANDRA-11420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15213744#comment-15213744 ] Dikang Gu commented on CASSANDRA-11420: --- This is an interesting metrics, as we can use it to calculate the write amplification of C* node, using the formula: wa = (compaction.BytesCompacted.Count) / (BytesFlushed.Count) [~krummas] do you want to take a quick look? > Add the JMX metrics to track number of data flushed from memtable to disk > - > > Key: CASSANDRA-11420 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11420 > Project: Cassandra > Issue Type: Improvement >Reporter: Dikang Gu >Assignee: Dikang Gu >Priority: Minor > Fix For: 3.x > > Attachments: > 0001-Add-the-metrics-of-how-many-bytes-we-flushed-from-me.patch > > > 2016-03-24_02:30:38.39936 INFO 02:30:38 Completed flushing > /data/cassandra/data/keyspace/column-family/column-family-tmp-ka-295782-Data.db > (73.266MiB) for commitlog position ReplayPosition(segmentId=1458717183630, > position=3690) > It would be useful to expose the number of flushed bytes to JMX, so that we > can monitor how many bytes are written by application and flushed to disk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11420) Add the JMX metrics to track number of data flushed from memtable to disk
[ https://issues.apache.org/jira/browse/CASSANDRA-11420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dikang Gu updated CASSANDRA-11420: -- Reviewer: Marcus Eriksson > Add the JMX metrics to track number of data flushed from memtable to disk > - > > Key: CASSANDRA-11420 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11420 > Project: Cassandra > Issue Type: Improvement >Reporter: Dikang Gu >Assignee: Dikang Gu >Priority: Minor > Fix For: 3.x > > Attachments: > 0001-Add-the-metrics-of-how-many-bytes-we-flushed-from-me.patch > > > 2016-03-24_02:30:38.39936 INFO 02:30:38 Completed flushing > /data/cassandra/data/keyspace/column-family/column-family-tmp-ka-295782-Data.db > (73.266MiB) for commitlog position ReplayPosition(segmentId=1458717183630, > position=3690) > It would be useful to expose the number of flushed bytes to JMX, so that we > can monitor how many bytes are written by application and flushed to disk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11420) Add the JMX metrics to track number of data flushed from memtable to disk
[ https://issues.apache.org/jira/browse/CASSANDRA-11420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dikang Gu updated CASSANDRA-11420: -- Reviewer: (was: Aleksey Yeschenko) > Add the JMX metrics to track number of data flushed from memtable to disk > - > > Key: CASSANDRA-11420 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11420 > Project: Cassandra > Issue Type: Improvement >Reporter: Dikang Gu >Assignee: Dikang Gu >Priority: Minor > Fix For: 3.x > > Attachments: > 0001-Add-the-metrics-of-how-many-bytes-we-flushed-from-me.patch > > > 2016-03-24_02:30:38.39936 INFO 02:30:38 Completed flushing > /data/cassandra/data/keyspace/column-family/column-family-tmp-ka-295782-Data.db > (73.266MiB) for commitlog position ReplayPosition(segmentId=1458717183630, > position=3690) > It would be useful to expose the number of flushed bytes to JMX, so that we > can monitor how many bytes are written by application and flushed to disk. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11106) Experiment with strategies for picking compaction candidates in LCS
[ https://issues.apache.org/jira/browse/CASSANDRA-11106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15213739#comment-15213739 ] Dikang Gu commented on CASSANDRA-11106: --- [~krummas], great, yes, this is sth I'm going to work on! > Experiment with strategies for picking compaction candidates in LCS > --- > > Key: CASSANDRA-11106 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11106 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Dikang Gu > Labels: lcs > Fix For: 3.x > > > Ideas taken here: http://rocksdb.org/blog/2921/compaction_pri/ > Current strategy in LCS is that we keep track of the token that was last > compacted and then we start a compaction with the sstable containing the next > token (kOldestSmallestSeqFirst in the blog post above) > The rocksdb blog post above introduces a few ideas how this could be improved: > * pick the 'coldest' sstable (sstable with the oldest max timestamp) - we > want to keep the hot data (recently updated) in the lower levels to avoid > write amplification > * pick the sstable with the highest tombstone ratio, we want to get > tombstones to the top level as quickly as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (CASSANDRA-11106) Experiment with strategies for picking compaction candidates in LCS
[ https://issues.apache.org/jira/browse/CASSANDRA-11106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dikang Gu reassigned CASSANDRA-11106: - Assignee: Dikang Gu > Experiment with strategies for picking compaction candidates in LCS > --- > > Key: CASSANDRA-11106 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11106 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Dikang Gu > Labels: lcs > Fix For: 3.x > > > Ideas taken here: http://rocksdb.org/blog/2921/compaction_pri/ > Current strategy in LCS is that we keep track of the token that was last > compacted and then we start a compaction with the sstable containing the next > token (kOldestSmallestSeqFirst in the blog post above) > The rocksdb blog post above introduces a few ideas how this could be improved: > * pick the 'coldest' sstable (sstable with the oldest max timestamp) - we > want to keep the hot data (recently updated) in the lower levels to avoid > write amplification > * pick the sstable with the highest tombstone ratio, we want to get > tombstones to the top level as quickly as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11383) Avoid index segment stitching in RAM which lead to OOM on big SSTable files
[ https://issues.apache.org/jira/browse/CASSANDRA-11383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Yaskevich updated CASSANDRA-11383: Resolution: Fixed Status: Resolved (was: Patch Available) Merged. > Avoid index segment stitching in RAM which lead to OOM on big SSTable files > > > Key: CASSANDRA-11383 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11383 > Project: Cassandra > Issue Type: Bug > Components: CQL > Environment: C* 3.4 >Reporter: DOAN DuyHai >Assignee: Jordan West > Labels: sasi > Fix For: 3.5 > > Attachments: CASSANDRA-11383.patch, > SASI_Index_build_LCS_1G_Max_SSTable_Size_logs.tar.gz, > new_system_log_CMS_8GB_OOM.log, system.log_sasi_build_oom > > > 13 bare metal machines > - 6 cores CPU (12 HT) > - 64Gb RAM > - 4 SSD in RAID0 > JVM settings: > - G1 GC > - Xms32G, Xmx32G > Data set: > - ≈ 100Gb/per node > - 1.3 Tb cluster-wide > - ≈ 20Gb for all SASI indices > C* settings: > - concurrent_compactors: 1 > - compaction_throughput_mb_per_sec: 256 > - memtable_heap_space_in_mb: 2048 > - memtable_offheap_space_in_mb: 2048 > I created 9 SASI indices > - 8 indices with text field, NonTokenizingAnalyser, PREFIX mode, > case-insensitive > - 1 index with numeric field, SPARSE mode > After a while, the nodes just gone OOM. > I attach log files. You can see a lot of GC happening while index segments > are flush to disk. At some point the node OOM ... > /cc [~xedin] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[1/3] cassandra git commit: Avoid index segment stitching in RAM which lead to OOM on big SSTable files
Repository: cassandra Updated Branches: refs/heads/trunk 494386de8 -> b6ff7f6c0 http://git-wip-us.apache.org/repos/asf/cassandra/blob/5c4d5c73/test/unit/org/apache/cassandra/index/sasi/disk/PerSSTableIndexWriterTest.java -- diff --git a/test/unit/org/apache/cassandra/index/sasi/disk/PerSSTableIndexWriterTest.java b/test/unit/org/apache/cassandra/index/sasi/disk/PerSSTableIndexWriterTest.java index 4663692..39a0fbc 100644 --- a/test/unit/org/apache/cassandra/index/sasi/disk/PerSSTableIndexWriterTest.java +++ b/test/unit/org/apache/cassandra/index/sasi/disk/PerSSTableIndexWriterTest.java @@ -20,14 +20,18 @@ package org.apache.cassandra.index.sasi.disk; import java.io.File; import java.nio.ByteBuffer; import java.util.*; +import java.util.concurrent.Callable; +import java.util.concurrent.ThreadLocalRandom; import org.apache.cassandra.SchemaLoader; +import org.apache.cassandra.config.CFMetaData; import org.apache.cassandra.config.ColumnDefinition; import org.apache.cassandra.db.Clustering; import org.apache.cassandra.db.ColumnFamilyStore; import org.apache.cassandra.db.DecoratedKey; import org.apache.cassandra.db.Keyspace; import org.apache.cassandra.db.compaction.OperationType; +import org.apache.cassandra.db.marshal.LongType; import org.apache.cassandra.db.rows.BTreeRow; import org.apache.cassandra.db.rows.BufferCell; import org.apache.cassandra.db.rows.Row; @@ -36,6 +40,7 @@ import org.apache.cassandra.index.sasi.utils.RangeIterator; import org.apache.cassandra.db.marshal.Int32Type; import org.apache.cassandra.db.marshal.UTF8Type; import org.apache.cassandra.exceptions.ConfigurationException; +import org.apache.cassandra.io.FSError; import org.apache.cassandra.io.sstable.Descriptor; import org.apache.cassandra.io.util.FileUtils; import org.apache.cassandra.schema.KeyspaceMetadata; @@ -158,4 +163,89 @@ public class PerSSTableIndexWriterTest extends SchemaLoader FileUtils.closeQuietly(index); } + +@Test +public void testSparse() throws Exception +{ +final String columnName = "timestamp"; + +ColumnFamilyStore cfs = Keyspace.open(KS_NAME).getColumnFamilyStore(CF_NAME); +ColumnDefinition column = cfs.metadata.getColumnDefinition(UTF8Type.instance.decompose(columnName)); + +SASIIndex sasi = (SASIIndex) cfs.indexManager.getIndexByName(columnName); + +File directory = cfs.getDirectories().getDirectoryForNewSSTables(); +Descriptor descriptor = Descriptor.fromFilename(cfs.getSSTablePath(directory)); +PerSSTableIndexWriter indexWriter = (PerSSTableIndexWriter) sasi.getFlushObserver(descriptor, OperationType.FLUSH); + +final long now = System.currentTimeMillis(); + +indexWriter.begin(); +indexWriter.indexes.put(column, indexWriter.newIndex(sasi.getIndex())); + +populateSegment(cfs.metadata, indexWriter.getIndex(column), new HashMap() +{{ +put(now, new HashSet<>(Arrays.asList(0, 1))); +put(now + 1, new HashSet<>(Arrays.asList(2, 3))); +put(now + 2, new HashSet<>(Arrays.asList(4, 5, 6, 7, 8, 9))); +}}); + +Callable segmentBuilder = indexWriter.getIndex(column).scheduleSegmentFlush(false); + +Assert.assertNull(segmentBuilder.call()); + +PerSSTableIndexWriter.Index index = indexWriter.getIndex(column); +Random random = ThreadLocalRandom.current(); + +Set segments = new HashSet<>(); +// now let's test multiple correct segments with yield incorrect final segment +for (int i = 0; i < 3; i++) +{ +populateSegment(cfs.metadata, index, new HashMap () +{{ +put(now, new HashSet<>(Arrays.asList(random.nextInt(), random.nextInt(), random.nextInt(; +put(now + 1, new HashSet<>(Arrays.asList(random.nextInt(), random.nextInt(), random.nextInt(; +put(now + 2, new HashSet<>(Arrays.asList(random.nextInt(), random.nextInt(), random.nextInt(; +}}); + +try +{ +// flush each of the new segments, they should all succeed +OnDiskIndex segment = index.scheduleSegmentFlush(false).call(); +index.segments.add(Futures.immediateFuture(segment)); +segments.add(segment.getIndexPath()); +} +catch (Exception | FSError e) +{ +e.printStackTrace(); +Assert.fail(); +} +} + +// make sure that all of the segments are present of the filesystem +for (String segment : segments) +Assert.assertTrue(new File(segment).exists()); + +indexWriter.complete(); + +// make sure that individual segments have been cleaned up +for (String segment : segments) +Assert.assertFalse(new
[3/3] cassandra git commit: Merge branch 'cassandra-3.5' into trunk
Merge branch 'cassandra-3.5' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b6ff7f6c Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b6ff7f6c Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b6ff7f6c Branch: refs/heads/trunk Commit: b6ff7f6c0d2849676efe05422e7c0d5bf0949a3e Parents: 494386d 5c4d5c7 Author: Pavel YaskevichAuthored: Sun Mar 27 15:24:29 2016 -0700 Committer: Pavel Yaskevich Committed: Sun Mar 27 15:24:29 2016 -0700 -- CHANGES.txt | 1 + .../sasi/disk/AbstractTokenTreeBuilder.java | 672 .../sasi/disk/DynamicTokenTreeBuilder.java | 189 + .../index/sasi/disk/OnDiskIndexBuilder.java | 52 +- .../index/sasi/disk/PerSSTableIndexWriter.java | 38 +- .../index/sasi/disk/StaticTokenTreeBuilder.java | 266 ++ .../apache/cassandra/index/sasi/disk/Token.java | 5 + .../cassandra/index/sasi/disk/TokenTree.java| 6 +- .../index/sasi/disk/TokenTreeBuilder.java | 805 +-- .../index/sasi/memory/KeyRangeIterator.java | 11 + .../cassandra/index/sasi/sa/SuffixSA.java | 7 +- .../index/sasi/utils/CombinedTerm.java | 47 +- .../index/sasi/disk/OnDiskIndexTest.java| 20 +- .../sasi/disk/PerSSTableIndexWriterTest.java| 90 +++ .../index/sasi/disk/TokenTreeTest.java | 217 +++-- .../index/sasi/utils/LongIterator.java | 8 + 16 files changed, 1482 insertions(+), 952 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/b6ff7f6c/CHANGES.txt -- diff --cc CHANGES.txt index b1b9034,2907df9..1a548d7 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,26 -1,5 +1,27 @@@ +3.6 + * Add auto import java.util for UDF code block (CASSANDRA-11392) + * Add --hex-format option to nodetool getsstables (CASSANDRA-11337) + * sstablemetadata should print sstable min/max token (CASSANDRA-7159) + * Do not wrap CassandraException in TriggerExecutor (CASSANDRA-9421) + * COPY TO should have higher double precision (CASSANDRA-11255) + * Stress should exit with non-zero status after failure (CASSANDRA-10340) + * Add client to cqlsh SHOW_SESSION (CASSANDRA-8958) + * Fix nodetool tablestats keyspace level metrics (CASSANDRA-11226) + * Store repair options in parent_repair_history (CASSANDRA-11244) + * Print current leveling in sstableofflinerelevel (CASSANDRA-9588) + * Change repair message for keyspaces with RF 1 (CASSANDRA-11203) + * Remove hard-coded SSL cipher suites and protocols (CASSANDRA-10508) + * Improve concurrency in CompactionStrategyManager (CASSANDRA-10099) + * (cqlsh) interpret CQL type for formatting blobs (CASSANDRA-11274) + * Refuse to start and print txn log information in case of disk + corruption (CASSANDRA-10112) + * Resolve some eclipse-warnings (CASSANDRA-11086) + * (cqlsh) Show static columns in a different color (CASSANDRA-11059) + * Allow to remove TTLs on table with default_time_to_live (CASSANDRA-11207) + + 3.5 + * Avoid index segment stitching in RAM which lead to OOM on big SSTable files (CASSANDRA-11383) * Fix clustering and row filters for LIKE queries on clustering columns (CASSANDRA-11397) Merged from 3.0: * Enable SO_REUSEADDR for JMX RMI server sockets (CASSANDRA-11093) http://git-wip-us.apache.org/repos/asf/cassandra/blob/b6ff7f6c/src/java/org/apache/cassandra/index/sasi/disk/OnDiskIndexBuilder.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/b6ff7f6c/src/java/org/apache/cassandra/index/sasi/disk/PerSSTableIndexWriter.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/b6ff7f6c/test/unit/org/apache/cassandra/index/sasi/disk/PerSSTableIndexWriterTest.java --
[2/3] cassandra git commit: Avoid index segment stitching in RAM which lead to OOM on big SSTable files
Avoid index segment stitching in RAM which lead to OOM on big SSTable files patch by jrwest and xedin; reviewed by xedin for CASSANDRA-11383 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5c4d5c73 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5c4d5c73 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5c4d5c73 Branch: refs/heads/trunk Commit: 5c4d5c731f1299ba310c81603914a1a8956e644c Parents: f6c5d72 Author: Jordan WestAuthored: Mon Mar 21 12:00:31 2016 -0700 Committer: Pavel Yaskevich Committed: Sun Mar 27 15:21:16 2016 -0700 -- CHANGES.txt | 1 + .../sasi/disk/AbstractTokenTreeBuilder.java | 672 .../sasi/disk/DynamicTokenTreeBuilder.java | 189 + .../index/sasi/disk/OnDiskIndexBuilder.java | 52 +- .../index/sasi/disk/PerSSTableIndexWriter.java | 37 +- .../index/sasi/disk/StaticTokenTreeBuilder.java | 266 ++ .../apache/cassandra/index/sasi/disk/Token.java | 5 + .../cassandra/index/sasi/disk/TokenTree.java| 6 +- .../index/sasi/disk/TokenTreeBuilder.java | 805 +-- .../index/sasi/memory/KeyRangeIterator.java | 11 + .../cassandra/index/sasi/sa/SuffixSA.java | 7 +- .../index/sasi/utils/CombinedTerm.java | 46 +- .../index/sasi/disk/OnDiskIndexTest.java| 20 +- .../sasi/disk/PerSSTableIndexWriterTest.java| 90 +++ .../index/sasi/disk/TokenTreeTest.java | 217 +++-- .../index/sasi/utils/LongIterator.java | 8 + 16 files changed, 1482 insertions(+), 950 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/5c4d5c73/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index f86c91f..2907df9 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.5 + * Avoid index segment stitching in RAM which lead to OOM on big SSTable files (CASSANDRA-11383) * Fix clustering and row filters for LIKE queries on clustering columns (CASSANDRA-11397) Merged from 3.0: * Enable SO_REUSEADDR for JMX RMI server sockets (CASSANDRA-11093) http://git-wip-us.apache.org/repos/asf/cassandra/blob/5c4d5c73/src/java/org/apache/cassandra/index/sasi/disk/AbstractTokenTreeBuilder.java -- diff --git a/src/java/org/apache/cassandra/index/sasi/disk/AbstractTokenTreeBuilder.java b/src/java/org/apache/cassandra/index/sasi/disk/AbstractTokenTreeBuilder.java new file mode 100644 index 000..4e93b2b --- /dev/null +++ b/src/java/org/apache/cassandra/index/sasi/disk/AbstractTokenTreeBuilder.java @@ -0,0 +1,672 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.cassandra.index.sasi.disk; + +import java.io.IOException; +import java.nio.ByteBuffer; +import java.util.ArrayList; +import java.util.Iterator; +import java.util.List; + +import org.apache.cassandra.io.util.DataOutputPlus; +import org.apache.cassandra.utils.AbstractIterator; +import org.apache.cassandra.utils.FBUtilities; +import org.apache.cassandra.utils.Pair; + +import com.carrotsearch.hppc.LongArrayList; +import com.carrotsearch.hppc.LongSet; +import com.carrotsearch.hppc.cursors.LongCursor; + +public abstract class AbstractTokenTreeBuilder implements TokenTreeBuilder +{ +protected int numBlocks; +protected Node root; +protected InteriorNode rightmostParent; +protected Leaf leftmostLeaf; +protected Leaf rightmostLeaf; +protected long tokenCount = 0; +protected long treeMinToken; +protected long treeMaxToken; + +public void add(TokenTreeBuilder other) +{ +add(other.iterator()); +} + +public TokenTreeBuilder finish() +{ +if (root == null) +constructTree(); + +return this; +} + +public long getTokenCount() +{ +return tokenCount; +} + +public int serializedSize() +{ +if
[1/2] cassandra git commit: Avoid index segment stitching in RAM which lead to OOM on big SSTable files
Repository: cassandra Updated Branches: refs/heads/cassandra-3.5 f6c5d7298 -> 5c4d5c731 http://git-wip-us.apache.org/repos/asf/cassandra/blob/5c4d5c73/test/unit/org/apache/cassandra/index/sasi/disk/PerSSTableIndexWriterTest.java -- diff --git a/test/unit/org/apache/cassandra/index/sasi/disk/PerSSTableIndexWriterTest.java b/test/unit/org/apache/cassandra/index/sasi/disk/PerSSTableIndexWriterTest.java index 4663692..39a0fbc 100644 --- a/test/unit/org/apache/cassandra/index/sasi/disk/PerSSTableIndexWriterTest.java +++ b/test/unit/org/apache/cassandra/index/sasi/disk/PerSSTableIndexWriterTest.java @@ -20,14 +20,18 @@ package org.apache.cassandra.index.sasi.disk; import java.io.File; import java.nio.ByteBuffer; import java.util.*; +import java.util.concurrent.Callable; +import java.util.concurrent.ThreadLocalRandom; import org.apache.cassandra.SchemaLoader; +import org.apache.cassandra.config.CFMetaData; import org.apache.cassandra.config.ColumnDefinition; import org.apache.cassandra.db.Clustering; import org.apache.cassandra.db.ColumnFamilyStore; import org.apache.cassandra.db.DecoratedKey; import org.apache.cassandra.db.Keyspace; import org.apache.cassandra.db.compaction.OperationType; +import org.apache.cassandra.db.marshal.LongType; import org.apache.cassandra.db.rows.BTreeRow; import org.apache.cassandra.db.rows.BufferCell; import org.apache.cassandra.db.rows.Row; @@ -36,6 +40,7 @@ import org.apache.cassandra.index.sasi.utils.RangeIterator; import org.apache.cassandra.db.marshal.Int32Type; import org.apache.cassandra.db.marshal.UTF8Type; import org.apache.cassandra.exceptions.ConfigurationException; +import org.apache.cassandra.io.FSError; import org.apache.cassandra.io.sstable.Descriptor; import org.apache.cassandra.io.util.FileUtils; import org.apache.cassandra.schema.KeyspaceMetadata; @@ -158,4 +163,89 @@ public class PerSSTableIndexWriterTest extends SchemaLoader FileUtils.closeQuietly(index); } + +@Test +public void testSparse() throws Exception +{ +final String columnName = "timestamp"; + +ColumnFamilyStore cfs = Keyspace.open(KS_NAME).getColumnFamilyStore(CF_NAME); +ColumnDefinition column = cfs.metadata.getColumnDefinition(UTF8Type.instance.decompose(columnName)); + +SASIIndex sasi = (SASIIndex) cfs.indexManager.getIndexByName(columnName); + +File directory = cfs.getDirectories().getDirectoryForNewSSTables(); +Descriptor descriptor = Descriptor.fromFilename(cfs.getSSTablePath(directory)); +PerSSTableIndexWriter indexWriter = (PerSSTableIndexWriter) sasi.getFlushObserver(descriptor, OperationType.FLUSH); + +final long now = System.currentTimeMillis(); + +indexWriter.begin(); +indexWriter.indexes.put(column, indexWriter.newIndex(sasi.getIndex())); + +populateSegment(cfs.metadata, indexWriter.getIndex(column), new HashMap() +{{ +put(now, new HashSet<>(Arrays.asList(0, 1))); +put(now + 1, new HashSet<>(Arrays.asList(2, 3))); +put(now + 2, new HashSet<>(Arrays.asList(4, 5, 6, 7, 8, 9))); +}}); + +Callable segmentBuilder = indexWriter.getIndex(column).scheduleSegmentFlush(false); + +Assert.assertNull(segmentBuilder.call()); + +PerSSTableIndexWriter.Index index = indexWriter.getIndex(column); +Random random = ThreadLocalRandom.current(); + +Set segments = new HashSet<>(); +// now let's test multiple correct segments with yield incorrect final segment +for (int i = 0; i < 3; i++) +{ +populateSegment(cfs.metadata, index, new HashMap () +{{ +put(now, new HashSet<>(Arrays.asList(random.nextInt(), random.nextInt(), random.nextInt(; +put(now + 1, new HashSet<>(Arrays.asList(random.nextInt(), random.nextInt(), random.nextInt(; +put(now + 2, new HashSet<>(Arrays.asList(random.nextInt(), random.nextInt(), random.nextInt(; +}}); + +try +{ +// flush each of the new segments, they should all succeed +OnDiskIndex segment = index.scheduleSegmentFlush(false).call(); +index.segments.add(Futures.immediateFuture(segment)); +segments.add(segment.getIndexPath()); +} +catch (Exception | FSError e) +{ +e.printStackTrace(); +Assert.fail(); +} +} + +// make sure that all of the segments are present of the filesystem +for (String segment : segments) +Assert.assertTrue(new File(segment).exists()); + +indexWriter.complete(); + +// make sure that individual segments have been cleaned up +for (String segment : segments) +
[2/2] cassandra git commit: Avoid index segment stitching in RAM which lead to OOM on big SSTable files
Avoid index segment stitching in RAM which lead to OOM on big SSTable files patch by jrwest and xedin; reviewed by xedin for CASSANDRA-11383 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5c4d5c73 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5c4d5c73 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5c4d5c73 Branch: refs/heads/cassandra-3.5 Commit: 5c4d5c731f1299ba310c81603914a1a8956e644c Parents: f6c5d72 Author: Jordan WestAuthored: Mon Mar 21 12:00:31 2016 -0700 Committer: Pavel Yaskevich Committed: Sun Mar 27 15:21:16 2016 -0700 -- CHANGES.txt | 1 + .../sasi/disk/AbstractTokenTreeBuilder.java | 672 .../sasi/disk/DynamicTokenTreeBuilder.java | 189 + .../index/sasi/disk/OnDiskIndexBuilder.java | 52 +- .../index/sasi/disk/PerSSTableIndexWriter.java | 37 +- .../index/sasi/disk/StaticTokenTreeBuilder.java | 266 ++ .../apache/cassandra/index/sasi/disk/Token.java | 5 + .../cassandra/index/sasi/disk/TokenTree.java| 6 +- .../index/sasi/disk/TokenTreeBuilder.java | 805 +-- .../index/sasi/memory/KeyRangeIterator.java | 11 + .../cassandra/index/sasi/sa/SuffixSA.java | 7 +- .../index/sasi/utils/CombinedTerm.java | 46 +- .../index/sasi/disk/OnDiskIndexTest.java| 20 +- .../sasi/disk/PerSSTableIndexWriterTest.java| 90 +++ .../index/sasi/disk/TokenTreeTest.java | 217 +++-- .../index/sasi/utils/LongIterator.java | 8 + 16 files changed, 1482 insertions(+), 950 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/5c4d5c73/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index f86c91f..2907df9 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.5 + * Avoid index segment stitching in RAM which lead to OOM on big SSTable files (CASSANDRA-11383) * Fix clustering and row filters for LIKE queries on clustering columns (CASSANDRA-11397) Merged from 3.0: * Enable SO_REUSEADDR for JMX RMI server sockets (CASSANDRA-11093) http://git-wip-us.apache.org/repos/asf/cassandra/blob/5c4d5c73/src/java/org/apache/cassandra/index/sasi/disk/AbstractTokenTreeBuilder.java -- diff --git a/src/java/org/apache/cassandra/index/sasi/disk/AbstractTokenTreeBuilder.java b/src/java/org/apache/cassandra/index/sasi/disk/AbstractTokenTreeBuilder.java new file mode 100644 index 000..4e93b2b --- /dev/null +++ b/src/java/org/apache/cassandra/index/sasi/disk/AbstractTokenTreeBuilder.java @@ -0,0 +1,672 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.cassandra.index.sasi.disk; + +import java.io.IOException; +import java.nio.ByteBuffer; +import java.util.ArrayList; +import java.util.Iterator; +import java.util.List; + +import org.apache.cassandra.io.util.DataOutputPlus; +import org.apache.cassandra.utils.AbstractIterator; +import org.apache.cassandra.utils.FBUtilities; +import org.apache.cassandra.utils.Pair; + +import com.carrotsearch.hppc.LongArrayList; +import com.carrotsearch.hppc.LongSet; +import com.carrotsearch.hppc.cursors.LongCursor; + +public abstract class AbstractTokenTreeBuilder implements TokenTreeBuilder +{ +protected int numBlocks; +protected Node root; +protected InteriorNode rightmostParent; +protected Leaf leftmostLeaf; +protected Leaf rightmostLeaf; +protected long tokenCount = 0; +protected long treeMinToken; +protected long treeMaxToken; + +public void add(TokenTreeBuilder other) +{ +add(other.iterator()); +} + +public TokenTreeBuilder finish() +{ +if (root == null) +constructTree(); + +return this; +} + +public long getTokenCount() +{ +return tokenCount; +} + +public int serializedSize() +{ +
[jira] [Updated] (CASSANDRA-11389) Case sensitive in LIKE query althogh index created with false
[ https://issues.apache.org/jira/browse/CASSANDRA-11389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp updated CASSANDRA-11389: - Fix Version/s: (was: 3.4) 3.x > Case sensitive in LIKE query althogh index created with false > - > > Key: CASSANDRA-11389 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11389 > Project: Cassandra > Issue Type: Bug > Components: sasi >Reporter: Alon Levi >Priority: Minor > Labels: sasi > Fix For: 3.x > > > I created an index on user's first name as following: > CREATE CUSTOM INDEX ON users (first_name) USING > 'org.apache.cassandra.index.sasi.SASIIndex' > with options = { > 'mode' : 'CONTAINS', > 'case_sensitive' : 'false' > }; > This is the data I have in my table > user_id | first_name > | last_name > ---+---+--- > daa312ae-ecdf-4eb4-b6e9-206e33e5ca24 | Shlomo | Cohen > ab38ce9d-2823-4e6a-994f-7783953baef1 | Elad | Karakuli > 5e8371a7-3ed9-479f-9e4b-e4a07c750b12 | Alon | Levi > ae85cdc0-5eb7-4f08-8e42-2abd89e327ed | Gil | Elias > Although i mentioned the option 'case_sensitive' : 'false' > when I run this query : > select user_id, first_name from users where first_name LIKE '%shl%'; > The query returns no results. > However, when I run this query : > select user_id, first_name from users where first_name LIKE '%Shl%'; > The query returns the right results, > and the strangest thing is when I run this query: > select user_id, first_name from users where first_name LIKE 'shl%'; > suddenly the query is no more case sensitive and the results are fine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-11271) UDFs don't compile in embedded C*
[ https://issues.apache.org/jira/browse/CASSANDRA-11271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp resolved CASSANDRA-11271. -- Resolution: Won't Fix Fix Version/s: (was: 3.0.x) Embedding app should handle this -> resolving as won't fix. > UDFs don't compile in embedded C* > - > > Key: CASSANDRA-11271 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11271 > Project: Cassandra > Issue Type: Bug >Reporter: Robert Stupp >Assignee: Robert Stupp >Priority: Minor > > In an embedded C* instance UDFs may not compile due to class loader clashes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11383) Avoid index segment stitching in RAM which lead to OOM on big SSTable files
[ https://issues.apache.org/jira/browse/CASSANDRA-11383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15213613#comment-15213613 ] Pavel Yaskevich commented on CASSANDRA-11383: - [~doanduyhai] Thanks! I will be merging today, testall/dtest where run without problems as well. > Avoid index segment stitching in RAM which lead to OOM on big SSTable files > > > Key: CASSANDRA-11383 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11383 > Project: Cassandra > Issue Type: Bug > Components: CQL > Environment: C* 3.4 >Reporter: DOAN DuyHai >Assignee: Jordan West > Labels: sasi > Fix For: 3.5 > > Attachments: CASSANDRA-11383.patch, > SASI_Index_build_LCS_1G_Max_SSTable_Size_logs.tar.gz, > new_system_log_CMS_8GB_OOM.log, system.log_sasi_build_oom > > > 13 bare metal machines > - 6 cores CPU (12 HT) > - 64Gb RAM > - 4 SSD in RAID0 > JVM settings: > - G1 GC > - Xms32G, Xmx32G > Data set: > - ≈ 100Gb/per node > - 1.3 Tb cluster-wide > - ≈ 20Gb for all SASI indices > C* settings: > - concurrent_compactors: 1 > - compaction_throughput_mb_per_sec: 256 > - memtable_heap_space_in_mb: 2048 > - memtable_offheap_space_in_mb: 2048 > I created 9 SASI indices > - 8 indices with text field, NonTokenizingAnalyser, PREFIX mode, > case-insensitive > - 1 index with numeric field, SPARSE mode > After a while, the nodes just gone OOM. > I attach log files. You can see a lot of GC happening while index segments > are flush to disk. At some point the node OOM ... > /cc [~xedin] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11383) Avoid index segment stitching in RAM which lead to OOM on big SSTable files
[ https://issues.apache.org/jira/browse/CASSANDRA-11383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15213516#comment-15213516 ] DOAN DuyHai commented on CASSANDRA-11383: - *Test case 2*: after installing the latest fix, building {{SPARSE}} numerical index is now working without OOM and without Gossip issue > Avoid index segment stitching in RAM which lead to OOM on big SSTable files > > > Key: CASSANDRA-11383 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11383 > Project: Cassandra > Issue Type: Bug > Components: CQL > Environment: C* 3.4 >Reporter: DOAN DuyHai >Assignee: Jordan West > Labels: sasi > Fix For: 3.5 > > Attachments: CASSANDRA-11383.patch, > SASI_Index_build_LCS_1G_Max_SSTable_Size_logs.tar.gz, > new_system_log_CMS_8GB_OOM.log, system.log_sasi_build_oom > > > 13 bare metal machines > - 6 cores CPU (12 HT) > - 64Gb RAM > - 4 SSD in RAID0 > JVM settings: > - G1 GC > - Xms32G, Xmx32G > Data set: > - ≈ 100Gb/per node > - 1.3 Tb cluster-wide > - ≈ 20Gb for all SASI indices > C* settings: > - concurrent_compactors: 1 > - compaction_throughput_mb_per_sec: 256 > - memtable_heap_space_in_mb: 2048 > - memtable_offheap_space_in_mb: 2048 > I created 9 SASI indices > - 8 indices with text field, NonTokenizingAnalyser, PREFIX mode, > case-insensitive > - 1 index with numeric field, SPARSE mode > After a while, the nodes just gone OOM. > I attach log files. You can see a lot of GC happening while index segments > are flush to disk. At some point the node OOM ... > /cc [~xedin] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11383) Avoid index segment stitching in RAM which lead to OOM on big SSTable files
[ https://issues.apache.org/jira/browse/CASSANDRA-11383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15213433#comment-15213433 ] DOAN DuyHai commented on CASSANDRA-11383: - *Test case 3*: Insert 3.2 billions rows in an empty table with a {{DENSE}} numerical index created before the insert. The insert was successful without OOM. The insert tooks 9h30 > Avoid index segment stitching in RAM which lead to OOM on big SSTable files > > > Key: CASSANDRA-11383 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11383 > Project: Cassandra > Issue Type: Bug > Components: CQL > Environment: C* 3.4 >Reporter: DOAN DuyHai >Assignee: Jordan West > Labels: sasi > Fix For: 3.5 > > Attachments: CASSANDRA-11383.patch, > SASI_Index_build_LCS_1G_Max_SSTable_Size_logs.tar.gz, > new_system_log_CMS_8GB_OOM.log, system.log_sasi_build_oom > > > 13 bare metal machines > - 6 cores CPU (12 HT) > - 64Gb RAM > - 4 SSD in RAID0 > JVM settings: > - G1 GC > - Xms32G, Xmx32G > Data set: > - ≈ 100Gb/per node > - 1.3 Tb cluster-wide > - ≈ 20Gb for all SASI indices > C* settings: > - concurrent_compactors: 1 > - compaction_throughput_mb_per_sec: 256 > - memtable_heap_space_in_mb: 2048 > - memtable_offheap_space_in_mb: 2048 > I created 9 SASI indices > - 8 indices with text field, NonTokenizingAnalyser, PREFIX mode, > case-insensitive > - 1 index with numeric field, SPARSE mode > After a while, the nodes just gone OOM. > I attach log files. You can see a lot of GC happening while index segments > are flush to disk. At some point the node OOM ... > /cc [~xedin] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11369) Re-use IndexedEntry / IndexInfo after 11206
[ https://issues.apache.org/jira/browse/CASSANDRA-11369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15213416#comment-15213416 ] Robert Stupp commented on CASSANDRA-11369: -- Unlike the initial approach, CASSANDRA-11206 now distinguishes between RIEs that keep {{IndexInfo}} objects on-heap and those that are too big and are never fully materialized on heap. In other words: RIEs with index samples and a serialized size up to {{column_index_cache_size_in_kb}} behave as before 11206. > Re-use IndexedEntry / IndexInfo after 11206 > --- > > Key: CASSANDRA-11369 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11369 > Project: Cassandra > Issue Type: Improvement >Reporter: Robert Stupp >Assignee: Stefania >Priority: Minor > Fix For: 3.x > > > CASSANDRA-8180 introduced > {{UnfilteredRowIteratorWithLowerBound#getPartitionIndexLowerBound}} which > relies on {{IndexInfo}} being present in the key-cache. But CASSANDRA-11206 > will remove the presence of {{IndexInfo}} objects in-memory. > This ticket is about to re-add that functionality if feasible. > Currently tagged with fix-version 3.x, if it is possible to to that without > changing summary or index disk structures - but maybe it's necessary to > change the on-disk structure, which means we can do this in 4.x. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11206) Support large partitions on the 3.0 sstable format
[ https://issues.apache.org/jira/browse/CASSANDRA-11206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp updated CASSANDRA-11206: - Status: Patch Available (was: In Progress) Pushed the latest version to the [git branch|https://github.com/apache/cassandra/compare/trunk...snazy:11206-large-part-trunk?expand=1]. CI results ([testall|http://cassci.datastax.com/job/snazy-11206-large-part-trunk-testall/lastCompletedBuild/testReport/], [dtest|http://cassci.datastax.com/job/snazy-11206-large-part-trunk-dtest/lastCompletedBuild/testReport/]) and cstar results (see below) look good. The initial approach was to “ban” all {{IndexInfo}} instances from the key cache. Although this is a great option for big partitions, “moderately” sized partitions suffer from that approach (see “0kB” cstar run below). So, as a compromise a new {{cassandra.yaml}} option {{column_index_cache_size_in_kb}} that defines when {{IndexInfo}} objects should not be kept on heap has been introduced. The new option defaults to 2 kB. It is possible to tune it to lower values (down to 0) and higher values. Some thoughts about both directions: * Setting the value to 0 or some other very low value will obviously reduce GC pressure at the cost of high I/O * The cost of accessing index samples on disk is two-folded: First, there’s the obvious I/O cost via a {{RandomAccessReader}}. Second, that each {{RandomAccessReader}} instance has its own buffer (which can be off- or on-heap, but seems to default to off-heap) - so there seems to be some (quite small) overhead to borrow/release that buffer. * The higher the value of {{column_index_cache_size_in_kb}}, the more objects will be in the heap - therefore: GC pressure * Note that the parameter refers to the _serialized_ size and not the _amount_ of {{IndexInfo}} objects. This was chosen to get some more obvious relation between the size of {{IndexInfo}} objects to the amount of consumed heap - size of {{IndexInfo}} objects is mostly related to the size of the clustering keys. * Also note that some internal system/schema tables - especially those for LWTs - use clustering keys and therefore index samples. * For workloads with a huge amount of random reads against a large data set, small values for {{column_index_cache_size_in_kb}} (like the default value) are beneficial if the key cache is always full (i.e. it is evicting a lot). Some local tests with the new {{LargePartitionTest}} on my Macbook (time machine + indexing turned off) indicate that caching seems to work for shallow indexed entries. I’ve scheduled some cstar runs against the _trades_ workload. Only the result with {{column_index_cache_size_in_kb: 0}} (which means, that no {{IndexInfo}} will be kept on heap (and in the key cache) shows a performance regression. The default value of 2kb for {{column_index_cache_size_in_kb}} was chosen as a result of this experiment. * {{column_index_cache_size_in_kb: 0}} - [cstar result|http://cstar.datastax.com/graph?command=one_job=e96c871e-f275-11e5-83a4-0256e416528f=op_rate=1_user=1_aggregates=true=0=2794.77=0=141912.1] * {{column_index_cache_size_in_kb: 2}} - [cstar result|http://cstar.datastax.com/graph?command=one_job=410592e2-f288-11e5-95fb-0256e416528f=op_rate=1_user=1_aggregates=true=0=2044.46=0=142732.7] * {{column_index_cache_size_in_kb: 4}} - [cstar result|http://cstar.datastax.com/graph?command=one_job=f8e36ec4-f275-11e5-a3d3-0256e416528f=op_rate=1_user=1_aggregates=true=0=2101.44=0=141696.5] * {{column_index_cache_size_in_kb: 8}} - [cstar result|http://cstar.datastax.com/graph?command=one_job=be3516a6-f275-11e5-95fb-0256e416528f=op_rate=1_user=1_aggregates=true=0=2057.88=0=142156.3] Other cstar runs ([here|http://cstar.datastax.com/graph?command=one_job=ce9de45a-f275-11e5-83a4-0256e416528f=op_rate=write=1_aggregates=true], [here|http://cstar.datastax.com/graph?command=one_job=c97118bc-f275-11e5-95fb-0256e416528f=op_rate=1_user=1_aggregates=true=0=259.82=0=73909] and [here|http://cstar.datastax.com/graph?command=one_job=c4ece8d4-f275-11e5-83a4-0256e416528f=op_rate=1_user=1_aggregates=true=0=496.32=0=89063.7]) have shown that there’s no change for some plain workloads. Daily regression tests show a similar performance: [compaction|http://cstar.datastax.com/graph?command=one_job=86d7cda8-f346-11e5-8ef0-0256e416528f=op_rate=1_write=1_aggregates=true=0=54.56=0=275053.9], [repair|http://cstar.datastax.com/graph?command=one_job=9c78fd1c-f346-11e5-82b8-0256e416528f=op_rate=1_write=1_aggregates=true=0=55.88=0=279059], [STCS|http://cstar.datastax.com/graph?command=one_job=ac43b886-f346-11e5-8ef0-0256e416528f=op_rate=1_write=1_aggregates=true=0=170.39=0=98341.1], [DTCS|http://cstar.datastax.com/graph?command=one_job=b8e0a11c-f346-11e5-82b8-0256e416528f=op_rate=1_write=1_aggregates=true=0=172.15=0=96739.5],
[jira] [Updated] (CASSANDRA-11381) Node running with join_ring=false and authentication can not serve requests
[ https://issues.apache.org/jira/browse/CASSANDRA-11381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mck updated CASSANDRA-11381: Attachment: dtest-11381-trunk.txt > Node running with join_ring=false and authentication can not serve requests > --- > > Key: CASSANDRA-11381 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11381 > Project: Cassandra > Issue Type: Bug >Reporter: mck >Assignee: mck > Attachments: 11381-2.0.txt, 11381-2.1.txt, 11381-2.2.txt, > 11381-3.0.txt, 11381-trunk.txt, dtest-11381-trunk.txt > > > Starting up a node with {{-Dcassandra.join_ring=false}} in a cluster that has > authentication configured, eg PasswordAuthenticator, won't be able to serve > requests. This is because {{Auth.setup()}} never gets called during the > startup. > Without {{Auth.setup()}} having been called in {{StorageService}} clients > connecting to the node fail with the node throwing > {noformat} > java.lang.NullPointerException > at > org.apache.cassandra.auth.PasswordAuthenticator.authenticate(PasswordAuthenticator.java:119) > at > org.apache.cassandra.thrift.CassandraServer.login(CassandraServer.java:1471) > at > org.apache.cassandra.thrift.Cassandra$Processor$login.getResult(Cassandra.java:3505) > at > org.apache.cassandra.thrift.Cassandra$Processor$login.getResult(Cassandra.java:3489) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at com.thinkaurelius.thrift.Message.invoke(Message.java:314) > at > com.thinkaurelius.thrift.Message$Invocation.execute(Message.java:90) > at > com.thinkaurelius.thrift.TDisruptorServer$InvocationHandler.onEvent(TDisruptorServer.java:695) > at > com.thinkaurelius.thrift.TDisruptorServer$InvocationHandler.onEvent(TDisruptorServer.java:689) > at com.lmax.disruptor.WorkProcessor.run(WorkProcessor.java:112) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {noformat} > The exception thrown from the > [code|https://github.com/apache/cassandra/blob/cassandra-2.0.16/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java#L119] > {code} > ResultMessage.Rows rows = > authenticateStatement.execute(QueryState.forInternalCalls(), new > QueryOptions(consistencyForUser(username), > >Lists.newArrayList(ByteBufferUtil.bytes(username; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11381) Node running with join_ring=false and authentication can not serve requests
[ https://issues.apache.org/jira/browse/CASSANDRA-11381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mck updated CASSANDRA-11381: Attachment: (was: dtest-11381-trunk.txt) > Node running with join_ring=false and authentication can not serve requests > --- > > Key: CASSANDRA-11381 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11381 > Project: Cassandra > Issue Type: Bug >Reporter: mck >Assignee: mck > Attachments: 11381-2.0.txt, 11381-2.1.txt, 11381-2.2.txt, > 11381-3.0.txt, 11381-trunk.txt > > > Starting up a node with {{-Dcassandra.join_ring=false}} in a cluster that has > authentication configured, eg PasswordAuthenticator, won't be able to serve > requests. This is because {{Auth.setup()}} never gets called during the > startup. > Without {{Auth.setup()}} having been called in {{StorageService}} clients > connecting to the node fail with the node throwing > {noformat} > java.lang.NullPointerException > at > org.apache.cassandra.auth.PasswordAuthenticator.authenticate(PasswordAuthenticator.java:119) > at > org.apache.cassandra.thrift.CassandraServer.login(CassandraServer.java:1471) > at > org.apache.cassandra.thrift.Cassandra$Processor$login.getResult(Cassandra.java:3505) > at > org.apache.cassandra.thrift.Cassandra$Processor$login.getResult(Cassandra.java:3489) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at com.thinkaurelius.thrift.Message.invoke(Message.java:314) > at > com.thinkaurelius.thrift.Message$Invocation.execute(Message.java:90) > at > com.thinkaurelius.thrift.TDisruptorServer$InvocationHandler.onEvent(TDisruptorServer.java:695) > at > com.thinkaurelius.thrift.TDisruptorServer$InvocationHandler.onEvent(TDisruptorServer.java:689) > at com.lmax.disruptor.WorkProcessor.run(WorkProcessor.java:112) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {noformat} > The exception thrown from the > [code|https://github.com/apache/cassandra/blob/cassandra-2.0.16/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java#L119] > {code} > ResultMessage.Rows rows = > authenticateStatement.execute(QueryState.forInternalCalls(), new > QueryOptions(consistencyForUser(username), > >Lists.newArrayList(ByteBufferUtil.bytes(username; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11381) Node running with join_ring=false and authentication can not serve requests
[ https://issues.apache.org/jira/browse/CASSANDRA-11381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15213410#comment-15213410 ] mck commented on CASSANDRA-11381: - Small update to the dtest. I had been writing dtests in the trunk branch instead of master. More info [here|https://github.com/pcmanus/ccm/pull/479] and [here|https://github.com/riptano/cassandra-dtest/pull/892]. > Node running with join_ring=false and authentication can not serve requests > --- > > Key: CASSANDRA-11381 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11381 > Project: Cassandra > Issue Type: Bug >Reporter: mck >Assignee: mck > Attachments: 11381-2.0.txt, 11381-2.1.txt, 11381-2.2.txt, > 11381-3.0.txt, 11381-trunk.txt > > > Starting up a node with {{-Dcassandra.join_ring=false}} in a cluster that has > authentication configured, eg PasswordAuthenticator, won't be able to serve > requests. This is because {{Auth.setup()}} never gets called during the > startup. > Without {{Auth.setup()}} having been called in {{StorageService}} clients > connecting to the node fail with the node throwing > {noformat} > java.lang.NullPointerException > at > org.apache.cassandra.auth.PasswordAuthenticator.authenticate(PasswordAuthenticator.java:119) > at > org.apache.cassandra.thrift.CassandraServer.login(CassandraServer.java:1471) > at > org.apache.cassandra.thrift.Cassandra$Processor$login.getResult(Cassandra.java:3505) > at > org.apache.cassandra.thrift.Cassandra$Processor$login.getResult(Cassandra.java:3489) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at com.thinkaurelius.thrift.Message.invoke(Message.java:314) > at > com.thinkaurelius.thrift.Message$Invocation.execute(Message.java:90) > at > com.thinkaurelius.thrift.TDisruptorServer$InvocationHandler.onEvent(TDisruptorServer.java:695) > at > com.thinkaurelius.thrift.TDisruptorServer$InvocationHandler.onEvent(TDisruptorServer.java:689) > at com.lmax.disruptor.WorkProcessor.run(WorkProcessor.java:112) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {noformat} > The exception thrown from the > [code|https://github.com/apache/cassandra/blob/cassandra-2.0.16/src/java/org/apache/cassandra/auth/PasswordAuthenticator.java#L119] > {code} > ResultMessage.Rows rows = > authenticateStatement.execute(QueryState.forInternalCalls(), new > QueryOptions(consistencyForUser(username), > >Lists.newArrayList(ByteBufferUtil.bytes(username; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11442) `IF NOT EXISTS` doesn't work for `CREATE INDEX` queries when index name is not specified
[ https://issues.apache.org/jira/browse/CASSANDRA-11442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15213378#comment-15213378 ] Alex Petrov commented on CASSANDRA-11442: - True, didn't see that one. Thanks for catching it! > `IF NOT EXISTS` doesn't work for `CREATE INDEX` queries when index name is > not specified > > > Key: CASSANDRA-11442 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11442 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Petrov >Priority: Trivial > Fix For: 3.x > > Attachments: > 0001-Make-IF-NOT-EXISTS-clause-work-on-index-creation-whe.patch > > > `IF NOT EXISTS` doesn't work for `CREATE INDEX` queries when index name is > not specified. So executing {{CREATE INDEX IF NOT EXISTS ON %s(c)}} twice > would cause > {code} > Caused by: org.apache.cassandra.exceptions.InvalidRequestException: Index > table_0_c_idx_1 is a duplicate of existing index table_0_c_idx > at > org.apache.cassandra.cql3.statements.RequestValidations.invalidRequest(RequestValidations.java:199) > at > org.apache.cassandra.cql3.statements.RequestValidations.checkTrue(RequestValidations.java:63) > at > org.apache.cassandra.cql3.statements.RequestValidations.checkFalse(RequestValidations.java:111) > at > org.apache.cassandra.cql3.statements.CreateIndexStatement.announceMigration(CreateIndexStatement.java:225) > at > org.apache.cassandra.cql3.statements.SchemaAlteringStatement.executeInternal(SchemaAlteringStatement.java:120) > at org.apache.cassandra.cql3.CQLTester.schemaChange(CQLTester.java:637) > ... 29 more > {code} > Patch is attached. > If it helps, I've also created a branch: > https://github.com/ifesdjeen/cassandra/tree/11442-trunk -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-11442) `IF NOT EXISTS` doesn't work for `CREATE INDEX` queries when index name is not specified
[ https://issues.apache.org/jira/browse/CASSANDRA-11442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-11442: Resolution: Duplicate Status: Resolved (was: Patch Available) > `IF NOT EXISTS` doesn't work for `CREATE INDEX` queries when index name is > not specified > > > Key: CASSANDRA-11442 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11442 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Petrov >Priority: Trivial > Fix For: 3.x > > Attachments: > 0001-Make-IF-NOT-EXISTS-clause-work-on-index-creation-whe.patch > > > `IF NOT EXISTS` doesn't work for `CREATE INDEX` queries when index name is > not specified. So executing {{CREATE INDEX IF NOT EXISTS ON %s(c)}} twice > would cause > {code} > Caused by: org.apache.cassandra.exceptions.InvalidRequestException: Index > table_0_c_idx_1 is a duplicate of existing index table_0_c_idx > at > org.apache.cassandra.cql3.statements.RequestValidations.invalidRequest(RequestValidations.java:199) > at > org.apache.cassandra.cql3.statements.RequestValidations.checkTrue(RequestValidations.java:63) > at > org.apache.cassandra.cql3.statements.RequestValidations.checkFalse(RequestValidations.java:111) > at > org.apache.cassandra.cql3.statements.CreateIndexStatement.announceMigration(CreateIndexStatement.java:225) > at > org.apache.cassandra.cql3.statements.SchemaAlteringStatement.executeInternal(SchemaAlteringStatement.java:120) > at org.apache.cassandra.cql3.CQLTester.schemaChange(CQLTester.java:637) > ... 29 more > {code} > Patch is attached. > If it helps, I've also created a branch: > https://github.com/ifesdjeen/cassandra/tree/11442-trunk -- This message was sent by Atlassian JIRA (v6.3.4#6332)