[jira] [Commented] (CASSANDRA-13922) nodetool verify should also verify sstable metadata
[ https://issues.apache.org/jira/browse/CASSANDRA-13922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190877#comment-16190877 ] Marcus Eriksson commented on CASSANDRA-13922: - Committed, thanks, opted not to fix the mutateRepairedAt change as it felt a tiny bit more non-nit:y, created CASSANDRA-13933 > nodetool verify should also verify sstable metadata > --- > > Key: CASSANDRA-13922 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13922 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 4.0, 3.0.16, 3.11.2 > > > nodetool verify should also try to deserialize the sstable metadata (and once > CASSANDRA-13321 makes it in, verify the checksums) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-13933) Handle mutateRepaired failure in nodetool verify
Marcus Eriksson created CASSANDRA-13933: --- Summary: Handle mutateRepaired failure in nodetool verify Key: CASSANDRA-13933 URL: https://issues.apache.org/jira/browse/CASSANDRA-13933 Project: Cassandra Issue Type: Bug Reporter: Marcus Eriksson See comment here: https://issues.apache.org/jira/browse/CASSANDRA-13922?focusedCommentId=16189875&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16189875 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[3/3] cassandra git commit: Merge branch 'cassandra-3.11' into trunk
Merge branch 'cassandra-3.11' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0cb27a78 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0cb27a78 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0cb27a78 Branch: refs/heads/trunk Commit: 0cb27a781ef74f66788e98624313e7750b183c59 Parents: 8ebef8e 12a09ec Author: Marcus Eriksson Authored: Wed Oct 4 08:27:19 2017 +0200 Committer: Marcus Eriksson Committed: Wed Oct 4 08:27:19 2017 +0200 -- CHANGES.txt | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/0cb27a78/CHANGES.txt -- diff --cc CHANGES.txt index 134ec2e,e231bf2..b146778 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,150 -1,8 +1,155 @@@ +4.0 + * Properly close StreamCompressionInputStream to release any ByteBuf (CASSANDRA-13906) + * Add SERIAL and LOCAL_SERIAL support for cassandra-stress (CASSANDRA-13925) + * LCS needlessly checks for L0 STCS candidates multiple times (CASSANDRA-12961) + * Correctly close netty channels when a stream session ends (CASSANDRA-13905) + * Update lz4 to 1.4.0 (CASSANDRA-13741) + * Optimize Paxos prepare and propose stage for local requests (CASSANDRA-13862) + * Throttle base partitions during MV repair streaming to prevent OOM (CASSANDRA-13299) + * Use compaction threshold for STCS in L0 (CASSANDRA-13861) + * Fix problem with min_compress_ratio: 1 and disallow ratio < 1 (CASSANDRA-13703) + * Add extra information to SASI timeout exception (CASSANDRA-13677) + * Add incremental repair support for --hosts, --force, and subrange repair (CASSANDRA-13818) + * Rework CompactionStrategyManager.getScanners synchronization (CASSANDRA-13786) + * Add additional unit tests for batch behavior, TTLs, Timestamps (CASSANDRA-13846) + * Add keyspace and table name in schema validation exception (CASSANDRA-13845) + * Emit metrics whenever we hit tombstone failures and warn thresholds (CASSANDRA-13771) + * Make netty EventLoopGroups daemon threads (CASSANDRA-13837) + * Race condition when closing stream sessions (CASSANDRA-13852) + * NettyFactoryTest is failing in trunk on macOS (CASSANDRA-13831) + * Allow changing log levels via nodetool for related classes (CASSANDRA-12696) + * Add stress profile yaml with LWT (CASSANDRA-7960) + * Reduce memory copies and object creations when acting on ByteBufs (CASSANDRA-13789) + * Simplify mx4j configuration (Cassandra-13578) + * Fix trigger example on 4.0 (CASSANDRA-13796) + * Force minumum timeout value (CASSANDRA-9375) + * Use netty for streaming (CASSANDRA-12229) + * Use netty for internode messaging (CASSANDRA-8457) + * Add bytes repaired/unrepaired to nodetool tablestats (CASSANDRA-13774) + * Don't delete incremental repair sessions if they still have sstables (CASSANDRA-13758) + * Fix pending repair manager index out of bounds check (CASSANDRA-13769) + * Don't use RangeFetchMapCalculator when RF=1 (CASSANDRA-13576) + * Don't optimise trivial ranges in RangeFetchMapCalculator (CASSANDRA-13664) + * Use an ExecutorService for repair commands instead of new Thread(..).start() (CASSANDRA-13594) + * Fix race / ref leak in anticompaction (CASSANDRA-13688) + * Expose tasks queue length via JMX (CASSANDRA-12758) + * Fix race / ref leak in PendingRepairManager (CASSANDRA-13751) + * Enable ppc64le runtime as unsupported architecture (CASSANDRA-13615) + * Improve sstablemetadata output (CASSANDRA-11483) + * Support for migrating legacy users to roles has been dropped (CASSANDRA-13371) + * Introduce error metrics for repair (CASSANDRA-13387) + * Refactoring to primitive functional interfaces in AuthCache (CASSANDRA-13732) + * Update metrics to 3.1.5 (CASSANDRA-13648) + * batch_size_warn_threshold_in_kb can now be set at runtime (CASSANDRA-13699) + * Avoid always rebuilding secondary indexes at startup (CASSANDRA-13725) + * Upgrade JMH from 1.13 to 1.19 (CASSANDRA-13727) + * Upgrade SLF4J from 1.7.7 to 1.7.25 (CASSANDRA-12996) + * Default for start_native_transport now true if not set in config (CASSANDRA-13656) + * Don't add localhost to the graph when calculating where to stream from (CASSANDRA-13583) + * Make CDC availability more deterministic via hard-linking (CASSANDRA-12148) + * Allow skipping equality-restricted clustering columns in ORDER BY clause (CASSANDRA-10271) + * Use common nowInSec for validation compactions (CASSANDRA-13671) + * Improve handling of IR prepare failures (CASSANDRA-13672) + * Send IR coordinator messages synchronously (CASSANDRA-13673) + * Flush system.repair table before IR finalize promise (CASSANDRA-13660) + * Fix column filter creation for wildcard queries (CASSANDRA-13650) + * A
[2/3] cassandra git commit: fixup CHANGES.txt
fixup CHANGES.txt Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/12a09ec3 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/12a09ec3 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/12a09ec3 Branch: refs/heads/trunk Commit: 12a09ec3f2028fa8e9182e027afb8bf96613721a Parents: f5bc429 Author: Marcus Eriksson Authored: Wed Oct 4 08:24:47 2017 +0200 Committer: Marcus Eriksson Committed: Wed Oct 4 08:24:47 2017 +0200 -- CHANGES.txt | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/12a09ec3/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 4a22af4..e231bf2 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,3 +1,8 @@ +3.11.2 +Merged from 3.0: + * Deserialise sstable metadata in nodetool verify (CASSANDRA-13922) + + 3.11.1 * Fix the computation of cdc_total_space_in_mb for exabyte filesystems (CASSANDRA-13808) * AbstractTokenTreeBuilder#serializedSize returns wrong value when there is a single leaf and overflow collisions (CASSANDRA-13869) @@ -12,7 +17,6 @@ * Duplicate the buffer before passing it to analyser in SASI operation (CASSANDRA-13512) * Properly evict pstmts from prepared statements cache (CASSANDRA-13641) Merged from 3.0: - * Deserialise sstable metadata in nodetool verify (CASSANDRA-13922) * Improve TRUNCATE performance (CASSANDRA-13909) * Implement short read protection on partition boundaries (CASSANDRA-13595) * Fix ISE thrown by UPI.Serializer.hasNext() for some SELECT queries (CASSANDRA-13911) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[1/3] cassandra git commit: fixup CHANGES.txt
Repository: cassandra Updated Branches: refs/heads/cassandra-3.11 f5bc42996 -> 12a09ec3f refs/heads/trunk 8ebef8e71 -> 0cb27a781 fixup CHANGES.txt Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/12a09ec3 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/12a09ec3 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/12a09ec3 Branch: refs/heads/cassandra-3.11 Commit: 12a09ec3f2028fa8e9182e027afb8bf96613721a Parents: f5bc429 Author: Marcus Eriksson Authored: Wed Oct 4 08:24:47 2017 +0200 Committer: Marcus Eriksson Committed: Wed Oct 4 08:24:47 2017 +0200 -- CHANGES.txt | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/12a09ec3/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 4a22af4..e231bf2 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,3 +1,8 @@ +3.11.2 +Merged from 3.0: + * Deserialise sstable metadata in nodetool verify (CASSANDRA-13922) + + 3.11.1 * Fix the computation of cdc_total_space_in_mb for exabyte filesystems (CASSANDRA-13808) * AbstractTokenTreeBuilder#serializedSize returns wrong value when there is a single leaf and overflow collisions (CASSANDRA-13869) @@ -12,7 +17,6 @@ * Duplicate the buffer before passing it to analyser in SASI operation (CASSANDRA-13512) * Properly evict pstmts from prepared statements cache (CASSANDRA-13641) Merged from 3.0: - * Deserialise sstable metadata in nodetool verify (CASSANDRA-13922) * Improve TRUNCATE performance (CASSANDRA-13909) * Implement short read protection on partition boundaries (CASSANDRA-13595) * Fix ISE thrown by UPI.Serializer.hasNext() for some SELECT queries (CASSANDRA-13911) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13922) nodetool verify should also verify sstable metadata
[ https://issues.apache.org/jira/browse/CASSANDRA-13922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-13922: Fix Version/s: 4.0 > nodetool verify should also verify sstable metadata > --- > > Key: CASSANDRA-13922 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13922 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 4.0, 3.0.16, 3.11.2 > > > nodetool verify should also try to deserialize the sstable metadata (and once > CASSANDRA-13321 makes it in, verify the checksums) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13922) nodetool verify should also verify sstable metadata
[ https://issues.apache.org/jira/browse/CASSANDRA-13922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-13922: Resolution: Fixed Fix Version/s: (was: 3.11.x) (was: 4.x) (was: 3.0.x) 3.11.2 3.0.16 Status: Resolved (was: Ready to Commit) > nodetool verify should also verify sstable metadata > --- > > Key: CASSANDRA-13922 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13922 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 3.0.16, 3.11.2 > > > nodetool verify should also try to deserialize the sstable metadata (and once > CASSANDRA-13321 makes it in, verify the checksums) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[6/6] cassandra git commit: Merge branch 'cassandra-3.11' into trunk
Merge branch 'cassandra-3.11' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/8ebef8e7 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/8ebef8e7 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/8ebef8e7 Branch: refs/heads/trunk Commit: 8ebef8e71a5ded0d5964e1ae315c9875db2e005f Parents: 982ab93 f5bc429 Author: Marcus Eriksson Authored: Wed Oct 4 08:20:54 2017 +0200 Committer: Marcus Eriksson Committed: Wed Oct 4 08:20:54 2017 +0200 -- CHANGES.txt | 1 + .../cassandra/db/compaction/Verifier.java | 31 .../org/apache/cassandra/db/VerifyTest.java | 25 3 files changed, 52 insertions(+), 5 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/8ebef8e7/CHANGES.txt -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/8ebef8e7/src/java/org/apache/cassandra/db/compaction/Verifier.java -- diff --cc src/java/org/apache/cassandra/db/compaction/Verifier.java index bca6e79,4c07bb6..22cf813 --- a/src/java/org/apache/cassandra/db/compaction/Verifier.java +++ b/src/java/org/apache/cassandra/db/compaction/Verifier.java @@@ -234,8 -250,14 +249,14 @@@ public class Verifier implements Closea private void markAndThrow() throws IOException { - sstable.descriptor.getMetadataSerializer().mutateRepaired(sstable.descriptor, ActiveRepairService.UNREPAIRED_SSTABLE, sstable.getSSTableMetadata().pendingRepair); - throw new CorruptSSTableException(new Exception(String.format("Invalid SSTable %s, please force repair", sstable.getFilename())), sstable.getFilename()); + markAndThrow(true); + } + + private void markAndThrow(boolean mutateRepaired) throws IOException + { + if (mutateRepaired) // if we are able to mutate repaired flag, an incremental repair should be enough - sstable.descriptor.getMetadataSerializer().mutateRepairedAt(sstable.descriptor, ActiveRepairService.UNREPAIRED_SSTABLE); ++ sstable.descriptor.getMetadataSerializer().mutateRepaired(sstable.descriptor, ActiveRepairService.UNREPAIRED_SSTABLE, sstable.getSSTableMetadata().pendingRepair); + throw new CorruptSSTableException(new Exception(String.format("Invalid SSTable %s, please force %srepair", sstable.getFilename(), mutateRepaired ? "" : "a full ")), sstable.getFilename()); } public CompactionInfo.Holder getVerifyInfo() http://git-wip-us.apache.org/repos/asf/cassandra/blob/8ebef8e7/test/unit/org/apache/cassandra/db/VerifyTest.java -- - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[4/6] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11
Merge branch 'cassandra-3.0' into cassandra-3.11 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f5bc4299 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f5bc4299 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f5bc4299 Branch: refs/heads/trunk Commit: f5bc42996b671ae26793691af3fd06fbcabbe4cc Parents: 983c72a e400b97 Author: Marcus Eriksson Authored: Wed Oct 4 08:18:36 2017 +0200 Committer: Marcus Eriksson Committed: Wed Oct 4 08:19:15 2017 +0200 -- CHANGES.txt | 1 + .../cassandra/db/compaction/Verifier.java | 31 .../org/apache/cassandra/db/VerifyTest.java | 26 3 files changed, 53 insertions(+), 5 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/f5bc4299/CHANGES.txt -- diff --cc CHANGES.txt index aca219e,df05f7f..4a22af4 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,17 -1,8 +1,18 @@@ -3.0.16 - * Deserialise sstable metadata in nodetool verify (CASSANDRA-13922) - - -3.0.15 +3.11.1 + * Fix the computation of cdc_total_space_in_mb for exabyte filesystems (CASSANDRA-13808) + * AbstractTokenTreeBuilder#serializedSize returns wrong value when there is a single leaf and overflow collisions (CASSANDRA-13869) + * Add a compaction option to TWCS to ignore sstables overlapping checks (CASSANDRA-13418) + * BTree.Builder memory leak (CASSANDRA-13754) + * Revert CASSANDRA-10368 of supporting non-pk column filtering due to correctness (CASSANDRA-13798) + * Add a skip read validation flag to cassandra-stress (CASSANDRA-13772) + * Fix cassandra-stress hang issues when an error during cluster connection happens (CASSANDRA-12938) + * Better bootstrap failure message when blocked by (potential) range movement (CASSANDRA-13744) + * "ignore" option is ignored in sstableloader (CASSANDRA-13721) + * Deadlock in AbstractCommitLogSegmentManager (CASSANDRA-13652) + * Duplicate the buffer before passing it to analyser in SASI operation (CASSANDRA-13512) + * Properly evict pstmts from prepared statements cache (CASSANDRA-13641) +Merged from 3.0: ++ * Deserialise sstable metadata in nodetool verify (CASSANDRA-13922) * Improve TRUNCATE performance (CASSANDRA-13909) * Implement short read protection on partition boundaries (CASSANDRA-13595) * Fix ISE thrown by UPI.Serializer.hasNext() for some SELECT queries (CASSANDRA-13911) http://git-wip-us.apache.org/repos/asf/cassandra/blob/f5bc4299/src/java/org/apache/cassandra/db/compaction/Verifier.java -- diff --cc src/java/org/apache/cassandra/db/compaction/Verifier.java index df659e4,68088b3..4c07bb6 --- a/src/java/org/apache/cassandra/db/compaction/Verifier.java +++ b/src/java/org/apache/cassandra/db/compaction/Verifier.java @@@ -88,7 -89,21 +90,21 @@@ public class Verifier implements Closea { long rowStart = 0; -outputHandler.output(String.format("Verifying %s (%s bytes)", sstable, dataFile.length())); +outputHandler.output(String.format("Verifying %s (%s)", sstable, FBUtilities.prettyPrintMemory(dataFile.length(; + outputHandler.output(String.format("Deserializing sstable metadata for %s ", sstable)); + try + { + EnumSet types = EnumSet.of(MetadataType.VALIDATION, MetadataType.STATS, MetadataType.HEADER); + Map sstableMetadata = sstable.descriptor.getMetadataSerializer().deserialize(sstable.descriptor, types); + if (sstableMetadata.containsKey(MetadataType.VALIDATION) && + !((ValidationMetadata)sstableMetadata.get(MetadataType.VALIDATION)).partitioner.equals(sstable.getPartitioner().getClass().getCanonicalName())) + throw new IOException("Partitioner does not match validation metadata"); + } + catch (Throwable t) + { + outputHandler.debug(t.getMessage()); + markAndThrow(false); + } outputHandler.output(String.format("Checking computed hash of %s ", sstable)); @@@ -187,8 -202,8 +203,8 @@@ if (key == null || dataSize > dataFile.length()) markAndThrow(); - //mimic the scrub read path + //mimic the scrub read path, intentionally unused -try (UnfilteredRowIterator iterator = new SSTableIdentityIterator(sstable, dataFile, key)) +try (UnfilteredRowIterator iterator = SSTableIdentityIterator.create(sstable, dataFile, key)) { } http://git-wip-us.apache.org/repos/asf/cassandra/blob/f5bc4299/te
[3/6] cassandra git commit: Deserialize sstable metadata in nodetool verify
Deserialize sstable metadata in nodetool verify Patch by marcuse; reviewed by Jason Brown for CASSANDRA-13922 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e400b976 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e400b976 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e400b976 Branch: refs/heads/trunk Commit: e400b976751110d41405bac614189152bf88f7ef Parents: b32a9e6 Author: Marcus Eriksson Authored: Mon Oct 2 10:11:17 2017 +0200 Committer: Marcus Eriksson Committed: Wed Oct 4 08:15:23 2017 +0200 -- CHANGES.txt | 4 +++ .../cassandra/db/compaction/Verifier.java | 32 .../org/apache/cassandra/db/VerifyTest.java | 25 +++ 3 files changed, 55 insertions(+), 6 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/e400b976/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index d6423b4..df05f7f 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,3 +1,7 @@ +3.0.16 + * Deserialise sstable metadata in nodetool verify (CASSANDRA-13922) + + 3.0.15 * Improve TRUNCATE performance (CASSANDRA-13909) * Implement short read protection on partition boundaries (CASSANDRA-13595) http://git-wip-us.apache.org/repos/asf/cassandra/blob/e400b976/src/java/org/apache/cassandra/db/compaction/Verifier.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/Verifier.java b/src/java/org/apache/cassandra/db/compaction/Verifier.java index 88bc3a7..68088b3 100644 --- a/src/java/org/apache/cassandra/db/compaction/Verifier.java +++ b/src/java/org/apache/cassandra/db/compaction/Verifier.java @@ -26,13 +26,15 @@ import org.apache.cassandra.io.sstable.Component; import org.apache.cassandra.io.sstable.CorruptSSTableException; import org.apache.cassandra.io.sstable.SSTableIdentityIterator; import org.apache.cassandra.io.sstable.format.SSTableReader; +import org.apache.cassandra.io.sstable.metadata.MetadataComponent; +import org.apache.cassandra.io.sstable.metadata.MetadataType; +import org.apache.cassandra.io.sstable.metadata.ValidationMetadata; import org.apache.cassandra.io.util.DataIntegrityMetadata; import org.apache.cassandra.io.util.DataIntegrityMetadata.FileDigestValidator; import org.apache.cassandra.io.util.FileUtils; import org.apache.cassandra.io.util.RandomAccessReader; import org.apache.cassandra.service.ActiveRepairService; import org.apache.cassandra.utils.ByteBufferUtil; -import org.apache.cassandra.utils.FBUtilities; import org.apache.cassandra.utils.OutputHandler; import org.apache.cassandra.utils.UUIDGen; @@ -58,7 +60,6 @@ public class Verifier implements Closeable private final RowIndexEntry.IndexSerializer rowIndexEntrySerializer; private int goodRows; -private int badRows; private final OutputHandler outputHandler; private FileDigestValidator validator; @@ -89,6 +90,20 @@ public class Verifier implements Closeable long rowStart = 0; outputHandler.output(String.format("Verifying %s (%s bytes)", sstable, dataFile.length())); +outputHandler.output(String.format("Deserializing sstable metadata for %s ", sstable)); +try +{ +EnumSet types = EnumSet.of(MetadataType.VALIDATION, MetadataType.STATS, MetadataType.HEADER); +Map sstableMetadata = sstable.descriptor.getMetadataSerializer().deserialize(sstable.descriptor, types); +if (sstableMetadata.containsKey(MetadataType.VALIDATION) && + !((ValidationMetadata)sstableMetadata.get(MetadataType.VALIDATION)).partitioner.equals(sstable.getPartitioner().getClass().getCanonicalName())) +throw new IOException("Partitioner does not match validation metadata"); +} +catch (Throwable t) +{ +outputHandler.debug(t.getMessage()); +markAndThrow(false); +} outputHandler.output(String.format("Checking computed hash of %s ", sstable)); @@ -187,7 +202,7 @@ public class Verifier implements Closeable if (key == null || dataSize > dataFile.length()) markAndThrow(); -//mimic the scrub read path +//mimic the scrub read path, intentionally unused try (UnfilteredRowIterator iterator = new SSTableIdentityIterator(sstable, dataFile, key)) { } @@ -204,7 +219,6 @@ public class Verifier implements Closeable } catch (Throwable th) { -badRows++; markAndThrow();
[1/6] cassandra git commit: Deserialize sstable metadata in nodetool verify
Repository: cassandra Updated Branches: refs/heads/cassandra-3.0 b32a9e645 -> e400b9767 refs/heads/cassandra-3.11 983c72a84 -> f5bc42996 refs/heads/trunk 982ab93a2 -> 8ebef8e71 Deserialize sstable metadata in nodetool verify Patch by marcuse; reviewed by Jason Brown for CASSANDRA-13922 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e400b976 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e400b976 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e400b976 Branch: refs/heads/cassandra-3.0 Commit: e400b976751110d41405bac614189152bf88f7ef Parents: b32a9e6 Author: Marcus Eriksson Authored: Mon Oct 2 10:11:17 2017 +0200 Committer: Marcus Eriksson Committed: Wed Oct 4 08:15:23 2017 +0200 -- CHANGES.txt | 4 +++ .../cassandra/db/compaction/Verifier.java | 32 .../org/apache/cassandra/db/VerifyTest.java | 25 +++ 3 files changed, 55 insertions(+), 6 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/e400b976/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index d6423b4..df05f7f 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,3 +1,7 @@ +3.0.16 + * Deserialise sstable metadata in nodetool verify (CASSANDRA-13922) + + 3.0.15 * Improve TRUNCATE performance (CASSANDRA-13909) * Implement short read protection on partition boundaries (CASSANDRA-13595) http://git-wip-us.apache.org/repos/asf/cassandra/blob/e400b976/src/java/org/apache/cassandra/db/compaction/Verifier.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/Verifier.java b/src/java/org/apache/cassandra/db/compaction/Verifier.java index 88bc3a7..68088b3 100644 --- a/src/java/org/apache/cassandra/db/compaction/Verifier.java +++ b/src/java/org/apache/cassandra/db/compaction/Verifier.java @@ -26,13 +26,15 @@ import org.apache.cassandra.io.sstable.Component; import org.apache.cassandra.io.sstable.CorruptSSTableException; import org.apache.cassandra.io.sstable.SSTableIdentityIterator; import org.apache.cassandra.io.sstable.format.SSTableReader; +import org.apache.cassandra.io.sstable.metadata.MetadataComponent; +import org.apache.cassandra.io.sstable.metadata.MetadataType; +import org.apache.cassandra.io.sstable.metadata.ValidationMetadata; import org.apache.cassandra.io.util.DataIntegrityMetadata; import org.apache.cassandra.io.util.DataIntegrityMetadata.FileDigestValidator; import org.apache.cassandra.io.util.FileUtils; import org.apache.cassandra.io.util.RandomAccessReader; import org.apache.cassandra.service.ActiveRepairService; import org.apache.cassandra.utils.ByteBufferUtil; -import org.apache.cassandra.utils.FBUtilities; import org.apache.cassandra.utils.OutputHandler; import org.apache.cassandra.utils.UUIDGen; @@ -58,7 +60,6 @@ public class Verifier implements Closeable private final RowIndexEntry.IndexSerializer rowIndexEntrySerializer; private int goodRows; -private int badRows; private final OutputHandler outputHandler; private FileDigestValidator validator; @@ -89,6 +90,20 @@ public class Verifier implements Closeable long rowStart = 0; outputHandler.output(String.format("Verifying %s (%s bytes)", sstable, dataFile.length())); +outputHandler.output(String.format("Deserializing sstable metadata for %s ", sstable)); +try +{ +EnumSet types = EnumSet.of(MetadataType.VALIDATION, MetadataType.STATS, MetadataType.HEADER); +Map sstableMetadata = sstable.descriptor.getMetadataSerializer().deserialize(sstable.descriptor, types); +if (sstableMetadata.containsKey(MetadataType.VALIDATION) && + !((ValidationMetadata)sstableMetadata.get(MetadataType.VALIDATION)).partitioner.equals(sstable.getPartitioner().getClass().getCanonicalName())) +throw new IOException("Partitioner does not match validation metadata"); +} +catch (Throwable t) +{ +outputHandler.debug(t.getMessage()); +markAndThrow(false); +} outputHandler.output(String.format("Checking computed hash of %s ", sstable)); @@ -187,7 +202,7 @@ public class Verifier implements Closeable if (key == null || dataSize > dataFile.length()) markAndThrow(); -//mimic the scrub read path +//mimic the scrub read path, intentionally unused try (UnfilteredRowIterator iterator = new SSTableIdentityIterator(sstable, dataFile, key)) { } @@ -204,7 +219,6
[2/6] cassandra git commit: Deserialize sstable metadata in nodetool verify
Deserialize sstable metadata in nodetool verify Patch by marcuse; reviewed by Jason Brown for CASSANDRA-13922 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e400b976 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e400b976 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e400b976 Branch: refs/heads/cassandra-3.11 Commit: e400b976751110d41405bac614189152bf88f7ef Parents: b32a9e6 Author: Marcus Eriksson Authored: Mon Oct 2 10:11:17 2017 +0200 Committer: Marcus Eriksson Committed: Wed Oct 4 08:15:23 2017 +0200 -- CHANGES.txt | 4 +++ .../cassandra/db/compaction/Verifier.java | 32 .../org/apache/cassandra/db/VerifyTest.java | 25 +++ 3 files changed, 55 insertions(+), 6 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/e400b976/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index d6423b4..df05f7f 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,3 +1,7 @@ +3.0.16 + * Deserialise sstable metadata in nodetool verify (CASSANDRA-13922) + + 3.0.15 * Improve TRUNCATE performance (CASSANDRA-13909) * Implement short read protection on partition boundaries (CASSANDRA-13595) http://git-wip-us.apache.org/repos/asf/cassandra/blob/e400b976/src/java/org/apache/cassandra/db/compaction/Verifier.java -- diff --git a/src/java/org/apache/cassandra/db/compaction/Verifier.java b/src/java/org/apache/cassandra/db/compaction/Verifier.java index 88bc3a7..68088b3 100644 --- a/src/java/org/apache/cassandra/db/compaction/Verifier.java +++ b/src/java/org/apache/cassandra/db/compaction/Verifier.java @@ -26,13 +26,15 @@ import org.apache.cassandra.io.sstable.Component; import org.apache.cassandra.io.sstable.CorruptSSTableException; import org.apache.cassandra.io.sstable.SSTableIdentityIterator; import org.apache.cassandra.io.sstable.format.SSTableReader; +import org.apache.cassandra.io.sstable.metadata.MetadataComponent; +import org.apache.cassandra.io.sstable.metadata.MetadataType; +import org.apache.cassandra.io.sstable.metadata.ValidationMetadata; import org.apache.cassandra.io.util.DataIntegrityMetadata; import org.apache.cassandra.io.util.DataIntegrityMetadata.FileDigestValidator; import org.apache.cassandra.io.util.FileUtils; import org.apache.cassandra.io.util.RandomAccessReader; import org.apache.cassandra.service.ActiveRepairService; import org.apache.cassandra.utils.ByteBufferUtil; -import org.apache.cassandra.utils.FBUtilities; import org.apache.cassandra.utils.OutputHandler; import org.apache.cassandra.utils.UUIDGen; @@ -58,7 +60,6 @@ public class Verifier implements Closeable private final RowIndexEntry.IndexSerializer rowIndexEntrySerializer; private int goodRows; -private int badRows; private final OutputHandler outputHandler; private FileDigestValidator validator; @@ -89,6 +90,20 @@ public class Verifier implements Closeable long rowStart = 0; outputHandler.output(String.format("Verifying %s (%s bytes)", sstable, dataFile.length())); +outputHandler.output(String.format("Deserializing sstable metadata for %s ", sstable)); +try +{ +EnumSet types = EnumSet.of(MetadataType.VALIDATION, MetadataType.STATS, MetadataType.HEADER); +Map sstableMetadata = sstable.descriptor.getMetadataSerializer().deserialize(sstable.descriptor, types); +if (sstableMetadata.containsKey(MetadataType.VALIDATION) && + !((ValidationMetadata)sstableMetadata.get(MetadataType.VALIDATION)).partitioner.equals(sstable.getPartitioner().getClass().getCanonicalName())) +throw new IOException("Partitioner does not match validation metadata"); +} +catch (Throwable t) +{ +outputHandler.debug(t.getMessage()); +markAndThrow(false); +} outputHandler.output(String.format("Checking computed hash of %s ", sstable)); @@ -187,7 +202,7 @@ public class Verifier implements Closeable if (key == null || dataSize > dataFile.length()) markAndThrow(); -//mimic the scrub read path +//mimic the scrub read path, intentionally unused try (UnfilteredRowIterator iterator = new SSTableIdentityIterator(sstable, dataFile, key)) { } @@ -204,7 +219,6 @@ public class Verifier implements Closeable } catch (Throwable th) { -badRows++; markAndT
[5/6] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11
Merge branch 'cassandra-3.0' into cassandra-3.11 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f5bc4299 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f5bc4299 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f5bc4299 Branch: refs/heads/cassandra-3.11 Commit: f5bc42996b671ae26793691af3fd06fbcabbe4cc Parents: 983c72a e400b97 Author: Marcus Eriksson Authored: Wed Oct 4 08:18:36 2017 +0200 Committer: Marcus Eriksson Committed: Wed Oct 4 08:19:15 2017 +0200 -- CHANGES.txt | 1 + .../cassandra/db/compaction/Verifier.java | 31 .../org/apache/cassandra/db/VerifyTest.java | 26 3 files changed, 53 insertions(+), 5 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/f5bc4299/CHANGES.txt -- diff --cc CHANGES.txt index aca219e,df05f7f..4a22af4 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,17 -1,8 +1,18 @@@ -3.0.16 - * Deserialise sstable metadata in nodetool verify (CASSANDRA-13922) - - -3.0.15 +3.11.1 + * Fix the computation of cdc_total_space_in_mb for exabyte filesystems (CASSANDRA-13808) + * AbstractTokenTreeBuilder#serializedSize returns wrong value when there is a single leaf and overflow collisions (CASSANDRA-13869) + * Add a compaction option to TWCS to ignore sstables overlapping checks (CASSANDRA-13418) + * BTree.Builder memory leak (CASSANDRA-13754) + * Revert CASSANDRA-10368 of supporting non-pk column filtering due to correctness (CASSANDRA-13798) + * Add a skip read validation flag to cassandra-stress (CASSANDRA-13772) + * Fix cassandra-stress hang issues when an error during cluster connection happens (CASSANDRA-12938) + * Better bootstrap failure message when blocked by (potential) range movement (CASSANDRA-13744) + * "ignore" option is ignored in sstableloader (CASSANDRA-13721) + * Deadlock in AbstractCommitLogSegmentManager (CASSANDRA-13652) + * Duplicate the buffer before passing it to analyser in SASI operation (CASSANDRA-13512) + * Properly evict pstmts from prepared statements cache (CASSANDRA-13641) +Merged from 3.0: ++ * Deserialise sstable metadata in nodetool verify (CASSANDRA-13922) * Improve TRUNCATE performance (CASSANDRA-13909) * Implement short read protection on partition boundaries (CASSANDRA-13595) * Fix ISE thrown by UPI.Serializer.hasNext() for some SELECT queries (CASSANDRA-13911) http://git-wip-us.apache.org/repos/asf/cassandra/blob/f5bc4299/src/java/org/apache/cassandra/db/compaction/Verifier.java -- diff --cc src/java/org/apache/cassandra/db/compaction/Verifier.java index df659e4,68088b3..4c07bb6 --- a/src/java/org/apache/cassandra/db/compaction/Verifier.java +++ b/src/java/org/apache/cassandra/db/compaction/Verifier.java @@@ -88,7 -89,21 +90,21 @@@ public class Verifier implements Closea { long rowStart = 0; -outputHandler.output(String.format("Verifying %s (%s bytes)", sstable, dataFile.length())); +outputHandler.output(String.format("Verifying %s (%s)", sstable, FBUtilities.prettyPrintMemory(dataFile.length(; + outputHandler.output(String.format("Deserializing sstable metadata for %s ", sstable)); + try + { + EnumSet types = EnumSet.of(MetadataType.VALIDATION, MetadataType.STATS, MetadataType.HEADER); + Map sstableMetadata = sstable.descriptor.getMetadataSerializer().deserialize(sstable.descriptor, types); + if (sstableMetadata.containsKey(MetadataType.VALIDATION) && + !((ValidationMetadata)sstableMetadata.get(MetadataType.VALIDATION)).partitioner.equals(sstable.getPartitioner().getClass().getCanonicalName())) + throw new IOException("Partitioner does not match validation metadata"); + } + catch (Throwable t) + { + outputHandler.debug(t.getMessage()); + markAndThrow(false); + } outputHandler.output(String.format("Checking computed hash of %s ", sstable)); @@@ -187,8 -202,8 +203,8 @@@ if (key == null || dataSize > dataFile.length()) markAndThrow(); - //mimic the scrub read path + //mimic the scrub read path, intentionally unused -try (UnfilteredRowIterator iterator = new SSTableIdentityIterator(sstable, dataFile, key)) +try (UnfilteredRowIterator iterator = SSTableIdentityIterator.create(sstable, dataFile, key)) { } http://git-wip-us.apache.org/repos/asf/cassandra/blob/f5
[jira] [Comment Edited] (CASSANDRA-13929) BTree$Builder / io.netty.util.Recycler$Stack leaking memory
[ https://issues.apache.org/jira/browse/CASSANDRA-13929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190865#comment-16190865 ] Thomas Steinmaurer edited comment on CASSANDRA-13929 at 10/4/17 6:19 AM: - Ok, if this sort of caching is entirely only useful during streaming, this somehow explains why I do not see any difference between the nodes here, cause they process regular business and no repair, bootstrapping etc, thus I can't comment on the performance gain with the cache (possibly this has been tested in CASSANDRA-9766 anyway). If the cache is useful for streaming, it still would definitely make sense IMHO, to make the max capacity configurable (with a somewhat useful small default value) via a system property, cassandra.yaml or whatever place is preferred here, cause ~ 2,4G+ heap usage for this sort of cache at -XmX8G does not make sense or perhaps the majority of Cassandra users don't scale out with smaller sized machines anymore. As you have mentioned *per thread*. I guess we are talking about the number of threads serving client requests, aka e.g. {{native_transport_max_threads}}? If so, there should be somewhere a clear pointer in a comment, documentation etc., that heap usage is directly related to number of threads * configurable max size was (Author: tsteinmaurer): Ok, if this sort of caching is entirely only useful during streaming, this somehow explains why I do not see any difference between the nodes here, cause they process regular business and no repair, bootstrapping etc, thus I can't comment on the performance gain with the cache (possibly this has been tested in CASSANDRA-9766 anyway). If the cache is useful for streaming, it still would definitely make sense IMHO, to make the max capacity configurable (with a somewhat useful small default value) via a system property, cassandra.yaml or whatever place is preferred here, cause ~ 2,4G+ heap usage for this sort of cache at -XmX8G does not make sense. As you have mentioned *per thread*. I guess we are talking about the number of threads serving client requests, aka e.g. {{native_transport_max_threads}}? If so, there should be somewhere a clear pointer in a comment, documentation etc., that heap usage is directly related to number of threads * configurable max size > BTree$Builder / io.netty.util.Recycler$Stack leaking memory > --- > > Key: CASSANDRA-13929 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13929 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Thomas Steinmaurer > Attachments: cassandra_3.11.0_min_memory_utilization.jpg, > cassandra_3.11.1_mat_dominator_classes_FIXED.png, > cassandra_3.11.1_mat_dominator_classes.png, > cassandra_3.11.1_snapshot_heaputilization.png > > > Different to CASSANDRA-13754, there seems to be another memory leak in > 3.11.0+ in BTree$Builder / io.netty.util.Recycler$Stack. > * heap utilization increase after upgrading to 3.11.0 => > cassandra_3.11.0_min_memory_utilization.jpg > * No difference after upgrading to 3.11.1 (snapshot build) => > cassandra_3.11.1_snapshot_heaputilization.png; thus most likely after fixing > CASSANDRA-13754, more visible now > * MAT shows io.netty.util.Recycler$Stack as top contributing class => > cassandra_3.11.1_mat_dominator_classes.png > * With -Xmx8G (CMS) and our load pattern, we have to do a rolling restart > after ~ 72 hours > Verified the following fix, namely explicitly unreferencing the > _recycleHandle_ member (making it non-final). In > _org.apache.cassandra.utils.btree.BTree.Builder.recycle()_ > {code} > public void recycle() > { > if (recycleHandle != null) > { > this.cleanup(); > builderRecycler.recycle(this, recycleHandle); > recycleHandle = null; // ADDED > } > } > {code} > Patched a single node in our loadtest cluster with this change and after ~ 10 > hours uptime, no sign of the previously offending class in MAT anymore => > cassandra_3.11.1_mat_dominator_classes_FIXED.png > Can' say if this has any other side effects etc., but I doubt. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13929) BTree$Builder / io.netty.util.Recycler$Stack leaking memory
[ https://issues.apache.org/jira/browse/CASSANDRA-13929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190865#comment-16190865 ] Thomas Steinmaurer edited comment on CASSANDRA-13929 at 10/4/17 6:18 AM: - Ok, if this sort of caching is entirely only useful during streaming, this somehow explains why I do not see any difference between the nodes here, cause they process regular business and no repair, bootstrapping etc, thus I can't comment on the performance gain with the cache (possibly this has been tested in CASSANDRA-9766 anyway). If the cache is useful for streaming, it still would definitely make sense IMHO, to make the max capacity configurable (with a somewhat useful small default value) via a system property, cassandra.yaml or whatever place is preferred here, cause ~ 2,4G+ heap usage for this sort of cache at -XmX8G does not make sense. As you have mentioned *per thread*. I guess we are talking about the number of threads serving client requests, aka e.g. {{native_transport_max_threads}}? If so, there should be somewhere a clear pointer in a comment, documentation etc., that heap usage is directly related to number of threads * configurable max size was (Author: tsteinmaurer): Ok, if this sort of caching is entirely only useful during streaming, this somehow explains why I do not see any difference between the nodes here, cause they process regular business and no repair, bootstrapping etc, thus I can't comment on the performance gain with the cache (possibly this has been tested in CASSANDRA-9766 anyway). If the cache is useful for streaming, it still would definitely make sense IMHO, to make the max capacity configurable via a system property, cassandra.yaml or whatever place is preferred here, cause ~ 2,4G+ heap usage for this sort of cache at -XmX8G does not make sense. As you have mentioned *per thread*. I guess we are talking about the number of threads serving client requests, aka e.g. {{native_transport_max_threads}}? If so, there should be somewhere a clear pointer in a comment, documentation etc., that heap usage is directly related to number of threads * configurable max size > BTree$Builder / io.netty.util.Recycler$Stack leaking memory > --- > > Key: CASSANDRA-13929 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13929 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Thomas Steinmaurer > Attachments: cassandra_3.11.0_min_memory_utilization.jpg, > cassandra_3.11.1_mat_dominator_classes_FIXED.png, > cassandra_3.11.1_mat_dominator_classes.png, > cassandra_3.11.1_snapshot_heaputilization.png > > > Different to CASSANDRA-13754, there seems to be another memory leak in > 3.11.0+ in BTree$Builder / io.netty.util.Recycler$Stack. > * heap utilization increase after upgrading to 3.11.0 => > cassandra_3.11.0_min_memory_utilization.jpg > * No difference after upgrading to 3.11.1 (snapshot build) => > cassandra_3.11.1_snapshot_heaputilization.png; thus most likely after fixing > CASSANDRA-13754, more visible now > * MAT shows io.netty.util.Recycler$Stack as top contributing class => > cassandra_3.11.1_mat_dominator_classes.png > * With -Xmx8G (CMS) and our load pattern, we have to do a rolling restart > after ~ 72 hours > Verified the following fix, namely explicitly unreferencing the > _recycleHandle_ member (making it non-final). In > _org.apache.cassandra.utils.btree.BTree.Builder.recycle()_ > {code} > public void recycle() > { > if (recycleHandle != null) > { > this.cleanup(); > builderRecycler.recycle(this, recycleHandle); > recycleHandle = null; // ADDED > } > } > {code} > Patched a single node in our loadtest cluster with this change and after ~ 10 > hours uptime, no sign of the previously offending class in MAT anymore => > cassandra_3.11.1_mat_dominator_classes_FIXED.png > Can' say if this has any other side effects etc., but I doubt. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13929) BTree$Builder / io.netty.util.Recycler$Stack leaking memory
[ https://issues.apache.org/jira/browse/CASSANDRA-13929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190865#comment-16190865 ] Thomas Steinmaurer edited comment on CASSANDRA-13929 at 10/4/17 6:17 AM: - Ok, if this sort of caching is entirely only useful during streaming, this somehow explains why I do not see any difference between the nodes here, cause they process regular business and no repair, bootstrapping etc, thus I can't comment on the performance gain with the cache (possibly this has been tested in CASSANDRA-9766 anyway). If the cache is useful for streaming, it still would definitely make sense IMHO, to make the max capacity configurable via a system property, cassandra.yaml or whatever place is preferred here, cause ~ 2,4G+ heap usage for this sort of cache at -XmX8G does not make sense. As you have mentioned *per thread*. I guess we are talking about the number of threads serving client requests, aka e.g. {{native_transport_max_threads}}? If so, there should be somewhere a clear pointer in a comment, documentation etc., that heap usage is directly related to number of threads * configurable max size was (Author: tsteinmaurer): Ok, if this sort of caching is entirely only useful during streaming, this somehow explains why I do not see any difference between the nodes here, cause they process regular business and no repair, bootstrapping etc, thus I can't comment on the performance gain with the cache (possibly this has been tested in CASSANDRA-9766 anyway). If the cache is useful for streaming, it still would definitely make sense IMHO, to make the max capacity configurable via a system property, cassandra.yaml or whatever place is preferred here, cause ~ 2,4G+ heap usage for this sort of cache at -XmX8G does not make sense. As you have mentioned *per thread*. I guess we are talking about the number of threads serving client requests, aka e.g. {{native_transport_max_threads}}. If so, there should be somewhere a clear pointer in a comment, documentation etc., that heap usage is directly related to number of threads * configurable max size > BTree$Builder / io.netty.util.Recycler$Stack leaking memory > --- > > Key: CASSANDRA-13929 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13929 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Thomas Steinmaurer > Attachments: cassandra_3.11.0_min_memory_utilization.jpg, > cassandra_3.11.1_mat_dominator_classes_FIXED.png, > cassandra_3.11.1_mat_dominator_classes.png, > cassandra_3.11.1_snapshot_heaputilization.png > > > Different to CASSANDRA-13754, there seems to be another memory leak in > 3.11.0+ in BTree$Builder / io.netty.util.Recycler$Stack. > * heap utilization increase after upgrading to 3.11.0 => > cassandra_3.11.0_min_memory_utilization.jpg > * No difference after upgrading to 3.11.1 (snapshot build) => > cassandra_3.11.1_snapshot_heaputilization.png; thus most likely after fixing > CASSANDRA-13754, more visible now > * MAT shows io.netty.util.Recycler$Stack as top contributing class => > cassandra_3.11.1_mat_dominator_classes.png > * With -Xmx8G (CMS) and our load pattern, we have to do a rolling restart > after ~ 72 hours > Verified the following fix, namely explicitly unreferencing the > _recycleHandle_ member (making it non-final). In > _org.apache.cassandra.utils.btree.BTree.Builder.recycle()_ > {code} > public void recycle() > { > if (recycleHandle != null) > { > this.cleanup(); > builderRecycler.recycle(this, recycleHandle); > recycleHandle = null; // ADDED > } > } > {code} > Patched a single node in our loadtest cluster with this change and after ~ 10 > hours uptime, no sign of the previously offending class in MAT anymore => > cassandra_3.11.1_mat_dominator_classes_FIXED.png > Can' say if this has any other side effects etc., but I doubt. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13929) BTree$Builder / io.netty.util.Recycler$Stack leaking memory
[ https://issues.apache.org/jira/browse/CASSANDRA-13929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190865#comment-16190865 ] Thomas Steinmaurer edited comment on CASSANDRA-13929 at 10/4/17 6:17 AM: - Ok, if this sort of caching is entirely only useful during streaming, this somehow explains why I do not see any difference between the nodes here, cause they process regular business and no repair, bootstrapping etc, thus I can't comment on the performance gain with the cache (possibly this has been tested in CASSANDRA-9766 anyway). If the cache is useful for streaming, it still would definitely make sense IMHO, to make the max capacity configurable via a system property, cassandra.yaml or whatever place is preferred here, cause ~ 2,4G+ heap usage for this sort of cache at -XmX8G does not make sense. As you have mentioned *per thread*. I guess we are talking about the number of threads serving client requests, aka e.g. {{native_transport_max_threads}}. If so, there should be somewhere a clear pointer in a comment, documentation etc., that heap usage is directly related to number of threads * configurable max size was (Author: tsteinmaurer): Ok, if this sort of caching is entirely only useful during streaming, this somehow explains why I do not see any difference between the nodes here, cause they process regular business and no repair, bootstrapping etc, thus I can't comment on the performance gain with the cache (possibly this has been tested in CASSANDRA-9766 anyway). If the cache is useful for streaming, it still would definitely make sense IMHO, to make the max capacity configurable via a system property, cassandra.yaml or whatever place is preferred here, cause ~ 1,8G heap usage for this sort of cache at -XmX8G does not make sense. > BTree$Builder / io.netty.util.Recycler$Stack leaking memory > --- > > Key: CASSANDRA-13929 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13929 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Thomas Steinmaurer > Attachments: cassandra_3.11.0_min_memory_utilization.jpg, > cassandra_3.11.1_mat_dominator_classes_FIXED.png, > cassandra_3.11.1_mat_dominator_classes.png, > cassandra_3.11.1_snapshot_heaputilization.png > > > Different to CASSANDRA-13754, there seems to be another memory leak in > 3.11.0+ in BTree$Builder / io.netty.util.Recycler$Stack. > * heap utilization increase after upgrading to 3.11.0 => > cassandra_3.11.0_min_memory_utilization.jpg > * No difference after upgrading to 3.11.1 (snapshot build) => > cassandra_3.11.1_snapshot_heaputilization.png; thus most likely after fixing > CASSANDRA-13754, more visible now > * MAT shows io.netty.util.Recycler$Stack as top contributing class => > cassandra_3.11.1_mat_dominator_classes.png > * With -Xmx8G (CMS) and our load pattern, we have to do a rolling restart > after ~ 72 hours > Verified the following fix, namely explicitly unreferencing the > _recycleHandle_ member (making it non-final). In > _org.apache.cassandra.utils.btree.BTree.Builder.recycle()_ > {code} > public void recycle() > { > if (recycleHandle != null) > { > this.cleanup(); > builderRecycler.recycle(this, recycleHandle); > recycleHandle = null; // ADDED > } > } > {code} > Patched a single node in our loadtest cluster with this change and after ~ 10 > hours uptime, no sign of the previously offending class in MAT anymore => > cassandra_3.11.1_mat_dominator_classes_FIXED.png > Can' say if this has any other side effects etc., but I doubt. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13929) BTree$Builder / io.netty.util.Recycler$Stack leaking memory
[ https://issues.apache.org/jira/browse/CASSANDRA-13929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190865#comment-16190865 ] Thomas Steinmaurer commented on CASSANDRA-13929: Ok, if this sort of caching is entirely only useful during streaming, this somehow explains why I do not see any difference between the nodes here, cause they process regular business and no repair, bootstrapping etc, thus I can't comment on the performance gain with the cache (possibly this has been tested in CASSANDRA-9766 anyway). If the cache is useful for streaming, it still would definitely make sense IMHO, to make the max capacity configurable via a system property, cassandra.yaml or whatever place is preferred here, cause ~ 1,8G heap usage for this sort of cache at -XmX8G does not make sense. > BTree$Builder / io.netty.util.Recycler$Stack leaking memory > --- > > Key: CASSANDRA-13929 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13929 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Thomas Steinmaurer > Attachments: cassandra_3.11.0_min_memory_utilization.jpg, > cassandra_3.11.1_mat_dominator_classes_FIXED.png, > cassandra_3.11.1_mat_dominator_classes.png, > cassandra_3.11.1_snapshot_heaputilization.png > > > Different to CASSANDRA-13754, there seems to be another memory leak in > 3.11.0+ in BTree$Builder / io.netty.util.Recycler$Stack. > * heap utilization increase after upgrading to 3.11.0 => > cassandra_3.11.0_min_memory_utilization.jpg > * No difference after upgrading to 3.11.1 (snapshot build) => > cassandra_3.11.1_snapshot_heaputilization.png; thus most likely after fixing > CASSANDRA-13754, more visible now > * MAT shows io.netty.util.Recycler$Stack as top contributing class => > cassandra_3.11.1_mat_dominator_classes.png > * With -Xmx8G (CMS) and our load pattern, we have to do a rolling restart > after ~ 72 hours > Verified the following fix, namely explicitly unreferencing the > _recycleHandle_ member (making it non-final). In > _org.apache.cassandra.utils.btree.BTree.Builder.recycle()_ > {code} > public void recycle() > { > if (recycleHandle != null) > { > this.cleanup(); > builderRecycler.recycle(this, recycleHandle); > recycleHandle = null; // ADDED > } > } > {code} > Patched a single node in our loadtest cluster with this change and after ~ 10 > hours uptime, no sign of the previously offending class in MAT anymore => > cassandra_3.11.1_mat_dominator_classes_FIXED.png > Can' say if this has any other side effects etc., but I doubt. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13321) Add a checksum component for the sstable metadata (-Statistics.db) file
[ https://issues.apache.org/jira/browse/CASSANDRA-13321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190846#comment-16190846 ] Marcus Eriksson edited comment on CASSANDRA-13321 at 10/4/17 5:43 AM: -- thanks for the review, fixed the comments [here|https://github.com/krummas/cassandra/commits/marcuse/simplerchecksum] and rerunning tests [here|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/359/] before committing was (Author: krummas): thanks for the review - rerunning tests [here|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/359/] before committing > Add a checksum component for the sstable metadata (-Statistics.db) file > --- > > Key: CASSANDRA-13321 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13321 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 4.x > > > Since we keep important information in the sstable metadata file now, we > should add a checksum component for it. One danger being if a bit gets > flipped in repairedAt we could consider the sstable repaired when it is not. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13321) Add a checksum component for the sstable metadata (-Statistics.db) file
[ https://issues.apache.org/jira/browse/CASSANDRA-13321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190846#comment-16190846 ] Marcus Eriksson commented on CASSANDRA-13321: - thanks for the review - rerunning tests [here|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/359/] before committing > Add a checksum component for the sstable metadata (-Statistics.db) file > --- > > Key: CASSANDRA-13321 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13321 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 4.x > > > Since we keep important information in the sstable metadata file now, we > should add a checksum component for it. One danger being if a bit gets > flipped in repairedAt we could consider the sstable repaired when it is not. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13899) Streaming of compressed partition fails
[ https://issues.apache.org/jira/browse/CASSANDRA-13899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta updated CASSANDRA-13899: Status: Ready to Commit (was: Patch Available) > Streaming of compressed partition fails > > > Key: CASSANDRA-13899 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13899 > Project: Cassandra > Issue Type: Bug >Reporter: Paulo Motta >Assignee: Jason Brown > Fix For: 4.0 > > Attachments: largepartition.yaml > > > Streaming a single partition with ~100K rows fails with the following > exception: > {noformat} > ERROR [Stream-Deserializer-/127.0.0.1:35149-a92e5e12] 2017-09-21 04:03:41,237 > StreamSession.java:617 - [Stream #c2e5b640-9eab-11e7-99c0-e9864ca8da8e] > Streaming error occurred on session with peer 127.0.0.1 > org.apache.cassandra.streaming.StreamReceiveException: > java.lang.RuntimeException: Last written key > DecoratedKey(-1000328290821038380) >= current key > DecoratedKey(-1055007227842125139) writing into > /home/paulo/.ccm/test/node2/data0/stresscql/typestest-482ac7b09e8d11e787cf85d073c > 8e037/na-1-big-Data.db > at > org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:63) > ~[main/:na] > at > org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:41) > ~[main/:na] > at > org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:55) > ~[main/:na] > at > org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:178) > ~[main/:na] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121] > {noformat} > Reproduction steps: > * Create CCM cluster with 2 nodes > * Start only first node, disable hinted handoff > * Run stress with the attached yaml: {{tools/bin/cassandra-stress "user > profile=largepartition.yaml n=10K ops(insert=1) no-warmup -node whitelist > 127.0.0.1 -mode native cql3 compression=lz4 -rate threads=4 -insert > visits=FIXED(100K) revisit=FIXED(100K)"}} > * Start second node, run repair on {{stresscql}} table - the exception above > will be thrown. > I investigated briefly and haven't found anything suspicious. This seems to > be related to CASSANDRA-12229 as I tested the steps above in a branch without > that and the repair completed successfully. I haven't tested with a smaller > number of rows per partition to see at which point it starts to be a problem. > We should probably add a regression dtest to stream large partitions to catch > similar problems in the future. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13899) Streaming of compressed partition fails
[ https://issues.apache.org/jira/browse/CASSANDRA-13899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190719#comment-16190719 ] Paulo Motta commented on CASSANDRA-13899: - I am away this week but this is simple enough so I managed to have a quick look during vacation. ;) bq. I can trigger the error on trunk with it and a much lower insert count in under 80 seconds total. I've turned it into a dtest. Awesome, dtest looks good to me - verified that it fails without the patch and passes with it - thanks! Marking as ready to commit. > Streaming of compressed partition fails > > > Key: CASSANDRA-13899 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13899 > Project: Cassandra > Issue Type: Bug >Reporter: Paulo Motta >Assignee: Jason Brown > Fix For: 4.0 > > Attachments: largepartition.yaml > > > Streaming a single partition with ~100K rows fails with the following > exception: > {noformat} > ERROR [Stream-Deserializer-/127.0.0.1:35149-a92e5e12] 2017-09-21 04:03:41,237 > StreamSession.java:617 - [Stream #c2e5b640-9eab-11e7-99c0-e9864ca8da8e] > Streaming error occurred on session with peer 127.0.0.1 > org.apache.cassandra.streaming.StreamReceiveException: > java.lang.RuntimeException: Last written key > DecoratedKey(-1000328290821038380) >= current key > DecoratedKey(-1055007227842125139) writing into > /home/paulo/.ccm/test/node2/data0/stresscql/typestest-482ac7b09e8d11e787cf85d073c > 8e037/na-1-big-Data.db > at > org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:63) > ~[main/:na] > at > org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:41) > ~[main/:na] > at > org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:55) > ~[main/:na] > at > org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:178) > ~[main/:na] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121] > {noformat} > Reproduction steps: > * Create CCM cluster with 2 nodes > * Start only first node, disable hinted handoff > * Run stress with the attached yaml: {{tools/bin/cassandra-stress "user > profile=largepartition.yaml n=10K ops(insert=1) no-warmup -node whitelist > 127.0.0.1 -mode native cql3 compression=lz4 -rate threads=4 -insert > visits=FIXED(100K) revisit=FIXED(100K)"}} > * Start second node, run repair on {{stresscql}} table - the exception above > will be thrown. > I investigated briefly and haven't found anything suspicious. This seems to > be related to CASSANDRA-12229 as I tested the steps above in a branch without > that and the repair completed successfully. I haven't tested with a smaller > number of rows per partition to see at which point it starts to be a problem. > We should probably add a regression dtest to stream large partitions to catch > similar problems in the future. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13915) Create a Docker container to build the docs
[ https://issues.apache.org/jira/browse/CASSANDRA-13915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190692#comment-16190692 ] Jason Brown edited comment on CASSANDRA-13915 at 10/4/17 2:07 AM: -- [~rustyrazorblade] I have way too many scars of *not* running {{ant realclean}} that I instinctively always run it. However, for the purposes of working on the docs, I agree that quicker turn around time is probably more important than cleaning everything, everytime. [~j.casares] you add the docker file as {{/doc/Dockerfile}}. Is that possibly confusing for anyone who thinks that this will allow them to build a docker container to actually run cassandra? At a minimum, add a quick comment at the top of the file in case someone skips everything else in our distro, looking for a docker file. Maybe rename the file if you think it's appropriate. Same goes for the {{/doc/docker-compose.yml}}, as well. was (Author: jasobrown): [~rustyrazorblade] I have way too many scars of *not* running {{ant realclean}} that I instinctively always run it. However, for the purposes of working on the docs, I agree that quicker turn around time is probably more important than cleaning everything, everytime. [~j.casares] you add the docker file as {{/doc/Dockerfile}}. Is that possibly confusing for anyone who thinks that this will allow them to build a docker container to actually run cassandra? At a minimum, add a quick comment at the top of the file in case someone skips everything else in our distro, looking for a docker file. Same goes for the {{/doc/docker-compose.yml}}, as well. > Create a Docker container to build the docs > --- > > Key: CASSANDRA-13915 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13915 > Project: Cassandra > Issue Type: Improvement >Reporter: Joaquin Casares >Assignee: Joaquin Casares > > As requested by [~rustyrazorblade], I will be adding a Docker container to > build the docs without any prereqs (other than Docker). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13915) Create a Docker container to build the docs
[ https://issues.apache.org/jira/browse/CASSANDRA-13915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190692#comment-16190692 ] Jason Brown commented on CASSANDRA-13915: - [~rustyrazorblade] I have way too many scars of *not* running {{ant realclean}} that I instinctively always run it. However, for the purposes of working on the docs, I agree that quicker turn around time is probably more important than cleaning everything, everytime. [~j.casares] you add the docker file as {{/doc/Dockerfile}}. Is that possibly confusing for anyone who thinks that this will allow them to build a docker container to actually run cassandra? At a minimum, add a quick comment at the top of the file in case someone skips everything else in our distro, looking for a docker file. Same goes for the {{/doc/docker-compose.yml}}, as well. > Create a Docker container to build the docs > --- > > Key: CASSANDRA-13915 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13915 > Project: Cassandra > Issue Type: Improvement >Reporter: Joaquin Casares >Assignee: Joaquin Casares > > As requested by [~rustyrazorblade], I will be adding a Docker container to > build the docs without any prereqs (other than Docker). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13915) Create a Docker container to build the docs
[ https://issues.apache.org/jira/browse/CASSANDRA-13915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190665#comment-16190665 ] Jon Haddad commented on CASSANDRA-13915: I'm not wild about always running {{ant realclean}} between builds of docs. Most of the time it's unnecessary and makes the previous process takes quite a bit longer, which will be kind of a pain if you want to make a lot of edits as I like to. If someone wants to {{realclean}}, then can do it explicitly, but I think it should be removed from this process. Some numbers on my mac laptop: Building the docker image: {code} docker-compose build build-docs 0.89s user 0.37s system 1% cpu 1:46.89 total {code} Building the docs w/ realclean: {code} docker-compose run build-docs 0.69s user 0.16s system 4% cpu 18.090 total {code} Building without realclean: {code} docker-compose run build-docs 0.67s user 0.15s system 12% cpu 6.625 total {code} [~jasobrown] what do you think? > Create a Docker container to build the docs > --- > > Key: CASSANDRA-13915 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13915 > Project: Cassandra > Issue Type: Improvement >Reporter: Joaquin Casares >Assignee: Joaquin Casares > > As requested by [~rustyrazorblade], I will be adding a Docker container to > build the docs without any prereqs (other than Docker). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-3200) Repair: compare all trees together (for a given range/cf) instead of by pair in isolation
[ https://issues.apache.org/jira/browse/CASSANDRA-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190560#comment-16190560 ] Blake Eggleston commented on CASSANDRA-3200: First round of review First, I think if there’s going to be a widely used data structure that has more than 1-2 levels of nested containers, it’s time to make some (simple) dedicated classes. For instance, IncomingRepairStreamTracker consumes and operates on {{Map>>>}}. What each part of this structure represents, and the intended effect of each collection method call is not clear. Same sort of thing with {{Map, Set>>}}. Rolling these structures into classes, as well as putting the raw container manipulation behind more meaningfully named methods will make this patch much easier to understand. It will also allow you to test your container manipulation logic and actual algorithm logic separately. Some more specific stuff: User facing: * symmetric/asymmetric nodetool naming option is ambiguous, not sure what a better name would be, maybe something about reducing or optimizing streams? * should be off by default AsymmetricSyncRequest/SyncTasks: * Could we just add a one-way flag to the existing requests / tasks? The new asymmetric classes duplicate most of the symmetric tasks code (I think). In the case of local sync task, the pullRepair flag is basically doing this already. IncomingRepairStreamTracker * fixing the container thing as mentioned above may fix this, but it’s difficult to figure out how this works. A top level java doc explaining how the duplicate streams are identified and reduced would be nice. * The class name doesn’t seem appropriate. Not all the streams are incoming, and it’s not tracking any continuous processes. Maybe RepairStreamReducer or RepairStreamOptimizer? * Should be in the repair package. IncomingRepairStreamTrackerTest * Should throw exception instead of printing stack trace in static block * Fix indentation of matrices in test comments * The content of the `differences` map, as set up in testSimpleReducing doesn’t make sense to me, why would node C be in node A’s map, but note vice versa? * I think it would be clearer to alias the contents of addresses 0-4 to static variables like A, B, C, etc. Parsing out the array indices when reading through the tests is difficult to follow. > Repair: compare all trees together (for a given range/cf) instead of by pair > in isolation > - > > Key: CASSANDRA-3200 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3200 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Assignee: Marcus Eriksson >Priority: Minor > Labels: repair > Fix For: 4.x > > > Currently, repair compare merkle trees by pair, in isolation of any other > tree. What that means concretely is that if I have three node A, B and C > (RF=3) with A and B in sync, but C having some range r inconsitent with both > A and B (since those are consistent), we will do the following transfer of r: > A -> C, C -> A, B -> C, C -> B. > The fact that we do both A -> C and C -> A is fine, because we cannot know > which one is more to date from A or C. However, the transfer B -> C is > useless provided we do A -> C if A and B are in sync. Not doing that transfer > will be a 25% improvement in that case. With RF=5 and only one node > inconsistent with all the others, that almost a 40% improvement, etc... > Given that this situation of one node not in sync while the others are is > probably fairly common (one node died so it is behind), this could be a fair > improvement over what is transferred. In the case where we use repair to > rebuild completely a node, this will be a dramatic improvement, because it > will avoid the rebuilded node to get RF times the data it should get. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-11596) Add native transport port to system.peers
[ https://issues.apache.org/jira/browse/CASSANDRA-11596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190343#comment-16190343 ] Olivier Michallat edited comment on CASSANDRA-11596 at 10/3/17 10:54 PM: - -I'm afraid it does, you'll need the port in {{STATUS_CHANGE}} events.- My bad, the {{inet}} type used by those events does in fact contain the port (the name threw me off), so indeed no protocol change needed. That port is currently hard-coded to the one of the current connection, but I see you've addressed that in the patch for 7544. was (Author: omichallat): I'm afraid it does, you'll need the port in {{STATUS_CHANGE}} events. > Add native transport port to system.peers > - > > Key: CASSANDRA-11596 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11596 > Project: Cassandra > Issue Type: Improvement > Components: Distributed Metadata >Reporter: Nico Haller >Assignee: Ariel Weisberg >Priority: Minor > Labels: lhf > > Is there any reason why the native transport port is not being stored in > system.peers along the rpc broadcast address and transmitted to the connected > drivers? > I would love to have that feature, that would allow me to "hide" my cluster > behind a reverse NAT or LB and only consume one external IP address and > forward packets based on the port the client is connecting to. > I guess it makes sense to provide the complete socket information instead of > just the address and using a default port setting on the client to complete > the connection information. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-11596) Add native transport port to system.peers
[ https://issues.apache.org/jira/browse/CASSANDRA-11596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190500#comment-16190500 ] Sandeep Tamhankar commented on CASSANDRA-11596: --- And {{TOPOLOGY_CHANGE}} events. > Add native transport port to system.peers > - > > Key: CASSANDRA-11596 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11596 > Project: Cassandra > Issue Type: Improvement > Components: Distributed Metadata >Reporter: Nico Haller >Assignee: Ariel Weisberg >Priority: Minor > Labels: lhf > > Is there any reason why the native transport port is not being stored in > system.peers along the rpc broadcast address and transmitted to the connected > drivers? > I would love to have that feature, that would allow me to "hide" my cluster > behind a reverse NAT or LB and only consume one external IP address and > forward packets based on the port the client is connecting to. > I guess it makes sense to provide the complete socket information instead of > just the address and using a default port setting on the client to complete > the connection information. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-11596) Add native transport port to system.peers
[ https://issues.apache.org/jira/browse/CASSANDRA-11596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190343#comment-16190343 ] Olivier Michallat commented on CASSANDRA-11596: --- I'm afraid it does, you'll need the port in {{STATUS_CHANGE}} events. > Add native transport port to system.peers > - > > Key: CASSANDRA-11596 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11596 > Project: Cassandra > Issue Type: Improvement > Components: Distributed Metadata >Reporter: Nico Haller >Assignee: Ariel Weisberg >Priority: Minor > Labels: lhf > > Is there any reason why the native transport port is not being stored in > system.peers along the rpc broadcast address and transmitted to the connected > drivers? > I would love to have that feature, that would allow me to "hide" my cluster > behind a reverse NAT or LB and only consume one external IP address and > forward packets based on the port the client is connecting to. > I guess it makes sense to provide the complete socket information instead of > just the address and using a default port setting on the client to complete > the connection information. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
cassandra-dtest git commit: Add more tests for SRP and #13911
Repository: cassandra-dtest Updated Branches: refs/heads/master 2f8bc9da4 -> b0f34e3a6 Add more tests for SRP and #13911 patch by Aleksey Yeschenko; reviewed by Sam Tunnicliffe Project: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/commit/b0f34e3a Tree: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/tree/b0f34e3a Diff: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/diff/b0f34e3a Branch: refs/heads/master Commit: b0f34e3a6b41c3be5f86dcb8db4f32de9436b071 Parents: 2f8bc9d Author: Aleksey Yeschenko Authored: Mon Oct 2 14:08:56 2017 +0100 Committer: Aleksey Yeschenko Committed: Mon Oct 2 15:23:55 2017 +0100 -- consistency_test.py | 127 +++ 1 file changed, 127 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra-dtest/blob/b0f34e3a/consistency_test.py -- diff --git a/consistency_test.py b/consistency_test.py index 1a624c3..ccc7ee6 100644 --- a/consistency_test.py +++ b/consistency_test.py @@ -823,6 +823,133 @@ class TestConsistency(Tester): [[0]], cl=ConsistencyLevel.ALL) +@since('3.11') +def test_13911_rows_srp(self): +""" +@jira_ticket CASSANDRA-13911 + +A regression test to prove that we can no longer rely on +!singleResultCounter.isDoneForPartition() to abort single +partition SRP early if a per partition limit is set. +""" +cluster = self.cluster + +# disable hinted handoff and set batch commit log so this doesn't interfere with the test +cluster.set_configuration_options(values={'hinted_handoff_enabled': False}) +cluster.set_batch_commitlog(enabled=True) + +cluster.populate(2).start(wait_other_notice=True) +node1, node2 = cluster.nodelist() + +session = self.patient_cql_connection(node1) + +query = "CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1': 2};" +session.execute(query) + +query = 'CREATE TABLE test.test (pk int, ck int, PRIMARY KEY (pk, ck));' +session.execute(query) + +# with node2 down +# +# node1, partition 0 | 0 1 - - +# node1, partition 2 | 0 x - - + +node2.stop(wait_other_notice=True) +session.execute('INSERT INTO test.test (pk, ck) VALUES (0, 0) USING TIMESTAMP 42;') +session.execute('INSERT INTO test.test (pk, ck) VALUES (0, 1) USING TIMESTAMP 42;') +session.execute('INSERT INTO test.test (pk, ck) VALUES (2, 0) USING TIMESTAMP 42;') +session.execute('DELETE FROM test.test USING TIMESTAMP 42 WHERE pk = 2 AND ck = 1;') +node2.start(wait_other_notice=True, wait_for_binary_proto=True) + +# with node1 down +# +# node2, partition 0 | - - 2 3 +# node2, partition 2 | x 1 2 - + +session = self.patient_cql_connection(node2) + +node1.stop(wait_other_notice=True) +session.execute('INSERT INTO test.test (pk, ck) VALUES (0, 2) USING TIMESTAMP 42;') +session.execute('INSERT INTO test.test (pk, ck) VALUES (0, 3) USING TIMESTAMP 42;') +session.execute('DELETE FROM test.test USING TIMESTAMP 42 WHERE pk = 2 AND ck = 0;') +session.execute('INSERT INTO test.test (pk, ck) VALUES (2, 1) USING TIMESTAMP 42;') +session.execute('INSERT INTO test.test (pk, ck) VALUES (2, 2) USING TIMESTAMP 42;') +node1.start(wait_other_notice=True, wait_for_binary_proto=True) + +# with both nodes up, do a CL.ALL query with per partition limit of 2 and limit of 3; +# without the change to if (!singleResultCounter.isDoneForPartition()) branch, +# the query would skip SRP on node2, partition 2, and incorrectly return just +# [[0, 0], [0, 1]] +assert_all(session, + 'SELECT pk, ck FROM test.test PER PARTITION LIMIT 2 LIMIT 3;', + [[0, 0], [0, 1], +[2, 2]], + cl=ConsistencyLevel.ALL) + +@since('3.11') +def test_13911_partitions_srp(self): +""" +@jira_ticket CASSANDRA-13911 + +A regression test to prove that we can't rely on +!singleResultCounter.isDone() to abort ranged +partition SRP early if a per partition limit is set. +""" +cluster = self.cluster + +# disable hinted handoff and set batch commit log so this doesn't interfere with the test +cluster.set_configuration_options(values={'hinted_handoff_enabled': False}) +cluster.set_batch_commitlog(enabled=True) + +cluster.populate(2).start(wait_other_notice=True) +node1, node2 = cluster.nodelist() +
[jira] [Commented] (CASSANDRA-13931) Cassandra JVM stop itself randomly
[ https://issues.apache.org/jira/browse/CASSANDRA-13931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190324#comment-16190324 ] Chris Lohfink commented on CASSANDRA-13931: --- with {{-Xms6G, -Xmx6G, -Xmn2048M}} your going to have issues running C* with default settings. I would strongly recommend a minimum 8GB. The JVM defaults MaxDirectMemorySize to same as heap size (6G), although this usually doesnt fill up unless hitting a netty or jdk leak. If your running on a limited system where your getting hit by OOM killer you might want to consider smaller heap yet (ie 4gb) but then you will need to limit other settings since this is not going to be able to handle the default settings. ie {{concurrent_reads}} and {{concurrent_writes}} should be perhaps 1/2 or 1/4 the 64/128 you have atm. Also look to decrease just about anything that takes up resources offheap. > Cassandra JVM stop itself randomly > -- > > Key: CASSANDRA-13931 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13931 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: RHEL 7.3 > JDK HotSpot 1.8.0_121-b13 > cassandra-3.11 cluster with 43 nodes in 9 datacenters > 8vCPU, 32 GB RAM >Reporter: Andrey Lataev > Attachments: cassandra-env.sh, cassandra.yaml, > system.log.2017-10-01.zip > > > Before I set -XX:MaxDirectMemorySize I receive OOM on OS level like; > # # grep "Out of" /var/log/messages-20170918 > Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26619 > (java) score 287 or sacrifice child > Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26640 > (java) score 289 or sacrifice child > If set -XX:MaxDirectMemorySize=5G limitation then periodicaly begin receive: > HeapUtils.java:136 - Dumping heap to > /egov/dumps/cassandra-1506868110-pid11155.hprof > It seems like JVM kill itself when off-heap memory leaks occur. > Typical errors in system.log before JVM begin dumping: > ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:00:36,336 > CassandraDaemon.java:228 - Exception in thread > Thread[MessagingService-Incoming-/172.20.4.143,5,main] > ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 > Message.java:625 - Unexpected exception during request; channel = [id: > 0x3c0c1c26, L:/172.20.4.142:9042 - R:/172.20.4.139:44874] > Full stack traces: > ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 > Message.java:625 - Unexpected exception during request; channel = [id: > 0x3c0c1c26, L:/172.20.4.142:9042 - > R:/172.20.4.139:44874] > java.lang.AssertionError: null > at > org.apache.cassandra.transport.ServerConnection.applyStateTransition(ServerConnection.java:97) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:521) > [apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410) > [apache-cassandra-3.11.0.jar:3.11.0] > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_121] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) > [apache-cassandra-3.11.0.jar:3.1 > 1.0] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) > [apache-cassandra-3.11.0.jar:3.11.0] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121] > INFO [MutationStage-127] 2017-10-01 19:08:24,255 HeapUtils.java:136 - > Dumping heap to /egov/dumps/cassandra-1506868110-pid11155.hprof ... > Heap dump file created > ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:08:33,493 > CassandraDaemon.java:228 - Exception in thread > Thread[MessagingService-Incoming-/172.20.4.143,5,main] > java.io.IOError: java.io.EOFException: Stream ended prematurely > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.
[jira] [Commented] (CASSANDRA-13931) Cassandra JVM stop itself randomly
[ https://issues.apache.org/jira/browse/CASSANDRA-13931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190306#comment-16190306 ] Chris Lohfink commented on CASSANDRA-13931: --- There is a JDK memory leak on the direct memory ({{-Djdk.nio.maxCachedBufferSize=262144}}) that may help if you running jdk >= 1.8u102 Your giving JVM more heap+off heap than your system has, so the OS out of memory killer kills java or the JVM fails a malloc and shuts down. > Cassandra JVM stop itself randomly > -- > > Key: CASSANDRA-13931 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13931 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: RHEL 7.3 > JDK HotSpot 1.8.0_121-b13 > cassandra-3.11 cluster with 43 nodes in 9 datacenters > 8vCPU, 32 GB RAM >Reporter: Andrey Lataev > Attachments: cassandra-env.sh, cassandra.yaml, > system.log.2017-10-01.zip > > > Before I set -XX:MaxDirectMemorySize I receive OOM on OS level like; > # # grep "Out of" /var/log/messages-20170918 > Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26619 > (java) score 287 or sacrifice child > Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26640 > (java) score 289 or sacrifice child > If set -XX:MaxDirectMemorySize=5G limitation then periodicaly begin receive: > HeapUtils.java:136 - Dumping heap to > /egov/dumps/cassandra-1506868110-pid11155.hprof > It seems like JVM kill itself when off-heap memory leaks occur. > Typical errors in system.log before JVM begin dumping: > ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:00:36,336 > CassandraDaemon.java:228 - Exception in thread > Thread[MessagingService-Incoming-/172.20.4.143,5,main] > ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 > Message.java:625 - Unexpected exception during request; channel = [id: > 0x3c0c1c26, L:/172.20.4.142:9042 - R:/172.20.4.139:44874] > Full stack traces: > ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 > Message.java:625 - Unexpected exception during request; channel = [id: > 0x3c0c1c26, L:/172.20.4.142:9042 - > R:/172.20.4.139:44874] > java.lang.AssertionError: null > at > org.apache.cassandra.transport.ServerConnection.applyStateTransition(ServerConnection.java:97) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:521) > [apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410) > [apache-cassandra-3.11.0.jar:3.11.0] > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_121] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) > [apache-cassandra-3.11.0.jar:3.1 > 1.0] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) > [apache-cassandra-3.11.0.jar:3.11.0] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121] > INFO [MutationStage-127] 2017-10-01 19:08:24,255 HeapUtils.java:136 - > Dumping heap to /egov/dumps/cassandra-1506868110-pid11155.hprof ... > Heap dump file created > ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:08:33,493 > CassandraDaemon.java:228 - Exception in thread > Thread[MessagingService-Incoming-/172.20.4.143,5,main] > java.io.IOError: java.io.EOFException: Stream ended prematurely > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:839) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > or
[jira] [Updated] (CASSANDRA-13932) Stress write order and seed order should be different
[ https://issues.apache.org/jira/browse/CASSANDRA-13932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Cranford updated CASSANDRA-13932: Summary: Stress write order and seed order should be different (was: Write order and seed order should be different) > Stress write order and seed order should be different > - > > Key: CASSANDRA-13932 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13932 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Daniel Cranford > Labels: stress > Attachments: 0001-Initial-implementation-cassandra-3.11.patch, > vmtouch-after.txt, vmtouch-before.txt > > > Read tests get an unrealistic boost in performance because they read data > from a set of partitions that was written sequentially. > I ran into this while running a timed read test against a large data set (250 > million partition keys) {noformat}cassandra-stress read > duration=30m{noformat} While the test was running, I noticed one node was > performing zero IO after an initial period. > I discovered each node in the cluster only had blocks from a single SSTable > loaded in the FS cache. {noformat}vmtouch -v /path/to/sstables{noformat} > For the node that was performing zero IO, the SSTable in question was small > enough to fit into the FS cache. > I realized that when a read test is run for a duration or until rate > convergenge, the default population for the seeds is a GAUSSIAN distribution > over the first million seeds. Because of the way compaction works, partitions > that are written sequentially will (with high probability) always live in the > same SSTable. That means that while the first million seeds will generate > partition keys that will be randomly distributed in the token space, they > will most likely all live in the same SSTable. When this SSTable is small > enough to fit into the FS cache, you get unbelievably good results for a read > test. Consider that a dataset 4x the size of the FS cache will have almost > 1/2 the data in SSTables small enough to fit into the FS cache. > Adjusting the population of seeds used during the read test to be the entire > 250 million seeds used to load the cluster does not fix the > problem.{noformat}cassandra-stress read duration=30m -pop > dist=gaussian(1..250M){noformat} > or (same population, larger sample) {noformat}cassandra-stress read > n=250M{noformat} > Any distribution other than the uniform distribution has one or more modes, > and the mode(s) of such a distribution will cluster reads around a certain > seed range which corresponds to a certain set of sequential writes which > corresponds to (with high probability) a single SSTable. > My patch against cassandra-3.11 fixes this by shuffling the sequence of > generated seeds. Each seed value will still be generated once and only once. > The old behavior of sequential seed generation (ie seed(n+1) = seed( n) + 1) > may be selected by using the no-shuffle flag. e.g. {noformat}cassandra-stress > read duration=30m -pop no-shuffle{noformat} > Results: In [^vmtouch-before.txt] only pages from a single SSTable are > present in the FS cache while in [^vmtouch-after.txt] an equal proportion of > all SSTables are present in the FS cache. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-13932) Write order and seed order should be different
Daniel Cranford created CASSANDRA-13932: --- Summary: Write order and seed order should be different Key: CASSANDRA-13932 URL: https://issues.apache.org/jira/browse/CASSANDRA-13932 Project: Cassandra Issue Type: Bug Components: Tools Reporter: Daniel Cranford Attachments: 0001-Initial-implementation-cassandra-3.11.patch, vmtouch-after.txt, vmtouch-before.txt Read tests get an unrealistic boost in performance because they read data from a set of partitions that was written sequentially. I ran into this while running a timed read test against a large data set (250 million partition keys) {noformat}cassandra-stress read duration=30m{noformat} While the test was running, I noticed one node was performing zero IO after an initial period. I discovered each node in the cluster only had blocks from a single SSTable loaded in the FS cache. {noformat}vmtouch -v /path/to/sstables{noformat} For the node that was performing zero IO, the SSTable in question was small enough to fit into the FS cache. I realized that when a read test is run for a duration or until rate convergenge, the default population for the seeds is a GAUSSIAN distribution over the first million seeds. Because of the way compaction works, partitions that are written sequentially will (with high probability) always live in the same SSTable. That means that while the first million seeds will generate partition keys that will be randomly distributed in the token space, they will most likely all live in the same SSTable. When this SSTable is small enough to fit into the FS cache, you get unbelievably good results for a read test. Consider that a dataset 4x the size of the FS cache will have almost 1/2 the data in SSTables small enough to fit into the FS cache. Adjusting the population of seeds used during the read test to be the entire 250 million seeds used to load the cluster does not fix the problem.{noformat}cassandra-stress read duration=30m -pop dist=gaussian(1..250M){noformat} or (same population, larger sample) {noformat}cassandra-stress read n=250M{noformat} Any distribution other than the uniform distribution has one or more modes, and the mode(s) of such a distribution will cluster reads around a certain seed range which corresponds to a certain set of sequential writes which corresponds to (with high probability) a single SSTable. My patch against cassandra-3.11 fixes this by shuffling the sequence of generated seeds. Each seed value will still be generated once and only once. The old behavior of sequential seed generation (ie seed(n+1) = seed( n) + 1) may be selected by using the no-shuffle flag. e.g. {noformat}cassandra-stress read duration=30m -pop no-shuffle{noformat} Results: In [^vmtouch-before.txt] only pages from a single SSTable are present in the FS cache while in [^vmtouch-after.txt] an equal proportion of all SSTables are present in the FS cache. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13899) Streaming of compressed partition fails
[ https://issues.apache.org/jira/browse/CASSANDRA-13899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190187#comment-16190187 ] Jason Brown commented on CASSANDRA-13899: - [~pauloricardomg] do you think you can finish up the review on this soonish? > Streaming of compressed partition fails > > > Key: CASSANDRA-13899 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13899 > Project: Cassandra > Issue Type: Bug >Reporter: Paulo Motta >Assignee: Jason Brown > Fix For: 4.0 > > Attachments: largepartition.yaml > > > Streaming a single partition with ~100K rows fails with the following > exception: > {noformat} > ERROR [Stream-Deserializer-/127.0.0.1:35149-a92e5e12] 2017-09-21 04:03:41,237 > StreamSession.java:617 - [Stream #c2e5b640-9eab-11e7-99c0-e9864ca8da8e] > Streaming error occurred on session with peer 127.0.0.1 > org.apache.cassandra.streaming.StreamReceiveException: > java.lang.RuntimeException: Last written key > DecoratedKey(-1000328290821038380) >= current key > DecoratedKey(-1055007227842125139) writing into > /home/paulo/.ccm/test/node2/data0/stresscql/typestest-482ac7b09e8d11e787cf85d073c > 8e037/na-1-big-Data.db > at > org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:63) > ~[main/:na] > at > org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:41) > ~[main/:na] > at > org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:55) > ~[main/:na] > at > org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:178) > ~[main/:na] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121] > {noformat} > Reproduction steps: > * Create CCM cluster with 2 nodes > * Start only first node, disable hinted handoff > * Run stress with the attached yaml: {{tools/bin/cassandra-stress "user > profile=largepartition.yaml n=10K ops(insert=1) no-warmup -node whitelist > 127.0.0.1 -mode native cql3 compression=lz4 -rate threads=4 -insert > visits=FIXED(100K) revisit=FIXED(100K)"}} > * Start second node, run repair on {{stresscql}} table - the exception above > will be thrown. > I investigated briefly and haven't found anything suspicious. This seems to > be related to CASSANDRA-12229 as I tested the steps above in a branch without > that and the repair completed successfully. I haven't tested with a smaller > number of rows per partition to see at which point it starts to be a problem. > We should probably add a regression dtest to stream large partitions to catch > similar problems in the future. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13906) Properly close StreamCompressionInputStream to release any ByteBuf
[ https://issues.apache.org/jira/browse/CASSANDRA-13906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Brown updated CASSANDRA-13906: Resolution: Fixed Fix Version/s: 4.0 Status: Resolved (was: Patch Available) > Properly close StreamCompressionInputStream to release any ByteBuf > -- > > Key: CASSANDRA-13906 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13906 > Project: Cassandra > Issue Type: Bug >Reporter: Jason Brown >Assignee: Jason Brown > Fix For: 4.0 > > > When running dtests for trunk (4.x) that perform some streaming, sometimes a > {{ByteBuf}} is not released properly, and we get this error in the logs > (causing the dtest to fail): > {code} > ERROR [MessagingService-NettyOutbound-Thread-4-2] 2017-09-26 13:42:37,940 > Slf4JLogger.java:176 - LEAK: ByteBuf.release() was not called before it's > garbage-collected. Enable advanced leak reporting to find out where the leak > occurred. To enable advanced leak reporting, specify the JVM option > '-Dio.netty.leakDetection.level=advanced' or call > ResourceLeakDetector.setLevel() See > http://netty.io/wiki/reference-counted-objects.html for more information. > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13906) Properly close StreamCompressionInputStream to release any ByteBuf
[ https://issues.apache.org/jira/browse/CASSANDRA-13906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190184#comment-16190184 ] Jason Brown commented on CASSANDRA-13906: - committed as sha {{982ab93a2f8a0f5c56af9378f65d3e9e43b9}} > Properly close StreamCompressionInputStream to release any ByteBuf > -- > > Key: CASSANDRA-13906 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13906 > Project: Cassandra > Issue Type: Bug >Reporter: Jason Brown >Assignee: Jason Brown > > When running dtests for trunk (4.x) that perform some streaming, sometimes a > {{ByteBuf}} is not released properly, and we get this error in the logs > (causing the dtest to fail): > {code} > ERROR [MessagingService-NettyOutbound-Thread-4-2] 2017-09-26 13:42:37,940 > Slf4JLogger.java:176 - LEAK: ByteBuf.release() was not called before it's > garbage-collected. Enable advanced leak reporting to find out where the leak > occurred. To enable advanced leak reporting, specify the JVM option > '-Dio.netty.leakDetection.level=advanced' or call > ResourceLeakDetector.setLevel() See > http://netty.io/wiki/reference-counted-objects.html for more information. > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
cassandra git commit: Properly close StreamCompressionInputStream to release any ByteBuffer
Repository: cassandra Updated Branches: refs/heads/trunk d97d95ff1 -> 982ab93a2 Properly close StreamCompressionInputStream to release any ByteBuffer patch by jasobrown; reviewed by Ariel Weisberg for CASSANDRA-13906 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/982ab93a Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/982ab93a Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/982ab93a Branch: refs/heads/trunk Commit: 982ab93a2f8a0f5c56af9378f65d3e9e43b9 Parents: d97d95f Author: Jason Brown Authored: Tue Sep 26 15:52:54 2017 -0700 Committer: Jason Brown Committed: Wed Oct 4 04:21:20 2017 +0900 -- CHANGES.txt| 1 + .../apache/cassandra/streaming/StreamReader.java | 11 ++- .../streaming/compress/CompressedInputStream.java | 4 +++- .../streaming/compress/CompressedStreamReader.java | 17 ++--- .../compress/StreamCompressionInputStream.java | 7 ++- 5 files changed, 22 insertions(+), 18 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/982ab93a/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 2498270..5a8ab47 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 4.0 + * Properly close StreamCompressionInputStream to release any ByteBuf (CASSANDRA-13906) * Add SERIAL and LOCAL_SERIAL support for cassandra-stress (CASSANDRA-13925) * LCS needlessly checks for L0 STCS candidates multiple times (CASSANDRA-12961) * Correctly close netty channels when a stream session ends (CASSANDRA-13905) http://git-wip-us.apache.org/repos/asf/cassandra/blob/982ab93a/src/java/org/apache/cassandra/streaming/StreamReader.java -- diff --git a/src/java/org/apache/cassandra/streaming/StreamReader.java b/src/java/org/apache/cassandra/streaming/StreamReader.java index 590ba5f..f4eb9c4 100644 --- a/src/java/org/apache/cassandra/streaming/StreamReader.java +++ b/src/java/org/apache/cassandra/streaming/StreamReader.java @@ -106,12 +106,12 @@ public class StreamReader session.planId(), fileSeqNum, session.peer, repairedAt, totalSize, cfs.keyspace.getName(), cfs.getTableName(), pendingRepair); - -TrackedDataInputPlus in = new TrackedDataInputPlus(new StreamCompressionInputStream(inputPlus, StreamMessage.CURRENT_VERSION)); -StreamDeserializer deserializer = new StreamDeserializer(cfs.metadata(), in, inputVersion, getHeader(cfs.metadata())); +StreamDeserializer deserializer = null; SSTableMultiWriter writer = null; -try +try (StreamCompressionInputStream streamCompressionInputStream = new StreamCompressionInputStream(inputPlus, StreamMessage.CURRENT_VERSION)) { +TrackedDataInputPlus in = new TrackedDataInputPlus(streamCompressionInputStream); +deserializer = new StreamDeserializer(cfs.metadata(), in, inputVersion, getHeader(cfs.metadata())); writer = createWriter(cfs, totalSize, repairedAt, pendingRepair, format); while (in.getBytesRead() < totalSize) { @@ -125,8 +125,9 @@ public class StreamReader } catch (Throwable e) { +Object partitionKey = deserializer != null ? deserializer.partitionKey() : ""; logger.warn("[Stream {}] Error while reading partition {} from stream on ks='{}' and table='{}'.", -session.planId(), deserializer.partitionKey(), cfs.keyspace.getName(), cfs.getTableName(), e); +session.planId(), partitionKey, cfs.keyspace.getName(), cfs.getTableName(), e); if (writer != null) { writer.abort(e); http://git-wip-us.apache.org/repos/asf/cassandra/blob/982ab93a/src/java/org/apache/cassandra/streaming/compress/CompressedInputStream.java -- diff --git a/src/java/org/apache/cassandra/streaming/compress/CompressedInputStream.java b/src/java/org/apache/cassandra/streaming/compress/CompressedInputStream.java index 76f76ea..4b9fc61 100644 --- a/src/java/org/apache/cassandra/streaming/compress/CompressedInputStream.java +++ b/src/java/org/apache/cassandra/streaming/compress/CompressedInputStream.java @@ -44,7 +44,7 @@ import org.apache.cassandra.utils.WrappedRunnable; * InputStream which reads data from underlining source with given {@link CompressionInfo}. Uses {@link #buffer} as a buffer * for uncompressed data (which is read by stream consumers - {@link StreamDeserializer} in this case). */ -public class CompressedInputStream extends Rebuffe
[jira] [Commented] (CASSANDRA-13910) Consider deprecating (then removing) read_repair_chance/dclocal_read_repair_chance
[ https://issues.apache.org/jira/browse/CASSANDRA-13910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190165#comment-16190165 ] Jason Brown commented on CASSANDRA-13910: - bq. I did sent an email to the user list a few days ago here. No feedback on that threads just yet but happy to leave it at least 1-2 more weeks before making any move. Excellent, thank you. I think you can wait up to a week, and if no one is screaming, please proceed full speed ahead. > Consider deprecating (then removing) > read_repair_chance/dclocal_read_repair_chance > -- > > Key: CASSANDRA-13910 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13910 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Priority: Minor > Labels: CommunityFeedbackRequested > > First, let me clarify so this is not misunderstood that I'm not *at all* > suggesting to remove the read-repair mechanism of detecting and repairing > inconsistencies between read responses: that mechanism is imo fine and > useful. But the {{read_repair_chance}} and {{dclocal_read_repair_chance}} > have never been about _enabling_ that mechanism, they are about querying all > replicas (even when this is not required by the consistency level) for the > sole purpose of maybe read-repairing some of the replica that wouldn't have > been queried otherwise. Which btw, bring me to reason 1 for considering their > removal: their naming/behavior is super confusing. Over the years, I've seen > countless users (and not only newbies) misunderstanding what those options > do, and as a consequence misunderstand when read-repair itself was happening. > But my 2nd reason for suggesting this is that I suspect > {{read_repair_chance}}/{{dclocal_read_repair_chance}} are, especially > nowadays, more harmful than anything else when enabled. When those option > kick in, what you trade-off is additional resources consumption (all nodes > have to execute the read) for a _fairly remote chance_ of having some > inconsistencies repaired on _some_ replica _a bit faster_ than they would > otherwise be. To justify that last part, let's recall that: > # most inconsistencies are actually fixed by hints in practice; and in the > case where a node stay dead for a long time so that hints ends up timing-out, > you really should repair the node when it comes back (if not simply > re-bootstrapping it). Read-repair probably don't fix _that_ much stuff in > the first place. > # again, read-repair do happen without those options kicking in. If you do > reads at {{QUORUM}}, inconsistencies will eventually get read-repaired all > the same. Just a tiny bit less quickly. > # I suspect almost everyone use a low "chance" for those options at best > (because the extra resources consumption is real), so at the end of the day, > it's up to chance how much faster this fixes inconsistencies. > Overall, I'm having a hard time imagining real cases where that trade-off > really make sense. Don't get me wrong, those options had their places a long > time ago when hints weren't working all that well, but I think they bring > more confusion than benefits now. > And I think it's sane to reconsider stuffs every once in a while, and to > clean up anything that may not make all that much sense anymore, which I > think is the case here. > Tl;dr, I feel the benefits brought by those options are very slim at best and > well overshadowed by the confusion they bring, and not worth maintaining the > code that supports them (which, to be fair, isn't huge, but getting rid of > {{ReadCallback.AsyncRepairRunner}} wouldn't hurt for instance). > Lastly, if the consensus here ends up being that they can have their use in > weird case and that we fill supporting those cases is worth confusing > everyone else and maintaining that code, I would still suggest disabling them > totally by default. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13931) Cassandra JVM stop itself randomly
[ https://issues.apache.org/jira/browse/CASSANDRA-13931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190166#comment-16190166 ] Andrey Lataev commented on CASSANDRA-13931: --- Also, I can attach JVM heap dump if it help. > Cassandra JVM stop itself randomly > -- > > Key: CASSANDRA-13931 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13931 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: RHEL 7.3 > JDK HotSpot 1.8.0_121-b13 > cassandra-3.11 cluster with 43 nodes in 9 datacenters > 8vCPU, 32 GB RAM >Reporter: Andrey Lataev > Attachments: cassandra-env.sh, cassandra.yaml, > system.log.2017-10-01.zip > > > Before I set -XX:MaxDirectMemorySize I receive OOM on OS level like; > # # grep "Out of" /var/log/messages-20170918 > Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26619 > (java) score 287 or sacrifice child > Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26640 > (java) score 289 or sacrifice child > If set -XX:MaxDirectMemorySize=5G limitation then periodicaly begin receive: > HeapUtils.java:136 - Dumping heap to > /egov/dumps/cassandra-1506868110-pid11155.hprof > It seems like JVM kill itself when off-heap memory leaks occur. > Typical errors in system.log before JVM begin dumping: > ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:00:36,336 > CassandraDaemon.java:228 - Exception in thread > Thread[MessagingService-Incoming-/172.20.4.143,5,main] > ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 > Message.java:625 - Unexpected exception during request; channel = [id: > 0x3c0c1c26, L:/172.20.4.142:9042 - R:/172.20.4.139:44874] > Full stack traces: > ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 > Message.java:625 - Unexpected exception during request; channel = [id: > 0x3c0c1c26, L:/172.20.4.142:9042 - > R:/172.20.4.139:44874] > java.lang.AssertionError: null > at > org.apache.cassandra.transport.ServerConnection.applyStateTransition(ServerConnection.java:97) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:521) > [apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410) > [apache-cassandra-3.11.0.jar:3.11.0] > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_121] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) > [apache-cassandra-3.11.0.jar:3.1 > 1.0] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) > [apache-cassandra-3.11.0.jar:3.11.0] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121] > INFO [MutationStage-127] 2017-10-01 19:08:24,255 HeapUtils.java:136 - > Dumping heap to /egov/dumps/cassandra-1506868110-pid11155.hprof ... > Heap dump file created > ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:08:33,493 > CassandraDaemon.java:228 - Exception in thread > Thread[MessagingService-Incoming-/172.20.4.143,5,main] > java.io.IOError: java.io.EOFException: Stream ended prematurely > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:839) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:800) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(M
[jira] [Updated] (CASSANDRA-13931) Cassandra JVM stop itself randomly
[ https://issues.apache.org/jira/browse/CASSANDRA-13931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Lataev updated CASSANDRA-13931: -- Attachment: system.log.2017-10-01.zip > Cassandra JVM stop itself randomly > -- > > Key: CASSANDRA-13931 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13931 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: RHEL 7.3 > JDK HotSpot 1.8.0_121-b13 > cassandra-3.11 cluster with 43 nodes in 9 datacenters > 8vCPU, 32 GB RAM >Reporter: Andrey Lataev > Attachments: cassandra-env.sh, cassandra.yaml, > system.log.2017-10-01.zip > > > Before I set -XX:MaxDirectMemorySize I receive OOM on OS level like; > # # grep "Out of" /var/log/messages-20170918 > Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26619 > (java) score 287 or sacrifice child > Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26640 > (java) score 289 or sacrifice child > If set -XX:MaxDirectMemorySize=5G limitation then periodicaly begin receive: > HeapUtils.java:136 - Dumping heap to > /egov/dumps/cassandra-1506868110-pid11155.hprof > It seems like JVM kill itself when off-heap memory leaks occur. > Typical errors in system.log before JVM begin dumping: > ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:00:36,336 > CassandraDaemon.java:228 - Exception in thread > Thread[MessagingService-Incoming-/172.20.4.143,5,main] > ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 > Message.java:625 - Unexpected exception during request; channel = [id: > 0x3c0c1c26, L:/172.20.4.142:9042 - R:/172.20.4.139:44874] > Full stack traces: > ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 > Message.java:625 - Unexpected exception during request; channel = [id: > 0x3c0c1c26, L:/172.20.4.142:9042 - > R:/172.20.4.139:44874] > java.lang.AssertionError: null > at > org.apache.cassandra.transport.ServerConnection.applyStateTransition(ServerConnection.java:97) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:521) > [apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410) > [apache-cassandra-3.11.0.jar:3.11.0] > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_121] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) > [apache-cassandra-3.11.0.jar:3.1 > 1.0] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) > [apache-cassandra-3.11.0.jar:3.11.0] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121] > INFO [MutationStage-127] 2017-10-01 19:08:24,255 HeapUtils.java:136 - > Dumping heap to /egov/dumps/cassandra-1506868110-pid11155.hprof ... > Heap dump file created > ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:08:33,493 > CassandraDaemon.java:228 - Exception in thread > Thread[MessagingService-Incoming-/172.20.4.143,5,main] > java.io.IOError: java.io.EOFException: Stream ended prematurely > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:839) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:800) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:415) > ~[apache-cassandra-3.11.0.jar:3.11.0] >
[jira] [Created] (CASSANDRA-13931) Cassandra JVM stop itself randomly
Andrey Lataev created CASSANDRA-13931: - Summary: Cassandra JVM stop itself randomly Key: CASSANDRA-13931 URL: https://issues.apache.org/jira/browse/CASSANDRA-13931 Project: Cassandra Issue Type: Bug Components: Core Environment: RHEL 7.3 JDK HotSpot 1.8.0_121-b13 cassandra-3.11 cluster with 43 nodes in 9 datacenters 8vCPU, 32 GB RAM Reporter: Andrey Lataev Attachments: cassandra-env.sh, cassandra.yaml Before I set -XX:MaxDirectMemorySize I receive OOM on OS level like; # # grep "Out of" /var/log/messages-20170918 Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26619 (java) score 287 or sacrifice child Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26640 (java) score 289 or sacrifice child If set -XX:MaxDirectMemorySize=5G limitation then periodicaly begin receive: HeapUtils.java:136 - Dumping heap to /egov/dumps/cassandra-1506868110-pid11155.hprof It seems like JVM kill itself when off-heap memory leaks occur. Typical errors in system.log before JVM begin dumping: ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:00:36,336 CassandraDaemon.java:228 - Exception in thread Thread[MessagingService-Incoming-/172.20.4.143,5,main] ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 Message.java:625 - Unexpected exception during request; channel = [id: 0x3c0c1c26, L:/172.20.4.142:9042 - R:/172.20.4.139:44874] Full stack traces: ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 Message.java:625 - Unexpected exception during request; channel = [id: 0x3c0c1c26, L:/172.20.4.142:9042 - R:/172.20.4.139:44874] java.lang.AssertionError: null at org.apache.cassandra.transport.ServerConnection.applyStateTransition(ServerConnection.java:97) ~[apache-cassandra-3.11.0.jar:3.11.0] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:521) [apache-cassandra-3.11.0.jar:3.11.0] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410) [apache-cassandra-3.11.0.jar:3.11.0] at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) [netty-all-4.0.44.Final.jar:4.0.44.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) [netty-all-4.0.44.Final.jar:4.0.44.Final] at io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35) [netty-all-4.0.44.Final.jar:4.0.44.Final] at io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348) [netty-all-4.0.44.Final.jar:4.0.44.Final] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_121] at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) [apache-cassandra-3.11.0.jar:3.1 1.0] at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) [apache-cassandra-3.11.0.jar:3.11.0] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121] INFO [MutationStage-127] 2017-10-01 19:08:24,255 HeapUtils.java:136 - Dumping heap to /egov/dumps/cassandra-1506868110-pid11155.hprof ... Heap dump file created ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:08:33,493 CassandraDaemon.java:228 - Exception in thread Thread[MessagingService-Incoming-/172.20.4.143,5,main] java.io.IOError: java.io.EOFException: Stream ended prematurely at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227) ~[apache-cassandra-3.11.0.jar:3.11.0] at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215) ~[apache-cassandra-3.11.0.jar:3.11.0] at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[apache-cassandra-3.11.0.jar:3.11.0] at org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:839) ~[apache-cassandra-3.11.0.jar:3.11.0] at org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:800) ~[apache-cassandra-3.11.0.jar:3.11.0] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:415) ~[apache-cassandra-3.11.0.jar:3.11.0] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:434) ~[apache-cassandra-3.11.0.jar:3.11.0] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:371) ~[apache-cassandra-3.11.0.jar:3.11.0] at org.apache.cassandra.net.MessageIn.read(MessageIn.java:123) ~[apache-cassandra-3.11.0.jar:3.11.0]
[jira] [Commented] (CASSANDRA-13906) Properly close StreamCompressionInputStream to release any ByteBuf
[ https://issues.apache.org/jira/browse/CASSANDRA-13906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190153#comment-16190153 ] Jason Brown commented on CASSANDRA-13906: - [~aweisberg] and I discussed offline, and I'll revert the overly cautious (and perhaps incorrect) change around that refCnt. > Properly close StreamCompressionInputStream to release any ByteBuf > -- > > Key: CASSANDRA-13906 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13906 > Project: Cassandra > Issue Type: Bug >Reporter: Jason Brown >Assignee: Jason Brown > > When running dtests for trunk (4.x) that perform some streaming, sometimes a > {{ByteBuf}} is not released properly, and we get this error in the logs > (causing the dtest to fail): > {code} > ERROR [MessagingService-NettyOutbound-Thread-4-2] 2017-09-26 13:42:37,940 > Slf4JLogger.java:176 - LEAK: ByteBuf.release() was not called before it's > garbage-collected. Enable advanced leak reporting to find out where the leak > occurred. To enable advanced leak reporting, specify the JVM option > '-Dio.netty.leakDetection.level=advanced' or call > ResourceLeakDetector.setLevel() See > http://netty.io/wiki/reference-counted-objects.html for more information. > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13910) Consider deprecating (then removing) read_repair_chance/dclocal_read_repair_chance
[ https://issues.apache.org/jira/browse/CASSANDRA-13910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190122#comment-16190122 ] Aleksey Yeschenko commented on CASSANDRA-13910: --- bq. C* is successful enough that anything that has been in the product for some amount of time is probably relied upon by someone, somewhere,, for some definition of "rely". Guaranteeing that it's not the case as a bar for removing anything would amount to remove nothing, and that would, imo, be dangerous for the project. Well said. bq. Short of that though, I suggest we move ahead with this rather than keep something we agree is more harmful than helpful most of the time (and we seem to more or less agree on that) on the off chance this may piss off somebody somewhere. +1 > Consider deprecating (then removing) > read_repair_chance/dclocal_read_repair_chance > -- > > Key: CASSANDRA-13910 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13910 > Project: Cassandra > Issue Type: Improvement >Reporter: Sylvain Lebresne >Priority: Minor > Labels: CommunityFeedbackRequested > > First, let me clarify so this is not misunderstood that I'm not *at all* > suggesting to remove the read-repair mechanism of detecting and repairing > inconsistencies between read responses: that mechanism is imo fine and > useful. But the {{read_repair_chance}} and {{dclocal_read_repair_chance}} > have never been about _enabling_ that mechanism, they are about querying all > replicas (even when this is not required by the consistency level) for the > sole purpose of maybe read-repairing some of the replica that wouldn't have > been queried otherwise. Which btw, bring me to reason 1 for considering their > removal: their naming/behavior is super confusing. Over the years, I've seen > countless users (and not only newbies) misunderstanding what those options > do, and as a consequence misunderstand when read-repair itself was happening. > But my 2nd reason for suggesting this is that I suspect > {{read_repair_chance}}/{{dclocal_read_repair_chance}} are, especially > nowadays, more harmful than anything else when enabled. When those option > kick in, what you trade-off is additional resources consumption (all nodes > have to execute the read) for a _fairly remote chance_ of having some > inconsistencies repaired on _some_ replica _a bit faster_ than they would > otherwise be. To justify that last part, let's recall that: > # most inconsistencies are actually fixed by hints in practice; and in the > case where a node stay dead for a long time so that hints ends up timing-out, > you really should repair the node when it comes back (if not simply > re-bootstrapping it). Read-repair probably don't fix _that_ much stuff in > the first place. > # again, read-repair do happen without those options kicking in. If you do > reads at {{QUORUM}}, inconsistencies will eventually get read-repaired all > the same. Just a tiny bit less quickly. > # I suspect almost everyone use a low "chance" for those options at best > (because the extra resources consumption is real), so at the end of the day, > it's up to chance how much faster this fixes inconsistencies. > Overall, I'm having a hard time imagining real cases where that trade-off > really make sense. Don't get me wrong, those options had their places a long > time ago when hints weren't working all that well, but I think they bring > more confusion than benefits now. > And I think it's sane to reconsider stuffs every once in a while, and to > clean up anything that may not make all that much sense anymore, which I > think is the case here. > Tl;dr, I feel the benefits brought by those options are very slim at best and > well overshadowed by the confusion they bring, and not worth maintaining the > code that supports them (which, to be fair, isn't huge, but getting rid of > {{ReadCallback.AsyncRepairRunner}} wouldn't hurt for instance). > Lastly, if the consensus here ends up being that they can have their use in > weird case and that we fill supporting those cases is worth confusing > everyone else and maintaining that code, I would still suggest disabling them > totally by default. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13906) Properly close StreamCompressionInputStream to release any ByteBuf
[ https://issues.apache.org/jira/browse/CASSANDRA-13906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190107#comment-16190107 ] Ariel Weisberg commented on CASSANDRA-13906: +1 > Properly close StreamCompressionInputStream to release any ByteBuf > -- > > Key: CASSANDRA-13906 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13906 > Project: Cassandra > Issue Type: Bug >Reporter: Jason Brown >Assignee: Jason Brown > > When running dtests for trunk (4.x) that perform some streaming, sometimes a > {{ByteBuf}} is not released properly, and we get this error in the logs > (causing the dtest to fail): > {code} > ERROR [MessagingService-NettyOutbound-Thread-4-2] 2017-09-26 13:42:37,940 > Slf4JLogger.java:176 - LEAK: ByteBuf.release() was not called before it's > garbage-collected. Enable advanced leak reporting to find out where the leak > occurred. To enable advanced leak reporting, specify the JVM option > '-Dio.netty.leakDetection.level=advanced' or call > ResourceLeakDetector.setLevel() See > http://netty.io/wiki/reference-counted-objects.html for more information. > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-12744) Randomness of stress distributions is not good
[ https://issues.apache.org/jira/browse/CASSANDRA-12744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190085#comment-16190085 ] Daniel Cranford commented on CASSANDRA-12744: - As I've thought about how to fix the seed multiplier, I've come to the conclusion that it is impossible to use an adaptive multiplier without breaking existing functionality or changing the command line interface. One of the key reasons you can specify how the seeds get generated is so that you can partition the seed space and run multiple cassandra-stress processes on different machines in parallel so the cassandra-stress client doesn't become the bottleneck. E.G. to write 2 million partitions from two client machines, you'd run {noformat}cassandra-stress write n=100 -pop seq=1..100{noformat} on one client machine and {noformat}cassandra-stress write n=100 -pop seq=101..200{noformat} on the other client machine. An adaptive multiplier that attempts to scale the seed sequence so that it's range is 10^22 (or better, Long.MAX_VALUE since seeds are 64 bit longs) would generate the same multiplier for both client processes resulting in seed sequence overlaps. To correctly generate an adaptive multiplier, you need global knowledge of the entire range of seeds being generated by all cassandra-stress processes. This information cannot be supplied via the current command line interface. The command line interface would have to be updated in a breaking fashion to support an adaptive multiplier. Using a hardcoded static multiplier is safe, but would reduce the allowable range of seed values (and thus reduce the maximum number of distinct partition keys). This probably isn't a big deal since nobody wants to write 2^64 partitions. But it would need to be chosen with care so that the number of distinct seeds (and thus the number of distinct partitions) doesn't become too small. > Randomness of stress distributions is not good > -- > > Key: CASSANDRA-12744 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12744 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: T Jake Luciani >Assignee: Ben Slater >Priority: Minor > Labels: stress > Fix For: 4.0 > > Attachments: CASSANDRA_12744_SeedManager_changes-trunk.patch > > > The randomness of our distributions is pretty bad. We are using the > JDKRandomGenerator() but in testing of uniform(1..3) we see for 100 > iterations it's only outputting 3. If you bump it to 10k it hits all 3 > values. > I made a change to just use the default commons math random generator and now > see all 3 values for n=10 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13924) Continuous/Infectious Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-13924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Jirsa updated CASSANDRA-13924: --- Labels: CommunityFeedbackRequested (was: ) > Continuous/Infectious Repair > > > Key: CASSANDRA-13924 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13924 > Project: Cassandra > Issue Type: Improvement > Components: Repair >Reporter: Joseph Lynch >Priority: Minor > Labels: CommunityFeedbackRequested > > I've been working on a way to keep data consistent without > scheduled/external/manual repair, because for large datasets repair is > extremely expensive. The basic gist is to introduce a new kind of hint that > keeps just the primary key of the mutation (indicating that PK needs repair) > and is recorded on replicas instead of coordinators during write time. Then a > periodic background task can issue read repairs to just the PKs that were > mutated. The initial performance degradation of this approach is non trivial, > but I believe that I can optimize it so that we are doing very little > additional work (see below in the design doc for some proposed optimizations). > My extremely rough proof of concept (uses a local table instead of > HintStorage, etc) so far is [in a > branch|https://github.com/apache/cassandra/compare/cassandra-3.11...jolynch:continuous_repair] > and has a rough [design > document|https://github.com/jolynch/cassandra/blob/continuous_repair/doc/source/architecture/continuous_repair.rst]. > I'm working on getting benchmarks of the various optimizations, but I > figured I should start this ticket before I got too deep into it. > I believe this approach is particularly good for high read rate clusters > requiring consistent low latency, and for clusters that mutate a relatively > small proportion of their data (since you never have to read the whole > dataset, just what's being mutated). I view this as something that works > _with_ incremental repair to reduce work required because with this technique > we could potentially flush repaired + unrepaired sstables directly from the > memtable. I also see this as something that would be enabled or disabled per > table since it is so use case specific (e.g. some tables don't need repair at > all). I think this is somewhat of a hybrid approach based on incremental > repair, ticklers (read all partitions @ ALL), mutation based repair > (CASSANDRA-8911), and hinted handoff. There are lots of tradeoffs, but I > think it's worth talking about. > If anyone has feedback on the idea, I'd love to chat about it. > [~bdeggleston], [~aweisberg] I chatted with you guys a bit about this at > NGCC; if you have time I'd love to continue that conversation here. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13924) Continuous/Infectious Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-13924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-13924: - Description: I've been working on a way to keep data consistent without scheduled/external/manual repair, because for large datasets repair is extremely expensive. The basic gist is to introduce a new kind of hint that keeps just the primary key of the mutation (indicating that PK needs repair) and is recorded on replicas instead of coordinators during write time. Then a periodic background task can issue read repairs to just the PKs that were mutated. The initial performance degradation of this approach is non trivial, but I believe that I can optimize it so that we are doing very little additional work (see below in the design doc for some proposed optimizations). My extremely rough proof of concept (uses a local table instead of HintStorage, etc) so far is [in a branch|https://github.com/apache/cassandra/compare/cassandra-3.11...jolynch:continuous_repair] and has a rough [design document|https://github.com/jolynch/cassandra/blob/continuous_repair/doc/source/architecture/continuous_repair.rst. I'm working on getting benchmarks of the various optimizations, but I figured I should start this ticket before I got too deep into it. I believe this approach is particularly good for high read rate clusters requiring consistent low latency, and for clusters that mutate a relatively small proportion of their data (since you never have to read the whole dataset, just what's being mutated). I view this as something that works _with_ incremental repair to reduce work required because with this technique we could potentially flush repaired + unrepaired sstables directly from the memtable. I also see this as something that would be enabled or disabled per table since it is so use case specific (e.g. some tables don't need repair at all). I think this is somewhat of a hybrid approach based on incremental repair, ticklers (read all partitions @ ALL), mutation based repair (CASSANDRA-8911), and hinted handoff. There are lots of tradeoffs, but I think it's worth talking about. If anyone has feedback on the idea, I'd love to chat about it. [~bdeggleston], [~aweisberg] I chatted with you guys a bit about this at NGCC; if you have time I'd love to continue that conversation here. was: I've been working on a way to keep data consistent without scheduled/external/manual repair, because for large datasets repair is extremely expensive. The basic gist is to introduce a new kind of hint that keeps just the primary key of the mutation (indicating that PK needs repair) and is recorded on replicas instead of coordinators during write time. Then a periodic background task can issue read repairs to just the PKs that were mutated. The initial performance degradation of this approach is non trivial, but I believe that I can optimize it so that we are doing very little additional work (see below in the design doc for some proposed optimizations). My extremely rough proof of concept (uses a local table instead of HintStorage, etc) so far is [in a branch|https://github.com/apache/cassandra/compare/cassandra-3.11...jolynch:continuous_repair] and has a rough [design document|https://github.com/jolynch/cassandra/blob/c597c0fc6415e00fa8db180be5034214d148822d/doc/source/architecture/continuous_repair.rst]. I'm working on getting benchmarks of the various optimizations, but I figured I should start this ticket before I got too deep into it. I believe this approach is particularly good for high read rate clusters requiring consistent low latency, and for clusters that mutate a relatively small proportion of their data (since you never have to read the whole dataset, just what's being mutated). I view this as something that works _with_ incremental repair to reduce work required because with this technique we could potentially flush repaired + unrepaired sstables directly from the memtable. I also see this as something that would be enabled or disabled per table since it is so use case specific (e.g. some tables don't need repair at all). I think this is somewhat of a hybrid approach based on incremental repair, ticklers (read all partitions @ ALL), mutation based repair (CASSANDRA-8911), and hinted handoff. There are lots of tradeoffs, but I think it's worth talking about. If anyone has feedback on the idea, I'd love to chat about it. [~bdeggleston], [~aweisberg] I chatted with you guys a bit about this at NGCC; if you have time I'd love to continue that conversation here. > Continuous/Infectious Repair > > > Key: CASSANDRA-13924 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13924 > Project: Cassandra > Issue Type: Improvement > Components: Repair >Reporter: Joseph
[jira] [Updated] (CASSANDRA-13924) Continuous/Infectious Repair
[ https://issues.apache.org/jira/browse/CASSANDRA-13924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-13924: - Description: I've been working on a way to keep data consistent without scheduled/external/manual repair, because for large datasets repair is extremely expensive. The basic gist is to introduce a new kind of hint that keeps just the primary key of the mutation (indicating that PK needs repair) and is recorded on replicas instead of coordinators during write time. Then a periodic background task can issue read repairs to just the PKs that were mutated. The initial performance degradation of this approach is non trivial, but I believe that I can optimize it so that we are doing very little additional work (see below in the design doc for some proposed optimizations). My extremely rough proof of concept (uses a local table instead of HintStorage, etc) so far is [in a branch|https://github.com/apache/cassandra/compare/cassandra-3.11...jolynch:continuous_repair] and has a rough [design document|https://github.com/jolynch/cassandra/blob/continuous_repair/doc/source/architecture/continuous_repair.rst]. I'm working on getting benchmarks of the various optimizations, but I figured I should start this ticket before I got too deep into it. I believe this approach is particularly good for high read rate clusters requiring consistent low latency, and for clusters that mutate a relatively small proportion of their data (since you never have to read the whole dataset, just what's being mutated). I view this as something that works _with_ incremental repair to reduce work required because with this technique we could potentially flush repaired + unrepaired sstables directly from the memtable. I also see this as something that would be enabled or disabled per table since it is so use case specific (e.g. some tables don't need repair at all). I think this is somewhat of a hybrid approach based on incremental repair, ticklers (read all partitions @ ALL), mutation based repair (CASSANDRA-8911), and hinted handoff. There are lots of tradeoffs, but I think it's worth talking about. If anyone has feedback on the idea, I'd love to chat about it. [~bdeggleston], [~aweisberg] I chatted with you guys a bit about this at NGCC; if you have time I'd love to continue that conversation here. was: I've been working on a way to keep data consistent without scheduled/external/manual repair, because for large datasets repair is extremely expensive. The basic gist is to introduce a new kind of hint that keeps just the primary key of the mutation (indicating that PK needs repair) and is recorded on replicas instead of coordinators during write time. Then a periodic background task can issue read repairs to just the PKs that were mutated. The initial performance degradation of this approach is non trivial, but I believe that I can optimize it so that we are doing very little additional work (see below in the design doc for some proposed optimizations). My extremely rough proof of concept (uses a local table instead of HintStorage, etc) so far is [in a branch|https://github.com/apache/cassandra/compare/cassandra-3.11...jolynch:continuous_repair] and has a rough [design document|https://github.com/jolynch/cassandra/blob/continuous_repair/doc/source/architecture/continuous_repair.rst. I'm working on getting benchmarks of the various optimizations, but I figured I should start this ticket before I got too deep into it. I believe this approach is particularly good for high read rate clusters requiring consistent low latency, and for clusters that mutate a relatively small proportion of their data (since you never have to read the whole dataset, just what's being mutated). I view this as something that works _with_ incremental repair to reduce work required because with this technique we could potentially flush repaired + unrepaired sstables directly from the memtable. I also see this as something that would be enabled or disabled per table since it is so use case specific (e.g. some tables don't need repair at all). I think this is somewhat of a hybrid approach based on incremental repair, ticklers (read all partitions @ ALL), mutation based repair (CASSANDRA-8911), and hinted handoff. There are lots of tradeoffs, but I think it's worth talking about. If anyone has feedback on the idea, I'd love to chat about it. [~bdeggleston], [~aweisberg] I chatted with you guys a bit about this at NGCC; if you have time I'd love to continue that conversation here. > Continuous/Infectious Repair > > > Key: CASSANDRA-13924 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13924 > Project: Cassandra > Issue Type: Improvement > Components: Repair >Reporter: Joseph Lynch >Prio
[jira] [Commented] (CASSANDRA-13929) BTree$Builder / io.netty.util.Recycler$Stack leaking memory
[ https://issues.apache.org/jira/browse/CASSANDRA-13929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190006#comment-16190006 ] T Jake Luciani commented on CASSANDRA-13929: This was part of CASSANDRA-9766 and addresses one of the main allocation culprits during streaming. > BTree$Builder / io.netty.util.Recycler$Stack leaking memory > --- > > Key: CASSANDRA-13929 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13929 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Thomas Steinmaurer > Attachments: cassandra_3.11.0_min_memory_utilization.jpg, > cassandra_3.11.1_mat_dominator_classes_FIXED.png, > cassandra_3.11.1_mat_dominator_classes.png, > cassandra_3.11.1_snapshot_heaputilization.png > > > Different to CASSANDRA-13754, there seems to be another memory leak in > 3.11.0+ in BTree$Builder / io.netty.util.Recycler$Stack. > * heap utilization increase after upgrading to 3.11.0 => > cassandra_3.11.0_min_memory_utilization.jpg > * No difference after upgrading to 3.11.1 (snapshot build) => > cassandra_3.11.1_snapshot_heaputilization.png; thus most likely after fixing > CASSANDRA-13754, more visible now > * MAT shows io.netty.util.Recycler$Stack as top contributing class => > cassandra_3.11.1_mat_dominator_classes.png > * With -Xmx8G (CMS) and our load pattern, we have to do a rolling restart > after ~ 72 hours > Verified the following fix, namely explicitly unreferencing the > _recycleHandle_ member (making it non-final). In > _org.apache.cassandra.utils.btree.BTree.Builder.recycle()_ > {code} > public void recycle() > { > if (recycleHandle != null) > { > this.cleanup(); > builderRecycler.recycle(this, recycleHandle); > recycleHandle = null; // ADDED > } > } > {code} > Patched a single node in our loadtest cluster with this change and after ~ 10 > hours uptime, no sign of the previously offending class in MAT anymore => > cassandra_3.11.1_mat_dominator_classes_FIXED.png > Can' say if this has any other side effects etc., but I doubt. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-10520) Compressed writer and reader should support non-compressed data.
[ https://issues.apache.org/jira/browse/CASSANDRA-10520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16189996#comment-16189996 ] Adam Holmberg commented on CASSANDRA-10520: --- I confirmed that it displays as expected when set in compression options. {code:none} cassandra@cqlsh:test> desc t; CREATE TABLE test.t ( k int PRIMARY KEY, value int ) WITH bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 130 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99PERCENTILE'; cassandra@cqlsh:test> alter table test.t with compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor', 'min_compress_ratio': '1.2'}; cassandra@cqlsh:test> desc test.t; CREATE TABLE test.t ( k int PRIMARY KEY, value int ) WITH bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor', 'min_compress_ratio': '1.2'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 130 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99PERCENTILE'; cassandra@cqlsh:test> {code} I did note that it's not in the metadata if it's not explicitly set. > Compressed writer and reader should support non-compressed data. > > > Key: CASSANDRA-10520 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10520 > Project: Cassandra > Issue Type: Improvement > Components: Local Write-Read Paths >Reporter: Branimir Lambov >Assignee: Branimir Lambov > Labels: client-impacting, messaging-service-bump-required > Fix For: 4.0 > > Attachments: ReadWriteTestCompression.java > > > Compressing uncompressible data, as done, for instance, to write SSTables > during stress-tests, results in chunks larger than 64k which are a problem > for the buffer pooling mechanisms employed by the > {{CompressedRandomAccessReader}}. This results in non-negligible performance > issues due to excessive memory allocation. > To solve this problem and avoid decompression delays in the cases where it > does not provide benefits, I think we should allow compressed files to store > uncompressed chunks as alternative to compressed data. Such a chunk could be > written after compression returns a buffer larger than, for example, 90% of > the input, and would not result in additional delays in writing. On reads it > could be recognized by size (using a single global threshold constant in the > compression metadata) and data could be directly transferred into the > decompressed buffer, skipping the decompression step and ensuring a 64k > buffer for compressed data always suffices. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13923) Flushers blocked due to many SSTables
[ https://issues.apache.org/jira/browse/CASSANDRA-13923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16189997#comment-16189997 ] Dan Kinder commented on CASSANDRA-13923: Brought up one node with [the patch|https://issues.apache.org/jira/secure/attachment/12852352/simple-cache.patch] -- wow, it starts up WAY faster. It also seemed to properly complete flushes on that node so I'm pushing out to other nodes now to see how it does. > Flushers blocked due to many SSTables > - > > Key: CASSANDRA-13923 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13923 > Project: Cassandra > Issue Type: Bug > Components: Compaction, Local Write-Read Paths > Environment: Cassandra 3.11.0 > Centos 6 (downgraded JNA) > 64GB RAM > 12-disk JBOD >Reporter: Dan Kinder >Assignee: Marcus Eriksson > Attachments: cassandra-jstack-readstage.txt, cassandra-jstack.txt > > > This started on the mailing list and I'm not 100% sure of the root cause, > feel free to re-title if needed. > I just upgraded Cassandra from 2.2.6 to 3.11.0. Within a few hours of serving > traffic, thread pools begin to back up and grow pending tasks indefinitely. > This happens to multiple different stages (Read, Mutation) and consistently > builds pending tasks for MemtablePostFlush and MemtableFlushWriter. > Using jstack shows that there is blocking going on when trying to call > getCompactionCandidates, which seems to happen on flush. We have fairly large > nodes that have ~15,000 SSTables per node, all LCS. > I seems like this can cause reads to get blocked because they try to acquire > a read lock when calling shouldDefragment. > And writes, of course, block once we can't allocate anymore memtables, > because flushes are backed up. > We did not have this problem in 2.2.6, so it seems like there is some > regression causing it to be incredibly slow trying to do calls like > getCompactionCandidates that list out the SSTables. > In our case this causes nodes to build up pending tasks and simply stop > responding to requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13928) Remove initialDirectories from CFS
[ https://issues.apache.org/jira/browse/CASSANDRA-13928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16189974#comment-16189974 ] Eduard Tudenhoefner commented on CASSANDRA-13928: - [~krummas] yes I think it's safe to remove the *initialDirectories* stuff. > Remove initialDirectories from CFS > -- > > Key: CASSANDRA-13928 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13928 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 4.x > > > The initialDirectories added in CASSANDRA-8671 is quite confusing and I don't > think it is needed anymore, it should be removed -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-9608) Support Java 9
[ https://issues.apache.org/jira/browse/CASSANDRA-9608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp updated CASSANDRA-9608: Assignee: Robert Stupp Status: Patch Available (was: Open) Alright, after some quiet period, here's [a patch|https://github.com/apache/cassandra/compare/trunk...snazy:9608-trunk] to make current trunk run on Java9. We ([~luy] and I) just presented the code changes at JavaOne. Basically all concerns above are addressed except the JMX changes proposed by [~alanb] and [~mandy.ch...@oracle.com], which would be better addressed in a separate ticket. We still have to support Java 8 and that means it must be built against Java 8 and not Java 9. The generated code however runs on 8 and 9. CI must run against 8 + 9. CI is looking good: * testall against Java 8 * testall against Java 9 * dtests against Java 8 * dtests against Java 9 not run yet Included in this patch: * library updated for jamm and ohc to address new Java 9 version strings * distinction of java8 and java9 in jvm.options and split out jvm8.options + jvm9.options * JVM log parameter changes * Slight changes for Java UDFs * Not that slight changes for JavaScript UDFs * Change in {{AtomicBTreePartition}} to reflect the removal of some {{Unsafe}} methods * Abstraction of {{Cleaner}} in {{FileUtils}} * Adoptions in {{JMXServerUtils}} * Addressed some noise caused by {{ant eclipse-warnings}} > Support Java 9 > -- > > Key: CASSANDRA-9608 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9608 > Project: Cassandra > Issue Type: Task >Reporter: Robert Stupp >Assignee: Robert Stupp >Priority: Minor > > This ticket is intended to group all issues found to support Java 9 in the > future. > From what I've found out so far: > * Maven dependency {{com.sun:tools:jar:0}} via cobertura cannot be resolved. > It can be easily solved using this patch: > {code} > - artifactId="cobertura"/> > + artifactId="cobertura"> > + > + > {code} > * Another issue is that {{sun.misc.Unsafe}} no longer contains the methods > {{monitorEnter}} + {{monitorExit}}. These methods are used by > {{o.a.c.utils.concurrent.Locks}} which is only used by > {{o.a.c.db.AtomicBTreeColumns}}. > I don't mind to start working on this yet since Java 9 is in a too early > development phase. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13929) BTree$Builder / io.netty.util.Recycler$Stack leaking memory
[ https://issues.apache.org/jira/browse/CASSANDRA-13929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16189956#comment-16189956 ] Thomas Steinmaurer commented on CASSANDRA-13929: I can try a different max value, but what is supposed to be cached here and what area should be suffering without the cache? The reason why I'm asking is, that in our 9 node cluster, a single node is patched with the discussed change. I don't see any difference in CPU usage, GC, request latency etc., thus potentially looking at the wrong metrics. > BTree$Builder / io.netty.util.Recycler$Stack leaking memory > --- > > Key: CASSANDRA-13929 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13929 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Thomas Steinmaurer > Attachments: cassandra_3.11.0_min_memory_utilization.jpg, > cassandra_3.11.1_mat_dominator_classes_FIXED.png, > cassandra_3.11.1_mat_dominator_classes.png, > cassandra_3.11.1_snapshot_heaputilization.png > > > Different to CASSANDRA-13754, there seems to be another memory leak in > 3.11.0+ in BTree$Builder / io.netty.util.Recycler$Stack. > * heap utilization increase after upgrading to 3.11.0 => > cassandra_3.11.0_min_memory_utilization.jpg > * No difference after upgrading to 3.11.1 (snapshot build) => > cassandra_3.11.1_snapshot_heaputilization.png; thus most likely after fixing > CASSANDRA-13754, more visible now > * MAT shows io.netty.util.Recycler$Stack as top contributing class => > cassandra_3.11.1_mat_dominator_classes.png > * With -Xmx8G (CMS) and our load pattern, we have to do a rolling restart > after ~ 72 hours > Verified the following fix, namely explicitly unreferencing the > _recycleHandle_ member (making it non-final). In > _org.apache.cassandra.utils.btree.BTree.Builder.recycle()_ > {code} > public void recycle() > { > if (recycleHandle != null) > { > this.cleanup(); > builderRecycler.recycle(this, recycleHandle); > recycleHandle = null; // ADDED > } > } > {code} > Patched a single node in our loadtest cluster with this change and after ~ 10 > hours uptime, no sign of the previously offending class in MAT anymore => > cassandra_3.11.1_mat_dominator_classes_FIXED.png > Can' say if this has any other side effects etc., but I doubt. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13923) Flushers blocked due to many SSTables
[ https://issues.apache.org/jira/browse/CASSANDRA-13923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16189886#comment-16189886 ] Dan Kinder commented on CASSANDRA-13923: I did notice that startup time was significantly slower now on 3.11, similar to the symptom reported in https://issues.apache.org/jira/browse/CASSANDRA-13215 I'll try to get you the other info shortly and apply that patch. Unfortunately jstack sometimes NPEs, so once I get lucky you'll get one with -l > Flushers blocked due to many SSTables > - > > Key: CASSANDRA-13923 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13923 > Project: Cassandra > Issue Type: Bug > Components: Compaction, Local Write-Read Paths > Environment: Cassandra 3.11.0 > Centos 6 (downgraded JNA) > 64GB RAM > 12-disk JBOD >Reporter: Dan Kinder >Assignee: Marcus Eriksson > Attachments: cassandra-jstack-readstage.txt, cassandra-jstack.txt > > > This started on the mailing list and I'm not 100% sure of the root cause, > feel free to re-title if needed. > I just upgraded Cassandra from 2.2.6 to 3.11.0. Within a few hours of serving > traffic, thread pools begin to back up and grow pending tasks indefinitely. > This happens to multiple different stages (Read, Mutation) and consistently > builds pending tasks for MemtablePostFlush and MemtableFlushWriter. > Using jstack shows that there is blocking going on when trying to call > getCompactionCandidates, which seems to happen on flush. We have fairly large > nodes that have ~15,000 SSTables per node, all LCS. > I seems like this can cause reads to get blocked because they try to acquire > a read lock when calling shouldDefragment. > And writes, of course, block once we can't allocate anymore memtables, > because flushes are backed up. > We did not have this problem in 2.2.6, so it seems like there is some > regression causing it to be incredibly slow trying to do calls like > getCompactionCandidates that list out the SSTables. > In our case this causes nodes to build up pending tasks and simply stop > responding to requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13321) Add a checksum component for the sstable metadata (-Statistics.db) file
[ https://issues.apache.org/jira/browse/CASSANDRA-13321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Brown updated CASSANDRA-13321: Status: Ready to Commit (was: Patch Available) > Add a checksum component for the sstable metadata (-Statistics.db) file > --- > > Key: CASSANDRA-13321 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13321 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 4.x > > > Since we keep important information in the sstable metadata file now, we > should add a checksum component for it. One danger being if a bit gets > flipped in repairedAt we could consider the sstable repaired when it is not. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13321) Add a checksum component for the sstable metadata (-Statistics.db) file
[ https://issues.apache.org/jira/browse/CASSANDRA-13321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16189881#comment-16189881 ] Jason Brown commented on CASSANDRA-13321: - Coming back to this after a long time I agree with the decision to simplify the solution for adding checksums, and the most recent branch satisfies that. A few small comments: - on the serialize path, call {{DataOutputBuffer@getData()}} instead of {{DataOutputBuffer#toByteArray}} as the latter allocates a new buffer and copies, whereas the former just hands over it's backing byte array from the {{ByteBuffer}}. - {{Hashing.md5()}} - we *could* choose to swap to some other, more lighter weight algo from guava'a {{Hasher}}, but as this code path is called very infrequently it's probably not worth bikeshedding - on the deserialize path, you build up the {{lengths}} map in the first {{for}} loop. Then in the second {{for}} loop, you determine the {{size}} to read from the {{in}} stream. Admittedly, it took me some staring at that {{if}} to figure out what exactly it was doing. While correct, it might be friendlier for code reading if we add the length for the {{lastType}} to the map after the first {{for}} loop completes - then you won't need the {{if}} branching in the second loop. Beyond these nits, I'm +1. Nice work simplifying this patch to the minimal work required. > Add a checksum component for the sstable metadata (-Statistics.db) file > --- > > Key: CASSANDRA-13321 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13321 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 4.x > > > Since we keep important information in the sstable metadata file now, we > should add a checksum component for it. One danger being if a bit gets > flipped in repairedAt we could consider the sstable repaired when it is not. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13922) nodetool verify should also verify sstable metadata
[ https://issues.apache.org/jira/browse/CASSANDRA-13922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Brown updated CASSANDRA-13922: Reviewer: Jason Brown > nodetool verify should also verify sstable metadata > --- > > Key: CASSANDRA-13922 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13922 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 3.0.x, 3.11.x, 4.x > > > nodetool verify should also try to deserialize the sstable metadata (and once > CASSANDRA-13321 makes it in, verify the checksums) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13922) nodetool verify should also verify sstable metadata
[ https://issues.apache.org/jira/browse/CASSANDRA-13922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Brown updated CASSANDRA-13922: Status: Ready to Commit (was: Patch Available) > nodetool verify should also verify sstable metadata > --- > > Key: CASSANDRA-13922 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13922 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 3.0.x, 3.11.x, 4.x > > > nodetool verify should also try to deserialize the sstable metadata (and once > CASSANDRA-13321 makes it in, verify the checksums) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13922) nodetool verify should also verify sstable metadata
[ https://issues.apache.org/jira/browse/CASSANDRA-13922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16189875#comment-16189875 ] Jason Brown commented on CASSANDRA-13922: - On the whole this patch looks fine. I have some petty nits are ones beyond what you changed in this patch (just general code clean up), so feel free to ignore or change on commit: - remove dead imports like {{FBUtilities}} (in 3.0 branch) - remove dead field {{badRows}} - not sure what it's original use was. - add a comment to li. 208 {{UnfilteredRowIterator iterator}} that the variable is intentionally unused. Also, and this was there from before, if {{markAndThrow}} fails on {{mutateRepairedAt}}, we'll percolate that error, rather than the {{CorruptSSTableException}} that it is coded to do. If you feel it's warranted, maybe add a try-catch block around the {{mutateRepairedAt}}, log that error if one occurs, and still throw the original {{CorruptSSTableException}}. Either way, I'm +1. Thanks for adding a test in {{VerifyTest}} - saves me from concating a one-off test to verify this patch ;) > nodetool verify should also verify sstable metadata > --- > > Key: CASSANDRA-13922 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13922 > Project: Cassandra > Issue Type: Improvement >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 3.0.x, 3.11.x, 4.x > > > nodetool verify should also try to deserialize the sstable metadata (and once > CASSANDRA-13321 makes it in, verify the checksums) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13926) Starting and stopping quickly on Windows results in "port already in use" error
[ https://issues.apache.org/jira/browse/CASSANDRA-13926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16189853#comment-16189853 ] Jason Rust commented on CASSANDRA-13926: I think that could work for our use-case. I'm happy to close the issue if others agree and have concerns about how solid the fix I posted is. > Starting and stopping quickly on Windows results in "port already in use" > error > --- > > Key: CASSANDRA-13926 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13926 > Project: Cassandra > Issue Type: Bug > Components: Packaging > Environment: Windows >Reporter: Jason Rust >Priority: Minor > Labels: windows > > If I stop/start Cassandra within a minute on Windows, using the included > Powershell script it can fail to start with the error message "Found a port > already in use. Aborting startup." > This is because the Powershell script uses netstat to find ports are in use, > and even if Cassandra is stopped it is still listed for a short time > (reported as TIME_WAIT). See > https://superuser.com/questions/173535/what-are-close-wait-and-time-wait-states > A change to the Powershell script to ensure that only ESTABLISHED ports are > searched solves the problem for me and involves changing from: > {code} if ($line -match "TCP" -and $line -match $portRegex){code} > to > {code} if ($line -match "TCP" -and $line -match $portRegex -and $line -match > "ESTABLISHED"){code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13928) Remove initialDirectories from CFS
[ https://issues.apache.org/jira/browse/CASSANDRA-13928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16189814#comment-16189814 ] Jeremiah Jordan commented on CASSANDRA-13928: - (y) > Remove initialDirectories from CFS > -- > > Key: CASSANDRA-13928 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13928 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 4.x > > > The initialDirectories added in CASSANDRA-8671 is quite confusing and I don't > think it is needed anymore, it should be removed -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13928) Remove initialDirectories from CFS
[ https://issues.apache.org/jira/browse/CASSANDRA-13928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16189777#comment-16189777 ] Marcus Eriksson commented on CASSANDRA-13928: - bq. How does a compaction strategy control where things are created without it? [~jjordan] they probably shouldn't right now, and I don't think this code is used anymore. > Remove initialDirectories from CFS > -- > > Key: CASSANDRA-13928 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13928 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 4.x > > > The initialDirectories added in CASSANDRA-8671 is quite confusing and I don't > think it is needed anymore, it should be removed -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13928) Remove initialDirectories from CFS
[ https://issues.apache.org/jira/browse/CASSANDRA-13928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16189764#comment-16189764 ] Jeremiah Jordan edited comment on CASSANDRA-13928 at 10/3/17 2:29 PM: -- [~krummas] why don't you think it is not needed anymore? How does a compaction strategy control where things are created without it? was (Author: jjordan): [~krummas] why don't you think it is needed anymore? How does a compaction strategy control where things are created without it? > Remove initialDirectories from CFS > -- > > Key: CASSANDRA-13928 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13928 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 4.x > > > The initialDirectories added in CASSANDRA-8671 is quite confusing and I don't > think it is needed anymore, it should be removed -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13928) Remove initialDirectories from CFS
[ https://issues.apache.org/jira/browse/CASSANDRA-13928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16189764#comment-16189764 ] Jeremiah Jordan edited comment on CASSANDRA-13928 at 10/3/17 2:30 PM: -- [~krummas] why do you think it is not needed anymore? How does a compaction strategy control where things are created without it? was (Author: jjordan): [~krummas] why don't you think it is not needed anymore? How does a compaction strategy control where things are created without it? > Remove initialDirectories from CFS > -- > > Key: CASSANDRA-13928 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13928 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 4.x > > > The initialDirectories added in CASSANDRA-8671 is quite confusing and I don't > think it is needed anymore, it should be removed -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13928) Remove initialDirectories from CFS
[ https://issues.apache.org/jira/browse/CASSANDRA-13928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16189764#comment-16189764 ] Jeremiah Jordan commented on CASSANDRA-13928: - [~krummas] why don't you think it is needed anymore? How does a compaction strategy control where things are created without it? > Remove initialDirectories from CFS > -- > > Key: CASSANDRA-13928 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13928 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 4.x > > > The initialDirectories added in CASSANDRA-8671 is quite confusing and I don't > think it is needed anymore, it should be removed -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13851) Allow existing nodes to use all peers in shadow round
[ https://issues.apache.org/jira/browse/CASSANDRA-13851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremiah Jordan updated CASSANDRA-13851: Component/s: Lifecycle > Allow existing nodes to use all peers in shadow round > - > > Key: CASSANDRA-13851 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13851 > Project: Cassandra > Issue Type: Bug > Components: Lifecycle >Reporter: Kurt Greaves > Fix For: 3.11.x, 4.x > > > In CASSANDRA-10134 we made collision checks necessary on every startup. A > side-effect was introduced that then requires a nodes seeds to be contacted > on every startup. Prior to this change an existing node could start up > regardless whether it could contact a seed node or not (because > checkForEndpointCollision() was only called for bootstrapping nodes). > Now if a nodes seeds are removed/deleted/fail it will no longer be able to > start up until live seeds are configured (or itself is made a seed), even > though it already knows about the rest of the ring. This is inconvenient for > operators and has the potential to cause some nasty surprises and increase > downtime. > One solution would be to use all a nodes existing peers as seeds in the > shadow round. Not a Gossip guru though so not sure of implications. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13853) nodetool describecluster should be more informative
[ https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremiah Jordan updated CASSANDRA-13853: Component/s: Observability > nodetool describecluster should be more informative > --- > > Key: CASSANDRA-13853 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13853 > Project: Cassandra > Issue Type: Improvement > Components: Observability >Reporter: Jon Haddad >Assignee: Preetika Tyagi > Labels: lhf > > Additional information we should be displaying: > * Total node count > * List of datacenters, RF, with number of nodes per dc, how many are down, > * Version(s) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13830) Simplify MerkleTree.difference/differenceHelper
[ https://issues.apache.org/jira/browse/CASSANDRA-13830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-13830: Reviewer: Marcus Eriksson Fix Version/s: 4.x > Simplify MerkleTree.difference/differenceHelper > --- > > Key: CASSANDRA-13830 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13830 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Minor > Fix For: 4.x > > > As brought up in CASSANDRA-13603, {{MerkleTree.differenceHelper}} is overly > complex and difficult to follow for what it's doing. It also shares some of > it's responsibilities with {{difference}}, and assumes that the trees it's > given have differences, which makes it a potential source of future bugs. > Since we're just trying to recursively compare these trees and record the > largest contiguous out of sync ranges, I think this could be simplified a > bit. I propose that we refactor {{difference}} / {{differenceHelper}} so that > {{difference}} is only concerned with supplying the range, and dealing with > the {{FULLY_INCONSISTENT}} case, and move everything else into a recursable > helper method. > I put together an alternate implementation > [here|https://github.com/bdeggleston/cassandra/tree/differencer-cleanup]. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13851) Allow existing nodes to use all peers in shadow round
[ https://issues.apache.org/jira/browse/CASSANDRA-13851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremiah Jordan updated CASSANDRA-13851: Reproduced In: 3.6 Fix Version/s: 4.x 3.11.x > Allow existing nodes to use all peers in shadow round > - > > Key: CASSANDRA-13851 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13851 > Project: Cassandra > Issue Type: Bug >Reporter: Kurt Greaves > Fix For: 3.11.x, 4.x > > > In CASSANDRA-10134 we made collision checks necessary on every startup. A > side-effect was introduced that then requires a nodes seeds to be contacted > on every startup. Prior to this change an existing node could start up > regardless whether it could contact a seed node or not (because > checkForEndpointCollision() was only called for bootstrapping nodes). > Now if a nodes seeds are removed/deleted/fail it will no longer be able to > start up until live seeds are configured (or itself is made a seed), even > though it already knows about the rest of the ring. This is inconvenient for > operators and has the potential to cause some nasty surprises and increase > downtime. > One solution would be to use all a nodes existing peers as seeds in the > shadow round. Not a Gossip guru though so not sure of implications. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13835) Thrift get_slice responds slower on Cassandra 3
[ https://issues.apache.org/jira/browse/CASSANDRA-13835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremiah Jordan updated CASSANDRA-13835: Environment: Windows > Thrift get_slice responds slower on Cassandra 3 > --- > > Key: CASSANDRA-13835 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13835 > Project: Cassandra > Issue Type: Bug > Environment: Windows >Reporter: Pawel Szlendak > Attachments: attack.py, cassandra120_get_slice_reply_time.png, > cassandra310_get_slice_reply_time.png > > > I have recently upgraded from Cassandra 1.2.18 to Cassandra 3.10 and was > surprised to notice performance degradation of my server application. > I dug down through my application stack only to find out that the cause of > the performance issue was slower response time of Cassandra 3.10 get_slice as > compared to Cassandra 1.2.18 (almost x3 times slower on average). > I am attaching a python script (attack.py) here that can be used to reproduce > this issue on a Windows platform. The script uses the pycassa python library > that can easily be installed using pip. > REPRODUCTION STEPS: > 1. Install Cassandra 1.2.18 from > https://archive.apache.org/dist/cassandra/1.2.18/apache-cassandra-1.2.18-bin.tar.gz > 2. Run Cassandra 1.2.18 from cmd console using cassandra.bat > 3. Create a test keyspace and an empty CF using attack.py script > > {noformat} > python attack.py create > {noformat} > 4. Run some get_slice queries to an empty CF and note down the average > response time (in seconds) > > {noformat} > python attack.py > {noformat} >get_slice count: 788 >get_slice total response time: 0.3126376 >*get_slice average response time: 0.000397208075838* > 5. Stop Cassandra 1.2.18 and install Cassandra 3.10 from > https://archive.apache.org/dist/cassandra/3.10/apache-cassandra-3.10-bin.tar.gz > 6. Tweak cassandra.yaml to run thrift service (start_rpc=true) and run > Cassandra from an elevated cmd console using cassandra.bat > 7. Create a test keyspace and an empty CF using attack.py script > > {noformat} > python attack.py create > {noformat} > 8. Run some get_slice queries to an empty CF using attack.py and note down > the average response time (in seconds) > {noformat} > python attack.py > {noformat} >get_slice count: 788 >get_slice total response time: 1.1646185 >*get_slice average response time: 0.00147842634753* > 9. Compare the average response times > EXPECTED: >get_slice response time of Cassandra 3.10 is not worse than on Cassandra > 1.2.18 > ACTUAL: >get_slice response time of Cassandra 3.10 is x3 worse than that of > Cassandra 1.2.18 > REMARKS: > - this seems to happen only on Windows platform (tested on Windows 10 and > Windows Server 2008 R2) > - running the very same procedure on Linux (Ubuntu) renders roughly the same > response times > - I sniffed the traffic to/from Cassandra 1.2.18 and Cassandra 3.10 and it > can be seen that Cassandra 3.10 responds slower (Wireshark dumps attached) > - when attacking the server with concurrent get_slice queries I can see lower > CPU usage for Cassandra 3.10 that for Cassandra 1.2.18 > - get_slice in attack.py queries the column family for non-exisitng key (the > column familiy is empty) > I am willing to work on this on my own if you guys give me some tips on where > to look for. I am also aware that this might be more Windows/Java related, > nevertheless, any help from your side would be much appreciated. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13848) Allow sstabledump to do a json object per partition to better handle large sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-13848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-13848: Component/s: Tools > Allow sstabledump to do a json object per partition to better handle large > sstables > --- > > Key: CASSANDRA-13848 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13848 > Project: Cassandra > Issue Type: New Feature > Components: Tools >Reporter: Jeff Jirsa >Assignee: Kevin Wern >Priority: Trivial > Labels: lhf > > sstable2json / sstabledump make a huge json document of the whole file. For > very large sstables this makes it impossible to load in memory to do anything > with it. Allowing users to Break it into small json objects per partition > would be useful. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13830) Simplify MerkleTree.difference/differenceHelper
[ https://issues.apache.org/jira/browse/CASSANDRA-13830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-13830: Priority: Minor (was: Major) > Simplify MerkleTree.difference/differenceHelper > --- > > Key: CASSANDRA-13830 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13830 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Minor > > As brought up in CASSANDRA-13603, {{MerkleTree.differenceHelper}} is overly > complex and difficult to follow for what it's doing. It also shares some of > it's responsibilities with {{difference}}, and assumes that the trees it's > given have differences, which makes it a potential source of future bugs. > Since we're just trying to recursively compare these trees and record the > largest contiguous out of sync ranges, I think this could be simplified a > bit. I propose that we refactor {{difference}} / {{differenceHelper}} so that > {{difference}} is only concerned with supplying the range, and dealing with > the {{FULLY_INCONSISTENT}} case, and move everything else into a recursable > helper method. > I put together an alternate implementation > [here|https://github.com/bdeggleston/cassandra/tree/differencer-cleanup]. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13830) Simplify MerkleTree.difference/differenceHelper
[ https://issues.apache.org/jira/browse/CASSANDRA-13830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-13830: Issue Type: Improvement (was: Bug) > Simplify MerkleTree.difference/differenceHelper > --- > > Key: CASSANDRA-13830 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13830 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Blake Eggleston >Assignee: Blake Eggleston > > As brought up in CASSANDRA-13603, {{MerkleTree.differenceHelper}} is overly > complex and difficult to follow for what it's doing. It also shares some of > it's responsibilities with {{difference}}, and assumes that the trees it's > given have differences, which makes it a potential source of future bugs. > Since we're just trying to recursively compare these trees and record the > largest contiguous out of sync ranges, I think this could be simplified a > bit. I propose that we refactor {{difference}} / {{differenceHelper}} so that > {{difference}} is only concerned with supplying the range, and dealing with > the {{FULLY_INCONSISTENT}} case, and move everything else into a recursable > helper method. > I put together an alternate implementation > [here|https://github.com/bdeggleston/cassandra/tree/differencer-cleanup]. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13830) Simplify MerkleTree.difference/differenceHelper
[ https://issues.apache.org/jira/browse/CASSANDRA-13830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-13830: Component/s: Core > Simplify MerkleTree.difference/differenceHelper > --- > > Key: CASSANDRA-13830 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13830 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Minor > > As brought up in CASSANDRA-13603, {{MerkleTree.differenceHelper}} is overly > complex and difficult to follow for what it's doing. It also shares some of > it's responsibilities with {{difference}}, and assumes that the trees it's > given have differences, which makes it a potential source of future bugs. > Since we're just trying to recursively compare these trees and record the > largest contiguous out of sync ranges, I think this could be simplified a > bit. I propose that we refactor {{difference}} / {{differenceHelper}} so that > {{difference}} is only concerned with supplying the range, and dealing with > the {{FULLY_INCONSISTENT}} case, and move everything else into a recursable > helper method. > I put together an alternate implementation > [here|https://github.com/bdeggleston/cassandra/tree/differencer-cleanup]. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13823) The Getting Started page should have instructions on setting up a cluster
[ https://issues.apache.org/jira/browse/CASSANDRA-13823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremiah Jordan updated CASSANDRA-13823: Priority: Minor (was: Major) > The Getting Started page should have instructions on setting up a cluster > - > > Key: CASSANDRA-13823 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13823 > Project: Cassandra > Issue Type: Improvement > Components: Documentation and Website >Reporter: Jon Haddad >Priority: Minor > Labels: lhf > > Currently the docs don't have an easy to follow guide on setting up a > cluster. I think it would benefit from a nice easy to follow walkthrough. > https://cassandra.apache.org/doc/latest/getting_started/index.html -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13820) List languages on driver list alphabetically
[ https://issues.apache.org/jira/browse/CASSANDRA-13820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-13820: Priority: Trivial (was: Major) > List languages on driver list alphabetically > - > > Key: CASSANDRA-13820 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13820 > Project: Cassandra > Issue Type: Improvement > Components: Documentation and Website >Reporter: Jon Haddad >Assignee: Jon Haddad >Priority: Trivial > > This is pretty minor, but I think the list of drivers on > https://cassandra.apache.org/doc/latest/getting_started/index.html should be > listed in alphabetical order to make it more readable. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13819) Surprising under-documented behavior with DELETE...USING TIMESTAMP
[ https://issues.apache.org/jira/browse/CASSANDRA-13819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-13819: Component/s: Core > Surprising under-documented behavior with DELETE...USING TIMESTAMP > -- > > Key: CASSANDRA-13819 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13819 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Eric Wolak >Priority: Minor > > While investigating differences between various Bigtable derivatives, I‘ve > run into an odd behavior of Cassandra. I’m guessing this is intended > behavior, but it's surprising enough to me that I think it should be > explicitly documented. > Let‘s say I have a sensor device reporting data with timestamps. It has a > great clock, so I use its timestamps in a USING TIMESTAMP clause in my INSERT > statements. One day Jeff realizes that we had a hardware bug with the sensor, > and data before timestamp T is incorrect. He issues a DELETE...USING > TIMESTAMP T to remove the old data. In the meantime, Sam figures out a way to > backfill the data, and she writes a job to insert corrected data into the > same table. In keeping with the schema, her job issues INSERT...USING > TIMESTAMP statememts, with timestamps before T (because that’s the time the > data points correspond to). When testing her job, Sam discovers that the > backfilled data isn‘t appearing in the database! In fact, there’s no way for > her to insert data with a TIMESTAMP <= T, because the tombstone written by > Jeff several days ago is masking them. How can Sam backfill the corrected > data? > This behavior seems to match the HBase “Current Limitation” that Deletes Mask > Puts, documented at http://hbase.apache.org/book.html#_deletes_mask_puts. > Should the Cassandra docs also explicitly call-out this behavior? > Related: > http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html > https://docs.datastax.com/en/cassandra/3.0/cassandra/dml/dmlAboutDeletes.html -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13811) Unable to find table . at maybeLoadschemainfo (StressProfile.java)
[ https://issues.apache.org/jira/browse/CASSANDRA-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16189729#comment-16189729 ] T Jake Luciani commented on CASSANDRA-13811: You need to specify the keyspace in the table definition > Unable to find table . at maybeLoadschemainfo > (StressProfile.java) > > > Key: CASSANDRA-13811 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13811 > Project: Cassandra > Issue Type: Bug > Components: Stress > Environment: 3 node cluster > Node 1 --> 172.27.21.16(Seed Node) > Node 2 --> 172.27.21.18 > Node 3 --> 172.27.21.19 > *cassandra.yaml paramters for all the nodes:-* > 1) seeds: "172.27.21.16" > 2) write_request_timeout_in_ms: 5000 > 3) listen_address: 172.27.21.1(6,8,9 > 4) rpc_address: 172.27.21.1(6,8,9) >Reporter: Akshay Jindal >Priority: Minor > Fix For: 3.10 > > Attachments: code.yaml, stress-script.sh > > > * Please find attached my .yaml and .sh file. > * Now the problem is if I run stress-script.sh the first time, just after > firing up cassandra, it is working fine on the cluster, but when I again run > stress-script.sh, it is giving the following error:- > *Unable to find prutorStress3node.code* > at > org.apache.cassandra.stress.StressProfile.maybeLoadSchemaInfo(StressProfile.java:306) > at > org.apache.cassandra.stress.StressProfile.maybeCreateSchema(StressProfile.java:273) > at > org.apache.cassandra.stress.StressProfile.newGenerator(StressProfile.java:676) > at > org.apache.cassandra.stress.StressProfile.printSettings(StressProfile.java:129) > at > org.apache.cassandra.stress.settings.StressSettings.printSettings(StressSettings.java:383) > at org.apache.cassandra.stress.Stress.run(Stress.java:95) > at org.apache.cassandra.stress.Stress.main(Stress.java:62) > In the file > [https://insight.io/github.com/apache/cassandra/blob/trunk/tools/stress/src/org/apache/cassandra/stress/StressProfile.java?line=289] > ,I saw that table metadata is being populated to NULL. I tried to make sense > of the stack trace, but was not able to make anything of it. Please give me > some directions as to what might have gone wrong? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13815) RPM package for client tools - cqlsh + nodetool
[ https://issues.apache.org/jira/browse/CASSANDRA-13815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-13815: Priority: Minor (was: Major) > RPM package for client tools - cqlsh + nodetool > --- > > Key: CASSANDRA-13815 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13815 > Project: Cassandra > Issue Type: Wish > Components: Packaging >Reporter: Dennis >Priority: Minor > Fix For: 4.x > > > Feature request. > I see you guys are picking up on the RPM packages. > Thanks for that. That could even be improved if you could package the client > tools as a separate or client-only package as well. That package could hold > cqlsh and nodetool for example. > That would support centralized, automated backup or other maintenance > processes. > Now the admin is forced to login to the box in order to use these tools, > which is not really best practice, security wise. The admin would need to > know an ssh account as well as the cassandra admin account. > So, benefits or usage of a client package (cqlsh+nodetool): > # Supports automated maintenance scripts (simply yum the client tools to a > temporary vm) > # Better security, as the admin doesn't need to ssh into the instance host. > Without having to pull the full Cassandra packages on the clients. > Datastax does have such client packages, but they don't support the community > edition anymore, so I am hoping that you can do this going forward. > Thanks! -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13813) Don't let user drop (or generally break) tables in system_distributed
[ https://issues.apache.org/jira/browse/CASSANDRA-13813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-13813: Component/s: Distributed Metadata > Don't let user drop (or generally break) tables in system_distributed > - > > Key: CASSANDRA-13813 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13813 > Project: Cassandra > Issue Type: Bug > Components: Distributed Metadata >Reporter: Sylvain Lebresne >Assignee: Aleksey Yeschenko > Fix For: 3.0.x, 3.11.x > > > There is not currently no particular restrictions on schema modifications to > tables of the {{system_distributed}} keyspace. This does mean you can drop > those tables, or even alter them in wrong ways like dropping or renaming > columns. All of which is guaranteed to break stuffs (that is, repair if you > mess up with on of it's table, or MVs if you mess up with > {{view_build_status}}). > I'm pretty sure this was never intended and is an oversight of the condition > on {{ALTERABLE_SYSTEM_KEYSPACES}} in > [ClientState|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/ClientState.java#L397]. > That condition is such that any keyspace not listed in > {{ALTERABLE_SYSTEM_KEYSPACES}} (which happens to be the case for > {{system_distributed}}) has no specific restrictions whatsoever, while given > the naming it's fair to assume the intention that exactly the opposite. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13811) Unable to find table . at maybeLoadschemainfo (StressProfile.java)
[ https://issues.apache.org/jira/browse/CASSANDRA-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-13811: Priority: Minor (was: Major) > Unable to find table . at maybeLoadschemainfo > (StressProfile.java) > > > Key: CASSANDRA-13811 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13811 > Project: Cassandra > Issue Type: Bug > Components: Stress > Environment: 3 node cluster > Node 1 --> 172.27.21.16(Seed Node) > Node 2 --> 172.27.21.18 > Node 3 --> 172.27.21.19 > *cassandra.yaml paramters for all the nodes:-* > 1) seeds: "172.27.21.16" > 2) write_request_timeout_in_ms: 5000 > 3) listen_address: 172.27.21.1(6,8,9 > 4) rpc_address: 172.27.21.1(6,8,9) >Reporter: Akshay Jindal >Priority: Minor > Fix For: 3.10 > > Attachments: code.yaml, stress-script.sh > > > * Please find attached my .yaml and .sh file. > * Now the problem is if I run stress-script.sh the first time, just after > firing up cassandra, it is working fine on the cluster, but when I again run > stress-script.sh, it is giving the following error:- > *Unable to find prutorStress3node.code* > at > org.apache.cassandra.stress.StressProfile.maybeLoadSchemaInfo(StressProfile.java:306) > at > org.apache.cassandra.stress.StressProfile.maybeCreateSchema(StressProfile.java:273) > at > org.apache.cassandra.stress.StressProfile.newGenerator(StressProfile.java:676) > at > org.apache.cassandra.stress.StressProfile.printSettings(StressProfile.java:129) > at > org.apache.cassandra.stress.settings.StressSettings.printSettings(StressSettings.java:383) > at org.apache.cassandra.stress.Stress.run(Stress.java:95) > at org.apache.cassandra.stress.Stress.main(Stress.java:62) > In the file > [https://insight.io/github.com/apache/cassandra/blob/trunk/tools/stress/src/org/apache/cassandra/stress/StressProfile.java?line=289] > ,I saw that table metadata is being populated to NULL. I tried to make sense > of the stack trace, but was not able to make anything of it. Please give me > some directions as to what might have gone wrong? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13801) CompactionManager sometimes wrongly determines that a background compaction is running for a particular table
[ https://issues.apache.org/jira/browse/CASSANDRA-13801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-13801: Priority: Minor (was: Major) > CompactionManager sometimes wrongly determines that a background compaction > is running for a particular table > - > > Key: CASSANDRA-13801 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13801 > Project: Cassandra > Issue Type: Bug > Components: Compaction >Reporter: Dimitar Dimitrov >Assignee: Dimitar Dimitrov >Priority: Minor > > Sometimes after writing different rows to a table, then doing a blocking > flush, if you alter the compaction strategy, then run background compaction > and wait for it to finish, {{CompactionManager}} may decide that there's an > ongoing compaction for that same table. > This may happen even though logs don't indicate that to be the case > (compaction may still be running for system_schema tables). -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13795) DDL statements running slow on huge data
[ https://issues.apache.org/jira/browse/CASSANDRA-13795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-13795: Priority: Minor (was: Major) > DDL statements running slow on huge data > > > Key: CASSANDRA-13795 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13795 > Project: Cassandra > Issue Type: Bug > Components: Core, CQL >Reporter: Vladimir >Priority: Minor > Fix For: 3.0.x > > > We are facing the issues with Cassandra DDL statements with a huge amount of > data inside. I run statement for create type and it works for several hours > without success in cqlsh for remote server. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13795) DDL statements running slow on huge data
[ https://issues.apache.org/jira/browse/CASSANDRA-13795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-13795: Component/s: CQL Core > DDL statements running slow on huge data > > > Key: CASSANDRA-13795 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13795 > Project: Cassandra > Issue Type: Bug > Components: Core, CQL >Reporter: Vladimir >Priority: Minor > Fix For: 3.0.x > > > We are facing the issues with Cassandra DDL statements with a huge amount of > data inside. I run statement for create type and it works for several hours > without success in cqlsh for remote server. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13788) Seemingly valid Java UDF fails compilation with error "type cannot be resolved. It is indirectly referenced from required .class files"
[ https://issues.apache.org/jira/browse/CASSANDRA-13788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joshua McKenzie updated CASSANDRA-13788: Priority: Minor (was: Major) > Seemingly valid Java UDF fails compilation with error "type cannot be > resolved. It is indirectly referenced from required .class files" > --- > > Key: CASSANDRA-13788 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13788 > Project: Cassandra > Issue Type: Improvement > Components: CQL > Environment: Cassandra 3.11.0 > java version "1.8.0_131" > Java(TM) SE Runtime Environment (build 1.8.0_131-b11) > Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode) >Reporter: jaikiran pai >Priority: Minor > > We are moving to Cassandra 3.11.0 from Cassandra 2.x. We have a Java UDF > which is straightforward and looks something like: > {code} > CREATE FUNCTION utf8_text_size (val TEXT) > CALLED ON NULL INPUT > RETURNS INT > LANGUAGE java > AS 'if (val == null) {return 0;}try { >return val.getBytes("UTF-8").length;} catch > (Exception e) {throw new RuntimeException("Failed to compute > size of UTF-8 text", e);}'; > {code} > This works fine in Cassandra 2.x. In Cassandra 3.11.0 when this UDF is being > created, we keep running into this exception when the UDF is being > (internally) compiled: > {code} > InvalidRequest: Error from server: code=2200 [Invalid query] message="Java > source compilation failed: > Line 1: The type java.io.UnsupportedEncodingException cannot be resolved. It > is indirectly referenced from required .class files > Line 1: The type java.nio.charset.Charset cannot be resolved. It is > indirectly referenced from required .class files > Line 1: The method getBytes(String) from the type String refers to the > missing type UnsupportedEncodingException > {code} > I realize there have been changes to the UDF support in Cassandra 3.x and I > also have read this[1] article related to it. However, I don't see anything > wrong with the above UDF. In fact, I enabled TRACE logging of > {{org.apache.cassandra.cql3.functions}} which is where the > {{JavaBasedUDFunction}} resides to see what the generated source looks like. > Here's what it looks like (I have modified the classname etc, but nothing > else): > {code} > package org.myapp; > import java.nio.ByteBuffer; > import java.util.*; > import org.apache.cassandra.cql3.functions.JavaUDF; > import org.apache.cassandra.cql3.functions.UDFContext; > import org.apache.cassandra.transport.ProtocolVersion; > import com.datastax.driver.core.TypeCodec; > import com.datastax.driver.core.TupleValue; > import com.datastax.driver.core.UDTValue; > public final class CassandraUDFTest extends JavaUDF > { > public CassandraUDFTest(TypeCodec returnCodec, > TypeCodec[] argCodecs, UDFContext udfContext) > { > super(returnCodec, argCodecs, udfContext); > } > protected ByteBuffer executeImpl(ProtocolVersion protocolVersion, > List params) > { > Integer result = xsome_keyspace_2eutf8_text_size_3232115_9( > /* parameter 'val' */ > (String) super.compose(protocolVersion, 0, params.get(0)) > ); > return super.decompose(protocolVersion, result); > } > protected Object executeAggregateImpl(ProtocolVersion protocolVersion, > Object firstParam, List params) > { > Integer result = xsome_keyspace_2eutf8_text_size_3232115_9( > /* parameter 'val' */ > (String) firstParam > ); > return result; > } > private Integer xsome_keyspace_2eutf8_text_size_3232115_9(String val) > { > if (val == null) {return 0;}try { >return val.getBytes("UTF-8").length;} catch (Exception > e) {throw new RuntimeException("Failed to compute size of > UTF-8 text", e);} > } > } > {code} > I then went ahead and compiled this generated class from command line using > the (Oracle) Java compiler as follows: > {code} > javac -cp "/opt/cassandra/apache-cassandra-3.11.0/lib/*" > org/myapp/CassandraUDFTest.java > {code} > and it compiled fine without any errors. > Looking at the {{JavaBasedUDFunction}} which compiles this UDF at runtime, > it's using Eclipse JDT compiler. I haven't looked into why it would be > running into these compilation errors. > [1] http://batey.info/cassandra-udfs.html -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apa
[jira] [Commented] (CASSANDRA-13929) BTree$Builder / io.netty.util.Recycler$Stack leaking memory
[ https://issues.apache.org/jira/browse/CASSANDRA-13929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16189717#comment-16189717 ] T Jake Luciani commented on CASSANDRA-13929: The default for recycler is 32k instances per thread. So perhaps change this to 8192 per thread and see if that makes a difference. > BTree$Builder / io.netty.util.Recycler$Stack leaking memory > --- > > Key: CASSANDRA-13929 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13929 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Thomas Steinmaurer > Attachments: cassandra_3.11.0_min_memory_utilization.jpg, > cassandra_3.11.1_mat_dominator_classes_FIXED.png, > cassandra_3.11.1_mat_dominator_classes.png, > cassandra_3.11.1_snapshot_heaputilization.png > > > Different to CASSANDRA-13754, there seems to be another memory leak in > 3.11.0+ in BTree$Builder / io.netty.util.Recycler$Stack. > * heap utilization increase after upgrading to 3.11.0 => > cassandra_3.11.0_min_memory_utilization.jpg > * No difference after upgrading to 3.11.1 (snapshot build) => > cassandra_3.11.1_snapshot_heaputilization.png; thus most likely after fixing > CASSANDRA-13754, more visible now > * MAT shows io.netty.util.Recycler$Stack as top contributing class => > cassandra_3.11.1_mat_dominator_classes.png > * With -Xmx8G (CMS) and our load pattern, we have to do a rolling restart > after ~ 72 hours > Verified the following fix, namely explicitly unreferencing the > _recycleHandle_ member (making it non-final). In > _org.apache.cassandra.utils.btree.BTree.Builder.recycle()_ > {code} > public void recycle() > { > if (recycleHandle != null) > { > this.cleanup(); > builderRecycler.recycle(this, recycleHandle); > recycleHandle = null; // ADDED > } > } > {code} > Patched a single node in our loadtest cluster with this change and after ~ 10 > hours uptime, no sign of the previously offending class in MAT anymore => > cassandra_3.11.1_mat_dominator_classes_FIXED.png > Can' say if this has any other side effects etc., but I doubt. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13553) Map C* table schema to RocksDB key value data model
[ https://issues.apache.org/jira/browse/CASSANDRA-13553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16189682#comment-16189682 ] DOAN DuyHai commented on CASSANDRA-13553: - Thanks [~dikanggu], can't wait to play with this new storage engine once we have a beta version > Map C* table schema to RocksDB key value data model > --- > > Key: CASSANDRA-13553 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13553 > Project: Cassandra > Issue Type: Sub-task > Components: Core >Reporter: Dikang Gu >Assignee: Dikang Gu > > The goal for this ticket is to find a way to map Cassandra's table data model > to RocksDB's key value data model. > To support most common C* queries on top of RocksDB, we plan to use this > strategy, for each row in Cassandra: > 1. Encode Cassandra partition key + clustering keys into RocksDB key. > 2. Encode rest of Cassandra columns into RocksDB value. > With this approach, there are two major problems we need to solve: > 1. After we encode C* keys into RocksDB key, we need to preserve the same > sorting order in RocksDB byte comparator, as in original data type. > 2. Support timestamp, ttl, and tombestone on the values. > To solve problem 1, we need to carefully design the encoding algorithm for > each data type. Fortunately, there are some existing libraries we can play > with, such as orderly (https://github.com/ndimiduk/orderly), which is used by > HBase. Or flatbuffer (https://github.com/google/flatbuffers) > To solve problem 2, our plan is to encode C* timestamp, ttl, and tombestone > together with the values, and then use RocksDB's merge operator/compaction > filter to merge different version of data, and handle ttl/tombestones. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13929) BTree$Builder / io.netty.util.Recycler$Stack leaking memory
[ https://issues.apache.org/jira/browse/CASSANDRA-13929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16189626#comment-16189626 ] Thomas Steinmaurer commented on CASSANDRA-13929: [~tjake]: Thanks for the feedback about invalidating the cache. Not sure what actually is cached here, but without nulling the reference, I do see e.g. ~ 770K BTree$Builder instances on the heap. Infinite caching still sounds like a memory leak to me, but that's nitpicking now. :-) Thanks again. > BTree$Builder / io.netty.util.Recycler$Stack leaking memory > --- > > Key: CASSANDRA-13929 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13929 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Thomas Steinmaurer > Attachments: cassandra_3.11.0_min_memory_utilization.jpg, > cassandra_3.11.1_mat_dominator_classes_FIXED.png, > cassandra_3.11.1_mat_dominator_classes.png, > cassandra_3.11.1_snapshot_heaputilization.png > > > Different to CASSANDRA-13754, there seems to be another memory leak in > 3.11.0+ in BTree$Builder / io.netty.util.Recycler$Stack. > * heap utilization increase after upgrading to 3.11.0 => > cassandra_3.11.0_min_memory_utilization.jpg > * No difference after upgrading to 3.11.1 (snapshot build) => > cassandra_3.11.1_snapshot_heaputilization.png; thus most likely after fixing > CASSANDRA-13754, more visible now > * MAT shows io.netty.util.Recycler$Stack as top contributing class => > cassandra_3.11.1_mat_dominator_classes.png > * With -Xmx8G (CMS) and our load pattern, we have to do a rolling restart > after ~ 72 hours > Verified the following fix, namely explicitly unreferencing the > _recycleHandle_ member (making it non-final). In > _org.apache.cassandra.utils.btree.BTree.Builder.recycle()_ > {code} > public void recycle() > { > if (recycleHandle != null) > { > this.cleanup(); > builderRecycler.recycle(this, recycleHandle); > recycleHandle = null; // ADDED > } > } > {code} > Patched a single node in our loadtest cluster with this change and after ~ 10 > hours uptime, no sign of the previously offending class in MAT anymore => > cassandra_3.11.1_mat_dominator_classes_FIXED.png > Can' say if this has any other side effects etc., but I doubt. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13929) BTree$Builder / io.netty.util.Recycler$Stack leaking memory
[ https://issues.apache.org/jira/browse/CASSANDRA-13929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16189587#comment-16189587 ] T Jake Luciani commented on CASSANDRA-13929: The recycler is meant to cache objects for reuse. By nulling the handler you are effectively invalidating the cache every time. I don't think this is a leak but perhaps we should limit this cache to hold less items. (see the recycler constructor) > BTree$Builder / io.netty.util.Recycler$Stack leaking memory > --- > > Key: CASSANDRA-13929 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13929 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Thomas Steinmaurer > Attachments: cassandra_3.11.0_min_memory_utilization.jpg, > cassandra_3.11.1_mat_dominator_classes_FIXED.png, > cassandra_3.11.1_mat_dominator_classes.png, > cassandra_3.11.1_snapshot_heaputilization.png > > > Different to CASSANDRA-13754, there seems to be another memory leak in > 3.11.0+ in BTree$Builder / io.netty.util.Recycler$Stack. > * heap utilization increase after upgrading to 3.11.0 => > cassandra_3.11.0_min_memory_utilization.jpg > * No difference after upgrading to 3.11.1 (snapshot build) => > cassandra_3.11.1_snapshot_heaputilization.png; thus most likely after fixing > CASSANDRA-13754, more visible now > * MAT shows io.netty.util.Recycler$Stack as top contributing class => > cassandra_3.11.1_mat_dominator_classes.png > * With -Xmx8G (CMS) and our load pattern, we have to do a rolling restart > after ~ 72 hours > Verified the following fix, namely explicitly unreferencing the > _recycleHandle_ member (making it non-final). In > _org.apache.cassandra.utils.btree.BTree.Builder.recycle()_ > {code} > public void recycle() > { > if (recycleHandle != null) > { > this.cleanup(); > builderRecycler.recycle(this, recycleHandle); > recycleHandle = null; // ADDED > } > } > {code} > Patched a single node in our loadtest cluster with this change and after ~ 10 > hours uptime, no sign of the previously offending class in MAT anymore => > cassandra_3.11.1_mat_dominator_classes_FIXED.png > Can' say if this has any other side effects etc., but I doubt. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13923) Flushers blocked due to many SSTables
[ https://issues.apache.org/jira/browse/CASSANDRA-13923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16189576#comment-16189576 ] Marcus Eriksson commented on CASSANDRA-13923: - created https://issues.apache.org/jira/browse/CASSANDRA-13930 for the defrag issue > Flushers blocked due to many SSTables > - > > Key: CASSANDRA-13923 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13923 > Project: Cassandra > Issue Type: Bug > Components: Compaction, Local Write-Read Paths > Environment: Cassandra 3.11.0 > Centos 6 (downgraded JNA) > 64GB RAM > 12-disk JBOD >Reporter: Dan Kinder >Assignee: Marcus Eriksson > Attachments: cassandra-jstack-readstage.txt, cassandra-jstack.txt > > > This started on the mailing list and I'm not 100% sure of the root cause, > feel free to re-title if needed. > I just upgraded Cassandra from 2.2.6 to 3.11.0. Within a few hours of serving > traffic, thread pools begin to back up and grow pending tasks indefinitely. > This happens to multiple different stages (Read, Mutation) and consistently > builds pending tasks for MemtablePostFlush and MemtableFlushWriter. > Using jstack shows that there is blocking going on when trying to call > getCompactionCandidates, which seems to happen on flush. We have fairly large > nodes that have ~15,000 SSTables per node, all LCS. > I seems like this can cause reads to get blocked because they try to acquire > a read lock when calling shouldDefragment. > And writes, of course, block once we can't allocate anymore memtables, > because flushes are backed up. > We did not have this problem in 2.2.6, so it seems like there is some > regression causing it to be incredibly slow trying to do calls like > getCompactionCandidates that list out the SSTables. > In our case this causes nodes to build up pending tasks and simply stop > responding to requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13930) Avoid grabbing the read lock when checking if compaction strategy should do defragmentation
[ https://issues.apache.org/jira/browse/CASSANDRA-13930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-13930: Status: Patch Available (was: Open) https://github.com/krummas/cassandra/commits/marcuse/defrag https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/356/ https://circleci.com/gh/krummas/cassandra/137 > Avoid grabbing the read lock when checking if compaction strategy should do > defragmentation > --- > > Key: CASSANDRA-13930 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13930 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 3.11.x, 4.x > > > We grab the read lock when checking whether the compaction strategy benefits > from defragmentation, avoid that. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-13930) Avoid grabbing the read lock when checking if compaction strategy should do defragmentation
Marcus Eriksson created CASSANDRA-13930: --- Summary: Avoid grabbing the read lock when checking if compaction strategy should do defragmentation Key: CASSANDRA-13930 URL: https://issues.apache.org/jira/browse/CASSANDRA-13930 Project: Cassandra Issue Type: Bug Reporter: Marcus Eriksson Assignee: Marcus Eriksson Fix For: 3.11.x, 4.x We grab the read lock when checking whether the compaction strategy benefits from defragmentation, avoid that. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-12961) LCS needlessly checks for L0 STCS candidates multiple times
[ https://issues.apache.org/jira/browse/CASSANDRA-12961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16189558#comment-16189558 ] Vusal Ahmadoglu commented on CASSANDRA-12961: - Thanks [~jjirsa]. That's a good news! > LCS needlessly checks for L0 STCS candidates multiple times > --- > > Key: CASSANDRA-12961 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12961 > Project: Cassandra > Issue Type: Improvement > Components: Compaction >Reporter: Jeff Jirsa >Assignee: Vusal Ahmadoglu >Priority: Trivial > Labels: lhf > Fix For: 4.0 > > Attachments: > 0001-CASSANDRA-12961-Moving-getSTCSInL0CompactionCandidat.patch > > > It's very likely that the check for L0 STCS candidates (if L0 is falling > behind) can be moved outside of the loop, or at very least made so that it's > not called on each loop iteration: > {code} > for (int i = generations.length - 1; i > 0; i--) > { > List sstables = getLevel(i); > if (sstables.isEmpty()) > continue; // mostly this just avoids polluting the debug log > with zero scores > // we want to calculate score excluding compacting ones > Set sstablesInLevel = Sets.newHashSet(sstables); > Set remaining = Sets.difference(sstablesInLevel, > cfs.getTracker().getCompacting()); > double score = (double) SSTableReader.getTotalBytes(remaining) / > (double)maxBytesForLevel(i, maxSSTableSizeInBytes); > logger.trace("Compaction score for level {} is {}", i, score); > if (score > 1.001) > { > // before proceeding with a higher level, let's see if L0 is > far enough behind to warrant STCS > CompactionCandidate l0Compaction = > getSTCSInL0CompactionCandidate(); > if (l0Compaction != null) > return l0Compaction; > .. > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13926) Starting and stopping quickly on Windows results in "port already in use" error
[ https://issues.apache.org/jira/browse/CASSANDRA-13926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16189528#comment-16189528 ] Joshua McKenzie commented on CASSANDRA-13926: - I ran into this when doing dev on Windows which is why I added the -a flag: {code} 30 -a Aggressive startup. Skip VerifyPorts check. For use in dev environments. {code} I suppose we could pursue formalizing it if you're running into this in production, but the -a in dev was sufficient to get past it for me. > Starting and stopping quickly on Windows results in "port already in use" > error > --- > > Key: CASSANDRA-13926 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13926 > Project: Cassandra > Issue Type: Bug > Components: Packaging > Environment: Windows >Reporter: Jason Rust >Priority: Minor > Labels: windows > > If I stop/start Cassandra within a minute on Windows, using the included > Powershell script it can fail to start with the error message "Found a port > already in use. Aborting startup." > This is because the Powershell script uses netstat to find ports are in use, > and even if Cassandra is stopped it is still listed for a short time > (reported as TIME_WAIT). See > https://superuser.com/questions/173535/what-are-close-wait-and-time-wait-states > A change to the Powershell script to ensure that only ESTABLISHED ports are > searched solves the problem for me and involves changing from: > {code} if ($line -match "TCP" -and $line -match $portRegex){code} > to > {code} if ($line -match "TCP" -and $line -match $portRegex -and $line -match > "ESTABLISHED"){code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13123) Draining a node might fail to delete all inactive commitlogs
[ https://issues.apache.org/jira/browse/CASSANDRA-13123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16189523#comment-16189523 ] Joshua McKenzie commented on CASSANDRA-13123: - bq. suspect it may be a test ordering issue (if the two tests are run in one order they pass, in the other they fail, so probably setup/teardown conditions). The brittleness of CL startup/teardown in unit testing was a pretty significant pain in the ass when I was working on CDC. Stupp and I have both bumped up against that in the memorable recent past and tidied things up a bit, but I suspect it will require a more invasive re-arch of the segment allocation and CL startup/shutdown to get it really ironed out. > Draining a node might fail to delete all inactive commitlogs > > > Key: CASSANDRA-13123 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13123 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths >Reporter: Jan Urbański >Assignee: Jan Urbański > Fix For: 3.0.15, 3.11.1, 4.0 > > Attachments: 13123-2.2.8.txt, 13123-3.0.10.txt, 13123-3.9.txt, > 13123-trunk.txt > > > After issuing a drain command, it's possible that not all of the inactive > commitlogs are removed. > The drain command shuts down the CommitLog instance, which in turn shuts down > the CommitLogSegmentManager. This has the effect of discarding any pending > management tasks it might have, like the removal of inactive commitlogs. > This in turn leads to an excessive amount of commitlogs being left behind > after a drain and a lengthy recovery after a restart. With a fleet of dozens > of nodes, each of them leaving several GB of commitlogs after a drain and > taking up to two minutes to recover them on restart, the additional time > required to restart the entire fleet becomes noticeable. > This problem is not present in 3.x or trunk because of the CLSM rewrite done > in CASSANDRA-8844. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-13850) Modifying "cassandra-env.sh"
[ https://issues.apache.org/jira/browse/CASSANDRA-13850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16189512#comment-16189512 ] Amitkumar Ghatwal edited comment on CASSANDRA-13850 at 10/3/17 10:21 AM: - [~jjirsa] [~mshuler] - Could you please check the PR here and see this fits the bill. This is an upgrade to this ticket - https://issues.apache.org/jira/browse/CASSANDRA-13601 was (Author: amitkumar_ghatwal): [~jjirsa] [~mshuler] - Could you please check the PR here and see this fit fits the bill. This is an upgrade to this ticket - https://issues.apache.org/jira/browse/CASSANDRA-13601 > Modifying "cassandra-env.sh" > > > Key: CASSANDRA-13850 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13850 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Amitkumar Ghatwal > Fix For: 4.x > > > Hi All, > Added support for arch in "cassandra-env.sh " with PR : > https://github.com/apache/cassandra/pull/149 > Regards, > Amit -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13850) Modifying "cassandra-env.sh"
[ https://issues.apache.org/jira/browse/CASSANDRA-13850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16189512#comment-16189512 ] Amitkumar Ghatwal commented on CASSANDRA-13850: --- [~jjirsa] [~mshuler] - Could you please check the PR here and see this fit fits the bill. This is an upgrade to this ticket - https://issues.apache.org/jira/browse/CASSANDRA-13601 > Modifying "cassandra-env.sh" > > > Key: CASSANDRA-13850 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13850 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Amitkumar Ghatwal > Fix For: 4.x > > > Hi All, > Added support for arch in "cassandra-env.sh " with PR : > https://github.com/apache/cassandra/pull/149 > Regards, > Amit -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13929) BTree$Builder / io.netty.util.Recycler$Stack leaking memory
[ https://issues.apache.org/jira/browse/CASSANDRA-13929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Steinmaurer updated CASSANDRA-13929: --- Description: Different to CASSANDRA-13754, there seems to be another memory leak in 3.11.0+ in BTree$Builder / io.netty.util.Recycler$Stack. * heap utilization increase after upgrading to 3.11.0 => cassandra_3.11.0_min_memory_utilization.jpg * No difference after upgrading to 3.11.1 (snapshot build) => cassandra_3.11.1_snapshot_heaputilization.png; thus most likely after fixing CASSANDRA-13754, more visible now * MAT shows io.netty.util.Recycler$Stack as top contributing class => cassandra_3.11.1_mat_dominator_classes.png * With -Xmx8G (CMS) and our load pattern, we have to do a rolling restart after ~ 72 hours Verified the following fix, namely explicitly unreferencing the _recycleHandle_ member (making it non-final). In _org.apache.cassandra.utils.btree.BTree.Builder.recycle()_ {code} public void recycle() { if (recycleHandle != null) { this.cleanup(); builderRecycler.recycle(this, recycleHandle); recycleHandle = null; // ADDED } } {code} Patched a single node in our loadtest cluster with this change and after ~ 10 hours uptime, no sign of the previously offending class in MAT anymore => cassandra_3.11.1_mat_dominator_classes_FIXED.png Can' say if this has any other side effects etc., but I doubt. was: Different to CASSANDRA-13754, there seems to be another memory leak in 3.11.0+ in BTree$Builder / io.netty.util.Recycler$Stack. * heap utilization increase after upgrading to 3.11.0 => cassandra_3.11.0_min_memory_utilization.jpg * No difference after upgrading to 3.11.1 (snapshot build) => cassandra_3.11.1_snapshot_heaputilization.png; thus most likely after fixing CASSANDRA-13754, more visible now * MAT shows io.netty.util.Recycler$Stack as top contributing class => cassandra_3.11.1_mat_dominator_classes.png * With -Xmx8G (CMS) and our load pattern, we have to do a rolling restart after ~ 72 hours Verified the following fix, namely explicitly unreferencing the _recycleHandle_ member (making it non-final). In _org.apache.cassandra.utils.btree.BTree.Builder.recycle()_ {code} public void recycle() { if (recycleHandle != null) { this.cleanup(); builderRecycler.recycle(this, recycleHandle); recycleHandle = null; // ADDED } } {code} Patched a single node in our loadtest cluster with this change and after ~ 10 hours uptime, no sign of the previously offending class in MAT anymore => cassandra_3.11.1_mat_dominator_classes_FIXED.png > BTree$Builder / io.netty.util.Recycler$Stack leaking memory > --- > > Key: CASSANDRA-13929 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13929 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Thomas Steinmaurer > Attachments: cassandra_3.11.0_min_memory_utilization.jpg, > cassandra_3.11.1_mat_dominator_classes_FIXED.png, > cassandra_3.11.1_mat_dominator_classes.png, > cassandra_3.11.1_snapshot_heaputilization.png > > > Different to CASSANDRA-13754, there seems to be another memory leak in > 3.11.0+ in BTree$Builder / io.netty.util.Recycler$Stack. > * heap utilization increase after upgrading to 3.11.0 => > cassandra_3.11.0_min_memory_utilization.jpg > * No difference after upgrading to 3.11.1 (snapshot build) => > cassandra_3.11.1_snapshot_heaputilization.png; thus most likely after fixing > CASSANDRA-13754, more visible now > * MAT shows io.netty.util.Recycler$Stack as top contributing class => > cassandra_3.11.1_mat_dominator_classes.png > * With -Xmx8G (CMS) and our load pattern, we have to do a rolling restart > after ~ 72 hours > Verified the following fix, namely explicitly unreferencing the > _recycleHandle_ member (making it non-final). In > _org.apache.cassandra.utils.btree.BTree.Builder.recycle()_ > {code} > public void recycle() > { > if (recycleHandle != null) > { > this.cleanup(); > builderRecycler.recycle(this, recycleHandle); > recycleHandle = null; // ADDED > } > } > {code} > Patched a single node in our loadtest cluster with this change and after ~ 10 > hours uptime, no sign of the previously offending class in MAT anymore => > cassandra_3.11.1_mat_dominator_classes_FIXED.png > Can' say if this has any other side effects etc., but I doubt. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: commits-unsubscr...@cassan