[jira] [Comment Edited] (CASSANDRA-14261) Compaction Profiling Improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-14261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16435468#comment-16435468 ] Marcus Eriksson edited comment on CASSANDRA-14261 at 4/12/18 12:35 PM: --- this lgtm, but I don't think the {{isCounterColumn}} change is correct - it seems super columns with counters are represented as a map with counter values, like this (in 3.0, upgraded from 2.1): {code} cqlsh> describe sc.counters; CREATE TABLE sc.counters ( key text, column1 text, column2 blob, "" map, value counter, PRIMARY KEY (key, column1, column2) ) WITH COMPACT STORAGE AND CLUSTERING ORDER BY (column1 ASC, column2 ASC) AND bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = 'NONE'; {code} the "for thrift" comment should be removed though was (Author: krummas): this lgtm, but I don't think the {{isCounterColumn}} change is correct - it seems super columns with counters are represented as a map with counter values, like this (in 3.0, upgraded from 2.1): {code} cqlsh> describe sc.counters; CREATE TABLE sc.counters ( key text, column1 text, column2 blob, "" map, value counter, PRIMARY KEY (key, column1, column2) ) WITH COMPACT STORAGE AND CLUSTERING ORDER BY (column1 ASC, column2 ASC) AND bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = 'NONE'; {code} > Compaction Profiling Improvements > - > > Key: CASSANDRA-14261 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14261 > Project: Cassandra > Issue Type: Improvement > Components: Compaction >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa >Priority: Minor > Fix For: 4.x > > Attachments: patched-hot-threads.png, patched-tlab.png, > unpatched-hot-threads-top.png, unpatched-hot-threads.png, unpatched-tlab.png > > > There's some low hanging fruit in some laptop compaction runs, such as > creating a ton of the same object unnecessarily and hashing cell names > repeatedly to see if a column is dropped even when we should know that the > table has no dropped columns. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14261) Compaction Profiling Improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-14261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16435468#comment-16435468 ] Marcus Eriksson edited comment on CASSANDRA-14261 at 4/12/18 12:30 PM: --- this lgtm, but I don't think the {{isCounterColumn}} change is correct - it seems super columns with counters are represented as a map with counter values, like this (in 3.0, upgraded from 2.1): {code} cqlsh> describe sc.counters; CREATE TABLE sc.counters ( key text, column1 text, column2 blob, "" map, value counter, PRIMARY KEY (key, column1, column2) ) WITH COMPACT STORAGE AND CLUSTERING ORDER BY (column1 ASC, column2 ASC) AND bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = 'NONE'; {code} was (Author: krummas): {{this lgtm, but I don't think the }}{{isCounterColumn}}{{ change is correct - it seems super columns with counters are represented as a map with counter values, like this (in 3.0, upgraded from 2.1):}} {{cqlsh> describe sc.counters;}}{{CREATE TABLE sc.counters (}} {{ key text,}} {{ column1 text,}} {{ column2 blob,}} {{ "" map,}} {{ value counter,}} {{ PRIMARY KEY (key, column1, column2)}} {{ ) WITH COMPACT STORAGE}} {{ AND CLUSTERING ORDER BY (column1 ASC, column2 ASC)}} {{ AND bloom_filter_fp_chance = 0.01}} {{ AND caching =}}{{{'keys': 'ALL', 'rows_per_partition': 'NONE'}}}{{AND comment = ''}} {{ AND compaction =}}{{{'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}}}{{AND compression =}}{{{'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}}}{{AND crc_check_chance = 1.0}} {{ AND dclocal_read_repair_chance = 0.1}} {{ AND default_time_to_live = 0}} {{ AND gc_grace_seconds = 864000}} {{ AND max_index_interval = 2048}} {{ AND memtable_flush_period_in_ms = 0}} {{ AND min_index_interval = 128}} {{ AND read_repair_chance = 0.0}} {{ AND speculative_retry = 'NONE';}} {{ that "for thrift" comment should be removed though}} > Compaction Profiling Improvements > - > > Key: CASSANDRA-14261 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14261 > Project: Cassandra > Issue Type: Improvement > Components: Compaction >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa >Priority: Minor > Fix For: 4.x > > Attachments: patched-hot-threads.png, patched-tlab.png, > unpatched-hot-threads-top.png, unpatched-hot-threads.png, unpatched-tlab.png > > > There's some low hanging fruit in some laptop compaction runs, such as > creating a ton of the same object unnecessarily and hashing cell names > repeatedly to see if a column is dropped even when we should know that the > table has no dropped columns. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14261) Compaction Profiling Improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-14261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16435468#comment-16435468 ] Marcus Eriksson edited comment on CASSANDRA-14261 at 4/12/18 12:28 PM: --- {{this lgtm, but I don't think the }}{{isCounterColumn}}{{ change is correct - it seems super columns with counters are represented as a map with counter values, like this (in 3.0, upgraded from 2.1):}} {{cqlsh> describe sc.counters;}}{{CREATE TABLE sc.counters (}} {{ key text,}} {{ column1 text,}} {{ column2 blob,}} {{ "" map,}} {{ value counter,}} {{ PRIMARY KEY (key, column1, column2)}} {{ ) WITH COMPACT STORAGE}} {{ AND CLUSTERING ORDER BY (column1 ASC, column2 ASC)}} {{ AND bloom_filter_fp_chance = 0.01}} {{ AND caching =}}{{{'keys': 'ALL', 'rows_per_partition': 'NONE'}}}{{AND comment = ''}} {{ AND compaction =}}{{{'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}}}{{AND compression =}}{{{'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}}}{{AND crc_check_chance = 1.0}} {{ AND dclocal_read_repair_chance = 0.1}} {{ AND default_time_to_live = 0}} {{ AND gc_grace_seconds = 864000}} {{ AND max_index_interval = 2048}} {{ AND memtable_flush_period_in_ms = 0}} {{ AND min_index_interval = 128}} {{ AND read_repair_chance = 0.0}} {{ AND speculative_retry = 'NONE';}} {{ that "for thrift" comment should be removed though}} was (Author: krummas): this lgtm, but I don't think the {{isCounterColumn}} change is correct - it seems super columns with counters are represented as a map with counter values, like this (in 3.0, upgraded from 2.1): {{cqlsh> describe sc.counters; CREATE TABLE sc.counters ( key text, column1 text, column2 blob, "" map, value counter, PRIMARY KEY (key, column1, column2) ) WITH COMPACT STORAGE AND CLUSTERING ORDER BY (column1 ASC, column2 ASC) AND bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = 'NONE'; }} that "for thrift" comment should be removed though > Compaction Profiling Improvements > - > > Key: CASSANDRA-14261 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14261 > Project: Cassandra > Issue Type: Improvement > Components: Compaction >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa >Priority: Minor > Fix For: 4.x > > Attachments: patched-hot-threads.png, patched-tlab.png, > unpatched-hot-threads-top.png, unpatched-hot-threads.png, unpatched-tlab.png > > > There's some low hanging fruit in some laptop compaction runs, such as > creating a ton of the same object unnecessarily and hashing cell names > repeatedly to see if a column is dropped even when we should know that the > table has no dropped columns. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14261) Compaction Profiling Improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-14261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16435468#comment-16435468 ] Marcus Eriksson edited comment on CASSANDRA-14261 at 4/12/18 12:27 PM: --- this lgtm, but I don't think the {{isCounterColumn}} change is correct - it seems super columns with counters are represented as a map with counter values, like this (in 3.0, upgraded from 2.1): {{cqlsh> describe sc.counters; CREATE TABLE sc.counters ( key text, column1 text, column2 blob, "" map, value counter, PRIMARY KEY (key, column1, column2) ) WITH COMPACT STORAGE AND CLUSTERING ORDER BY (column1 ASC, column2 ASC) AND bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = 'NONE'; }} that "for thrift" comment should be removed though was (Author: krummas): this lgtm, but I don't think the {{isCounterColumn}} change is correct - it seems super columns with counters are represented as a map with counter values, like this (in 3.0, upgraded from 2.1): {{cqlsh> describe sc.counters; CREATE TABLE sc.counters ( key text, column1 text, column2 blob, "" map, value counter, PRIMARY KEY (key, column1, column2) ) WITH COMPACT STORAGE AND CLUSTERING ORDER BY (column1 ASC, column2 ASC) AND bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = 'NONE'; }} that "for thrift" comment should be removed though > Compaction Profiling Improvements > - > > Key: CASSANDRA-14261 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14261 > Project: Cassandra > Issue Type: Improvement > Components: Compaction >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa >Priority: Minor > Fix For: 4.x > > Attachments: patched-hot-threads.png, patched-tlab.png, > unpatched-hot-threads-top.png, unpatched-hot-threads.png, unpatched-tlab.png > > > There's some low hanging fruit in some laptop compaction runs, such as > creating a ton of the same object unnecessarily and hashing cell names > repeatedly to see if a column is dropped even when we should know that the > table has no dropped columns. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14261) Compaction Profiling Improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-14261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16383189#comment-16383189 ] Jeff Jirsa edited comment on CASSANDRA-14261 at 3/2/18 5:26 AM: So I've re-run my trivial tests on a laptop (because I cant run JMC/FR in real environments) to demonstrate where the improvement lies, and the good thing is it repro's easily and consistently, which is positive. Still not a real environment, but I don't see any reason to believe this is a weird synthetic improvement. While I was there, I remembered that I had also cleaned up one additional check (isCounter() was checking instanceof for thrift, and with thrift removed, I believe this can be removed), which helps a bit more. Squashed patch here [https://github.com/jeffjirsa/cassandra/commit/d627dfbb3a7653d8c5a8c7a170a9621edbc1f1cd] Here's the top threads with trunk: !unpatched-hot-threads-top.png|height=400,width=800! And expanded down to see the isDroppedColumn time: !unpatched-hot-threads.png|height=400,width=800! And the trunk tlab allocation during this compaction test: !unpatched-tlab.png|height=400,width=800! After applying these patches (except the isCounter() patch, which I did after this run, note that BTreeBuilder.resolve() has fallen significantly (as isDroppedColumn become trivially cheap): !patched-hot-threads.png|height=400,width=800! And tlab allocation (note we've done the same compaction, but allocated 500MB fewer empty iterators): !patched-tlab.png|height=400,width=800! was (Author: jjirsa): So I've re-run my trivial tests on a laptop (because I cant run JMC/FR in real environments) to demonstrate where the improvement lies, and the good thing is it repro's easily and consistently, which reinforces that it's real gain. While I was there, I remembered that I had also cleaned up one additional check (isCounter() was checking instanceof for thrift, and with thrift removed, I believe this can be removed), which helps a bit more. Squashed patch here [https://github.com/jeffjirsa/cassandra/commit/d627dfbb3a7653d8c5a8c7a170a9621edbc1f1cd] Here's the top threads with trunk: !unpatched-hot-threads-top.png|height=400,width=800! And expanded down to see the isDroppedColumn time: !unpatched-hot-threads.png|height=400,width=800! And the trunk tlab allocation during this compaction test: !unpatched-tlab.png|height=400,width=800! After applying these patches (except the isCounter() patch, which I did after this run, note that BTreeBuilder.resolve() has fallen significantly (as isDroppedColumn become trivially cheap): !patched-hot-threads.png|height=400,width=800! And tlab allocation (note we've done the same compaction, but allocated 500MB fewer empty iterators): !patched-tlab.png|height=400,width=800! > Compaction Profiling Improvements > - > > Key: CASSANDRA-14261 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14261 > Project: Cassandra > Issue Type: Improvement > Components: Compaction >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa >Priority: Minor > Fix For: 4.x > > Attachments: patched-hot-threads.png, patched-tlab.png, > unpatched-hot-threads-top.png, unpatched-hot-threads.png, unpatched-tlab.png > > > There's some low hanging fruit in some laptop compaction runs, such as > creating a ton of the same object unnecessarily and hashing cell names > repeatedly to see if a column is dropped even when we should know that the > table has no dropped columns. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14261) Compaction Profiling Improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-14261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16383189#comment-16383189 ] Jeff Jirsa edited comment on CASSANDRA-14261 at 3/2/18 5:22 AM: So I've re-run my trivial tests on a laptop (because I cant run JMC/FR in real environments) to demonstrate where the improvement lies, and the good thing is it repro's easily and consistently, which reinforces that it's real gain. While I was there, I remembered that I had also cleaned up one additional check (isCounter() was checking instanceof for thrift, and with thrift removed, I believe this can be removed), which helps a bit more. Squashed patch here [https://github.com/jeffjirsa/cassandra/commit/d627dfbb3a7653d8c5a8c7a170a9621edbc1f1cd] Here's the top threads with trunk: !unpatched-hot-threads-top.png|height=400,width=800! And expanded down to see the isDroppedColumn time: !unpatched-hot-threads.png|height=400,width=800! And the trunk tlab allocation during this compaction test: !unpatched-tlab.png|height=400,width=800! After applying these patches (except the isCounter() patch, which I did after this run, note that BTreeBuilder.resolve() has fallen significantly (as isDroppedColumn become trivially cheap): !patched-hot-threads.png|height=400,width=800! And tlab allocation (note we've done the same compaction, but allocated 500MB fewer empty iterators): !patched-tlab.png|height=400,width=800! was (Author: jjirsa): So I've re-run my trivial tests on a laptop just to demonstrate where the improvement lies, and the good thing is it repro's easily and consistently, which reinforces that it's real gain. While I was there, I remembered that I had also cleaned up one additional check (isCounter() was checking instanceof for thrift, and with thrift removed, I believe this can be removed), which helps a bit more. Squashed patch here [https://github.com/jeffjirsa/cassandra/commit/d627dfbb3a7653d8c5a8c7a170a9621edbc1f1cd] Here's the top threads with trunk: !unpatched-hot-threads-top.png|height=400,width=800! And expanded down to see the isDroppedColumn time: !unpatched-hot-threads.png|height=400,width=800! And the trunk tlab allocation during this compaction test: !unpatched-tlab.png|height=400,width=800! After applying these patches (except the isCounter() patch, which I did after this run, note that BTreeBuilder.resolve() has fallen significantly (as isDroppedColumn become trivially cheap): !patched-hot-threads.png|height=400,width=800! And tlab allocation (note we've done the same compaction, but allocated 500MB fewer empty iterators): !patched-tlab.png|height=400,width=800! > Compaction Profiling Improvements > - > > Key: CASSANDRA-14261 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14261 > Project: Cassandra > Issue Type: Improvement > Components: Compaction >Reporter: Jeff Jirsa >Assignee: Jeff Jirsa >Priority: Minor > Fix For: 4.x > > Attachments: patched-hot-threads.png, patched-tlab.png, > unpatched-hot-threads-top.png, unpatched-hot-threads.png, unpatched-tlab.png > > > There's some low hanging fruit in some laptop compaction runs, such as > creating a ton of the same object unnecessarily and hashing cell names > repeatedly to see if a column is dropped even when we should know that the > table has no dropped columns. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org