[jira] [Comment Edited] (CASSANDRA-14261) Compaction Profiling Improvements

2018-04-12 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16435468#comment-16435468
 ] 

Marcus Eriksson edited comment on CASSANDRA-14261 at 4/12/18 12:35 PM:
---

this lgtm, but I don't think the {{isCounterColumn}} change is correct - it 
seems super columns with counters are represented as a map with counter values, 
like this (in 3.0, upgraded from 2.1):
{code}
cqlsh> describe sc.counters;

   
CREATE TABLE sc.counters (
key text,
column1 text,
column2 blob,
"" map,
value counter,
PRIMARY KEY (key, column1, column2)
) WITH COMPACT STORAGE
AND CLUSTERING ORDER BY (column1 ASC, column2 ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = 'NONE';
{code}

the "for thrift" comment should be removed though


was (Author: krummas):
this lgtm, but I don't think the {{isCounterColumn}} change is correct - it 
seems super columns with counters are represented as a map with counter values, 
like this (in 3.0, upgraded from 2.1):
{code}
cqlsh> describe sc.counters;

   
CREATE TABLE sc.counters (
key text,
column1 text,
column2 blob,
"" map,
value counter,
PRIMARY KEY (key, column1, column2)
) WITH COMPACT STORAGE
AND CLUSTERING ORDER BY (column1 ASC, column2 ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = 'NONE';
{code}

> Compaction Profiling Improvements
> -
>
> Key: CASSANDRA-14261
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14261
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
>Priority: Minor
> Fix For: 4.x
>
> Attachments: patched-hot-threads.png, patched-tlab.png, 
> unpatched-hot-threads-top.png, unpatched-hot-threads.png, unpatched-tlab.png
>
>
> There's some low hanging fruit in some laptop compaction runs, such as 
> creating a ton of the same object unnecessarily and hashing cell names 
> repeatedly to see if a column is dropped even when we should know that the 
> table has no dropped columns. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14261) Compaction Profiling Improvements

2018-04-12 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16435468#comment-16435468
 ] 

Marcus Eriksson edited comment on CASSANDRA-14261 at 4/12/18 12:30 PM:
---

this lgtm, but I don't think the {{isCounterColumn}} change is correct - it 
seems super columns with counters are represented as a map with counter values, 
like this (in 3.0, upgraded from 2.1):
{code}
cqlsh> describe sc.counters;

   
CREATE TABLE sc.counters (
key text,
column1 text,
column2 blob,
"" map,
value counter,
PRIMARY KEY (key, column1, column2)
) WITH COMPACT STORAGE
AND CLUSTERING ORDER BY (column1 ASC, column2 ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = 'NONE';
{code}


was (Author: krummas):
{{this lgtm, but I don't think the }}{{isCounterColumn}}{{ change is correct - 
it seems super columns with counters are represented as a map with counter 
values, like this (in 3.0, upgraded from 2.1):}}
{{cqlsh> describe sc.counters;}}{{CREATE TABLE sc.counters (}}
{{ key text,}}
{{ column1 text,}}
{{ column2 blob,}}
{{ "" map,}}
{{ value counter,}}
{{ PRIMARY KEY (key, column1, column2)}}
{{ ) WITH COMPACT STORAGE}}
{{ AND CLUSTERING ORDER BY (column1 ASC, column2 ASC)}}
{{ AND bloom_filter_fp_chance = 0.01}}
{{ AND caching =}}{{{'keys': 'ALL', 'rows_per_partition': 'NONE'}}}{{AND 
comment = ''}}
{{ AND compaction =}}{{{'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
'max_threshold': '32', 'min_threshold': '4'}}}{{AND compression 
=}}{{{'chunk_length_in_kb': '64', 'class': 
'org.apache.cassandra.io.compress.LZ4Compressor'}}}{{AND crc_check_chance = 
1.0}}
{{ AND dclocal_read_repair_chance = 0.1}}
{{ AND default_time_to_live = 0}}
{{ AND gc_grace_seconds = 864000}}
{{ AND max_index_interval = 2048}}
{{ AND memtable_flush_period_in_ms = 0}}
{{ AND min_index_interval = 128}}
{{ AND read_repair_chance = 0.0}}
{{ AND speculative_retry = 'NONE';}}
{{ that "for thrift" comment should be removed though}}

> Compaction Profiling Improvements
> -
>
> Key: CASSANDRA-14261
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14261
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
>Priority: Minor
> Fix For: 4.x
>
> Attachments: patched-hot-threads.png, patched-tlab.png, 
> unpatched-hot-threads-top.png, unpatched-hot-threads.png, unpatched-tlab.png
>
>
> There's some low hanging fruit in some laptop compaction runs, such as 
> creating a ton of the same object unnecessarily and hashing cell names 
> repeatedly to see if a column is dropped even when we should know that the 
> table has no dropped columns. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14261) Compaction Profiling Improvements

2018-04-12 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16435468#comment-16435468
 ] 

Marcus Eriksson edited comment on CASSANDRA-14261 at 4/12/18 12:28 PM:
---

{{this lgtm, but I don't think the }}{{isCounterColumn}}{{ change is correct - 
it seems super columns with counters are represented as a map with counter 
values, like this (in 3.0, upgraded from 2.1):}}
{{cqlsh> describe sc.counters;}}{{CREATE TABLE sc.counters (}}
{{ key text,}}
{{ column1 text,}}
{{ column2 blob,}}
{{ "" map,}}
{{ value counter,}}
{{ PRIMARY KEY (key, column1, column2)}}
{{ ) WITH COMPACT STORAGE}}
{{ AND CLUSTERING ORDER BY (column1 ASC, column2 ASC)}}
{{ AND bloom_filter_fp_chance = 0.01}}
{{ AND caching =}}{{{'keys': 'ALL', 'rows_per_partition': 'NONE'}}}{{AND 
comment = ''}}
{{ AND compaction =}}{{{'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
'max_threshold': '32', 'min_threshold': '4'}}}{{AND compression 
=}}{{{'chunk_length_in_kb': '64', 'class': 
'org.apache.cassandra.io.compress.LZ4Compressor'}}}{{AND crc_check_chance = 
1.0}}
{{ AND dclocal_read_repair_chance = 0.1}}
{{ AND default_time_to_live = 0}}
{{ AND gc_grace_seconds = 864000}}
{{ AND max_index_interval = 2048}}
{{ AND memtable_flush_period_in_ms = 0}}
{{ AND min_index_interval = 128}}
{{ AND read_repair_chance = 0.0}}
{{ AND speculative_retry = 'NONE';}}
{{ that "for thrift" comment should be removed though}}


was (Author: krummas):
this lgtm, but I don't think the {{isCounterColumn}} change is correct - it 
seems super columns with counters are represented as a map with counter values, 
like this (in 3.0, upgraded from 2.1):
{{cqlsh> describe sc.counters;  

 

CREATE TABLE sc.counters (
key text,
column1 text,
column2 blob,
"" map,
value counter,
PRIMARY KEY (key, column1, column2)
) WITH COMPACT STORAGE
AND CLUSTERING ORDER BY (column1 ASC, column2 ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = 'NONE';
}}
that "for thrift" comment should be removed though

> Compaction Profiling Improvements
> -
>
> Key: CASSANDRA-14261
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14261
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
>Priority: Minor
> Fix For: 4.x
>
> Attachments: patched-hot-threads.png, patched-tlab.png, 
> unpatched-hot-threads-top.png, unpatched-hot-threads.png, unpatched-tlab.png
>
>
> There's some low hanging fruit in some laptop compaction runs, such as 
> creating a ton of the same object unnecessarily and hashing cell names 
> repeatedly to see if a column is dropped even when we should know that the 
> table has no dropped columns. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14261) Compaction Profiling Improvements

2018-04-12 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16435468#comment-16435468
 ] 

Marcus Eriksson edited comment on CASSANDRA-14261 at 4/12/18 12:27 PM:
---

this lgtm, but I don't think the {{isCounterColumn}} change is correct - it 
seems super columns with counters are represented as a map with counter values, 
like this (in 3.0, upgraded from 2.1):
{{cqlsh> describe sc.counters;  

 

CREATE TABLE sc.counters (
key text,
column1 text,
column2 blob,
"" map,
value counter,
PRIMARY KEY (key, column1, column2)
) WITH COMPACT STORAGE
AND CLUSTERING ORDER BY (column1 ASC, column2 ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = 'NONE';
}}
that "for thrift" comment should be removed though


was (Author: krummas):
this lgtm, but I don't think the {{isCounterColumn}} change is correct - it 
seems super columns with counters are represented as a map with counter values, 
like this (in 3.0, upgraded from 2.1):
{{cqlsh> describe sc.counters;  

 

CREATE TABLE sc.counters (
key text,
column1 text,
column2 blob,
"" map,
value counter,
PRIMARY KEY (key, column1, column2)
) WITH COMPACT STORAGE
AND CLUSTERING ORDER BY (column1 ASC, column2 ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = 'NONE';
}}
that "for thrift" comment should be removed though

> Compaction Profiling Improvements
> -
>
> Key: CASSANDRA-14261
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14261
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
>Priority: Minor
> Fix For: 4.x
>
> Attachments: patched-hot-threads.png, patched-tlab.png, 
> unpatched-hot-threads-top.png, unpatched-hot-threads.png, unpatched-tlab.png
>
>
> There's some low hanging fruit in some laptop compaction runs, such as 
> creating a ton of the same object unnecessarily and hashing cell names 
> repeatedly to see if a column is dropped even when we should know that the 
> table has no dropped columns. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14261) Compaction Profiling Improvements

2018-03-01 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16383189#comment-16383189
 ] 

Jeff Jirsa edited comment on CASSANDRA-14261 at 3/2/18 5:26 AM:


So I've re-run my trivial tests on a laptop (because I cant run JMC/FR in real 
environments) to demonstrate where the improvement lies, and the good thing is 
it repro's easily and consistently, which is positive. Still not a real 
environment, but I don't see any reason to believe this is a weird synthetic 
improvement. 

While I was there, I remembered that I had also cleaned up one additional check 
(isCounter() was checking instanceof for thrift, and with thrift removed, I 
believe this can be removed), which helps a bit more. 

Squashed patch here 
[https://github.com/jeffjirsa/cassandra/commit/d627dfbb3a7653d8c5a8c7a170a9621edbc1f1cd]

 
Here's the top threads with trunk:

!unpatched-hot-threads-top.png|height=400,width=800!

And expanded down to see the isDroppedColumn time: 

!unpatched-hot-threads.png|height=400,width=800!

And the trunk tlab allocation during this compaction test:

!unpatched-tlab.png|height=400,width=800!

After applying these patches (except the isCounter() patch, which I did after 
this run, note that BTreeBuilder.resolve() has fallen significantly (as 
isDroppedColumn become trivially cheap):

!patched-hot-threads.png|height=400,width=800!

And tlab allocation (note we've done the same compaction, but allocated 500MB 
fewer empty iterators): 

!patched-tlab.png|height=400,width=800!





was (Author: jjirsa):
So I've re-run my trivial tests on a laptop (because I cant run JMC/FR in real 
environments) to demonstrate where the improvement lies, and the good thing is 
it repro's easily and consistently, which reinforces that it's real gain.

While I was there, I remembered that I had also cleaned up one additional check 
(isCounter() was checking instanceof for thrift, and with thrift removed, I 
believe this can be removed), which helps a bit more. 

Squashed patch here 
[https://github.com/jeffjirsa/cassandra/commit/d627dfbb3a7653d8c5a8c7a170a9621edbc1f1cd]

 
Here's the top threads with trunk:

!unpatched-hot-threads-top.png|height=400,width=800!

And expanded down to see the isDroppedColumn time: 

!unpatched-hot-threads.png|height=400,width=800!

And the trunk tlab allocation during this compaction test:

!unpatched-tlab.png|height=400,width=800!

After applying these patches (except the isCounter() patch, which I did after 
this run, note that BTreeBuilder.resolve() has fallen significantly (as 
isDroppedColumn become trivially cheap):

!patched-hot-threads.png|height=400,width=800!

And tlab allocation (note we've done the same compaction, but allocated 500MB 
fewer empty iterators): 

!patched-tlab.png|height=400,width=800!




> Compaction Profiling Improvements
> -
>
> Key: CASSANDRA-14261
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14261
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
>Priority: Minor
> Fix For: 4.x
>
> Attachments: patched-hot-threads.png, patched-tlab.png, 
> unpatched-hot-threads-top.png, unpatched-hot-threads.png, unpatched-tlab.png
>
>
> There's some low hanging fruit in some laptop compaction runs, such as 
> creating a ton of the same object unnecessarily and hashing cell names 
> repeatedly to see if a column is dropped even when we should know that the 
> table has no dropped columns. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14261) Compaction Profiling Improvements

2018-03-01 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16383189#comment-16383189
 ] 

Jeff Jirsa edited comment on CASSANDRA-14261 at 3/2/18 5:22 AM:


So I've re-run my trivial tests on a laptop (because I cant run JMC/FR in real 
environments) to demonstrate where the improvement lies, and the good thing is 
it repro's easily and consistently, which reinforces that it's real gain.

While I was there, I remembered that I had also cleaned up one additional check 
(isCounter() was checking instanceof for thrift, and with thrift removed, I 
believe this can be removed), which helps a bit more. 

Squashed patch here 
[https://github.com/jeffjirsa/cassandra/commit/d627dfbb3a7653d8c5a8c7a170a9621edbc1f1cd]

 
Here's the top threads with trunk:

!unpatched-hot-threads-top.png|height=400,width=800!

And expanded down to see the isDroppedColumn time: 

!unpatched-hot-threads.png|height=400,width=800!

And the trunk tlab allocation during this compaction test:

!unpatched-tlab.png|height=400,width=800!

After applying these patches (except the isCounter() patch, which I did after 
this run, note that BTreeBuilder.resolve() has fallen significantly (as 
isDroppedColumn become trivially cheap):

!patched-hot-threads.png|height=400,width=800!

And tlab allocation (note we've done the same compaction, but allocated 500MB 
fewer empty iterators): 

!patched-tlab.png|height=400,width=800!





was (Author: jjirsa):
So I've re-run my trivial tests on a laptop just to demonstrate where the 
improvement lies, and the good thing is it repro's easily and consistently, 
which reinforces that it's real gain.

While I was there, I remembered that I had also cleaned up one additional check 
(isCounter() was checking instanceof for thrift, and with thrift removed, I 
believe this can be removed), which helps a bit more. 

Squashed patch here 
[https://github.com/jeffjirsa/cassandra/commit/d627dfbb3a7653d8c5a8c7a170a9621edbc1f1cd]

 
Here's the top threads with trunk:

!unpatched-hot-threads-top.png|height=400,width=800!

And expanded down to see the isDroppedColumn time: 

!unpatched-hot-threads.png|height=400,width=800!

And the trunk tlab allocation during this compaction test:

!unpatched-tlab.png|height=400,width=800!

After applying these patches (except the isCounter() patch, which I did after 
this run, note that BTreeBuilder.resolve() has fallen significantly (as 
isDroppedColumn become trivially cheap):

!patched-hot-threads.png|height=400,width=800!

And tlab allocation (note we've done the same compaction, but allocated 500MB 
fewer empty iterators): 

!patched-tlab.png|height=400,width=800!




> Compaction Profiling Improvements
> -
>
> Key: CASSANDRA-14261
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14261
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
>Priority: Minor
> Fix For: 4.x
>
> Attachments: patched-hot-threads.png, patched-tlab.png, 
> unpatched-hot-threads-top.png, unpatched-hot-threads.png, unpatched-tlab.png
>
>
> There's some low hanging fruit in some laptop compaction runs, such as 
> creating a ton of the same object unnecessarily and hashing cell names 
> repeatedly to see if a column is dropped even when we should know that the 
> table has no dropped columns. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org