[jira] [Commented] (CASSANDRA-14261) Compaction Profiling Improvements

2018-04-12 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16435468#comment-16435468
 ] 

Marcus Eriksson commented on CASSANDRA-14261:
-

this lgtm, but I don't think the {{isCounterColumn}} change is correct - it 
seems super columns with counters are represented as a map with counter values, 
like this (in 3.0, upgraded from 2.1):
{{cqlsh> describe sc.counters;  

 

CREATE TABLE sc.counters (
key text,
column1 text,
column2 blob,
"" map,
value counter,
PRIMARY KEY (key, column1, column2)
) WITH COMPACT STORAGE
AND CLUSTERING ORDER BY (column1 ASC, column2 ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = 'NONE';
}}
that "for thrift" comment should be removed though

> Compaction Profiling Improvements
> -
>
> Key: CASSANDRA-14261
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14261
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
>Priority: Minor
> Fix For: 4.x
>
> Attachments: patched-hot-threads.png, patched-tlab.png, 
> unpatched-hot-threads-top.png, unpatched-hot-threads.png, unpatched-tlab.png
>
>
> There's some low hanging fruit in some laptop compaction runs, such as 
> creating a ton of the same object unnecessarily and hashing cell names 
> repeatedly to see if a column is dropped even when we should know that the 
> table has no dropped columns. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14261) Compaction Profiling Improvements

2018-03-01 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16383189#comment-16383189
 ] 

Jeff Jirsa commented on CASSANDRA-14261:


So I've re-run my trivial tests on a laptop just to demonstrate where the 
improvement lies, and the good thing is it repro's easily and consistently, 
which reinforces that it's real gain.

While I was there, I remembered that I had also cleaned up one additional check 
(isCounter() was checking instanceof for thrift, and with thrift removed, I 
believe this can be removed), which helps a bit more. 

Squashed patch here 
[https://github.com/jeffjirsa/cassandra/commit/d627dfbb3a7653d8c5a8c7a170a9621edbc1f1cd]

 
Here's the top threads with trunk:

!unpatched-hot-threads-top.png|height=400,width=800!

And expanded down to see the isDroppedColumn time: 

!unpatched-hot-threads.png|height=400,width=800!

And the trunk tlab allocation during this compaction test:

!unpatched-tlab.png|height=400,width=800!

After applying these patches (except the isCounter() patch, which I did after 
this run, note that BTreeBuilder.resolve() has fallen significantly (as 
isDroppedColumn become trivially cheap):

!patched-hot-threads.png|height=400,width=800!

And tlab allocation (note we've done the same compaction, but allocated 500MB 
fewer empty iterators): 

!patched-tlab.png|height=400,width=800!




> Compaction Profiling Improvements
> -
>
> Key: CASSANDRA-14261
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14261
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
>Priority: Minor
> Fix For: 4.x
>
> Attachments: patched-hot-threads.png, patched-tlab.png, 
> unpatched-hot-threads-top.png, unpatched-hot-threads.png, unpatched-tlab.png
>
>
> There's some low hanging fruit in some laptop compaction runs, such as 
> creating a ton of the same object unnecessarily and hashing cell names 
> repeatedly to see if a column is dropped even when we should know that the 
> table has no dropped columns. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14261) Compaction Profiling Improvements

2018-02-26 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16377386#comment-16377386
 ] 

Joshua McKenzie commented on CASSANDRA-14261:
-

bq.  but these looked simple and sufficiently isolated not to warrant a full 
run in perf.
Right. Was thinking more something along the lines of jmh microbenchmarks or 
something that might be usable in the future as well.

What you have here is pretty non-controversial, but performance optimization 
code-changes w/out perf testing is something we have a pretty long history on 
this project of having bad hygiene around.

> Compaction Profiling Improvements
> -
>
> Key: CASSANDRA-14261
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14261
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
>Priority: Minor
> Fix For: 4.x
>
>
> There's some low hanging fruit in some laptop compaction runs, such as 
> creating a ton of the same object unnecessarily and hashing cell names 
> repeatedly to see if a column is dropped even when we should know that the 
> table has no dropped columns. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14261) Compaction Profiling Improvements

2018-02-26 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16377368#comment-16377368
 ] 

Jeff Jirsa commented on CASSANDRA-14261:


I don't, because I did them on a laptop on a plane over the holidays, and it's 
decidedly synthetic non-prod tests, but these looked simple and sufficiently 
isolated not to warrant a full run in perf.

Here's an estimate from memory though:

[this|https://github.com/jeffjirsa/cassandra/commit/b2b2d765c089c5be609d65f04611b2800ffa70b8]
 was based on seeing that function in the stack about 5% of the time in 
compaction, and it goes from 5% to ~0.001% with that trivial patch.

[this|https://github.com/jeffjirsa/cassandra/commit/dc8070eaa5ec52e8be46358777fe42d9944f5f30]
 was based on seeing ~130MB of allocations (about 3.12% of the TLAB allocation 
for a 9 second span).

And 
[this|https://github.com/jeffjirsa/cassandra/commit/391846e4d0cfd8c8076c3e6050fb0b13496e24ed]
 I expect to never show up in profiles except under very high contention, which 
I have little desire to manually test, but it should be fairly obvious to most 
people that it's both safe and necessary.


> Compaction Profiling Improvements
> -
>
> Key: CASSANDRA-14261
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14261
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
>Priority: Minor
> Fix For: 4.x
>
>
> There's some low hanging fruit in some laptop compaction runs, such as 
> creating a ton of the same object unnecessarily and hashing cell names 
> repeatedly to see if a column is dropped even when we should know that the 
> table has no dropped columns. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14261) Compaction Profiling Improvements

2018-02-26 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16376897#comment-16376897
 ] 

Joshua McKenzie commented on CASSANDRA-14261:
-

You have some perf runs to go with those correctness runs to indicate the 
impact of the change?

> Compaction Profiling Improvements
> -
>
> Key: CASSANDRA-14261
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14261
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
>Priority: Minor
> Fix For: 4.x
>
>
> There's some low hanging fruit in some laptop compaction runs, such as 
> creating a ton of the same object unnecessarily and hashing cell names 
> repeatedly to see if a column is dropped even when we should know that the 
> table has no dropped columns. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14261) Compaction Profiling Improvements

2018-02-25 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16376493#comment-16376493
 ] 

Jeff Jirsa commented on CASSANDRA-14261:


Branch [here|https://github.com/jeffjirsa/cassandra/commits/trunk-perf-fixes] 
with some fixes. Tests will run 
[here|https://circleci.com/gh/jeffjirsa/cassandra/tree/trunk-perf-fixes] . 
Ignore the commit about refactoring Pair, that's for CASSANDRA-14260.


> Compaction Profiling Improvements
> -
>
> Key: CASSANDRA-14261
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14261
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
>Priority: Minor
> Fix For: 4.x
>
>
> There's some low hanging fruit in some laptop compaction runs, such as 
> creating a ton of the same object unnecessarily and hashing cell names 
> repeatedly to see if a column is dropped even when we should know that the 
> table has no dropped columns. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org