[jira] [Commented] (CASSANDRA-14318) Debug logging can create massive performance issues

2018-03-27 Thread Alexander Dejanovski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16415923#comment-16415923
 ] 

Alexander Dejanovski commented on CASSANDRA-14318:
--

For the record, the same tests on 3.11.2 didn't show any notable performance 
difference between debug on and off : 

Cassandra 3.11.2 debug on : 
{noformat}
Results:
Op rate : 18 777 op/s [read_event_1: 3 165 op/s, read_event_2: 3 109 op/s, 
read_event_3: 12 562 op/s]
Partition rate : 6 215 pk/s [read_event_1: 3 165 pk/s, read_event_2: 3 109 
pk/s, read_event_3: 0 pk/s]
Row rate : 6 215 row/s [read_event_1: 3 165 row/s, read_event_2: 3 109 row/s, 
read_event_3: 0 row/s]
Latency mean : 6,7 ms [read_event_1: 6,7 ms, read_event_2: 6,7 ms, 
read_event_3: 6,6 ms]
Latency median : 5,0 ms [read_event_1: 5,0 ms, read_event_2: 5,0 ms, 
read_event_3: 4,9 ms]
Latency 95th percentile : 15,6 ms [read_event_1: 15,5 ms, read_event_2: 15,9 
ms, read_event_3: 15,5 ms]
Latency 99th percentile : 43,3 ms [read_event_1: 42,7 ms, read_event_2: 44,2 
ms, read_event_3: 43,2 ms]
Latency 99.9th percentile : 82,0 ms [read_event_1: 80,3 ms, read_event_2: 82,4 
ms, read_event_3: 82,1 ms]
Latency max : 272,4 ms [read_event_1: 272,4 ms, read_event_2: 268,7 ms, 
read_event_3: 245,1 ms]
Total partitions : 330 970 [read_event_1: 165 386, read_event_2: 165 584, 
read_event_3: 0]
Total errors : 0 [read_event_1: 0, read_event_2: 0, read_event_3: 0]
Total GC count : 42
Total GC memory : 13,102 GiB
Total GC time : 1,8 seconds
Avg GC time : 42,4 ms
StdDev GC time : 1,3 ms
Total operation time : 00:00:53{noformat}
 


Cassandra 3.11.2 debug off : 
{noformat}
Results:
Op rate : 18 853 op/s [read_event_1: 3 138 op/s, read_event_2: 3 137 op/s, 
read_event_3: 12 578 op/s]
Partition rate : 6 275 pk/s [read_event_1: 3 138 pk/s, read_event_2: 3 137 
pk/s, read_event_3: 0 pk/s]
Row rate : 6 275 row/s [read_event_1: 3 138 row/s, read_event_2: 3 137 row/s, 
read_event_3: 0 row/s]
Latency mean : 6,7 ms [read_event_1: 6,7 ms, read_event_2: 6,7 ms, 
read_event_3: 6,7 ms]
Latency median : 5,0 ms [read_event_1: 5,1 ms, read_event_2: 5,1 ms, 
read_event_3: 5,0 ms]
Latency 95th percentile : 15,5 ms [read_event_1: 15,5 ms, read_event_2: 15,6 
ms, read_event_3: 15,4 ms]
Latency 99th percentile : 39,9 ms [read_event_1: 41,0 ms, read_event_2: 39,6 
ms, read_event_3: 39,6 ms]
Latency 99.9th percentile : 73,3 ms [read_event_1: 73,4 ms, read_event_2: 71,6 
ms, read_event_3: 73,6 ms]
Latency max : 367,0 ms [read_event_1: 240,5 ms, read_event_2: 250,3 ms, 
read_event_3: 367,0 ms]
Total partitions : 332 852 [read_event_1: 166 447, read_event_2: 166 405, 
read_event_3: 0]
Total errors : 0 [read_event_1: 0, read_event_2: 0, read_event_3: 0]
Total GC count : 46
Total GC memory : 14,024 GiB
Total GC time : 2,0 seconds
Avg GC time : 42,7 ms
StdDev GC time : 3,9 ms
Total operation time : 00:00:53{noformat}
The improvement over 2.2 is nice though :)

 

> Debug logging can create massive performance issues
> ---
>
> Key: CASSANDRA-14318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14318
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Alexander Dejanovski
>Assignee: Alexander Dejanovski
>Priority: Major
>  Labels: lhf, performance
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
> Attachments: cassandra-2.2-debug.yaml, debuglogging.png, flame22 
> nodebug sjk svg.png, flame22-nodebug-sjk.svg, flame22-sjk.svg, 
> flame_graph_snapshot.png
>
>
> Debug logging can involve in many cases (especially very low latency ones) a 
> very important overhead on the read path in 2.2 as we've seen when upgrading 
> clusters from 2.0 to 2.2.
> The performance impact was especially noticeable on the client side metrics, 
> where p99 could go up to 10 times higher, while ClientRequest metrics 
> recorded by Cassandra didn't show any overhead.
> Below shows latencies recorded on the client side with debug logging on 
> first, and then without it :
> !debuglogging.png!  
> We generated a flame graph before turning off debug logging that shows the 
> read call stack is dominated by debug logging : 
> !flame_graph_snapshot.png!
> I've attached the original flame graph for exploration.
> Once disabled, the new flame graph shows that the read call stack gets 
> extremely thin, which is further confirmed by client recorded metrics : 
> !flame22 nodebug sjk svg.png!
> The query pager code has been reworked since 3.0 and it looks like 
> log.debug() calls are gone there, but for 2.2 users and to prevent such 
> issues to appear with default settings, I really think debug logging should 
> be disabled by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: 

[jira] [Commented] (CASSANDRA-14318) Debug logging can create massive performance issues

2018-03-27 Thread Alexander Dejanovski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16415846#comment-16415846
 ] 

Alexander Dejanovski commented on CASSANDRA-14318:
--

[~jjirsa]: apparently the ReadCallback class already logs at TRACE and not 
DEBUG on the latest 2.2.

I've created the fix that downgrades debug logging to trace logging in the 
query pager classes, and here are the results : 

debug on - no fix : 
{noformat}
Results:
op rate : 6681 [read_event_1:1109, read_event_2:1119, read_event_3:4452]
partition rate : 6681 [read_event_1:1109, read_event_2:1119, read_event_3:4452]
row rate : 6681 [read_event_1:1109, read_event_2:1119, read_event_3:4452]
latency mean : 19,1 [read_event_1:15,4, read_event_2:15,4, read_event_3:21,0]
latency median : 15,6 [read_event_1:14,2, read_event_2:14,0, read_event_3:16,3]
latency 95th percentile : 39,1 [read_event_1:28,4, read_event_2:28,6, 
read_event_3:44,2]
latency 99th percentile : 75,6 [read_event_1:52,9, read_event_2:53,6, 
read_event_3:87,7]
latency 99.9th percentile : 315,7 [read_event_1:101,0, read_event_2:110,1, 
read_event_3:361,1]
latency max : 609,1 [read_event_1:319,6, read_event_2:315,9, read_event_3:609,1]
Total partitions : 993050 [read_event_1:164882, read_event_2:166381, 
read_event_3:661787]
Total errors : 0 [read_event_1:0, read_event_2:0, read_event_3:0]
total gc count : 189
total gc mb : 56464
total gc time (s) : 7
avg gc time(ms) : 37
stdev gc time(ms) : 8
Total operation time : 00:02:28{noformat}
 

 

debug off - no fix : 
{noformat}
Results:
op rate : 12655 [read_event_1:2141, read_event_2:2093, read_event_3:8422]
partition rate : 12655 [read_event_1:2141, read_event_2:2093, read_event_3:8422]
row rate : 12655 [read_event_1:2141, read_event_2:2093, read_event_3:8422]
latency mean : 10,1 [read_event_1:10,1, read_event_2:10,1, read_event_3:10,1]
latency median : 9,2 [read_event_1:9,2, read_event_2:9,2, read_event_3:9,3]
latency 95th percentile : 15,2 [read_event_1:15,8, read_event_2:15,9, 
read_event_3:15,7]
latency 99th percentile : 29,3 [read_event_1:44,5, read_event_2:45,1, 
read_event_3:41,3]
latency 99.9th percentile : 52,7 [read_event_1:67,9, read_event_2:66,9, 
read_event_3:67,1]
latency max : 268,0 [read_event_1:257,1, read_event_2:263,3, read_event_3:268,0]
Total partitions : 983056 [read_event_1:166311, read_event_2:162570, 
read_event_3:654175]
Total errors : 0 [read_event_1:0, read_event_2:0, read_event_3:0]
total gc count : 100
total gc mb : 31529
total gc time (s) : 4
avg gc time(ms) : 37
stdev gc time(ms) : 5
Total operation time : 00:01:17{noformat}
 

 

debug on - with fix : 
{noformat}
Results:
op rate : 12289 [read_event_1:2058, read_event_2:2051, read_event_3:8181]
partition rate : 12289 [read_event_1:2058, read_event_2:2051, read_event_3:8181]
row rate : 12289 [read_event_1:2058, read_event_2:2051, read_event_3:8181]
latency mean : 10,4 [read_event_1:10,4, read_event_2:10,4, read_event_3:10,4]
latency median : 9,4 [read_event_1:9,4, read_event_2:9,4, read_event_3:9,4]
latency 95th percentile : 16,3 [read_event_1:16,8, read_event_2:17,3, 
read_event_3:16,2]
latency 99th percentile : 36,6 [read_event_1:44,3, read_event_2:46,6, 
read_event_3:37,2]
latency 99.9th percentile : 62,2 [read_event_1:78,0, read_event_2:77,1, 
read_event_3:80,8]
latency max : 251,2 [read_event_1:246,9, read_event_2:249,9, read_event_3:251,2]
Total partitions : 100 [read_event_1:167422, read_event_2:166861, 
read_event_3:665717]
Total errors : 0 [read_event_1:0, read_event_2:0, read_event_3:0]
total gc count : 102
total gc mb : 31843
total gc time (s) : 4
avg gc time(ms) : 38
stdev gc time(ms) : 6
Total operation time : 00:01:21{noformat}
 

 

So we have similar performance with debug logging off and with the fix and 
debug on.
The difference in throughput is pretty massive as we roughly get *twice the 
read throughput* with the fix.

Latencies without the fix and with the fix : 

p95 : 35ms -> 16ms
p99 : 75ms -> 36ms

I've ran all tests several times, alternating with and without the fix to make 
sure caches were not making a difference, and results were consistent with 
what's pasted above.
It's been running on a single node using an i3.xlarge instance for Cassandra 
and another i3.large for running cassandra-stress.

 

*One pretty interesting thing to note* is that when I tested with the 
predefined mode of cassandra-stress, no paging occurred and the performance 
difference was not noticeable. This is due to the fact that the predefined mode 
generates COMPACT STORAGE tables, which involve a different read path 
(apparently). I think anyone performing benchmarks for Cassandra changes should 
be aware that the predefined mode isn't relevant and that a user defined test 
should be used (maybe we should create one that would be used as standard 
benchmark). 
Here's the one I used : [^cassandra-2.2-debug.yaml]

With the following commands for 

[jira] [Commented] (CASSANDRA-14318) Debug logging can create massive performance issues

2018-03-19 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16405580#comment-16405580
 ] 

Jeff Jirsa commented on CASSANDRA-14318:


If you havent yet started the second run of benchmarks, you may also want to 
move the debug log statements from ReadCallback.java (one deals with digest 
mismatch exceptions, forgot what the other is) to trace as well. 

> Debug logging can create massive performance issues
> ---
>
> Key: CASSANDRA-14318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14318
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Alexander Dejanovski
>Assignee: Alexander Dejanovski
>Priority: Major
>  Labels: lhf, performance
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
> Attachments: debuglogging.png, flame22 nodebug sjk svg.png, 
> flame22-nodebug-sjk.svg, flame22-sjk.svg, flame_graph_snapshot.png
>
>
> Debug logging can involve in many cases (especially very low latency ones) a 
> very important overhead on the read path in 2.2 as we've seen when upgrading 
> clusters from 2.0 to 2.2.
> The performance impact was especially noticeable on the client side metrics, 
> where p99 could go up to 10 times higher, while ClientRequest metrics 
> recorded by Cassandra didn't show any overhead.
> Below shows latencies recorded on the client side with debug logging on 
> first, and then without it :
> !debuglogging.png!  
> We generated a flame graph before turning off debug logging that shows the 
> read call stack is dominated by debug logging : 
> !flame_graph_snapshot.png!
> I've attached the original flame graph for exploration.
> Once disabled, the new flame graph shows that the read call stack gets 
> extremely thin, which is further confirmed by client recorded metrics : 
> !flame22 nodebug sjk svg.png!
> The query pager code has been reworked since 3.0 and it looks like 
> log.debug() calls are gone there, but for 2.2 users and to prevent such 
> issues to appear with default settings, I really think debug logging should 
> be disabled by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14318) Debug logging can create massive performance issues

2018-03-19 Thread Alexander Dejanovski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16405051#comment-16405051
 ] 

Alexander Dejanovski commented on CASSANDRA-14318:
--

Fair point. I'll write the patch and run the benchmarks.

> Debug logging can create massive performance issues
> ---
>
> Key: CASSANDRA-14318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14318
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Alexander Dejanovski
>Assignee: Alexander Dejanovski
>Priority: Major
>  Labels: lhf, performance
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
> Attachments: debuglogging.png, flame22 nodebug sjk svg.png, 
> flame22-nodebug-sjk.svg, flame22-sjk.svg, flame_graph_snapshot.png
>
>
> Debug logging can involve in many cases (especially very low latency ones) a 
> very important overhead on the read path in 2.2 as we've seen when upgrading 
> clusters from 2.0 to 2.2.
> The performance impact was especially noticeable on the client side metrics, 
> where p99 could go up to 10 times higher, while ClientRequest metrics 
> recorded by Cassandra didn't show any overhead.
> Below shows latencies recorded on the client side with debug logging on 
> first, and then without it :
> !debuglogging.png!  
> We generated a flame graph before turning off debug logging that shows the 
> read call stack is dominated by debug logging : 
> !flame_graph_snapshot.png!
> I've attached the original flame graph for exploration.
> Once disabled, the new flame graph shows that the read call stack gets 
> extremely thin, which is further confirmed by client recorded metrics : 
> !flame22 nodebug sjk svg.png!
> The query pager code has been reworked since 3.0 and it looks like 
> log.debug() calls are gone there, but for 2.2 users and to prevent such 
> issues to appear with default settings, I really think debug logging should 
> be disabled by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14318) Debug logging can create massive performance issues

2018-03-19 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16404814#comment-16404814
 ] 

Jeremiah Jordan commented on CASSANDRA-14318:
-

No matter what is decided we should move those log messages to TRACE.  So I 
think we can proceed here no matter what.

> Debug logging can create massive performance issues
> ---
>
> Key: CASSANDRA-14318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14318
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Alexander Dejanovski
>Assignee: Alexander Dejanovski
>Priority: Major
>  Labels: lhf, performance
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
> Attachments: debuglogging.png, flame22 nodebug sjk svg.png, 
> flame22-nodebug-sjk.svg, flame22-sjk.svg, flame_graph_snapshot.png
>
>
> Debug logging can involve in many cases (especially very low latency ones) a 
> very important overhead on the read path in 2.2 as we've seen when upgrading 
> clusters from 2.0 to 2.2.
> The performance impact was especially noticeable on the client side metrics, 
> where p99 could go up to 10 times higher, while ClientRequest metrics 
> recorded by Cassandra didn't show any overhead.
> Below shows latencies recorded on the client side with debug logging on 
> first, and then without it :
> !debuglogging.png!  
> We generated a flame graph before turning off debug logging that shows the 
> read call stack is dominated by debug logging : 
> !flame_graph_snapshot.png!
> I've attached the original flame graph for exploration.
> Once disabled, the new flame graph shows that the read call stack gets 
> extremely thin, which is further confirmed by client recorded metrics : 
> !flame22 nodebug sjk svg.png!
> The query pager code has been reworked since 3.0 and it looks like 
> log.debug() calls are gone there, but for 2.2 users and to prevent such 
> issues to appear with default settings, I really think debug logging should 
> be disabled by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14318) Debug logging can create massive performance issues

2018-03-19 Thread Alexander Dejanovski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16404797#comment-16404797
 ] 

Alexander Dejanovski commented on CASSANDRA-14318:
--

[~pauloricardomg], thanks for assigning the ticket to me.

Whatever the implementation, I'll run new performance tests and generate flame 
graphs to verify the impact of the changes, no worries.

I'll wait a bit for a consensus to come out of the discussion on the dev ML 
before moving on, as there seem to be some conflicting views on what should be 
done.

> Debug logging can create massive performance issues
> ---
>
> Key: CASSANDRA-14318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14318
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Alexander Dejanovski
>Assignee: Alexander Dejanovski
>Priority: Major
>  Labels: lhf, performance
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
> Attachments: debuglogging.png, flame22 nodebug sjk svg.png, 
> flame22-nodebug-sjk.svg, flame22-sjk.svg, flame_graph_snapshot.png
>
>
> Debug logging can involve in many cases (especially very low latency ones) a 
> very important overhead on the read path in 2.2 as we've seen when upgrading 
> clusters from 2.0 to 2.2.
> The performance impact was especially noticeable on the client side metrics, 
> where p99 could go up to 10 times higher, while ClientRequest metrics 
> recorded by Cassandra didn't show any overhead.
> Below shows latencies recorded on the client side with debug logging on 
> first, and then without it :
> !debuglogging.png!  
> We generated a flame graph before turning off debug logging that shows the 
> read call stack is dominated by debug logging : 
> !flame_graph_snapshot.png!
> I've attached the original flame graph for exploration.
> Once disabled, the new flame graph shows that the read call stack gets 
> extremely thin, which is further confirmed by client recorded metrics : 
> !flame22 nodebug sjk svg.png!
> The query pager code has been reworked since 3.0 and it looks like 
> log.debug() calls are gone there, but for 2.2 users and to prevent such 
> issues to appear with default settings, I really think debug logging should 
> be disabled by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14318) Debug logging can create massive performance issues

2018-03-18 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16404134#comment-16404134
 ] 

Paulo Motta commented on CASSANDRA-14318:
-

Hey Alexander, would you care to re-run the experiments with the query pager 
DEBUG logs changed to TRACE level to check if debug.log logging overhead will 
still persist? Thanks!

> Debug logging can create massive performance issues
> ---
>
> Key: CASSANDRA-14318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14318
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Alexander Dejanovski
>Assignee: Alexander Dejanovski
>Priority: Major
>  Labels: lhf, performance
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
> Attachments: debuglogging.png, flame22 nodebug sjk svg.png, 
> flame22-nodebug-sjk.svg, flame22-sjk.svg, flame_graph_snapshot.png
>
>
> Debug logging can involve in many cases (especially very low latency ones) a 
> very important overhead on the read path in 2.2 as we've seen when upgrading 
> clusters from 2.0 to 2.2.
> The performance impact was especially noticeable on the client side metrics, 
> where p99 could go up to 10 times higher, while ClientRequest metrics 
> recorded by Cassandra didn't show any overhead.
> Below shows latencies recorded on the client side with debug logging on 
> first, and then without it :
> !debuglogging.png!  
> We generated a flame graph before turning off debug logging that shows the 
> read call stack is dominated by debug logging : 
> !flame_graph_snapshot.png!
> I've attached the original flame graph for exploration.
> Once disabled, the new flame graph shows that the read call stack gets 
> extremely thin, which is further confirmed by client recorded metrics : 
> !flame22 nodebug sjk svg.png!
> The query pager code has been reworked since 3.0 and it looks like 
> log.debug() calls are gone there, but for 2.2 users and to prevent such 
> issues to appear with default settings, I really think debug logging should 
> be disabled by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14318) Debug logging can create massive performance issues

2018-03-17 Thread Alexander Dejanovski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16403694#comment-16403694
 ] 

Alexander Dejanovski commented on CASSANDRA-14318:
--

Fair enough, I'll go down the read and write path to see if there's debug 
logging in v3 and above, and switch debug to trace in the query pager in 2.2.

> Debug logging can create massive performance issues
> ---
>
> Key: CASSANDRA-14318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14318
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Alexander Dejanovski
>Priority: Major
>  Labels: lhf, performance
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
> Attachments: debuglogging.png, flame22 nodebug sjk svg.png, 
> flame22-nodebug-sjk.svg, flame22-sjk.svg, flame_graph_snapshot.png
>
>
> Debug logging can involve in many cases (especially very low latency ones) a 
> very important overhead on the read path in 2.2 as we've seen when upgrading 
> clusters from 2.0 to 2.2.
> The performance impact was especially noticeable on the client side metrics, 
> where p99 could go up to 10 times higher, while ClientRequest metrics 
> recorded by Cassandra didn't show any overhead.
> Below shows latencies recorded on the client side with debug logging on 
> first, and then without it :
> !debuglogging.png!  
> We generated a flame graph before turning off debug logging that shows the 
> read call stack is dominated by debug logging : 
> !flame_graph_snapshot.png!
> I've attached the original flame graph for exploration.
> Once disabled, the new flame graph shows that the read call stack gets 
> extremely thin, which is further confirmed by client recorded metrics : 
> !flame22 nodebug sjk svg.png!
> The query pager code has been reworked since 3.0 and it looks like 
> log.debug() calls are gone there, but for 2.2 users and to prevent such 
> issues to appear with default settings, I really think debug logging should 
> be disabled by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14318) Debug logging can create massive performance issues

2018-03-17 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16403671#comment-16403671
 ] 

Jeremiah Jordan commented on CASSANDRA-14318:
-

CASSANDRA-10241 creates that debug.log to be a “production” debug log. If there 
are things being logged at DEBUG which cause performance issues we should 
disable those or move them to TRACE, not turn off the debug.log.

> Debug logging can create massive performance issues
> ---
>
> Key: CASSANDRA-14318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14318
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Alexander Dejanovski
>Priority: Major
>  Labels: lhf, performance
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
> Attachments: debuglogging.png, flame22 nodebug sjk svg.png, 
> flame22-nodebug-sjk.svg, flame22-sjk.svg, flame_graph_snapshot.png
>
>
> Debug logging can involve in many cases (especially very low latency ones) a 
> very important overhead on the read path in 2.2 as we've seen when upgrading 
> clusters from 2.0 to 2.2.
> The performance impact was especially noticeable on the client side metrics, 
> where p99 could go up to 10 times higher, while ClientRequest metrics 
> recorded by Cassandra didn't show any overhead.
> Below shows latencies recorded on the client side with debug logging on 
> first, and then without it :
> !debuglogging.png!  
> We generated a flame graph before turning off debug logging that shows the 
> read call stack is dominated by debug logging : 
> !flame_graph_snapshot.png!
> I've attached the original flame graph for exploration.
> Once disabled, the new flame graph shows that the read call stack gets 
> extremely thin, which is further confirmed by client recorded metrics : 
> !flame22 nodebug sjk svg.png!
> The query pager code has been reworked since 3.0 and it looks like 
> log.debug() calls are gone there, but for 2.2 users and to prevent such 
> issues to appear with default settings, I really think debug logging should 
> be disabled by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14318) Debug logging can create massive performance issues

2018-03-16 Thread Vinay Chella (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402526#comment-16402526
 ] 

Vinay Chella commented on CASSANDRA-14318:
--

+1 on your perf test results. Patch LGTM.

> Debug logging can create massive performance issues
> ---
>
> Key: CASSANDRA-14318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14318
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Alexander Dejanovski
>Priority: Major
>  Labels: performance
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
> Attachments: debuglogging.png, flame22 nodebug sjk svg.png, 
> flame22-nodebug-sjk.svg, flame22-sjk.svg, flame_graph_snapshot.png
>
>
> Debug logging can involve in many cases (especially very low latency ones) a 
> very important overhead on the read path in 2.2 as we've seen when upgrading 
> clusters from 2.0 to 2.2.
> The performance impact was especially noticeable on the client side metrics, 
> where p99 could go up to 10 times higher, while ClientRequest metrics 
> recorded by Cassandra didn't show any overhead.
> Below shows latencies recorded on the client side with debug logging on 
> first, and then without it :
> !debuglogging.png!  
> We generated a flame graph before turning off debug logging that shows the 
> read call stack is dominated by debug logging : 
> !flame_graph_snapshot.png!
> I've attached the original flame graph for exploration.
> Once disabled, the new flame graph shows that the read call stack gets 
> extremely thin, which is further confirmed by client recorded metrics : 
> !flame22 nodebug sjk svg.png!
> The query pager code has been reworked since 3.0 and it looks like 
> log.debug() calls are gone there, but for 2.2 users and to prevent such 
> issues to appear with default settings, I really think debug logging should 
> be disabled by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14318) Debug logging can create massive performance issues

2018-03-16 Thread Alexander Dejanovski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401971#comment-16401971
 ] 

Alexander Dejanovski commented on CASSANDRA-14318:
--

Here's the [patch for 
2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...thelastpickle:disable-debug-logging-by-default]

It should be mergeable in 3.0/3.11/4.0 without a problem.

> Debug logging can create massive performance issues
> ---
>
> Key: CASSANDRA-14318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14318
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Alexander Dejanovski
>Priority: Major
>  Labels: performance
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
> Attachments: debuglogging.png, flame22 nodebug sjk svg.png, 
> flame22-nodebug-sjk.svg, flame22-sjk.svg, flame_graph_snapshot.png
>
>
> Debug logging can involve in many cases (especially very low latency ones) a 
> very important overhead on the read path in 2.2 as we've seen when upgrading 
> clusters from 2.0 to 2.2.
> The performance impact was especially noticeable on the client side metrics, 
> where p99 could go up to 10 times higher, while ClientRequest metrics 
> recorded by Cassandra didn't show any overhead.
> Below shows latencies recorded on the client side with debug logging on 
> first, and then without it :
> !debuglogging.png!  
> We generated a flame graph before turning off debug logging that shows the 
> read call stack is dominated by debug logging : 
> !flame_graph_snapshot.png!
> I've attached the original flame graph for exploration.
> Once disabled, the new flame graph shows that the read call stack gets 
> extremely thin, which is further confirmed by client recorded metrics : 
> !flame22 nodebug sjk svg.png!
> The query pager code has been reworked since 3.0 and it looks like 
> log.debug() calls are gone there, but for 2.2 users and to prevent such 
> issues to appear with default settings, I really think debug logging should 
> be disabled by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org