Re: Jon Haddad on Diagnosing Performance Problems in Production

2018-02-27 Thread Kant Kodali
+1 That was a nice talk! I don't know why I haven't come across that video
before!

On Tue, Feb 27, 2018 at 9:12 AM, Jonathan Haddad  wrote:

> There isn't a ton from that talk I'd consider "wrong" at this point, but
> some of it is a little stale.  I always start off looking at system
> metrics.  For a very thorough discussion on the matter check out Brendan
> Gregg's USE [1] method.  I did a blog post on my own about the talk [2]
> that has screenshots and might be helpful.  Generally speaking know your OS
> and the tools to examine each component.  Learn how to interpret the
> numbers you see, there's more information than a human can process in a
> lifetime but understanding some fundamentals of throughput vs latency &
> error rates and how to find out each of those metrics for cpu / memory /
> network / disk is a good start.
>
> More recently I did a talk at Data Day Texas, I posted the slides on
> Slideshare [3].  The focus there was more on perf tuning and less on
> performance troubleshooting, but I guess it's a matter of perspective which
> point your at.  The tools have changed a little (Prometheus instead of
> Graphite), and there's some new perf tuning tips like examining your read
> ahead and compression settings, generating flame graphs and using tools
> like YourKit and Java Flight Recorder, and the easiest win of all time,
> disabling dynamic snitch if your hardware is fast and you want sub ms
> p99s.  Turn up counter cache if you use counters (it still gets hit on the
> write path), and row cache is way more effective than people give it credit
> for under the right workloads.
>
> I've got a blog post in the works on JVM tuning, but for now I reference
> CASSANDRA-8150 [4] and Blake Eggleston's blog post [5] from back in our
> days at a small startup.
>
> Lastly, I'm doing a performance tuning series on our blog at The Last
> Pickle, with the first being on Flame Graphs [6].  I've got about 6 posts
> in the pipeline, just need to find time to get to them.
>
> Hope this helps,
> Jon
>
> [1] http://www.brendangregg.com/usemethod.html
> [2] http://rustyrazorblade.com/post/2014/2014-09-18-diagnosing-production/
> [3] https://www.slideshare.net/JonHaddad/performance-tuning-86995333
> [4] https://issues.apache.org/jira/browse/CASSANDRA-8150
> [5] http://blakeeggleston.com/cassandra-tuning-the-jvm-for-
> read-heavy-workloads.html
> [6] http://thelastpickle.com/blog/2018/01/16/cassandra-flame-graphs.html
>
>
>
> On Tue, Feb 27, 2018 at 8:56 AM Michael Shuler 
> wrote:
>
>> On 02/27/2018 10:20 AM, Nicolas Guyomar wrote:
>> > Is Jon blog
>> > post https://academy.datastax.com/planet-cassandra/blog/
>> cassandra-summit-recap-diagnosing-problems-in-production
>> > was relocated somewhere ?
>>
>> https://web.archive.org/web/20160322011022/planetcassandra.org/blog/
>> cassandra-summit-recap-diagnosing-problems-in-production
>>
>> --
>> Kind regards,
>> Michael
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>


Re: Jon Haddad on Diagnosing Performance Problems in Production

2018-02-27 Thread Jonathan Haddad
There isn't a ton from that talk I'd consider "wrong" at this point, but
some of it is a little stale.  I always start off looking at system
metrics.  For a very thorough discussion on the matter check out Brendan
Gregg's USE [1] method.  I did a blog post on my own about the talk [2]
that has screenshots and might be helpful.  Generally speaking know your OS
and the tools to examine each component.  Learn how to interpret the
numbers you see, there's more information than a human can process in a
lifetime but understanding some fundamentals of throughput vs latency &
error rates and how to find out each of those metrics for cpu / memory /
network / disk is a good start.

More recently I did a talk at Data Day Texas, I posted the slides on
Slideshare [3].  The focus there was more on perf tuning and less on
performance troubleshooting, but I guess it's a matter of perspective which
point your at.  The tools have changed a little (Prometheus instead of
Graphite), and there's some new perf tuning tips like examining your read
ahead and compression settings, generating flame graphs and using tools
like YourKit and Java Flight Recorder, and the easiest win of all time,
disabling dynamic snitch if your hardware is fast and you want sub ms
p99s.  Turn up counter cache if you use counters (it still gets hit on the
write path), and row cache is way more effective than people give it credit
for under the right workloads.

I've got a blog post in the works on JVM tuning, but for now I reference
CASSANDRA-8150 [4] and Blake Eggleston's blog post [5] from back in our
days at a small startup.

Lastly, I'm doing a performance tuning series on our blog at The Last
Pickle, with the first being on Flame Graphs [6].  I've got about 6 posts
in the pipeline, just need to find time to get to them.

Hope this helps,
Jon

[1] http://www.brendangregg.com/usemethod.html
[2] http://rustyrazorblade.com/post/2014/2014-09-18-diagnosing-production/
[3] https://www.slideshare.net/JonHaddad/performance-tuning-86995333
[4] https://issues.apache.org/jira/browse/CASSANDRA-8150
[5]
http://blakeeggleston.com/cassandra-tuning-the-jvm-for-read-heavy-workloads.html
[6] http://thelastpickle.com/blog/2018/01/16/cassandra-flame-graphs.html



On Tue, Feb 27, 2018 at 8:56 AM Michael Shuler 
wrote:

> On 02/27/2018 10:20 AM, Nicolas Guyomar wrote:
> > Is Jon blog
> > post
> https://academy.datastax.com/planet-cassandra/blog/cassandra-summit-recap-diagnosing-problems-in-production
> > was relocated somewhere ?
>
>
> https://web.archive.org/web/20160322011022/planetcassandra.org/blog/cassandra-summit-recap-diagnosing-problems-in-production
>
> --
> Kind regards,
> Michael
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


RE: Jon Haddad on Diagnosing Performance Problems in Production

2018-02-27 Thread ZAIDI, ASAD A
Perhaps Mr. Hadad himself  will share it again somewhere; he was kind enough to 
share it once at datastax!


From: Kenneth Brotman [mailto:kenbrot...@yahoo.com.INVALID]
Sent: Tuesday, February 27, 2018 10:39 AM
To: user@cassandra.apache.org
Subject: RE: Jon Haddad on Diagnosing Performance Problems in Production

Nicolas,

I think you had the link to the other version I was thinking of.  I couldn’t 
find it.  I think it might have gotten taken down; a lot of other stuff seems 
to be gone too.  Maybe it will be back.  Maybe they are just redoing stuff.  
Either way, it’s another sign of Mom and Dad drifting apart – I’m not sure 
who’s Mom and who’s Dad: DataStax or ASF.  Hopefully, for the sake of everyone 
in the family they will reconcile.

It’s gems like that presentation that will keep us vital.

Kenneth Brotman

From: Nicolas Guyomar [mailto:nicolas.guyo...@gmail.com]
Sent: Tuesday, February 27, 2018 8:21 AM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: Jon Haddad on Diagnosing Performance Problems in Production

Is Jon blog post 
https://academy.datastax.com/planet-cassandra/blog/cassandra-summit-recap-diagnosing-problems-in-production<https://urldefense.proofpoint.com/v2/url?u=https-3A__academy.datastax.com_planet-2Dcassandra_blog_cassandra-2Dsummit-2Drecap-2Ddiagnosing-2Dproblems-2Din-2Dproduction=DwMFaQ=LFYZ-o9_HUMeMTSQicvjIg=FsmDztdsVuIKml8IDhdHdg=ETtRCCbiqO2DbUs6JS3LXKpTS6WClUKrPG4hYxYR55E=mOIbQnFR3d-E0jT3Dr2183IMO9PygcXXZiignU8XTHM=>
 was relocated somewhere ?

On 27 February 2018 at 16:34, Kenneth Brotman 
<kenbrot...@yahoo.com.invalid<mailto:kenbrot...@yahoo.com.invalid>> wrote:
One presentation that I hope can get updated is Jon Haddad’s very thorough 
presentation on Diagnosing Performance Problems in Production.  I’ve seen 
another version somewhere where I believe he says something like “This should 
help you fix 99% of the problems you see.”  Seems right.

I’m sure it will be well attended and well viewed for some time.  Here’s the 
version I found: 
https://www.youtube.com/watch?v=2JlUpgsEdN8<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.youtube.com_watch-3Fv-3D2JlUpgsEdN8=DwMFaQ=LFYZ-o9_HUMeMTSQicvjIg=FsmDztdsVuIKml8IDhdHdg=ETtRCCbiqO2DbUs6JS3LXKpTS6WClUKrPG4hYxYR55E=FuNx8e6rV7QEvzGVdXxFdRROaxaBUy4A3f4-_t3USgQ=>

If Jon did a new version I’d probably stop and watch it three times right now.

If we started with that video inline on the Apache Cassandra web site in the 
troubleshooting section, that would help a lot of people because of the quality 
of the content and the density of the content.

Kenneth Brotman



Re: Jon Haddad on Diagnosing Performance Problems in Production

2018-02-27 Thread Michael Shuler
On 02/27/2018 10:20 AM, Nicolas Guyomar wrote:
> Is Jon blog
> post 
> https://academy.datastax.com/planet-cassandra/blog/cassandra-summit-recap-diagnosing-problems-in-production
> was relocated somewhere ?

https://web.archive.org/web/20160322011022/planetcassandra.org/blog/cassandra-summit-recap-diagnosing-problems-in-production

-- 
Kind regards,
Michael

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



RE: Jon Haddad on Diagnosing Performance Problems in Production

2018-02-27 Thread Kenneth Brotman
Nicolas,

 

I think you had the link to the other version I was thinking of.  I couldn’t 
find it.  I think it might have gotten taken down; a lot of other stuff seems 
to be gone too.  Maybe it will be back.  Maybe they are just redoing stuff.  
Either way, it’s another sign of Mom and Dad drifting apart – I’m not sure 
who’s Mom and who’s Dad: DataStax or ASF.  Hopefully, for the sake of everyone 
in the family they will reconcile.

 

It’s gems like that presentation that will keep us vital.  

 

Kenneth Brotman

 

From: Nicolas Guyomar [mailto:nicolas.guyo...@gmail.com] 
Sent: Tuesday, February 27, 2018 8:21 AM
To: user@cassandra.apache.org
Subject: Re: Jon Haddad on Diagnosing Performance Problems in Production

 

Is Jon blog post 
https://academy.datastax.com/planet-cassandra/blog/cassandra-summit-recap-diagnosing-problems-in-production
 was relocated somewhere ?

 

On 27 February 2018 at 16:34, Kenneth Brotman <kenbrot...@yahoo.com.invalid> 
wrote:

One presentation that I hope can get updated is Jon Haddad’s very thorough 
presentation on Diagnosing Performance Problems in Production.  I’ve seen 
another version somewhere where I believe he says something like “This should 
help you fix 99% of the problems you see.”  Seems right.

 

I’m sure it will be well attended and well viewed for some time.  Here’s the 
version I found: https://www.youtube.com/watch?v=2JlUpgsEdN8

 

If Jon did a new version I’d probably stop and watch it three times right now.  

 

If we started with that video inline on the Apache Cassandra web site in the 
troubleshooting section, that would help a lot of people because of the quality 
of the content and the density of the content.  

 

Kenneth Brotman

 



Re: Jon Haddad on Diagnosing Performance Problems in Production

2018-02-27 Thread Nicolas Guyomar
Is Jon blog post
https://academy.datastax.com/planet-cassandra/blog/cassandra-summit-recap-diagnosing-problems-in-production
was relocated somewhere ?

On 27 February 2018 at 16:34, Kenneth Brotman 
wrote:

> One presentation that I hope can get updated is Jon Haddad’s very thorough
> presentation on Diagnosing Performance Problems in Production.  I’ve seen
> another version somewhere where I believe he says something like “This
> should help you fix 99% of the problems you see.”  Seems right.
>
>
>
> I’m sure it will be well attended and well viewed for some time.  Here’s
> the version I found: https://www.youtube.com/watch?v=2JlUpgsEdN8
>
>
>
> If Jon did a new version I’d probably stop and watch it three times right
> now.
>
>
>
> If we started with that video inline on the Apache Cassandra web site in
> the troubleshooting section, that would help a lot of people because of the
> quality of the content and the density of the content.
>
>
>
> Kenneth Brotman
>