[kudu-CR] Make tcmalloc heap sampling more useful

2018-02-22 Thread Todd Lipcon (Code Review)
Todd Lipcon has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9263 )

Change subject: Make tcmalloc heap sampling more useful
..

Make tcmalloc heap sampling more useful

Previously our /pprof/heap path required that the process be started
with lifetime heap profiling enabled. This mode of operation is very
slow so we would never be able to use it in production. Even in tests
it's of limited utility.

The alternative is tcmalloc heap _sampling_ which only records periodic
allocations. According to recent mailing list threads, Google runs most
of their binaries with this enabled in production, so it's designed to
be very low overhead and potentially useful for us in the field.

This doesn't go so far as to enable it by default, but it does adjust
the /pprof/heap endpoint to dump the sampled allocations instead of
starting a profile. Sampling can be enabled using either the tcmalloc
env variable (TCMALLOC_SAMPLE_PARAMETER) or a new command line flag
--heap-sample-every-n-bytes.

I tested this manually by running a kudu tserver with loadgen and
sampling set to 500kb. I then used pprof to grab the heap sample
information which looked like I expected it to (most of the memory from
MRS)

Change-Id: I939c2f01f17ceb4b9520bb566f66463952b2a255
Reviewed-on: http://gerrit.cloudera.org:8080/9263
Tested-by: Kudu Jenkins
Reviewed-by: Will Berkeley 
---
M docs/troubleshooting.adoc
M src/kudu/server/pprof_path_handlers.cc
M src/kudu/util/flags.cc
3 files changed, 95 insertions(+), 26 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Will Berkeley: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/9263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I939c2f01f17ceb4b9520bb566f66463952b2a255
Gerrit-Change-Number: 9263
Gerrit-PatchSet: 4
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] Make tcmalloc heap sampling more useful

2018-02-22 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9263 )

Change subject: Make tcmalloc heap sampling more useful
..


Patch Set 3:

Errant ??- what I mean is I checked it builds on macOS


--
To view, visit http://gerrit.cloudera.org:8080/9263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I939c2f01f17ceb4b9520bb566f66463952b2a255
Gerrit-Change-Number: 9263
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Thu, 22 Feb 2018 21:00:33 +
Gerrit-HasComments: No


[kudu-CR] Make tcmalloc heap sampling more useful

2018-02-22 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9263 )

Change subject: Make tcmalloc heap sampling more useful
..


Patch Set 3: Code-Review+2

Builds on macOS ✔️


--
To view, visit http://gerrit.cloudera.org:8080/9263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I939c2f01f17ceb4b9520bb566f66463952b2a255
Gerrit-Change-Number: 9263
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Thu, 22 Feb 2018 20:59:46 +
Gerrit-HasComments: No


[kudu-CR] Make tcmalloc heap sampling more useful

2018-02-13 Thread Todd Lipcon (Code Review)
Hello Will Berkeley, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9263

to look at the new patch set (#3).

Change subject: Make tcmalloc heap sampling more useful
..

Make tcmalloc heap sampling more useful

Previously our /pprof/heap path required that the process be started
with lifetime heap profiling enabled. This mode of operation is very
slow so we would never be able to use it in production. Even in tests
it's of limited utility.

The alternative is tcmalloc heap _sampling_ which only records periodic
allocations. According to recent mailing list threads, Google runs most
of their binaries with this enabled in production, so it's designed to
be very low overhead and potentially useful for us in the field.

This doesn't go so far as to enable it by default, but it does adjust
the /pprof/heap endpoint to dump the sampled allocations instead of
starting a profile. Sampling can be enabled using either the tcmalloc
env variable (TCMALLOC_SAMPLE_PARAMETER) or a new command line flag
--heap-sample-every-n-bytes.

I tested this manually by running a kudu tserver with loadgen and
sampling set to 500kb. I then used pprof to grab the heap sample
information which looked like I expected it to (most of the memory from
MRS)

Change-Id: I939c2f01f17ceb4b9520bb566f66463952b2a255
---
M docs/troubleshooting.adoc
M src/kudu/server/pprof_path_handlers.cc
M src/kudu/util/flags.cc
3 files changed, 95 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/63/9263/3
--
To view, visit http://gerrit.cloudera.org:8080/9263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I939c2f01f17ceb4b9520bb566f66463952b2a255
Gerrit-Change-Number: 9263
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] Make tcmalloc heap sampling more useful

2018-02-13 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9263 )

Change subject: Make tcmalloc heap sampling more useful
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/9263/2/src/kudu/server/pprof_path_handlers.cc
File src/kudu/server/pprof_path_handlers.cc:

http://gerrit.cloudera.org:8080/#/c/9263/2/src/kudu/server/pprof_path_handlers.cc@104
PS2, Line 104: message with instructions how to enable it
> The message (tcmalloc.cc L592) just mentions the TCMALLOC_SAMPLE_PARAMETER
Done


http://gerrit.cloudera.org:8080/#/c/9263/2/src/kudu/util/flags.cc
File src/kudu/util/flags.cc:

http://gerrit.cloudera.org:8080/#/c/9263/2/src/kudu/util/flags.cc@88
PS2, Line 88: heap_sample_parameter
> I know this name matches the one that tcmalloc uses, but I'd like a more de
Done


http://gerrit.cloudera.org:8080/#/c/9263/2/src/kudu/util/flags.cc@89
PS2, Line 89:  N
> What's N? I think this would also be fixed with a better name and adjusted
Done



--
To view, visit http://gerrit.cloudera.org:8080/9263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I939c2f01f17ceb4b9520bb566f66463952b2a255
Gerrit-Change-Number: 9263
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Tue, 13 Feb 2018 20:02:46 +
Gerrit-HasComments: Yes


[kudu-CR] Make tcmalloc heap sampling more useful

2018-02-13 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9263 )

Change subject: Make tcmalloc heap sampling more useful
..


Patch Set 2:

> You're telling me you can't symbolize addresses in your head?!

:(

 > In that case you can use the pprof tool as follows:
 >
 > ./thirdparty/installed/uninstrumented/bin/pprof  build/latest/bin/kudu-master
 > http://localhost:8051/pprof/heap

Ah, nice.

 > I believe there is also a newer version of pprof implemented in go,

That's right. Go has a nice package https://golang.org/pkg/net/http/pprof/ that 
basically gives a lot of this profiling kind of stuff exposed over an http 
interface for free just by importing it, plus a pprof tool built into the go 
toolchain for processing the result.


--
To view, visit http://gerrit.cloudera.org:8080/9263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I939c2f01f17ceb4b9520bb566f66463952b2a255
Gerrit-Change-Number: 9263
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Tue, 13 Feb 2018 18:40:50 +
Gerrit-HasComments: No


[kudu-CR] Make tcmalloc heap sampling more useful

2018-02-13 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9263 )

Change subject: Make tcmalloc heap sampling more useful
..


Patch Set 2:

> Is that because it's macOS? If not, how does one read that?

You're telling me you can't symbolize addresses in your head?!

In that case you can use the pprof tool as follows:

./thirdparty/installed/uninstrumented/bin/pprof  build/latest/bin/kudu-master 
http://localhost:8051/pprof/heap

(or point it at a path to an already-downloaded heap profile instead of using 
HTTP if you collect one from some production server without the tool installed).

I believe there is also a newer version of pprof implemented in go, but I 
haven't tried that one. I think they moved towards protobuf-based dumps instead 
of text in recent years.


--
To view, visit http://gerrit.cloudera.org:8080/9263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I939c2f01f17ceb4b9520bb566f66463952b2a255
Gerrit-Change-Number: 9263
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Tue, 13 Feb 2018 18:24:14 +
Gerrit-HasComments: No


[kudu-CR] Make tcmalloc heap sampling more useful

2018-02-13 Thread Will Berkeley (Code Review)
Will Berkeley has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9263 )

Change subject: Make tcmalloc heap sampling more useful
..


Patch Set 2:

(3 comments)

I tried this out on my macOS system and the heap samples looked like this:

heap profile: 93: 57580776 [93: 57580776] @ heap_v2/512000
14: 13993984 [14: 13993984] @ 0x10b9dce13 0x7fff57cc9201 0x7fff57cc850b 
0x10d9386f9 0x10d938603 0x10d938c0f 0x10d939579 0x10d932137 0x10d92fbcb 
0x10d93096c 0x10a19f012 0x10a1a09d4 0x10a1a08e4 0x10a1a0810 0x10a18b7c8 
0x10a18a737 0x10a0498ee 0x10a05078b 0x10a04f755 0x10a110ff7 0x10a102d1f 
0x10a1050a9 0x10a104ffc 0x10a104faa 0x10a5f08de 0x10da06ff9 0x10da0206f 
0x10da1da69 0x10da1d9d0 0x10da1d97a 0x10da1d75d
 5:  2621440 [ 5:  2621440] @ 0x10b9dce13 0x7fff57cc9201 0x7fff57cc850b 
0x10d9386f9 0x10d938603 0x10d932137 0x10d932aeb 0x10d93377c 0x1099537d2 
0x10a113dfd 0x10a113920 0x10a046164 0x10a10ff46 0x10a100115 0x10a0fef97 
0x10a1050a9 0x10a104ffc 0x10a104faa 0x10a5f08de 0x10da06ff9 0x10da0206f 
0x10da1da69 0x10da1d9d0 0x10da1d97a 0x10da1d75d 0x10b33e198 0x10d9ecb88 
0x7fff57dac6c1 0x7fff57dac56d 0x7fff57dabc5d
 2:   509952 [ 2:   509952] @ 0x10b9dccf1 0x7fff57cc9201 0x7fff57cc850b 
0x10d9386f9 0x10d938603 0x10d938c0f 0x10d939579 0x10d932137 0x10d92fbcb 
0x10d93096c 0x10a19f012 0x10a19edc4 0x10a1a28a9 0x10a1a14a1 0x10a1a0e13 
0x10a18b8bf 0x10a18a80a 0x10a0498ee 0x10a05078b 0x10a04f755 0x10a110ff7 
0x10a102d1f 0x10a1050a9 0x10a104ffc 0x10a104faa 0x10a5f08de 0x10da06ff9 
0x10da0206f 0x10da1da69 0x10da1d9d0 0x10da1d97a
 3:  192 [ 3:  192] @ 0x10b9dccf1 0x7fff57cc9201 0x7fff57cc850b 
0x7fff55ba8628 0x10a1bd6ce 0x10a049964 0x10a05078b 0x10a04f755 0x10a110ff7 
0x10a102d1f 0x10a1050a9 0x10a104ffc 0x10a104faa 0x10a5f08de 0x10da06ff9 
0x10da0206f 0x10da1da69 0x10da1d9d0 0x10da1d97a 0x10da1d75d 0x10b33e198 
0x10d9ecb88 0x7fff57dac6c1 0x7fff57dac56d 0x7fff57dabc5d
12: 12464128 [12: 12464128] @ 0x10b9dce13 0x7fff57cc9201 0x7fff57cc850b 
0x10d9386f9 0x10d938603 0x10d938c0f 0x10d939579 0x10d932137 0x10d92fbcb 
0x10d93096c 0x10a19f012 0x10a19edc4 0x10a1a28a9 0x10a1a14a1 0x10a1a0e13 
0x10a18b8bf 0x10a18a80a 0x10a0498ee 0x10a05078b 0x10a04f755 0x10a110ff7 
0x10a102d1f 0x10a1050a9 0x10a104ffc 0x10a104faa 0x10a5f08de 0x10da06ff9 
0x10da0206f 0x10da1da69 0x10da1d9d0 0x10da1d97a

Is that because it's macOS? If not, how does one read that?

http://gerrit.cloudera.org:8080/#/c/9263/2/src/kudu/server/pprof_path_handlers.cc
File src/kudu/server/pprof_path_handlers.cc:

http://gerrit.cloudera.org:8080/#/c/9263/2/src/kudu/server/pprof_path_handlers.cc@104
PS2, Line 104: message with instructions how to enable it
The message (tcmalloc.cc L592) just mentions the TCMALLOC_SAMPLE_PARAMETER 
environment variable and not the gflag.


http://gerrit.cloudera.org:8080/#/c/9263/2/src/kudu/util/flags.cc
File src/kudu/util/flags.cc:

http://gerrit.cloudera.org:8080/#/c/9263/2/src/kudu/util/flags.cc@88
PS2, Line 88: heap_sample_parameter
I know this name matches the one that tcmalloc uses, but I'd like a more 
descriptive name. Maybe something like "heap_sample_every_n_bytes"? That sort 
of name has precedents in our logging macros.


http://gerrit.cloudera.org:8080/#/c/9263/2/src/kudu/util/flags.cc@89
PS2, Line 89:  N
What's N? I think this would also be fixed with a better name and adjusted 
description.



--
To view, visit http://gerrit.cloudera.org:8080/9263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I939c2f01f17ceb4b9520bb566f66463952b2a255
Gerrit-Change-Number: 9263
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 
Gerrit-Comment-Date: Tue, 13 Feb 2018 18:16:12 +
Gerrit-HasComments: Yes


[kudu-CR] Make tcmalloc heap sampling more useful

2018-02-09 Thread Todd Lipcon (Code Review)
Hello Will Berkeley, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9263

to look at the new patch set (#2).

Change subject: Make tcmalloc heap sampling more useful
..

Make tcmalloc heap sampling more useful

Previously our /pprof/heap path required that the process be started
with lifetime heap profiling enabled. This mode of operation is very
slow so we would never be able to use it in production. Even in tests
it's of limited utility.

The alternative is tcmalloc heap _sampling_ which only records periodic
allocations. According to recent mailing list threads, Google runs most
of their binaries with this enabled in production, so it's designed to
be very low overhead and potentially useful for us in the field.

This doesn't go so far as to enable it by default, but it does adjust
the /pprof/heap endpoint to dump the sampled allocations instead of
starting a profile. Sampling can be enabled using either the tcmalloc
env variable (TCMALLOC_SAMPLE_PARAMETER) or a new command line flag
--heap_sample_parameter.

I tested this manually by running a kudu tserver with loadgen and
sampling set to 500kb. I then used pprof to grab the heap sample
information which looked like I expected it to (most of the memory from
MRS)

Change-Id: I939c2f01f17ceb4b9520bb566f66463952b2a255
---
M src/kudu/server/pprof_path_handlers.cc
M src/kudu/util/flags.cc
2 files changed, 42 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/63/9263/2
--
To view, visit http://gerrit.cloudera.org:8080/9263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I939c2f01f17ceb4b9520bb566f66463952b2a255
Gerrit-Change-Number: 9263
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley 


[kudu-CR] Make tcmalloc heap sampling more useful

2018-02-08 Thread Todd Lipcon (Code Review)
Hello Will Berkeley,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/9263

to review the following change.


Change subject: Make tcmalloc heap sampling more useful
..

Make tcmalloc heap sampling more useful

Previously our /pprof/heap path required that the process be started
with lifetime heap profiling enabled. This mode of operation is very
slow so we would never be able to use it in production. Even in tests
it's of limited utility.

The alternative is tcmalloc heap _sampling_ which only records periodic
allocations. According to recent mailing list threads, Google runs most
of their binaries with this enabled in production, so it's designed to
be very low overhead and potentially useful for us in the field.

This doesn't go so far as to enable it by default, but it does adjust
the /pprof/heap endpoint to dump the sampled allocations instead of
starting a profile. Sampling can be enabled using either the tcmalloc
env variable (TCMALLOC_SAMPLE_PARAMETER) or a new command line flag
--heap_sample_parameter.

I tested this manually by running a kudu tserver with loadgen and
sampling set to 500kb. I then used pprof to grab the heap sample
information which looked like I expected it to (most of the memory from
MRS)

Change-Id: I939c2f01f17ceb4b9520bb566f66463952b2a255
---
M src/kudu/server/pprof_path_handlers.cc
M src/kudu/util/flags.cc
2 files changed, 41 insertions(+), 23 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/63/9263/1
--
To view, visit http://gerrit.cloudera.org:8080/9263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I939c2f01f17ceb4b9520bb566f66463952b2a255
Gerrit-Change-Number: 9263
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Will Berkeley