Re: Low-Overhead Heap Profiling

JC Beyler Wed, 01 Nov 2017 10:47:25 -0700

Dear all,

Here is the next webrev:
http://cr.openjdk.java.net/~rasbold/8171119/webrev.14a/


Incremental since the rebase:
http://cr.openjdk.java.net/~rasbold/8171119/webrev.14_14a/

(I'm still not too familiar with hg so I had to do a fresh rebase so v14 is
once the rebase was done and v14a integrates the changes requested from
Thomas).

I have also inlined answers to Thomas' email:

>
> A few minor issues:
>
>   - weak reference handling has been factored out in JDK-8189359, now
> you only need to add the additions required for this change to one
> place. :) Please update the webrev :)
>
>
That is awesome and very happily done :)



>   - the one issue Robin noticed.
>
>   - in the declaration of CollectedHeap::sample_allocation, it would be
> nice if the fix_sample_rate parameter would be described - it takes a
> time to figure out what it's used for. I.e. in case an allocation goes
> beyond the sampling watermark, this value which represents the amount
> of overallocation is used to adjust the next sampling watermark to
> sample at the correct rate.
> Something like this - and if what I wrote is incorrect, there is even
> more reason to document it.
> Or maybe just renaming "fix_sample_rate" to something more descriptive
> - but I have no good idea about that.
> With lack of units in the type, it would also be nice to have the unit
> in the identifier name, as done elsewhere.
>

Done for Robbin's issue and changed it to

>
>   - some (or most actually) of the new setters and getters in the
> ThreadLocalAllocBuffer class could be private I think. Also, we
> typically do not use "simple" getters that just return a member in the
> class where they are defined.
>

I removed all that were not used that I could see (not used outside the
class) moved the ones that are not simple to private if they could be. I
think it's a bit weird because now some of the setting of the tlab internal
data is using methods, others are directly setting. Let me know what you
think.


>
>   - ThreadLocalAllocBuffer::set_sample_end(): please use
> pointer_delta() for pointer subtractions.
>

Done!


>
>   - ThreadLocalAllocBuffer::pick_next_sample() - I recommend making the
> first check an assert - it seems that it is only useful to call this
> with heap monitoring enabled, as is done right now.
>

Longer conversation below about this, It cannot be an assert (I could
remove the test altogether though).


>
>   - ThreadLocalAllocBuffer::pick_next_sample() - please use
> "PTR_FORMAT" (or INTPTR_FORMAT - they are the same) as format string
> for printing pointer values as is customary within Hotspot. %p output
> is OS dependent. I.e. I heard that e.g. on Ubuntu it prints "null"
> instead of 0x0...0 .... which is kind of annoying.
>

Done :)


>
>   - personal preference: do not allocate
> HeapMonitoring::AlwaysTrueClosure globally, but only locally when it's
> used. Setting it up seems to be very cheap.
>

Done!


>
>   - HeapMonitoring::next_random() - the different names for the
> constants use different formatting. Preferable (to me) is
> UpperCamelCase, but at least make them uniform.
>
>
I think done the way you wanted!


>   - in HeapMonitoring::next_random(), you might want to use
> right_n_bits() to create your mask.


Done!


>
>
  - not really convinced that it is a good idea to not somehow guard
> StartHeapSampling() and StopHeapSampling() against being called by
> multiple threads.
>
>
I added another mutex for the start/stop so that way it will protect from
that.



> Otherwise looks okay from what I can see.
>

Awesome, what do you think I still need for this before going to the next
step (which is what by the way? :)).


---------------

Ok now the longer conversation about this remark:


>   - ThreadLocalAllocBuffer::pick_next_sample() - I recommend making the
> first check an assert - it seems that it is only useful to call this
> with heap monitoring enabled, as is done right now.
>


You can't do this right now because this is how the mutexes work right now
to ensure we allow things to work fast but safely:

I) Background before I go into details

1) You can turn on/off whenever you like; which flips the switch of the
HeapMonitoring::enabled() method.
2) When you hit the end of a tlab, you go to the slow path, check
HeapMonitoring::enabled()
    3) Later in the handling of the sample if enabled() returned true, you
get to the point of choosing a new sampling rate

Now imagine we have two threads A & B and imagine that enabled is currently
returning true. Here is a series of events:

i) Thread A hits the end of tlab, checks enabled at entrance, gets a stack
ii) Meanwhile, Thread B turned off HeapMonitoring
iii) Thread A now goes to pick a new sample rate and is going to check
HeapMonitoring::enabled

If put as an assert, clearly (iii) will fail, we don't want this.

II) So it this really an issue? Is there something really broken?

Not really, I can go into a bigger explanation as to why but really it
boils down to:

- Tlab asks the HeapMonitoring system if it should try to sample and thus
move its end pointer a bit for the next sample
  - If it "missed" a disable event, it will try to sample, find out
monitoring is disabled and just not pick a sample.
  - If it did the sample and sends it to HeapMonitoring, HeapMonitoring
just ignores it and TLab goes back to do its thing

Because of this producer/consumer relationship between Tlab and
HeapMonitoring, there is no real need to ensure that we are not in a
specific part of the code. The thread sensitive data that could get
corrupted is in HeapMonitoring and guarded by a specific lock, the rest of
it will just result in a bit of extra useless work but won't provoke an
race issue. I put a few extra checks for the HeapMonitoring::enabled
because they allow a quick exit of sampling but to be honest, we could
check at the entrance of the code path and if ever it got disabled then we
are really only sampling one extra object that will get discarded.

Thanks for looking as always, welcome back and let me know if there are any
other concerns/questions/criticisms,
Jc

Re: Low-Overhead Heap Profiling

Reply via email to