Hi JC,

Sorry for a latency in reply.
I'm going to a 4-week vacation.
But I'll try to help you occasionally when there will be access to the Internet.



On 6/23/17 09:58, JC Beyler wrote:
Hi Serguei and all,

Thanks Serguei for your review. I believe I have not answered many questions but raised more but I hope this helps understand what I am trying to do.

I have inlined my answers to make it easier to answer everything (and to be honest, ask questions).

@Thomas: Thank you for your thorough answer as well, I'll be working on it and will answer when I get the fixes and comments in a handled state, most likely beginning of next week :)

On Fri, Jun 23, 2017 at 3:01 AM, serguei.spit...@oracle.com <serguei.spit...@oracle.com> wrote:
Hi JC,

> Some initial JVMTI-specific questions/comments.

> I was not able to apply your patch as the are many merge conflicts.
> Could you, please, re-base it on the latest JDK 10 sources?

This is weird, I did a pull right now and it worked. I hesitated I was using hg correctly so I just did:
bash ./get_source.sh
patch -p1 < hotspot.patch

Thanks.
The following seems to work:
 % cd hotspot; patch -p1 < ../hotspot.patch

Hopefully, I'll be able to build and play with it in order to understand better.

(Maybe the patch is not the right way of doing things, there might be a hg equivalent?).

Yes.
I normally do this:

cd <repo>/hotspot
hg import ../hotspot.patch

It returns the following errors:

applying ../hotspot.patch
patching file src/cpu/x86/vm/macroAssembler_x86.cpp
Hunk #1 succeeded at 5662 with fuzz 2 (offset 60 lines).
patching file src/share/vm/gc/g1/g1CollectedHeap.cpp
Hunk #1 FAILED at 74
Hunk #2 succeeded at 4316 with fuzz 1 (offset 33 lines).
Hunk #3 FAILED at 4506
2 out of 3 hunks FAILED -- saving rejects to file src/share/vm/gc/g1/g1CollectedHeap.cpp.rej
patching file src/share/vm/gc/g1/g1CollectedHeap.hpp
Hunk #1 FAILED at 302
1 out of 1 hunks FAILED -- saving rejects to file src/share/vm/gc/g1/g1CollectedHeap.hpp.rej
patching file src/share/vm/gc/g1/g1MarkSweep.cpp
Hunk #1 FAILED at 47
Hunk #2 FAILED at 249
2 out of 2 hunks FAILED -- saving rejects to file src/share/vm/gc/g1/g1MarkSweep.cpp.rej
patching file src/share/vm/gc/parallel/psMarkSweep.cpp
Hunk #1 FAILED at 50
Hunk #2 FAILED at 609
2 out of 2 hunks FAILED -- saving rejects to file src/share/vm/gc/parallel/psMarkSweep.cpp.rej
patching file src/share/vm/gc/parallel/psParallelCompact.cpp
Hunk #1 FAILED at 59
Hunk #2 FAILED at 2167
2 out of 2 hunks FAILED -- saving rejects to file src/share/vm/gc/parallel/psParallelCompact.cpp.rej
patching file src/share/vm/gc/shared/collectedHeap.cpp
Hunk #1 FAILED at 37
Hunk #2 FAILED at 294
Hunk #3 FAILED at 314
Hunk #4 FAILED at 335
4 out of 4 hunks FAILED -- saving rejects to file src/share/vm/gc/shared/collectedHeap.cpp.rej
patching file src/share/vm/gc/shared/collectedHeap.hpp
Hunk #1 succeeded at 146 with fuzz 2 (offset 3 lines).
patching file src/share/vm/gc/shared/collectedHeap.inline.hpp
Hunk #1 FAILED at 156
1 out of 1 hunks FAILED -- saving rejects to file src/share/vm/gc/shared/collectedHeap.inline.hpp.rej
patching file src/share/vm/gc/shared/genCollectedHeap.cpp
Hunk #1 FAILED at 48
Hunk #2 FAILED at 721
2 out of 2 hunks FAILED -- saving rejects to file src/share/vm/gc/shared/genCollectedHeap.cpp.rej
patching file src/share/vm/gc/shared/referenceProcessor.cpp
Hunk #1 FAILED at 34
Hunk #2 FAILED at 256
2 out of 2 hunks FAILED -- saving rejects to file src/share/vm/gc/shared/referenceProcessor.cpp.rej
patching file src/share/vm/gc/shared/threadLocalAllocBuffer.cpp
Hunk #1 FAILED at 28
Hunk #2 FAILED at 120
Hunk #3 FAILED at 182
Hunk #4 FAILED at 305
4 out of 4 hunks FAILED -- saving rejects to file src/share/vm/gc/shared/threadLocalAllocBuffer.cpp.rej
patching file src/share/vm/gc/shared/threadLocalAllocBuffer.hpp
Hunk #1 FAILED at 43
Hunk #2 FAILED at 51
Hunk #3 FAILED at 65
Hunk #4 FAILED at 114
Hunk #5 FAILED at 158
5 out of 5 hunks FAILED -- saving rejects to file src/share/vm/gc/shared/threadLocalAllocBuffer.hpp.rej
patching file src/share/vm/prims/jvmti.xml
Hunk #1 succeeded at 11719 with fuzz 1 (offset 190 lines).
patching file src/share/vm/prims/jvmtiEnv.cpp
Hunk #1 FAILED at 45
Hunk #2 FAILED at 54
Hunk #3 succeeded at 1932 with fuzz 2 (offset -17 lines).
2 out of 3 hunks FAILED -- saving rejects to file src/share/vm/prims/jvmtiEnv.cpp.rej
patching file src/share/vm/runtime/thread.hpp
Hunk #1 FAILED at 614
1 out of 1 hunks FAILED -- saving rejects to file src/share/vm/runtime/thread.hpp.rej
file src/share/vm/prims/jvmtiHeapTransition.hpp already exists
1 out of 1 hunks FAILED -- saving rejects to file src/share/vm/prims/jvmtiHeapTransition.hpp.rej
file src/share/vm/runtime/heapMonitoring.cpp already exists
1 out of 1 hunks FAILED -- saving rejects to file src/share/vm/runtime/heapMonitoring.cpp.rej
file src/share/vm/runtime/heapMonitoring.hpp already exists
1 out of 1 hunks FAILED -- saving rejects to file src/share/vm/runtime/heapMonitoring.hpp.rej
file test/serviceability/jvmti/HeapMonitor/MyPackage/Frame.java already exists
1 out of 1 hunks FAILED -- saving rejects to file test/serviceability/jvmti/HeapMonitor/MyPackage/Frame.java.rej
file test/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorFrequentTest.java already exists
1 out of 1 hunks FAILED -- saving rejects to file test/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorFrequentTest.java.rej
file test/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorOnOffTest.java already exists
1 out of 1 hunks FAILED -- saving rejects to file test/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorOnOffTest.java.rej
file test/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorRecentTest.java already exists
1 out of 1 hunks FAILED -- saving rejects to file test/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorRecentTest.java.rej
file test/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorTest.java already exists
1 out of 1 hunks FAILED -- saving rejects to file test/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorTest.java.rej
file test/serviceability/jvmti/HeapMonitor/libHeapMonitor.c already exists
1 out of 1 hunks FAILED -- saving rejects to file test/serviceability/jvmti/HeapMonitor/libHeapMonitor.c.rej
abort: patch failed to apply



And it seemed to work:
bash ./configure --with-boot-jdk=/path/to/jdk1.8.0-latest --with-debug-level=slowdebug
make all images JOBS=48

Any idea why it is working for me and not you? I am assuming I am doing something wrong but I don't know what!

It seems, we've found the reason (please, see above).


I think, it would make sense to introduce new optional capability, something like:
   can_sample_heap_allocations

I will work on adding that for the next webrev, Robbin had mentioned I should and I dropped the ball on that one.

I see a couple of more emails from you.
Will try to help when you have some problems or questions.

> Do you consider this API to be used by a single agent or it is a multi-agent feature?
> What if one agent calls the StartHeapSampling() but another calls one of the others.
> The API specs needs to clarify this.

> I'm not sure you had a chance to carefully think on what JVMTI phases are allowed for new functions.

- I am pretty sure I have NOT really thought about this as carefully as you have because this is the first time I am doing this and these questions already make me go "hmmm" :)

Expectable. :)


I have a tendency to work through issues and try to keep the design simple first. Your question seems to imply I can make this work for a single agent world and have a means to explicitly disable it for multi-agent for now.

I looked quickly in the prims folder for anything mentioning single agent vs multi agent but did not find anything. Does this get handled s this more a documentation thing and tough luck for the user.


seems to say the latter. That the documentation should say that this is not a multiple agent support (yet?) and there is nothing we can do. Is that right?

 

> http://cr.openjdk.java.net/~rasbold/8171119/webrev.06/src/share/vm/runtime/heapMonitoring.cpp.html

> I do not see the variable HeapMonitoring::_enabled is checked in the functions:
>     HeapMonitoring::get_live_traces(),
>     HeapMonitoring::get_garbage_traces(),
>     HeapMonitoring::get_frequent_garbage_traces(),
>     HeapMonitoring::release_traces()

So it is not checked there because the API allows you to:
  - Start sampling (which flips the enabled variable)
  - Stop sampling (which flips back the enabled variable)
  - Get the sampled traces at that point

I think the API documentation is not clear for this right now, so I'll change it for the next webrev. Let me also know if you think this is a bad idea.

I agree, it is better at least to document this behavior.


I thought it would be useful because it would allow you to start/stop the sampling and then later on go back and see what was sampled at a later time.

I need to think more on this.
The whole style is different than that used in the JVMTI.
Normally, each agent does some steps like the following:

  - add the capabilities required by the agent (normally in the ONLOAD phase)
  - set the agent event callbacks
  - enable/disable event notification mode
  - relinquish the capabilities that are not needed anymore

In your approach there is no tight association with any agent yet.
Also, it seems, the heap monitoring initialization/finalization
are not separated from sampling starting/stopping.
The heap allocation events are processed by the VM itself (in GC or JVMTI).

It is not clear yet if it could be somehow adjusted to JVMTI style
and if it is really necessary.


A question about a multi-agent feature from above applies here as well.

I'm open to either supporting multi-agent if you think it is important or doing whatever can be done to, as a first step, make it a single agent support. Let me know how to do that.

I do not think it is really important to support multi-agents here.
However, it is important to make a choice sooner rather than later as it impacts the design.


Thanks,
Serguei

  > Thanks,
  > Serguei

Thank you!
Jc
 




On 6/21/17 13:45, JC Beyler wrote:
Hi all,

First off: Thanks again to Robbin and Thomas for their reviews :)

Next, I've uploaded a new webrev:

Here is an update:

- @Robbin, I forgot to say that yes I need to look at implementing this for the other architectures and testing it before it is all ready to go. Is it common to have it working on all possible combinations or is there a subset that I should be doing first and we can do the others later?
- I've tested slowdebug, built and ran the JTreg tests I wrote with slowdebug and fixed a few more issues
- I've refactored a bit of the code following Thomas' comments
   - I think I've handled all the comments from Thomas (I put comments inline below for the specifics)

The biggest addition to this webrev is that there is more testing, I've added two tests for looking specifically at the two garbage sampling data Recent vs Frequent:
   and

   - Side question: I was looking at trying to make this a multi-file library so you would not have all the code in one spot. It seems this is not really done?
      - Is there a solution to this? What I really wanted was:
        - The core library that will help testing be easier
        - The JNI code for each Java class separate and calling into the core library

- Following Thomas' comments on statistics, I want to add some quality assurance tests and find that the easiest way would be to have a few counters of what is happening in the sampler and expose that to the user.
   - I'll be adding that in the next version if no one sees any objections to that.
   - This will allow me to add a sanity test in JTreg about number of samples and average of sampling rate

@Thomas: I had a few questions that I inlined below but I will summarize the "bigger ones" here:
   - You mentioned constants are not using the right conventions, I looked around and didn't see any convention except normal naming then for static constants. Is that right?
   - You mentioned putting the weak_oops_do in a separate method and logging, I inlined my answer below and would appreciate your feedback there too.

Thanks again for your reviews and patience!
Jc

PS: I've also inlined my answers to Thomas below:

On Tue, Jun 13, 2017 at 8:03 AM, Thomas Schatzl <thomas.scha...@oracle.com> wrote:
Hi all,

On Mon, 2017-06-12 at 11:11 -0700, JC Beyler wrote:
> Dear all,
>
> I've continued working on this and have done the following webrev:
> http://cr.openjdk.java.net/~rasbold/8171119/webrev.05/

[...]
> Things I still need to do:
>    - Have to fix that TLAB case for the FastTLABRefill
>    - Have to start looking at the data to see that it is consistent
> and does gather the right samples, right frequency, etc.
>    - Have to check the GC elements and what that produces
>    - Run a slowdebug run and ensure I fixed all those issues you saw
> Robbin
>
> Thanks for looking at the webrev and have a great week!

  scratching a bit on the surface of this change, so apologies for
rather shallow comments:

- macroAssembler_x86.cpp:5604: while this is compiler code, and I am
not sure this is final, please avoid littering the code with TODO
remarks :) They tend to be candidates for later wtf moments only.

Just file a CR for that.


Newcomer question: what is a CR and not sure I have the rights to do that yet ? :)
 
- calling HeapMonitoring::do_weak_oops() (which should probably be
called weak_oops_do() like other similar methods) only if string
deduplication is enabled (in g1CollectedHeap.cpp:4511) seems wrong.

The call should be at least around 6 lines up outside the if.

Preferentially in a method like process_weak_jni_handles(), including
additional logging. (No new (G1) gc phase without minimal logging :)).

Done but really not sure because:

I put for logging:
  log_develop_trace(gc, freelist)("G1ConcRegionFreeing [other] : heap monitoring");

Since weak_jni_handles didn't have logging for me to be inspired from, I did that but unconvinced this is what should be done.
 

Seeing the changes in referenceProcess.cpp, you need to add the call to
HeapMonitoring::do_weak_oops() exactly similar to
process_weak_jni_handles() in case there is no reference processing
done.

- psParallelCompact.cpp:2172 similar to other places where the change
adds the call to HeapMonitoring::do_weak_oops(), remove the empty line.

Done from what I can tell in the whole webrev. (The only empty lines I still see are when I maintain an empty line, exception being http://cr.openjdk.java.net/~rasbold/8171119/webrev.06/src/share/vm/gc/shared/collectedHeap.cpp.patch where I think it was "nicer")
 

- the change doubles the size of
CollectedHeap::allocate_from_tlab_slow() above the "small and nice"
threshold. Maybe it could be refactored a bit.

Done I think, it looks better to me :).
 

- referenceProcessor.cpp:40: please make sure that the includes are
sorted alphabetically (I did not check the other files yet).

Done and checked all other files normally.
 

- referenceProcessor.cpp:261: the change should add logging about the
number of references encountered, maybe after the corresponding "JNI
weak reference count" log message.

Just to double check, are you saying that you'd like to have the heap sampler to keep in store how many sampled objects were encountered in the HeapMonitoring::weak_oops_do?
   - Would a return of the method with the number of handled references and logging that work?
   - Additionally, would you prefer it in a separate block with its GCTraceTime?


- threadLocalAllocBuffer.cpp:331: one more "TODO"

Removed it and added it to my personal todos to look at.
 

- threadLocalAllocBuffer.hpp: ThreadLocalAllocBuffer class
documentation should be updated about the sampling additions. I would
have no clue what the difference between "actual_end" and "end" would
be from the given information.

If you are talking about the comments in this file, I made them more clear I hope in the new webrev. If it was somewhere else, let me know where to change.
 

- threadLocalAllocBuffer.hpp:130: extra whitespace ;)

Fixed :)
 

- some copyrights :)


I think I fixed all the issues, if you see specific issues, let me know.
 
- in heapMonitoring.hpp: there are some random comments about some code
that has been grabbed from "util/math/fastmath.[h|cc]". I can't tell
whether this is code that can be used but I assume that Noam Shazeer is
okay with that (i.e. that's all Google code).

Jeremy and I double checked and we can release that as I thought. I removed the comment from that piece of code entirely.
 

- heapMonitoring.hpp/cpp static constant naming does not correspond to
Hotspot's. Additionally, in Hotspot static methods are cased like other
methods.

I think I fixed the methods to be cased the same way as all other methods. For static constants, I was not sure. I fixed a few other variables but I could not seem to really see a consistent trend for constants. I made them as variables but I'm not sure now.

 

- in heapMonitoring.cpp there are a few cryptic comments at the top
that seem to refer to internal stuff that should probably be removed.

Sorry about that! My personal todos not cleared out.
 

I did not think through the impact of the TLAB changes on collector
behavior yet (if there are). Also I did not check for problems with
concurrent mark and SATB/G1 (if there are).

I would love to know your thoughts on this, I think this is fine. I see issues with multiple threads right now hitting the stack storage instance. Previous webrevs had a mutex lock here but we took it out for simplificity (and only for now).
 

Thanks,
  Thomas





Reply via email to