Hi Jc,
Thank you for the update!
It looks good.
It is great that testing on your side is Okay.
I'll submit a mach5 job soon (today or tomorrow).
Thanks,
Serguei
On 11/6/18 20:03, JC Beyler wrote:
Hi Serguei,
You are right, I should have reverted the
memAllocator.cpp file, sorry about that.
Here is the new webrev:
I think we are good by testing standards, like I
said HeapMonitorThreadTest.java tests multiple threads. I
did test an example with a thousand threads and I get the
samples from 1000 threads so it seems to work there too.
Per thread is tested via the new
HeapMonitorThreadDisabledTest.java so I think we are good
there too.
I would recommend a mach-5 testing just in case for
this one if you can, it will be better to have that
reinsurance.
Thanks for your help,
Jc
Hi Jc,
Not sure, I understand a motivation for this change:
- if (JvmtiExport::should_post_sampled_object_alloc()) {
+ {
Also, I'm not sure this is still needed:
+#include "prims/jvmtiEventController.inline.hpp"
+#include "prims/jvmtiThreadState.inline.hpp"
I expected you'd just revert all the changes in the
memAllocator.cpp.
Also, it is up to you to make a decision if these
performance-related fix is needed or not.
But it needs to be well tested so that both global+thread
event management works correctly.
Thanks,
Serguei
On 11/6/18
9:42 AM, JC Beyler wrote:
Hi Serguei,
Yes exactly it was an optimization. When
using a 512k sampling rate, I don't see a no
real difference (the overhead is anyway low for
that sampling rate), I imagine there would be a
difference if trying to sample every allocation
or with a low sampling interval. But because you
are right and it is an optimization of the
system and not a functional need, I've reverted
it and now the webrev is updated here:
Let me know what you think,
Jc
Hi
Jc,
Okay, I see your point: the change in
memAllocator.cpp is for performance.
Do you have any measurements showing a performance
difference?
Also, do you need me to submit a mach5 test run?
Thanks,
Serguei
On 11/5/18 15:14, JC Beyler wrote:
Hi Serguei,
First off, thanks as always for looking at
this :-) Let me inline my answers:
I actually "struggled" with this part of
the change. My change is correct
semantically and if you care about
performance for when sampling a given
thread.
Your change will work semantically but
the performance is the same as the global
sampling.
What I mean by working semantically is
that that the tests and the code will work.
However, this means that all threads will be
doing the sampling work but when the code
will post the event here:
(which is why your suggestion works, the
change in jvmtiExport basically ensures only
the threads requested are posting events)
The code will check that we actually only
post for threads we care about. The code
above ensures that only threads that were
requested to be sampling are the ones that
are sampling internally.
Note: I REALLY prefer your suggestion for
two reasons:
- We do not change the runtime/GC code
at all, it remains "simple"
- The overhead in the general case goes
away and this is a NOP for my actual
use-case from a performance point of view
(sampling every thread)
But:
- Then sampling per thread really is
just telling the system don't pollute the
callbacks, though internally you are doing
all the work anyway.
Let me know which you prefer :)
Also, do you see that enabling the sampling events globally still works?
Yes, otherwise HeapMonitorThreadTest.java
would fail since it checks that.
Done and done, I added a fprintf on
stderr saying the GetThreadInfo failed and
the test is ignoring the add count.
Thanks again for looking and let me know
what you think,
Jc
Hi
Jc,
It looks good in general but I have some
comments below.
http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/src/hotspot/share/gc/shared/memAllocator.cpp.udiff.html
+static bool thread_enabled_for_one_jvmti_env() {
+ JavaThread *thread = JavaThread::current();
+ JvmtiThreadState *state = thread->jvmti_thread_state();
+ if (state == NULL) {
+ return false;
+ }
+
+ JvmtiEnvThreadStateIterator it(state);
+ for (JvmtiEnvThreadState* ets = it.first(); ets != NULL; ets = it.next(ets)) {
+ if (ets->is_enabled(JVMTI_EVENT_SAMPLED_OBJECT_ALLOC)) {
+ return true;
+ }
+ }
+
+ return false;
+}
+
void MemAllocator::Allocation::notify_allocation_jvmti_sampler() {
// support for JVMTI VMObjectAlloc event (no-op if not enabled)
JvmtiExport::vm_object_alloc_event_collector(obj());
if (!JvmtiExport::should_post_sampled_object_alloc()) {
// Sampling disabled
return;
}
+ // Sampling is enabled for at least one thread, is it this one?
+ if (!thread_enabled_for_one_jvmti_env()) {
+ return;
+ }
+
I don't think you need this change as this condition already does it:
if (!JvmtiExport::should_post_sampled_object_alloc()) {
// Sampling disabled
return;
}
Please, look at the following line in the jvmtiEventController.cpp:
JvmtiExport::set_should_post_sampled_object_alloc((any_env_thread_enabled & SAMPLED_OBJECT_ALLOC_BIT) != 0);
I hope, testing will prove my suggestion is correct.
Also, do you see that enabling the sampling events globally still works?
http://cr.openjdk.java.net/%7Ejcbeyler/8201655/webrev.02/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/libHeapMonitorTest.c.frames.html
A couple of places where err is declared as int instead of jvmtiError:
714 int err;
742 int err;
Should not be silent in a case of JVMTI error:
744 err = (*jvmti)->GetThreadInfo(jvmti, thread, &info);
745 if (err != JVMTI_ERROR_NONE) {
746 return;
Thanks,
Serguei
On 10/26/18 10:48, JC Beyler wrote:
Hi all,
When working on the heap
sampling, I had promised to do the
per thread event so here it is!
Could I get a review
for this:
I was thinking of adding GC-dev
for the memAllocator change once I
get favorable reviews for the rest
of the change.
I've done a bit of performance
testing and on the Dacapo
benchmark I see no change in
performance when turned off
(logical, any code change is
behind a flag check already in
place) and when turned on it is
comparable to the current
performance.
(More information is: I see a
very slight degradation if we are
doing 512k sampling but no
degradation at 2MB).
--
--
--
|