Hi Thomas,
I feel that this is outside of the scope of this event and change. It is
probably a good thing to track to understand if an application wastes
more memory than wanted.
Perhaps open an RFE for a separate event and documentation?
Similarly the TLAB/PLAB can waste memory when we get a new TLAB/PLAB for
an object that didn't fit in the previous LAB. This information might
make sense to include as an update to PLAB and TLAB event later.
Cheers,
Staffan
On 09/15/2014 09:13 AM, Thomas Schatzl wrote:
Hi,
one more allocation that is more serious that this change misses I
think:
When a PLAB does not fit the current allocation region, G1 releases that
one (filling it with a dummy allocation).
At the moment this might be up to half a region minus one word per
allocated region. We know of applications where this is a problem,
wasting a few regions per GC.
What I merely suggest is to document somewhere that this can happen, is
okay to happen, and possibly most importantly for end-users, that the
sum of the PLAB allocation events (whether actual PLAB allocations or
direct allocations) is not a reliable measure for getting total
allocation information during GC in G1 (if this is actually one use of
this data, which I am not sure).
Maybe some additional event for PLABs and TLABs measuring fragmentation
could be introduced.
I expect this problem to increase if there are multiple allocation
regions during GC.
Thanks,
Thomas
On Mon, 2014-09-15 at 14:57 +0200, Thomas Schatzl wrote:
Hi,
On Fri, 2014-09-05 at 15:20 -0700, Staffan Friberg wrote:
Hi,
I have uploaded a new webrev here,
cr.openjdk.java.net/~sfriberg/8055845/webrev.03
It contains several changes
- Split event into two events (PromoteObjectInNewPLAB,
PromoteObjectOutsidePLAB)
- Moved events to "vm/gc/detailed/PromoteObject..."
- Supporting ParNew+CMS and ParNew+SerialOld tenuring
- Not sure if the way I do it with passing the ParNewTracer
is the best solution, please let me know if you have an idea how to
improve it
- Simplified the G1 code to avoid sending age and having a single
call site
- Fixed so that the generated event has size information in bytes
rather than words
Thanks for offline comments and suggestions from Dmitry and Thomas.
- in G1CollectedHeap::par_allocate_during_gc() I still think it is
required to do the !old->is_forwarded() check before retrieving the
old->age(), and so that there cannot be a reload of the mark header
between those.
Also, if you look at oopDesc::age(), the first assert checks whether it
is not forwarded. Between the atomic claim by installing the forward
pointer and this reading of the age this might happen, so the assert may
trigger.
So the change should either read the mark oop first (using a volatile
read), then do the is_forwarded() check and the retrieval of the age
value on that mark oop, do the even processing after the claiming of the
object (forward pointer installing) as suggested once, or (least
favourable to me) pass the markOop down.
The latter messes up the method signatures, and in any case (when using
option one or three) this code is slightly racy as we might report too
many events as another thread might have claimed the object. (Parallel
Scavenge has the same issue, in addition to the possibility of sending
two events as Bengt describes).
Please document the possibility of the race, and the workaround in these
locations.
- another source of inaccuracy is that at the end of GC, G1 will make
the very last PLAB available for allocation in the next GC. And it may
do additional allocations to fill up the region (if there is not enough
useful space at the end of the allocation region), or fill up the old
gen allocation to the next card to avoid races in the next GC (see
G1CollectedHeap::release_gc_alloc_regions() and the release() methods of
the Survivor/OldGCAllocRegion classes.
This, that JFR might get slightly too many events (or too few), should
be documented somewhere, probably in JFR/event documentation. At least
the sum of these allocations should not be used as the number of copied
bytes.
- also maybe add a comment about the purpose of the "old" parameters
passed around. It is not obvious that a method that allocates a range of
bytes needs the value of the old memory block. Except for CMS, where it
is already used, the other non-G1 methods lack this documentation too.
- please align the parameters in all calls to
gc_tracer->report_promotion_event().
- there is a superfluous space at the end of the line in
generation.hpp:326
Thanks,
Thomas