The HPROF binary format came from JVMPI, so it has a long history. JVMPI was 
experimental,
but became so critical to people the "experimental" label was a joke, dito for 
HPROF which was
closely tied to JVMPI. When I converted HPROF to JVM TI in jdk5, it was pretty 
much a complete re-write
of the agent code, and the HPROF format was the only thing it had in common 
with the old pre-jdk5 HPROF.

This file format represents a very generic view of the heap and has never been 
hotspot specific.

Remember we have at least 2 separate places and different code that creates the 
HPROF binary
format, the VM itself (Alan would be the best contact on that) but also the 
HPROF VM agent in
the jdk repository. You really can't change one without at least making sure 
they both co-operate.
(I assume NetBeans also has code that reads this format).

Any change to the HPROF format is risky because we don't own all the code that 
consumes the
format, and the same is true of the HPROF text format (HP has a tool that reads 
the HPROF text output).
Too many 3rd party tools out there use this format and have made some very bad 
assumptions
on the format when reading it. Consequently, it is very easy to break these 3rd 
party tools.
For the longest time there wasn't even a spec on the format, some of the code 
was written without a spec.
So I don't know how many tools are left out there that use this format.

After spending many years working on HPROF I came to the conclusion that the 
format
should be locked down, and a new format should be developed.
After Alan developed the HPROF snapshot binary dump, I became fairly convinced 
that the
HPROF VM agent had a very limited use case scenario, also the NetBeans Profiler 
came into existence,
and it was obvious to me that HPROF should be deprecated.

But a new file format still had a future, unfortunately it never was important 
enough to get any
attention, then the 2009 Serviceability resource purge happened and the effort 
just died. :^(

If libraries could be provided to read and write the new format, you could get 
away from this
3rd party tools having their own code to read the bytes out of a file or parse 
the output.
You could even have a "real" generated parser and detailed formal spec on the 
format.
Those same libraries could actually be used to create a transformation tool 
that could possibly
convert the new format to/from the old HPROF binary format, as a bridging 
mechanism for some
people.

IF you decide to modify the existing HPROF file format be very careful and 
don't underestimate
the impacts this might create. Having some contingency plans would be a good 
idea.


-kto

On Dec 29, 2012, at 3:05 AM, Aleksey Shipilev wrote:

> Thanks Alan!
> 
> On 12/29/2012 02:53 PM, Alan Bateman wrote:
>> On 29/12/2012 10:10, Aleksey Shipilev wrote:
>>> :
>>> 
>>> Seems like that instance_size() method is used to populate both GC_CLASS
>>> and GC_INSTANCE. Would it be OK to push the VM-reported size to GC_CLASS
>>> only, and leave GC_INSTANCE intact?
>>> 
>> I don't think it can be in either as the sizes in the HPROF dump are VM
>> and padding independent.
> 
> Sorry, I fail to follow this reasoning. I thought HPROF was about
> dumping the actual info from the VM? What's the downside for publishing
> the actual instance size as the part of class metainfo? Is the only
> consideration being the format "purity" (which, btw, clashes with
> "WARNING: This format is still considered highly experimental" clause in
> HPROF format description), or is it the strive to rather add new
> attributes than change the existing attributes?
> 
>> The only thing I can suggest for the short term would be to rev the
>> HPROF format to define a new record that provides a better details on
>> the sizes and layout.
> 
> Ok, we can go there, and that adds some lag for the tools. Would the
> prototype implementation of new format be enough to get the ball rolling?
> 
>> It's a slippery slope of course. In the mean-time, at
>> least for the built-in heap dumper and the SA-based heap dumper, then
>> the object ID is the oop so it's possible to deduce something.
> 
> Is that post-mortem? The alignment and padding info might be lost and/or
> inconsistent at that point already.
> 
> -Aleksey.
> 

Reply via email to