Hi Simon, If you just need a count then would a class histogram suffice? jcmd <pid> GC.class_histogram
and/or for older jdks jmap -histo[:live] <pid> thanks, Alex https://docs.oracle.com/javase/6/docs/technotes/tools/share/jmap.html *-histo[:live]*Prints a histogram of the heap. For each Java class, number of objects, memory size in bytes, and fully qualified class names are printed. VM internal class names are printed with '*' prefix. If the *live* suboption is specified, only live objects are counted. On Thu, Nov 1, 2018 at 10:09 PM Simon Roberts < si...@dancingcloudservices.com> wrote: > Thanks for the input Kirk, JC. > > Kirk, do I assume that your code decodes the entire object graph? I admit > that mine is intended to be hyper-lean. It collects the strings, the > classes, and then simply counts object instances. But the upside of that is > that I can get a map of object count by class from a 20Gb heap in under two > minutes, on a machine that only has 16G RAM. I tried to read it using jhat > once on a 32G machine and it just ground into the dirt with all 8 cores > running flat out (clearly in the "I'm dying" GC mode, since I'm pretty sure > that jhat itself is single threaded!) But it's nice to know I'm not insane! > > JC, I'm on the road right now, but will try to dig out the relevant > segment fwiw in the next few days. I will say that it was pretty clear that > it was somehow bogus. > > Oh, while on the topic of what I learned from this exercise, I also > believe I discovered a bug in BufferedInputStream (don't laugh!). It fails > catastrophically after reading 2GB (I think I can guess why without even > looking at the code :). Everything was reading properly from an unadorned > FileInputStream, but then I needed to look at the byte sequence to work out > what was happening with these broken parts, and everything that was > previously working went crazy when I stuck the BIS on the front. I created > a replacement so that I could actually look at the failing bytes in a > debugger. Hmm, the point of that observation was to ask the off-topic > question of one should report a bug in a core API? > > Thanks again, > Cheers, > Simon > > > On Thu, Nov 1, 2018 at 11:20 AM JC Beyler <jcbey...@google.com> wrote: > >> Hi Simon, >> >> I briefly looked at the code that does the dump (or seems to) and the >> code is written in a form of: >> out.writeByte((byte) >> HPROF_GC_ROOT_JNI_LOCAL); >> writeObjectID(oop); >> out.writeInt(threadIndex); >> >> out.writeInt(EMPTY_FRAME_DEPTH); >> >> (for the Java implementation). I quickly went through the various path >> but I don't see a case where it could just stop after having written the >> object ID, it seems that it would either throw a nice exception or would >> write those two integers afterward. The C++ implementation does the same >> void JNILocalsDumper::do_oop(oop* obj_p) { >> // ignore null handles >> oop o = *obj_p; >> if (o != NULL) { >> writer()->write_u1(HPROF_GC_ROOT_JNI_LOCAL); >> writer()->write_objectID(o); >> writer()->write_u4(_thread_serial_num); >> writer()->write_u4((u4)_frame_num); >> } >> } >> >> I'm making a lot of assumptions that the surrounding code is same, the >> writer does not get corrupted or messed up. But it does seem sane. What >> exactly are the few bytes at that 16th element that make you believe the >> next two 4-bytes could not be the thread serial number and frame number? >> Jc >> >> >> On Wed, Oct 31, 2018 at 11:34 AM Kirk Pepperdine < >> kirk.pepperd...@gmail.com> wrote: >> >>> Hi Simon, >>> >>> I’ve also started a small project to try and solve the we need to look >>> at very large heap problem. My solution is to load the data into Neo4J. You >>> can find the project on my GitHub account. >>> >>> So, I believe I’ve taken the same tactic in just abandoning the segment >>> for the moment. It would be useful to sort that out but I listed it as a >>> future… >>> >>> Kind regards, >>> Kirk >>> L >>> >>> On Oct 31, 2018, at 4:07 AM, Simon Roberts < >>> si...@dancingcloudservices.com> wrote: >>> >>> Hi all, I'm hoping this is the correct list for a question on the hprof >>> file format (1.0.2)? >>> >>> I found this information: >>> http://hg.openjdk.java.net/jdk6/jdk6/jdk/raw-file/tip/src/share/demo/jvmti/hprof/manual.html >>> >>> and have been working on a small project to read these files. (Yes, I >>> know that NetBeans/VisualVM and Eclipse both have such libraries, and a >>> number of other tools have been derived from those, but so far as I can >>> tell, they all are fundamentally built on the notion of fully decoding >>> everything, and creating memory representations of the entire heap. I want >>> to pull out only certain pieces of information--specifically object counts >>> by class--from a large, ~20Gb, dump file, and those tools just give up the >>> ghost on my systems.) >>> >>> Anyway, my code reads the file pretty well so far, except that the file >>> I want to analyze seems to contradict the specifications of the document >>> mentioned above. Specifically, after processing about five >>> HEAP_DUMP_SEGMENTS with around 1.5 million sub-records in each, I come >>> across some ROOT_JNI_LOCAL records. The first 15 follow the format >>> specified in the above document (one 8 byte "ID" and two four byte values.) >>> But the 16th omits the two four byte records (well, it might simply have >>> more, but visual analysis shows that after the 8 byte ID, I have a new >>> block tag, and a believable structure. I've actually noticed that several >>> of the record types defined in this "group" seem to diverge from the paper >>> I mentioned. >>> >>> My solution is that if my parser trips, it abandons that >>> HEAP_DUMP_SEGMENT from that point forward. It doesn't seem to matter much, >>> since I was looking for object data, and it appears that all of that has >>> already been handled. However, clearly this is not ideal! >>> >>> Is there any more detailed, newer, better, information? Or anything else >>> I should know to pursue this tool (or indeed a simple object frequency by >>> classname result) from an hprof 1.0.2 format file? >>> >>> (And yes, I'm pursuing a putative memory leak :) >>> >>> Thanks for any input (including "dude, this is the wrong list!") >>> Cheers, >>> Simon >>> >>> >>> >>> -- >>> Simon Roberts >>> (303) 249 3613 >>> >>> >>> >>> -- >>> Simon Roberts >>> (303) 249 3613 >>> >>> >>> >> >> -- >> >> Thanks, >> Jc >> > > > -- > Simon Roberts > (303) 249 3613 > >