Hi Gunter, the change looks good to me now.
I would think it is most pragmatic if parsing in the thread_dump function of attachListener.cpp could be done as you proposed. Let's wait if Chris is fine with it. Small nit: In src/jdk.jcmd/share/classes/sun/tools/jstack/JStack.java, around line 70, I'd prefer if you could write the if statements like this: if (arg.equals("-l")) { locks = true; } else if (arg.equals("-e")) { extended = true; } else { usage(1); } That'll keep the indentation a bit more flat... Best regards Christoph > -----Original Message----- > From: Haug, Gunter > Sent: Freitag, 8. Juni 2018 16:05 > To: David Holmes <david.hol...@oracle.com>; Lindenmaier, Goetz > <goetz.lindenma...@sap.com>; Chris Plummer > <chris.plum...@oracle.com>; serviceability-dev <serviceability- > d...@openjdk.java.net>; Langer, Christoph <christoph.lan...@sap.com>; > hotspot-runtime-...@openjdk.java.net; Kirk Pepperdine > <k...@kodewerk.com>; Thomas Stüfe <thomas.stu...@gmail.com> > Subject: Re: RFR(S): 8200720: Print additional information in thread dump > (times, allocated bytes etc.) > > Hi all, > > thanks a lot for all the input! I have prepared a new version of the webrev > incorporating the suggestions you made (at least I tried): > > http://cr.openjdk.java.net/~ghaug/webrevs/8200720.v3/ > > This version outputs the thread times unconditionally while the other > information is guarded by a flags. I think, most participants found the thread > times the most valuable information and had no (strong) objections to > adding them unconditionally. > > @David > Implementation is much simpler with conditional output only for > JavaThreads. I could get rid of the terrible hack (changing the flag) without > having to change too many source files. > > The information on allocated bytes is present in the Thread class already > before this proposed change, I just print it. It might be sensible to move > _allocated_bytes and the respective methods to the ThreadStatisticInfo class > as Götz suggested. I haven’t done that in the current version, though. > > @David and Thomas > I've removed the pthread-id output. I'm unsure myself what it could be good > for. Anyway, we could add it with a separate change (with a better > implementation) if there is a need to. > > @Chris > As you have written, the user has no direct contact to the attach listener of > the VM and jstack won't misinterpret e.g. -help. I agree that the parsing in > attachListener.cpp could be more robust. However, it hasn't been so far > either. All that is done in the current implementation is a strcmp to -l > everything else will be silently ignored. I can try to make this more robust > or > we could say that the new output is only available via jcmd. OTH nothing bad > can happen with the current version of the proposed change, so we could > also leave it as it is. What would you prefer? > > @Kirk and Thomas > Implementing a new diagnostic command is certainly well worth a thought. > However, it appears to me that it's not necessary in this case. There are > already flags to jstack and jcmd Thread.print, so it's not that uncommon. The > amount of information that would be available by a new diagnostic command > and that is now added to the thread dump is quite small and I think that is > tolerable. Moreover, and most importantly, our support team is use to find > the information in the thread dump and it would make things easier to us if I > could leave it there. > > Thanks again and have a nice weekend, > Gunter > > > On 06.06.18, 04:56, "David Holmes" <david.hol...@oracle.com> wrote: > > Hi Goetz, > > On 5/06/2018 11:07 PM, Lindenmaier, Goetz wrote: > > Hi > > > > this discussion touched a lot of points so far, which seem > > to lead to different conclusions. > > > > I think we should look at the values printed: > > > > 1. cpu=6300.65ms elapsed=123.28s > > Overhead > > cpu time: > > * system calls at thread dump time > > elapsed time: > > * 1 system call at thread creation time > > * 1 64-bit field per thread for the thread start time > > * 1 system call at thread dump time > > > > As I understand, JDK-8143176 would have had to get the > > time at each locking, which is much more time critical > > than doing this during thread creation. While > > Correct. > > > the time a lock was held would be much more useful, > > the ratio here, combined with knowledge about the application, > > could lead to the conclusion that the thread is wrongly > > blocked at much lower cost. > > Agreed. I see no issue with unconditionally adding this information as > it should address some of the concerns of 8143176 as well. > > > > > 2. pthread-id=0x109708000 > > Overhead: > > * a field access at thread dump time. Negligible I would say. > > I'd overlooked this addition, but it is of course the chunk of ifdef > code in osThread.cpp that I objected to. The problem I have here is the > convoluted mess of "thread identifiers" that we already have. We > currently print: > > st->print("tid=" INTPTR_FORMAT " ", p2i(this)); > > which is just the address of the Thread/JavaThread object. Then: > > st->print("nid=0x%x ", thread_id()); > > where 'n' is supposed to represent "native thread id", which is: > - linux: kernel thread id from syscall gettid > - solaris: thread library identity from thr_create() or thr_self() > - windows: thread id from _beginthreadex > - OS X: kernel thread id from pthread_mach_thread_np(pthread_self()); > - Other BSD: kernel (?) thread id from syscall thr_self or getthrid > - AIX: thread library identity from pthread_create() or pthread_self() > > It's telling that the code Gunter added gets the thread library id on > linux, but a "kernel thread id" on other platforms! Do we want to see a > thread library id and a kernel thread id? IIRC nid is supposed be the id > you need to see the thread in a debugger. In which case I'm unclear what > role the thread id Gunter wants to add is playing? I don't think we need > to see kernel ids in general, and pthread-id isn't useful for debugging > is it? > > > > > 3. allocated=242236760B defined_classes=1725 > > Overhead: > > * 1 64-bit field per thread. > > * counter increment on class creation > > > > Especially defined_classes seems to be controversial. > > As this only makes sense for Java threads, this could > > be printed in a line of it's own in Threads::print_on_error() > > (which already gets a flag to customize for jstack: > > print_concurrent_locks) by calling a dedicated function in Thread. > > No need for flag PrintExtendedThreadInfo thus. > > > > But I would propose to drop this information as it is > > too controversial. > > Agreed. > > > This leaves the times and the pthread-id to be decided whether > > they are worth the cost. If so, I think they should be printed > > unconditional. > > Agreed. With some refinement of the "thread id" issue. > > > If a flag is required to avoid bloating the info in the default case, > > I would include more infos, as os_prio and nid under that flag. > > Of course those things exist for good reason. The native id is supposed > to be what aids in you matching what you see in our thread dumps with > what you see in a debugger. Priority is less of an issue these days, but > there was a time ... ;-) > > > As jstack is deprecated, changes to jstack could be skipped always > > disabling the new printouts. > > > > Finally, I would propose to move _allocated_bytes into > > threadStatisticInfo.hpp. > > Didn't that part get dropped? (You don't need it for the time info or > thread id). That's a GC logging issue to me. > > Thanks, > David > > > Best regards, > > Goetz. > > > > > > > > > >> -----Original Message----- > >> From: serviceability-dev [mailto:serviceability-dev- > >> boun...@openjdk.java.net] On Behalf Of David Holmes > >> Sent: Dienstag, 5. Juni 2018 04:53 > >> To: Haug, Gunter <gunter.h...@sap.com>; Chris Plummer > >> <chris.plum...@oracle.com>; serviceability-dev <serviceability- > >> d...@openjdk.java.net>; Langer, Christoph > <christoph.lan...@sap.com>; > >> hotspot-runtime-...@openjdk.java.net > >> Subject: Re: RFR(S): 8200720: Print additional information in thread > dump > >> (times, allocated bytes etc.) > >> > >> Hi Gunter, > >> > >> On 5/06/2018 1:27 AM, Haug, Gunter wrote: > >>> Hi David, Chris, > >>> > >>> would it be an option to unconditionally print the additional > information? > >> Regardless which way this is implemented it will be rather complicated > and it > >> only switches on/off a few field in the thread dump. > >> > >> I'm not convinced this is all suitable information for a thread stack > >> dump. I can see the time information being useful if using the dump to > >> try and diagnoze a hang or responsiveness issue. But the allocated- > bytes > >> and classes-defined just doesn't seem useful in the context of a thread > >> dump to me. Anyone interested in allocation stats is going to be > looking > >> at GC logs etc which is where this belongs. Ditto the class-stats > belong > >> in some kind of classloading logging IMHO. > >> > >> I'm very reluctant to burden the implementation with capturing these > >> kinds of stats, just to use for diagnostic purposes that may not even > be > >> asked for. I'm very much in the "you shouldn't pay for what you don't > >> use" camp in that regard. (See my comments in JDK-8143176 referenced > by > >> Sean.) > >> > >> The ThreadStatisticInfo adds overhead to every thread object - and I'd > >> have to suspect there could be overlap with whatever flight recorder > >> sticks in there too. (Thread has become even more bloated in recent > time!). > >> > >> Cheers, > >> David > >> > >> > >>> Thanks, > >>> Gunter > >>> > >>> On 04.06.18, 01:13, "David Holmes" <david.hol...@oracle.com> > wrote: > >>> > >>> Hi Gunter, Chris, > >>> > >>> Sorry just trying to catch up and this is only a partial look > so far ... > >>> > >>> BTW these changes are not limited to serviceability code so > adding > in > >>> runtime team as well. > >>> > >>> On 2/06/2018 9:12 AM, Chris Plummer wrote: > >>> > Hi Gunter, > >>> > > >>> > On 6/1/18 3:17 AM, Haug, Gunter wrote: > >>> >> Hi Chris, > >>> >> > >>> >> thanks for looking into this! > >>> >> > >>> >> Re the synchronization: The value is stored only in a VM > operation at > >>> >> a safepoint by the VM thread. I was of the opinion, that this > could > >>> >> not be interrupted by a second VM operation (of the same > type). Or > >> am > >>> >> I missing something else? > >>> > I was really thinking more about the temporary changing of > >>> > PrintExtendedThreadInfo, not the value stored in the VMOp. You > may > >> be be > >>> > correct that no more than one VMOp is executing, but while it > is > >>> > executing it is has changed the value of > PrintExtendedThreadInfo, > >> which > >>> > might have an impact on anything else that triggers printing > thread info > >>> > while the VMOp is executing. > >>> > >>> Even if nothing else can possibly be running during the > safepoint I > find > >>> it extremely bad form to change a command-line flag in this way, > >>> particularly from a safepoint! > >>> > >>> If this flag is intended to be dynamically enabled as part of a > dcmd > >>> then it should also be a Manageable flag IMHO. > >>> > >>> That said ... > >>> > >>> >> I did think about passing an argument to the various > print_on > >> member > >>> >> functions of the thread classes, but this would require > rather > >>> >> extensive code changes for a rather tiny extension. > Therefore I > feel > >>> >> doing it like this is the lesser evil. > >>> > >>> ... it's obviously not truly a global setting, but one that is > needed on > >>> a per-print-request basis. The flag is just the default, but if > the > >>> default is off you still want to enable extended printing on a > >>> per-request basis. > >>> > >>> The current print_on mechanics is not set up for this kind of > >>> flexibility. I think this needs more thought. > >>> > >>> --- > >>> > >>> Re osThread.cpp shared code changes ... I really hate to see > all the > >>> ifdefs in there. Please add a pd_print_on function to support > this if > >>> you truly want platform specific information. > >>> > >>> --- > >>> > >>> threadStatisticInfo.hpp > >>> > >>> typo: getElepsedTime() -> getElapsedTime() > >>> > >>> Thanks, > >>> David > >>> > >>> > >>> > >>> >> > >>> >> Re thread_dump(): I think it's correct or, well, at least it > works ;-) > >>> >> In fact jstack will transfer the "-e" and "-l" in only one > string, > >>> >> i.e. op->arg(0). > >>> > So you are saying that op->arg(0) is "-e -l" when using > jstack? I > think > >>> > you really need to clean up the parsing. As it stands right > now > passing, > >>> > I get the feeling that if a user erroneously asks for help by > using > >>> > "jcmd <pid> Thread.Print -help", it will end up executing > with -e > an -l > >>> > enabled, and no failure for the invalid "-help" option. > >>> >> Christoph has already explained my reasoning. But I agree, > it's > not > >>> >> nice and I would be happy to do it like Christoph suggested. > >>> > I'll respond separately to his suggestion. > >>> > > >>> > thanks, > >>> > > >>> > Chris > >>> >> > >>> >> And typo fixed, sorry. > >>> >> > >>> >> Thanks again, > >>> >> Gunter > >>> >> > >>> >> On 01.06.18, 00:03, "Chris Plummer" > <chris.plum...@oracle.com> > >> wrote: > >>> >> > >>> >> Hi Gunter, > >>> >> globals.hpp: fix typo "informatiuon" > >>> >> I worry a little bit about the synchronizing (if that's > the right > >>> >> word) > >>> >> of PrintExtendedThreadInfo and the dcmd's -e flag. When > using -e, > >>> >> you > >>> >> are temporarily enabling PrintExtendedThreadInfo if it > was > false. > >>> >> This > >>> >> temporarily changes the behavior of thread dumps, and > could > >>> >> impact other > >>> >> uses that happen in parallel. Also, could two > simultaneous > uses > >>> >> of -e > >>> >> result in PrintExtendedThreadInfo not getting restored > properly? > >>> >> thread_dump() doesn't look right. It looks like you are > iterating > >>> >> char > >>> >> by char over the argument, and expect something like > "-el" to > be > >>> >> specified rather then "-e -l". The loop should be > iterating over > >>> >> op->arg(i), not op->arg(0)[i]. > >>> >> The rest of the changes look fine. > >>> >> thanks, > >>> >> Chris > >>> >> On 5/30/18 8:12 AM, Haug, Gunter wrote: > >>> >> > Hi all, > >>> >> > > >>> >> > As Chris proposed, I have made an the extended output > >>> >> switchable. There is an VM flag (PrintExtendedThreadInfo), > which is > >>> >> false by default. Moreover, there is an Option (-e) which > can be > used > >>> >> with jcmd Thread.print as well as with jstack. The -e option > >>> >> essentially sets PrintExtendedThreadInfo true just for the > respective > >>> >> thread dump. > >>> >> > > >>> >> > Here is the updated webrev: > >>> >> > > >>> >> > http://cr.openjdk.java.net/~ghaug/webrevs/8200720.v2 > >>> >> > > >>> >> > (https://bugs.openjdk.java.net/browse/JDK-8200720) > >>> >> > > >>> >> > Thanks, > >>> >> > Gunter > >>> >> > > >>> >> > > >>> >> > On 02.05.18, 17:07, "serviceability-dev on behalf of > Haug, > >>> >> Gunter" <serviceability-dev-boun...@openjdk.java.net on > behalf of > >>> >> gunter.h...@sap.com> wrote: > >>> >> > > >>> >> > Hi Chris, > >>> >> > > >>> >> > Thanks for looking into this. > >>> >> > You're right, there is a little more we have. We > have > >>> >> implemented an IO tracing mechanism which - rather as a > byproduct - > >>> >> keeps track of bytes read and written per thread. However, > this > of > >>> >> course requires changes not only in hotspot. We would be > happy to > >>> >> contribute this as well, but this is a far bigger change and > will > >>> >> probably lead to a far bigger discussion. Anyway, with the > number of > >>> >> bytes read, the number of classes defined doesn't look that > arbitrary > >>> >> anymore, as one can correlate IO to class loading. > >>> >> > > >>> >> > Regarding the verbose option I think that's a > good idea! > >>> >> > > >>> >> > Thanks again, > >>> >> > Gunter > >>> >> > > >>> >> > On 01.05.18, 22:55, "Chris Plummer" > >>> >> <chris.plum...@oracle.com> wrote: > >>> >> > > >>> >> > Hi Gunter, > >>> >> > > >>> >> > The output you are adding is all useful. I > think the > >>> >> question is (and > >>> >> > I'd like to see a few people chime in on > this for this > >>> >> review) is > >>> >> > whether or not all of it is the appropriate > for a > >>> >> thread's stack dump. > >>> >> > For example, defined_classes is on the > fringe of what > >>> >> I would call > >>> >> > generally useful info in a stack dump. Sure, > there > >>> >> might be that rare > >>> >> > case when you need it, but how often > compared to > other > >>> >> useful info > >>> >> > maintained on a per thread basis. How many > other bits > >>> >> of useful info are > >>> >> > not being printed in favor of > defined_classes? It > >>> >> seems you have more in > >>> >> > the queue. How many? I'm worried about how > cluttered > >>> >> the stack dumps > >>> >> > will get. Maybe we should add some sort of > verbose > >>> >> thread dumping > >>> >> > option. Just a thought. > >>> >> > > >>> >> > As for the implementation, overall it looks > good, but > >>> >> I can't speak to > >>> >> > whether or not you are doing proper > accounting of > >>> >> defined_classes and > >>> >> > bytes allocated. You'll need input from > someone with > >>> >> more knowledge of > >>> >> > those areas. We'll also need to do some > testing to > >>> >> make sure tool tests > >>> >> > are not impacted. > >>> >> > > >>> >> > thanks, > >>> >> > > >>> >> > Chris > >>> >> > > >>> >> > On 4/30/18 2:51 AM, Haug, Gunter wrote: > >>> >> > > Hi, > >>> >> > > > >>> >> > > this is an update to an RFR I posted on > hotspot-dev, > >>> >> but it is probably more suitable to post it here. Can I > please have > a > >>> >> review and a sponsor for the following enhancement: > >>> >> > > > >>> >> > > > http://cr.openjdk.java.net/~ghaug/webrevs/8200720.v1 > >>> >> > > > https://bugs.openjdk.java.net/browse/JDK-8200720 > >>> >> > > > >>> >> > > We at SAP have extended the thread dumps > (obtained > >>> >> by jstack or jcmd) by several fields providing thread > specific > >>> >> information. These extensions are quite popular with our > support > >> team. > >>> >> With some knowledge of the architecture of the application, > they > >> often > >>> >> allow for quick and simple diagnosis of a running system. > Therefore > >> we > >>> >> would like to contribute these enhancements. > >>> >> > > > >>> >> > > I took a few simple examples here, namely > cpu time, > >>> >> elapsed time since thread creation, bytes allocated and > classes > >>> >> defined by the thread and the pthread-id or equivalent on > platforms > >>> >> where it makes sense. Provided it is known how the > application > >> should > >>> >> behave, a misbehaving thread can identified easily. > >>> >> > > > >>> >> > > There is no measurable overhead for this > >>> >> enhancement. However, it may be a problem that the format of > the > >>> >> output is changed. Tools parsing the output may have to be > changed. > >>> >> > > > >>> >> > > Here is an example of the output generated: > >>> >> > > > >>> >> > > > ------------------------------------------------------ > >>> >> > > > >>> >> > > "main" #1 prio=5 os_prio=31 cpu=6300.65ms > >>> >> elapsed=123.28s allocated=242236760B defined_classes=1725 > >>> >> tid=0x00007fa13a806000 nid=0x1c03 pthread-id=0x109708000 > waiting > >> on > >>> >> condition [0x0000000109707000] > >>> >> > > java.lang.Thread.State: TIMED_WAITING > (sleeping) > >>> >> > > JavaThread state: _thread_blocked > >>> >> > > Thread: 0x00007fa13a806000 [0x1c03] State: > >>> >> _at_safepoint _has_called_back 0 _at_poll_safepoint 0 > >>> >> > > JavaThread state: _thread_blocked > >>> >> > > at java.lang.Thread.sleep(java.base/Native > Method) > >>> >> > > ... > >>> >> > > > ------------------------------------------------------ > >>> >> > > > >>> >> > > As mentioned above, we have a whole bunch > of > other > >>> >> enhancements to the thread dump similar to this one and would > be > >>> >> willing to contribute them if there is any interest. > >>> >> > > > >>> >> > > Thanks and best regards, > >>> >> > > Gunter > >>> >> > > > >>> >> > > > >>> >> > > >>> >> > > >>> >> > > >>> >> > > >>> >> > > >>> >> > >>> > > >>> > > >>> > >>> >