Yasumasa, > Should I change the Assignee on JBS (JDK-8165869) to you?
Yes. Please do it. -Dmitry On 2016-10-25 12:29, Yasumasa Suenaga wrote: > Hi Dmitry, > >> It's hard to fix this issue without access to Oracle testing >> infrastructure. > > Agree. > I'm not an Oracle employee. Thus I cannot access it. > > Should I change the Assignee on JBS (JDK-8165869) to you? > > > Thanks, > > Yasumasa > > > On 2016/10/25 15:09, Dmitry Samersoff wrote: >> Yasumasa, >> >> It's hard to fix this issue without access to Oracle testing >> infrastructure. >> >> I'm working on the issue but unfortunately I was caught by some internal >> problems so it progresses slowly than I expected. >> >> I'm sorry about it. >> >> -Dmitry >> >> >> On 2016-10-24 13:24, Yasumasa Suenaga wrote: >>> Hi Dmitry, David, >>> >>> I want to resume to work for this issue because this issue has been >>> marked P3. >>> Dmitry, do you have any idea for it? >>> >>> IMHO, we can fix as below: >>> >>> http://cr.openjdk.java.net/~ysuenaga/JDK-8165869/webrev.00/ >>> >>> According to David, JvmtiExport::load_agent_library() should return zero >>> when the operation which is kicked by AttachListener is succeeded. >>> AttachListener will write response code from >>> JvmtiExport::load_agent_library() at first, and the client (e.g. >>> HotSpotVirtualMachine.java) checks it. >>> >>> Thus I keep current implementation about return code, and I added the >>> code to send error message to the client. >>> It works fine with jdk/test/com/sun/tools/attach/BasicTests.java . >>> >>> What do you think about it? >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2016/09/15 20:08, Dmitry Samersoff wrote: >>>> Yasumasa, David, >>>> >>>> I'm looking into this issue and come back as soon as I have a clear >>>> picture. >>>> >>>> It takes some time. Sorry! >>>> >>>> -Dmitry >>>> >>>> On 2016-09-15 14:01, Yasumasa Suenaga wrote: >>>>> Hi David, >>>>> >>>>> I agree the call sequence which you write in the case of "jcmd" >>>>> operation. >>>>> However, we should think about "load" operation in AttachListener. >>>>> >>>>> We can access JvmtiExport::load_agent_library() through two routes: >>>>> >>>>> Route "load": AttachListener -> JvmtiExport::load_agent_library() >>>>> Route "jcmd": AttachListener -> jcmd() in attachListener.cpp -> >>>>> DCmd::parse_and_execute() >>>>> >>>>> "load" sets error code (it means first data in the stream) when >>>>> JvmtiExport::load_agent_library() returns JNI_ERR. >>>>> OTOH "jcmd" sets error code if exception is occurred ONLY. >>>>> >>>>> If we try to load invalid JVMTI agent, "load" writes JNI_ERR to >>>>> stream, >>>>> however "jcmd" writes JNI_OK because pending exception is not >>>>> available >>>>> at this point. >>>>> Currently, JCmd.java shows "Command executed successfully" when it >>>>> cannot read the message from the socket. In case of this issue, >>>>> JVMTIAgentLoadDCmd does not write any message because it cannot attach >>>>> the agent. >>>>> If return code which is in the top of stream should mean whether >>>>> communication with AttachListener is succeeded, I think HotSpot should >>>>> write error message or error code to the stream for JCmd.java . >>>>> >>>>> I'm not sure this email is the answer for you. >>>>> Sorry for my English. >>>>> >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2016/09/15 18:43, David Holmes wrote: >>>>>> Hi Yasumasa, >>>>>> >>>>>> On 15/09/2016 1:56 PM, Yasumasa Suenaga wrote: >>>>>>> Hi, >>>>>>> >>>>>>>> If this is done through jcmd then jcmd needs to know how to >>>>>>>> "unwrap" >>>>>>>> the information it gets back from the Dcmd. >>>>>>> >>>>>>> In case of JVMTI.agent_load dcmd, I think we should be able to >>>>>>> get the >>>>>>> response as below: >>>>>>> >>>>>>> 1. Invalid JVMTI agent >>>>>>> 2. Agent_OnAttach() did not return JNI_OK >>>>>>> 3. Succeeded >>>>>>> >>>>>>> Currently, JvmtiExport::load_agent_library() returns JNI_OK to >>>>>>> caller, >>>>>>> and jcmd() in attachListener.cpp returns JNI_ERR if exception is >>>>>>> occurred (in Agent_OnAttach()). >>>>>> >>>>>> Can you respond to my comment in the bug report about the chain of >>>>>> calls and explain exactly where you think things are going wrong in >>>>>> this scenario. I'm having a lot of trouble trying to see how the >>>>>> information flows back through the layers. >>>>>> >>>>>>> IMHO, HotSpot should return error code (in top of the stream: >>>>>>> written by >>>>>>> AttachListener at first). So we should be able to get some error >>>>>>> code in >>>>>>> case 1. and 2. Then the client (jcmd or other methods) can decide to >>>>>>> parse text information in the stream. >>>>>> >>>>>> It returns the high-level response code first "did communication with >>>>>> the target VM succeed", then the actual action response code. That >>>>>> seems the right thing to do to me. It is up to the callers reading >>>>>> the >>>>>> stream to understand how to read it. To me the issue lies >>>>>> somewhere on >>>>>> the jcmd/dcmd side. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> >>>>>>> Sorry for my English. >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> On 2016/09/14 7:03, David Holmes wrote: >>>>>>>> On 13/09/2016 10:31 PM, Yasumasa Suenaga wrote: >>>>>>>>> Thanks David! >>>>>>>>> If we should not change the meaning of return code from >>>>>>>>> JvmtiExport::load_agent_library(), we should print return code as >>>>>>>>> below: >>>>>>>>> ------------------- >>>>>>>>> diff -r 0cf03b9d9b1f src/share/vm/prims/jvmtiExport.cpp >>>>>>>>> --- a/src/share/vm/prims/jvmtiExport.cpp Mon Sep 12 18:59:13 >>>>>>>>> 2016 >>>>>>>>> +0000 >>>>>>>>> +++ b/src/share/vm/prims/jvmtiExport.cpp Tue Sep 13 21:12:14 >>>>>>>>> 2016 >>>>>>>>> +0900 >>>>>>>>> @@ -2412,6 +2412,10 @@ >>>>>>>>> result = JNI_OK; >>>>>>>>> } >>>>>>>>> } >>>>>>>>> + // Print error code if Agent_OnAttach cannot be executed >>>>>>>>> + if (result != JNI_OK) { >>>>>>>>> + st->print_cr("%d", result); >>>>>>>>> + } >>>>>>>>> return result; >>>>>>>>> } >>>>>>>>> ------------------- >>>>>>>> >>>>>>>> Not sure I see the point. "return result" will put the error code >>>>>>>> into >>>>>>>> the socket stream and that error will be seen and responded to. I >>>>>>>> don't expect anything to then read further into the stream to >>>>>>>> see the >>>>>>>> "result" repeated a second time. In other words if execute() >>>>>>>> doesn't >>>>>>>> see a zero result then you go no further; if execute sees a zero >>>>>>>> result then the actual action may have had an issue so you >>>>>>>> proceed to >>>>>>>> read the "action's result". >>>>>>>> >>>>>>>>> It can pass com/sun/tools/attach/BasicTests.java, and we can get >>>>>>>>> -1 if >>>>>>>>> we try to attach invalid agent. >>>>>>>> >>>>>>>> Can you please outline your test case for this again. If this is >>>>>>>> done >>>>>>>> through jcmd then jcmd needs to know how to "unwrap" the >>>>>>>> information >>>>>>>> it gets back from the Dcmd. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>>> >>>>>>>>> On 2016/09/13 17:06, Dmitry Samersoff wrote: >>>>>>>>>> David, >>>>>>>>>> >>>>>>>>>> Thank you for the evaluation. >>>>>>>>>> >>>>>>>>>>> With that in mind I suspect the real fix for the original issue >>>>>>>>>>> is to >>>>>>>>>>> have Dcmd recognize that it also has to read two "results". >>>>>>>>>>> Though I'm >>>>>>>>>>> not sure how exactly. >>>>>>>>>> >>>>>>>>>> This logic seems completely broken for me. But I don't see >>>>>>>>>> anything we >>>>>>>>>> can do right now - for jdk 9. >>>>>>>>>> >>>>>>>>>> It requires careful changes of both - code and tests. >>>>>>>>>> >>>>>>>>>> I can help with this task but not today. >>>>>>>>>> >>>>>>>>>> -Dmitry >>>>>>>>>> >>>>>>>>>> On 2016-09-13 08:08, David Holmes wrote: >>>>>>>>>>> On 13/09/2016 1:53 PM, David Holmes wrote: >>>>>>>>>>>> On 13/09/2016 1:30 PM, Yasumasa Suenaga wrote: >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> >>>>>>>>>>>>> I could reproduce and I added a comment to JBS: >>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8165869?focusedCommentId=14000623&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14000623 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> In case of BasicTests.java, I think it is a test bug. >>>>>>>>>>>> >>>>>>>>>>>> I don't agree as yet. I have not been able to isolate the exact >>>>>>>>>>>> difference between what happens with your fix and what happens >>>>>>>>>>>> without. >>>>>>>>>>>> I know it relates to the presence of the error code, but also >>>>>>>>>>>> how we >>>>>>>>>>>> get >>>>>>>>>>>> AgentInitializationException instead of AgentLoadException. I >>>>>>>>>>>> need >>>>>>>>>>>> to be >>>>>>>>>>>> able to run the test outside of jtreg but that is proving to be >>>>>>>>>>>> very >>>>>>>>>>>> difficult to arrange. :( >>>>>>>>>>> >>>>>>>>>>> I finally managed to connect all the pieces on this. >>>>>>>>>>> >>>>>>>>>>> The basic problem is that with the proposed change >>>>>>>>>>> VirtualMachineImpl.execute() will throw AgentLoadException, >>>>>>>>>>> which >>>>>>>>>>> then >>>>>>>>>>> leads to the InternalError. Without the change we reach this >>>>>>>>>>> block in >>>>>>>>>>> HotspotVirtualMachine.loadAgentLibrary: >>>>>>>>>>> >>>>>>>>>>> int result = readInt(in); >>>>>>>>>>> if (result != 0) { >>>>>>>>>>> throw new AgentInitializationException("Agent_OnAttach >>>>>>>>>>> failed", >>>>>>>>>>> result); >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> and so get the expected AgentInitializationException. >>>>>>>>>>> >>>>>>>>>>> Looking at the proposed change in jvmtiExport.cpp if we have the >>>>>>>>>>> original: >>>>>>>>>>> >>>>>>>>>>> st->print_cr("%d", result); >>>>>>>>>>> result = JNI_OK; >>>>>>>>>>> >>>>>>>>>>> then execute() manages to read a zero completion status from the >>>>>>>>>>> stream, >>>>>>>>>>> and then loadAgentLibrary is able to read the actual "result" - >>>>>>>>>>> whether >>>>>>>>>>> zero or not - from the stream. This is because the >>>>>>>>>>> AttachListener >>>>>>>>>>> code >>>>>>>>>>> calls op->complete(result, &st) where st is the stream where we >>>>>>>>>>> wrote >>>>>>>>>>> the result value above in print_cr. Then if we look at, for >>>>>>>>>>> example, >>>>>>>>>>> LinuxAttachOperation::complete, we will write "result" to the >>>>>>>>>>> socket >>>>>>>>>>> first, followed by the contents of st. Hence on a successful >>>>>>>>>>> operation >>>>>>>>>>> we can read 0 for execute, and then 0 for loadAgent. On error we >>>>>>>>>>> read 0 >>>>>>>>>>> for execute and the actual error code for loadAgent. With the >>>>>>>>>>> proposed >>>>>>>>>>> change execute() sees the real result (instead of JNI_OK) and so >>>>>>>>>>> throws >>>>>>>>>>> AgentLoadException. >>>>>>>>>>> >>>>>>>>>>> So in summary there are two different error indicators written >>>>>>>>>>> into >>>>>>>>>>> the >>>>>>>>>>> stream, and we need the first to be zero unless some major error >>>>>>>>>>> with >>>>>>>>>>> the actual attach functionality occurred - otherwise even if the >>>>>>>>>>> "load" >>>>>>>>>>> failed in some other way, it is treated as a secondary error. >>>>>>>>>>> >>>>>>>>>>> With that in mind I suspect the real fix for the original issue >>>>>>>>>>> is to >>>>>>>>>>> have Dcmd recognize that it also has to read two "results". >>>>>>>>>>> Though I'm >>>>>>>>>>> not sure how exactly. >>>>>>>>>>> >>>>>>>>>>> David >>>>>>>>>>> ----- >>>>>>>>>>> >>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> If it is accepted, I will upload webrev for this redo task. >>>>>>>>>>>>> However, I cannot access (and fix) Oracle internal test. Can >>>>>>>>>>>>> someone >>>>>>>>>>>>> help me? >>>>>>>>>>>>> (I cannot access JPRT. So I need a sponsor.) >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 2016/09/13 9:00, David Holmes wrote: >>>>>>>>>>>>>> I think I see the problem. The attach framework tries to >>>>>>>>>>>>>> load the >>>>>>>>>>>>>> "instrument" agent library. Prior to 8164913 if this fails it >>>>>>>>>>>>>> actually >>>>>>>>>>>>>> appears to succeed. Now it fails, and that error propagates >>>>>>>>>>>>>> through. >>>>>>>>>>>>>> I'm guessing, at the moment, that the failure is due to a >>>>>>>>>>>>>> missing >>>>>>>>>>>>>> module related option. >>>>>>>>>>>>>> >>>>>>>>>>>>>> David >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 13/09/2016 9:54 AM, David Holmes wrote: >>>>>>>>>>>>>>> Hi Yasumasa, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 13/09/2016 9:23 AM, Yasumasa Suenaga wrote: >>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hmm, it has been backouted... >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I agree to David. This error seems to be a test bug. >>>>>>>>>>>>>>>> Can you share the test? I want to evaluate it. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Sorry we can't share the tests. I took a quick look and it >>>>>>>>>>>>>>> seems it >>>>>>>>>>>>>>> may >>>>>>>>>>>>>>> be related to the result code parsing (that I thought we >>>>>>>>>>>>>>> ended up >>>>>>>>>>>>>>> not >>>>>>>>>>>>>>> touching!). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Can you run a test of HotSpotVirtualMachine.loadAgent(null) >>>>>>>>>>>>>>> and >>>>>>>>>>>>>>> see >>>>>>>>>>>>>>> what >>>>>>>>>>>>>>> happens? We see AgentLoadException from the code that >>>>>>>>>>>>>>> internally >>>>>>>>>>>>>>> tries >>>>>>>>>>>>>>> to load the "instrument" agent: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Exception in thread "main" Failure: Unexpected exception >>>>>>>>>>>>>>> during >>>>>>>>>>>>>>> test >>>>>>>>>>>>>>> execution: java.lang.InternalError: instrument library is >>>>>>>>>>>>>>> missing in >>>>>>>>>>>>>>> target VM >>>>>>>>>>>>>>> ... >>>>>>>>>>>>>>> Caused by: java.lang.InternalError: instrument library is >>>>>>>>>>>>>>> missing in >>>>>>>>>>>>>>> target VM >>>>>>>>>>>>>>> ... >>>>>>>>>>>>>>> Caused by: com.sun.tools.attach.AgentLoadException: >>>>>>>>>>>>>>> Failed to >>>>>>>>>>>>>>> load >>>>>>>>>>>>>>> agent >>>>>>>>>>>>>>> library >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I do not agree to fix this bug in JDK 10 because VM >>>>>>>>>>>>>>>> might lie >>>>>>>>>>>>>>>> when >>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>> JVMTI agent could not be attached. IMHO this bug should be >>>>>>>>>>>>>>>> fixed >>>>>>>>>>>>>>>> in 9 >>>>>>>>>>>>>>>> GA, and we should fix testcase(s). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I agree. It has to be noted that "we" failed to run all the >>>>>>>>>>>>>>> appropriate >>>>>>>>>>>>>>> tests before pushing this, which would have discovered the >>>>>>>>>>>>>>> problem - >>>>>>>>>>>>>>> assuming it is actually a problem with the change and not an >>>>>>>>>>>>>>> unrelated >>>>>>>>>>>>>>> issue in our testing environment. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Unfortunately I have some high priority tasks to handle >>>>>>>>>>>>>>> right >>>>>>>>>>>>>>> now, >>>>>>>>>>>>>>> so I >>>>>>>>>>>>>>> can't go into this in any more depth at the moment. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> David >>>>>>>>>>>>>>> ----- >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Yasumasa >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 2016/09/13 6:15, David Holmes wrote: >>>>>>>>>>>>>>>>> For the record it looks like the tests involved are >>>>>>>>>>>>>>>>> broken, in >>>>>>>>>>>>>>>>> that >>>>>>>>>>>>>>>>> because the VM used to lie about the success of attaching >>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>> agent, the >>>>>>>>>>>>>>>>> tests expected different exceptions to occur. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> David >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 13/09/2016 3:11 AM, harold seigel wrote: >>>>>>>>>>>>>>>>>> Looks good. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Harold >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 9/12/2016 1:05 PM, Christian Tornqvist wrote: >>>>>>>>>>>>>>>>>>> Hi everyone, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Please review this (clean) backout of the change for >>>>>>>>>>>>>>>>>>> JDK-8164913, it >>>>>>>>>>>>>>>>>>> failed >>>>>>>>>>>>>>>>>>> several tests in the nightly testing. The failures are >>>>>>>>>>>>>>>>>>> tracked in: >>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8165869 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Webrev: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ctornqvi/webrev/8165881/webrev.00/ >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Bug: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8165881 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Christian >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>> >>>> >> >> -- Dmitry Samersoff Oracle Java development team, Saint Petersburg, Russia * I would love to change the world, but they won't give me the sources.
