Hi Thomas, Thank you for your comment. I uploaded new webrev:
http://cr.openjdk.java.net/~ysuenaga/JDK-8204531/webrev.02/ 2018-06-14 3:34 GMT+09:00 Thomas Stüfe <thomas.stu...@gmail.com>: > Hi Yasumasa, > > looks good to me. > > small nit: > import java.nio.charset.* -> import java.nio.charset.Charset; Fixed. > We loose the assert for type now from byteArrayValue(). Can you re-add > it to the location where you call CStringUtils? I added the assertion to check vector length. I did not add type assertion because if-statement branches by data type. > We do not seem to need byteArrayValue(), can this be removed? I cannot agree that. byteArrayValue() is public method (but it is not used AFAICS). However, similar getter methods are privided from PerfDataEntry (e.g. charArrayValue()). So I think we should not remove byteArrayValue(). Regards, Yasumasa > Best Regards, Thomas > > > > On Wed, Jun 13, 2018 at 3:33 PM, Yasumasa Suenaga <yasue...@gmail.com> wrote: >> Thanks Thomas, >> >>> That said, I'd probably just add a variant to >>> CStringUtilities::getString which takes encoding as an argument, and >>> then pass "US-ASCII" to remain compatible with the current version of >>> jsnap. >> >> >> I agree with you. My opinion is one of the ideals. >> Anyway, I uploaded new webrev which is added charset support in >> CStringUtils: >> >> http://cr.openjdk.java.net/~ysuenaga/JDK-8204531/webrev.01/ >> >> >> Thanks, >> >> Yasumasa >> >> >> >> On 2018/06/13 21:46, Thomas Stüfe wrote: >>> >>> Disclaimer: I am Reviewer but not versed in this corner of the jdk, >>> nor have knowledge of the history, so take what I suggest with a grain >>> of salt. >>> >>> That said, I'd probably just add a variant to >>> CStringUtilities::getString which takes encoding as an argument, and >>> then pass "US-ASCII" to remain compatible with the current version of >>> jsnap. >>> >>> ..Thomas >>> >>> On Wed, Jun 13, 2018 at 2:37 PM, Yasumasa Suenaga <yasue...@gmail.com> >>> wrote: >>>> >>>> Hi Thomas, >>>> >>>> Thank you for your comment. >>>> >>>> CStringUtilities.java says the String which is constructed from byte >>>> array >>>> should use Charset.defaultCharset(). [1] >>>> I'm not convinced how should we fix - US-ASCII or default charset. >>>> >>>> IMHO we should use default charset. However I concern I need to get CSR >>>> approvement. >>>> (In addition, I want to refactor CStringUtilities. I guess we can make >>>> more >>>> simple code :-) >>>> >>>> >>>> What do you think about it? >>>> >>>> Yasumasa >>>> >>>> >>>> [1] >>>> >>>> http://hg.openjdk.java.net/jdk/jdk/file/7bf4f1b5e438/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/utilities/CStringUtilities.java#l73 >>>> >>>> >>>> >>>> On 2018/06/13 21:18, Thomas Stüfe wrote: >>>>> >>>>> >>>>> Hi Yasumasa, >>>>> >>>>> your patch will change behavior of jsnap if file.encoding is set to >>>>> something different than US-ASCII. >>>>> >>>>> Old version uses, hard wired, US-ASCII. >>>>> >>>>> You now use jvm/hotspot/utilities/CStringUtilities.java, which uses >>>>> >>>>> private static String encoding = System.getProperty("file.encoding", >>>>> "US-ASCII"); >>>>> >>>>> and thus may use a different encoding. >>>>> >>>>> Best Regards, Thomas >>>>> >>>>> >>>>> >>>>> On Wed, Jun 13, 2018 at 2:09 PM, Yasumasa Suenaga <yasue...@gmail.com> >>>>> wrote: >>>>>> >>>>>> >>>>>> PING: Could you review it? >>>>>> I want to merge this change before JDK 11 RDP 1. >>>>>> >>>>>>>>> webrev: >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8204531/webrev.00/ >>>>>>>>> >>>>>>>>> JBS: >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8204531 >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2018/06/08 15:38, Yasumasa Suenaga wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> Hi David, >>>>>>> >>>>>>> 2018-06-08 10:47 GMT+09:00 David Holmes <david.hol...@oracle.com>: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Hi Yasumasa, >>>>>>>> >>>>>>>> On 7/06/2018 10:43 PM, Yasumasa Suenaga wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> Please review this change: >>>>>>>>> >>>>>>>>> webrev: >>>>>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8204531/webrev.00/ >>>>>>>>> >>>>>>>>> JBS: >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8204531 >>>>>>>>> >>>>>>>>> >>>>>>>>> We can use `jhsdb jsnap` to check all PerfData. >>>>>>>>> String values in PerfData are defined as jbyte array, but JSnap >>>>>>>>> cannot >>>>>>>>> handle it well as following: >>>>>>>>> >>>>>>>>> ``` >>>>>>>>> $ jhsdb jsnap --pid 28542 --all | less >>>>>>>>> >>>>>>>>> sun.gc.cause=No >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> GC^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ >>>>>>>>> ``` >>>>>>>>> >>>>>>>>> You can see this value via `less` and `vim` on Linux. `^@` shows it >>>>>>>>> is >>>>>>>>> non-ascii character. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> So your synopsis says "remove unused chars following \0". Is that >>>>>>>> really >>>>>>>> what is happening? I would have expected "new String(bytes, >>>>>>>> encoding)" >>>>>>>> to >>>>>>>> stop processing bytes when it encounters a NUL! >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> String c'tor will continue to parse byte array even if it finds '\0'. >>>>>>> You can see it on JShell as below: >>>>>>> >>>>>>> ------------- >>>>>>> jshell> var bytes = new byte[]{(byte)'a', (byte)'b', (byte)'c', >>>>>>> (byte)'\0', (byte)'d', (byte)'e', (byte)'f'} >>>>>>> bytes ==> byte[7] { 97, 98, 99, 0, 100, 101, 102 } >>>>>>> >>>>>>> jshell> new String(bytes, "US-ASCII") >>>>>>> $2 ==> "abc\000def" >>>>>>> ------------- >>>>>>> >>>>>>> >>>>>>>>> PerfDataEntry has null-terminated C string. So we should restore as >>>>>>>>> it >>>>>>>>> in >>>>>>>>> Java layer. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> If the issue is really "junk" beyond the \0 nul-terminator, how did >>>>>>>> we >>>>>>>> get >>>>>>>> such values? Shouldn't the byte array already be terminated at the >>>>>>>> NUL? >>>>>>>> Or >>>>>>>> is this just a raw representation of an array in the VM using the >>>>>>>> exact >>>>>>>> length of the array regardless of content? >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> I think JSnap shows the array regardless of content. >>>>>>> >>>>>>> >>>>>>>> I see CStringUtilities.getString() both stops when it encounters a 0 >>>>>>>> value, >>>>>>>> and uses the default file encoding (else ASCII). >>>>>>>> >>>>>>>> The fix seems perfectly reasonable, I'm just unclear where exactly >>>>>>>> this >>>>>>>> "bad" data should have been stopped. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Thanks! >>>>>>> >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Yasumasa >>>>>>>>> >>>>>>>> >>>>>> >>>> >>