Hi Yasumasa, your patch will change behavior of jsnap if file.encoding is set to something different than US-ASCII.
Old version uses, hard wired, US-ASCII. You now use jvm/hotspot/utilities/CStringUtilities.java, which uses private static String encoding = System.getProperty("file.encoding", "US-ASCII"); and thus may use a different encoding. Best Regards, Thomas On Wed, Jun 13, 2018 at 2:09 PM, Yasumasa Suenaga <yasue...@gmail.com> wrote: > PING: Could you review it? > I want to merge this change before JDK 11 RDP 1. > >>>> webrev: >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8204531/webrev.00/ >>>> >>>> JBS: >>>> https://bugs.openjdk.java.net/browse/JDK-8204531 > > > > Yasumasa > > > On 2018/06/08 15:38, Yasumasa Suenaga wrote: >> >> Hi David, >> >> 2018-06-08 10:47 GMT+09:00 David Holmes <david.hol...@oracle.com>: >>> >>> Hi Yasumasa, >>> >>> On 7/06/2018 10:43 PM, Yasumasa Suenaga wrote: >>>> >>>> >>>> Hi all, >>>> >>>> Please review this change: >>>> >>>> webrev: >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8204531/webrev.00/ >>>> >>>> JBS: >>>> https://bugs.openjdk.java.net/browse/JDK-8204531 >>>> >>>> >>>> We can use `jhsdb jsnap` to check all PerfData. >>>> String values in PerfData are defined as jbyte array, but JSnap cannot >>>> handle it well as following: >>>> >>>> ``` >>>> $ jhsdb jsnap --pid 28542 --all | less >>>> >>>> sun.gc.cause=No >>>> >>>> GC^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ >>>> ``` >>>> >>>> You can see this value via `less` and `vim` on Linux. `^@` shows it is >>>> non-ascii character. >>> >>> >>> >>> So your synopsis says "remove unused chars following \0". Is that really >>> what is happening? I would have expected "new String(bytes, encoding)" to >>> stop processing bytes when it encounters a NUL! >> >> >> String c'tor will continue to parse byte array even if it finds '\0'. >> You can see it on JShell as below: >> >> ------------- >> jshell> var bytes = new byte[]{(byte)'a', (byte)'b', (byte)'c', >> (byte)'\0', (byte)'d', (byte)'e', (byte)'f'} >> bytes ==> byte[7] { 97, 98, 99, 0, 100, 101, 102 } >> >> jshell> new String(bytes, "US-ASCII") >> $2 ==> "abc\000def" >> ------------- >> >> >>>> PerfDataEntry has null-terminated C string. So we should restore as it >>>> in >>>> Java layer. >>> >>> >>> >>> If the issue is really "junk" beyond the \0 nul-terminator, how did we >>> get >>> such values? Shouldn't the byte array already be terminated at the NUL? >>> Or >>> is this just a raw representation of an array in the VM using the exact >>> length of the array regardless of content? >> >> >> I think JSnap shows the array regardless of content. >> >> >>> I see CStringUtilities.getString() both stops when it encounters a 0 >>> value, >>> and uses the default file encoding (else ASCII). >>> >>> The fix seems perfectly reasonable, I'm just unclear where exactly this >>> "bad" data should have been stopped. >> >> >> Thanks! >> >> >> Yasumasa >> >> >>> Thanks, >>> David >>> >>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>> >