Re: The store for byte strings

2018-06-09 Thread John Rose
On Jun 9, 2018, at 12:18 PM, Xueming Shen wrote: > > Ideally I would assume we would want to have a utf-8 internal storage for > String, even in theory utf8 is supposed to be used externally and utf16 > to be the internal one. Separately from my point about ByteSequence, I agree that "doubling

Re: The store for byte strings

2018-06-09 Thread John Rose
I'm glad to see you are thinking about this, Florian. You appear to be aiming at a way to compactly store and manipulate series of octets (in an arbitrary encoding) with an emphasis on using those octets to represent strings, in the usual sense of character sequences. Would you agree that this

Re: The store for byte strings

2018-06-09 Thread Xueming Shen
On 6/9/18, 3:27 AM, Florian Weimer wrote: Lately I've been thinking about string representation. The world turned out not to be UCS-2 or UTF-16, after all, and we often have to deal with strings generally encoded as ASCII or UTF-8, but we aren't always encoded this way (and there might not even

Re: Review Request: 8204648: test/jdk/tools/launchers/SourceMode.java fails with long shebang line

2018-06-09 Thread joe darcy
Skipping the shebang tests is fine a workaround Mandy; thanks, -Joe On 6/8/2018 9:57 PM, mandy chung wrote: I run into some issue with shebang tests.  Since Jon is on vacation, I revise the patch to skip the shebang test temporarily until he returns. Mandy diff --git

Re: RFR: 8199871: Deprecate pack200 and unpack200 tools

2018-06-09 Thread Henry Jen
I revised the webrev to have warning only executable name for unpack200 instead of full path on Windows, which I believe is what was intended all along, the test is also revised to take unpack200.exe on the Windows platform. The updated version is at

The store for byte strings

2018-06-09 Thread Florian Weimer
Lately I've been thinking about string representation. The world turned out not to be UCS-2 or UTF-16, after all, and we often have to deal with strings generally encoded as ASCII or UTF-8, but we aren't always encoded this way (and there might not even be a charset declaration, see the ELF