Hi Aleksey, While it's true that the denser format will require fewer cachelines, my experience is that most strings are smaller than a single cacheline worth of storage, maybe two lines in some cases; there's just a ton of them in the heap. So the heap footprint should be substantially reduced, but I'm not sure the cache pollution will be significantly reduced.
There's currently no vectorization of char[] scanning (or any vectorization other than memcpy for that matter) - are you referring to the recent Intel contributions here or there's a plan to further improve vectorization in time for this JEP? Just curious. I agree that string fusion is separate from this change, and we've discussed this before. It just seems to me like that's a bigger perf problem today since even tiny/small strings (very common, IME) incur the indirection and bloat overhead, so would have liked to see that addressed first. If you're saying that's fully on valhalla's plate, ok, but I haven't seen anything proposed there yet. Thanks sent from my phone On Jun 1, 2015 4:50 AM, "Aleksey Shipilev" <aleksey.shipi...@oracle.com> wrote: > On 05/18/2015 05:35 PM, Vitaly Davidovich wrote: > > This part is a bit unclear for the proposed changes. While it's true > that > > single byte encoding will be denser than two byte, most string ops end up > > walking the backing store linearly; prefetch (either implicit h/w or > > software-assisted) could hide the memory access latency. > > It will still pollute the caches though, and generally incur more > instructions to be executed (e.g. think about the vectorized scan of the > char[] array -- the compressed version will take 2x less instructions). > > > > Personally, what I'd like to see is fusing storage of String with its > > backing data, irrespective of encoding (i.e. removing the indirection to > > fetch the char[] or byte[]). > > This is not the target for this JEP, and the groundwork for > String-char[] fusion is handled elsewhere (I put my hopes at Valhalla > that will explore the exact path to add the "exotic" object shapes into > the runtime). > > String-char[] fusion neither conflicts with the Compact String > optimization, nor provides the alternative. Removing the "excess" > headers from backing char[] array would solve the "static" overhead in > Strings, while the String compaction would further compact the backing > storage. > > Thanks, > -Aleksey. > > >