On 1/19/16 11:43 AM, Martin Buchholz wrote:
Many years ago I considered implementing this cool feature.
I thought that few would find it worth the cost - it would be hard to
keep the cost low if this feature is used only rarely.  You might want
an expiring cache of character name mappings, and the JDK doesn't have
such a thing yet.

As a matter of fact. The compressed data file is about 130k in the file system. The inflated runtime data for the name string table is about 700k. The cp->name lookup table is about 160k and the name->cp lookup mapping is about 400k+ (there might be a little more space can be cut from the homemade hashmap...). So the overall runtime cost is about 1.2mb for this "cool" feature. Yes, it's a little bigger than the zt_tw charset, but consider you can have a round trip mapping between all the codepoints and their names, 1.3mb might not be that expensive, consider a "normal"
pic now takes couple mb memory.

How about you help take a look to see if we can squeeze out more space? really need
a reviewer :-)

-sherman


(I haven't actually reviewed the implementation)



On Mon, Jan 18, 2016 at 11:52 PM, Xueming Shen <xueming.s...@oracle.com> wrote:
Hi,

Please help review the change to add \N support in regex.

Issue: https://bugs.openjdk.java.net/browse/JDK-8147531
webrev: http://cr.openjdk.java.net/~sherman/8147531/webrev

This is one of the items we were planning to address via JEP111
http://openjdk.java.net/jeps/111
https://bugs.openjdk.java.net/browse/JDK-8046101

Some of the constructs had been added already in early release. I'm
planning to address the rest as individual rfe separately.

Thanks,
Sherman

Reply via email to