Re: RFR JDK-8147531,To add named character construct \N{...} to support Unicode name property

Xueming Shen Thu, 21 Jan 2016 19:46:32 -0800

On 1/19/16 11:43 AM, Martin Buchholz wrote:

Many years ago I considered implementing this cool feature.
I thought that few would find it worth the cost - it would be hard to
keep the cost low if this feature is used only rarely.  You might want
an expiring cache of character name mappings, and the JDK doesn't have
such a thing yet.

As a matter of fact. The compressed data file is about 130k in the filesystem. Theinflated runtime data for the name string table is about 700k. Thecp->name lookuptable is about 160k and the name->cp lookup mapping is about 400k+(there mightbe a little more space can be cut from the homemade hashmap...). So theoverallruntime cost is about 1.2mb for this "cool" feature. Yes, it's a littlebigger than thezt_tw charset, but consider you can have a round trip mapping betweenall thecodepoints and their names, 1.3mb might not be that expensive, considera "normal"

pic now takes couple mb memory.

How about you help take a look to see if we can squeeze out more space?really need

a reviewer :-)

-sherman


(I haven't actually reviewed the implementation)

On Mon, Jan 18, 2016 at 11:52 PM, Xueming Shen <xueming.s...@oracle.com> wrote:

Hi,

Please help review the change to add \N support in regex.

Issue: https://bugs.openjdk.java.net/browse/JDK-8147531
webrev: http://cr.openjdk.java.net/~sherman/8147531/webrev

This is one of the items we were planning to address via JEP111
http://openjdk.java.net/jeps/111
https://bugs.openjdk.java.net/browse/JDK-8046101

Some of the constructs had been added already in early release. I'm
planning to address the rest as individual rfe separately.

Thanks,
Sherman

Re: RFR JDK-8147531,To add named character construct \N{...} to support Unicode name property

Reply via email to