Yuri Gaevsky wrote:
Hi Sherman,

A couple of minor comments:
  - There is a typo (Uniocde) in 
Character.UnicodeScript.forName(java.lang.String):
        "Returns the UnicodeScript with the given Uniocde script name or the 
script
         name alias. "
  - Shouldn't the method be more specific in respect of inner spaces, 
underscores
    and so on (as [1] does)?

Regards,
-Yuri

[1] 
http://java.sun.com/javase/6/docs/api/java/lang/Character.UnicodeBlock.html#forName(java.lang.String)



Thanks Yuri.

Typo has been fixed and webrev has been updated.

The difference of block name and script name is that the block name defined by Unicode in Blocks.txt uses space character and hyphen as the separator (instead of the underscore) for example, the "Latin-1 Supplement", which makes it impossible to use the name as a identifier in Java directly. The UnicodeBlock.forName() then has too accept both the original/canonical block name and the "text representation" of the UnicoeBlock identifer.

For the script name, while the tr24[2] states that " the presence of hyphen or underscore is optional", the Scripts.txt[3] strictly only uses underscore for the script name. I was considering if I should also allow "loose-match" for the script name to accept those names that use space or hyphen in place of "_", but decided to stick with the canonical name (actually there are only several few names that need this). Well, I'm still open on this one, if people think the
"loose-match" is important.

I added "The en_US locale's case mapping rules are used to provide case-insensitive string comparisons for script
name validation", as suggested.

I also replaced the "Character.UnicodeScript object/instance" with "constant" in several places to be consistent with
the inherited methods valueOf(0 and values()

The rfe ids are
4860714: Make Unicode scripts available for use in regular expressions
6945564: Unicode script support in Character class

-Sherman


[1] http://www.unicode.org/Public/5.2.0/ucd/Blocks.txt
[2] http://www.unicode.org/reports/tr24/
[3] http://www.unicode.org/Public/UNIDATA/Scripts.txt

Reply via email to