Sherman,

thanks for insisting on icu4j. I must admit, I newer heard about it. :-(
I took a short look on http://icu-project.org/download/4.2.html#ICU4J etc. Very interesting stuff.

Now I'm thinking: why they don't use Character class? Do they miss those 3..4 methods? ;-) May be it's time to talk with them, so they could reduce their footprint, which would reduce JDK footprint in a row.

-Ulf


Am 07.09.2009 20:27, Xueming Shen schrieb:

Ulf,

sun.text.normalizer might not be a good example, it is "independent third-party" package we licensed. Usually we try our best to keep this kind of package as-is, for maintenance reason, you don't want to make the trivial update allover the places again and again whenever there is a version upgrade. Yes, I did make some modification when dropped in the package when I was the owner years ago, that was for functionally better fit with the rest of the code and mostly for better performance (especially faster startup). You should be able to "smell" that is
part of the icu4j, so it has its own utf16 handling:-)

sherman


Ulf Zibis wrote:

Looking deeper in the classes you can find many use scenarios.
E.g. class sun.text.normalizer.UTF16 more ore less is a duplicate of sun.nio.cs.Surrogate.


I'm not saying we can NOT add isBMP() (I know icu4j's UCharacter class does have one), just
believe it's arguably not necessary.

Same as the pair

-- static char highSurrogate(int codePoint);

--> sun.text.normalizer.UTF16.getLeadSurrogate(int char32)

-- static char lowSurrogate(int codePoint);

--> sun.text.normalizer.UTF16.getTrailSurrogate(int char32)

-- CharBuffer.putCodePoint(int)

Maybe it would be better to add appendCodePoint(int cp) to Charsequence, of something similar.
See: sun.text.normalizer.UTF16.append(StringBuffer target, int char32)
See: java.text.CharacterIterator.CodePointIterator



-Ulf





Reply via email to