Sherman,
thanks for insisting on icu4j. I must admit, I newer heard about it. :-(
I took a short look on http://icu-project.org/download/4.2.html#ICU4J
etc. Very interesting stuff.
Now I'm thinking: why they don't use Character class? Do they miss those
3..4 methods? ;-)
May be it's time to talk with them, so they could reduce their
footprint, which would reduce JDK footprint in a row.
-Ulf
Am 07.09.2009 20:27, Xueming Shen schrieb:
Ulf,
sun.text.normalizer might not be a good example, it is "independent
third-party" package we licensed. Usually
we try our best to keep this kind of package as-is, for maintenance
reason, you don't want to make the trivial
update allover the places again and again whenever there is a version
upgrade. Yes, I did make some modification
when dropped in the package when I was the owner years ago, that was
for functionally better fit with the rest
of the code and mostly for better performance (especially faster
startup). You should be able to "smell" that is
part of the icu4j, so it has its own utf16 handling:-)
sherman
Ulf Zibis wrote:
Looking deeper in the classes you can find many use scenarios.
E.g. class sun.text.normalizer.UTF16 more ore less is a duplicate of
sun.nio.cs.Surrogate.
I'm not saying we can NOT add isBMP() (I know icu4j's UCharacter
class does have one), just
believe it's arguably not necessary.
Same as the pair
-- static char highSurrogate(int codePoint);
--> sun.text.normalizer.UTF16.getLeadSurrogate(int char32)
-- static char lowSurrogate(int codePoint);
--> sun.text.normalizer.UTF16.getTrailSurrogate(int char32)
-- CharBuffer.putCodePoint(int)
Maybe it would be better to add appendCodePoint(int cp) to
Charsequence, of something similar.
See: sun.text.normalizer.UTF16.append(StringBuffer target, int char32)
See: java.text.CharacterIterator.CodePointIterator
-Ulf