Sherman,
thanks for your detailed review.
Am 26.08.2009 20:02, Xueming Shen schrieb:
Ulf,
"get rid of sun.nio.cs.Surrogate" itself is not a sufficient enough
reason to add bunch of new supplementary
support related APIs into Character and CharBuffer classes. You
probably need to demonstrate more use
scenarios to show/prove why these new APIs are needed, why they can
not be easily achieved by using existing
APIs and especially you should ask yourself first if these APIs
will/should be used/needed by "general public"
or they are just "specifically" needed by your current
application/project.
Well yes, most of the use scenarios I see in charset processing. ;-) I
guess one can find more use scenarios in existing JDK e.g. visualization
and keyboard input of surrogate chars / supplementary code points in
Swing package. On the other hand I have total code base and footprint
size in mind.
Alternatively subclassing sun.nio.cs.Surrogate from Character would be
straight forward. (Unfortunately Character class is final. Is that
mandatory?)
... and this would not cover my enhancements in CharBuffer.
For example, the isBMP(int), it might be convenient, but it can be
easily archived by the one line code
(int)(char)codePoint == codePoint;
We still have isSupplementaryCodePoint(int), so IMHO additionally having
isBMPCodePoint(int) would be a good pair.
On the other hand, there are still many public methods in the API witch
can be "easily archived by one line code", especially in class
Character, even plenty of duplicate (or XYZ+1 valued) public constants.
e.g:
public static UnicodeBlock of(char c) {
return of((int)c);
}
or more readable form
codePoint < Character.MIN_SUPPLEMENTARY_COE_POINT;
correction: ;-)
codePoint >= MIN_CODE_POINT) && codePoint <
MIN_SUPPLEMENTARY_CODE_POINT
I'm not saying we can NOT add isBMP() (I know icu4j's UCharacter class
does have one), just
believe it's arguably not necessary.
Same as the pair
-- static char highSurrogate(int codePoint);
-- static char lowSurrogate(int codePoint);
-- CharBuffer.putCodePoint(int)
In "normal use case", the Character.toChars(int codePoint) is good
enough cover them all.
Hm, yes, but what's about performance ?
- instantiating a new char[]
- boxing the char(s) into
- invoking *bulk* method for just 1 (potentially, but rare 2) char(s)
(resolves to put(char[], int, int) including checkBounds(int, int, int) )
And yes, we can make CharBuffer#putCodePoint(int) using /Charset
helpers/ sun.nio.cs.Surrogate#high(int) + sun.nio.cs.Surrogate#low(int)
if there is no correspondent in Character class. But is that good design ?
-Ulf