Ok, I get it now.
But I am pretty sure trying to support supplementary characters will not
be as simple as just using the int methods in Character class.
All iterations over character sequences require new logic .. and I am
pretty sure that would be much harder than the benefit of supporting
the supplementary characters (unless doing big business in china where
government demands the support).
"A\uD840\uDC00B".length() returns 4 even though it only has 3 characters
because
char represents only UTF-16 code unit not a code point that can be
uniquely mapped to a character.
So iterating a character sequence looking for whitespace would have to
be something like
char[] ca = str.toCharArray();
for(int i=0; i<ca.length; i++) {
int cp = Character.codePointAt(ca, i);
if (Character.charCount(cp) == 2) i++;
if (Character.isWhitespace(cp)) {
...
}
...
}
And so on and so forth.
So even though you are right I think supporting supplementary characters
all the way might be difficult
since iterating over characters is such a common task.
And usage of supplemental characters so rare.
Just casting a char to int will not make any difference.
- rami
On 27.6.2011 0:33, cowwoc wrote:
Hi Rami,
Yes. See
http://java.sun.com/developer/technicalArticles/Intl/Supplementary/
for more information.
Gili
On 26/06/2011 5:32 PM, Rami Ojares wrote:
But those characters can not be represented as chars inside jvm and
what about String. Can it contain characters that are not of type char?
- rami
On 27.6.2011 0:27, cowwoc wrote:
Hi Rami,
You're right that nbsp is treated the same by both methods (my
mistake!) but H2 should still use the int variant because it
accounts for unicode characters that don't fit in 16-bit.
Gili
On 26/06/2011 5:08 PM, Rami Ojares wrote:
On 26.6.2011 23:21, cowwoc wrote:
we should really be using isWhitespace(int) because it is
newer/better. In general you're supposed to ignore the methods
that take a char parameter.
I don't understand what you mean.
Recent Jdk 1.6 returns the following
Character.isSpaceChar(' ') -> true
Character.isSpaceChar((int) ' ') -> true
Character.isWhitespace(' ') -> false
Character.isWhitespace((int) ' ') -> false
So to me there seems to be no difference between char and int methods.
(Note: The argument character is a nbsp.)
- rami
--
You received this message because you are subscribed to the Google Groups "H2
Database" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/h2-database?hl=en.