Consider for example type varchar(n).
To enforce the upper limit it is not enough anymore to check for
string.length().
One would have to iterate through the string in order to find out how
many characters it contains.
- rami
On 27.6.2011 2:38, cowwoc wrote:
Correct:
http://stackoverflow.com/questions/1527856/how-can-i-iterate-through-the-unicode-codepoints-of-a-java-string
For what it's worth, I don't find codepoint iteration that much
more complicated than char iteration. The main obstacle is to retrofit
existing code.
All we need is something like Findbugs to point out which code
sniplets need to be retrofitted.
Gili
On 26/06/2011 6:27 PM, Rami Ojares wrote:
Ok, I get it now.
But I am pretty sure trying to support supplementary characters will
not be as simple as just using the int methods in Character class.
All iterations over character sequences require new logic .. and I am
pretty sure that would be much harder than the benefit of supporting
the supplementary characters (unless doing big business in china
where government demands the support).
"A\uD840\uDC00B".length() returns 4 even though it only has 3
characters because
char represents only UTF-16 code unit not a code point that can be
uniquely mapped to a character.
So iterating a character sequence looking for whitespace would have
to be something like
char[] ca = str.toCharArray();
for(int i=0; i<ca.length; i++) {
int cp = Character.codePointAt(ca, i);
if (Character.charCount(cp) == 2) i++;
if (Character.isWhitespace(cp)) {
...
}
...
}
And so on and so forth.
So even though you are right I think supporting supplementary
characters all the way might be difficult
since iterating over characters is such a common task.
And usage of supplemental characters so rare.
Just casting a char to int will not make any difference.
- rami
On 27.6.2011 0:33, cowwoc wrote:
Hi Rami,
Yes. See
http://java.sun.com/developer/technicalArticles/Intl/Supplementary/
for more information.
Gili
On 26/06/2011 5:32 PM, Rami Ojares wrote:
But those characters can not be represented as chars inside jvm and
what about String. Can it contain characters that are not of type
char?
- rami
On 27.6.2011 0:27, cowwoc wrote:
Hi Rami,
You're right that nbsp is treated the same by both methods (my
mistake!) but H2 should still use the int variant because it
accounts for unicode characters that don't fit in 16-bit.
Gili
On 26/06/2011 5:08 PM, Rami Ojares wrote:
On 26.6.2011 23:21, cowwoc wrote:
we should really be using isWhitespace(int) because it is
newer/better. In general you're supposed to ignore the methods
that take a char parameter.
I don't understand what you mean.
Recent Jdk 1.6 returns the following
Character.isSpaceChar(' ') -> true
Character.isSpaceChar((int) ' ') -> true
Character.isWhitespace(' ') -> false
Character.isWhitespace((int) ' ') -> false
So to me there seems to be no difference between char and int
methods.
(Note: The argument character is a nbsp.)
- rami
--
You received this message because you are subscribed to the Google Groups "H2
Database" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/h2-database?hl=en.