Artur Biesiadowski <[EMAIL PROTECTED]> writes:

> 1) isWhiteSpace should be isWhitespace

Fixed.

> 2) readChar should not return null on not-defined characters.

I'll look into this.

> 3) There is no need for putting char ch field in CharAttr - it is not
> used anywhere

I'll look into this.

> 4) There is a lot of errors in implementation - 279 compared to 118 in
> JDK. More details later

I'm surprised to hear this.  You are using UnicodeData-2.1.2.txt,
correct?

I'm aware of some bugs with isTitleCase, which have yet to be fixed,
but overall, as I was writing the class, I tested most of the border
cases.

> 5) Performance is terrible.

How _exactly_ are you testing our Character?  Are you simply replacing
Sun's Character with ours, and running the test again with the JDK?
If you're using Japhar, then that changes everything...

Classpath's Character keeps information on the Unicode characters out
on secondary storage.  Each time you look up information about a
character, the disk must be accessed, and information read off.

Things could be optimized to assume that once you've asked for data
about a specific character, it's likely that you'll ask for more data
on that same character.  Such a change should incur a 20x performance
increase for your test.

Yet another option is to stick the entire Unicode attribute database
into memory -- it's around 21k.

> This test should go into mauve soon, but first I would like to correct
> some places where I'm not sure what behaviour should be considered
> normal - look at the errors reported and try to find any which should
> not be reported.

I'll get back to you shortly.

-- 
Paul Fisher * [EMAIL PROTECTED]

Reply via email to