On 2/27/07, Raul Benito <[EMAIL PROTECTED]> wrote:
Hi Jason,
Sorry for the delay.
See my comments inline
On 2/23/07, jason marshall <[EMAIL PROTECTED] > wrote:
> Raul,
>
> I'm not sure I can be as helpful as Yvan, having a more modest and
> polite test suite, but I have a bit of Unicode and specifically UTF-8
> en/decoding experience, and I might be able to make a few
> observations. I'm curious about your comments about how some Unicode
> characters are not being handled properly. Which ones are you having
> trouble with? The new 32 bit characters, 0, something else?
Great, you help is really appreciated. I have just create a test that checks
my encoding against implementation the String.getBytes("UTF-8") for the
first 2**16 chars , and they are all equal but character 0xd8ff.
Okay, you're a bit outside of my realm here, but have you considered
the possibility that the error is Sun's, and not yours? 0xd800-0xd8ff
don't appear to be characters, per se, they're the first 2 bytes of a
4-byte sequence. You might need to look to a third party for
confirmation of the right encoding for that character (one page
suggests 0xED 0xA3 0xBF is correct).
> You say in your comments that the problem is fixed in HEAD, but I'm
> looking at HEAD
>
>
http://svn.apache.org/viewvc/xml/security/trunk/src/org/apache/xml/security/c14n/implementations/CanonicalizerBase.java?view=markup
>
> And the code still seems to be using 8th bit checks throughout.
Can you point me where do you think is incorrect? or give me a test case? I
will really appreciated it.
Just search for "0x80". On the revision I reviewed, there were a
number of them in that file, and almost all of them were wrapped
around a call to UtfHelpper
> If you want to make this code go faster, your better bet is to split
> up the methods in UTFHelpper so that Hotspot can inline the fast-path
> into the the callers. That'll get you the same effect with saner
> code. For example:
>
Great idea I will try & do it.
> I'm pretty sure that even the 1.3 Hotspot will be happy with this
> code, but I haven't tested it (I'm having some trouble building the
> code from the source release, and work doesn't allow svn access
> through the firewall, for various reasons, a couple of which are
> understandable).
We should create some nightly build & publish mechanism. I will try to see
how other projects handle this.
> Good luck, and keep us posted on your ETA for a 1.4.1 release.
Thanks,
I will try to see how much bug reports we got (the MS-Office bug looks
promising)
Regards,
Raul
> Thanks,
> Jason
>
>
--
- Jason