Re: java-nio-charset-enhanced -- Milestone 4 is released

Ulf Zibis Sun, 29 Mar 2009 12:49:44 -0700

Am 29.03.2009 20:27, Martin Buchholz schrieb:

On Fri, Mar 27, 2009 at 15:44, Ulf Zibis <ulf.zi...@gmx.de> wrote:


I also have coded such a test for full-scan comparision:
See CharsetsTest + LegacyCharset (it retrieves the legacy charsets by
reflection directly from rt.jar of the patched JDK) here:
https://java-nio-charset-enhanced.dev.java.net/source/browse/java-nio-charset-enhanced/trunk/test/sun/nio/cs/

It cost me several nights having all code points equal, faced to my special
mixture of range-limited direct maps and full-range indirected map.


It does look like you've written a lot of good tests.
It would be nice not to have an explicit list of charsets in
CharsetsTest.java.PARAMETERS.

The advantage of this list is, that I can disable charsets byline-commenting to speed up the test while debugging special cases.

I guess it's a list of charsets subject to single-byte testing?


Yes, + charsets depending on those. E.g. EUC-JP depends on JIS-X-0201.

If so, better documentation would be good.
Charsets named ISO-8859-* are guaranteed to be single-byte,
it might be good to include those programmatically,
by filtering Charsets.availableCharsets().

Good idea, but how to catch those, which internally use single-bytecharsets e.g. JIS-X-0201?

Why include EUC-JP but not UTF-8?


UTF-8 is not affected of my changes in single-byte charsets.

It's probably still a good idea to get inspiration from my

Find*Bugs


I'll keep this in mind.

tests which test many other things like
complete compatibility of exceptions in case of invalid input.


I see, this would affect our discussion about malformed().

Concerning the malformed length on invalid low surrogate, I now haveunderstood your philosophy while hacking the UTF-8 coder. As result I'vefiled a bug:

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6798515

Concerning \uFFFE and \uFFFF I still think, that they are invalid, asthese code points don't have any valid meaning from Java VM side, so whyshould they be seen as possibly mappable to other char encodings.Handling of BOM etc. should be done otherwise, e.g. by coderinitialization or the flush() method.

The problem is more human.  One would like to give credit for good ideas
or good analysis, but the only official way to give credit in a commit
message is
via a simple
Contributed-by: email-address
which raises legal doubts even when there is no copyrighted material.
I guess one can abuse the Summary: field to squeeze in thank-yous,
but it's pretty obvious that you are circumventing the process.

The last paragraph is difficult for me to understand in english. Couldyou please translate it?


-Ulf

Re: java-nio-charset-enhanced -- Milestone 4 is released

Reply via email to