On Fri, Mar 27, 2009 at 15:44, Ulf Zibis <ulf.zi...@gmx.de> wrote: > Am 27.03.2009 22:49, Martin Buchholz schrieb: >> >> Again, Ulf, I love the sort of stuff you're doing. >> > > Much thanks again for the flowers. :-) > >> I hope to be able to contribute some enginering >> to your effort myself someday. >> >> In the meantime, we need some infrastructure to guarantee that >> the behavior of the charsets is completely unchanged as we optimize. >> I have some code left behind at Sun to do that, i.e. compare different >> JDKs w.r.t charset compatibility. >> Hopefully Sun engineers can resurrect that code and perhaps put it >> into a public mercurial repo somewhere. >> >> Another approach is to take the code in tests like my >> Find{En,De}coderBugs.java tests which compare direct >> vs. regular buffers, and retarget it to compare two different jdks. >> > > I also have coded such a test for full-scan comparision: > See CharsetsTest + LegacyCharset (it retrieves the legacy charsets by > reflection directly from rt.jar of the patched JDK) here: > https://java-nio-charset-enhanced.dev.java.net/source/browse/java-nio-charset-enhanced/trunk/test/sun/nio/cs/ > > It cost me several nights having all code points equal, faced to my special > mixture of range-limited direct maps and full-range indirected map.
It does look like you've written a lot of good tests. It would be nice not to have an explicit list of charsets in CharsetsTest.java.PARAMETERS. I guess it's a list of charsets subject to single-byte testing? If so, better documentation would be good. Charsets named ISO-8859-* are guaranteed to be single-byte, it might be good to include those programmatically, by filtering Charsets.availableCharsets(). Why include EUC-JP but not UTF-8? It's probably still a good idea to get inspiration from my Find*Bugs tests which test many other things like complete compatibility of exceptions in case of invalid input. >> It's too difficult to give credit to external contributors. >> One problem is that the Contributed-by: line is a red flag to >> lawyers and other folks that might cause the legality of the change >> to be questioned without end. Let's try to get Ulf a proper commit bit >> and make sure the legal questions come to an end. >> > > Aren't "Contributed-by" and "author" comments usual practice in open source > products? > Even in Sun's JRL source author was mentioned. I think, the lawyer guys and > girls from Sun should rethink that subject. > Ok, we will see ... The problem is more human. One would like to give credit for good ideas or good analysis, but the only official way to give credit in a commit message is via a simple Contributed-by: email-address which raises legal doubts even when there is no copyrighted material. I guess one can abuse the Summary: field to squeeze in thank-yous, but it's pretty obvious that you are circumventing the process. Martin