There is a bug in gnu.java.io.encode/decode classes. It is not apparent
on first glance, but it is there.

Most classes from there use lookup_table for lookups. They share actual
tranlate code from abstract superclass. This superclass has static field
called lookup_table on which it operates. Subclasses assign their own
arrays to this variable hoping that there will be a copy of static
variable for each subclass, bit it ISN'T true. They assign own tables to
the same static var, overwriting previous changes. Thus all 8-bit
encoders/decoders will behave as last one created.

It can be fixed in few ways. convertToChars can be copied to each
subclass and use specific lookup tables instead of just shared
lookup_table. DecoderEightBitLookup can become non-abstract with
additional field holding pointer to lookup_table (it will be a small
performance penalty - statics are generally faster than instance
variables).

Small performance update - in convertToChars cbuf_offset and buf_offset
could be inc'ed directly to save few iadd opcodes.

Note - Encoders use 128kb per encoder. I know that it is simpliest, but
it is really a lot of memory. I think that some other way should be used
- for example manual check for <= 127, if not, then binary lookup in
encode table (which would have to be in form of {char, byte} map). This
way we find encode in 7 checks in worst case and save a lot of mem.


Artur

Here is a code that demonstates the bug (lookup_tables should be
different for different classloaders)
------------------------------

import gnu.java.io.decode.*;

public class test
{

        static class A extends Decoder8859_1
        {
                public A() {super(null);}
                public String toString() { return lookup_table.toString();}
        }
        
        static class B extends Decoder8859_2
        {
                public B() {super(null);}
                public String toString() { return lookup_table.toString();}
        }

        public static void main( String argv[] )
        {
                System.out.println((new A()).toString() );
                System.out.println((new B()).toString() );
        }

}

Reply via email to