Dear All,

I think that Santiago was actually right in saying that there is a bug
in GenericElement, although his solution might not be correct. I've had
simillar problems and therefore I would like to share some thoughts with
you.

in GenericElement we have:

    public final String toString(String codeset)
    {
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        BufferedOutputStream bos = new BufferedOutputStream(baos);
        String out = null;
        try
        {
            output(bos);   // <== here is the problem
            bos.flush();
            out = baos.toString(codeset);
            bos.close();
            baos.close();
        }
        catch ...
     }

Now the output(bos) function looks like this:

    public void output(OutputStream out)
    {
        try
        {
            out.write(createStartTag().getBytes());

            if(getFilterState())
                out.write(getFilter().process(getTagText()).getBytes());
            else
                out.write(getTagText().getBytes());

            if (getNeedClosingTag())
                out.write(createEndTag().getBytes());

        }
        catch (IOException ioe)
        {
            ioe.printStackTrace(new PrintWriter(out));
        }
    }

It calls getBytes() on every String. This method converts a given String
into bytes according to the platform's default character encoding,
storing the result in a new byte array. In toString(..) method we
specify the codeset, but in the output method we convert all the Strings
into bytes according to the default character encoding. At this point we
may loose information, if our default encoding character set does not
support certain characters.  

I will be glad to submit a patch for it...

Cheers,

Krzysztof
 
Jon Stevens wrote:
> 
> on 11/17/2000 4:32 PM, "Santiago Gala" <[EMAIL PROTECTED]> wrote:
> 
> > I had to patch org.apache.ecs.GenericElement.java, which plainly did not know
> > how to
> > convert multibyte characters back to a String.
>
> If you pass the correct encoding to the toString() method of a
> ByteArrayOutputStream, the Javadoc clearly states that it does the
> translation between bytes and characters. ECS is clearly doing this
> correctly. It appears as though you are not though.


--
------------------------------------------------------------
To subscribe:        [EMAIL PROTECTED]
To unsubscribe:      [EMAIL PROTECTED]
Archives and Other:  <http://java.apache.org/main/mail.html>
Problems?:           [EMAIL PROTECTED]

Reply via email to