DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=5096>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=5096

PageData.getInputStream() returns XML doc with invalid encoding

           Summary: PageData.getInputStream() returns XML doc with invalid
                    encoding
           Product: Tomcat 4
           Version: 4.0.1 Final
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: Major
          Priority: Other
         Component: Jasper
        AssignedTo: [EMAIL PROTECTED]
        ReportedBy: [EMAIL PROTECTED]


The XML View of a JSP page is an XML document without an encoding declaration.
According to the XML spec, such a document must use UTF-8 encoding, but the 
PageData implementation in Jasper returns a stream with the platform's
default encoding instead.

The error can be fixed in the org.apache.jasper.compiler.XmlOutputter class,
by hardcoding UTF-8 in the getPageData() method:

    PageData getPageData() {
        StringBuffer buff = new StringBuffer();
        AttributesImpl attrs = new AttributesImpl();
        
        append("jsp:root", rootAttrs, buff, false);
        buff.append(sb.toString());
        buff.append("</jsp:root>");
        // Current code:
        //InputStream is = new byteArrayInputStream(buff.toString().getBytes());
        InputStream is = null;
        try {
            is = new ByteArrayInputStream(buff.toString().getBytes("UTF-8"));
        }
        catch (java.io.UnsupportedEncodingException e) {
            // Can never happen? I assume all platforms support UTF-8
        }
        //System.out.println("XmlOutputter: \n" + buff);
        PageData pageData = new PageDataImpl(is);
        return pageData;
    }

Without this patch, a the TagLibraryValidator used in the Jakarta Taglibs
"standard" library (i.e. the JSTL RI, EA2) throws a SaxParseException if
the page contains European national characters (e.g. å, ä, ö):

  Character conversion error: "Malformed UTF-8 char -- is an XML encoding 
  declaration missing?" (line number may be too low).

--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to