When I was working with LE versions of Tomcat I believe it was 
defaulting to whatever is in j2sdk1.4.0. I think this is Crimson, only 
because the code for Crimson is in the src.zip of j2sdk. What I saw with 
that was the following parsing error:

PARSE error at line 1 column -1
org.xml.sax.SAXParseException: Character conversion error: "Malformed 
UTF-8 char
 -- is an XML encoding declaration missing?" (line number may be too low).

So I got sick of trying to solve that and went and got all the full 
versions. Now whatever version of Xerces is present in the full versions 
1.4.4 (I think) I'm not sure about Tomcat 4.1. Does that use Xerces2?

Then the error becomes less obvious
PARSE error at line 1 column 1
org.xml.sax.SAXParseException: The markup in the document preceding the 
root ele
ment must be well-formed.

So, I don't think its the parser specifically. I pull the tld files out 
of the Jar and reference them directly in the web.xml file and they work 
fine with not exceptions. If there's a tld in any of my custom tag jar 
files I get the exception. This doesn't seem to happen with the 
standard.jar and I'm totally copying its layout and the version 
information at the beginning of the file accept for the encoding being 
UTF-8 instead of ISO...

The beginning of my files look like this
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE taglib PUBLIC "-//Sun Microsystems, Inc.//DTD JSP Tag Library 
1.2//EN"
        "http://java.sun.com/dtd/web-jsptaglibrary_1_2.dtd";>
<taglib>
    <tlib-version>1.1</tlib-version>
    <jsp-version>1.2</jsp-version>
    <short-name>Conditional</short-name>

The beginning of the standard.jar files looks like this
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE taglib PUBLIC "-//Sun Microsystems, Inc.//DTD JSP Tag Library 
1.2//EN"
  "http://java.sun.com/dtd/web-jsptaglibrary_1_2.dtd";>
<taglib>
    <tlib-version>1.0</tlib-version>
    <jsp-version>1.2</jsp-version>

I'm generating my tld files using Forte4J because I like the code 
generation capabilities. It forces the tld files to be UTF-8. I really 
don't think that should be so much of a problem. Expecially if when the 
tld is outside of the jar it working fine. This seems to be something to 
do with the mechanism that gets the tld from the Jar file (which I have 
absolutely no knowledge of, or want to explore in any detail 
personally). Could it be trying to force a specific encoding onto the 
parser? Or could it not be setting the encoding appropriately in the 
parser. I know that when you set a Char/Stream to the parser you have 
tell the parser its encoding. I would suspect there is a difference in 
the code between pulling from the Jar vs. Pulling it from the file. 
Probibly something like getting the resource from a JarUrlConnection vs 
pulling it from some resource location.

I'll play with the encodings again, but I doubt it will effect anything 
in the long run. Again, I don't get any problems from validating this 
file directly using Xerces 1.4.4.

-Mark Diggory



Jean-Francois Arcand wrote:

> Which version of Xerces are you using? If it's 2.2, there is a bug 
> associated with the problem:
>
> http://nagoya.apache.org/bugzilla/show_bug.cgi?id=13282
>
> -- Jeanfrancois
>
>
>
> Mark R. Diggory wrote:
>
>> I keep getting these parsing exceptions when I try to load my custom 
>> taglibs (from JAR files) on Tomcat 4.0.3, 4.0.5, 4.1 on Windows 2000/XP.
>>
>>> Starting service Tomcat-Standalone
>>> Apache Tomcat/4.0.3
>>> PARSE error at line 1 column -1
>>> org.xml.sax.SAXParseException: Character conversion error: 
>>> "Malformed UTF-8 char
>>>  -- is an XML encoding declaration missing?" (line number may be too 
>>> low).
>>> PARSE error at line 1 column -1
>>> org.xml.sax.SAXParseException: Character conversion error: 
>>> "Malformed UTF-8 char
>>>  -- is an XML encoding declaration missing?" (line number may be too 
>>> low).
>>> PARSE error at line 1 column -1
>>> org.xml.sax.SAXParseException: Character conversion error: 
>>> "Malformed UTF-8 char
>>>  -- is an XML encoding declaration missing?" (line number may be too 
>>> low).
>>> PARSE error at line 1 column -1
>>> org.xml.sax.SAXParseException: Character conversion error: 
>>> "Malformed UTF-8 char
>>>  -- is an XML encoding declaration missing?" (line number may be too 
>>> low).
>>> PARSE error at line 1 column -1
>>> org.xml.sax.SAXParseException: Character conversion error: 
>>> "Malformed UTF-8 char
>>>  -- is an XML encoding declaration missing?" (line number may be too 
>>> low).
>>> PARSE error at line 1 column -1
>>> org.xml.sax.SAXParseException: Character conversion error: 
>>> "Malformed UTF-8 char
>>>  -- is an XML encoding declaration missing?" (line number may be too 
>>> low).
>>> No tags
>>> No tags
>>
>>
>>
>>
>> I know this is coming from some parsing error when the tld is parsed. 
>> But even if I put the tld file into different encodings (ISO-8859-1). 
>> I still get the exceptions.
>>
>> example tld header:
>>
>> <?xml version="1.0" encoding="UTF-8"?>
>> <!DOCTYPE taglib PUBLIC "-//Sun Microsystems, Inc.//DTD JSP Tag 
>> Library 1.2//EN"
>>         "http://java.sun.com/dtd/web-jsptaglibrary_1_2.dtd";>
>> <taglib>
>> ...
>>
>> -Mark Diggory
>>
>>
>>
>> --
>> To unsubscribe, e-mail:   
>> <mailto:[EMAIL PROTECTED]>
>> For additional commands, e-mail: 
>> <mailto:[EMAIL PROTECTED]>
>>
>>
>
>
> --
> To unsubscribe, e-mail:   
> <mailto:[EMAIL PROTECTED]>
> For additional commands, e-mail: 
> <mailto:[EMAIL PROTECTED]>
>





--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to