When I was working with LE versions of Tomcat I believe it was
defaulting to whatever is in j2sdk1.4.0. I think this is Crimson, only
because the code for Crimson is in the src.zip of j2sdk. What I saw with
that was the following parsing error:
PARSE error at line 1 column -1
org.xml.sax.SAXParseException: Character conversion error: "Malformed
UTF-8 char
-- is an XML encoding declaration missing?" (line number may be too low).
So I got sick of trying to solve that and went and got all the full
versions. Now whatever version of Xerces is present in the full versions
1.4.4 (I think) I'm not sure about Tomcat 4.1. Does that use Xerces2?
Then the error becomes less obvious
PARSE error at line 1 column 1
org.xml.sax.SAXParseException: The markup in the document preceding the
root ele
ment must be well-formed.
So, I don't think its the parser specifically. I pull the tld files out
of the Jar and reference them directly in the web.xml file and they work
fine with not exceptions. If there's a tld in any of my custom tag jar
files I get the exception. This doesn't seem to happen with the
standard.jar and I'm totally copying its layout and the version
information at the beginning of the file accept for the encoding being
UTF-8 instead of ISO...
The beginning of my files look like this
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE taglib PUBLIC "-//Sun Microsystems, Inc.//DTD JSP Tag Library
1.2//EN"
"http://java.sun.com/dtd/web-jsptaglibrary_1_2.dtd">
<taglib>
<tlib-version>1.1</tlib-version>
<jsp-version>1.2</jsp-version>
<short-name>Conditional</short-name>
The beginning of the standard.jar files looks like this
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE taglib PUBLIC "-//Sun Microsystems, Inc.//DTD JSP Tag Library
1.2//EN"
"http://java.sun.com/dtd/web-jsptaglibrary_1_2.dtd">
<taglib>
<tlib-version>1.0</tlib-version>
<jsp-version>1.2</jsp-version>
I'm generating my tld files using Forte4J because I like the code
generation capabilities. It forces the tld files to be UTF-8. I really
don't think that should be so much of a problem. Expecially if when the
tld is outside of the jar it working fine. This seems to be something to
do with the mechanism that gets the tld from the Jar file (which I have
absolutely no knowledge of, or want to explore in any detail
personally). Could it be trying to force a specific encoding onto the
parser? Or could it not be setting the encoding appropriately in the
parser. I know that when you set a Char/Stream to the parser you have
tell the parser its encoding. I would suspect there is a difference in
the code between pulling from the Jar vs. Pulling it from the file.
Probibly something like getting the resource from a JarUrlConnection vs
pulling it from some resource location.
I'll play with the encodings again, but I doubt it will effect anything
in the long run. Again, I don't get any problems from validating this
file directly using Xerces 1.4.4.
-Mark Diggory
Jean-Francois Arcand wrote:
> Which version of Xerces are you using? If it's 2.2, there is a bug
> associated with the problem:
>
> http://nagoya.apache.org/bugzilla/show_bug.cgi?id=13282
>
> -- Jeanfrancois
>
>
>
> Mark R. Diggory wrote:
>
>> I keep getting these parsing exceptions when I try to load my custom
>> taglibs (from JAR files) on Tomcat 4.0.3, 4.0.5, 4.1 on Windows 2000/XP.
>>
>>> Starting service Tomcat-Standalone
>>> Apache Tomcat/4.0.3
>>> PARSE error at line 1 column -1
>>> org.xml.sax.SAXParseException: Character conversion error:
>>> "Malformed UTF-8 char
>>> -- is an XML encoding declaration missing?" (line number may be too
>>> low).
>>> PARSE error at line 1 column -1
>>> org.xml.sax.SAXParseException: Character conversion error:
>>> "Malformed UTF-8 char
>>> -- is an XML encoding declaration missing?" (line number may be too
>>> low).
>>> PARSE error at line 1 column -1
>>> org.xml.sax.SAXParseException: Character conversion error:
>>> "Malformed UTF-8 char
>>> -- is an XML encoding declaration missing?" (line number may be too
>>> low).
>>> PARSE error at line 1 column -1
>>> org.xml.sax.SAXParseException: Character conversion error:
>>> "Malformed UTF-8 char
>>> -- is an XML encoding declaration missing?" (line number may be too
>>> low).
>>> PARSE error at line 1 column -1
>>> org.xml.sax.SAXParseException: Character conversion error:
>>> "Malformed UTF-8 char
>>> -- is an XML encoding declaration missing?" (line number may be too
>>> low).
>>> PARSE error at line 1 column -1
>>> org.xml.sax.SAXParseException: Character conversion error:
>>> "Malformed UTF-8 char
>>> -- is an XML encoding declaration missing?" (line number may be too
>>> low).
>>> No tags
>>> No tags
>>
>>
>>
>>
>> I know this is coming from some parsing error when the tld is parsed.
>> But even if I put the tld file into different encodings (ISO-8859-1).
>> I still get the exceptions.
>>
>> example tld header:
>>
>> <?xml version="1.0" encoding="UTF-8"?>
>> <!DOCTYPE taglib PUBLIC "-//Sun Microsystems, Inc.//DTD JSP Tag
>> Library 1.2//EN"
>> "http://java.sun.com/dtd/web-jsptaglibrary_1_2.dtd">
>> <taglib>
>> ...
>>
>> -Mark Diggory
>>
>>
>>
>> --
>> To unsubscribe, e-mail:
>> <mailto:[EMAIL PROTECTED]>
>> For additional commands, e-mail:
>> <mailto:[EMAIL PROTECTED]>
>>
>>
>
>
> --
> To unsubscribe, e-mail:
> <mailto:[EMAIL PROTECTED]>
> For additional commands, e-mail:
> <mailto:[EMAIL PROTECTED]>
>
--
To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>