Sent: Monday, February 18, 2002 8:23
PM
Subject: [dom4j-user] special entities
and JAXP
While the application I'm writing utilizes dom4j, this
question isn't dom4j specific. I offer it up to the list simply because
many of you are proficient with Java and JAXP.... I suspect this is a very
simple problem, but one I can't quite decipher yet. I apologize if the
question is misplaced....
I have written a simple editor which utilizes a JTextArea to
hold and visualize text. I have written a simple method to check for
well-formedness (you try to come up with a better adjective) of an XML
document. I am using the JAXP classes in conjunction with jdk
1.4.
The text within the JTextArea is extracted using a
getDocument().getText() method to obtain a String object.
Well-formedness is determine in the following
manner:
public void wellform(String text) throws
IOException, SAXException, ParserConfigurationException
{
SAXParser parser =
factory.newSAXParser();
StringReader
reader = new StringReader(text);
parser.parse(new InputSource(reader), new
XMLHandler(holder));
}
private static SAXParserFactory factory =
SAXParserFactory.newInstance();
XMLHandler is a class I've written to catch certain thrown
exceptions. It extends the DefaultHandler adapter just to simplify
matters.
Here's what I don't understand with regards to my
application:
If I write a simple one line XML document, say:
<a>testing</a>
everything works just fine and dandy... the document is
declared well-formed, and life goes merrily on its way.
However, if I add a second line directly underneath the
first line, any line at all, like:
<b>testing</b>
I get back an exception saying:
Illegal character at end of document,
<.
Line 2, column 0.
Now, I know that what apparently is happening, is that the
less than character is being interpreted as a special entity. But what I
don't understand is 1) why does it only affect the second line of text, not
the first (is there a carriage return or something that JTextArea sticks in?)
?
Is there some feature that I'm supposed to specify for the
xmlreader, or saxparser to avoid this problem? Are there hidden
characters coming from the JTextArea I should be aware of, or is there some
method I need to modify in my custom ContentHandler (XMLHandler) to overcome
this issue???
Hope this makes sense to some of you,
Rob