Re: [dom4j-user] special entities and JAXP

Robert J. Lebowitz Mon, 18 Feb 2002 21:20:28 -0800

After some further experimentation, I determined that the character encoding of the text from my JTextArea is Cp1252 and not UTF-8. Is there some simple way to set the encoding of a StyledDocument used in a JTextArea?

I'm assuming that character encoding is the source of my problem here, it may not be the only issue.

Rob

----- Original Message -----

From: Robert J. Lebowitz

To: [EMAIL PROTECTED]

Sent: Monday, February 18, 2002 8:23 PM

Subject: [dom4j-user] special entities and JAXP

While the application I'm writing utilizes dom4j, this question isn't dom4j specific. I offer it up to the list simply because many of you are proficient with Java and JAXP.... I suspect this is a very simple problem, but one I can't quite decipher yet. I apologize if the question is misplaced....

I have written a simple editor which utilizes a JTextArea to hold and visualize text. I have written a simple method to check for well-formedness (you try to come up with a better adjective) of an XML document. I am using the JAXP classes in conjunction with jdk 1.4.

The text within the JTextArea is extracted using a getDocument().getText() method to obtain a String object.

Well-formedness is determine in the following manner:

    public void wellform(String text) throws IOException, SAXException, ParserConfigurationException {
        SAXParser parser = factory.newSAXParser();

        StringReader reader = new StringReader(text);
        parser.parse(new InputSource(reader), new XMLHandler(holder));
    }

    private static SAXParserFactory factory = SAXParserFactory.newInstance();

XMLHandler is a class I've written to catch certain thrown exceptions. It extends the DefaultHandler adapter just to simplify matters.

Here's what I don't understand with regards to my application:

If I write a simple one line XML document, say:

<a>testing</a>

everything works just fine and dandy... the document is declared well-formed, and life goes merrily on its way.

However, if I add a second line directly underneath the first line, any line at all, like:

<b>testing</b>

I get back an exception saying:

Illegal character at end of document, <.

Line 2, column 0.

Now, I know that what apparently is happening, is that the less than character is being interpreted as a special entity. But what I don't understand is 1) why does it only affect the second line of text, not the first (is there a carriage return or something that JTextArea sticks in?) ?

Is there some feature that I'm supposed to specify for the xmlreader, or saxparser to avoid this problem? Are there hidden characters coming from the JTextArea I should be aware of, or is there some method I need to modify in my custom ContentHandler (XMLHandler) to overcome this issue???

Hope this makes sense to some of you,

Rob

Re: [dom4j-user] special entities and JAXP

Reply via email to