Re: [dom4j-dev] fix for writing numeric entities in XMLWriter

Terry Steichen Thu, 20 Jun 2002 20:00:00 -0700

Hi Dan,

I've had this nagging problem in that I read in an HTML file with, for
example, &#150;&#150; in it.  I then 'Tidy' it up and use dom4j to extract
the details and construct a Document and write it out using XMLWriter.  What
I end up seeing is ?? where the &#150;&#150; was.  It should (sort of) look
like '--'.  Would your fix handle this, or will it only handle characters
that are actually embedded as raw numbers?


Regards,

Terry

----- Original Message -----
From: "Dan Jacobs" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Wednesday, June 19, 2002 11:10 PM
Subject: [dom4j-dev] fix for writing numeric entities in XMLWriter


> I found and fixed a simple bug in XMLWriter.java.  A fixed version is
> attached, but not checked into the source repository.  I'll leave that
> for the folks who maintain the sources.
>
> If you parse and print out (using asXML) an XML document containing a
> numeric entity reference such as &#160; (the code for &nbsp;) the code
> was just writing out a byte with a value of 160 (decimal).  The fix
> (abstracted below) is to check for characters with integer codes below
> 32 and above 126 (excluding standard whitespace characters) and encode
> them as numeric entities.  The same fix is made in two places in
> XMLWriter.java.
>
> Thanks for reading.
> -- Dan Jacobs
>
>             char c;     // declaration and assignment added by Dan Jacobs
>             switch( c = text.charAt(i) ) {
>                 case '<' :
>                     entity = "&lt;";
>                     break;
>                 case '>' :
>                     entity = "&gt;";
>                     break;
>                 case '&' :
>                     entity = "&amp;";
>                     break;
>
>                 file://!!! Begin code added by Dan Jacobs !!!//
>                 case '\t': case '\n': case '\r':
>                     // don't encode standard whitespace characters
>                     break;
>                 default:
>                     // encode low and high characters as entities
>                     if ((c < 32) || (c >= 127))
>                         entity = "&#" + (int)c + ";";
>                     break;
>                 file://!!! End code added by Dan Jacobs !!!//
>             }
>
> --
> Daniel S. Jacobs
> President, Model Objects Group
> Object-Orient Software Engineering
> Java & Web Application Development
>
>


----------------------------------------------------------------------------
----


> /*
>  * Copyright 2001 (C) MetaStuff, Ltd. All Rights Reserved.
>  *
>  * This software is open source.
>  * See the bottom of this file for the licence.
>  *
>  * $Id: XMLWriter.java,v 1.46 2002/02/14 11:55:46 jstrachan Exp $
>  */
>
> package org.dom4j.io;
>
> import java.io.BufferedOutputStream;
> import java.io.BufferedWriter;
> import java.io.ByteArrayOutputStream;
> import java.io.IOException;
> import java.io.OutputStream;
> import java.io.OutputStreamWriter;
> import java.io.StringWriter;
> import java.io.UnsupportedEncodingException;
> import java.io.Writer;
> import java.util.HashMap;
> import java.util.HashSet;
> import java.util.Iterator;
> import java.util.LinkedList;
> import java.util.List;
> import java.util.Map;
> import java.util.Set;
> import java.util.StringTokenizer;
>
> import org.dom4j.Attribute;
> import org.dom4j.CDATA;
> import org.dom4j.CharacterData;
> import org.dom4j.Comment;
> import org.dom4j.DocumentType;
> import org.dom4j.Document;
> import org.dom4j.Element;
> import org.dom4j.Entity;
> import org.dom4j.Namespace;
> import org.dom4j.Node;
> import org.dom4j.ProcessingInstruction;
> import org.dom4j.Text;
>
> import org.dom4j.tree.NamespaceStack;
>
> import org.xml.sax.Attributes;
> import org.xml.sax.ContentHandler;
> import org.xml.sax.DTDHandler;
> import org.xml.sax.InputSource;
> import org.xml.sax.Locator;
> import org.xml.sax.SAXException;
> import org.xml.sax.SAXNotRecognizedException;
> import org.xml.sax.SAXNotSupportedException;
> import org.xml.sax.XMLReader;
> import org.xml.sax.ext.LexicalHandler;
> import org.xml.sax.helpers.XMLFilterImpl;
>
> /**<p><code>XMLWriter</code> takes a DOM4J tree and formats it to a
>   * stream as XML.
>   * It can also take SAX events too so can be used by SAX clients as this
object
>   * implements the {@link ContentHandler} and {@link LexicalHandler}
interfaces.
>   * as well. This formatter performs typical document
>   * formatting.  The XML declaration and processing instructions are
>   * always on their own lines. An {@link OutputFormat} object can be
>   * used to define how whitespace is handled when printing and allows
various
>   * configuration options, such as to allow suppression of the XML
declaration,
>   * the encoding declaration or whether empty documents are collapsed.</p>
>   *
>   * <p> There are <code>write(...)</code> methods to print any of the
>   * standard DOM4J classes, including <code>Document</code> and
>   * <code>Element</code>, to either a <code>Writer</code> or an
>   * <code>OutputStream</code>.  Warning: using your own
>   * <code>Writer</code> may cause the writer's preferred character
>   * encoding to be ignored.  If you use encodings other than UTF8, we
>   * recommend using the method that takes an OutputStream instead.
>   * </p>
>   *
>   * @author <a href="mailto:[EMAIL PROTECTED]";>James Strachan</a>
>   * @author Joseph Bowbeer
>   * @version $Revision: 1.46 $
>   */
> public class XMLWriter extends XMLFilterImpl implements LexicalHandler {
>
>     protected static final String[] LEXICAL_HANDLER_NAMES = {
>         "http://xml.org/sax/properties/lexical-handler";,
>         "http://xml.org/sax/handlers/LexicalHandler";
>     };
>
>     private static final boolean ESCAPE_TEXT = true;
>     private static final boolean SUPPORT_PAD_TEXT = false;
>
>     protected static final OutputFormat DEFAULT_FORMAT = new
OutputFormat();
>
>     /** Stores the last type of node written so algorithms can refer to
the
>       * previous node type */
>     protected int lastOutputNodeType;
>
>     /** The Writer used to output to */
>     protected Writer writer;
>
>     /** The Stack of namespaceStack written so far */
>     private NamespaceStack namespaceStack = new NamespaceStack();
>
>     /** The format used by this writer */
>     private OutputFormat format;
>     /** The initial number of indentations (so you can print a whole
>         document indented, if you like) **/
>     private int indentLevel = 0;
>
>     /** buffer used when escaping strings */
>     private StringBuffer buffer = new StringBuffer();
>
>     /** Whether a flush should occur after writing a document */
>     private boolean autoFlush;
>
>     /** Lexical handler we should delegate to */
>     private LexicalHandler lexicalHandler;
>
>     /** Whether comments should appear inside DTD declarations - defaults
to false */
>     private boolean showCommentsInDTDs;
>
>     /** Is the writer curerntly inside a DTD definition? */
>     private boolean inDTD;
>
>
>     public XMLWriter(Writer writer) {
>         this( writer, DEFAULT_FORMAT );
>     }
>
>     public XMLWriter(Writer writer, OutputFormat format) {
>         this.writer = writer;
>         this.format = format;
>     }
>
>     public XMLWriter() {
>         this.format = DEFAULT_FORMAT;
>         this.writer = new BufferedWriter( new
utputStreamWriter( System.out ) );
>         this.autoFlush = true;
>     }
>
>     public XMLWriter(OutputStream out) throws UnsupportedEncodingException
{
>         this.format = DEFAULT_FORMAT;
>         this.writer = createWriter(out, format.getEncoding());
>         this.autoFlush = true;
>     }
>
>     public XMLWriter(OutputStream out, OutputFormat format) throws
UnsupportedEncodingException {
>         this.format = format;
>         this.writer = createWriter(out, format.getEncoding());
>         this.autoFlush = true;
>     }
>
>     public XMLWriter(OutputFormat format) throws
UnsupportedEncodingException {
>         this.format = format;
>         this.writer = createWriter( System.out, format.getEncoding() );
>         this.autoFlush = true;
>     }
>
>
>     public void setWriter(Writer writer) {
>         this.writer = writer;
>         this.autoFlush = false;
>     }
>
>     public void setOutputStream(OutputStream out) throws
UnsupportedEncodingException {
>         this.writer = createWriter(out, format.getEncoding());
>         this.autoFlush = true;
>     }
>
>
>     /** Set the initial indentation level.  This can be used to output
>       * a document (or, more likely, an element) starting at a given
>       * indent level, so it's not always flush against the left margin.
>       * Default: 0
>       *
>       * @param indentLevel the number of indents to start with
>       */
>     public void setIndentLevel(int indentLevel) {
>         this.indentLevel = indentLevel;
>     }
>
>     /** Flushes the underlying Writer */
>     public void flush() throws IOException {
>         writer.flush();
>     }
>
>     /** Closes the underlying Writer */
>     public void close() throws IOException {
>         writer.close();
>     }
>
>     /** Writes the new line text to the underlying Writer */
>     public void println() throws IOException {
>         writer.write( format.getLineSeparator() );
>     }
>
>     /** Writes the given {@link Attribute}.
>       *
>       * @param attribute <code>Attribute</code> to output.
>       */
>     public void write(Attribute attribute) throws IOException

>         writeAttribute(attribute);
>
>         if ( autoFlush ) {
>             flush();
>         }
>     }
>
>
>     /** <p>This will print the <code>Document</code> to the current
Writer.</p>
>      *
>      * <p> Warning: using your own Writer may cause the writer's
>      * preferred character encoding to be ignored.  If you use
>      * encodings other than UTF8, we recommend using the method that
>      * takes an OutputStream instead.  </p>
>      *
>      * <p>Note: as with all Writers, you may need to flush() yours
>      * after this method returns.</p>
>      *
>      * @param doc <code>Document</code> to format.
>      * @throws <code>IOException</code> - if there's any problem writing.
>      **/
>     public void write(Document doc) throws IOException {
>         writeDeclaration();
>
>         if (doc.getDocType() != null) {
>             indent();
>             writeDocType(doc.getDocType());
>         }
>
>         for ( int i = 0, size = doc.nodeCount(); i < size; i++ ) {
>             Node node = doc.node(i);
>             writeNode( node );
>         }
>         writePrintln();
>
>         if ( autoFlush ) {
>             flush();
>         }
>     }
>
>     /** <p>Writes the <code>{@link Element}</code>, including
>       * its <code>{@link Attribute}</code>s, and its value, and all
>       * its content (child nodes) to the current Writer.</p>
>       *
>       * @param element <code>Element</code> to output.
>       */
>     public void write(Element element) throws IOException {
>         writeElement(element);
>
>         if ( autoFlush ) {
>             flush();
>         }
>     }
>
>
>     /** Writes the given {@link CDATA}.
>       *
>       * @param cdata <code>CDATA</code> to output.
>       */
>     public void write(CDATA cdata) throws IOException {
>         writeCDATA( cdata.getText() );
>
>         if ( autoFlush ) {
>             flush();
>         }
>     }
>
>     /** Writes the given {@link Comment}.
>       *
>       * @param comment <code>Comment</code> to output.
>       */
>     public void write(Comment comment) throws IOException

>         writeComment( comment.getText() );
>
>         if ( autoFlush ) {
>             flush();
>         }
>     }
>
>     /** Writes the given {@link DocumentType}.
>       *
>       * @param docType <code>DocumentType</code> to output.
>       */
>     public void write(DocumentType docType) throws IOException {
>         writeDocType(docType);
>
>         if ( autoFlush ) {
>             flush();
>         }
>     }
>
>
>     /** Writes the given {@link Entity}.
>       *
>       * @param entity <code>Entity</code> to output.
>       */
>     public void write(Entity entity) throws IOException {
>         writeEntity( entity );
>
>         if ( autoFlush ) {
>             flush();
>         }
>     }
>
>
>     /** Writes the given {@link Namespace}.
>       *
>       * @param namespace <code>Namespace</code> to output.
>       */
>     public void write(Namespace namespace) throws IOException {
>         writeNamespace(namespace);
>
>         if ( autoFlush ) {
>             flush();
>         }
>     }
>
>     /** Writes the given {@link ProcessingInstruction}.
>       *
>       * @param processingInstruction <code>ProcessingInstruction</code> to
output.
>       */
>     public void write(ProcessingInstruction processingInstruction) throws
IOException {
>         writeProcessingInstruction(processingInstruction);
>
>         if ( autoFlush ) {
>             flush();
>         }
>     }
>
>     /** <p>Print out a {@link String}, Perfoms
>       * the necessary entity escaping and whitespace stripping.</p>
>       *
>       * @param text is the text to output
>       */
>     public void write(String text) throws IOException {
>         writeString(text);
>
>         if ( autoFlush ) {
>             flush();
>         }
>     }
>
>     /** Writes the given {@link Text}.
>       *
>       * @param text <code>Text</code> to output.
>       */
>     public void write(Text text) throws IOException {
>         writeString(text.getText());
>
>         if ( autoFlush ) {
>             flush();
>         }
>     }
>
>     /** Writes the given {@link Node}.
>       *
>       * @param node <code>Node</code> to output.
>       */
>     public void write(Node node) throws IOException {
>         writeNode(node);
>
>         if ( autoFlush ) {
>             flush();
>         }
>     }
>
>     /** Writes the given object which should be a String, a Node or a List
>       * of Nodes.
>       *
>       * @param object is the object to output.
>       */
>     public void write(Object object) throws IOException {
>         if (object instanceof Node) {
>             write((Node) object);
>         }
>         else if (object instanceof String) {
>             write((String) object);
>         }
>         else if (object instanceof List) {
>             List list = (List) object;
>             for ( int i = 0, size = list.size(); i < size; i++ ) {
>                 write( list.get(i) );
>             }
>         }
>         else if (object != null) {
>             throw new IOException( "Invalid object: " + object );
>         }
>     }
>
>
>     /** <p>Writes the opening tag of an {@link Element},
>       * including its {@link Attribute}s
>       * but without its content.</p>
>       *
>       * @param element <code>Element</code> to output.
>       */
>     public void writeOpen(Element element) throws IOException {
>         writer.write("<");
>         writer.write( element.getQualifiedName() );
>         writeAttributes(element);
>         writer.write(">");
>     }
>
>     /** <p>Writes the closing tag of an {@link Element}</p>
>       *
>       * @param element <code>Element</code> to output.
>       */
>     public void writeClose(Element element) throws IOException {
>         writeClose( element.getQualifiedName() );
>     }
>
>
>     // XMLFilterImpl methods
>
file://---------------------------------------------------------------------
----
>     public void parse(InputSource source) throws IOException, SAXException
{
>         installLexicalHandler();
>         super.parse(source);
>     }
>
>
>     public void setProperty(String name, Object value) throws
SAXNotRecognizedException, SAXNotSupportedException {
>         for (int i = 0; i < LEXICAL_HANDLER_NAMES.length; i++) {
>             if (LEXICAL_HANDLER_NAMES[i].equals(name)) {
>                 setLexicalHandler((LexicalHandler) value);
>                 return;
>             }
>         }
>         super.setProperty(name, value);
>     }
>
>     public Object getProperty(String name) throws
SAXNotRecognizedException, SAXNotSupportedException {
>         for (int i = 0; i < LEXICAL_HANDLER_NAMES.length; i++) {
>             if (LEXICAL_HANDLER_NAMES[i].equals(name)) {
>                 return getLexicalHandler();
>             }
>         }
>         return super.getProperty(name);
>     }
>
>     public void setLexicalHandler (LexicalHandler handler) {
>         if (handler == null) {
>             throw new NullPointerException("Null lexical handler");
>         }
>         else {
>             this.lexicalHandler = handler;
>         }
>     }
>
>     public LexicalHandler getLexicalHandler(){
>         return lexicalHandler;
>     }
>
>
>     // ContentHandler interface
>
file://---------------------------------------------------------------------
----
>     public void setDocumentLocator(Locator locator) {
>         super.setDocumentLocator(locator);
>     }
>
>     public void startDocument() throws SAXException {
>         try {
>             writeDeclaration();
>             super.startDocument();
>         }
>         catch (IOException e) {
>             handleException(e);
>         }
>     }
>
>     public void endDocument() throws SAXException {
>         super.endDocument();
>     }
>
>     public void startPrefixMapping(String prefix, String uri) throws
SAXException {
>         super.startPrefixMapping(prefix, uri);
>     }
>
>     public void endPrefixMapping(String prefix) throws SAXException {
>         super.endPrefixMapping(prefix);
>     }
>
>
>     public void startElement(String namespaceURI, String localName, String
qName, Attributes attributes) throws SAXException {
>         try {
>             writePrintln();
>             indent();
>             writer.write("<");
>             writer.write(qName);
>             writeAttributes( attributes );
>             writer.write(">");
>             ++indentLevel;
>             lastOutputNodeType = Node.ELEMENT_NODE;
>
>             super.startElement( namespaceURI, localName, qName,
attributes );
>         }
>         catch (IOException e) {
>             handleException(e);
>         }
>     }
>
>     public void endElement(String namespaceURI, String localName, String
qName) throws SAXException {
>         try {
>             --indentLevel;
>             if ( lastOutputNodeType == Node.ELEMENT_NODE ) {
>                 writePrintln();
>                 indent();
>             }
>
>             // XXXX: need to determine this using a stack and checking for
>             // content / children
>             boolean hadContent = true;
>             if ( hadContent ) {
>                 writeClose(qName);
>             }
>             else {
>                 writeEmptyElementClose(qName);
>             }
>             lastOutputNodeType = Node.ELEMENT_NODE;
>
>             super.endElement( namespaceURI, localName, qName );
>         }
>         catch (IOException e) {
>             handleException(e);
>         }
>     }
>
>     public void characters(char[] ch, int start, int length) throws
SAXException {
>         try {
>             write( new String( ch, start, length ) );
>
>             super.characters(ch, start, length);
>         }
>         catch (IOException e) {
>             handleException(e);
>         }
>     }
>
>     public void ignorableWhitespace(char[] ch, int start, int length)
throws SAXException {
>         super.ignorableWhitespace(ch, start, length);
>     }
>
>     public void processingInstruction(String target, String data) throws
SAXException {
>         try {
>             indent();
>             writer.write("<?");
>             writer.write(target);
>             writer.write(" ");
>             writer.write(data);
>             writer.write("?>");
>             writePrintln();
>             lastOutputNodeType = Node.PROCESSING_INSTRUCTION_NODE;
>
>             super.processingInstruction(target, data);
>         }
>         catch (IOException e) {
>             handleException(e);
>         }
>     }
>
>
>
>     // DTDHandler interface
>
file://---------------------------------------------------------------------
----
>     public void notationDecl(String name, String publicID, String
systemID) throws SAXException {
>         super.notationDecl(name, publicID, systemID);
>     }
>
>     public void unparsedEntityDecl(String name, String publicID, String
systemID, String notationName) throws SAXException

>         super.unparsedEntityDecl(name, publicID, systemID, notationName);
>     }
>
>
>     // LexicalHandler interface
>
file://---------------------------------------------------------------------
----
>     public void startDTD(String name, String publicID, String systemID)
throws SAXException {
>         inDTD = true;
>         try {
>             writeDocType(name, publicID, systemID);
>         }
>         catch (IOException e) {
>             handleException(e);
>         }
>
>         if (lexicalHandler != null) {
>             lexicalHandler.startDTD(name, publicID, systemID);
>         }
>     }
>
>     public void endDTD() throws SAXException

>         inDTD = false;
>         if (lexicalHandler != null) {
>             lexicalHandler.endDTD();
>         }
>     }
>
>     public void startCDATA() throws SAXException {
>         try {
>             writer.write( "<![CDATA[" );
>         }
>         catch (IOException e) {
>             handleException(e);
>         }
>
>         if (lexicalHandler != null) {
>             lexicalHandler.startCDATA();
>         }
>     }
>
>     public void endCDATA() throws SAXException {
>         try {
>             writer.write( "]]>" );
>         }
>         catch (IOException e) {
>             handleException(e);
>         }
>
>         if (lexicalHandler != null) {
>             lexicalHandler.endCDATA();
>         }
>     }
>
>     public void startEntity(String name) throws SAXException {
>         try {
>             writeEntityRef(name);
>         }
>         catch (IOException e) {
>             handleException(e);
>         }
>
>         if (lexicalHandler != null) {
>             lexicalHandler.startEntity(name);
>         }
>     }
>
>     public void endEntity(String name) throws SAXException

>         if (lexicalHandler != null) {
>             lexicalHandler.endEntity(name);
>         }
>     }
>
>     public void comment(char[] ch, int start, int length) throws
SAXException {
>         if ( showCommentsInDTDs || ! inDTD ) {
>             try {
>                 writeComment( new String(ch, start, length) );
>             }
>             catch (IOException e) {
>                 handleException(e);
>             }
>         }
>
>         if (lexicalHandler != null) {
>             lexicalHandler.comment(ch, start, length);
>         }
>     }
>
>
>
>     // Implementation methods
>
file://---------------------------------------------------------------------
----
>     protected void writeElement(Element element) throws IOException {
>         int size = element.nodeCount();
>         String qualifiedName = element.getQualifiedName();
>
>         writePrintln();
>         indent();
>
>         writer.write("<");
>         writer.write(qualifiedName);
>
>         int previouslyDeclaredNamespaces = namespaceStack.size();
>         Namespace ns = element.getNamespace();
>         if (isNamespaceDeclaration( ns ) ) {
>             namespaceStack.push(ns);
>             writeNamespace(ns);
>         }
>
>         // Print out additional namespace declarations
>         boolean textOnly = true;
>         for ( int i = 0; i < size; i++ ) {
>             Node node = element.node(i);
>             if ( node instanceof Namespace ) {
>                 Namespace additional = (Namespace) node;
>                 if (isNamespaceDeclaration( additional ) ) {
>                     namespaceStack.push(additional);
>                     writeNamespace(additional);
>                 }
>             }
>             else if ( node instanceof Element) {
>                 textOnly = false;
>             }
>         }
>
>         writeAttributes(element);
>
>         lastOutputNodeType = Node.ELEMENT_NODE;
>
>         if ( size <= 0 ) {
>             writeEmptyElementClose(qualifiedName);
>         }
>         else {
>             writer.write(">");
>             if ( textOnly ) {
>                 // we have at least one text node so lets assume
>                 // that its non-empty
>                 writeElementContent(element);
>             }
>             else {
>                 // we know it's not null or empty from above
>                 ++indentLevel;
>
>                 writeElementContent(element);
>
>                 --indentLevel;
>
>                 writePrintln();
>                 indent();
>             }
>             writer.write("</");
>             writer.write(qualifiedName);
>             writer.write(">");
>         }
>
>         // remove declared namespaceStack from stack
>         while (namespaceStack.size() > previouslyDeclaredNamespaces) {
>             namespaceStack.pop();
>         }
>
>         lastOutputNodeType = Node.ELEMENT_NODE;
>     }
>
>     /** Outputs the content of the given element. If whitespace trimming
is
>      * enabled then all adjacent text nodes are appended together before
>      * the whitespace trimming occurs to avoid problems with multiple
>      * text nodes being created due to text content that spans parser
buffers
>      * in a SAX parser.
>      */
>     protected void writeElementContent(Element element) throws IOException
{
>         if (format.isTrimText()) {
>             // concatenate adjacent text nodes together
>             // so that whitespace trimming works properly
>             Text lastTextNode = null;
>             StringBuffer buffer = null;
>             for ( int i = 0, size = element.nodeCount(); i < size; i++ ) {
>                 Node node = element.node(i);
>                 if ( node instanceof Text ) {
>                     if ( lastTextNode == null ) {
>                         lastTextNode = (Text) node;
>                     }
>                     else {
>                         buffer = new
StringBuffer( lastTextNode.getText() );
>                         buffer.append( ((Text) node).getText() );
>                     }
>                 }
>                 else {
>                     if ( lastTextNode != null )

>                         if ( buffer != null ) {
>                             writeString( buffer.toString() );
>                             buffer = null;
>                         }
>                         else {
>                             writeString( lastTextNode.getText() );
>                         }
>                         lastTextNode = null;
>                     }
>                     writeNode(node);
>                 }
>             }
>             if ( lastTextNode != null )

>                 if ( buffer != null ) {
>                     writeString( buffer.toString() );
>                     buffer = null;
>                 }
>                 else {
>                     writeString( lastTextNode.getText() );
>                 }
>                 lastTextNode = null;
>             }
>         }
>         else {
>             for ( int i = 0, size = element.nodeCount(); i < size; i++ ) {
>                 Node node = element.node(i);
>                 writeNode(node);
>             }
>         }
>     }
>     protected void writeCDATA(String text) throws IOException {
>         writer.write( "<![CDATA[" );
>         writer.write( text );
>         writer.write( "]]>" );
>
>         lastOutputNodeType = Node.CDATA_SECTION_NODE;
>     }
>
>     protected void writeDocType(DocumentType docType) throws IOException {
>         if (docType != null) {
>             docType.write( writer );
>             file://writeDocType( docType.getElementName(),
docType.getPublicID(), docType.getSystemID() );
>             writePrintln();
>         }
>     }
>
>
>     protected void writeNamespace(Namespace namespace) throws IOException
{
>         if ( namespace != null ) {
>             String prefix = namespace.getPrefix();
>             writer.write(" xmlns");
>             if (prefix != null && prefix.length() > 0) {
>                 writer.write(":");
>                 writer.write(prefix);
>             }
>             writer.write("=\"");
>             writer.write(namespace.getURI());
>             writer.write("\"");
>         }
>     }
>
>     protected void writeProcessingInstruction(ProcessingInstruction
processingInstruction) throws IOException {
>         file://indent();
>         writer.write( "<?" );
>         writer.write( processingInstruction.getName() );
>         writer.write( " " );
>         writer.write( processingInstruction.getText() );
>         writer.write( "?>" );
>         writePrintln();
>
>         lastOutputNodeType = Node.PROCESSING_INSTRUCTION_NODE;
>     }
>
>     protected void writeString(String text) throws IOException {
>         if ( text != null && text.length() > 0 ) {
>             if ( ESCAPE_TEXT ) {
>                 text = escapeElementEntities(text);
>             }
>
>             if ( SUPPORT_PAD_TEXT ) {
>                 if (lastOutputNodeType == Node.ELEMENT_NODE) {
>                     String padText = getPadText();
>                     if ( padText != null ) {
>                         writer.write(padText);
>                     }
>                 }
>             }
>
>             if (format.isTrimText()) {
>                 boolean first = true;
>                 StringTokenizer tokenizer = new StringTokenizer(text);
>                 while (tokenizer.hasMoreTokens()) {
>                     String token = tokenizer.nextToken();
>                     if ( first ) {
>                         first = false;
>                         if ( lastOutputNodeType == Node.TEXT_NODE )

>                             writer.write(" ");
>                         }
>                     }
>                     else {
>                         writer.write(" ");
>                     }
>                     writer.write(token);
>                     lastOutputNodeType = Node.TEXT_NODE;
>                 }
>             }
>             else

>                 lastOutputNodeType = Node.TEXT_NODE;
>                 writer.write(text);
>             }
>         }
>     }
>
>
>     protected void writeNode(Node node) throws IOException {
>         int nodeType = node.getNodeType();
>         switch (nodeType) {
>             case Node.ELEMENT_NODE:
>                 writeElement((Element) node);
>                 break;
>             case Node.ATTRIBUTE_NODE:
>                 writeAttribute((Attribute) node);
>                 break;
>             case Node.TEXT_NODE:
>                 writeString(node.getText());
>                 file://write((Text) node);
>                 break;
>             case Node.CDATA_SECTION_NODE:
>                 writeCDATA(node.getText());
>                 break;
>             case Node.ENTITY_REFERENCE_NODE:
>                 writeEntity((Entity) node);
>                 break;
>             case Node.PROCESSING_INSTRUCTION_NODE:
>                 writeProcessingInstruction((ProcessingInstruction) node);
>                 break;
>             case Node.COMMENT_NODE:
>                 writeComment(node.getText());
>                 break;
>             case Node.DOCUMENT_NODE:
>                 write((Document) node);
>                 break;
>             case Node.DOCUMENT_TYPE_NODE:
>                 writeDocType((DocumentType) node);
>                 break;
>             case Node.NAMESPACE_NODE:
>                 // Will be output with attributes
>                 file://write((Namespace) node);
>                 break;
>             default:
>                 throw new IOException( "Invalid node type: " + node );
>         }
>     }
>
>
>
>
>     protected void installLexicalHandler() {
>         XMLReader parent = getParent();
>         if (parent == null) {
>             throw new NullPointerException("No parent for filter");
>         }
>         // try to register for lexical events
>         for (int i = 0; i < LEXICAL_HANDLER_NAMES.length; i++) {
>             try {
>                 parent.setProperty(LEXICAL_HANDLER_NAMES[i], this);
>                 break;
>             }
>             catch (SAXNotRecognizedException ex) {
>                 // ignore
>             }
>             catch (SAXNotSupportedException ex) {
>                 // ignore
>             }
>         }
>     }
>
>     protected void writeDocType(String name, String publicID, String
systemID) throws IOException {
>         boolean hasPublic = false;
>
>         writer.write("<!DOCTYPE ");
>         writer.write(name);
>         if ((publicID != null) && (!publicID.equals(""))) {
>             writer.write(" PUBLIC \"");
>             writer.write(publicID);
>             writer.write("\"");
>             hasPublic = true;
>         }
>         if ((systemID != null) && (!systemID.equals(""))) {
>             if (!hasPublic) {
>                 writer.write(" SYSTEM");
>             }
>             writer.write(" \"");
>             writer.write(systemID);
>             writer.write("\"");
>         }
>         writer.write(">");
>         writePrintln();
>     }
>
>     protected void writeEntity(Entity entity) throws IOException {
>         writeEntityRef( entity.getName() );
>     }
>
>     protected void writeEntityRef(String name) throws IOException {
>         writer.write( "&" );
>         writer.write( name );
>         writer.write( ";" );
>
>         lastOutputNodeType = Node.ENTITY_REFERENCE_NODE;
>     }
>
>     protected void writeComment(String text) throws IOException

>         if (format.isNewlines()) {
>             if ( lastOutputNodeType != Node.COMMENT_NODE ) {
>                 println();
>             }
>             indent();
>         }
>         writer.write( "<!--" );
>         writer.write( text );
>         writer.write( "-->" );
>
>         writePrintln();
>
>         lastOutputNodeType = Node.COMMENT_NODE;
>     }
>
>     /** Writes the attributes of the given element
>       *
>       */
>     protected void writeAttributes( Element element ) throws IOException {
>
>         // I do not yet handle the case where the same prefix maps to
>         // two different URIs. For attributes on the same element
>         // this is illegal; but as yet we don't throw an exception
>         // if someone tries to do this
>         for ( int i = 0, size = element.attributeCount(); i < size; i++ )
{
>             Attribute attribute = element.attribute(i);
>             Namespace ns = attribute.getNamespace();
>             if (ns != null && ns != Namespace.NO_NAMESPACE && ns !=
Namespace.XML_NAMESPACE) {
>                 String prefix = ns.getPrefix();
>                 String uri = namespaceStack.getURI(prefix);
>                 if (!ns.getURI().equals(uri)) { // output a new namespace
declaration
>                     writeNamespace(ns);
>                     namespaceStack.push(ns);
>                 }
>             }
>
>             writer.write(" ");
>             writer.write(attribute.getQualifiedName());
>             writer.write("=\"");
>             writeEscapeAttributeEntities(attribute.getValue());
>             writer.write("\"");
>         }
>     }
>
>     protected void writeAttribute(Attribute attribute) throws IOException
{        
>         writer.write(" ");
>         writer.write(attribute.getQualifiedName());
>         writer.write("=");
> 
>         writer.write("\"");
>         
>         writeEscapeAttributeEntities(attribute.getValue());
>         
>         writer.write("\"");
>         lastOutputNodeType = Node.ATTRIBUTE_NODE;
>     }
> 
>     protected void writeAttributes(Attributes attributes) throws IOException {
>         for (int i = 0, size = attributes.getLength(); i < size; i++) {
>             writeAttribute( attributes, i );
>         }
>     }
> 
>     protected void writeAttribute(Attributes attributes, int index) throws 
>IOException {       
>         writer.write(" ");
>         writer.write(attributes.getQName(index));
>         writer.write("=\"");        
>         writeEscapeAttributeEntities(attributes.getValue(index));
>         writer.write("\"");
>     }
> 
>     
>     
>     protected void indent() throws IOException {
>         String indent = format.getIndent();
>         if ( indent != null && indent.length() > 0 ) {
>             for ( int i = 0; i < indentLevel; i++ ) {
>   
              writer.write(indent);
>             }
>         }
>     }
>
>     /**
>      * <p>
>      * This will print a new line only if the newlines flag was set to
true
>      * </p>
>      *
>      * @param out <code>Writer</code> to write to
>      */
>     protected void writePrintln() throws IOException  {
>         if (format.isNewlines()) {
>             writer.write( format.getLineSeparator() );
>         }
>     }
>
>     /**
>      * Get an OutputStreamWriter, use preferred encoding.
>      */
>     protected Writer createWriter(OutputStream outStream, String encoding)
throws UnsupportedEncodingException {
>         return new
feredWriter( 
>             new OutputStreamWriter( outStream, encoding )
>         );
>     }
> 
>     /**
>      * <p>
>      * This will write the declaration to the given Writer.
>      *   Assumes XML version 1.0 since we don't directly know.
>      * </p>
>      */
>     protected void writeDeclaration() throws IOException {
>         String encoding = format.getEncoding();
>         
>         // Only print of declaration is not suppressed
>         if (! format.isSuppressDeclaration()) {
>             // Assume 1.0 version
>             if (encoding.equals("UTF8")) {
>                 writer.write("<?xml version=\"1.0\"");
>                 if (!format.isOmitEncoding()) {
>                     writer.write(" encoding=\"UTF-8\"");
>                 }
>                 writer.write("?>");
>             } else {
>                 writer.write("<?xml version=\"1.0\"");
>                 if (! format.isOmitEncoding()) {
>                     writer.write(" encoding=\"" + encoding + "\"");
>                 }
>                 writer.write("?>");
>             }
>             println();
>         }        
>     }    
> 
>     protected void writeClose(String qualifiedName) throws IOException {
>         writer.write("</");
>         writer.write(qualifiedName);
>         writer.write(">");
>     }
> 
> 
    protected void writeEmptyElementClose(String qualifiedName) throws
IOException {
>         // Simply close up
>         if (! isExpandEmptyElements()) {
>             writer.write("/>");
>         } else {
>             writer.write("></");
>             writer.write(qualifiedName);
>             writer.write(">");
>         }
>     }
>
>     protected boolean isExpandEmptyElements() {
>         return format.isExpandEmptyElements();
>     }
>
>
>     /** This will take the pre-defined entities in XML 1.0 and
>       * convert their character representation to the appropriate
>       * entity reference, suitable for XML attributes.
>       */
>     protected String escapeElementEntities(String text) {
>         char[] block = null;
>         int i, last = 0, size = text.length();
>         for ( i = 0; i < size; i++ ) {
>             String entity = null;
>             char c;     // declaration and assignment added by Dan Jacobs
>             switch( c = text.charAt(i) ) {
>                 case '<' :
>                     entity = "&lt;";
>                     break;
>                 case '>' :
>                     entity = "&gt;";
>                     break;
>                 case '&' :
>                     entity = "&amp;";
>                     break;
>
>                 file://!!! Begin code added by Dan Jacobs !!!//
>                 case '\t': case '\n': case '\r':
>                     // don't encode standard whitespace characters
>                     break;
>                 default:
>                     // encode low and high characters as entities
>                     if ((c < 32) || (c >= 127))
>                         entity = "&#" + (int)c + ";";
>                     break;
>                 file://!!! End code added by Dan Jacobs !!!//
>             }
>             if (entity != null) {
>                 if ( block == null ) {
>                     block = text.toCharArray();
>                 }
>                 buffer.append(block, last, i - last);
>                 buffer.append(entity);
>                 last = i + 1;
>             }
>         }
>         if ( last == 0 ) {
>             return text;
>         }
>         if ( last < size ) {
>             if ( block == null ) {
>                 block = text.toCharArray();
>             }
>             buffer.append(block, last, i - last);
>         }
>         String answer = buffer.toString();
>         buffer.setLength(0);
>         return answer;
>     }
>
>     protected void writeEscapeAttributeEntities(String text) throws
IOException {
>         if ( text != null ) {
>             String escapedText = escapeAttributeEntities( text );
>             writer.write( escapedText );
>         }
>     }
>     /** This will take the pre-defined entities in XML 1.0 and
>       * convert their character representation to the appropriate
>       * entity reference, suitable for XML attributes.
>       */
>     protected String escapeAttributeEntities(String text) {
>         char[] block = null;
>         int i, last = 0, size = text.length();
>         for ( i = 0; i < size; i++ ) {
>             String entity = null;
>             char c;     // declaration and assignment added by Dan Jacobs
>             switch( c = text.charAt(i) ) {
>                 case '<' :
>                     entity = "&lt;";
>                     break;
>                 case '>' :
>                     entity = "&gt;";
>                     break;
>                 case '\'' :
>                     entity = "&apos;";
>                     break;
>                 case '\"' :
>                     entity = "&quot;";
>                     break;
>                 case '&' :
>                     entity = "&amp;";
>                     break;
>
>                 file://!!! Begin code added by Dan Jacobs !!!//
>                 case '\t': case '\n': case '\r':
>                     // don't encode standard whitespace characters
>                     break;
>                 default:
>                     // encode low and high characters as entities
>                     if ((c < 32) || (c >= 127))
>                         entity = "&#" + (int)c + ";";
>                     break;
>                 file://!!! End code added by Dan Jacobs !!!//
>             }
>             if (entity != null) {
>                 if ( block == null ) {
>                     block = text.toCharArray();
>                 }
>                 buffer.append(block, last, i - last);
>                 buffer.append(entity);
>                 last = i + 1;
>             }
>         }
>         if ( last == 0 ) {
>             return text;
>         }
>         if ( last < size ) {
>             if ( block == null ) {
>                 block = text.toCharArray();
>             }
>             buffer.append(block, last, i - last);
>         }
>         String answer = buffer.toString();
>         buffer.setLength(0);
>         return answer;
>     }
>
>     protected boolean isNamespaceDeclaration( Namespace ns ) {
>         if (ns != null && ns != Namespace.NO_NAMESPACE && ns !=
Namespace.XML_NAMESPACE) {
>             String uri = ns.getURI();
>             if ( uri != null && uri.length() > 0 ) {
>                 if ( ! namespaceStack.contains( ns ) ) {
>                     return true;
>
>                 }
>             }
>         }
>         return false;
>     }
>
>     protected void handleException(IOException e) throws SAXException {
>         throw new SAXException(e);
>     }
>
>     protected String getPadText() {
>         return null;
>     }
> }
>
>
>
>
> /*
>  * Redistribution and use of this software and associated documentation
>  * ("Software"), with or without modification, are permitted provided
>  * that the following conditions are met:
>  *
>  * 1. Redistributions of source code must retain copyright
>  *    statements and notices.  Redistributions must also contain a
>  *    copy of this document.
>  *
>  * 2. Redistributions in binary form must reproduce the
>  *    above copyright notice, this list of conditions and the
>  *    following disclaimer in the documentation and/or other
>  *    materials provided with the distribution.
>  *
>  * 3. The name "DOM4J" must not be used to endorse or promote
>  *    products derived from this Software without prior written
>  *    permission of MetaStuff, Ltd.  For written permission,
>  *    please contact [EMAIL PROTECTED]
>  *
>  * 4. Products derived from this Software may not be called "DOM4J"
>  *    nor may "DOM4J" appear in their names without prior written
>  *    permission of MetaStuff, Ltd. DOM4J is a registered
>  *    trademark of MetaStuff, Ltd.
>  *
>  * 5. Due credit should be given to the DOM4J Project
>  *    (http://dom4j.org/).
>  *
>  * THIS SOFTWARE IS PROVIDED BY METASTUFF, LTD. AND CONTRIBUTORS
>  * ``AS IS'' AND ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT
>  * NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
>  * FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL
>  * METASTUFF, LTD. OR ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
>  * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
>  * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
>  * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
>  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
>  * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
>  * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
>  * OF THE POSSIBILITY OF SUCH DAMAGE.
>  *
>  * Copyright 2001 (C) MetaStuff, Ltd. All Rights Reserved.
>  *
>  * $Id: XMLWriter.java,v 1.46 2002/02/14 11:55:46 jstrachan Exp $
>  */
>



-------------------------------------------------------
Sponsored by:
ThinkGeek at http://www.ThinkGeek.com/
_______________________________________________
dom4j-dev mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dom4j-dev

Re: [dom4j-dev] fix for writing numeric entities in XMLWriter

Reply via email to