Hi Dan, I've had this nagging problem in that I read in an HTML file with, for example, –– in it. I then 'Tidy' it up and use dom4j to extract the details and construct a Document and write it out using XMLWriter. What I end up seeing is ?? where the –– was. It should (sort of) look like '--'. Would your fix handle this, or will it only handle characters that are actually embedded as raw numbers?
Regards, Terry ----- Original Message ----- From: "Dan Jacobs" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Wednesday, June 19, 2002 11:10 PM Subject: [dom4j-dev] fix for writing numeric entities in XMLWriter > I found and fixed a simple bug in XMLWriter.java. A fixed version is > attached, but not checked into the source repository. I'll leave that > for the folks who maintain the sources. > > If you parse and print out (using asXML) an XML document containing a > numeric entity reference such as   (the code for ) the code > was just writing out a byte with a value of 160 (decimal). The fix > (abstracted below) is to check for characters with integer codes below > 32 and above 126 (excluding standard whitespace characters) and encode > them as numeric entities. The same fix is made in two places in > XMLWriter.java. > > Thanks for reading. > -- Dan Jacobs > > char c; // declaration and assignment added by Dan Jacobs > switch( c = text.charAt(i) ) { > case '<' : > entity = "<"; > break; > case '>' : > entity = ">"; > break; > case '&' : > entity = "&"; > break; > > file://!!! Begin code added by Dan Jacobs !!!// > case '\t': case '\n': case '\r': > // don't encode standard whitespace characters > break; > default: > // encode low and high characters as entities > if ((c < 32) || (c >= 127)) > entity = "&#" + (int)c + ";"; > break; > file://!!! End code added by Dan Jacobs !!!// > } > > -- > Daniel S. Jacobs > President, Model Objects Group > Object-Orient Software Engineering > Java & Web Application Development > > ---------------------------------------------------------------------------- ---- > /* > * Copyright 2001 (C) MetaStuff, Ltd. All Rights Reserved. > * > * This software is open source. > * See the bottom of this file for the licence. > * > * $Id: XMLWriter.java,v 1.46 2002/02/14 11:55:46 jstrachan Exp $ > */ > > package org.dom4j.io; > > import java.io.BufferedOutputStream; > import java.io.BufferedWriter; > import java.io.ByteArrayOutputStream; > import java.io.IOException; > import java.io.OutputStream; > import java.io.OutputStreamWriter; > import java.io.StringWriter; > import java.io.UnsupportedEncodingException; > import java.io.Writer; > import java.util.HashMap; > import java.util.HashSet; > import java.util.Iterator; > import java.util.LinkedList; > import java.util.List; > import java.util.Map; > import java.util.Set; > import java.util.StringTokenizer; > > import org.dom4j.Attribute; > import org.dom4j.CDATA; > import org.dom4j.CharacterData; > import org.dom4j.Comment; > import org.dom4j.DocumentType; > import org.dom4j.Document; > import org.dom4j.Element; > import org.dom4j.Entity; > import org.dom4j.Namespace; > import org.dom4j.Node; > import org.dom4j.ProcessingInstruction; > import org.dom4j.Text; > > import org.dom4j.tree.NamespaceStack; > > import org.xml.sax.Attributes; > import org.xml.sax.ContentHandler; > import org.xml.sax.DTDHandler; > import org.xml.sax.InputSource; > import org.xml.sax.Locator; > import org.xml.sax.SAXException; > import org.xml.sax.SAXNotRecognizedException; > import org.xml.sax.SAXNotSupportedException; > import org.xml.sax.XMLReader; > import org.xml.sax.ext.LexicalHandler; > import org.xml.sax.helpers.XMLFilterImpl; > > /**<p><code>XMLWriter</code> takes a DOM4J tree and formats it to a > * stream as XML. > * It can also take SAX events too so can be used by SAX clients as this object > * implements the {@link ContentHandler} and {@link LexicalHandler} interfaces. > * as well. This formatter performs typical document > * formatting. The XML declaration and processing instructions are > * always on their own lines. An {@link OutputFormat} object can be > * used to define how whitespace is handled when printing and allows various > * configuration options, such as to allow suppression of the XML declaration, > * the encoding declaration or whether empty documents are collapsed.</p> > * > * <p> There are <code>write(...)</code> methods to print any of the > * standard DOM4J classes, including <code>Document</code> and > * <code>Element</code>, to either a <code>Writer</code> or an > * <code>OutputStream</code>. Warning: using your own > * <code>Writer</code> may cause the writer's preferred character > * encoding to be ignored. If you use encodings other than UTF8, we > * recommend using the method that takes an OutputStream instead. > * </p> > * > * @author <a href="mailto:[EMAIL PROTECTED]">James Strachan</a> > * @author Joseph Bowbeer > * @version $Revision: 1.46 $ > */ > public class XMLWriter extends XMLFilterImpl implements LexicalHandler { > > protected static final String[] LEXICAL_HANDLER_NAMES = { > "http://xml.org/sax/properties/lexical-handler", > "http://xml.org/sax/handlers/LexicalHandler" > }; > > private static final boolean ESCAPE_TEXT = true; > private static final boolean SUPPORT_PAD_TEXT = false; > > protected static final OutputFormat DEFAULT_FORMAT = new OutputFormat(); > > /** Stores the last type of node written so algorithms can refer to the > * previous node type */ > protected int lastOutputNodeType; > > /** The Writer used to output to */ > protected Writer writer; > > /** The Stack of namespaceStack written so far */ > private NamespaceStack namespaceStack = new NamespaceStack(); > > /** The format used by this writer */ > private OutputFormat format; > /** The initial number of indentations (so you can print a whole > document indented, if you like) **/ > private int indentLevel = 0; > > /** buffer used when escaping strings */ > private StringBuffer buffer = new StringBuffer(); > > /** Whether a flush should occur after writing a document */ > private boolean autoFlush; > > /** Lexical handler we should delegate to */ > private LexicalHandler lexicalHandler; > > /** Whether comments should appear inside DTD declarations - defaults to false */ > private boolean showCommentsInDTDs; > > /** Is the writer curerntly inside a DTD definition? */ > private boolean inDTD; > > > public XMLWriter(Writer writer) { > this( writer, DEFAULT_FORMAT ); > } > > public XMLWriter(Writer writer, OutputFormat format) { > this.writer = writer; > this.format = format; > } > > public XMLWriter() { > this.format = DEFAULT_FORMAT; > this.writer = new BufferedWriter( new utputStreamWriter( System.out ) ); > this.autoFlush = true; > } > > public XMLWriter(OutputStream out) throws UnsupportedEncodingException { > this.format = DEFAULT_FORMAT; > this.writer = createWriter(out, format.getEncoding()); > this.autoFlush = true; > } > > public XMLWriter(OutputStream out, OutputFormat format) throws UnsupportedEncodingException { > this.format = format; > this.writer = createWriter(out, format.getEncoding()); > this.autoFlush = true; > } > > public XMLWriter(OutputFormat format) throws UnsupportedEncodingException { > this.format = format; > this.writer = createWriter( System.out, format.getEncoding() ); > this.autoFlush = true; > } > > > public void setWriter(Writer writer) { > this.writer = writer; > this.autoFlush = false; > } > > public void setOutputStream(OutputStream out) throws UnsupportedEncodingException { > this.writer = createWriter(out, format.getEncoding()); > this.autoFlush = true; > } > > > /** Set the initial indentation level. This can be used to output > * a document (or, more likely, an element) starting at a given > * indent level, so it's not always flush against the left margin. > * Default: 0 > * > * @param indentLevel the number of indents to start with > */ > public void setIndentLevel(int indentLevel) { > this.indentLevel = indentLevel; > } > > /** Flushes the underlying Writer */ > public void flush() throws IOException { > writer.flush(); > } > > /** Closes the underlying Writer */ > public void close() throws IOException { > writer.close(); > } > > /** Writes the new line text to the underlying Writer */ > public void println() throws IOException { > writer.write( format.getLineSeparator() ); > } > > /** Writes the given {@link Attribute}. > * > * @param attribute <code>Attribute</code> to output. > */ > public void write(Attribute attribute) throws IOException > writeAttribute(attribute); > > if ( autoFlush ) { > flush(); > } > } > > > /** <p>This will print the <code>Document</code> to the current Writer.</p> > * > * <p> Warning: using your own Writer may cause the writer's > * preferred character encoding to be ignored. If you use > * encodings other than UTF8, we recommend using the method that > * takes an OutputStream instead. </p> > * > * <p>Note: as with all Writers, you may need to flush() yours > * after this method returns.</p> > * > * @param doc <code>Document</code> to format. > * @throws <code>IOException</code> - if there's any problem writing. > **/ > public void write(Document doc) throws IOException { > writeDeclaration(); > > if (doc.getDocType() != null) { > indent(); > writeDocType(doc.getDocType()); > } > > for ( int i = 0, size = doc.nodeCount(); i < size; i++ ) { > Node node = doc.node(i); > writeNode( node ); > } > writePrintln(); > > if ( autoFlush ) { > flush(); > } > } > > /** <p>Writes the <code>{@link Element}</code>, including > * its <code>{@link Attribute}</code>s, and its value, and all > * its content (child nodes) to the current Writer.</p> > * > * @param element <code>Element</code> to output. > */ > public void write(Element element) throws IOException { > writeElement(element); > > if ( autoFlush ) { > flush(); > } > } > > > /** Writes the given {@link CDATA}. > * > * @param cdata <code>CDATA</code> to output. > */ > public void write(CDATA cdata) throws IOException { > writeCDATA( cdata.getText() ); > > if ( autoFlush ) { > flush(); > } > } > > /** Writes the given {@link Comment}. > * > * @param comment <code>Comment</code> to output. > */ > public void write(Comment comment) throws IOException > writeComment( comment.getText() ); > > if ( autoFlush ) { > flush(); > } > } > > /** Writes the given {@link DocumentType}. > * > * @param docType <code>DocumentType</code> to output. > */ > public void write(DocumentType docType) throws IOException { > writeDocType(docType); > > if ( autoFlush ) { > flush(); > } > } > > > /** Writes the given {@link Entity}. > * > * @param entity <code>Entity</code> to output. > */ > public void write(Entity entity) throws IOException { > writeEntity( entity ); > > if ( autoFlush ) { > flush(); > } > } > > > /** Writes the given {@link Namespace}. > * > * @param namespace <code>Namespace</code> to output. > */ > public void write(Namespace namespace) throws IOException { > writeNamespace(namespace); > > if ( autoFlush ) { > flush(); > } > } > > /** Writes the given {@link ProcessingInstruction}. > * > * @param processingInstruction <code>ProcessingInstruction</code> to output. > */ > public void write(ProcessingInstruction processingInstruction) throws IOException { > writeProcessingInstruction(processingInstruction); > > if ( autoFlush ) { > flush(); > } > } > > /** <p>Print out a {@link String}, Perfoms > * the necessary entity escaping and whitespace stripping.</p> > * > * @param text is the text to output > */ > public void write(String text) throws IOException { > writeString(text); > > if ( autoFlush ) { > flush(); > } > } > > /** Writes the given {@link Text}. > * > * @param text <code>Text</code> to output. > */ > public void write(Text text) throws IOException { > writeString(text.getText()); > > if ( autoFlush ) { > flush(); > } > } > > /** Writes the given {@link Node}. > * > * @param node <code>Node</code> to output. > */ > public void write(Node node) throws IOException { > writeNode(node); > > if ( autoFlush ) { > flush(); > } > } > > /** Writes the given object which should be a String, a Node or a List > * of Nodes. > * > * @param object is the object to output. > */ > public void write(Object object) throws IOException { > if (object instanceof Node) { > write((Node) object); > } > else if (object instanceof String) { > write((String) object); > } > else if (object instanceof List) { > List list = (List) object; > for ( int i = 0, size = list.size(); i < size; i++ ) { > write( list.get(i) ); > } > } > else if (object != null) { > throw new IOException( "Invalid object: " + object ); > } > } > > > /** <p>Writes the opening tag of an {@link Element}, > * including its {@link Attribute}s > * but without its content.</p> > * > * @param element <code>Element</code> to output. > */ > public void writeOpen(Element element) throws IOException { > writer.write("<"); > writer.write( element.getQualifiedName() ); > writeAttributes(element); > writer.write(">"); > } > > /** <p>Writes the closing tag of an {@link Element}</p> > * > * @param element <code>Element</code> to output. > */ > public void writeClose(Element element) throws IOException { > writeClose( element.getQualifiedName() ); > } > > > // XMLFilterImpl methods > file://--------------------------------------------------------------------- ---- > public void parse(InputSource source) throws IOException, SAXException { > installLexicalHandler(); > super.parse(source); > } > > > public void setProperty(String name, Object value) throws SAXNotRecognizedException, SAXNotSupportedException { > for (int i = 0; i < LEXICAL_HANDLER_NAMES.length; i++) { > if (LEXICAL_HANDLER_NAMES[i].equals(name)) { > setLexicalHandler((LexicalHandler) value); > return; > } > } > super.setProperty(name, value); > } > > public Object getProperty(String name) throws SAXNotRecognizedException, SAXNotSupportedException { > for (int i = 0; i < LEXICAL_HANDLER_NAMES.length; i++) { > if (LEXICAL_HANDLER_NAMES[i].equals(name)) { > return getLexicalHandler(); > } > } > return super.getProperty(name); > } > > public void setLexicalHandler (LexicalHandler handler) { > if (handler == null) { > throw new NullPointerException("Null lexical handler"); > } > else { > this.lexicalHandler = handler; > } > } > > public LexicalHandler getLexicalHandler(){ > return lexicalHandler; > } > > > // ContentHandler interface > file://--------------------------------------------------------------------- ---- > public void setDocumentLocator(Locator locator) { > super.setDocumentLocator(locator); > } > > public void startDocument() throws SAXException { > try { > writeDeclaration(); > super.startDocument(); > } > catch (IOException e) { > handleException(e); > } > } > > public void endDocument() throws SAXException { > super.endDocument(); > } > > public void startPrefixMapping(String prefix, String uri) throws SAXException { > super.startPrefixMapping(prefix, uri); > } > > public void endPrefixMapping(String prefix) throws SAXException { > super.endPrefixMapping(prefix); > } > > > public void startElement(String namespaceURI, String localName, String qName, Attributes attributes) throws SAXException { > try { > writePrintln(); > indent(); > writer.write("<"); > writer.write(qName); > writeAttributes( attributes ); > writer.write(">"); > ++indentLevel; > lastOutputNodeType = Node.ELEMENT_NODE; > > super.startElement( namespaceURI, localName, qName, attributes ); > } > catch (IOException e) { > handleException(e); > } > } > > public void endElement(String namespaceURI, String localName, String qName) throws SAXException { > try { > --indentLevel; > if ( lastOutputNodeType == Node.ELEMENT_NODE ) { > writePrintln(); > indent(); > } > > // XXXX: need to determine this using a stack and checking for > // content / children > boolean hadContent = true; > if ( hadContent ) { > writeClose(qName); > } > else { > writeEmptyElementClose(qName); > } > lastOutputNodeType = Node.ELEMENT_NODE; > > super.endElement( namespaceURI, localName, qName ); > } > catch (IOException e) { > handleException(e); > } > } > > public void characters(char[] ch, int start, int length) throws SAXException { > try { > write( new String( ch, start, length ) ); > > super.characters(ch, start, length); > } > catch (IOException e) { > handleException(e); > } > } > > public void ignorableWhitespace(char[] ch, int start, int length) throws SAXException { > super.ignorableWhitespace(ch, start, length); > } > > public void processingInstruction(String target, String data) throws SAXException { > try { > indent(); > writer.write("<?"); > writer.write(target); > writer.write(" "); > writer.write(data); > writer.write("?>"); > writePrintln(); > lastOutputNodeType = Node.PROCESSING_INSTRUCTION_NODE; > > super.processingInstruction(target, data); > } > catch (IOException e) { > handleException(e); > } > } > > > > // DTDHandler interface > file://--------------------------------------------------------------------- ---- > public void notationDecl(String name, String publicID, String systemID) throws SAXException { > super.notationDecl(name, publicID, systemID); > } > > public void unparsedEntityDecl(String name, String publicID, String systemID, String notationName) throws SAXException > super.unparsedEntityDecl(name, publicID, systemID, notationName); > } > > > // LexicalHandler interface > file://--------------------------------------------------------------------- ---- > public void startDTD(String name, String publicID, String systemID) throws SAXException { > inDTD = true; > try { > writeDocType(name, publicID, systemID); > } > catch (IOException e) { > handleException(e); > } > > if (lexicalHandler != null) { > lexicalHandler.startDTD(name, publicID, systemID); > } > } > > public void endDTD() throws SAXException > inDTD = false; > if (lexicalHandler != null) { > lexicalHandler.endDTD(); > } > } > > public void startCDATA() throws SAXException { > try { > writer.write( "<![CDATA[" ); > } > catch (IOException e) { > handleException(e); > } > > if (lexicalHandler != null) { > lexicalHandler.startCDATA(); > } > } > > public void endCDATA() throws SAXException { > try { > writer.write( "]]>" ); > } > catch (IOException e) { > handleException(e); > } > > if (lexicalHandler != null) { > lexicalHandler.endCDATA(); > } > } > > public void startEntity(String name) throws SAXException { > try { > writeEntityRef(name); > } > catch (IOException e) { > handleException(e); > } > > if (lexicalHandler != null) { > lexicalHandler.startEntity(name); > } > } > > public void endEntity(String name) throws SAXException > if (lexicalHandler != null) { > lexicalHandler.endEntity(name); > } > } > > public void comment(char[] ch, int start, int length) throws SAXException { > if ( showCommentsInDTDs || ! inDTD ) { > try { > writeComment( new String(ch, start, length) ); > } > catch (IOException e) { > handleException(e); > } > } > > if (lexicalHandler != null) { > lexicalHandler.comment(ch, start, length); > } > } > > > > // Implementation methods > file://--------------------------------------------------------------------- ---- > protected void writeElement(Element element) throws IOException { > int size = element.nodeCount(); > String qualifiedName = element.getQualifiedName(); > > writePrintln(); > indent(); > > writer.write("<"); > writer.write(qualifiedName); > > int previouslyDeclaredNamespaces = namespaceStack.size(); > Namespace ns = element.getNamespace(); > if (isNamespaceDeclaration( ns ) ) { > namespaceStack.push(ns); > writeNamespace(ns); > } > > // Print out additional namespace declarations > boolean textOnly = true; > for ( int i = 0; i < size; i++ ) { > Node node = element.node(i); > if ( node instanceof Namespace ) { > Namespace additional = (Namespace) node; > if (isNamespaceDeclaration( additional ) ) { > namespaceStack.push(additional); > writeNamespace(additional); > } > } > else if ( node instanceof Element) { > textOnly = false; > } > } > > writeAttributes(element); > > lastOutputNodeType = Node.ELEMENT_NODE; > > if ( size <= 0 ) { > writeEmptyElementClose(qualifiedName); > } > else { > writer.write(">"); > if ( textOnly ) { > // we have at least one text node so lets assume > // that its non-empty > writeElementContent(element); > } > else { > // we know it's not null or empty from above > ++indentLevel; > > writeElementContent(element); > > --indentLevel; > > writePrintln(); > indent(); > } > writer.write("</"); > writer.write(qualifiedName); > writer.write(">"); > } > > // remove declared namespaceStack from stack > while (namespaceStack.size() > previouslyDeclaredNamespaces) { > namespaceStack.pop(); > } > > lastOutputNodeType = Node.ELEMENT_NODE; > } > > /** Outputs the content of the given element. If whitespace trimming is > * enabled then all adjacent text nodes are appended together before > * the whitespace trimming occurs to avoid problems with multiple > * text nodes being created due to text content that spans parser buffers > * in a SAX parser. > */ > protected void writeElementContent(Element element) throws IOException { > if (format.isTrimText()) { > // concatenate adjacent text nodes together > // so that whitespace trimming works properly > Text lastTextNode = null; > StringBuffer buffer = null; > for ( int i = 0, size = element.nodeCount(); i < size; i++ ) { > Node node = element.node(i); > if ( node instanceof Text ) { > if ( lastTextNode == null ) { > lastTextNode = (Text) node; > } > else { > buffer = new StringBuffer( lastTextNode.getText() ); > buffer.append( ((Text) node).getText() ); > } > } > else { > if ( lastTextNode != null ) > if ( buffer != null ) { > writeString( buffer.toString() ); > buffer = null; > } > else { > writeString( lastTextNode.getText() ); > } > lastTextNode = null; > } > writeNode(node); > } > } > if ( lastTextNode != null ) > if ( buffer != null ) { > writeString( buffer.toString() ); > buffer = null; > } > else { > writeString( lastTextNode.getText() ); > } > lastTextNode = null; > } > } > else { > for ( int i = 0, size = element.nodeCount(); i < size; i++ ) { > Node node = element.node(i); > writeNode(node); > } > } > } > protected void writeCDATA(String text) throws IOException { > writer.write( "<![CDATA[" ); > writer.write( text ); > writer.write( "]]>" ); > > lastOutputNodeType = Node.CDATA_SECTION_NODE; > } > > protected void writeDocType(DocumentType docType) throws IOException { > if (docType != null) { > docType.write( writer ); > file://writeDocType( docType.getElementName(), docType.getPublicID(), docType.getSystemID() ); > writePrintln(); > } > } > > > protected void writeNamespace(Namespace namespace) throws IOException { > if ( namespace != null ) { > String prefix = namespace.getPrefix(); > writer.write(" xmlns"); > if (prefix != null && prefix.length() > 0) { > writer.write(":"); > writer.write(prefix); > } > writer.write("=\""); > writer.write(namespace.getURI()); > writer.write("\""); > } > } > > protected void writeProcessingInstruction(ProcessingInstruction processingInstruction) throws IOException { > file://indent(); > writer.write( "<?" ); > writer.write( processingInstruction.getName() ); > writer.write( " " ); > writer.write( processingInstruction.getText() ); > writer.write( "?>" ); > writePrintln(); > > lastOutputNodeType = Node.PROCESSING_INSTRUCTION_NODE; > } > > protected void writeString(String text) throws IOException { > if ( text != null && text.length() > 0 ) { > if ( ESCAPE_TEXT ) { > text = escapeElementEntities(text); > } > > if ( SUPPORT_PAD_TEXT ) { > if (lastOutputNodeType == Node.ELEMENT_NODE) { > String padText = getPadText(); > if ( padText != null ) { > writer.write(padText); > } > } > } > > if (format.isTrimText()) { > boolean first = true; > StringTokenizer tokenizer = new StringTokenizer(text); > while (tokenizer.hasMoreTokens()) { > String token = tokenizer.nextToken(); > if ( first ) { > first = false; > if ( lastOutputNodeType == Node.TEXT_NODE ) > writer.write(" "); > } > } > else { > writer.write(" "); > } > writer.write(token); > lastOutputNodeType = Node.TEXT_NODE; > } > } > else > lastOutputNodeType = Node.TEXT_NODE; > writer.write(text); > } > } > } > > > protected void writeNode(Node node) throws IOException { > int nodeType = node.getNodeType(); > switch (nodeType) { > case Node.ELEMENT_NODE: > writeElement((Element) node); > break; > case Node.ATTRIBUTE_NODE: > writeAttribute((Attribute) node); > break; > case Node.TEXT_NODE: > writeString(node.getText()); > file://write((Text) node); > break; > case Node.CDATA_SECTION_NODE: > writeCDATA(node.getText()); > break; > case Node.ENTITY_REFERENCE_NODE: > writeEntity((Entity) node); > break; > case Node.PROCESSING_INSTRUCTION_NODE: > writeProcessingInstruction((ProcessingInstruction) node); > break; > case Node.COMMENT_NODE: > writeComment(node.getText()); > break; > case Node.DOCUMENT_NODE: > write((Document) node); > break; > case Node.DOCUMENT_TYPE_NODE: > writeDocType((DocumentType) node); > break; > case Node.NAMESPACE_NODE: > // Will be output with attributes > file://write((Namespace) node); > break; > default: > throw new IOException( "Invalid node type: " + node ); > } > } > > > > > protected void installLexicalHandler() { > XMLReader parent = getParent(); > if (parent == null) { > throw new NullPointerException("No parent for filter"); > } > // try to register for lexical events > for (int i = 0; i < LEXICAL_HANDLER_NAMES.length; i++) { > try { > parent.setProperty(LEXICAL_HANDLER_NAMES[i], this); > break; > } > catch (SAXNotRecognizedException ex) { > // ignore > } > catch (SAXNotSupportedException ex) { > // ignore > } > } > } > > protected void writeDocType(String name, String publicID, String systemID) throws IOException { > boolean hasPublic = false; > > writer.write("<!DOCTYPE "); > writer.write(name); > if ((publicID != null) && (!publicID.equals(""))) { > writer.write(" PUBLIC \""); > writer.write(publicID); > writer.write("\""); > hasPublic = true; > } > if ((systemID != null) && (!systemID.equals(""))) { > if (!hasPublic) { > writer.write(" SYSTEM"); > } > writer.write(" \""); > writer.write(systemID); > writer.write("\""); > } > writer.write(">"); > writePrintln(); > } > > protected void writeEntity(Entity entity) throws IOException { > writeEntityRef( entity.getName() ); > } > > protected void writeEntityRef(String name) throws IOException { > writer.write( "&" ); > writer.write( name ); > writer.write( ";" ); > > lastOutputNodeType = Node.ENTITY_REFERENCE_NODE; > } > > protected void writeComment(String text) throws IOException > if (format.isNewlines()) { > if ( lastOutputNodeType != Node.COMMENT_NODE ) { > println(); > } > indent(); > } > writer.write( "<!--" ); > writer.write( text ); > writer.write( "-->" ); > > writePrintln(); > > lastOutputNodeType = Node.COMMENT_NODE; > } > > /** Writes the attributes of the given element > * > */ > protected void writeAttributes( Element element ) throws IOException { > > // I do not yet handle the case where the same prefix maps to > // two different URIs. For attributes on the same element > // this is illegal; but as yet we don't throw an exception > // if someone tries to do this > for ( int i = 0, size = element.attributeCount(); i < size; i++ ) { > Attribute attribute = element.attribute(i); > Namespace ns = attribute.getNamespace(); > if (ns != null && ns != Namespace.NO_NAMESPACE && ns != Namespace.XML_NAMESPACE) { > String prefix = ns.getPrefix(); > String uri = namespaceStack.getURI(prefix); > if (!ns.getURI().equals(uri)) { // output a new namespace declaration > writeNamespace(ns); > namespaceStack.push(ns); > } > } > > writer.write(" "); > writer.write(attribute.getQualifiedName()); > writer.write("=\""); > writeEscapeAttributeEntities(attribute.getValue()); > writer.write("\""); > } > } > > protected void writeAttribute(Attribute attribute) throws IOException { > writer.write(" "); > writer.write(attribute.getQualifiedName()); > writer.write("="); > > writer.write("\""); > > writeEscapeAttributeEntities(attribute.getValue()); > > writer.write("\""); > lastOutputNodeType = Node.ATTRIBUTE_NODE; > } > > protected void writeAttributes(Attributes attributes) throws IOException { > for (int i = 0, size = attributes.getLength(); i < size; i++) { > writeAttribute( attributes, i ); > } > } > > protected void writeAttribute(Attributes attributes, int index) throws >IOException { > writer.write(" "); > writer.write(attributes.getQName(index)); > writer.write("=\""); > writeEscapeAttributeEntities(attributes.getValue(index)); > writer.write("\""); > } > > > > protected void indent() throws IOException { > String indent = format.getIndent(); > if ( indent != null && indent.length() > 0 ) { > for ( int i = 0; i < indentLevel; i++ ) { > writer.write(indent); > } > } > } > > /** > * <p> > * This will print a new line only if the newlines flag was set to true > * </p> > * > * @param out <code>Writer</code> to write to > */ > protected void writePrintln() throws IOException { > if (format.isNewlines()) { > writer.write( format.getLineSeparator() ); > } > } > > /** > * Get an OutputStreamWriter, use preferred encoding. > */ > protected Writer createWriter(OutputStream outStream, String encoding) throws UnsupportedEncodingException { > return new feredWriter( > new OutputStreamWriter( outStream, encoding ) > ); > } > > /** > * <p> > * This will write the declaration to the given Writer. > * Assumes XML version 1.0 since we don't directly know. > * </p> > */ > protected void writeDeclaration() throws IOException { > String encoding = format.getEncoding(); > > // Only print of declaration is not suppressed > if (! format.isSuppressDeclaration()) { > // Assume 1.0 version > if (encoding.equals("UTF8")) { > writer.write("<?xml version=\"1.0\""); > if (!format.isOmitEncoding()) { > writer.write(" encoding=\"UTF-8\""); > } > writer.write("?>"); > } else { > writer.write("<?xml version=\"1.0\""); > if (! format.isOmitEncoding()) { > writer.write(" encoding=\"" + encoding + "\""); > } > writer.write("?>"); > } > println(); > } > } > > protected void writeClose(String qualifiedName) throws IOException { > writer.write("</"); > writer.write(qualifiedName); > writer.write(">"); > } > > protected void writeEmptyElementClose(String qualifiedName) throws IOException { > // Simply close up > if (! isExpandEmptyElements()) { > writer.write("/>"); > } else { > writer.write("></"); > writer.write(qualifiedName); > writer.write(">"); > } > } > > protected boolean isExpandEmptyElements() { > return format.isExpandEmptyElements(); > } > > > /** This will take the pre-defined entities in XML 1.0 and > * convert their character representation to the appropriate > * entity reference, suitable for XML attributes. > */ > protected String escapeElementEntities(String text) { > char[] block = null; > int i, last = 0, size = text.length(); > for ( i = 0; i < size; i++ ) { > String entity = null; > char c; // declaration and assignment added by Dan Jacobs > switch( c = text.charAt(i) ) { > case '<' : > entity = "<"; > break; > case '>' : > entity = ">"; > break; > case '&' : > entity = "&"; > break; > > file://!!! Begin code added by Dan Jacobs !!!// > case '\t': case '\n': case '\r': > // don't encode standard whitespace characters > break; > default: > // encode low and high characters as entities > if ((c < 32) || (c >= 127)) > entity = "&#" + (int)c + ";"; > break; > file://!!! End code added by Dan Jacobs !!!// > } > if (entity != null) { > if ( block == null ) { > block = text.toCharArray(); > } > buffer.append(block, last, i - last); > buffer.append(entity); > last = i + 1; > } > } > if ( last == 0 ) { > return text; > } > if ( last < size ) { > if ( block == null ) { > block = text.toCharArray(); > } > buffer.append(block, last, i - last); > } > String answer = buffer.toString(); > buffer.setLength(0); > return answer; > } > > protected void writeEscapeAttributeEntities(String text) throws IOException { > if ( text != null ) { > String escapedText = escapeAttributeEntities( text ); > writer.write( escapedText ); > } > } > /** This will take the pre-defined entities in XML 1.0 and > * convert their character representation to the appropriate > * entity reference, suitable for XML attributes. > */ > protected String escapeAttributeEntities(String text) { > char[] block = null; > int i, last = 0, size = text.length(); > for ( i = 0; i < size; i++ ) { > String entity = null; > char c; // declaration and assignment added by Dan Jacobs > switch( c = text.charAt(i) ) { > case '<' : > entity = "<"; > break; > case '>' : > entity = ">"; > break; > case '\'' : > entity = "'"; > break; > case '\"' : > entity = """; > break; > case '&' : > entity = "&"; > break; > > file://!!! Begin code added by Dan Jacobs !!!// > case '\t': case '\n': case '\r': > // don't encode standard whitespace characters > break; > default: > // encode low and high characters as entities > if ((c < 32) || (c >= 127)) > entity = "&#" + (int)c + ";"; > break; > file://!!! End code added by Dan Jacobs !!!// > } > if (entity != null) { > if ( block == null ) { > block = text.toCharArray(); > } > buffer.append(block, last, i - last); > buffer.append(entity); > last = i + 1; > } > } > if ( last == 0 ) { > return text; > } > if ( last < size ) { > if ( block == null ) { > block = text.toCharArray(); > } > buffer.append(block, last, i - last); > } > String answer = buffer.toString(); > buffer.setLength(0); > return answer; > } > > protected boolean isNamespaceDeclaration( Namespace ns ) { > if (ns != null && ns != Namespace.NO_NAMESPACE && ns != Namespace.XML_NAMESPACE) { > String uri = ns.getURI(); > if ( uri != null && uri.length() > 0 ) { > if ( ! namespaceStack.contains( ns ) ) { > return true; > > } > } > } > return false; > } > > protected void handleException(IOException e) throws SAXException { > throw new SAXException(e); > } > > protected String getPadText() { > return null; > } > } > > > > > /* > * Redistribution and use of this software and associated documentation > * ("Software"), with or without modification, are permitted provided > * that the following conditions are met: > * > * 1. Redistributions of source code must retain copyright > * statements and notices. Redistributions must also contain a > * copy of this document. > * > * 2. Redistributions in binary form must reproduce the > * above copyright notice, this list of conditions and the > * following disclaimer in the documentation and/or other > * materials provided with the distribution. > * > * 3. The name "DOM4J" must not be used to endorse or promote > * products derived from this Software without prior written > * permission of MetaStuff, Ltd. For written permission, > * please contact [EMAIL PROTECTED] > * > * 4. Products derived from this Software may not be called "DOM4J" > * nor may "DOM4J" appear in their names without prior written > * permission of MetaStuff, Ltd. DOM4J is a registered > * trademark of MetaStuff, Ltd. > * > * 5. Due credit should be given to the DOM4J Project > * (http://dom4j.org/). > * > * THIS SOFTWARE IS PROVIDED BY METASTUFF, LTD. AND CONTRIBUTORS > * ``AS IS'' AND ANY EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT > * NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND > * FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL > * METASTUFF, LTD. OR ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, > * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES > * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR > * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) > * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, > * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) > * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED > * OF THE POSSIBILITY OF SUCH DAMAGE. > * > * Copyright 2001 (C) MetaStuff, Ltd. All Rights Reserved. > * > * $Id: XMLWriter.java,v 1.46 2002/02/14 11:55:46 jstrachan Exp $ > */ > ------------------------------------------------------- Sponsored by: ThinkGeek at http://www.ThinkGeek.com/ _______________________________________________ dom4j-dev mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dom4j-dev