Hello,

I would like to edit OpenOffice.org documents (XML format) with dom4j.
Therefor I use the OpenOffice.org Flat XML format (the whole document in
one xml file instead of four separate documents).

The document begins with this tag:

<office:document xmlns:office="http://openoffice.org/2000/office";
xmlns:style="http://openoffice.org/2000/style";
xmlns:text="http://openoffice.org/2000/text";
xmlns:table="http://openoffice.org/2000/table";
xmlns:draw="http://openoffice.org/2000/drawing";
xmlns:fo="http://www.w3.org/1999/XSL/Format";
xmlns:xlink="http://www.w3.org/1999/xlink";
xmlns:dc="http://purl.org/dc/elements/1.1/";
xmlns:meta="http://openoffice.org/2000/meta";
xmlns:number="http://openoffice.org/2000/datastyle";
xmlns:svg="http://www.w3.org/2000/svg";
xmlns:chart="http://openoffice.org/2000/chart";
xmlns:dr3d="http://openoffice.org/2000/dr3d";
xmlns:math="http://www.w3.org/1998/Math/MathML";
xmlns:form="http://openoffice.org/2000/form";
xmlns:script="http://openoffice.org/2000/script";
xmlns:config="http://openoffice.org/2001/config"; office:class="text"
office:version="1.0">

The document also contains an OLE object. This means that the document
contains the XML code of another document. So you can find another
<office:document> tag inside the document. It looks like this:

<office:document xmlns:office="http://openoffice.org/2000/office";
xmlns:style="http://openoffice.org/2000/style";
xmlns:text="http://openoffice.org/2000/text";
xmlns:table="http://openoffice.org/2000/table";
xmlns:draw="http://openoffice.org/2000/drawing";
xmlns:fo="http://www.w3.org/1999/XSL/Format";
xmlns:xlink="http://www.w3.org/1999/xlink";
xmlns:dc="http://purl.org/dc/elements/1.1/";
xmlns:meta="http://openoffice.org/2000/meta";
xmlns:number="http://openoffice.org/2000/datastyle";
xmlns:presentation="http://openoffice.org/2000/presentation";
xmlns:svg="http://www.w3.org/2000/svg";
xmlns:chart="http://openoffice.org/2000/chart";
xmlns:dr3d="http://openoffice.org/2000/dr3d";
xmlns:math="http://www.w3.org/1998/Math/MathML";
xmlns:form="http://openoffice.org/2000/form";
xmlns:script="http://openoffice.org/2000/script";
xmlns:config="http://openoffice.org/2001/config"; office:class="drawing"
office:version="1.0">

It's nearly the same. It additionally contains the xmlns:presentation
attribute and the value of office:class is "drawing" instead of "text".

When I'm parsing the document with dom4j and Xerces and writing it back to
a file without editing the Document the OLE object isn't visilble anymore.
This is caused because all duplicate namespaces in the second
<office:document> tag disappeared. It looks like this:

<office:document
xmlns:presentation="http://openoffice.org/2000/presentation";
office:class="drawing" office:version="1.0">

When I add the lost namespaces manually the OLE object is visible. So it
seems that OpenOffice.org needs those duplicate namespaces at the second
<office:document> tag.

How can I force dom4j not to remove those namespaces???

I use this easy code:

SAXReader reader = new SAXReader();
Document docu = reader.read("/tmp/MyFile.fsxw"); // fsxw is the file
extension for OpenOffice.org flat xml files
FileWriter fileWriter = new FileWriter("/tmp/NewFile.fsxw");
BufferedWriter bufferedWriter = new BufferedWriter(fileWriter);
docu.write(bufferedWriter);
bufferedWriter.close();

The result is the same when using an XMLWriter object to write the
Document.



-------------------------------------------------------
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click
_______________________________________________
dom4j-user mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dom4j-user

Reply via email to