Well, I'm not sure there's much to help with anymore. If the document isn't rendered correctly when the whitespace is formatted, then we'll just have to change the source document. I'm unable to reproduce the rowspan/colspan issue after formatting the document. So, really the only thing left is the broken formatting after a DOM manipulation. In this case the merging of the document heads.

As requested, here's my code. It's somewhat spread out because I have more of a framework than a single program. I've pieced it together into an attached .txt file.

Thanks,
--Evan





Mike Skells wrote:
Hi,
I was looking for source XML + parse code + dom model printed, so that we
can all givemore precise help

Mike

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Evan Kirkconnell
Sent: 27 June 2006 00:27
Cc: dom4j-user@lists.sourceforge.net
Subject: Re: [dom4j-user] Extra Attributes

Oops, forgot the attachments.

I wasn't able to reproduce the rendering difference by just removing the rowspan and colspan attributes. I guess the pretty printing affected this.
--Evan

Evan Kirkconnell wrote:
I think I'm just going to live with it all until we have a
new style
template which is supposed to happen in a month or so.
Hopefully this
will resolve issues like this. The new one is not supposed
to have tables.
As for what I'm doing... I'm taking a style template page
and merging
it with my generated content. The style template display
is breaking
because of rowspan/colspan attributes as well as whitespace
differences.
You can see this in the attached files. template.html is the template, and new.html is the output after merging. I removed the merged(generated) body content so that you just see the template differences. You can also see the issue I was talking about in my response to your pretty print suggestion where the
serialization isn't
correct where the two head's are merged.(I left the merged head content in there)

It's easier to see the code changes if you reformat the
html, but this
also causes rendering differences.

--Evan


Mike Skells wrote:
Hi,
Can you provide an example (code + data) of what you are
doing, what
you expect, and that you are getting

-Mike

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On
Behalf Of Evan
Kirkconnell
Sent: 24 June 2006 03:04
To: Mike Skells
Cc: dom4j-user@lists.sourceforge.net
Subject: Re: [dom4j-user] Extra Attributes

Thanks for the response Mike,
I still have a dilemma though, because if I set Xerces' http://apache.org/xml/features/nonvalidating/load-external-dtd to false, it stops adding the rowspan="1"'s, but it also removes all the  's from the document. Do you have any
suggestions?
I'm going to manually remove the added attributes, but I don't really like this solution.
--Evan

Mike Skells wrote:
Hi Evan,
Thi is not DOM4J doing this I dont think, as it does not
understand HMTL!
Perhaps it is a validating SAX Parser, or someting else in
the parse
chain

Mike

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On
Behalf Of Evan
Kirkconnell
Sent: 22 June 2006 23:01
To: dom4j-user@lists.sourceforge.net
Subject: [dom4j-user] Extra Attributes

DOM4J seems to be adding attributes to my document.  I
don't know if
this is when it's loaded, or when it's serialized, and I'd
guess it
has something to do with the XHTML DTD.
It's adding rowspan="1" and colspan="1" to my td's and like clear="none" or something like that to br's. I'm using a pre-existing style document, and this is changing the layout.
 I know it sounds kind of silly, but it is changing how
the browsers
render the content.  Is there a way to turn this off?
--Evan


All the advantages of Linux Managed Hosting--Without
the Cost and
Risk!
Fully trained technicians. The highest number of Red Hat certifications in the hosting industry. Fanatical Support.
Click to learn more
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=107521&bid=248729&;
dat=121642
_______________________________________________
dom4j-user mailing list
dom4j-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dom4j-user

Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to
make your
job easier Download IBM WebSphere Application Server
v.1.0.1 based
on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&;
dat=121642
_______________________________________________
dom4j-user mailing list
dom4j-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dom4j-user


Using Tomcat but need to do more? Need to support web
services, security?
Get stuff done quickly with pre-integrated technology to
make your job
easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo

http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=1216
42 _______________________________________________
dom4j-user mailing list
dom4j-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dom4j-user



Load the document with: 

        public static Document getDocument(URI uri) throws Exception{
                try{
                SAXReader reader = getSAXReader();
                Document doc = null;
                
                long start = System.currentTimeMillis();
                
                if( "file".equals(uri.getScheme()) ){
                                doc = reader.read( new File(uri) );
                                doc.addComment("DOM document originally created 
from file in "+(System.currentTimeMillis() - start)+" milliseconds");   
                }else{
                                doc = reader.read( uri.toURL() );
                                doc.addComment("DOM document originally created 
from URL in "+(System.currentTimeMillis() - start)+" milliseconds");
                        }                       
                        
                        fixNamespaces(doc);
                
                        return doc;
                }catch(Exception e){
                        throw new Exception("Error getting document from: 
"+uri.toString(), e);
                }
        }
        
        public static SAXReader getSAXReader() throws org.xml.sax.SAXException{
                if(saxReader == null){
                        saxReader = new SAXReader();
                
//saxReader.setFeature("http://xml.org/sax/features/namespaces";, false);
                
//saxReader.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar";,
 false);
                
//saxReader.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd";,
 false);
                
//saxReader.setFeature("http://apache.org/xml/features/dom/include-ignorable-whitespace";,
 false);               
                saxReader.setValidation(false);
                //saxReader.setEntityResolver(null);
                saxReader.setDocumentFactory(df);
                //saxReader.setStripWhitespaceText(true);
            }       
        return saxReader;
        }
        
        public static void fixNamespaces(Document doc){
                Element root = doc.getRootElement();            
                if(removeNamespaces && root.getNamespace() != 
Namespace.NO_NAMESPACE) removeNamespaces( root.content() ); //recursively 
remove namespaces                               
        }
        
        This is also relevant:
        
        public void setDoc(Document doc, boolean runStyleModifications) throws 
Exception{
                if(doc == null) throw new IllegalStateException("setDoc called 
with null document passed.");            
                root = doc.getRootElement();
                
                original = root.getNamespace();
                
                Node n = root.selectSingleNode("/html/body");
                if(n != null) body = (Element)n;
                else body = null;
                                
                n = root.selectSingleNode("/html/head");                
                if(n != null) head = (Element)n;
                else head = null;
                
                this.doc = doc;
                if(runStyleModifications && DOMStyle.runModifications) 
servlet.modify(this);  //takes us to the merging code            
        }
    

Print the document with:

    public static void print(Node node, PrintWriter pw, Namespace original) 
throws IOException{
        if(node instanceof Document && original != null) unfixNamespaces( 
(Document)node, original );
           
        OutputFormat of = OutputFormat.createPrettyPrint();
        of.setIndent("\t");           
        of.setLineSeparator("\n");
        //of.setTrimText(trim);
        of.setExpandEmptyElements(false);
        //of.setNewlines(true);
        //of.setNewLineAfterNTags(1);
        //of.setXHTML(true);
       
        //StringBufferWriter sbw = new StringBufferWriter();       
       
        XMLWriter writer = new XMLWriter(pw, of);
        writer.setEscapeText(true);
        writer.setResolveEntityRefs(false);
        writer.write(node);
        pw.flush();
       
    }
    
    public static void unfixNamespaces(Document doc, Namespace original){
                Element root = doc.getRootElement();
                if(removeNamespaces && original != null) 
setNamespaces(root.content(), original);//recursive
        }

Here's the template merging code:

        public void modify(DOMBase base) throws Exception{

        Element body = base.body;
        Element head = base.head;
        String title = base.getTitle();
        Date modified = base.getModified();
       

        Document styleDoc = base.getDocument(new 
URI("http://www.uark.edu/staff/uarkinfo/public_html/cstemplate/cstemplate_link.htm";),
 true);
       
        base.setDoc(styleDoc, false);
       
        body.setQName( new QName( "div", body.getNamespace() ) );
        List l = body.attributes();
        for(int i=0; i<l.size(); i++){
            Attribute att = (Attribute)l.get(i);
            if( mergeAttributes.contains( att.getName().toLowerCase() ) ){
                base.body.addAttribute( att.getName(), 
base.body.attributeValue( att.getName() ) + att.getValue() );
            }
        }
        if(body != null) base.replaceComment("content", body);
           
        if(base.head != null && head != null) 
base.head.content().addAll(head.content());
       
        base.setTitle(base.getTitle()+title);
       
        String filename = "generated document";
        if(base.uri != null){           
            if( "file".equals(base.uri.getScheme()) ) filename = new 
File(base.uri).getName();
        }      
      
        base.replaceComment("url", filename);      
       
        base.replaceComment("last_update", ""+modified);
        
    }
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
dom4j-user mailing list
dom4j-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dom4j-user

Reply via email to