Re: [dom4j-user] Extra Attributes

Evan Kirkconnell Mon, 26 Jun 2006 08:13:28 -0700

Well, I'm not sure there's much to help with anymore. If the documentisn't rendered correctly when the whitespace is formatted, then we'lljust have to change the source document. I'm unable to reproduce therowspan/colspan issue after formatting the document. So, really theonly thing left is the broken formatting after a DOM manipulation. Inthis case the merging of the document heads.

As requested, here's my code. It's somewhat spread out because I havemore of a framework than a single program. I've pieced it together intoan attached .txt file.


Thanks,
--Evan





Mike Skells wrote:

Hi,
I was looking for source XML + parse code + dom model printed, so that we
can all givemore precise help

Mike
-----Original Message-----
From: [EMAIL PROTECTED][mailto:[EMAIL PROTECTED] On BehalfOf Evan Kirkconnell
Sent: 27 June 2006 00:27
Cc: [email protected]
Subject: Re: [dom4j-user] Extra Attributes

Oops, forgot the attachments.
I wasn't able to reproduce the rendering difference by justremoving the rowspan and colspan attributes. I guess thepretty printing affected this.
--Evan

Evan Kirkconnell wrote:
I think I'm just going to live with it all until we have a
new style
template which is supposed to happen in a month or so.
Hopefully this
will resolve issues like this. The new one is not supposed
to have tables.
As for what I'm doing... I'm taking a style template page
and merging
it with my generated content. The style template display
is breaking
because of rowspan/colspan attributes as well as whitespace
differences.
You can see this in the attached files. template.html is thetemplate, and new.html is the output after merging. I removed themerged(generated) body content so that you just see the templatedifferences. You can also see the issue I was talking about in myresponse to your pretty print suggestion where the
serialization isn't
correct where the two head's are merged.(I left the merged headcontent in there)
It's easier to see the code changes if you reformat the
html, but this
also causes rendering differences.

--Evan


Mike Skells wrote:
Hi,
Can you provide an example (code + data) of what you are
doing, what
you expect, and that you are getting

-Mike
-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On
Behalf Of Evan
Kirkconnell
Sent: 24 June 2006 03:04
To: Mike Skells
Cc: [email protected]
Subject: Re: [dom4j-user] Extra Attributes

Thanks for the response Mike,
I still have a dilemma though, because if I set Xerces'http://apache.org/xml/features/nonvalidating/load-external-dtdto false, it stops adding the rowspan="1"'s, but it also removesall the  's from the document. Do you have any
suggestions?
I'm going to manually remove the added attributes, but I don'treally like this solution.
--Evan

Mike Skells wrote:
Hi Evan,
Thi is not DOM4J doing this I dont think, as it does not
understand HMTL!
Perhaps it is a validating SAX Parser, or someting else in
the parse
chain

Mike
-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On
Behalf Of Evan
Kirkconnell
Sent: 22 June 2006 23:01
To: [email protected]
Subject: [dom4j-user] Extra Attributes

DOM4J seems to be adding attributes to my document.  I
don't know if
this is when it's loaded, or when it's serialized, and I'd
guess it
has something to do with the XHTML DTD.
It's adding rowspan="1" and colspan="1" to my td's and likeclear="none" or something like that to br's. I'm using apre-existing style document, and this is changing the layout.
 I know it sounds kind of silly, but it is changing how
the browsers
render the content.  Is there a way to turn this off?
--Evan
All the advantages of Linux Managed Hosting--Without
the Cost and
Risk!
Fully trained technicians. The highest number of Red Hatcertifications in the hosting industry. Fanatical Support.
Click to learn more
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=107521&bid=248729&;
dat=121642
_______________________________________________
dom4j-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dom4j-user
Using Tomcat but need to do more? Need to support web services,security?Get stuff done quickly with pre-integrated technology to
make your
job easier Download IBM WebSphere Application Server
v.1.0.1 based
on Apache Geronimohttp://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&;
dat=121642
_______________________________________________
dom4j-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dom4j-user
Using Tomcat but need to do more? Need to support web
services, security?
Get stuff done quickly with pre-integrated technology to
make your job
easier Download IBM WebSphere Application Server v.1.0.1 based onApache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=1216
42 _______________________________________________
dom4j-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dom4j-user

Load the document with: 

        public static Document getDocument(URI uri) throws Exception{
                try{
                SAXReader reader = getSAXReader();
                Document doc = null;
                
                long start = System.currentTimeMillis();
                
                if( "file".equals(uri.getScheme()) ){
                                doc = reader.read( new File(uri) );
                                doc.addComment("DOM document originally created 
from file in "+(System.currentTimeMillis() - start)+" milliseconds");   
                }else{
                                doc = reader.read( uri.toURL() );
                                doc.addComment("DOM document originally created 
from URL in "+(System.currentTimeMillis() - start)+" milliseconds");
                        }                       
                        
                        fixNamespaces(doc);
                
                        return doc;
                }catch(Exception e){
                        throw new Exception("Error getting document from: 
"+uri.toString(), e);
                }
        }
        
        public static SAXReader getSAXReader() throws org.xml.sax.SAXException{
                if(saxReader == null){
                        saxReader = new SAXReader();
                
//saxReader.setFeature("http://xml.org/sax/features/namespaces";, false);
                
//saxReader.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar";,
 false);
                
//saxReader.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd";,
 false);
                
//saxReader.setFeature("http://apache.org/xml/features/dom/include-ignorable-whitespace";,
 false);               
                saxReader.setValidation(false);
                //saxReader.setEntityResolver(null);
                saxReader.setDocumentFactory(df);
                //saxReader.setStripWhitespaceText(true);
            }       
        return saxReader;
        }
        
        public static void fixNamespaces(Document doc){
                Element root = doc.getRootElement();            
                if(removeNamespaces && root.getNamespace() != 
Namespace.NO_NAMESPACE) removeNamespaces( root.content() ); //recursively 
remove namespaces                               
        }
        
        This is also relevant:
        
        public void setDoc(Document doc, boolean runStyleModifications) throws 
Exception{
                if(doc == null) throw new IllegalStateException("setDoc called 
with null document passed.");            
                root = doc.getRootElement();
                
                original = root.getNamespace();
                
                Node n = root.selectSingleNode("/html/body");
                if(n != null) body = (Element)n;
                else body = null;
                                
                n = root.selectSingleNode("/html/head");                
                if(n != null) head = (Element)n;
                else head = null;
                
                this.doc = doc;
                if(runStyleModifications && DOMStyle.runModifications) 
servlet.modify(this);  //takes us to the merging code            
        }
    

Print the document with:

    public static void print(Node node, PrintWriter pw, Namespace original) 
throws IOException{
        if(node instanceof Document && original != null) unfixNamespaces( 
(Document)node, original );
           
        OutputFormat of = OutputFormat.createPrettyPrint();
        of.setIndent("\t");           
        of.setLineSeparator("\n");
        //of.setTrimText(trim);
        of.setExpandEmptyElements(false);
        //of.setNewlines(true);
        //of.setNewLineAfterNTags(1);
        //of.setXHTML(true);
       
        //StringBufferWriter sbw = new StringBufferWriter();       
       
        XMLWriter writer = new XMLWriter(pw, of);
        writer.setEscapeText(true);
        writer.setResolveEntityRefs(false);
        writer.write(node);
        pw.flush();
       
    }
    
    public static void unfixNamespaces(Document doc, Namespace original){
                Element root = doc.getRootElement();
                if(removeNamespaces && original != null) 
setNamespaces(root.content(), original);//recursive
        }

Here's the template merging code:

        public void modify(DOMBase base) throws Exception{

        Element body = base.body;
        Element head = base.head;
        String title = base.getTitle();
        Date modified = base.getModified();
       

        Document styleDoc = base.getDocument(new 
URI("http://www.uark.edu/staff/uarkinfo/public_html/cstemplate/cstemplate_link.htm";),
 true);
       
        base.setDoc(styleDoc, false);
       
        body.setQName( new QName( "div", body.getNamespace() ) );
        List l = body.attributes();
        for(int i=0; i<l.size(); i++){
            Attribute att = (Attribute)l.get(i);
            if( mergeAttributes.contains( att.getName().toLowerCase() ) ){
                base.body.addAttribute( att.getName(), 
base.body.attributeValue( att.getName() ) + att.getValue() );
            }
        }
        if(body != null) base.replaceComment("content", body);
           
        if(base.head != null && head != null) 
base.head.content().addAll(head.content());
       
        base.setTitle(base.getTitle()+title);
       
        String filename = "generated document";
        if(base.uri != null){           
            if( "file".equals(base.uri.getScheme()) ) filename = new 
File(base.uri).getName();
        }      
      
        base.replaceComment("url", filename);      
       
        base.replaceComment("last_update", ""+modified);
        
    }

Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

_______________________________________________
dom4j-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dom4j-user

Re: [dom4j-user] Extra Attributes

Reply via email to