Thanks for getting back to me Stephen. Since my last post I actually got some xpaths working, but only by using a more explicit form like "//*[name()='body']". And actually, I tried the simple program you sent and I get the same results: no xpaths work unless I use the explicit name test.

Maybe I'm just remembering things funny, but it seemed like I could just use an xpath like "//body" when I was doing xslt's. When you run this program, are you able to get any results for xpaths in the form "//foo", where foo is down a couple levels from root?

thanks again,

Ben


PS: Oh, I tried your code reading in an html file... both with and without the DTD... and for now, I'm not using different namespaces so the standard xhtml namespace is fine...




Stephen C. Upton wrote:
Ben,

You don't *need* to specify a namespace, and it works if you don't. Not sure
if that is one of your requirements, though. I don't use namespaces in my
xml documents when parsing with dom4j, and all works fine. Can't speak to
whether dom4j *should* handle namespaces since I don't use them. I took out
your DOCTYPE line and the xpaths work fine for me, but again, won't help you
if you need to specify some namespace.

HTH
Steve

BTW, here's the class I wrote to do your test. Thought it might help, but
it's pretty simple.

import java.io.*;
import org.dom4j.*;
import org.dom4j.io.*;
import org.dom4j.xpath.*;
import java.net.MalformedURLException;


public class XPathTest { private String fileName,xpathExpression; private Document document; private Element root;

        XPathTest(String fileName, String xpathExpression){
                this.fileName = fileName;
                this.xpathExpression = xpathExpression;
        }
        
        void go(){
                        Node n = root.selectSingleNode(xpathExpression);
                        System.out.println("Root name is: " +
root.getName());
                        System.out.println("xpath:" + xpathExpression +
":");

                        if ( n == null ) {
                                System.out.println("Node is empty");
                                return;
                        }
                        System.out.println("Name of node: " + n.getName());
                        System.out.println("text of node: " + n.getText());

        }
        
        public void openXMLFile() {
                try {
                        File file = new File(fileName);
                        SAXReader xmlReader = new SAXReader();
                        document = xmlReader.read(file);
                        root = document.getRootElement();

                } catch (MalformedURLException mue) {
                        System.out.println("Can not form URL from this file:
" + fileName);
                        mue.printStackTrace(System.out);
                        System.exit(-1);
                } catch (DocumentException de) {
                        System.out.println("Error attempting to parse: " +
fileName);
                        de.printStackTrace(System.out);
                        System.exit(-1);
                }

        }
        
        public static void main(String[] args){
                XPathTest xpathTest = new XPathTest(args[0], args[1]);
                xpathTest.openXMLFile();
                xpathTest.go();
        }
}

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Ben Munat
Sent: Thursday, January 15, 2004 12:10 PM
To: [EMAIL PROTECTED]
Subject: Re: [dom4j-user] xpath not working


Thanks for getting back to me Martin. That's very strange that what I pasted into my email doesn't have the namespace declaration on it. The actual file definitely does, because XmEdil flags the doc as not well-formed when I open it without the namespace declaration.


So, not sure how I got that (actually, I think it was the console output of my little dom4j test, which is weird: it outputs the whole doc when I select "/" but strips the namespace off?), but all of my test xpaths in XmEdil have been with the namespace declaration. I hadn't tried putting the xhtml: prefix on my xpath selections, but that still didn't work.

I'm as stumped as ever, though I guess I should find an xpath mailing list... it would appear that it's not just dom4j's xpath implementation that is the problem. I can't get any xpaths (beyond /) to work in this other program either. Sigh.

b



Bradley, Martin wrote:

Ben,

A couple of things to get started:
1. This document isn't well formed xml. Try to check the well formedness
in your editor and you'll get a message about not being able to define a
default namespace in the DTD. You need <html
xmlns="http://www.w3.org/1999/xhmtl";> to replace your html element to
define a default namespace for the rest of the document (and/or check
out this link
http://www.biglist.com/lists/xsl-list/archives/200104/msg00568.html Jeni
describes other ways to define the namespace using the DTD - I don't do
this much so I would just confuse the issue more)
2. the reason you aren't getting anything back is due to the lack of a
namespace or default namespace
3. After doing (1) use your xpath tool and set a namespace with a prefix
(or set a default if you can)  I used
xmlns:xhtml="http:www.w3.org/1999/xhtml"

Then to your original example do //xhtml:[EMAIL PROTECTED] and you'll get a node
list of the elements.

I'm not sure how to do this in dom4j (I'm just learning the api's
myself) but you should be able to set a namespace and prefix. If this
doesn't fix your problem I hope it sheds some light on what's actually
happening.

I've got a busy day today, but I'll try to look at this later if I get a
chance.

Hope this helps!
Marty
-----Original Message-----
From: Ben Munat [mailto:[EMAIL PROTECTED]
Sent: Thursday, January 15, 2004 1:57 AM
To: [EMAIL PROTECTED]
Subject: Re: [dom4j-user] xpath not working


McDonough, Michael wrote: > While a valid html file, do you know whether it is a well-formed XML > document?

Bradley, Martin wrote:


do you have a way to externally test the xpath expression into your

html



doc?  I use Xselerator.  If you don't post the sample html and I can
verify the path is returning a nodelist.



Thanks for getting back to me... good idea. I used a prog called XmEdil which has an xpath tester (and tests the doc for validty and well-formednes when I open it). And, it seems I am doing something wrong
with the xpath: when I select "/" I get the whole doc still, but when I put in any other xpath (//head, //p, etc.) I get no results.


Now, when I select "/", the xpath tester lists its result as being of type "document"... does this mean that I have to "get into" the document
before my xpaths will work? I don't remember any such thing from my xpath/xsl class.


And actually, I already tried running the dom4j xpath methods after selecting the root node of the document:

        Element root = d.getRootElement();
        Node n = root.selectSingleNode("//p");

That doesn't work either. :-(
        
Anyway, here's the html... hope you guys have more ideas.

Thanks,

Ben

PS: Just noticed that the xpath builder window has a tab called "nodes" which shows a list of the elements and attributes in my doc (p, div, a, class, etc). So, it's even figuring out that there are sub nodes within this "document" element... but I can't select them with xpaths!?

-------------------------------------------------

Using this as test doc:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd";>


<html>
<head>
<title>
test page
</title>
</head>
<body>
<div class="navbox">
<p>
SITE LINKS:
</p>
<a href="#">A link</a><br />
<a href="#">A link again</a><br />
<a href="#">Another link</a><br />
<a href="#">A link too</a><br />
</div>
<div class="content">
<h1>
Test page!
</h1>
<p>
Lobore et dolore nagna aliquam erat volupat. At enim ad minimin veniami quis nostrud
exercitation lorem ipsum dolor sit amet, consectetur adips cing elit, diam nonnumy
eiusmod tempor incidunt ut ullamcorper suscripit laboris nisi ut alquip exea commodo
consequat, consectetur adips cing elit.
</p>
</div>
</body>
</html>







------------------------------------------------------- This SF.net email is sponsored by: Perforce Software. Perforce is the Fast Software Configuration Management System offering advanced branching capabilities and atomic changes on 50+ platforms. Free Eval! http://www.perforce.com/perforce/loadprog.html _______________________________________________ dom4j-user mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dom4j-user

NOTICE: This E-mail may contain confidential information. If you are not
the addressee or the intended recipient please do not read this E-mail
and please immediately delete this e-mail message and any attachments
from your workstation or network mail system. If you are the addressee
or the intended recipient and you save or print a copy of this E-mail,
please place it in an appropriate file, depending on whether
confidential information is contained in the message.




------------------------------------------------------- This SF.net email is sponsored by: Perforce Software. Perforce is the Fast Software Configuration Management System offering advanced branching capabilities and atomic changes on 50+ platforms. Free Eval! http://www.perforce.com/perforce/loadprog.html _______________________________________________ dom4j-user mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dom4j-user





-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
dom4j-user mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/dom4j-user




------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ dom4j-user mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dom4j-user




------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ dom4j-user mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dom4j-user

Reply via email to