Re: Extracting xml from html

2007-09-19 Thread Stefan Behnel
[EMAIL PROTECTED] wrote: >>row = tree.find("//Row") >>print row.findtext("primaryowner") >>print row.findtext("customeraddress") > > I tried this your way and Laurent's way and both give me this error: > > AttributeError: 'NoneType' object has no attribute 'findtext' Well, error hand

Re: Extracting xml from html

2007-09-19 Thread kyosohma
On Sep 19, 3:13 am, Stefan Behnel <[EMAIL PROTECTED]> wrote: > [EMAIL PROTECTED] wrote: > > Does this make sense? It works pretty well, but I don't really > > understand everything that I'm doing. > > > def Parser(filename): > > It's uncommon to give a function a capitalised name, unless it's a fac

Re: Extracting xml from html

2007-09-19 Thread Stefan Behnel
[EMAIL PROTECTED] wrote: > Does this make sense? It works pretty well, but I don't really > understand everything that I'm doing. > > def Parser(filename): It's uncommon to give a function a capitalised name, unless it's a factory function (which this isn't). > parser = etree.HTMLParser() >

Re: Extracting xml from html

2007-09-19 Thread Laurent Pointal
[EMAIL PROTECTED] a écrit : > On Sep 18, 1:56 am, Stefan Behnel <[EMAIL PROTECTED]> wrote: >> [EMAIL PROTECTED] wrote: >>> I am attempting to extract some XML from an HTML document that I get >>> returned from a form based web page. For some reason, I cannot figure >>> out how to do this. >>> Here'

Re: Extracting xml from html

2007-09-19 Thread Stefan Behnel
George Sakkis wrote: > Given that you can do in 2 lines what > took you around 15 with lxml, I wouldn't think it twice. Don't judge a tool by beginner's code. Stefan -- http://mail.python.org/mailman/listinfo/python-list

Re: Extracting xml from html

2007-09-18 Thread George Sakkis
On Sep 18, 3:31 pm, [EMAIL PROTECTED] wrote: > On Sep 17, 4:51 pm, "Gabriel Genellina" <[EMAIL PROTECTED]> > wrote: > > > > > En Mon, 17 Sep 2007 17:31:19 -0300, <[EMAIL PROTECTED]> escribi?: > > > > I am attempting to extract some XML from an HTML document that I get > > > returned from a form bas

Re: Extracting xml from html

2007-09-18 Thread kyosohma
On Sep 18, 1:56 am, Stefan Behnel <[EMAIL PROTECTED]> wrote: > [EMAIL PROTECTED] wrote: > > I am attempting to extract some XML from an HTML document that I get > > returned from a form based web page. For some reason, I cannot figure > > out how to do this. > > Here's a sample of the html: > > >

Re: Extracting xml from html

2007-09-18 Thread kyosohma
On Sep 17, 4:51 pm, "Gabriel Genellina" <[EMAIL PROTECTED]> wrote: > En Mon, 17 Sep 2007 17:31:19 -0300, <[EMAIL PROTECTED]> escribi?: > > > I am attempting to extract some XML from an HTML document that I get > > returned from a form based web page. For some reason, I cannot figure > > out how to

Re: Extracting xml from html

2007-09-18 Thread Paul Boddie
On 17 Sep, 23:14, [EMAIL PROTECTED] wrote: > > I have lxml installed and I appear to also have libxml2dom installed. > I know lxml has decent docs, but I don't see much for yours. Is this > the only place to go:http://www.boddie.org.uk/python/libxml2dom.html > ? Unfortunately yes, with regard to o

Re: Extracting xml from html

2007-09-18 Thread Stefan Behnel
[EMAIL PROTECTED] wrote: > I am attempting to extract some XML from an HTML document that I get > returned from a form based web page. For some reason, I cannot figure > out how to do this. > Here's a sample of the html: > > > > lots of screwy text including divs and spans > > 1126264 >

Re: Extracting xml from html

2007-09-17 Thread Gabriel Genellina
En Mon, 17 Sep 2007 17:31:19 -0300, <[EMAIL PROTECTED]> escribi�: > I am attempting to extract some XML from an HTML document that I get > returned from a form based web page. For some reason, I cannot figure > out how to do this. I thought I could use the minidom module to do it, > but all I get

Re: Extracting xml from html

2007-09-17 Thread kyosohma
On Sep 17, 4:01 pm, Paul Boddie <[EMAIL PROTECTED]> wrote: > On 17 Sep, 22:31, [EMAIL PROTECTED] wrote: > > > > > What's the best way to get at the XML? Do I need to somehow parse it > > using the HTMLParser and then parse that with minidom or what? > > Probably easiest is to use an XML processing

Re: Extracting xml from html

2007-09-17 Thread Paul Boddie
On 17 Sep, 22:31, [EMAIL PROTECTED] wrote: > > What's the best way to get at the XML? Do I need to somehow parse it > using the HTMLParser and then parse that with minidom or what? Probably easiest is to use an XML processing toolkit or library which supports HTML parsing. Since the libxml2 librar