subject:"Re\: \[BangPypers\] HTML Parsing in python"

Re: [BangPypers] HTML Parsing in python

2009-10-20 Thread Anand Balachandran Pillai

On Thu, Sep 10, 2009 at 7:44 PM, Puneet Aggarwal look4pun...@gmail.comwrote: Thanks all for the suggestions. I think I will start with BeautifulSoup (3.0.7a) and will experiment with other suggested libs if it does not fit into my requirement or if I face issues with this. You are not going

Re: [BangPypers] HTML Parsing in python

2009-10-20 Thread Yuvi Panda

I use lxml.html. Just as good, and MUCH faster. A pain to install though. On Tue, Oct 20, 2009 at 6:32 PM, Anand Balachandran Pillai abpil...@gmail.com wrote: On Thu, Sep 10, 2009 at 7:44 PM, Puneet Aggarwal look4pun...@gmail.comwrote: Thanks all for the suggestions. I think I will start

Re: [BangPypers] HTML Parsing in python

2009-10-20 Thread srid

On Tue, Oct 20, 2009 at 6:34 PM, Yuvi Panda yuvipa...@gmail.com wrote: I use lxml.html. Just as good, and MUCH faster. A pain to install though. If you're using ActivePython, the following command is just enough to get lxml installed on Mac, Linux or Windows: $ pypm install lxml

Re: [BangPypers] HTML Parsing in python

2009-09-10 Thread Anand Chitipothu

2009/9/10 Puneet Aggarwal look4pun...@gmail.com: Hi BangPypers, Can anyone suggest me a good library for html parsing in python ? I googled a found few libararies BeautifulSoup, HTMLParser, SGMLParser etc. Can anyone suggest me which should I go for from your experience. I recommend

Re: [BangPypers] HTML Parsing in python

2009-09-10 Thread Baiju M

On Thu, Sep 10, 2009 at 2:29 PM, Puneet Aggarwallook4pun...@gmail.com wrote: Hi BangPypers, Can anyone suggest me a good library for html parsing in python ? http://code.google.com/p/html5lib/ -- Baiju M ___ BangPypers mailing list

Re: [BangPypers] HTML Parsing in python

2009-09-10 Thread Noufal Ibrahim

On Thu, Sep 10, 2009 at 3:41 PM, Anand Chitipothu anandol...@gmail.com wrote: 2009/9/10 Puneet Aggarwal look4pun...@gmail.com: Hi BangPypers, Can anyone suggest me a good library for html parsing in python ? I googled a found few libararies BeautifulSoup, HTMLParser, SGMLParser etc. Can

Re: [BangPypers] HTML Parsing in python

2009-09-10 Thread Ramkumar R

or use cElementTree (the ElementTree implementation in C). ElementTree is an XML parser. Forget that I mentioned it if you're only going to be parsing HTML. ___ BangPypers mailing list BangPypers@python.org

Re: [BangPypers] HTML Parsing in python

2009-09-10 Thread Ramkumar R

+1 Beautiful Soup The author is no longer interested in maintaining BeautifulSoup (see http://www.crummy.com/software/BeautifulSoup/3.1-problems.html). The BeautifulSoup port to Python 3.x is pretty terrible, as it's based on the error intolerant HTMLParser. While it's a fantastic library for

Re: [BangPypers] HTML Parsing in python

2009-09-10 Thread S.Ramaswamy

On Thu, Sep 10, 2009 at 2:29 PM, Puneet Aggarwal look4pun...@gmail.comwrote: Hi BangPypers, Can anyone suggest me a good library for html parsing in python ? I googled a found few libararies BeautifulSoup, HTMLParser, SGMLParser etc. Can anyone suggest me which should I go for from your

Re: [BangPypers] HTML Parsing in python

2009-09-10 Thread Baishampayan Ghose

Can anyone suggest me a good library for html parsing in python ? I googled a found few libararies BeautifulSoup, HTMLParser, SGMLParser etc. Can anyone suggest me which should I go for from your experience. BeautifulSoup was OK, but now it's broken. Use lxml, it's very good.

Re: [BangPypers] HTML Parsing in python

2009-09-10 Thread Dhananjay Nene

Do you require tolerance for non well formed xml / html ? If y, you may consider sgmlop http://effbot.org/zone/sgmlop-index.htm On Thu, Sep 10, 2009 at 7:07 PM, Baishampayan Ghose b.gh...@gmail.comwrote: Can anyone suggest me a good library for html parsing in python ? I googled a found few

Re: [BangPypers] HTML Parsing in python

2009-09-10 Thread Puneet Aggarwal

Thanks all for the suggestions. I think I will start with BeautifulSoup (3.0.7a) and will experiment with other suggested libs if it does not fit into my requirement or if I face issues with this. On Thu, Sep 10, 2009 at 7:07 PM, Baishampayan Ghose b.gh...@gmail.comwrote: Can anyone suggest me

Re: [BangPypers] HTML Parsing in python

2009-09-10 Thread Puneet Aggarwal

Hi Dhananjay, My requirement is simple. I need to extract information from a page. But the pages can be malformed html or it can be any junk html. So the tolerance required. Thanks, Puneet On Thu, Sep 10, 2009 at 7:33 PM, Dhananjay Nene dhananjay.n...@gmail.comwrote: Do you require tolerance

Re: [BangPypers] HTML Parsing in python

2009-09-10 Thread srid

On Thu, Sep 10, 2009 at 6:37 AM, Baishampayan Ghose b.gh...@gm BeautifulSoup was OK, but now it's broken. Use lxml, it's very good. http://codespeak.net/lxml/ IanB has an interesting blog post on using lxml to parse HTML:

Re: [BangPypers] HTML Parsing in python

Re: [BangPypers] HTML Parsing in python

Re: [BangPypers] HTML Parsing in python

Re: [BangPypers] HTML Parsing in python

Re: [BangPypers] HTML Parsing in python

Re: [BangPypers] HTML Parsing in python

Re: [BangPypers] HTML Parsing in python

Re: [BangPypers] HTML Parsing in python

Re: [BangPypers] HTML Parsing in python

Re: [BangPypers] HTML Parsing in python

Re: [BangPypers] HTML Parsing in python

Re: [BangPypers] HTML Parsing in python

Re: [BangPypers] HTML Parsing in python

Re: [BangPypers] HTML Parsing in python

14 matches

Site Navigation

Mail list logo

Footer information