Re: Hands-on HTML Table Parser/Matrix?

robert Sun, 06 Jul 2008 07:46:34 -0700

Tim Cook wrote:

On Sun, 2008-07-06 at 14:40 +0200, robert wrote:
Often I want to extract some web table contents. Formats aremostly static, simple text & numbers in it, other tags to bestripped off. So a simple & fast approach would be ok.
What of the different modules around is most easy to use, stable,up-to-date, iterator access or best matrix-access (without needfor callback functions,classes.. for basic tasks)?

> There are couple of HTML examples using Pyparsing here:
>
> http://pyparsing.wikispaces.com/Examples
>
>

hm - nothing special with HTML tables.

Meanwhile:

I dislike "ClientTable" (file centric, too much parsing errors inreal world).

"TableParse" works. Very simple&fast 70-liner regexp->matrix andstrip/clean/HTML-entities conversion. Fast success hands-on.Doesn't separate nested tables and such complexities consciously -but works though for simple hands-on tasks in real world.



Robert
--
http://mail.python.org/mailman/listinfo/python-list

Re: Hands-on HTML Table Parser/Matrix?

Reply via email to