In "lynx-dev Using other HTML parsers in Lynx"
[25/Jul/2000 Tue 19:30:40]
Mooneer Salem wrote:
I don't know much about programming, but some of this is of interest
to me.
> For fun, I decided to write a library that parses HTML. :)
>
> After writing the library I decided to write a demo application using it
> (which prints an English representation of the HTML passed into it and
> which also acts as a benchmark app.) According to the demonstration
> program, it parsed the PostgreSQL-HOWTO (300kb in HTML format) in
> 0.16 seconds, which is pretty impressive for a Celeron 500 with 192MB RAM
> which also acts as a pretty busy DNS, Web and database server.
>
> Here's the question: how hard would it be to implement this parser into
> Lynx?
Depends how integrated you want it to be. Could be as easy as
defining a new DOWNLOADER [your app] in lynx.cfg
What really interests me is; who feels like teaching it to parse
JavaScript as well [if you're releasing the source and leaving
it open to changes]?
People seem to think adding support for JavaScript isn't practical,
and maybe impossibe, in Lynx because of its a one-pass HTML parsing.
A separate application that *could* make sense of JavaScript,
then pass the results to Lynx, popped into my head some time ago
but I don't know how to do it.
Here's an old message about it:
http://www.flora.org/lynx-dev/html/month072000/msg00034.html
The follow-ups are probably more interesting than my input.
> The parser can be found at
> http://devel.usnuk.net/libhtmlparse-0.1-alpha1.tar.gz,
> and you can see the uncompressed archive at
> http://devel.usnuk.net/libhtmlparse-0.1-alpha1/.
> demo/test.c is the benchmark application I ran, while demo/prettyHTML is
> another
> demo app I wrote for the library (it makes HTML more clear and readable by
> adding tabs)
>
> --
> Mooneer Salem
> Sysadmin, Ultraspeed UK (http://www.ultraspeed.co.uk/)
> GPLTrans (http://www.translator.cx/)
> Personal Home Page (http://msalem.translator.cx/)
Patrick
<mailto:[EMAIL PROTECTED]>
; To UNSUBSCRIBE: Send "unsubscribe lynx-dev" to [EMAIL PROTECTED]