On Wed, 19 Jan 2005, Peter Stevens wrote:
> Edward Peschko wrote:
[...]
> >If there was an integration between LWP and seamonkey, what form
> >of integration would people feel would be most useful?
[...]
> I think seamonkey integration would be a good thing and see it as an
> alternative to mech. Essentially the same methods as mech, but there
> would be two advantages:
>
>    1. Since the browser is supported by (an increasing number of)
>       websites, there will be fewer issues of "it works under
>       Firefox/IE6/etc, but not with my script".
>    2. support for javascript. A lot of sites use javascript to do
>       argument checking before dispatching to the actual link. I'd like
>       to invoke a method 'click on the button', have it do the
>       javascript and get/post/whatever the link. As it is, I have to use
>       hard coded URLs or do regex matching on the javascript to find
>       where the button actually posts to. Inelegant at best and fragile
>       at worst.
[...]

Do you mean spidermonkey (Mozilla's JavaScript interpreter)?

Or do you mean Mozilla itself, through XP-COM? (Wasn't seamonkey the
original project to get a working browser out of Netscape's source code?
Or is there some project now to make the Mozilla source code usable as a
library?)

The latter would be essentially a replacement for LWP, rather than
something that you would integrate with it.

If you mean the former, that doesn't remove the need for LWP and
mechanize.  I got a first attempt at automatic JavaScript interpretation
working for the Python port of mechanize and parts of LWP:

http://wwwsearch.sourceforge.net/DOMForm/
http://wwwsearch.sourceforge.net/python-spidermonkey/
http://wwwsearch.sourceforge.net/mechanize/


If there's a good HTML DOM parser for Perl, it will be fairly easy to get
something like this working with a few changes to Claes Jacobssen's
JavaScript module (Perl wrapper of spidermonkey, which I borrowed from
when doing the stuff above).

Never did anything with it, though: I think it would be a LOT of work to
make it work really well, certainly if nobody has already written a
browser-style (rather than standards-compliant!) HTML DOM for you.  I had
to hack a DOM together from somebody's unmaintained pre-standards
implementation of the HTML DOM.  The tree builder literally gave me a
headache (the version on my web site is certainly very incorrect, though
if anybody is interested in doing a Perl version, I can probably dig out
some patches that people sent me to make it work something approaching
correctly).

I don't want to put people off, though: a module that gives a useful level
of compatibility with real browsers that is much better than my effort is
quite doable in somebody's spare time, I think. It would be nice to see it
done well -- browsers are such heavy things to drag into your code when
all you want to do is fetch one lousy URL without poring over somebody
else's JavaScript!


John

Reply via email to