Re: automating javascript data forms

John J Lee Sat, 22 Jan 2005 13:05:20 -0800

On Wed, 19 Jan 2005, Peter Stevens wrote:
> Edward Peschko wrote:
[...]
> >If there was an integration between LWP and seamonkey, what form
> >of integration would people feel would be most useful?
[...]
> I think seamonkey integration would be a good thing and see it as an
> alternative to mech. Essentially the same methods as mech, but there
> would be two advantages:
>
>    1. Since the browser is supported by (an increasing number of)
>       websites, there will be fewer issues of "it works under
>       Firefox/IE6/etc, but not with my script".
>    2. support for javascript. A lot of sites use javascript to do
>       argument checking before dispatching to the actual link. I'd like
>       to invoke a method 'click on the button', have it do the
>       javascript and get/post/whatever the link. As it is, I have to use
>       hard coded URLs or do regex matching on the javascript to find
>       where the button actually posts to. Inelegant at best and fragile
>       at worst.
[...]


Do you mean spidermonkey (Mozilla's JavaScript interpreter)?

Or do you mean Mozilla itself, through XP-COM? (Wasn't seamonkey the
original project to get a working browser out of Netscape's source code?
Or is there some project now to make the Mozilla source code usable as a
library?)

The latter would be essentially a replacement for LWP, rather than
something that you would integrate with it.

If you mean the former, that doesn't remove the need for LWP and
mechanize.  I got a first attempt at automatic JavaScript interpretation
working for the Python port of mechanize and parts of LWP:

http://wwwsearch.sourceforge.net/DOMForm/
http://wwwsearch.sourceforge.net/python-spidermonkey/
http://wwwsearch.sourceforge.net/mechanize/


If there's a good HTML DOM parser for Perl, it will be fairly easy to get
something like this working with a few changes to Claes Jacobssen's
JavaScript module (Perl wrapper of spidermonkey, which I borrowed from
when doing the stuff above).

Never did anything with it, though: I think it would be a LOT of work to
make it work really well, certainly if nobody has already written a
browser-style (rather than standards-compliant!) HTML DOM for you.  I had
to hack a DOM together from somebody's unmaintained pre-standards
implementation of the HTML DOM.  The tree builder literally gave me a
headache (the version on my web site is certainly very incorrect, though
if anybody is interested in doing a Perl version, I can probably dig out
some patches that people sent me to make it work something approaching
correctly).

I don't want to put people off, though: a module that gives a useful level
of compatibility with real browsers that is much better than my effort is
quite doable in somebody's spare time, I think. It would be nice to see it
done well -- browsers are such heavy things to drag into your code when
all you want to do is fetch one lousy URL without poring over somebody
else's JavaScript!


John

Re: automating javascript data forms

Reply via email to