On Wed, 19 Jan 2005, Peter Stevens wrote: > Edward Peschko wrote: [...] > >If there was an integration between LWP and seamonkey, what form > >of integration would people feel would be most useful? [...] > I think seamonkey integration would be a good thing and see it as an > alternative to mech. Essentially the same methods as mech, but there > would be two advantages: > > 1. Since the browser is supported by (an increasing number of) > websites, there will be fewer issues of "it works under > Firefox/IE6/etc, but not with my script". > 2. support for javascript. A lot of sites use javascript to do > argument checking before dispatching to the actual link. I'd like > to invoke a method 'click on the button', have it do the > javascript and get/post/whatever the link. As it is, I have to use > hard coded URLs or do regex matching on the javascript to find > where the button actually posts to. Inelegant at best and fragile > at worst. [...]
Do you mean spidermonkey (Mozilla's JavaScript interpreter)? Or do you mean Mozilla itself, through XP-COM? (Wasn't seamonkey the original project to get a working browser out of Netscape's source code? Or is there some project now to make the Mozilla source code usable as a library?) The latter would be essentially a replacement for LWP, rather than something that you would integrate with it. If you mean the former, that doesn't remove the need for LWP and mechanize. I got a first attempt at automatic JavaScript interpretation working for the Python port of mechanize and parts of LWP: http://wwwsearch.sourceforge.net/DOMForm/ http://wwwsearch.sourceforge.net/python-spidermonkey/ http://wwwsearch.sourceforge.net/mechanize/ If there's a good HTML DOM parser for Perl, it will be fairly easy to get something like this working with a few changes to Claes Jacobssen's JavaScript module (Perl wrapper of spidermonkey, which I borrowed from when doing the stuff above). Never did anything with it, though: I think it would be a LOT of work to make it work really well, certainly if nobody has already written a browser-style (rather than standards-compliant!) HTML DOM for you. I had to hack a DOM together from somebody's unmaintained pre-standards implementation of the HTML DOM. The tree builder literally gave me a headache (the version on my web site is certainly very incorrect, though if anybody is interested in doing a Perl version, I can probably dig out some patches that people sent me to make it work something approaching correctly). I don't want to put people off, though: a module that gives a useful level of compatibility with real browsers that is much better than my effort is quite doable in somebody's spare time, I think. It would be nice to see it done well -- browsers are such heavy things to drag into your code when all you want to do is fetch one lousy URL without poring over somebody else's JavaScript! John