On Thu, 6 Apr 2006, Peter Stevens wrote:
[...]
    One typical use of Javascript is to perform argument checking before
    posting to the server. The URL you want is probably just buried in
    the Javascript function. Do a regular expression match on
   | $mech->content()| to find the link that you want and |$mech->get|
    it directly (this assumes that you know what your are looking for in
    advance).

    In more difficult cases, the Javascript is used for URL mangling to
    satisfy the needs of some middleware. In this case you need to
    figure out what the Javascript is doing (why are these URLs always
    really long?). There is probably some function with one or more
    arguments which calculates the new URL.
[...]

Another very common thing that's important for would-be scrapers is manipulation of forms (adding form controls and list items, submitting forms). In a sense that's just URL manipulation, of course, but in the FAQ it might be useful to draw people's attention to this specific case.

Script can also set cookies.


John

Reply via email to