Feature request: it is a good idea to add the ability to pre-filter fetched
documents with external program.

Here's my problem. I'd like to download a site recursively, but all links
are made via javascript function calls. E.g. <a
href="javascript:go_to('/some/url.html')">. There are hidden form on the
page, and this js function performs some field manipulation and form
submission (it is used to check auth). Actually, I can add
"--post-data=login=someuser&passwd=mysecret" parameter to wget and then
convert this urls to normal form - <a href="/some/url.html"> - via external
filter (sed, awk, perl). The problem is, that I can't run external filter
automatically after fetch but before url extraction. Hence this feature
request.

If there is another solution to my problem - I'm interested to know it.


Actually, I started to write patch to support --output-filter=CMD parameter,
but I'm afraid I have no time to finish it. I guess the right place to
insert this filter is between 'fd_read_body' and 'write_data' functions.


-- 
 WBR, Sergey Martynoff
  "Webmaster Agency"
   http://www.wm.ru


Reply via email to