RE: Wget - relative links within a script call aren't followed
No way, sorry. wget does not support javascript, so there is no way to have it follow that kind of links. Heiko -- -- PREVINET S.p.A. www.previnet.it -- Heiko Herold [EMAIL PROTECTED] [EMAIL PROTECTED] -- +39-041-5907073 ph -- +39-041-5907472 fax -Original Message- From: Raydeen A. Gallogly [mailto:[EMAIL PROTECTED] Sent: Friday, March 12, 2004 4:20 PM To: [EMAIL PROTECTED] Subject: Wget - relative links within a script call aren't followed I'm new to Wget but have learned alot in the last week. We are successfully running Wget to mirror a website existing on the other side of a firewall within our own agency. We can retrieve all relative links from existing HTML files with the exception of those that are contained within a script. For example, this is an excerpt from a script call to load an image within an HTML document that is not being followed: MM_preloadImages('pix/lats_but_lite.gif',) The only fix to this problem so far that we have been able to implement is to have the webmaster on the site that we want to mirror create a small HTML file named 'wgetfixes.html', link to it from the home page using style (display:none;) so that users won't see. Within the file, list all the files that they are calling from within their scripts individually using the following syntax: img src=pix/lat_but_lite.gif -- this works fine but I'm hopeful that there is a better way using a switch within Wget. Thanks for any input, it is truly appreciated. - Raydeen .. Raydeen Gallogly Web Manager NYS Department of Health, Wadsworth Center http://www.wadsworth.org email: [EMAIL PROTECTED]
RE: Wget - relative links within a script call aren't followed
It surely would be nice if some day WGET could support javascript. Is that something to put on the wish list or is it substantially impossible to implement? Do folks use Java to load images in order to thwart 'bots such as WGET? I run into the same problem regularly, and simply create a series of lines in a batch file that download each of the images by explicit filename. Very doable, but requires manual setup, rather than having WGET automatically follow the links. This will test for/download files that are known to ought to be there, but won't find files that are newly added. Thanks, Fred Holmes At 05:07 AM 3/15/2004, Herold Heiko wrote: No way, sorry. wget does not support javascript, so there is no way to have it follow that kind of links. Heiko -- -- PREVINET S.p.A. www.previnet.it -- Heiko Herold [EMAIL PROTECTED] [EMAIL PROTECTED] -- +39-041-5907073 ph -- +39-041-5907472 fax -Original Message- From: Raydeen A. Gallogly [mailto:[EMAIL PROTECTED] Sent: Friday, March 12, 2004 4:20 PM To: [EMAIL PROTECTED] Subject: Wget - relative links within a script call aren't followed I'm new to Wget but have learned alot in the last week. We are successfully running Wget to mirror a website existing on the other side of a firewall within our own agency. We can retrieve all relative links from existing HTML files with the exception of those that are contained within a script. For example, this is an excerpt from a script call to load an image within an HTML document that is not being followed: MM_preloadImages('pix/lats_but_lite.gif',) The only fix to this problem so far that we have been able to implement is to have the webmaster on the site that we want to mirror create a small HTML file named 'wgetfixes.html', link to it from the home page using style (display:none;) so that users won't see. Within the file, list all the files that they are calling from within their scripts individually using the following syntax: img src=pix/lat_but_lite.gif -- this works fine but I'm hopeful that there is a better way using a switch within Wget. Thanks for any input, it is truly appreciated. - Raydeen .. Raydeen Gallogly Web Manager NYS Department of Health, Wadsworth Center http://www.wadsworth.org email: [EMAIL PROTECTED]
RE: Wget - relative links within a script call aren't followed
This has been discusses several times in the past, for a complete solution LOT of work would be needed (a complete javascript engine would be neccessary for a starter), also there are several semantic problems (for example if a pic is laded only during mouseover, without preload, we still would not get it, since there is no mouse). Possibly some very partial, incomplete solution would be possible but frankly that would be an ugly hack. Heiko -- -- PREVINET S.p.A. www.previnet.it -- Heiko Herold [EMAIL PROTECTED] [EMAIL PROTECTED] -- +39-041-5907073 ph -- +39-041-5907472 fax -Original Message- From: Fred Holmes [mailto:[EMAIL PROTECTED] Sent: Monday, March 15, 2004 3:09 PM To: Herold Heiko; 'Raydeen A. Gallogly'; [EMAIL PROTECTED] Subject: RE: Wget - relative links within a script call aren't followed It surely would be nice if some day WGET could support javascript. Is that something to put on the wish list or is it substantially impossible to implement? Do folks use Java to load images in order to thwart 'bots such as WGET? I run into the same problem regularly, and simply create a series of lines in a batch file that download each of the images by explicit filename. Very doable, but requires manual setup, rather than having WGET automatically follow the links. This will test for/download files that are known to ought to be there, but won't find files that are newly added. Thanks, Fred Holmes At 05:07 AM 3/15/2004, Herold Heiko wrote: No way, sorry. wget does not support javascript, so there is no way to have it follow that kind of links. Heiko -- -- PREVINET S.p.A. www.previnet.it -- Heiko Herold [EMAIL PROTECTED] [EMAIL PROTECTED] -- +39-041-5907073 ph -- +39-041-5907472 fax -Original Message- From: Raydeen A. Gallogly [mailto:[EMAIL PROTECTED] Sent: Friday, March 12, 2004 4:20 PM To: [EMAIL PROTECTED] Subject: Wget - relative links within a script call aren't followed I'm new to Wget but have learned alot in the last week. We are successfully running Wget to mirror a website existing on the other side of a firewall within our own agency. We can retrieve all relative links from existing HTML files with the exception of those that are contained within a script. For example, this is an excerpt from a script call to load an image within an HTML document that is not being followed: MM_preloadImages('pix/lats_but_lite.gif',) The only fix to this problem so far that we have been able to implement is to have the webmaster on the site that we want to mirror create a small HTML file named 'wgetfixes.html', link to it from the home page using style (display:none;) so that users won't see. Within the file, list all the files that they are calling from within their scripts individually using the following syntax: img src=pix/lat_but_lite.gif -- this works fine but I'm hopeful that there is a better way using a switch within Wget. Thanks for any input, it is truly appreciated. - Raydeen .. Raydeen Gallogly Web Manager NYS Department of Health, Wadsworth Center http://www.wadsworth.org email: [EMAIL PROTECTED]