Deleting files locally, that is not present remote any more?
Hi Just started to test wget on a win2000 PC. I'm using the mirror functionality, and it seems to work ok, Sometimes files on the remote ftp server are removed deliberately. I wonder if it's possible to have wget to remove those files locally also, so it actually is a 100% mirror that exists on the local server? Best Regards / Venlig Hilsen Lars Myrthu Rasmussen -- Rohde Schwarz Technology Center A/S Tel.: +45 96 73 88 88 http://www.rohdeschwarz.dk Lars Myrthu Rasmussen SW Developer Tel.: +45 96 73 88 34 mailto:[EMAIL PROTECTED]
Windows Schedule tool for starting/stopping wget?
Hi I'm calling the wget program via a .bat file on a win2000 PC. Works ok. I have to schedule the start/stop of this, so i'm sure wget does not start before afternoon, and then stops eventhough it's not finished, at a specfied time, due to other sync jobs that has to run. I have tried to use the simple Scheduled task in WIN2000, but this will not stop the wget process again at a specified time. Anyone got a hint here what i have to do/use? Best Regards / Venlig Hilsen Lars Myrthu Rasmussen -- Rohde Schwarz Technology Center A/S Tel.: +45 96 73 88 88 http://www.rohdeschwarz.dk Lars Myrthu Rasmussen SW Developer Tel.: +45 96 73 88 34 mailto:[EMAIL PROTECTED]
Re: wget with openssl problems
On Tue, Jun 24, 2003 at 02:41:50PM -0400, Jim Ennis wrote: Hello, I am trying to compile wget-1.8.2 on Solaris 9 with openssl-0.9.7b . The Don't.. Wget is seriously broken with the SSL extensions, see my messages a month or two ago. (Not that anyone repied :P) Check out curl perhaps? http://curl.haxx.se tjc -- Turning and turning in the widening gyre The falcon cannot hear the falconer; Things fall apart, the centre cannot hold; Mere anarchy is loosed upon the world.
Re: Windows Schedule tool for starting/stopping wget?
no such facility currently exists for wget. this is a question of job control and is better directed at your operating system. On Thu, 3 Jul 2003 [EMAIL PROTECTED] wrote: Hi I'm calling the wget program via a .bat file on a win2000 PC. Works ok. I have to schedule the start/stop of this, so i'm sure wget does not start before afternoon, and then stops eventhough it's not finished, at a specfied time, due to other sync jobs that has to run. I have tried to use the simple Scheduled task in WIN2000, but this will not stop the wget process again at a specified time. Anyone got a hint here what i have to do/use? Best Regards / Venlig Hilsen Lars Myrthu Rasmussen
Re: Deleting files locally, that is not present remote any more?
the feature to locally delete mirrored files that were not downloaded from the server on the most recent wget --mirror has been requested previously. On Thu, 3 Jul 2003 [EMAIL PROTECTED] wrote: Hi Just started to test wget on a win2000 PC. I'm using the mirror functionality, and it seems to work ok, Sometimes files on the remote ftp server are removed deliberately. I wonder if it's possible to have wget to remove those files locally also, so it actually is a 100% mirror that exists on the local server? Best Regards / Venlig Hilsen Lars Myrthu Rasmussen
Re: wget problem
Rajesh wrote: Wget is not mirroring the web site properly. For eg it is not copying symbolic links from the main web server.The target directories do exist on the mirror server. wget can only mirror what can be seen from the web. Symbolic links will be treated as hard references (assuming that some web page points to them). If you cannot get there from http://www.sl.nsw.gov.au/ via your browser, wget won't get the page. Also, some servers change their behavior depending on the client. You may need to use a user agent that looks like a browser to mirror some sites. For example: wget --user-agent=Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) will make it look like wget is really Internet Explorer running on Windows XP. Another problem is some of the files are different on the mirror web server. her you again. For eg: compare these 2 attached files. penrith1.cfm is the file after wget copied from the main server. penrith1.cfm.org is the actual file sitting on the main server. wget is storing what the web server returned, which may or may not be the precise file stored on your system. In particular, I notice that penrith1.cfm contains !--Requested: 17:30:40 Thursday 3 July 2003 --. That implies that all or part of the output is generated programmatically. You might try using wget to replicate an FTP version of the website. Then again, perhaps wget is the wrong tool for your task. Have you considered using secure copy (scp) instead? HTH, Tony
Re: wget problem
Rajesh wrote: Thanks for your reply. I have tried using the command wget --user-agent=Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1), but it didn't work. Adding the user agent helps some people -- I think most often with web servers from the evil empire. I have one more question. In each directory I have a welcome.cfm file on the main server (DirectoryIndex order is welcome.cfm welcome.htm welcome.html index.html). But, when I run wget on the mirror server, wget renames welcome.cfm to index.html and downloads to mirror server. Why does it change the file name from welcome.cfm to index.html. It appears to me that wget assumes that the result of getting a directory (such as http://www.sl.nsw.gov.au/collections/) is index.html. (See the debug output below.) How can I mirror a web site using scp?? I can only copy one file at a time using scp. The following works for me: scp [EMAIL PROTECTED]:path/to/directory/* -r ** The promised debug output: wget http://www.sl.nsw.gov.au/collections --debug DEBUG output created by Wget 1.8.1 on linux-gnu. --20:16:36-- http://www.sl.nsw.gov.au/collections = `collections' Resolving www.sl.nsw.gov.au... done. Caching www.sl.nsw.gov.au = 192.231.59.40 Connecting to www.sl.nsw.gov.au[192.231.59.40]:80... connected. Created socket 3. Releasing 0x810dc38 (new refcount 1). ---request begin--- GET /collections HTTP/1.0 User-Agent: Wget/1.8.1 Host: www.sl.nsw.gov.au Accept: */* Connection: Keep-Alive ---request end--- HTTP request sent, awaiting response... HTTP/1.1 301 Moved Permanently Date: Fri, 04 Jul 2003 03:16:36 GMT Server: Apache/1.3.19 (Unix) Location: http://www.sl.nsw.gov.au/collections/ Connection: close Content-Type: text/html; charset=iso-8859-1 Location: http://www.sl.nsw.gov.au/collections/ [following] Closing fd 3 --20:16:37-- http://www.sl.nsw.gov.au/collections/ = `index.html' Found www.sl.nsw.gov.au in host_name_addresses_map (0x810dc38) Connecting to www.sl.nsw.gov.au[192.231.59.40]:80... connected. Created socket 3. Releasing 0x810dc38 (new refcount 1). ---request begin--- GET /collections/ HTTP/1.0 User-Agent: Wget/1.8.1 Host: www.sl.nsw.gov.au Accept: */* Connection: Keep-Alive ---request end--- HTTP request sent, awaiting response... HTTP/1.1 200 OK Date: Fri, 04 Jul 2003 03:16:37 GMT Server: Apache/1.3.19 (Unix) Connection: close Content-Type: text/html; charset=iso-8859-1 Length: unspecified [text/html] [ ] 21,28420.83K/s Closing fd 3 20:16:38 (20.83 KB/s) - `index.html' saved [21284]