Deleting files locally, that is not present remote any more?

2003-07-03 Thread Lars . Rasmussen
Hi

Just started to test wget on a win2000 PC. I'm using the mirror functionality,
and it seems to work ok,
Sometimes files on the remote ftp server are removed deliberately.
I wonder if it's possible to have wget to remove those files locally also, so it
actually is a 100% mirror that exists on the local server?

Best Regards / Venlig Hilsen
Lars Myrthu Rasmussen

--
Rohde  Schwarz Technology Center A/S
Tel.: +45 96 73 88 88
http://www.rohdeschwarz.dk

Lars Myrthu Rasmussen
SW Developer
Tel.: +45 96 73 88 34
mailto:[EMAIL PROTECTED]








Windows Schedule tool for starting/stopping wget?

2003-07-03 Thread Lars . Rasmussen
Hi

I'm calling the wget program via a .bat file on a win2000 PC. Works ok.
I have to schedule the start/stop of this, so i'm sure wget does not start
before afternoon, and then stops eventhough it's not finished, at a specfied
time, due to other sync jobs that has to run.

I have tried to use the simple Scheduled task in WIN2000, but this will not stop
the wget process again at a specified time.

Anyone got a hint here what i have to do/use?

Best Regards / Venlig Hilsen
Lars Myrthu Rasmussen

--
Rohde  Schwarz Technology Center A/S
Tel.: +45 96 73 88 88
http://www.rohdeschwarz.dk

Lars Myrthu Rasmussen
SW Developer
Tel.: +45 96 73 88 34
mailto:[EMAIL PROTECTED]








Re: wget with openssl problems

2003-07-03 Thread Toby Corkindale
On Tue, Jun 24, 2003 at 02:41:50PM -0400, Jim Ennis wrote:
 Hello,
 
 I am trying to compile wget-1.8.2 on Solaris 9 with openssl-0.9.7b .  The

Don't.. Wget is seriously broken with the SSL extensions, see my messages a
month or two ago. (Not that anyone repied :P)

Check out curl perhaps?
http://curl.haxx.se

tjc

-- 
Turning and turning in the widening gyre
The falcon cannot hear the falconer;
Things fall apart, the centre cannot hold;
Mere anarchy is loosed upon the world.


Re: Windows Schedule tool for starting/stopping wget?

2003-07-03 Thread Aaron S. Hawley
no such facility currently exists for wget.  this is a question of job
control and is better directed at your operating system.

On Thu, 3 Jul 2003 [EMAIL PROTECTED] wrote:

 Hi

 I'm calling the wget program via a .bat file on a win2000 PC. Works ok.
 I have to schedule the start/stop of this, so i'm sure wget does not start
 before afternoon, and then stops eventhough it's not finished, at a specfied
 time, due to other sync jobs that has to run.

 I have tried to use the simple Scheduled task in WIN2000, but this will not stop
 the wget process again at a specified time.

 Anyone got a hint here what i have to do/use?

 Best Regards / Venlig Hilsen
 Lars Myrthu Rasmussen


Re: Deleting files locally, that is not present remote any more?

2003-07-03 Thread Aaron S. Hawley
the feature to locally delete mirrored files that were not downloaded from
the server on the most recent wget --mirror has been requested previously.

On Thu, 3 Jul 2003 [EMAIL PROTECTED] wrote:

 Hi

 Just started to test wget on a win2000 PC. I'm using the mirror functionality,
 and it seems to work ok,
 Sometimes files on the remote ftp server are removed deliberately.
 I wonder if it's possible to have wget to remove those files locally also, so it
 actually is a 100% mirror that exists on the local server?

 Best Regards / Venlig Hilsen
 Lars Myrthu Rasmussen


Re: wget problem

2003-07-03 Thread Tony Lewis
Rajesh wrote:

 Wget is not mirroring the web site properly. For eg it is not copying
symbolic
 links from the main web server.The target directories do exist on the
mirror
 server.

wget can only mirror what can be seen from the web. Symbolic links will be
treated as hard references (assuming that some web page points to them).

If you cannot get there from http://www.sl.nsw.gov.au/ via your browser,
wget won't get the page.

Also, some servers change their behavior depending on the client. You may
need to use a user agent that looks like a browser to mirror some sites. For
example:

wget --user-agent=Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)

will make it look like wget is really Internet Explorer running on Windows
XP.

 Another problem is some of the files are different on the mirror web
server.
 her you again. For eg: compare these 2 attached files.

 penrith1.cfm is the file after wget copied from the main server.
 penrith1.cfm.org is the actual file sitting on the main server.

wget is storing what the web server returned, which may or may not be the
precise file stored on your system.

In particular, I notice that penrith1.cfm contains !--Requested: 17:30:40
Thursday 3 July 2003 --. That implies that all or part of the output is
generated programmatically.

You might try using wget to replicate an FTP version of the website.

Then again, perhaps wget is the wrong tool for your task. Have you
considered using secure copy (scp) instead?

HTH,

Tony



Re: wget problem

2003-07-03 Thread Tony Lewis
Rajesh wrote:

 Thanks for your reply. I have tried using the command wget
 --user-agent=Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1), but it
didn't
 work.

Adding the user agent helps some people -- I think most often with web
servers from the evil empire.

 I have one more question. In each directory I have a welcome.cfm file on
the
 main server (DirectoryIndex order is welcome.cfm welcome.htm welcome.html
 index.html). But, when I run wget on the mirror server, wget renames
welcome.cfm
 to index.html and downloads to mirror server.

 Why does it change the file name from welcome.cfm to index.html.

It appears to me that wget assumes that the result of getting a directory
(such as http://www.sl.nsw.gov.au/collections/) is index.html. (See the
debug output below.)

 How can I mirror a web site using scp?? I can only copy one file at a time
using
 scp.

The following works for me: scp [EMAIL PROTECTED]:path/to/directory/* -r


**
The promised debug output:

wget http://www.sl.nsw.gov.au/collections  --debug
DEBUG output created by Wget 1.8.1 on linux-gnu.

--20:16:36--  http://www.sl.nsw.gov.au/collections
   = `collections'
Resolving www.sl.nsw.gov.au... done.
Caching www.sl.nsw.gov.au = 192.231.59.40
Connecting to www.sl.nsw.gov.au[192.231.59.40]:80... connected.
Created socket 3.
Releasing 0x810dc38 (new refcount 1).
---request begin---
GET /collections HTTP/1.0
User-Agent: Wget/1.8.1
Host: www.sl.nsw.gov.au
Accept: */*
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response... HTTP/1.1 301 Moved Permanently
Date: Fri, 04 Jul 2003 03:16:36 GMT
Server: Apache/1.3.19 (Unix)
Location: http://www.sl.nsw.gov.au/collections/
Connection: close
Content-Type: text/html; charset=iso-8859-1


Location: http://www.sl.nsw.gov.au/collections/ [following]
Closing fd 3
--20:16:37--  http://www.sl.nsw.gov.au/collections/
   = `index.html'
Found www.sl.nsw.gov.au in host_name_addresses_map (0x810dc38)
Connecting to www.sl.nsw.gov.au[192.231.59.40]:80... connected.
Created socket 3.
Releasing 0x810dc38 (new refcount 1).
---request begin---
GET /collections/ HTTP/1.0
User-Agent: Wget/1.8.1
Host: www.sl.nsw.gov.au
Accept: */*
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response... HTTP/1.1 200 OK
Date: Fri, 04 Jul 2003 03:16:37 GMT
Server: Apache/1.3.19 (Unix)
Connection: close
Content-Type: text/html; charset=iso-8859-1


Length: unspecified [text/html]

[
] 21,28420.83K/s

Closing fd 3
20:16:38 (20.83 KB/s) - `index.html' saved [21284]