Hi Tony, Thanks for your reply. I have tried using the command wget --user-agent="Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)", but it didn't work.
I have one more question. In each directory I have a welcome.cfm file on the main server (DirectoryIndex order is welcome.cfm welcome.htm welcome.html index.html). But, when I run wget on the mirror server, wget renames welcome.cfm to index.html and downloads to mirror server. Why does it change the file name from welcome.cfm to index.html. How can I mirror a web site using scp?? I can only copy one file at a time using scp. Thanks, Rajesh. >From: "Tony Lewis" <[EMAIL PROTECTED]> >To: "Rajesh" <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]>, <[EMAIL PROTECTED]> >Subject: Re: wget problem >Date: Thu, 3 Jul 2003 07:46:33 -0700 >MIME-Version: 1.0 >Content-Transfer-Encoding: 7bit >X-Priority: 3 >X-MSMail-Priority: Normal >X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165 > >Rajesh wrote: > >> Wget is not mirroring the web site properly. For eg it is not copying >symbolic >> links from the main web server.The target directories do exist on the >mirror >> server. > >wget can only mirror what can be seen from the web. Symbolic links will be >treated as hard references (assuming that some web page points to them). > >If you cannot get there from http://www.sl.nsw.gov.au/ via your browser, >wget won't get the page. > >Also, some servers change their behavior depending on the client. You may >need to use a user agent that looks like a browser to mirror some sites. For >example: > >wget --user-agent="Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)" > >will make it look like wget is really Internet Explorer running on Windows >XP. > >> Another problem is some of the files are different on the mirror web >server. >> her you again. For eg: compare these 2 attached files..... >> >> penrith1.cfm is the file after wget copied from the main server. >> penrith1.cfm.org is the actual file sitting on the main server. > >wget is storing what the web server returned, which may or may not be the >precise file stored on your system. > >In particular, I notice that penrith1.cfm contains "<!--Requested: 17:30:40 >Thursday 3 July 2003 -->". That implies that all or part of the output is >generated programmatically. > >You might try using wget to replicate an FTP version of the website. > >Then again, perhaps wget is the wrong tool for your task. Have you >considered using secure copy (scp) instead? > >HTH, > >Tony > Unix System Administrator State Library of NSW Macquarie Street Sydney - 2000 Email: [EMAIL PROTECTED] Ph: 02-92731711 ==================================== This email and any attachments to it are privileged and confidential. If you are not the intended recipient, please notify the sender and delete it. The contents of this email are not given or endorsed by the State Library of New South Wales unless otherwise indicated by an authorised officer of the Library. Copyright law may also apply to this contents of this email. ====================================
