When something like this happens you need more information. Sometimes -v and
the logfile can be enough, sometimes a look at the html helps.
In this case opening that page from a browser showed me links pointing to
another host, so after a bit of tweaking this seems to work fine(running
currently):

wget -vkKrp -l0 -np -e "robots=off" -H -Dwww.nt.ntnu.no,www.chembio.ntnu.no
http://www.nt.ntnu.no/~skoge/book/

(watch the wrap)

Heiko

-- 
-- PREVINET S.p.A.            [EMAIL PROTECTED]
-- Via Ferretto, 1            ph  x39-041-5907073
-- I-31021 Mogliano V.to (TV) fax x39-041-5907472
-- ITALY

> -----Original Message-----
> From: Yun MO [mailto:[EMAIL PROTECTED]]
> Sent: Tuesday, January 28, 2003 2:28 AM
> To: Max Bowsher
> Cc: [EMAIL PROTECTED]
> Subject: Re: Recursive download Problem.
> 
> 
> Dear  Max Bowsher,
> 
> Thank you for the answer. Yes, it  works for some sites. I 
> encountered a
>  new site again that I could not get the contents in the links.
> 
> The site is
>    http://www.nt.ntnu.no/~skoge/book/
> and I tried
>    Wget -r -np -e robots=off http://www.nt.ntnu.no/~skoge/book/
> and
>    Wget -r -np  http://www.nt.ntnu.no/~skoge/book/
> 
> Both of them failed. Read the robots.txt there was a comment line  as
> written as
> 
> User-agent: *           # directed to all spiders, not just Scooter
> Disallow: /RCS
> Disallow: /cards
> Disallow: /doc
> Disallow: /fag
> Disallow: /fakultet
> Disallow: /foot.shtml
> Disallow: /head.html
> Disallow: /index.shtml
> Disallow: /indexe.shtml
> Disallow: /info
> Disallow: /inst
> Disallow: /ntnubilder
> Disallow: /robots.txt
> Disallow: /usage
> Disallow: /userlist.shtml
> Disallow: /users
> 
> Could you help me solve the problem?
> 
> Thank you in advance.
> 
> Mo  Yun
> 
> 
> Max Bowsher wrote:
> > Yun MO wrote:
> > 
> >>Dear Ma'am/sir,
> >>
> >>I could not get all files with "wget -r" command for following
> >>address. Would you help me?
> >>Thank you in advance.
> >>
> >>M.Y.
> >>-----------------------
> >>
> >><meta NAME="robots" CONTENT="noindex,nofollow">
> > 
> > 
> > Wget is obeying the robots instruction.
> > 
> > wget -e robots=off ...
> > 
> > will override.
> > 
> > Max.
> > 
> 
> 
> -- 
> Yun Mo, Ph.D.
> 
> Technology Development Center, Tokyo Electron Ltd.
> 650 Mitsuzawa, Hosaka-cho, Nirasaki-shi, Yamanashi 407-0192, Japan
> 
> Phone: +81-551-23-4303   Fax: +81-551-23-4454
> E-mai: [EMAIL PROTECTED] / [EMAIL PROTECTED]
> 

Reply via email to