Copy/move downloaded pages

2004-02-05 Thread Dr. Martin Vanselow
I want to move/copy downloaded (with absolute file paths) pages on my hard disk. Is it possible to specify a local path as source instead of an internet address? O do you know a tool to convert the absolute links to relative ones?

I am working with DOS/Windows.

Thanks in advance,

Martin Vanselow




Re: Why no -nc with -N?

2004-02-05 Thread Jens Rösner
Hi Dan,

I must admit that I don't fully understand your question.

-nc
means no clobber, that means that files that already exist
locally are not downloaded again, independent from their age or size or 
whatever.

-N
means that only newer files are downloaded (or if the size differs).

So these two options are mutually exclusive.
I could imagine that you want something like
wget --no-clobber --keep-server-time URL
right?
If I understand the manual correctly, this date should normally be kept 
for http,
at least if you specify
wget URL
I just tested this and it works for me.
(With -S and/or -s you can print the http headers, if you need to.)

However, I noticed that quite many servers do not provide a 
last-modified header.

Did this answer your question?
Jens






 I'd love to have an option so that, when mirroring, it
 will backup only files that are replaced because they
 are newer on the source system (time-stamping).

 Is there a reason these can't be enabled together?

 __
 Do you Yahoo!?
 Yahoo! SiteBuilder - Free web site building tool. Try it!
 http://webhosting.yahoo.com/ps/sb/




 I'd love to have an option so that, when mirroring, it
 will backup only files that are replaced because they
 are newer on the source system (time-stamping).
 
 Is there a reason these can't be enabled together?
 
 __
 Do you Yahoo!?
 Yahoo! SiteBuilder - Free web site building tool. Try it!
 http://webhosting.yahoo.com/ps/sb/
 

-- 
GMX ProMail (250 MB Mailbox, 50 FreeSMS, Virenschutz, 2,99 EUR/Monat...)
jetzt 3 Monate GRATIS + 3x DER SPIEGEL +++ http://www.gmx.net/derspiegel +++



Read error at byte ...

2004-02-05 Thread Nick Hogg
Hi...

Stuck on a problem with wget.

Am using --ignore-length -o wget.log  -R 
jpg,jpeg,gif,mpeg,mpg,avi,au,ps,pdf,mp3,tmp,bmp,png,tiff,mov,wmv,qt,wav,ogg,rm,ram,doc,ppt,xls,zip,tar,gz,bz2,rar,arj,swf 
--random-wait --recursive --no-parent --directory-prefix=domain 
--timeout=10 --tries=4 as the options but the site will not come down. It 
gets the index page and nothing else. The log says (18.87 KB/s) - Read 
error at byte 19768 (Operation timed out).Giving up. but I thought that 
the --ignore-length would stop that??

Am using GNU Wget 1.8.2

Have searched mailing list and archives Is there something silly that I 
am missing???

Cheers

Nick

PS Please CC me!




Re: Why no -nc with -N?

2004-02-05 Thread Dan LeGate
You're right, I wasn't very clear.

What I'm wanting to do is Mirror a site, but keep
backups of any local files that get replaced because
newer versions are being downloaded.

Upon reading the documentation again, I think I
originally misunderstood the file.1, file.2 renaming
scheme.

I *thought* what happened was if a newer file exists
on the remote server, the local copy would first be
renamed to file.1 and the newer copy would be
downloaded in its place.  Instead, it looks like the
newer copy gets pulled down as file.1 and then file.2,
if I'm reading the following from the documentation
correctly:

   When running Wget without -N, -nc, or -r,
downloading the same
   file in the same directory will result in the
original copy of
   file being preserved and the second copy being
named file.1.  If
   that file is downloaded yet again, the third copy
will be named
   file.2, and so on.

I was hoping for this to be the local copy being
renamed and the newer file taking its place.  Am I
reading this right?

What I'm trying to do is FTP (using ftp:// instead of
http://) a set of directories and files down (hence
needing to do recursive, like -m), but keep a backup
of older versions locally.  Can this be done with
wget?

Thanks,

Dan

--- Jens_Rösner [EMAIL PROTECTED] wrote:
 Hi Dan,
 
 I must admit that I don't fully understand your
 question.
 
 -nc
 means no clobber, that means that files that already
 exist
 locally are not downloaded again, independent from
 their age or size or 
 whatever.
 
 -N
 means that only newer files are downloaded (or if
 the size differs).
 
 So these two options are mutually exclusive.
 I could imagine that you want something like
 wget --no-clobber --keep-server-time URL
 right?
 If I understand the manual correctly, this date
 should normally be kept 
 for http,
 at least if you specify
 wget URL
 I just tested this and it works for me.
 (With -S and/or -s you can print the http headers,
 if you need to.)
 
 However, I noticed that quite many servers do not
 provide a 
 last-modified header.
 
 Did this answer your question?
 Jens
 
 
 
 
 
 
  I'd love to have an option so that, when
 mirroring, it
  will backup only files that are replaced because
 they
  are newer on the source system (time-stamping).
 
  Is there a reason these can't be enabled together?


__
Do you Yahoo!?
Yahoo! Finance: Get your refund fast by filing online.
http://taxes.yahoo.com/filing.html


Re: Why no -nc with -N?

2004-02-05 Thread Hrvoje Niksic
Dan LeGate [EMAIL PROTECTED] writes:

 What I'm wanting to do is Mirror a site, but keep backups of any
 local files that get replaced because newer versions are being
 downloaded.

You might want to try the undocumented option `--backups', which does
what you want, i.e. forces the use of numbered backups.

Hmm.  Why is that option undocumented?  It seems useful in this case,
and, as far as I can tell, it works correctly.

 Upon reading the documentation again, I think I originally
 misunderstood the file.1, file.2 renaming scheme.

 I *thought* what happened was if a newer file exists on the remote
 server, the local copy would first be renamed to file.1 and the
 newer copy would be downloaded in its place.

That's what happens when you use `--backups'.  Normally, the local
copy is overwritten by the new copy.

 Instead, it looks like the newer copy gets pulled down as file.1 and
 then file.2, if I'm reading the following from the documentation
 correctly:

Umm, no.  -N turns off the numbered backups, which means that the file
is either not downloaded or is overwritten, depending on whether the
local file is older than the remote file or not.

When running Wget without -N, -nc, or -r,
 downloading the same
file in the same directory will result in the
 original copy of
file being preserved and the second copy being
 named file.1.  If
that file is downloaded yet again, the third copy
 will be named
file.2, and so on.

Without -N is the key phrase.

 I was hoping for this to be the local copy being renamed and the
 newer file taking its place.  Am I reading this right?

You're not, but `--backups' does seem to do what you want.  Let us
know if it works for you.



Re: Read error at byte ...

2004-02-05 Thread Hrvoje Niksic
Nick Hogg [EMAIL PROTECTED] writes:

 Hi...

 Stuck on a problem with wget.

 Am using --ignore-length -o wget.log  -R
 jpg,jpeg,gif,mpeg,mpg,avi,au,ps,pdf,mp3,tmp,bmp,png,tiff,mov,wmv,qt,wav,ogg,rm,ram,doc,ppt,xls,zip,tar,gz,bz2,rar,arj,swf
 --random-wait --recursive --no-parent --directory-prefix=domain
 --timeout=10 --tries=4 as the options but the site will not come
 down. It gets the index page and nothing else. The log says (18.87
 KB/s) - Read error at byte 19768 (Operation timed out).Giving up. but
 I thought that the --ignore-length would stop that??

Did it give up immediately, or after the fourth try?  You might want
to bump the number of tries to something large, or even infinite.