Copy/move downloaded pages
I want to move/copy downloaded (with absolute file paths) pages on my hard disk. Is it possible to specify a local path as source instead of an internet address? O do you know a tool to convert the absolute links to relative ones? I am working with DOS/Windows. Thanks in advance, Martin Vanselow
Re: Why no -nc with -N?
Hi Dan, I must admit that I don't fully understand your question. -nc means no clobber, that means that files that already exist locally are not downloaded again, independent from their age or size or whatever. -N means that only newer files are downloaded (or if the size differs). So these two options are mutually exclusive. I could imagine that you want something like wget --no-clobber --keep-server-time URL right? If I understand the manual correctly, this date should normally be kept for http, at least if you specify wget URL I just tested this and it works for me. (With -S and/or -s you can print the http headers, if you need to.) However, I noticed that quite many servers do not provide a last-modified header. Did this answer your question? Jens I'd love to have an option so that, when mirroring, it will backup only files that are replaced because they are newer on the source system (time-stamping). Is there a reason these can't be enabled together? __ Do you Yahoo!? Yahoo! SiteBuilder - Free web site building tool. Try it! http://webhosting.yahoo.com/ps/sb/ I'd love to have an option so that, when mirroring, it will backup only files that are replaced because they are newer on the source system (time-stamping). Is there a reason these can't be enabled together? __ Do you Yahoo!? Yahoo! SiteBuilder - Free web site building tool. Try it! http://webhosting.yahoo.com/ps/sb/ -- GMX ProMail (250 MB Mailbox, 50 FreeSMS, Virenschutz, 2,99 EUR/Monat...) jetzt 3 Monate GRATIS + 3x DER SPIEGEL +++ http://www.gmx.net/derspiegel +++
Read error at byte ...
Hi... Stuck on a problem with wget. Am using --ignore-length -o wget.log -R jpg,jpeg,gif,mpeg,mpg,avi,au,ps,pdf,mp3,tmp,bmp,png,tiff,mov,wmv,qt,wav,ogg,rm,ram,doc,ppt,xls,zip,tar,gz,bz2,rar,arj,swf --random-wait --recursive --no-parent --directory-prefix=domain --timeout=10 --tries=4 as the options but the site will not come down. It gets the index page and nothing else. The log says (18.87 KB/s) - Read error at byte 19768 (Operation timed out).Giving up. but I thought that the --ignore-length would stop that?? Am using GNU Wget 1.8.2 Have searched mailing list and archives Is there something silly that I am missing??? Cheers Nick PS Please CC me!
Re: Why no -nc with -N?
You're right, I wasn't very clear. What I'm wanting to do is Mirror a site, but keep backups of any local files that get replaced because newer versions are being downloaded. Upon reading the documentation again, I think I originally misunderstood the file.1, file.2 renaming scheme. I *thought* what happened was if a newer file exists on the remote server, the local copy would first be renamed to file.1 and the newer copy would be downloaded in its place. Instead, it looks like the newer copy gets pulled down as file.1 and then file.2, if I'm reading the following from the documentation correctly: When running Wget without -N, -nc, or -r, downloading the same file in the same directory will result in the original copy of file being preserved and the second copy being named file.1. If that file is downloaded yet again, the third copy will be named file.2, and so on. I was hoping for this to be the local copy being renamed and the newer file taking its place. Am I reading this right? What I'm trying to do is FTP (using ftp:// instead of http://) a set of directories and files down (hence needing to do recursive, like -m), but keep a backup of older versions locally. Can this be done with wget? Thanks, Dan --- Jens_Rösner [EMAIL PROTECTED] wrote: Hi Dan, I must admit that I don't fully understand your question. -nc means no clobber, that means that files that already exist locally are not downloaded again, independent from their age or size or whatever. -N means that only newer files are downloaded (or if the size differs). So these two options are mutually exclusive. I could imagine that you want something like wget --no-clobber --keep-server-time URL right? If I understand the manual correctly, this date should normally be kept for http, at least if you specify wget URL I just tested this and it works for me. (With -S and/or -s you can print the http headers, if you need to.) However, I noticed that quite many servers do not provide a last-modified header. Did this answer your question? Jens I'd love to have an option so that, when mirroring, it will backup only files that are replaced because they are newer on the source system (time-stamping). Is there a reason these can't be enabled together? __ Do you Yahoo!? Yahoo! Finance: Get your refund fast by filing online. http://taxes.yahoo.com/filing.html
Re: Why no -nc with -N?
Dan LeGate [EMAIL PROTECTED] writes: What I'm wanting to do is Mirror a site, but keep backups of any local files that get replaced because newer versions are being downloaded. You might want to try the undocumented option `--backups', which does what you want, i.e. forces the use of numbered backups. Hmm. Why is that option undocumented? It seems useful in this case, and, as far as I can tell, it works correctly. Upon reading the documentation again, I think I originally misunderstood the file.1, file.2 renaming scheme. I *thought* what happened was if a newer file exists on the remote server, the local copy would first be renamed to file.1 and the newer copy would be downloaded in its place. That's what happens when you use `--backups'. Normally, the local copy is overwritten by the new copy. Instead, it looks like the newer copy gets pulled down as file.1 and then file.2, if I'm reading the following from the documentation correctly: Umm, no. -N turns off the numbered backups, which means that the file is either not downloaded or is overwritten, depending on whether the local file is older than the remote file or not. When running Wget without -N, -nc, or -r, downloading the same file in the same directory will result in the original copy of file being preserved and the second copy being named file.1. If that file is downloaded yet again, the third copy will be named file.2, and so on. Without -N is the key phrase. I was hoping for this to be the local copy being renamed and the newer file taking its place. Am I reading this right? You're not, but `--backups' does seem to do what you want. Let us know if it works for you.
Re: Read error at byte ...
Nick Hogg [EMAIL PROTECTED] writes: Hi... Stuck on a problem with wget. Am using --ignore-length -o wget.log -R jpg,jpeg,gif,mpeg,mpg,avi,au,ps,pdf,mp3,tmp,bmp,png,tiff,mov,wmv,qt,wav,ogg,rm,ram,doc,ppt,xls,zip,tar,gz,bz2,rar,arj,swf --random-wait --recursive --no-parent --directory-prefix=domain --timeout=10 --tries=4 as the options but the site will not come down. It gets the index page and nothing else. The log says (18.87 KB/s) - Read error at byte 19768 (Operation timed out).Giving up. but I thought that the --ignore-length would stop that?? Did it give up immediately, or after the fourth try? You might want to bump the number of tries to something large, or even infinite.