Re: for the wishlist
Quoting Dan Harkless ([EMAIL PROTECTED]): the file's size). This feature would enable the writing of cool scripts to do something like multi-threaded retrieval at file level. [...] Hi, Alec. You're the second person within a few days to ask for such a feature. I've added it to the TODO list. I would object in this case. While the ability to retrieve only part of the file might make sense, we sould have in mind that a tool like wget shall behave decently with respect to servers on the other side of the connection. I have nothing against starting several wget jobs to download data from several sites. Hammering servers by making X connections each of them retrieving 1/X of the file in concern doesn't look to me as a very good policy. Has anyone really measured how much can be gained this way, compared to, say, persistent http connections? I would say that several partial downloads make only sense in case that the http server limits the bandwidth for a single connection ... -- jan ---+ Jan Prikryl icq | vr|vis center for virtual reality and [EMAIL PROTECTED] 83242638 | visualisation http://www.vrvis.at ---+
Re: Wget and i18n
Philipp Thomas [EMAIL PROTECTED] writes: * Hrvoje Niksic ([EMAIL PROTECTED]) [20010305 19:30]: you leave LC_CTYPE at the default, "C" locale, gettext converts eight-bit characters to question marks. What should it do? characters 127 are undefined in LC_CTYPE for the "C" locale. So IMHO the only safe thing to do is to print a question mark instead. IMHO the only reasonable thing is to pass those characters as-is, which is what it used to do, and which worked perfectly. I hoped the times when the default action was to strip the eighth bit were behind us. Even the GNU coding standards recommend all applications to be 8-bit clean.
Re: Wget and i18n
Philipp Thomas [EMAIL PROTECTED] writes: Ooops, yes my fingers were a bit too fast :-) Here they are, both safe-ctype.h and safe-ctype.c. They look good to me. The only thing I don't get is this check: #ifdef isalpha #error "safe-ctype.h and ctype.h may not be used simultaneously" #else Is the error statement actually true, or is this only a warning that tries to enforce consistency of the application? Also, won't this trigger an error if a system header file, say string.h, happens to include ctype.h? (I know system header files should not do that because it pollutes your namespace, but older systems sometimes do that.)
Re: Wget and i18n
* Hrvoje Niksic ([EMAIL PROTECTED]) [20010306 10:35]: #ifdef isalpha #error "safe-ctype.h and ctype.h may not be used simultaneously" #else Is the error statement actually true, or is this only a warning that tries to enforce consistency of the application? The error statement is true. Remember that ctype.h is locale dependent whereas safe-ctype is not. So for instance isprint (ctype.h) and ISPRINT (safe-ctype) could well produce different results. And as the intention is to get rid of the locale dependency, you have to block the inclusion of ctype.h. The caveat with using safe-ctype is, that it won't work with multibyte encodings or wchars. So in the end every use of is... does need to be checked anway. Also, won't this trigger an error if a system header file, say string.h, happens to include ctype.h? (I know system header files should not do that because it pollutes your namespace, but older systems sometimes do that.) Yes, it would trigger in that case. But safe-ctype was developed for GCC originally and as gcc is used also on old systems (one of them the original BSD), I guess we would have heard if safe-ctype broke things. Philipp -- Penguins shall save the dinosaurs -- Handelsblatt about Linux on S/390
Re: Wget
I'm confused. I thought 1.5.3 *did* display the dots, but I could be wrong. Please send queries like this to the list ( [EMAIL PROTECTED] ), not to me personally. -- Csaba Rduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED] http://www.sophos.com US support: +1 888 SOPHOS 9UK Support: +44 1235 559933 :-( sorry for the top-posting )-: [EMAIL PROTECTED] (Timo Maier) To: [EMAIL PROTECTED] cc: 06/03/01 Subject: Re: Wget 10:58 Hi! The newest wget is 1.6 release and 1.7 developer. I have GNU Wget 1.5.3 which doesn't dsiplay the dots, it lokks like this: --- Connecting to www.telekom.de:80... connected! HTTP request sent, awaiting response... 206 Partial content Length: 4,509,742 (4,267,794 to go) [application/octet-stream] 3.05Mb (236.28kb) done at 5.19 KB/s. time: 0:09:16 (0:04:05 left) --- Is it possible to implement this in new versions, too? TAM -- OS/2 Warp4, Ducati 750SS '92 You still have the freedom to learn and say what you wanna say http://tam.belchenstuermer.de
Re: Wget
On Tue, Mar 06, 2001 at 11:28:04AM +, [EMAIL PROTECTED] wrote: I'm confused. I thought 1.5.3 *did* display the dots, but I could be wrong. It does here:- 1600K - .. .. .. .. .. [ 95%] 1650K - .. .. .. .. .. [ 98%] 1700K - .. .. ... [100%] 10:07:04 (7.08 KB/s) - `SB16AWE32AWE64v19rexxBETA.zip' saved [1764803/1764803] Please send queries like this to the list ( [EMAIL PROTECTED] ), not to me personally. -- Csaba Rduly, Software Engineer Sophos Anti-Virus email: [EMAIL PROTECTED] http://www.sophos.com US support: +1 888 SOPHOS 9UK Support: +44 1235 559933 :-( sorry for the top-posting )-: -- John
Re: Wget and i18n
* Hrvoje Niksic ([EMAIL PROTECTED]) [20010306 11:21]: It is true that old systems use Gcc, but I wonder if anyone tests *new* Gcc's on old these old systems... Yes, they do. The patches to make gcc build on the original BSD are only present in the current CVS GCC. Philipp -- Penguins shall save the dinosaurs -- Handelsblatt about Linux on S/390
Re: Wget and i18n
Philipp Thomas [EMAIL PROTECTED] writes: * Hrvoje Niksic ([EMAIL PROTECTED]) [20010306 11:21]: It is true that old systems use Gcc, but I wonder if anyone tests *new* Gcc's on old these old systems... Yes, they do. The patches to make gcc build on the original BSD are only present in the current CVS GCC. OK, then the #error stays. If noone objects, I'll modify Wget to use these files. Thanks.
css js
hi I know my question was answered on this list some months ago, but couldn't find it anymore. How is it now with recursive downloading css and js files tia i.t -- http://it97.dyn.dhs.org -- IrmundThum
fancy logs
hi! this is a (crazy) idea but it could be usefull (more or less) it is to make wget add lines like: start-time end-time size status url to a `central' log after downloading of each file... `status' can be used to determine if it is ok, timeout closed connection, etc. `central-log' could be ~/.wget_log for example. perhaps not very comfortable when mirroring but it can add single line for each url being mirrored. as I said this feature is perhaps arguable but I'd like to know your opinion on it... thanx! P! Vladi. -- Vladi Belperchinov-Shabanski [EMAIL PROTECTED] [EMAIL PROTECTED] Personal home page at http://www.biscom.net/~cade DataMax Ltd. http://www.datamax.bg No tears to cry, no feelings left...
-c question
hi! `wget -c file' starts to download file from the begining if the file is completely downloaded already... why?! I expect wget to do nothing in this case: I wanted it to download file to the end (i.e. to continue, -c) and if the file is already here so there is nothing to do. There is nothing for this case in the man so I'd like to know is there explanation for this behaviour, otherwise I think it should be considered `a bug'. thanx for the attention! P! Vladi. -- Vladi Belperchinov-Shabanski [EMAIL PROTECTED] [EMAIL PROTECTED] Personal home page at http://www.biscom.net/~cade DataMax Ltd. http://www.datamax.bg No tears to cry, no feelings left...
Re: Wget and i18n
* Hrvoje Niksic ([EMAIL PROTECTED]) [20010306 14:09]: OK, then the #error stays. If noone objects, I'll modify Wget to use these files. I have the patches ready and and am about to test them. So if you wait a bit, you'll get patches ready to apply. Philipp -- Penguins shall save the dinosaurs -- Handelsblatt about Linux on S/390
retrieving images referenced from existing file
Hi, I mirrored a website to my computer using: wget -k -r -tinf -np URL Later I noticed that some of the files were missing. So I decided to run: wget -k -r -tinf -nc -np URL Which in my opion should do the job of looking at alle pages and retrieving image files as needed. However, this only works, if wget did not have to move to a different directory, i.e. only images in the directory the starting page is in were downloaded. All images from subdirectories were not. I am using wget 1.5.3 on SuSE Linux 7.1, the files are written to a FAT filesystem (this causes utime-change errors). Sebastian -- Sebastian Boung - [EMAIL PROTECTED] http://www.bossung.org # Anybody who doesn't cut his speed at the # sight of a police car is probably parked.
Re: for the wishlist
Jan Prikryl [EMAIL PROTECTED] writes: Quoting Dan Harkless ([EMAIL PROTECTED]): the file's size). This feature would enable the writing of cool scripts to do something like multi-threaded retrieval at file level. [...] Hi, Alec. You're the second person within a few days to ask for such a feature. I've added it to the TODO list. I would object in this case. While the ability to retrieve only part of the file might make sense, we sould have in mind that a tool like wget shall behave decently with respect to servers on the other side of the connection. I have nothing against starting several wget jobs to download data from several sites. Hammering servers by making X connections each of them retrieving 1/X of the file in concern doesn't look to me as a very good policy. Has anyone really measured how much can be gained this way, compared to, say, persistent http connections? I would say that several partial downloads make only sense in case that the http server limits the bandwidth for a single connection ... Oops. I should have explicitly stated what I was thinking when I said I'd throw that on the TODO. I think being able to specify a range to download would be feature that'd have a lot of uses, including bandwidth _conservation_ (e.g. downloading only the last 100K of a 100MB file). It'd also be helpful for people with those lousy HTTP proxies that throw in "Transfer interrupted." strings that break the --continue feature. I agree completely that using this feature to try to speed up a download is probably misguided in most cases, and can be unfriendly to servers in the cases where it does work. Therefore I'm not willing to implement the special-purpose --split option the first guy suggested. If people really want to do this sort of thing, it'll be up to them to write their own wrappers for the --range option. --- Dan Harkless| To help prevent SPAM contamination, GNU Wget co-maintainer | please do not mention this email http://sunsite.dk/wget/ | address in Usenet posts -- thank you.
Re: css js
Irmund Thum [EMAIL PROTECTED] writes: hi I know my question was answered on this list some months ago, but couldn't find it anymore. How is it now with recursive downloading css and js files It works in 1.6, which you can find in the usual mirrors of ftp://ftp.gnu.org/pub/gnu/wget/. --- Dan Harkless| To help prevent SPAM contamination, GNU Wget co-maintainer | please do not mention this email http://sunsite.dk/wget/ | address in Usenet posts -- thank you.
Re: -c question
Vladi Belperchinov-Shabanski [EMAIL PROTECTED] writes: hi! `wget -c file' starts to download file from the begining if the file is completely downloaded already... why?! I expect wget to do nothing in this case: I wanted it to download file to the end (i.e. to continue, -c) and if the file is already here so there is nothing to do. There is nothing for this case in the man so I'd like to know is there explanation for this behaviour, otherwise I think it should be considered `a bug'. Yes, it's a known bug and is documented in the current CVS version of wget.texi. With luck, the fix may be as simple as changing a = to a . It's just that no one's had a chance to look at it. Feel free to peruse the source and send us a patch. --- Dan Harkless| To help prevent SPAM contamination, GNU Wget co-maintainer | please do not mention this email http://sunsite.dk/wget/ | address in Usenet posts -- thank you.
Re: retrieving images referenced from existing file
Sebastian Bossung [EMAIL PROTECTED] writes: Hi, I mirrored a website to my computer using: wget -k -r -tinf -np URL Later I noticed that some of the files were missing. So I decided to run: wget -k -r -tinf -nc -np URL Which in my opion should do the job of looking at alle pages and retrieving image files as needed. However, this only works, if wget did not have to move to a different directory, i.e. only images in the directory the starting page is in were downloaded. All images from subdirectories were not. I am using wget 1.5.3 on SuSE Linux 7.1, the files are written to a FAT filesystem (this causes utime-change errors). Get Wget 1.6 (see http://sunsite.dk/wget/). It has a new -p / --page-requisites option that downloads everything necessary to display a given page, regardless of what directories those images, stylesheets, etc. reside in. --- Dan Harkless| To help prevent SPAM contamination, GNU Wget co-maintainer | please do not mention this email http://sunsite.dk/wget/ | address in Usenet posts -- thank you.
Re: retrieving images referenced from existing file
Sebastian Bossung [EMAIL PROTECTED] writes: Hi Dan, will the -p option in wget 1.6 look at each .html file that is already on my hard disk to see if anything needed for it is missing? 1.5.3 does not seem to do this (I am only talking about images here - no css or the like) It will, but this depends on -N being specified and timestamp support working properly, and you mentioned your FAT file system screws this up. Well, as long as it always thinks local files are newer than the server versions, rather than the other way around, you can get the prerequisites for previously-downloaded HTML files without having to re-download them. --- Dan Harkless| To help prevent SPAM contamination, GNU Wget co-maintainer | please do not mention this email http://sunsite.dk/wget/ | address in Usenet posts -- thank you.
Re: retrieving images referenced from existing file
Hi again, I am now using wget 1.6 with -p. Seems to work, also I won't be able to tell for sure before tomorrow moring :-). I also noticed that the server is usually pretty fast, but appears to "hang" from time to time (webbrowser timing out). This might have been a problem on the first run, but I didn't scroll back the whole console. Does wget maintain an error-log file? Sebastian On Tuesday 06 March 2001 21:06 Dan Harkless wrote: Sebastian Bossung [EMAIL PROTECTED] writes: Hi, I mirrored a website to my computer using: wget -k -r -tinf -np URL Later I noticed that some of the files were missing. So I decided to run: wget -k -r -tinf -nc -np URL Which in my opion should do the job of looking at alle pages and retrieving image files as needed. However, this only works, if wget did not have to move to a different directory, i.e. only images in the directory the starting page is in were downloaded. All images from subdirectories were not. I am using wget 1.5.3 on SuSE Linux 7.1, the files are written to a FAT filesystem (this causes utime-change errors). Get Wget 1.6 (see http://sunsite.dk/wget/). It has a new -p / --page-requisites option that downloads everything necessary to display a given page, regardless of what directories those images, stylesheets, etc. reside in. --- Dan Harkless| To help prevent SPAM contamination, GNU Wget co-maintainer | please do not mention this email http://sunsite.dk/wget/ | address in Usenet posts -- thank you. -- Sebastian Boung - [EMAIL PROTECTED] http://www.bossung.org # Anybody who doesn't cut his speed at the # sight of a police car is probably parked.