Re: wget -O writes empty file on failure
Mauro Tortonesi <[EMAIL PROTECTED]> writes: >>>what are you exactly suggesting to do? to keep the current behavior >>>of -O allowing only single downloads? >> Huh? -O doesn't currently allow only single downloads -- multiple >> downloads are appended to the same output file. > > let me rephrase. are you suggesting to modify the current behavior of > to of -O to allow only single downloads? That would IMHO be too incompatible a change. The documentation should clarify what -O does and refer user to another option if what they really want is --save-to (or --save-file-name or however we want to name it). This other option should do the renaming and possibly accept only one download.
Re: wget -O writes empty file on failure
Hrvoje Niksic wrote: Mauro Tortonesi <[EMAIL PROTECTED]> writes: That seems to break the principle of least surprise as well. If such an option is specified, maybe Wget should simply refuse to accept more than a single URL. what are you exactly suggesting to do? to keep the current behavior of -O allowing only single downloads? Huh? -O doesn't currently allow only single downloads -- multiple downloads are appended to the same output file. let me rephrase. are you suggesting to modify the current behavior of to of -O to allow only single downloads? if you think the current behavior is fine, we should probably at least document that the -k, -N, -p options won't work with -O and exit printing an error message in case they are used. I'm suggesting that we should include an option named --save-to (perhaps shortened to -s) that allows specifying an output file independent of what is in the URL. In that case we might want to disallow multiple URLs, or multiple URLs could save to .1, .2, etc. good idea. -- Aequam memento rebus in arduis servare mentem... Mauro Tortonesi http://www.tortonesi.com University of Ferrara - Dept. of Eng.http://www.ing.unife.it GNU Wget - HTTP/FTP file retrieval tool http://www.gnu.org/software/wget Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net Ferrara Linux User Group http://www.ferrara.linux.it
Re: wget -O writes empty file on failure
Mauro Tortonesi <[EMAIL PROTECTED]> writes: >> That seems to break the principle of least surprise as well. If such >> an option is specified, maybe Wget should simply refuse to accept more >> than a single URL. > > what are you exactly suggesting to do? to keep the current behavior > of -O allowing only single downloads? Huh? -O doesn't currently allow only single downloads -- multiple downloads are appended to the same output file. I'm suggesting that we should include an option named --save-to (perhaps shortened to -s) that allows specifying an output file independent of what is in the URL. In that case we might want to disallow multiple URLs, or multiple URLs could save to .1, .2, etc.
Re: wget -O writes empty file on failure
Hrvoje Niksic wrote: Mauro Tortonesi <[EMAIL PROTECTED]> writes: The semantics of -O are well-defined, but they're not what people expect. In other words, -O breaks the "principle of least surprise". it does indeed. in this case, the redirection command would simply write all the downloaded data to the output without performing any trasformation. on the other hand, the output filename command could perform more complex operations, That seems to break the principle of least surprise as well. If such an option is specified, maybe Wget should simply refuse to accept more than a single URL. what are you exactly suggesting to do? to keep the current behavior of -O allowing only single downloads? -- Aequam memento rebus in arduis servare mentem... Mauro Tortonesi http://www.tortonesi.com University of Ferrara - Dept. of Eng.http://www.ing.unife.it GNU Wget - HTTP/FTP file retrieval tool http://www.gnu.org/software/wget Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net Ferrara Linux User Group http://www.ferrara.linux.it
Re: wget -O writes empty file on failure
Mauro Tortonesi <[EMAIL PROTECTED]> writes: > you might be actually right. the real problem here is that the > semantics of -O are too generic and not well-defined. The semantics of -O are well-defined, but they're not what people expect. In other words, -O breaks the "principle of least surprise". > in this case, the redirection command would simply write all the > downloaded data to the output without performing any > trasformation. on the other hand, the output filename command could > perform more complex operations, That seems to break the principle of least surprise as well. If such an option is specified, maybe Wget should simply refuse to accept more than a single URL.
Re: wget -O writes empty file on failure
Hrvoje Niksic wrote: Mauro Tortonesi <[EMAIL PROTECTED]> writes: the following patch (just commited into the trunk) should solve the problem. I don't think that patch is such a good idea. -O, as currently implemented, is simply a way to specify redirection. You can think of it as analogous to "command > file" in the shell. In that light, leaving empty files makes perfect sense (that's what shell does with "nosuchcommand > foo"). Most people, on the other hand, expect -O to simply change the destination file name of the current download (and fail to even consider what should happen when multiple URLs are submitted to Wget). For them, the current behavior doesn't make sense. Until -O is changed to really just change the destination file name, I believe the current behavior should be retained. you might be actually right. the real problem here is that the semantics of -O are too generic and not well-defined. as you say, we should split the redirection and output filename functions in two different commands. in this case, the redirection command would simply write all the downloaded data to the output without performing any trasformation. on the other hand, the output filename command could perform more complex operations, like saving downloaded resources in a temporary file, parsing them for new URLs (maybe also providing a programming hook for external parsers) and writing the resources to their destination, archiving them in a well defined format in case of multiple downloads. what do you think? -- Aequam memento rebus in arduis servare mentem... Mauro Tortonesi http://www.tortonesi.com University of Ferrara - Dept. of Eng.http://www.ing.unife.it GNU Wget - HTTP/FTP file retrieval tool http://www.gnu.org/software/wget Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net Ferrara Linux User Group http://www.ferrara.linux.it
Re: wget -O writes empty file on failure
Mauro Tortonesi <[EMAIL PROTECTED]> writes: > the following patch (just commited into the trunk) should solve the > problem. I don't think that patch is such a good idea. -O, as currently implemented, is simply a way to specify redirection. You can think of it as analogous to "command > file" in the shell. In that light, leaving empty files makes perfect sense (that's what shell does with "nosuchcommand > foo"). Most people, on the other hand, expect -O to simply change the destination file name of the current download (and fail to even consider what should happen when multiple URLs are submitted to Wget). For them, the current behavior doesn't make sense. Until -O is changed to really just change the destination file name, I believe the current behavior should be retained.
Re: wget -O writes empty file on failure
Avis, Ed wrote:
>If you try to download a file with wget and the download fails, then
>normally wget writes no output.
>
>% wget --quiet http://www.gnu.org/nonexistent
>% ls -l nonexistent
>ls: nonexistent: No such file or directory
>
>This is good behaviour. It would be confusing to create the output file
>when nothing has been downloaded. But with the -O option, wget creates
>an output file whether successful or not:
>
>% wget -O out --quiet http://www.gnu.org/nonexistent
>% ls -l out
>-rw-rw-r-- 1 avised avised 0 2006-01-17 08:57 out
>
>I think that the behaviour with -O should be consistent with the
>behaviour without -O, that is, no output file should be created in the
>case of 404 errors or other total failures to download anything.
>
>
the following patch (just commited into the trunk) should solve the
problem. i am not so sure that checking that the amount of downloaded
bytes is zero is a safe condition for the removal of the output
document. a better solution would probably be to count the actual number
of successful downloads, to avoid the deletion of valid zero-sized
resources when -O is used.
Index: init.c
===
--- init.c (revision 2104)
+++ init.c (working copy)
@@ -1427,7 +1427,15 @@
/* Free external resources, close files, etc. */
if (output_stream)
-fclose (output_stream);
+{
+ fclose (output_stream);
+ if (opt.output_document
+ && !(total_downloaded_bytes > 0))
+{
+ unlink (opt.output_document);
+}
+}
+
/* No need to check for error because Wget flushes its output (and
checks for errors) after any data arrives. */
--
Aequam memento rebus in arduis servare mentem...
Mauro Tortonesi http://www.tortonesi.com
University of Ferrara - Dept. of Eng.http://www.ing.unife.it
GNU Wget - HTTP/FTP file retrieval tool http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it
Re: wget -O writes empty file on failure
When Wget fetches a URL to store into a file with a URL-derived name, it can easily open the output file after it knows that the download has begun. With "-O", multiple URLs are possible, and so Wget opens the file before any download is attempted. Consider: wget -O fred http://www.gnu.org/ http://www.gnu.org/nonexistent Here, one fetch works, and the other does not. Is that successful or not? Wget could probably be changed to delay opening the "-O" file until a download succeeds, or it could detect any output to a "-O" file, and do a delete-on-close if nothing is ever written to it, but it'd probably be simpler for the fellow who specifies "-O" to do the check himself. man wc Steven M. Schweda (+1) 651-699-9818 382 South Warwick Street[EMAIL PROTECTED] Saint Paul MN 55105-2547
