Re: wget -O writes empty file on failure

2006-01-27 Thread Hrvoje Niksic
Mauro Tortonesi <[EMAIL PROTECTED]> writes:

>>>what are you exactly suggesting to do? to keep the current behavior
>>>of -O allowing only single downloads?
>> Huh?  -O doesn't currently allow only single downloads -- multiple
>> downloads are appended to the same output file.
>
> let me rephrase. are you suggesting to modify the current behavior of
> to of -O to allow only single downloads? 

That would IMHO be too incompatible a change.  The documentation
should clarify what -O does and refer user to another option if what
they really want is --save-to (or --save-file-name or however we want
to name it).  This other option should do the renaming and possibly
accept only one download.


Re: wget -O writes empty file on failure

2006-01-27 Thread Mauro Tortonesi

Hrvoje Niksic wrote:

Mauro Tortonesi <[EMAIL PROTECTED]> writes:


That seems to break the principle of least surprise as well.  If such
an option is specified, maybe Wget should simply refuse to accept more
than a single URL.


what are you exactly suggesting to do? to keep the current behavior
of -O allowing only single downloads?



Huh?  -O doesn't currently allow only single downloads -- multiple
downloads are appended to the same output file.


let me rephrase. are you suggesting to modify the current behavior of to 
of -O to allow only single downloads? if you think the current behavior 
is fine, we should probably at least document that the -k, -N, -p 
options won't work with -O and exit printing an error message in case 
they are used.



I'm suggesting that we should include an option named --save-to
(perhaps shortened to -s) that allows specifying an output file
independent of what is in the URL.  In that case we might want to
disallow multiple URLs, or multiple URLs could save to .1,
.2, etc.


good idea.

--
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Re: wget -O writes empty file on failure

2006-01-25 Thread Hrvoje Niksic
Mauro Tortonesi <[EMAIL PROTECTED]> writes:

>> That seems to break the principle of least surprise as well.  If such
>> an option is specified, maybe Wget should simply refuse to accept more
>> than a single URL.
>
> what are you exactly suggesting to do? to keep the current behavior
> of -O allowing only single downloads?

Huh?  -O doesn't currently allow only single downloads -- multiple
downloads are appended to the same output file.

I'm suggesting that we should include an option named --save-to
(perhaps shortened to -s) that allows specifying an output file
independent of what is in the URL.  In that case we might want to
disallow multiple URLs, or multiple URLs could save to .1,
.2, etc.


Re: wget -O writes empty file on failure

2006-01-25 Thread Mauro Tortonesi

Hrvoje Niksic wrote:

Mauro Tortonesi <[EMAIL PROTECTED]> writes:

The semantics of -O are well-defined, but they're not what people
expect. In other words, -O breaks the "principle of least surprise".


it does indeed.


in this case, the redirection command would simply write all the
downloaded data to the output without performing any
trasformation. on the other hand, the output filename command could
perform more complex operations,


That seems to break the principle of least surprise as well.  If such
an option is specified, maybe Wget should simply refuse to accept more
than a single URL.


what are you exactly suggesting to do? to keep the current behavior of 
-O allowing only single downloads?


--
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Re: wget -O writes empty file on failure

2006-01-24 Thread Hrvoje Niksic
Mauro Tortonesi <[EMAIL PROTECTED]> writes:

> you might be actually right. the real problem here is that the
> semantics of -O are too generic and not well-defined.

The semantics of -O are well-defined, but they're not what people
expect.  In other words, -O breaks the "principle of least surprise".

> in this case, the redirection command would simply write all the
> downloaded data to the output without performing any
> trasformation. on the other hand, the output filename command could
> perform more complex operations,

That seems to break the principle of least surprise as well.  If such
an option is specified, maybe Wget should simply refuse to accept more
than a single URL.


Re: wget -O writes empty file on failure

2006-01-24 Thread Mauro Tortonesi

Hrvoje Niksic wrote:

Mauro Tortonesi <[EMAIL PROTECTED]> writes:



the following patch (just commited into the trunk) should solve the
problem.



I don't think that patch is such a good idea.

-O, as currently implemented, is simply a way to specify redirection.
You can think of it as analogous to "command > file" in the shell.  In
that light, leaving empty files makes perfect sense (that's what shell
does with "nosuchcommand > foo").

Most people, on the other hand, expect -O to simply change the
destination file name of the current download (and fail to even
consider what should happen when multiple URLs are submitted to Wget).
For them, the current behavior doesn't make sense.

Until -O is changed to really just change the destination file name, I
believe the current behavior should be retained.


you might be actually right. the real problem here is that the semantics 
of -O are too generic and not well-defined. as you say, we should split 
the redirection and output filename functions in two different commands.


in this case, the redirection command would simply write all the 
downloaded data to the output without performing any trasformation. on 
the other hand, the output filename command could perform more complex 
operations, like saving downloaded resources in a temporary file, 
parsing them for new URLs (maybe also providing a programming hook for 
external parsers) and writing the resources to their destination, 
archiving them in a well defined format in case of multiple downloads.


what do you think?

--
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it


Re: wget -O writes empty file on failure

2006-01-24 Thread Hrvoje Niksic
Mauro Tortonesi <[EMAIL PROTECTED]> writes:

> the following patch (just commited into the trunk) should solve the
> problem.

I don't think that patch is such a good idea.

-O, as currently implemented, is simply a way to specify redirection.
You can think of it as analogous to "command > file" in the shell.  In
that light, leaving empty files makes perfect sense (that's what shell
does with "nosuchcommand > foo").

Most people, on the other hand, expect -O to simply change the
destination file name of the current download (and fail to even
consider what should happen when multiple URLs are submitted to Wget).
For them, the current behavior doesn't make sense.

Until -O is changed to really just change the destination file name, I
believe the current behavior should be retained.


Re: wget -O writes empty file on failure

2006-01-24 Thread Mauro Tortonesi
Avis, Ed wrote:

>If you try to download a file with wget and the download fails, then
>normally wget writes no output.
>
>% wget --quiet http://www.gnu.org/nonexistent
>% ls -l nonexistent
>ls: nonexistent: No such file or directory
>
>This is good behaviour.  It would be confusing to create the output file
>when nothing has been downloaded.  But with the -O option, wget creates
>an output file whether successful or not:
>
>% wget -O out --quiet http://www.gnu.org/nonexistent
>% ls -l out
>-rw-rw-r--  1 avised avised 0 2006-01-17 08:57 out
>
>I think that the behaviour with -O should be consistent with the
>behaviour without -O, that is, no output file should be created in the
>case of 404 errors or other total failures to download anything.
>  
>

the following patch (just commited into the trunk) should solve the
problem. i am not so sure that checking that the amount of downloaded
bytes is zero is a safe condition for the removal of the output
document. a better solution would probably be to count the actual number
of successful downloads, to avoid the deletion of valid zero-sized
resources when -O is used.


Index: init.c
===
--- init.c  (revision 2104)
+++ init.c  (working copy)
@@ -1427,7 +1427,15 @@
   /* Free external resources, close files, etc. */

   if (output_stream)
-fclose (output_stream);
+{
+  fclose (output_stream);
+  if (opt.output_document
+  && !(total_downloaded_bytes > 0))
+{
+  unlink (opt.output_document);
+}
+}
+
   /* No need to check for error because Wget flushes its output (and
  checks for errors) after any data arrives.  */


-- 
Aequam memento rebus in arduis servare mentem...

Mauro Tortonesi  http://www.tortonesi.com

University of Ferrara - Dept. of Eng.http://www.ing.unife.it
GNU Wget - HTTP/FTP file retrieval tool  http://www.gnu.org/software/wget
Deep Space 6 - IPv6 for Linuxhttp://www.deepspace6.net
Ferrara Linux User Group http://www.ferrara.linux.it



Re: wget -O writes empty file on failure

2006-01-17 Thread Steven M. Schweda
   When Wget fetches a URL to store into a file with a URL-derived name,
it can easily open the output file after it knows that the download has
begun.

   With "-O", multiple URLs are possible, and so Wget opens the file
before any download is attempted.  Consider:

  wget -O fred http://www.gnu.org/ http://www.gnu.org/nonexistent

   Here, one fetch works, and the other does not.  Is that successful or
not?

   Wget could probably be changed to delay opening the "-O" file until a
download succeeds, or it could detect any output to a "-O" file, and do
a delete-on-close if nothing is ever written to it, but it'd probably be
simpler for the fellow who specifies "-O" to do the check himself.

  man wc



   Steven M. Schweda   (+1) 651-699-9818
   382 South Warwick Street[EMAIL PROTECTED]
   Saint Paul  MN  55105-2547