Re: [Bug-wget] [Bug-Wget] Use of maltipart/form-data when using body-file command

2013-04-16 Thread Tim Ruehsen
Am Tuesday 16 April 2013 schrieb Daniel Stenberg:
 On Sun, 14 Apr 2013, Tim Rühsen wrote:
  I wanted to propose that we use Content-Type: multipart/form-data and
  send the whole file as-is when using the --body-file option. This
  allows us to add the long missing functionality to send files as
  attachments through wget, without having to change the working of the
  old options.
  
  Why not look at curl (see --form) and decide, if it is the optimum or if
  there is a better way for the user to specify what he wants to upload.
  And then implement the best option syntax.
 
 I'm the main author of the -F logic for curl so I'm very biased here. But
 let me just provide some data. (And I don't think --form syntax curl uses
 is the optimum interface, it is just one I made up some 10-12 years ago
 and we've stayed with it to maintain compatibility - and it works pretty
 good.)

 multipart/form-data posts consis of one or more parts, where most HTML
 forms use more than one. To allow a tool to mimic a browser fine, you need
 to be able to fill in the other parts as well as the file upload (and you
 can even nest the parts, and for example upload multiple files within a
 single part). Users also occasionally want to alter the headers for
 specific form parts. (RFC1867 has all the details on the format.)

I use curl's --form for multi-file uploads (+ several key/value pairs) since 
2005 and changed 2010 to curl library. So I am aware of you and curl ;-)
BTW, thanks for that awesome tool.

For some closed source projects I wrote a MIME parsing/compositing library 
that is used for email and HTTP stuff. Just said to clarify that I am aware of 
what we are talking here.

From RFC 1867:
The media-type multipart/form-data follows the rules of all multipart
   MIME data streams as outlined in RFC 1521

 The boundary string Giuseppe mentioned isn't really such a big deal if you
 ask me. You can easily make it in the same style as the browsers do it (a
 - prefix and a series of random letters) and if you like curl use
 12 random hex letters it still makes 184884258895036416 possible combos.

Sorry, I don't see the point here.
Guiseppe talked about the *user* specifying the boundary (if I correctly 
understood that).
Wget should care for these details and just automatically create a boundary 
(similar to what you suggested above). But that's details.

The main task would be to provide a user interface to specify as many MIME 
parts as the user wants in one upload. And it is worth looking at how curl 
does it - as you say, it works pretty good.
This includes the possibility to specify --body-file as often as needed and 
also to specify the Content-Type.
Base64 encoding is already existent in Wget, boundary creation is simple, 
recursive MIME structure isn't needed so far, MIME parsing isn't needed.
So Wget already has everything to compose multipart/form-data bodies. We just 
have to throw it together.

Regards, Tim



Re: [Bug-wget] [Bug-Wget] Use of maltipart/form-data when using body-file command

2013-04-16 Thread Daniel Stenberg

On Tue, 16 Apr 2013, Tim Ruehsen wrote:

The boundary string Giuseppe mentioned isn't really such a big deal if you 
ask me. You can easily make it in the same style as the browsers do it (a 
- prefix and a series of random letters) and if you like curl use 
12 random hex letters it still makes 184884258895036416 possible combos.


Sorry, I don't see the point here. Guiseppe talked about the *user* 
specifying the boundary (if I correctly understood that). Wget should care 
for these details and just automatically create a boundary (similar to what 
you suggested above). But that's details.


Sorry for being unclear. I just meant to say that is probably not terribly 
important to let users be able to set the boundary string. We never had that 
option in (lib)curl and the only people who ever asked for it only did that 
because they didn't fully grasp what it is and how it works...


And of course this is just my personal opinion.

--

 / daniel.haxx.se



Re: [Bug-wget] [Bug-Wget] Use of maltipart/form-data when using body-file command

2013-04-15 Thread Daniel Stenberg

On Sun, 14 Apr 2013, Tim Rühsen wrote:

I wanted to propose that we use Content-Type: multipart/form-data and send 
the whole file as-is when using the --body-file option. This allows us to 
add the long missing functionality to send files as attachments through 
wget, without having to change the working of the old options.


Why not look at curl (see --form) and decide, if it is the optimum or if 
there is a better way for the user to specify what he wants to upload. And 
then implement the best option syntax.


I'm the main author of the -F logic for curl so I'm very biased here. But let 
me just provide some data. (And I don't think --form syntax curl uses is the 
optimum interface, it is just one I made up some 10-12 years ago and we've 
stayed with it to maintain compatibility - and it works pretty good.)


multipart/form-data posts consis of one or more parts, where most HTML forms 
use more than one. To allow a tool to mimic a browser fine, you need to be 
able to fill in the other parts as well as the file upload (and you can even 
nest the parts, and for example upload multiple files within a single part). 
Users also occasionally want to alter the headers for specific form parts. 
(RFC1867 has all the details on the format.)


The boundary string Giuseppe mentioned isn't really such a big deal if you ask 
me. You can easily make it in the same style as the browsers do it (a 
- prefix and a series of random letters) and if you like curl use 12 
random hex letters it still makes 184884258895036416 possible combos.


--

 / daniel.haxx.se

Re: [Bug-wget] [Bug-Wget] Use of maltipart/form-data when using body-file command

2013-04-14 Thread Giuseppe Scrivano
Darshit Shah dar...@gmail.com writes:

 this change will break backward compatibility, we need a new option for
 that and leave the default unchanged.

 I am not suggesting that we change the working of --post-file and --post-data 
 commands. Unlike the patch I just submitted, we could could de-couple the 
 --post-file,
 --post-data method and the --body-file --body-data methods, so that the older 
 --post-file/data methods work in exactly the same way, but only the NEW 
 --body- methods
 change.

I don't think this is a good idea.  It will create confusion, I prefer
them to work exactly in the same way so one day we can drop the --post-*
commands.


 How do you expect the file to be in this case?  Will wget do any 
 filtering?

 The file should be read in binary mode and sent as-is. However, I am
 unsure if wget should do any filtering of filetypes.

I was more thinking of how to specify boundary for instance.  Expect
that the file is well formed doesn't seem nice to users.

At most, we could prevent the file from being an executable
binary. (ELF in Linux, Win32 EXE in Windows, etc.)

Why?


-- 
Giuseppe



Re: [Bug-wget] [Bug-Wget] Use of maltipart/form-data when using body-file command

2013-04-14 Thread Darshit Shah

 I don't think this is a good idea.  It will create confusion, I prefer
 them to work exactly in the same way so one day we can drop the --post-*
 commands.

 Okay, in that case, we could have a new command, something like:
--body-attach to signify an attached file.

I was more thinking of how to specify boundary for instance.  Expect
 that the file is well formed doesn't seem nice to users.

  I don't really catch this. Shouldn't wget simply forward the file
specified by the user? What kind of checks should wget implement?

 At most, we could prevent the file from being an executable
 binary. (ELF in Linux, Win32 EXE in Windows, etc.)

 Why?

 Just a thought. To prevent misuse of Wget to send potentially malicious
code to the servers.
Scratch that, if it doesn't sound good.

-- 
Thanking You,
Darshit Shah
Research Lead, Code Innovation
Kill Code Phobia.
B.E.(Hons.) Mechanical Engineering, '14. BITS-Pilani


Re: [Bug-wget] [Bug-Wget] Use of maltipart/form-data when using body-file command

2013-04-14 Thread Tim Rühsen
Am Sonntag, 14. April 2013 schrieb Darshit Shah:
 Assuming that my previous patch adding --method, --body-file and
 --body-data options is accepted and merged into master,
 I wanted to propose that we use Content-Type: multipart/form-data and send
 the whole file as-is when using the --body-file option.
 This allows us to add the long missing functionality to send files as
 attachments through wget, without having to change the working of the old
 options.

Why not look at curl (see --form) and decide, if it is the optimum or if there 
is a better way for the user to specify what he wants to upload.
And then implement the best option syntax.

 The only problem I currently see here is that there remains no way for a
 user to send body-data in a way that is cannot be seen by another user who
 can run ps.

This is a different problem and could be solved by extending the -e option:
a leading special character could say, the following characters are a filename. 
Then parse that file for 'commands' (aka options).

Regards, Tim