Re: [libmicrohttpd] How can I decode the form-data fields with MHD_http_unescape() function?

2016-02-16 Thread silvioprog
On Tue, Feb 16, 2016 at 9:29 AM, Evgeny Grin  wrote:

> There is a conflict between standards.
> HTML states that "+" must be decoded to space in url-encoding.
> https://www.w3.org/TR/html/forms.html#url-encoded-form-data
> RFC 3986 doesn't assume any special treatment of "+".
> https://tools.ietf.org/html/rfc3986


I found a draft talking "... and the plus sign may be used to represent
space characters.":

https://tools.ietf.org/html/draft-hoehrmann-urlencoded-01

But I don't know if this draft was accepted as standard.

However, I did some tests with PHP and JS, that shows MHD_http_unescape()
is right:

";
  echo "rawenc: " . $rawenc. "";
  echo "";
  echo "urldecode(enc): " . urldecode($enc) . "";
  echo "urldecode(rawenc): " . urldecode($rawenc) . "";
  echo "";
  echo "rawurldecode(enc): " . rawurldecode($enc). "";
  echo "rawurldecode(rawenc): " . rawurldecode($rawenc);
?>

Result:

enc: Silvio+Cl%C3%A9cio
rawenc: Silvio%20Cl%C3%A9cio

urldecode(enc): Silvio Clécio
urldecode(rawenc): Silvio Clécio

rawurldecode(enc): Silvio+Clécio
rawurldecode(rawenc): Silvio Clécio

---

var s = "Silvio Clécio";
var enc = encodeURI(s);
var encCmp = encodeURIComponent(s);
console.log("enc: ", enc);
console.log("encCmp: ", encCmp);
console.log();
console.log("decodeURI(enc): ", decodeURI(enc));
console.log("decodeURI(encCmp): ", decodeURI(encCmp));
console.log();
console.log("decodeURIComponent(enc): ", decodeURIComponent(enc));
console.log("decodeURIComponent(encCmp): ", decodeURIComponent(encCmp));

Result:

enc:  Silvio%20Cl%EF%BF%BDcio
encCmp:  Silvio%20Cl%EF%BF%BDcio

decodeURI(enc):  Silvio Clécio
decodeURI(encCmp):  Silvio Clécio

decodeURIComponent(enc):  Silvio Clécio
decodeURIComponent(encCmp):  Silvio Clécio

And:

var s = "Silvio+Cl%C3%A9cio";
console.log("decodeURI(s): ", decodeURI(s));
console.log("decodeURIComponent(s): ", decodeURIComponent(s));

Result:

decodeURI(s):  Silvio+Clécio
decodeURIComponent(s):  Silvio+Clécio

So MHD_http_unescape() looks like rawurldecode() and decodeURI()/
decodeURIComponent(). However I need to replace the plus chars because I
don't know what format the end user will send. I going to try
MHD_OPTION_UNESCAPE_CALLBACK ...

Thanks for answer -CG and -EG! :-)

--
Silvio Clécio


Re: [libmicrohttpd] How can I decode the form-data fields with MHD_http_unescape() function?

2016-02-16 Thread Evgeny Grin
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

May be we should change decoding algorithm.
According to current HTML standard
https://www.w3.org/TR/html5/forms.html#url-encoded-form-data symbol
"+" must be decoded as space. This standard uses old RFC for
encoding/decoding http://www.ietf.org/rfc/rfc1738.txt
But in "path" part of URI, space symbol must be encoded as %20.

- -- 
Best Wishes,
Evgeny Grin
On 16.02.2016 12:26, Christian Grothoff wrote:
> For (2), as you said, MHD_http_unescape does not replace the "+".
> The reason is that there are some situations in the HTTP protocol
> where that is undesirable.  So yes, in your case, you need to
> yourself replace the "+" with space, and that should be the only
> further transformation required.
> 
> I'm not sure I understand question (1) -- what else would you pass?
> -- but the answer seems to be "yes".
> 
> On 02/16/2016 03:32 AM, silvioprog wrote:
>> I have following form-data:
>> 
>> Content-Type: multipart-form-data id=10=Silvio+Cl%C3%A9cio 
>> ... some uploads here ...
>> 
>> But when I decode it with MHD_http_unescape() function:
>> 
>> ..., char *key, ... MHD_http_unescape(key);
>> 
>> I got:
>> 
>> Silvio+Clécio
>> 
>> But I need the full unescaped value, that is "Silvio Clécio"
>> (without quotes). So, I have two questions:
>> 
>> 1. is it correct to pass only the reference of my variable to
>> this function? 2. is it correct to replece all "+" occurrences
>> after to use the HTTP unescape function, or do I need to replace
>> other chars?
>> 
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iQEcBAEBCAAGBQJWwvtAAAoJEL96xKXqwrr0lNMIAJ7x/CPFoB5CpuFnhXB46B3Z
dZY3jNhff0T9JxfKmQPHNC/PJCPjNb8P5eVBiIHVnN4OMRD2DO9TiTDhX9pudynZ
tHQeuWEZVr39Q7vqcerxNjsCtLnYvXqcIeDUoMElSvQ0jY7PBDHURd2bkzKkuM7q
BAaS+EjbyON7UqqPc2NBMjIW0cBoUBqSnIh4gSNrZKJWuHQMPWwItu2QnUB6HQub
ZczuiodEZgg3ETBe1oZHnHBa798gJwEdG1iGZCtP/QUMsFjgQuxbbUb20sRNoTQ4
zgy8Ul0kGvL63CV6ZiwU2h7xRbKtVeAznXR63QhXfwjIconE1o60Du6eOxvP4Dk=
=yzPm
-END PGP SIGNATURE-



Re: [libmicrohttpd] How can I decode the form-data fields with MHD_http_unescape() function?

2016-02-16 Thread Christian Grothoff
For (2), as you said, MHD_http_unescape does not replace the "+". The
reason is that there are some situations in the HTTP protocol where that
is undesirable.  So yes, in your case, you need to yourself replace the
"+" with space, and that should be the only further transformation required.

I'm not sure I understand question (1) -- what else would you pass? --
but the answer seems to be "yes".

Happy hacking!

Christian

On 02/16/2016 03:32 AM, silvioprog wrote:
> Hello,
> 
> I have following form-data:
> 
> Content-Type: multipart-form-data
> id=10=Silvio+Cl%C3%A9cio
> ... some uploads here ...
> 
> But when I decode it with MHD_http_unescape() function:
> 
> ..., char *key, ...
> MHD_http_unescape(key);
> 
> I got:
> 
> Silvio+Clécio
> 
> But I need the full unescaped value, that is "Silvio Clécio" (without
> quotes). So, I have two questions:
> 
> 1. is it correct to pass only the reference of my variable to this function?
> 2. is it correct to replece all "+" occurrences after to use the HTTP
> unescape function, or do I need to replace other chars?
> 
> Thank you!
> 



signature.asc
Description: OpenPGP digital signature


[libmicrohttpd] How can I decode the form-data fields with MHD_http_unescape() function?

2016-02-15 Thread silvioprog
Hello,

I have following form-data:

Content-Type: multipart-form-data
id=10=Silvio+Cl%C3%A9cio
... some uploads here ...

But when I decode it with MHD_http_unescape() function:

..., char *key, ...
MHD_http_unescape(key);

I got:

Silvio+Clécio

But I need the full unescaped value, that is "Silvio Clécio" (without
quotes). So, I have two questions:

1. is it correct to pass only the reference of my variable to this function?
2. is it correct to replece all "+" occurrences after to use the HTTP
unescape function, or do I need to replace other chars?

Thank you!

-- 
Silvio Clécio