Going back to my first post, I linked to this
http://stackoverflow.com/questions/1005676/urls-and-plus-signs
Per the defintion of application/x-www-form-urlencoded:
http://www.w3.org/TR/html401/interact/forms.html#h-17.13.4.1
"Control names and values are escaped. Space characters are replaced
by`+', and then reserved characters are escaped as described in[RFC1738]
<http://www.w3.org/TR/html401/references.html#ref-RFC1738>,"
The whole +=space thing is only for the query portion of the URL, not
the filename.
I've done some testing with nginx, and this is how it behaves:
On the server, somewhere in the webroot:
echo space > "test file"
Then, from a client:
$ wget --spider "http://example.com/test/test file"
Spider mode enabled. Check if remote file exists.
--2014-06-26 11:46:54-- http://example.com/test/test%20file
Connecting to example.com:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 6 [application/octet-stream]
Remote file exists.
$ wget --spider "http://example.com/test/test+file"
Spider mode enabled. Check if remote file exists.
--2014-06-26 11:46:57-- http://example.com/test/test+file
Connecting to example.com:80... connected.
HTTP request sent, awaiting response... 404 Not Found
Remote file does not exist -- broken link!!!
These tests were done just with the standard filesystem. I wasn't using
radosgw for this. Feel free to repeat with the web server of your
choice, you'll find the same thing happens.
URL decoding the path is not the correct behavior.
On 6/26/2014 11:36 AM, Sylvain Munaut wrote:
Hi,
Based on the debug log, radosgw is definitely the software that's
incorrectly parsing the URL. For example:
2014-06-25 17:30:37.383134 7f7c6cfa9700 20
REQUEST_URI=/ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb
2014-06-25 17:30:37.383199 7f7c6cfa9700 10
s->object=ubuntu/pool/main/a/adduser/adduser_3.113 nmu3ubuntu3_all.deb
s->bucket=ubuntu
I'll dig into this some more, but it definitely looks like radosgw is the
one that's unencoding the + character here. How else would it be receiving
the request_uri with the + in it, but then a little bit later the request
has a space in it instead?
Note that AFAIK, in fastcgi, REQUEST_URI is _supposed_ to be an URL
encoded version and should be URL-decoded by the fastcgi handler. So
converting the + to ' ' seems valid to me.
Cheers,
Sylvain
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com