Based on the debug log, radosgw is definitely the software that's
incorrectly parsing the URL. For example:
2014-06-25 17:30:37.383134 7f7c6cfa9700 20
REQUEST_URI=/ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb
2014-06-25 17:30:37.383199 7f7c6cfa9700 10
s->object=ubuntu/pool/main/a/adduser/adduser_3.113 nmu3ubuntu3_all.deb
s->bucket=ubuntu
I'll dig into this some more, but it definitely looks like radosgw is
the one that's unencoding the + character here. How else would it be
receiving the request_uri with the + in it, but then a little bit later
the request has a space in it instead?
On 6/26/2014 2:59 AM, Yehuda Sadeh wrote:
The gateway itself supports these kind of characters. Usually we see
this issue when there's something in front of the web server (like a
load balancer) that modifies the requests. Another possibility is the
web server configuration that might be rewriting the requests. In this
case it seems that you're using nginx which is outside of our usual
test environment, so it might be related.
Yehuda
On Jun 25, 2014 5:39 PM, "Brian Rak" <[email protected]> wrote:
Unfortunately, both the client and actual files are outside of my control
here.... In the case that I noticed, the client is the Ubuntu installer, and
the files are part of the Ubuntu archives content.
On 6/25/2014 8:07 PM, Gerard Toonstra wrote:
the + is a reserved character in the HTTP protocol, which means it may have
specific meaning in a specific part of the URL, but not everywhere.
The earliest HTTP specification re-encoded spaces in the URL as + characters
after the question mark and form fields for posts that were
sent with urlencode.
Best is to prevent these characters in filenames or percentage encode the URL
explicitly.
Rgds,
G>
On Wed, Jun 25, 2014 at 8:41 PM, Brian Rak <[email protected]> wrote:
ceph version 0.80.1 (a38fe1169b6d2ac98b427334c12d7cf81f809b74)
I'll try to take a look through the bug tracker, but I didn't see anything
obvious at first glance.
On 6/25/2014 7:33 PM, Gregory Farnum wrote:
Unfortunately Yehuda's out for a while as he could best handle this,
but it sounds familiar so I think you probably want to search the list
archives and the bug tracker (http://tracker.ceph.com/projects/rgw).
What version precisely are you on?
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
On Wed, Jun 25, 2014 at 2:58 PM, Brian Rak <[email protected]> wrote:
I'm trying to find an issue with RadosGW and special characters in
filenames. Specifically, it seems that filenames with a + in them are not
being handled correctly, and that I need to explicitly escape them.
For example:
---request begin---
HEAD /ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb HTTP/1.0
User-Agent: Wget/1.12 (linux-gnu)
Will fail with a 404 error, but
---request begin---
HEAD /ubuntu/pool/main/a/adduser/adduser_3.113%2Bnmu3ubuntu3_all.deb
HTTP/1.0
User-Agent: Wget/1.12 (linux-gnu)
will work properly.
I enabled debug mode on radosgw, and see this:
2014-06-25 17:30:37.383029 7f7ca7fff700 20 RGWWQ:
2014-06-25 17:30:37.383040 7f7ca7fff700 20 req: 0x7f7ca000b180
2014-06-25 17:30:37.383053 7f7ca7fff700 10 allocated request
req=0x7f7ca0015ef0
2014-06-25 17:30:37.383064 7f7c6cfa9700 20 dequeued request
req=0x7f7ca000b180
2014-06-25 17:30:37.383070 7f7c6cfa9700 20 RGWWQ: empty
2014-06-25 17:30:37.383121 7f7c6cfa9700 20 CONTENT_LENGTH=
2014-06-25 17:30:37.383123 7f7c6cfa9700 20 CONTENT_TYPE=
2014-06-25 17:30:37.383124 7f7c6cfa9700 20 DOCUMENT_ROOT=/etc/nginx/html
2014-06-25 17:30:37.383125 7f7c6cfa9700 20
DOCUMENT_URI=/ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb
2014-06-25 17:30:37.383126 7f7c6cfa9700 20 FCGI_ROLE=RESPONDER
2014-06-25 17:30:37.383127 7f7c6cfa9700 20 GATEWAY_INTERFACE=CGI/1.1
2014-06-25 17:30:37.383128 7f7c6cfa9700 20 HTTP_ACCEPT=*/*
2014-06-25 17:30:37.383129 7f7c6cfa9700 20 HTTP_CONNECTION=Keep-Alive
2014-06-25 17:30:37.383129 7f7c6cfa9700 20 HTTP_HOST=xxx
2014-06-25 17:30:37.383130 7f7c6cfa9700 20 HTTP_USER_AGENT=Wget/1.12
(linux-gnu)
2014-06-25 17:30:37.383131 7f7c6cfa9700 20 QUERY_STRING=
2014-06-25 17:30:37.383131 7f7c6cfa9700 20 REDIRECT_STATUS=200
2014-06-25 17:30:37.383132 7f7c6cfa9700 20 REMOTE_ADDR=yyy
2014-06-25 17:30:37.383133 7f7c6cfa9700 20 REMOTE_PORT=43855
2014-06-25 17:30:37.383134 7f7c6cfa9700 20 REQUEST_METHOD=HEAD
2014-06-25 17:30:37.383134 7f7c6cfa9700 20
REQUEST_URI=/ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb
2014-06-25 17:30:37.383135 7f7c6cfa9700 20
SCRIPT_NAME=/ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb
2014-06-25 17:30:37.383136 7f7c6cfa9700 20 SERVER_ADDR=yyy
2014-06-25 17:30:37.383136 7f7c6cfa9700 20 SERVER_NAME=xxx
2014-06-25 17:30:37.383137 7f7c6cfa9700 20 SERVER_PORT=80
2014-06-25 17:30:37.383138 7f7c6cfa9700 20 SERVER_PROTOCOL=HTTP/1.0
2014-06-25 17:30:37.383138 7f7c6cfa9700 20 SERVER_SOFTWARE=nginx/1.4.6
2014-06-25 17:30:37.383140 7f7c6cfa9700 1 ====== starting new request
req=0x7f7ca000b180 =====
2014-06-25 17:30:37.383152 7f7c6cfa9700 2 req 1:0.000013::HEAD
/ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb::initializing
2014-06-25 17:30:37.383158 7f7c6cfa9700 10 host=xxxx rgw_dns_name=xxxx
2014-06-25 17:30:37.383199 7f7c6cfa9700 10
s->object=ubuntu/pool/main/a/adduser/adduser_3.113 nmu3ubuntu3_all.deb
s->bucket=ubuntu
2014-06-25 17:30:37.383207 7f7c6cfa9700 2 req 1:0.000068:s3:HEAD
/ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb::getting op
2014-06-25 17:30:37.383211 7f7c6cfa9700 2 req 1:0.000072:s3:HEAD
/ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb:get_obj:authorizing
2014-06-25 17:30:37.383218 7f7c6cfa9700 2 req 1:0.000079:s3:HEAD
/ubuntu/pool/main/a/adduser/adduser_3.113+nmu3ubuntu3_all.deb:get_obj:reading
permissions
2014-06-25 17:30:37.383268 7f7c6cfa9700 20 get_obj_state:
rctx=0x7f7c6cfa8640 obj=.rgw:ubuntu state=0x7f7c6800c0a8 s->prefetch_data=0
2014-06-25 17:30:37.383279 7f7c6cfa9700 10 cache get: name=.rgw+ubuntu :
miss
It seems that Ceph is attempting to urldecode the filename, even when it
shouldn't be. (Going by
http://stackoverflow.com/questions/1005676/urls-and-plus-signs ). Is this a
bug, or is this the desired behavior?
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com