Hi,
thanks for the nginx config. We did try it and it works like a charm. No more
stalling downloads. I wonder why apache does this though.
Sebastian
On 04.12.2013, at 10:53, Yann ROBIN wrote:
> Hi,
>
> Our conf :
> server {
> listen 80;
> listen [::]:80;
>
> server_name radosgw-prod;
>
> client_max_body_size 1000m;
> error_log /var/log/nginx/radosgw-prod-error.log;
> access_log off;
>
>
> location / {
> fastcgi_pass_header Authorization;
> fastcgi_pass_request_headers on;
>
> if ($request_method = PUT ) {
> rewrite ^ /PUT$request_uri;
> }
>
> include fastcgi_params;
> client_max_body_size 0;
>
> fastcgi_busy_buffers_size 512k;
> fastcgi_buffer_size 512k;
> fastcgi_buffers 16 512k;
> fastcgi_read_timeout 2s;
> fastcgi_send_timeout 1s;
> fastcgi_connect_timeout 1s;
>
>
> fastcgi_next_upstream error timeout http_500 http_503;
> fastcgi_pass ceph-rgw;
> }
>
> location /PUT/ {
> internal;
> fastcgi_pass_header Authorization;
> fastcgi_pass_request_headers on;
>
> include fastcgi_params;
> client_max_body_size 0;
> fastcgi_param CONTENT_LENGTH $content_length;
>
> fastcgi_busy_buffers_size 512k;
> fastcgi_buffer_size 512k;
> fastcgi_buffers 16 512k;
>
> fastcgi_pass ceph-rgw;
> }
> }
>
>
> Content-Length is only sent with PUT request because there was an issue with
> older version of the radosgateway.
>
> DON'T activate keep alive, connection are not closed on the radosgw side when
> the keep alive option is activated, leading to too much connection open on
> the rgw.
> We use this configuration with a tcp socket and not with a local one.
>
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Sebastian
> Sent: mercredi 4 décembre 2013 10:29
> To: ceph-users
> Subject: Re: [ceph-users] radosgw daemon stalls on download of some files
>
> Hi,
>
> we are currently using the patched fastcgi version
> (2.4.7-0910042141-6-gd4fffda) Updating to a more recent version is currently
> blocked by http://tracker.ceph.com/issues/6453
>
> Is there a documentation for running radosgw with nginx? I only find some
> mailinglist posts with some config snippets.
>
> Sebastian
>
> On 30.11.2013, at 20:46, Andrew Woodward wrote:
>
>> Are you using the inktank patched FastCGI sever?
>> http://gitbuilder.ceph.com
>>
>> Alternately try another script sever like ngnix as already suggested.
>>
>> On Nov 29, 2013 12:23 PM, "German Anders" <[email protected]> wrote:
>> Thanks a lot Sebastian, i'm going to try that, also i'm having an issue
>> while trying to test a rbd creation, i've install in the deploy server the
>> ceph-client:
>>
>> ceph@ceph-deploy01:/etc/ceph$ sudo rbd -n client.ceph-test -k
>> /home/ceph/ceph-cluster/ceph.client.admin.keyring create --size 10240
>> cephdata
>> 2013-11-29 15:20:25.683930 7fcd9979c780 0 librados:
>> client.ceph-openstack authentication error (1) Operation not permitted
>> rbd: couldn't connect to the cluster!
>>
>> Anyone know what could be the issue here? maybe it has something to do with
>> keys or maybe not...
>>
>> Thanks in advance,
>>
>> Best regards,
>>
>> German Anders
>>
>>
>>
>>
>>
>>
>>
>>> --- Original message ---
>>> Asunto: Re: [ceph-users] radosgw daemon stalls on download of some
>>> files
>>> De: Sebastian <[email protected]>
>>> Para: ceph-users <[email protected]>
>>> Fecha: Friday, 29/11/2013 16:18
>>>
>>> Hi Yehuda,
>>>
>>>
>>>> It's interesting, the responses are received but seems that they
>>>> aren't being handled (hence the following pings). There are a few
>>>> things that you could look at. First, try to connect to the admin
>>>> socket and see if you get any useful information from there. This
>>>> could include in-flight requests, look for other requests that have
>>>> not completed. Also see if there's indication for requests throttling.
>>>
>>> Do you refer to the methods mentioned here?
>>> http://ceph.com/docs/dumpling/radosgw/troubleshooting/?
>>> Unfortunately the socket file is not present. Do i have to activate it in
>>> the config somehow? I could not find any reference to that in the docs. Is
>>> it already included in my radosgw version?
>>> radosgw -v
>>> ceph version 0.67.4 (ad85b8bfafea6232d64cb7ba76a8b6e8252fa0c7)
>>>
>>>> Another thing to look at would be at the seemingly unrelated timeout
>>>> messages. These should not happen and might indicate that there's
>>>> something that is holding you up that shouldn't. Try searching for
>>>> the same thread id that is specified in these messages (omit the 0x
>>>> prefix), and see what's the last thing that it's doing.
>>>
>>> I checked that:
>>> http://pastebin.com/Z23PWwjt
>>> i do not see anything unusual before the messages happen, but maybe you see
>>> something odd.
>>>
>>>
>>>> You could also try turning on also 'debug objecter = 20', see if it
>>>> provides more info (it's very verbose though).
>>>>
>>>
>>> Did that, but that is way to verbose for me ;) I uploaded it here:
>>> http://pastebin.com/VBPAVP6z
>>> There might be some requests mixed into it, but the one for
>>> cdn/52974400c6dd6ca719000004/source.avi is the one that stalled.
>>>
>>>> How much are you loading the gateway before that happens? We've seen
>>>> a similar issue in the past that was related to the fcgi library
>>>> that is dynamically linked with the radosgw process (that is, not
>>>> the apache mod_fastcgi module). This, however, would only happen
>>>> when there's heavy load and the fd numbers handled by the radosgw
>>>> surpassed 1024 (buggy library that was using select() instead of poll()).
>>>
>>> There are not that many requests on the Storage, maybe 10-20 req/min. The
>>> cluster serves as a source for a CDN, so once the resource is fetched it
>>> should not be fetched again soon. I checked for the open files, and there
>>> are only about 10-20 open file handles for the radosgw process. So this
>>> probably is not the issue.
>>>
>>> Sebastian
>>>
>>>
>>>>
>>>> Yehuda
>>>>
>>>> On Fri, Nov 29, 2013 at 7:28 AM, Sebastian <[email protected]> wrote:
>>>>> Hi,
>>>>>
>>>>> thanks for the hint. I tried this again and noticed that the time out
>>>>> message does seem to be unrelated. Here is the log file for a stalling
>>>>> request with debug turned on:
>>>>> http://pastebin.com/DcQuc9wP
>>>>>
>>>>> I really cannot really find a real "error" in the log. The download
>>>>> stalls at about 500kb at that point though. Restarting radosgw fixes it
>>>>> for 1 download only, the next one is broken again. But as i said this
>>>>> does not happen for all files.
>>>>>
>>>>> Sebastian
>>>>>
>>>>> On 27.11.2013, at 21:53, Yehuda Sadeh wrote:
>>>>>
>>>>>> On Wed, Nov 27, 2013 at 4:46 AM, Sebastian <[email protected]> wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> we have a setup of 4 Servers running ceph and radosgw. We use it as an
>>>>>>> internal S3 service for our files. The Servers run Debian Squeeze with
>>>>>>> Ceph 0.67.4.
>>>>>>>
>>>>>>> The cluster has been running smoothly for quite a while, but we are
>>>>>>> currently experiencing issues with the radosgw. For some files the HTTP
>>>>>>> Download just stalls at around 500kb.
>>>>>>>
>>>>>>> The Apache error log just says:
>>>>>>> [error] [client ] FastCGI: comm with server "/var/www/s3gw.fcgi"
>>>>>>> aborted: idle timeout (30 sec) [error] [client ] Handler for
>>>>>>> fastcgi-script returned invalid result code 1
>>>>>>>
>>>>>>> radosgw logging:
>>>>>>> 7f00bc66a700 1 heartbeat_map is_healthy 'RGWProcess::m_tp thread
>>>>>>> 0x7f00934bb700' had timed out after 600
>>>>>>> 7f00bc66a700 1 heartbeat_map is_healthy 'RGWProcess::m_tp thread
>>>>>>> 0x7f00ab4eb700' had timed out after 600
>>>>>>>
>>>>>>> The interesting thing is that the cluster health is fine an only some
>>>>>>> files are not working properly. Most of them just work fine. A restart
>>>>>>> of radosgw fixes the issue. The other ceph logs are also clean.
>>>>>>>
>>>>>>> Any idea why this happens?
>>>>>>>
>>>>>>
>>>>>> No, but you can turn on 'debug ms = 1' on your gateway ceph.conf,
>>>>>> and that might give some better indication.
>>>>>>
>>>>>> Yehuda
>>>>>
>>>>> _______________________________________________
>>>>> ceph-users mailing list
>>>>> [email protected]
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> [email protected]
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> [email protected]
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com