Thanks. I opened issues #4409, #4410. The first one to improve the
logging on such an error, the second one to make sure we don't die a
horrible death.

On Mon, Mar 11, 2013 at 7:24 AM, Yann ROBIN <[email protected]> wrote:
> The socket error was due to nginx opening too much connection, thus reaching 
> the limit of open fd for the gateway.
>
> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Yann ROBIN
> Sent: lundi 11 mars 2013 00:40
> To: Yehuda Sadeh
> Cc: [email protected]
> Subject: RE: Rados gateway 0.58 crash in RGWProcess::_clear
>
> The setup is multiple nginx accessing the fastcgi module using tcp socket.
> I found a ticket about multiple gateways issue : 
> http://tracker.ceph.com/issues/2804 this may be related.
>
> We'll test with only one nginx to see if we still have the issue.
>
> Thanks,
>
> -----Message d'origine-----
> De : [email protected] [mailto:[email protected]] De la part de Yehuda 
> Sadeh Envoyé : dimanche 10 mars 2013 23:58 À : Yann ROBIN Cc : 
> [email protected] Objet : Re: Rados gateway 0.58 crash in 
> RGWProcess::_clear
>
> On Sun, Mar 10, 2013 at 3:48 PM, Yann ROBIN <[email protected]> wrote:
>> Hi,
>>
>> We recently setup a cluster using version 0.58. We did massive parallel 
>> upload to the gateway and saw the radosgw restarted every 5 to 10 minutes.
>> Here are the debug log :
>> https://gist.github.com/kYann/5130775
>>
>> Small version :
>> -2> 2013-03-10 23:20:02.916521 7fc1376ee700  1 ====== req done
>> -2> req=0x237fc10 http_status=200 ======
>>     -1> 2013-03-10 23:20:02.916546 7fc1376ee700  1 RGWProcess::m_tp worker 
>> finish
>>      0> 2013-03-10 23:20:02.931847 7fc1efa5b780 -1 rgw/rgw_main.cc: In
>> function 'virtual void RGWProcess::RGWWQ::_clear()' thread
>> 7fc1efa5b780 time 2013-03-10 23:20:02.922020
>> rgw/rgw_main.cc: 175: FAILED assert(process->m_req_queue.empty())
>>
>> ceph version 0.58 (ba3f91e7504867a52a83399d60917e3414e8c3e2)
>> 1: (RGWRESTMgr_Admin::~RGWRESTMgr_Admin()+0) [0x474910]
>> 2: (ThreadPool::stop(bool)+0x1ed) [0x4909bd]
>> 3: (RGWProcess::run()+0x3c7) [0x473367]
>> 4: (main()+0x8b6) [0x447276]
>> 5: (__libc_start_main()+0xed) [0x7fc1ec92276d]
>> 6: /usr/bin/radosgw() [0x448871]
>>
>>
>
> This obviously shouldn't happen. Just note that this code should only be 
> reached either when trying to bring the gateway down, or when there's some 
> error on the fastcgi socket. Which web server are you using? Which fastcgi 
> module? How did you set up fastcgi?
>
> Thanks,
> Yehuda
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the 
> body of a message to [email protected] More majordomo info at  
> http://vger.kernel.org/majordomo-info.html
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to [email protected]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to