Thanks. I opened issues #4409, #4410. The first one to improve the logging on such an error, the second one to make sure we don't die a horrible death.
On Mon, Mar 11, 2013 at 7:24 AM, Yann ROBIN <[email protected]> wrote: > The socket error was due to nginx opening too much connection, thus reaching > the limit of open fd for the gateway. > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of Yann ROBIN > Sent: lundi 11 mars 2013 00:40 > To: Yehuda Sadeh > Cc: [email protected] > Subject: RE: Rados gateway 0.58 crash in RGWProcess::_clear > > The setup is multiple nginx accessing the fastcgi module using tcp socket. > I found a ticket about multiple gateways issue : > http://tracker.ceph.com/issues/2804 this may be related. > > We'll test with only one nginx to see if we still have the issue. > > Thanks, > > -----Message d'origine----- > De : [email protected] [mailto:[email protected]] De la part de Yehuda > Sadeh Envoyé : dimanche 10 mars 2013 23:58 À : Yann ROBIN Cc : > [email protected] Objet : Re: Rados gateway 0.58 crash in > RGWProcess::_clear > > On Sun, Mar 10, 2013 at 3:48 PM, Yann ROBIN <[email protected]> wrote: >> Hi, >> >> We recently setup a cluster using version 0.58. We did massive parallel >> upload to the gateway and saw the radosgw restarted every 5 to 10 minutes. >> Here are the debug log : >> https://gist.github.com/kYann/5130775 >> >> Small version : >> -2> 2013-03-10 23:20:02.916521 7fc1376ee700 1 ====== req done >> -2> req=0x237fc10 http_status=200 ====== >> -1> 2013-03-10 23:20:02.916546 7fc1376ee700 1 RGWProcess::m_tp worker >> finish >> 0> 2013-03-10 23:20:02.931847 7fc1efa5b780 -1 rgw/rgw_main.cc: In >> function 'virtual void RGWProcess::RGWWQ::_clear()' thread >> 7fc1efa5b780 time 2013-03-10 23:20:02.922020 >> rgw/rgw_main.cc: 175: FAILED assert(process->m_req_queue.empty()) >> >> ceph version 0.58 (ba3f91e7504867a52a83399d60917e3414e8c3e2) >> 1: (RGWRESTMgr_Admin::~RGWRESTMgr_Admin()+0) [0x474910] >> 2: (ThreadPool::stop(bool)+0x1ed) [0x4909bd] >> 3: (RGWProcess::run()+0x3c7) [0x473367] >> 4: (main()+0x8b6) [0x447276] >> 5: (__libc_start_main()+0xed) [0x7fc1ec92276d] >> 6: /usr/bin/radosgw() [0x448871] >> >> > > This obviously shouldn't happen. Just note that this code should only be > reached either when trying to bring the gateway down, or when there's some > error on the fastcgi socket. Which web server are you using? Which fastcgi > module? How did you set up fastcgi? > > Thanks, > Yehuda > > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the > body of a message to [email protected] More majordomo info at > http://vger.kernel.org/majordomo-info.html > > > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to [email protected] > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
