Also, looking at the code, I see an extra call to FCGX_Finish_r():
diff --git a/src/rgw/rgw_main.cc b/src/rgw/rgw_main.cc
index 9a8aa5f..0aa7ded 100644
--- a/src/rgw/rgw_main.cc
+++ b/src/rgw/rgw_main.cc
@@ -669,8 +669,6 @@ void RGWFCGXProcess::handle_request(RGWRequest *r)
dout(20) << "process_request() returned " << ret << dendl;
}
- FCGX_Finish_r(fcgx);
-
delete req;
}
Maybe this is a problem on the specific libfcgi version that you're using?
----- Original Message -----
> From: "Yehuda Sadeh-Weinraub" <[email protected]>
> To: "GuangYang" <[email protected]>
> Cc: [email protected], [email protected]
> Sent: Wednesday, June 24, 2015 2:21:04 PM
> Subject: Re: radosgw crash within libfcgi
>
>
>
> ----- Original Message -----
> > From: "GuangYang" <[email protected]>
> > To: "Yehuda Sadeh-Weinraub" <[email protected]>
> > Cc: [email protected], [email protected]
> > Sent: Wednesday, June 24, 2015 2:12:23 PM
> > Subject: RE: radosgw crash within libfcgi
> >
> > ----------------------------------------
> > > Date: Wed, 24 Jun 2015 17:04:05 -0400
> > > From: [email protected]
> > > To: [email protected]
> > > CC: [email protected]; [email protected]
> > > Subject: Re: radosgw crash within libfcgi
> > >
> > >
> > >
> > > ----- Original Message -----
> > >> From: "GuangYang" <[email protected]>
> > >> To: "Yehuda Sadeh-Weinraub" <[email protected]>
> > >> Cc: [email protected], [email protected]
> > >> Sent: Wednesday, June 24, 2015 1:53:20 PM
> > >> Subject: RE: radosgw crash within libfcgi
> > >>
> > >> Thanks Yehuda for the response.
> > >>
> > >> We already patched libfcgi to use poll instead of select to overcome the
> > >> limitation.
> > >>
> > >> Thanks,
> > >> Guang
> > >>
> > >>
> > >> ----------------------------------------
> > >>> Date: Wed, 24 Jun 2015 14:40:25 -0400
> > >>> From: [email protected]
> > >>> To: [email protected]
> > >>> CC: [email protected]; [email protected]
> > >>> Subject: Re: radosgw crash within libfcgi
> > >>>
> > >>>
> > >>>
> > >>> ----- Original Message -----
> > >>>> From: "GuangYang" <[email protected]>
> > >>>> To: [email protected], [email protected],
> > >>>> [email protected]
> > >>>> Sent: Wednesday, June 24, 2015 10:09:58 AM
> > >>>> Subject: radosgw crash within libfcgi
> > >>>>
> > >>>> Hello Cephers,
> > >>>> Recently we have several radosgw daemon crashes with the same
> > >>>> following
> > >>>> kernel log:
> > >>>>
> > >>>> Jun 23 14:17:38 xxx kernel: radosgw[68180]: segfault at f0 ip
> > >>>> 00007ffa069996f2 sp 00007ff55c432710 error 6 in
> > >
> > > error 6 is sigabrt, right? With invalid pointer I'd expect to get
> > > segfault.
> > > Is the pointer actually invalid?
> > With (ip - {address_load_the_sharded_library}) to get the instruction which
> > caused this crash, the objdump shows the crash happened at instruction 46f2
> > (see below), which was to assign '-1' to the CGX_Request::ipcFd to -1, but
> > I
> > don't quite understand how/why it could crash there.
> >
> > 0000000000004690 <FCGX_Free>:
> > 4690: 48 89 5c 24 f0 mov %rbx,-0x10(%rsp)
> > 4695: 48 89 6c 24 f8 mov %rbp,-0x8(%rsp)
> > 469a: 48 83 ec 18 sub $0x18,%rsp
> > 469e: 48 85 ff test %rdi,%rdi
> > 46a1: 48 89 fb mov %rdi,%rbx
> > 46a4: 89 f5 mov %esi,%ebp
> > 46a6: 74 28 je 46d0 <FCGX_Free+0x40>
> > 46a8: 48 8d 7f 08 lea 0x8(%rdi),%rdi
> > 46ac: e8 67 e3 ff ff callq 2a18 <FCGX_FreeStream@plt>
> > 46b1: 48 8d 7b 10 lea 0x10(%rbx),%rdi
> > 46b5: e8 5e e3 ff ff callq 2a18 <FCGX_FreeStream@plt>
> > 46ba: 48 8d 7b 18 lea 0x18(%rbx),%rdi
> > 46be: e8 55 e3 ff ff callq 2a18 <FCGX_FreeStream@plt>
> > 46c3: 48 8d 7b 28 lea 0x28(%rbx),%rdi
> > 46c7: e8 d4 f4 ff ff callq 3ba0 <FCGX_PutS+0x40>
> > 46cc: 85 ed test %ebp,%ebp
> > 46ce: 75 10 jne 46e0 <FCGX_Free+0x50>
> > 46d0: 48 8b 5c 24 08 mov 0x8(%rsp),%rbx
> > 46d5: 48 8b 6c 24 10 mov 0x10(%rsp),%rbp
> > 46da: 48 83 c4 18 add $0x18,%rsp
> > 46de: c3 retq
> > 46df: 90 nop
> > 46e0: 31 f6 xor %esi,%esi
> > 46e2: 83 7b 4c 00 cmpl $0x0,0x4c(%rbx)
> > 46e6: 8b 7b 30 mov 0x30(%rbx),%edi
> > 46e9: 40 0f 94 c6 sete %sil
> > 46ed: e8 86 e6 ff ff callq 2d78 <OS_IpcClose@plt>
> > 46f2: c7 43 30 ff ff ff ff movl $0xffffffff,0x30(%rbx)
>
> info registers?
>
> Not too familiar with the specific message, but it could be that
> OS_IpcClose() aborts (not highly unlikely) and it only dumps the return
> address of the current function (shouldn't be referenced as ip though).
>
> What's rbx? Is the memory at %rbx + 0x30 valid?
>
> Also, did you by any chance upgrade the binaries while the code was running?
> is the code running over nfs?
>
> Yehuda
>
> > >
> > > Yehuda
> > >
> > >
> > >>>> libfcgi.so.0.0.0[7ffa06995000+a000] in
> > >>>> libfcgi.so.0.0.0[7ffa06995000+a000]
> > >>>>
> > >>>> Looking at the assembly, it seems crashing at this point -
> > >>>> http://github.com/sknown/fcgi/blob/master/libfcgi/fcgiapp.c#L2035,
> > >>>> which
> > >>>> confused me. I tried to see if there is any other reference holding
> > >>>> the
> > >>>> FCGX_Request which release the handle without any luck.
> > >>>>
> > >>>> There are also other observations:
> > >>>> 1> Several radosgw daemon across different hosts crashed around the
> > >>>> same
> > >>>> time.
> > >>>> 2> Apache's error log has some fcgi error complaining ##idle timeout##
> > >>>> during the time.
> > >>>>
> > >>>> Does anyone experience similar issue?
> > >>>>
> > >>>
> > >>> In the past we've had issues with libfcgi that were related to the
> > >>> number
> > >>> of open fds on the process (> 1024). The issue was a buggy libfcgi that
> > >>> was using select() instead of poll(), so this might be the issue you're
> > >>> noticing.
> > >>>
> > >>> Yehuda
> > >>> --
> > >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
> > >>> in
> > >>> the body of a message to [email protected]
> > >>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> > >> N嫥叉靣笡y氊b瞂千v豝�藓{.n�壏渮榏z鳐妠ay�蕠跈�jf"穐殝鄗�畐ア�⒎:+v墾妛鑚豰稛�珣赙zZ+凒殠娸"濟!秈
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > > the body of a message to [email protected]
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com