Hi Canh,
Thanks for your good finding. There is other possibility that well-known streams can be deleted as well. Looking at below code, proc_stream_open_msg(). rc = lgs_mds_msg_send(cb, &msg, &evt->fr_dest, &evt->mds_ctxt, MDS_SEND_PRIORITY_HIGH); // Checkpoint the opened stream if (ais_rv == SA_AIS_OK) { lgs_ckpt_stream_open(logStream, open_sync_param->client_id); } If the active node is rebooted or is split from the peer after sending OK reply to log agent *and* before forwarding the update to standby node, the numberOpeners value at standby will be less than one comparing with the actual total number of connections toward that stream. I think, we should log Warning or Error in case checkpoint data gets failed and never close well-known streams even the numberOpenners is zero(0). Regards, Vu From: Canh Van Truong <canh.v.tru...@dektech.com.au> Sent: Friday, March 22, 2019 12:01 PM To: 'Vu Minh Nguyen' <vu.m.ngu...@dektech.com.au>; lennart.l...@ericsson.com Cc: opensaf-devel@lists.sourceforge.net Subject: RE: [PATCH 1/1] log: logd crash due to well known stream has numOpeners = 0 [#3018] Hi Lennart and aVu, Thanks for your review. I check the code again and guess the condition for the crash: 1/Current "numOpeners" of one well known stream = 2. This mean that just one client open this stream. LOGD get close that stream request from client. The closing stream is successful on active node. But checkpoiting to standby have problem and cannot closing stream on standby. 2/Now the "numOpeners" on active node is (1), on standby node is (2). And still have one client own that well known stream on standby although the client already closed the stream on active node. 3/One other client open again that stream again. "numOpeners" on both active node and standby is (2). Only one client own the stream, but 2 client own that stream on standby. 4/Reboot active node and assume that 2 clients that is on active node, so 2 client will be downed. Standby node up to active will close that well known stream 2 times with numOpeners = 2. Then numOpeners will be zero for the well known stream. I don't see any more hint that cause the issue happen. Regards Canh From: Vu Minh Nguyen <vu.m.ngu...@dektech.com.au <mailto:vu.m.ngu...@dektech.com.au> > Sent: Monday, March 18, 2019 2:14 PM To: 'Canh Van Truong' <canh.v.tru...@dektech.com.au <mailto:canh.v.tru...@dektech.com.au> >; lennart.l...@ericsson.com <mailto:lennart.l...@ericsson.com> Cc: opensaf-devel@lists.sourceforge.net <mailto:opensaf-devel@lists.sourceforge.net> Subject: RE: [PATCH 1/1] log: logd crash due to well known stream has numOpeners = 0 [#3018] Hi Canh, Log stream is allocated with brackets, (), going with new operator; It means numOpeners field is already zero-initialized. log_stream_t *stream = new (std::nothrow) log_stream_t(); So, I don't think your change will address the issue. The bug may locate at another place. Regards, Vu > -----Original Message----- > From: Canh Van Truong <canh.v.tru...@dektech.com.au <mailto:canh.v.tru...@dektech.com.au> > > Sent: Thursday, March 14, 2019 6:49 PM > To: lennart.l...@ericsson.com <mailto:lennart.l...@ericsson.com> ; vu.m.ngu...@dektech.com.au <mailto:vu.m.ngu...@dektech.com.au> > Cc: opensaf-devel@lists.sourceforge.net <mailto:opensaf-devel@lists.sourceforge.net> ; Canh Van Truong > <canh.v.tru...@dektech.com.au <mailto:canh.v.tru...@dektech.com.au> > > Subject: [PATCH 1/1] log: logd crash due to well known stream has > numOpeners = 0 [#3018] > > When the stream is created, the numOpeners is not initialized and > may be started with unexpected value (e.g max value of unsigned int32). > It is not correct. That may cause when client close the well known stream > and numOpeners may be 0. The crash happens. > > The "numOpeners" should be initialized with 0. > --- > src/log/logd/lgs_stream.cc | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/src/log/logd/lgs_stream.cc b/src/log/logd/lgs_stream.cc > index 28344c8cd..8ad0757b9 100644 > --- a/src/log/logd/lgs_stream.cc > +++ b/src/log/logd/lgs_stream.cc > @@ -719,6 +719,7 @@ log_stream_t *log_stream_new(const std::string > &name, int stream_id) { > stream->severityFilter = 0x7f; /* by default all levels are allowed */ > stream->isRtStream = SA_FALSE; > stream->dest_names.clear(); > + stream->numOpeners = 0; // Set the number of openers is 0 at creating > stream > > /* Initiate local or shared stream file descriptor dependant on shared or > * split file system > -- > 2.15.1 _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel