Hi Alex,

 

Sorry I got confuse so the comment is not clear, and maybe is wrong.

I try to rewrite it.

 

- Do we really need to add a flag variable for tracking node_added? If we 
implement cpnd_ckpt_node_del() before ckpt_shm_node_free_error and after 
agent_rsp2 with new goto mark that might seem clearer. I have a concern of 
adding flag(s) makes code harder to read.

 

Another minor point, this file uses tab indent, your code uses 2 space indent, 
please make it consistent.

 

Sincerely,

Hoang

 

From: Alex Jones [mailto:alex.jo...@genband.com] 
Sent: Friday, October 20, 2017 8:41 PM
To: Vo Minh Hoang <hoang.m...@dektech.com.au>
Cc: opensaf-devel@lists.sourceforge.net
Subject: Re: [PATCH 1/1] ckptnd: fix crash during checkpoint open timeout with 
large sections [#1510]

 

Hi Hoang,

    I'm not sure what you are asking. Are you saying that all of the code under 
the ckpt_node_free_error label should be moved to the agent_rsp2 label? I don't 
understand why. Can you explain this in more detail?

    What we see is that out_evt->info.cpnd.error does return TRY_AGAIN, but 
there is a "goto ckpt_shm_node_free_error" right after that (line 1102 in the 
patch). Without changing this goto statement, then all the code that was moved 
to agent_rsp2 (per your suggestion) would get skipped, and nothing would be 
removed.

Alex

 

On 10/20/2017 02:28 AM, Vo Minh Hoang wrote:

  _____  

NOTICE: This email was received from an EXTERNAL sender

  _____  


Dear Alex,

I got confuse with this path as following:
>> sync from the active can timeout with errorcode SA_AIS_ERR_TRY_AGAIN
Does that mean out_evt->info.cpnd.error == SA_AIS_ERR_TRY_AGAIN?

Your current cpnd_ckpt_node_del() is added in ckpt_node_free_error.
If above is true, please consider moving ckpt_node_free_error() to
agent_rsp2 part, node_added flag might not need.

Sincerely,
Hoang

-----Original Message-----
From: Alex Jones [mailto:alex.jo...@genband.com] 
Sent: Tuesday, October 17, 2017 9:20 PM
To: Hoang Vo  <mailto:hoang.m...@dektech.com.au> <hoang.m...@dektech.com.au>
Cc: opensaf-devel@lists.sourceforge.net 
<mailto:opensaf-devel@lists.sourceforge.net> ; Alex Jones  
<mailto:alex.jo...@genband.com> <alex.jo...@genband.com>
Subject: [PATCH 1/1] ckptnd: fix crash during checkpoint open timeout with
large sections [#1510]

ckptnd crashes

When opening a collocated checkpoint replica where the active has large
numbers of sections (~200k), the sync from the active can timeout with
errorcode SA_AIS_ERR_TRY_AGAIN. In this case the code deletes the memory for
the node, but does not delete the node from the db. When the checkpoint
access is tried again, the freed memory for the node is still in the db, and
ckptnd crashes.

Delete the node from the db if the node is deleted during the open.
---
src/ckpt/ckptnd/cpnd_evt.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/src/ckpt/ckptnd/cpnd_evt.c b/src/ckpt/ckptnd/cpnd_evt.c index
2070163..a968f34 100644
--- a/src/ckpt/ckptnd/cpnd_evt.c
+++ b/src/ckpt/ckptnd/cpnd_evt.c
@@ -702,6 +702,7 @@ static uint32_t cpnd_evt_proc_ckpt_open(CPND_CB *cb,
CPND_EVT *evt,
CPSV_EVT send_evt, *out_evt = NULL;
SaConstStringT ckpt_name = NULL;
uint32_t rc = NCSCC_RC_SUCCESS;
+ bool node_added = false;
CPND_CPD_DEFERRED_REQ_NODE *node = NULL;
CPND_CKPT_CLIENT_NODE *cl_node = NULL;
CPND_CKPT_NODE *cp_node = NULL;
@@ -1026,6 +1027,8 @@ static uint32_t cpnd_evt_proc_ckpt_open(CPND_CB *cb,
CPND_EVT *evt,
goto ckpt_shm_node_free_error;
}

+ node_added = true;
+
if (out_evt->info.cpnd.info.ckpt_info.ckpt_rep_create ==
true &&
cp_node->create_attrib.maxSections == 1) {

@@ -1200,6 +1203,13 @@ ckpt_node_free_error:
if (cp_node->ret_tmr.is_active)
cpnd_tmr_stop(&cp_node->ret_tmr);
cpnd_ckpt_sec_map_destroy(&cp_node->replica_info);
+
+ if (node_added) {
+ rc = cpnd_ckpt_node_del(cb, cp_node);
+ if (rc == NCSCC_RC_FAILURE)
+ LOG_ER("cpnd client tree del failed"); }
+
m_MMGR_FREE_CPND_CKPT_NODE(cp_node);

agent_rsp:
--
2.9.5

 

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to