Dear Mahesh, Based on what I saw, in this case, retention time cannot detect CPND temporarily down because its pid changed.
If cpnd is temporary down only, we don't need clean up anything. If cpnd is permanently down, the bad effect of this proposal is that replica is not clean up. But if cpnd permanently down, we have to reboot node for recovering so I think this cleanup is not really necessary. I also checked this implementation with possible test cases and have not seen any side effect. Please consider it. Thank you and best regards, Hoang -----Original Message----- From: A V Mahesh [mailto:mahesh.va...@oracle.com] Sent: Friday, February 10, 2017 10:40 AM To: Hoang Vo <hoang.m...@dektech.com.au>; zoran.milinko...@ericsson.com Cc: opensaf-devel@lists.sourceforge.net Subject: Re: [PATCH 1 of 1] cpd: to correct failover behavior of cpsv [#1765] V5 Hi Hoang, The CPD_CPND_DOWN_RETENTION is to recognize, ether CPND temporarily down or permanently down, this is started a CPND is down and based on cpd_evt_proc_timer_expiry(), cpd recognize that the CPND is complete down and do cleanup, else cpnd rejoined with in CPD_CPND_DOWN_RETENTION_TIME , the CPD_CPND_DOWN_RETENTION is stoped. If we stop CPD_CPND_DOWN_RETENTION timer in cpd_process_cpnd_dow(), do cpd recognize the CPD permanently down, the cpd_process_cpnd_dow() being called in multiple flows, can you please check all the flows, is stopping CPD_CPND_DOWN_RETENTION timer has any impact ? -AVM On 2/9/2017 1:35 PM, Hoang Vo wrote: > src/ckpt/ckptd/cpd_proc.c | 11 ++++++++++- > 1 files changed, 10 insertions(+), 1 deletions(-) > > > problem: > In case failover multiple times, the cpnd is down for a moment so > there is no cpnd opening specific checkpoint. This lead to retention timer is trigger. > When cpnd is up again but has different pid so retention timer is not stoped. > Repica is deleted at retention while its information still be in ckpt database. > That cause problem > > Fix: > - Stop timer of removed node. > - Update data in patricia trees (for retention value consistence). > > diff --git a/src/ckpt/ckptd/cpd_proc.c b/src/ckpt/ckptd/cpd_proc.c > --- a/src/ckpt/ckptd/cpd_proc.c > +++ b/src/ckpt/ckptd/cpd_proc.c > @@ -679,7 +679,8 @@ uint32_t cpd_process_cpnd_down(CPD_CB *c > cpd_cpnd_info_node_find_add(&cb->cpnd_tree, cpnd_dest, &cpnd_info, &add_flag); > if (!cpnd_info) > return NCSCC_RC_SUCCESS; > - > + /* Stop timer before processing down */ > + cpd_tmr_stop(&cpnd_info->cpnd_ret_timer); > cref_info = cpnd_info->ckpt_ref_list; > > while (cref_info) { > @@ -989,6 +990,14 @@ uint32_t cpd_proc_retention_set(CPD_CB * > > /* Update the retention Time */ > (*ckpt_node)->ret_time = reten_time; > + (*ckpt_node)->attributes.retentionDuration = reten_time; > + > + /* Update the related patricia tree */ > + CPD_CKPT_MAP_INFO *map_info = NULL; > + cpd_ckpt_map_node_get(&cb->ckpt_map_tree, (*ckpt_node)->ckpt_name, &map_info); > + if (map_info) { > + map_info->attributes.retentionDuration = reten_time; > + } > return rc; > } > ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel