Hi Hitoshi,

     We think it's a reasonable idea killing gateway nodes when there are only 
gateway nodes left. The VM can not read/write from/to Sheepdog in this 
situation anyway. We can test more latter:)

Thanks,
Yang, Long

> -----原始邮件-----
> 发件人: "Hitoshi Mitake" <[email protected]>
> 发送时间: 2014年12月10日 星期三
> 收件人: "Yang Zhang" <[email protected]>
> 抄送: "Hitoshi Mitake" <[email protected]>, [email protected], 
> [email protected], [email protected], "徐小�霜" <[email protected]>
> 主题: Re: [PATCH] sheep: don't clean stale dir if there are no enough nodes
> 
> At Fri, 5 Dec 2014 12:05:26 +0800 (GMT+08:00),
> Yang Zhang wrote:
> > 
> > Hi Hitoshi,
> > 
> > I've test the patch. It didn't solve the problem, 'dog vdi list' still show 
> > object not found. 
> > Actually. it didn't clean the object saved in .stale dir, but didn't 
> > recover it back to obj/ also.
> 
> Yang, long,
> 
> Thanks for your testing. It worked on my environment well, so I think
> we did different testing. I'll share my testing method later.
> 
> > 
> > Also, i wonder even if we recover the obj in.stale dir, will it be the 
> > newest version?
> 
> Yes, the problem remains. Current behavior of sheepdog is odd. In a
> case of nr_zones < maximum nr copies, it should stop with the status
> SD_STATUS_WAIT like initialization sequence. In addition, if the
> cluster is SD_STATUS_OK already, newly joining node shouldn't provide
> its object for recovery process. All objects are replicated in
> existing nodes correctly.
> 
> BTW, how do you think about this idea: simply killing gateway nodes
> when an epoch is becoming gateway only. It will simply solve the
> problem. And it doesn't hurt VMs because QEMU (and tgt) already have
> reconnection feature. Gateway only cluster doesn't contribute to read
> and write, so simply stopping it seems reasonable idea to me.
> 
> My company doesn't use the gateway feature, so I'd like to hear your
> opinion.
> 
> Thanks,
> Hitoshi
> 
> > 
> > Thanks,
> > Yang
> > 
> > > -----原始邮件-----
> > > 发件人: "Hitoshi Mitake" <[email protected]>
> > > 发送时间: 2014年12月4日 星期�> > 收件人: [email protected]
> > > 抄送: [email protected], "Hitoshi Mitake" 
> > > <[email protected]>, [email protected], "张扬" 
> > > <[email protected]>, "徐小�霜" <[email protected]>
> > > 主题: Re: [PATCH] sheep: don't clean stale dir if there are no enough nodes
> > > 
> > > At Thu,  4 Dec 2014 16:05:39 +0900,
> > > Hitoshi Mitake wrote:
> > > > 
> > > > Current recovery process has a bug of data wipe. After an epoch which
> > > > consists only gateway nodes, objects stored in dying nodes will be
> > > > wiped when the nodes join to the cluster. This patch solves the
> > > > problem with removing invalid call of sd_store->cleanup() during
> > > > recovery completion.
> > > > 
> > > > Related issue:
> > > > https://bugs.launchpad.net/sheepdog-project/+bug/1327037
> > > > 
> > > > Cc: [email protected]
> > > > Cc: �UEQo <[email protected]>
> > > > Cc: 徐小�霜 <[email protected]>
> > > > Signed-off-by: Hitoshi Mitake <[email protected]>
> > > > ---
> > > >  sheep/ops.c        |  5 +++--
> > > >  sheep/sheep_priv.h |  1 +
> > > >  sheep/vdi.c        | 12 ++++++++++++
> > > >  3 files changed, 16 insertions(+), 2 deletions(-)
> > > 
> > > �UEQo, 徐小�霜, could you test this patch if you have time? It would be
> > > the simplest solution for the problem.
> > > 
> > > Thanks,
> > > Hitoshi
> > > 
> > > > 
> > > > diff --git a/sheep/ops.c b/sheep/ops.c
> > > > index a617a83..b418bda 100644
> > > > --- a/sheep/ops.c
> > > > +++ b/sheep/ops.c
> > > > @@ -726,8 +726,9 @@ static int cluster_recovery_completion(const struct 
> > > > sd_req *req,
> > > >                         sd_notice("all nodes are recovered, epoch %d", 
> > > > epoch);
> > > >                         last_gathered_epoch = epoch;
> > > >                         /* sd_store can be NULL if this node is a 
> > > > gateway */
> > > > -                       if (vnode_info->nr_zones >= ec_max_data_strip &&
> > > > -                           sd_store && sd_store->cleanup)
> > > > +                       if (vnode_info->nr_zones >=
> > > > +                           max(ec_max_data_strip, max_nr_copies)
> > > > +                           && sd_store && sd_store->cleanup)
> > > >                                 sd_store->cleanup();
> > > >                 }
> > > >         }
> > > > diff --git a/sheep/sheep_priv.h b/sheep/sheep_priv.h
> > > > index 5fc6b90..699f352 100644
> > > > --- a/sheep/sheep_priv.h
> > > > +++ b/sheep/sheep_priv.h
> > > > @@ -357,6 +357,7 @@ int inode_coherence_update(uint32_t vid, bool 
> > > > validate,
> > > >  void remove_node_from_participants(const struct node_id *left);
> > > >  
> > > >  extern int ec_max_data_strip;
> > > > +extern int max_nr_copies;
> > > >  
> > > >  int read_vdis(char *data, int len, unsigned int *rsp_len);
> > > >  int read_del_vdis(char *data, int len, unsigned int *rsp_len);
> > > > diff --git a/sheep/vdi.c b/sheep/vdi.c
> > > > index 1c8fb36..d815196 100644
> > > > --- a/sheep/vdi.c
> > > > +++ b/sheep/vdi.c
> > > > @@ -40,6 +40,12 @@ static struct sd_rw_lock vdi_state_lock = 
> > > > SD_RW_LOCK_INITIALIZER;
> > > >   */
> > > >  int ec_max_data_strip;
> > > >  
> > > > +/*
> > > > + * max_nr_copies represent max number of copies of replicated VDIs. It 
> > > > is used
> > > > + * for the same purpose of ec_max_data_strip.
> > > > + */
> > > > +int max_nr_copies;
> > > > +
> > > >  int sheep_bnode_writer(uint64_t oid, void *mem, unsigned int len,
> > > >                        uint64_t offset, uint32_t flags, int copies,
> > > >                        int copy_policy, bool create, bool direct)
> > > > @@ -171,6 +177,12 @@ int add_vdi_state(uint32_t vid, int nr_copies, 
> > > > bool snapshot, uint8_t cp)
> > > >                 sd_mutex_lock(&m);
> > > >                 ec_max_data_strip = max(d, ec_max_data_strip);
> > > >                 sd_mutex_unlock(&m);
> > > > +       } else {
> > > > +               static struct sd_mutex m = SD_MUTEX_INITIALIZER;
> > > > +
> > > > +               sd_mutex_lock(&m);
> > > > +               max_nr_copies = max(nr_copies, max_nr_copies);
> > > > +               sd_mutex_unlock(&m);
> > > >         }
> > > >  
> > > >         sd_debug("%" PRIx32 ", %d, %d", vid, nr_copies, cp);
> > > > -- 
> > > > 1.8.3.2
> > > > 
> > 

-- 
sheepdog mailing list
[email protected]
http://lists.wpkg.org/mailman/listinfo/sheepdog

Reply via email to