At Mon, 1 Jun 2015 16:44:53 +0800,
=?ISO-8859-1?B?QmluZ3BlbmcgWmh1?= wrote:
> 
> [1  <text/plain; ISO-8859-1 (base64)>]
> Hi Hitoshi,
>     I have missed some important code in queue_peer_request().  The function 
> has the following code:
> 
> 
>       if (req->rq.flags & SD_FLAG_CMD_RECOVERY)
>               req->rq.epoch = req->rq.obj.tgt_epoch;
> 
> 
> 
>    At last, epoch and tgt_epoch has the same value. So it it not a bug. I'm 
> sorry about my mistake.

Oops, I missed the code, too. Could you send a patch for adding a
description as a comment? If the above description is added to
peer_read_obj(), it will be helpful for other developers.

Thanks,
Hitoshi

> 
> 
> Bingpeng
> 
> 
> ------------------ Original ------------------
> From:  "Hitoshi Mitake";<[email protected]>;
> Date:  Sat, May 30, 2015 06:11 PM
> To:  "Bingpeng Zhu"<[email protected]>; 
> Cc:  "sheepdog"<[email protected]>; 
> Subject:  Re: [sheepdog] question about the epoch field in reading peer 
> request       header
> 
> 
> 
> At Sat, 30 May 2015 01:12:12 +0800,
> =?ISO-8859-1?B?QmluZ3BlbmcgWmh1?= wrote:
> > 
> > [1  <multipart/alternative (8bit)>]
> > [1.1  <text/plain; ISO-8859-1 (base64)>]
> > Hi all:
> >     I have a problem about the epoch information in SD_OP_READ_PEER request 
> > header. I'm not sure whether I misunderstand the code or it is a bug.
> >     When we recover a erasure code object in recovery, we need to read the 
> > remaining replicas firstly to rebuild the lost replica. In function 
> > read_erasure_object(), we init  SD_OP_READ_PEER request header by the 
> > following code:
> > 
> > 
> >     sd_init_req(&hdr, SD_OP_READ_PEER);
> >     hdr.epoch = epoch;
> >     hdr.flags = SD_FLAG_CMD_RECOVERY;
> >     hdr.data_length = rlen;
> >     hdr.obj.oid = oid;
> >     hdr.obj.tgt_epoch = tgt_epoch;
> >     hdr.obj.ec_index = idx;
> > 
> > 
> > 
> >     I think hdr.epoch is current epoch of the cluster and hdr.obj.tgt_epoch 
> > is the historical epoch from which we want to read the stale replica. The 
> > target node will call peer_read_obj() to process SD_OP_READ_PEER request. 
> > Peer_read_obj() set  iocb.epoch = hdr->epoch then pass iocb to 
> > sd_store->read(). In default_read(), we use iocb->epoch < sys_epoch() to  
> > judge whether the request is againt the older epoch which needs to read 
> > replica from the stale directory. I think we use the wrong epoch here. We 
> > should use hdr.obj.tgt_epoch rather than hdr.epoch to make the judgement. 
> > Can anyone answer my question?
> 
> Hi Bingpeng,
> 
> Thanks for your pointing! Your argument seems to be right. We need to
> use tgt_epoch. The problem seems to be shared with recovyer process of
> ordinal replicated object.
> 
> Could you send a patch for solving this problem?
> 
> Thanks,
> Hitoshi
> 
> > 
> > 
> > Thanks.
> > Bingpeng
> > [1.2  <text/html; ISO-8859-1 (base64)>]
> > 
> > [2  <text/plain; us-ascii (7bit)>]
> > -- 
> > sheepdog mailing list
> > [email protected]
> > https://lists.wpkg.org/mailman/listinfo/sheepdog
> [2  <text/html; ISO-8859-1 (base64)>]
> 
-- 
sheepdog mailing list
[email protected]
https://lists.wpkg.org/mailman/listinfo/sheepdog

Reply via email to