At Mon, 1 Jun 2015 16:44:53 +0800, =?ISO-8859-1?B?QmluZ3BlbmcgWmh1?= wrote: > > [1 <text/plain; ISO-8859-1 (base64)>] > Hi Hitoshi, > I have missed some important code in queue_peer_request(). The function > has the following code: > > > if (req->rq.flags & SD_FLAG_CMD_RECOVERY) > req->rq.epoch = req->rq.obj.tgt_epoch; > > > > At last, epoch and tgt_epoch has the same value. So it it not a bug. I'm > sorry about my mistake.
Oops, I missed the code, too. Could you send a patch for adding a description as a comment? If the above description is added to peer_read_obj(), it will be helpful for other developers. Thanks, Hitoshi > > > Bingpeng > > > ------------------ Original ------------------ > From: "Hitoshi Mitake";<[email protected]>; > Date: Sat, May 30, 2015 06:11 PM > To: "Bingpeng Zhu"<[email protected]>; > Cc: "sheepdog"<[email protected]>; > Subject: Re: [sheepdog] question about the epoch field in reading peer > request header > > > > At Sat, 30 May 2015 01:12:12 +0800, > =?ISO-8859-1?B?QmluZ3BlbmcgWmh1?= wrote: > > > > [1 <multipart/alternative (8bit)>] > > [1.1 <text/plain; ISO-8859-1 (base64)>] > > Hi all: > > I have a problem about the epoch information in SD_OP_READ_PEER request > > header. I'm not sure whether I misunderstand the code or it is a bug. > > When we recover a erasure code object in recovery, we need to read the > > remaining replicas firstly to rebuild the lost replica. In function > > read_erasure_object(), we init SD_OP_READ_PEER request header by the > > following code: > > > > > > sd_init_req(&hdr, SD_OP_READ_PEER); > > hdr.epoch = epoch; > > hdr.flags = SD_FLAG_CMD_RECOVERY; > > hdr.data_length = rlen; > > hdr.obj.oid = oid; > > hdr.obj.tgt_epoch = tgt_epoch; > > hdr.obj.ec_index = idx; > > > > > > > > I think hdr.epoch is current epoch of the cluster and hdr.obj.tgt_epoch > > is the historical epoch from which we want to read the stale replica. The > > target node will call peer_read_obj() to process SD_OP_READ_PEER request. > > Peer_read_obj() set iocb.epoch = hdr->epoch then pass iocb to > > sd_store->read(). In default_read(), we use iocb->epoch < sys_epoch() to > > judge whether the request is againt the older epoch which needs to read > > replica from the stale directory. I think we use the wrong epoch here. We > > should use hdr.obj.tgt_epoch rather than hdr.epoch to make the judgement. > > Can anyone answer my question? > > Hi Bingpeng, > > Thanks for your pointing! Your argument seems to be right. We need to > use tgt_epoch. The problem seems to be shared with recovyer process of > ordinal replicated object. > > Could you send a patch for solving this problem? > > Thanks, > Hitoshi > > > > > > > Thanks. > > Bingpeng > > [1.2 <text/html; ISO-8859-1 (base64)>] > > > > [2 <text/plain; us-ascii (7bit)>] > > -- > > sheepdog mailing list > > [email protected] > > https://lists.wpkg.org/mailman/listinfo/sheepdog > [2 <text/html; ISO-8859-1 (base64)>] > -- sheepdog mailing list [email protected] https://lists.wpkg.org/mailman/listinfo/sheepdog
