At Sat, 30 May 2015 01:12:12 +0800, =?ISO-8859-1?B?QmluZ3BlbmcgWmh1?= wrote: > > [1 <multipart/alternative (8bit)>] > [1.1 <text/plain; ISO-8859-1 (base64)>] > Hi all: > I have a problem about the epoch information in SD_OP_READ_PEER request > header. I'm not sure whether I misunderstand the code or it is a bug. > When we recover a erasure code object in recovery, we need to read the > remaining replicas firstly to rebuild the lost replica. In function > read_erasure_object(), we init SD_OP_READ_PEER request header by the > following code: > > > sd_init_req(&hdr, SD_OP_READ_PEER); > hdr.epoch = epoch; > hdr.flags = SD_FLAG_CMD_RECOVERY; > hdr.data_length = rlen; > hdr.obj.oid = oid; > hdr.obj.tgt_epoch = tgt_epoch; > hdr.obj.ec_index = idx; > > > > I think hdr.epoch is current epoch of the cluster and hdr.obj.tgt_epoch > is the historical epoch from which we want to read the stale replica. The > target node will call peer_read_obj() to process SD_OP_READ_PEER request. > Peer_read_obj() set iocb.epoch = hdr->epoch then pass iocb to > sd_store->read(). In default_read(), we use iocb->epoch < sys_epoch() to > judge whether the request is againt the older epoch which needs to read > replica from the stale directory. I think we use the wrong epoch here. We > should use hdr.obj.tgt_epoch rather than hdr.epoch to make the judgement. Can > anyone answer my question?
Hi Bingpeng, Thanks for your pointing! Your argument seems to be right. We need to use tgt_epoch. The problem seems to be shared with recovyer process of ordinal replicated object. Could you send a patch for solving this problem? Thanks, Hitoshi > > > Thanks. > Bingpeng > [1.2 <text/html; ISO-8859-1 (base64)>] > > [2 <text/plain; us-ascii (7bit)>] > -- > sheepdog mailing list > [email protected] > https://lists.wpkg.org/mailman/listinfo/sheepdog -- sheepdog mailing list [email protected] https://lists.wpkg.org/mailman/listinfo/sheepdog
