2014-10-20 9:07 GMT+02:00 Hitoshi Mitake <[email protected]>: > This patchset removes a bug in recovery process. Current recovery > process can lose data of erasure coded VDIs when a number of nodes is > smaller than a number of data stripes. > > The same thing can be found here: > https://github.com/sheepdog/sheepdog/tree/ec-recovery > > Valerio, could you test it?
It seems to work fine. I used -c 2:1 and kill all nodes but one. I rejoined the cluster with the second node and check the md5sum of the vdi and it matches the one calculated before killing the nodes. dog vdi read test | md5sum 8886bddd205a7698a8194594c76e61b5 - dog vdi read test | md5sum 8886bddd205a7698a8194594c76e61b5 - I notice that a lot of INFO and ERROR get printed in sheep.log. In my testing environment I have only 1 vdi of 800M. In a real cluster with terabytes of data the log would probably became huge. ... Oct 21 12:09:18 INFO [main] recover_object_main(908) object recovery progress 47% Oct 21 12:09:18 ERROR [rw 14158] sheep_exec_req(1170) failed Failed to find requested tag, remote address: 192.168.10.5:7000, op name: GET_EPOCH Oct 21 12:09:18 ALERT [rw 14158] rollback_vnode_info(117) cannot get epoch 0 Oct 21 12:09:18 ALERT [rw 14158] rollback_vnode_info(118) clients may see old data Oct 21 12:09:18 ERROR [rw 14158] read_erasure_object(230) can not read 7c2b2500000085 idx 0 Oct 21 12:09:18 INFO [main] recover_object_main(908) object recovery progress 48% Oct 21 12:09:18 ERROR [rw 13514] sheep_exec_req(1170) failed Failed to find requested tag, remote address: 192.168.10.5:7000, op name: GET_EPOCH Oct 21 12:09:18 ALERT [rw 13514] rollback_vnode_info(117) cannot get epoch 0 Oct 21 12:09:18 ALERT [rw 13514] rollback_vnode_info(118) clients may see old data Oct 21 12:09:18 ERROR [rw 13514] read_erasure_object(230) can not read 7c2b2500000086 idx 0 Oct 21 12:09:18 ERROR [rw 14158] sheep_exec_req(1170) failed Failed to find requested tag, remote address: 192.168.10.5:7000, op name: GET_EPOCH Oct 21 12:09:18 ALERT [rw 14158] rollback_vnode_info(117) cannot get epoch 0 Oct 21 12:09:18 ALERT [rw 14158] rollback_vnode_info(118) clients may see old data Oct 21 12:09:18 ERROR [rw 14158] read_erasure_object(230) can not read 7c2b2500000088 idx 1 Oct 21 12:09:18 INFO [main] recover_object_main(908) object recovery progress 50% ... -- sheepdog mailing list [email protected] http://lists.wpkg.org/mailman/listinfo/sheepdog
