Hi, everyone.

I read the source code.  Could this be a case: a "WRITE" op designated to 
OBJECT X is followed by a series of Ops at the end of which is a "READ" op 
designated to the same OBJECT that come from the "rbd EXPORT" command; although 
the "WRITE" op modified the ObjectContext of OBJECT X to add the new "snap" 
object, the modified obc is erased from the SharedLRU cache "object_contexts" 
before the "WRITE" op is written to underlying file system by threads of 
filestore and before the "READ" op finds its obc, in which case, if, also 
before the "WRITE" op is executed, the "READ" op try to find its obc from the 
underlying file system, it would get the "out dated" obc which points to the 
"HEAD" object of OBJECT X if no other modification designated to OBJECT  X  is 
executed after the snapshot is created. If this is possible, then the result 
would be a non-consistent snapshot view.

Is this correct?

发件人: ceph-users [mailto:[email protected]] 代表 Zhongyan Gu
发送时间: 2017年2月20日 18:47
收件人: ceph-users; Jason Dillaman; [email protected]
主题: Re: [ceph-users] Rbd export-diff bug? rbd export-diff generates different 
incremental files

Could this  be a  synchronization issue in which case multi clients  visiting 
the same object, one client(the vm/qemu) is updating the object while another 
client(ceph rbd export/export-diff execution) is reading the content of the 
same object? How do Ceph make sure the consistency in this case?

Zhongyan

On Mon, Feb 20, 2017 at 11:21 AM, Zhongyan Gu <[email protected]> wrote:
BTW, we used hammer  version with the following fix. the issue is also reported 
by us during the former backup testing.
https://github.com/ceph/ceph/pull/12218/files
librbd: diffs to clone's first snapshot should include parent diffs


Zhongyan

On Mon, Feb 20, 2017 at 11:13 AM, Zhongyan Gu <[email protected]> wrote:

Hi Sage and Jason,
My company is building backup system based on rbd export-diff and import-diff 
cmds.
However, in recent test we found some strange behaviors of cmd export-diff. 
long words in short: sometimes repeatedly executing rbd export-diff –from-snap 
snap1 image@snap2 -|md5sum, and md5sum returns different values.
The details are:
We used two ceph rbd clusters: A for online vms usage and B for backup usage.
For a specific vm image, this image is cloned from a parent image. And 
initially our backup system will do a full backup with rbd export/import cmds. 
Then every day we will do incremental backup with rbd export-diff/import-diff 
cmds.
The make sure the data consistency, we also do the md5 comparison of online vm 
images@snapN and backup vm images@snapN.
Our test found some times for some vm images the md5 check is failed: online vm 
images@snapN doesn’t match backup vm images@snapN.
To narrow this issue, we manually generated the incremental file generated by 
rbd export-diff between the specific snaps and found its md5 didn’t match the 
file generated by backup scripits.
Compared those two binary files we found only a little difference: some bytes 
are not the same.
I doubt could this be an export-diff bug? As far as I know, if we create two 
snaps, then the diffs between two snaps should always be the same. But why 
export-diff doesn’t work as expected and return different md5 check? Some 
corner case not well considered or anyone else has the same experience? BTW, we 
did some fio io workload 24 hours in vms during the backup test.
 
Thanks,
Zhongyan


_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to