[ceph-users] Consistency problems when taking RBD snapshot

Nikolay Borisov Tue, 13 Sep 2016 03:09:02 -0700

Hello list,

I have the following cluster:

ceph status
cluster a2fba9c1-4ca2-46d8-8717-a8e42db14bb0
health HEALTH_OK
monmap e2: 5 mons at
{alxc10=xxxxx:6789/0,alxc11=xxxxx:6789/0,alxc5=xxxxx:6789/0,alxc6=xxxx:6789/0,alxc7=xxxxx:6789/0}
election epoch 196, quorum 0,1,2,3,4 alxc10,alxc5,alxc6,alxc7,alxc11
mdsmap e797: 1/1/1 up {0=alxc11.xxxx=up:active}, 2 up:standby
osdmap e11243: 50 osds: 50 up, 50 in
pgmap v3563774: 8192 pgs, 3 pools, 1954 GB data, 972 kobjects
4323 GB used, 85071 GB / 89424 GB avail
8192 active+clean
client io 168 MB/s rd, 11629 kB/s wr, 3447 op/s

It's running ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432) and
kernel 4.4.14

I have multiple rbd devices which are used as the root for lxc-based containers
and have ext4. At some point I want
to create a an rbd snapshot, for this the sequence of operations I do is thus:

1. freezefs -f /path/to/where/ext4-ontop-of-rbd-is-mounted

2. rbd snap create "${CEPH_POOL_NAME}/${name-of-blockdev}@${name-of-snapshot}

3. freezefs -u /path/to/where/ext4-ontop-of-rbd-is-mounted

<= At this point normal container operation continues =>

4. Mount the newly created snapshot to a 2nd location as read-only and rsync
the files from it to a remote server.

However as I start rsyncing stuff to the remote server then certain files in
the snapshot are reported as corrupted.

freezefs implies filesystem syncing I also tested with manually doing
sync/syncfs on the fs which is being snapshot. Before
and after the freezefs and the corruption is still present. So it's unlikely
there are dirty buffers in the page cache.
I'm using the kernel rbd driver for the clients. The theory currently is there
are some caches which are not being flushed,
other than the linux page cache. Reading the doc implies that only librbd is
using separate caching but I'm not using librbd.

Any ideas would be much appreciated.

Regards,
Nikolay

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Consistency problems when taking RBD snapshot

Reply via email to