Hmm, perhaps there's a secondary bug. Can you send the output from strace, i.e. strace.log after running:
cat snap1.diff | strace -f -o strace.log rbd merge-diff - snap2.diff combined.diff
for a case where it fails? Josh On 12/09/2015 08:38 PM, Alex Gorbachev wrote:
More oddity: retrying several times, the merge-diff sometimes works and sometimes does not, using the same source files. On Wed, Dec 9, 2015 at 10:15 PM, Alex Gorbachev <[email protected] <mailto:[email protected]>> wrote: Hi Josh, looks like I celebrated too soon: On Wed, Dec 9, 2015 at 2:25 PM, Josh Durgin <[email protected] <mailto:[email protected]>> wrote: This is the problem: http://tracker.ceph.com/issues/14030 As a workaround, you can pass the first diff in via stdin, e.g.: cat snap1.diff | rbd merge-diff - snap2.diff combined.diff one test worked - merging the initial full export (export-diff with just one snapshot) but the second one failed (merging two incremental diffs): root@lab2-b1:/data/volume1# cat scrun1-120720151502.bck | rbd merge-diff - scrun1-120720151504.bck scrun1-part04.bck Merging image diff: 13% complete...failed. rbd: merge-diff error I am not sure how to run gdb in such scenario with stdin/stdout Thanks, Alex Josh On 12/08/2015 11:11 PM, Josh Durgin wrote: On 12/08/2015 10:44 PM, Alex Gorbachev wrote: Hi Josh, On Mon, Dec 7, 2015 at 6:50 PM, Josh Durgin <[email protected] <mailto:[email protected]> <mailto:[email protected] <mailto:[email protected]>>> wrote: On 12/07/2015 03:29 PM, Alex Gorbachev wrote: When trying to merge two results of rbd export-diff, the following error occurs: iss@lab2-b1:~$ rbd export-diff --from-snap autosnap120720151500 spin1/scrun1@autosnap120720151502 /data/volume1/scrun1-120720151502.bck iss@lab2-b1:~$ rbd export-diff --from-snap autosnap120720151504 spin1/scrun1@autosnap120720151504 /data/volume1/scrun1-120720151504.bck iss@lab2-b1:~$ rbd merge-diff /data/volume1/scrun1-120720151502.bck /data/volume1/scrun1-120720151504.bck /data/volume1/mrg-scrun1-0204.bck Merging image diff: 11% complete...failed. rbd: merge-diff error That's all the output and I have found this link http://tracker.ceph.com/issues/12911 but not sure if the patch should have already been in hammer or how to get it? That patch fixed a bug that was only present after hammer, due to parallelizing export-diff. You're likely seeing a different (possibly new) issue. Unfortunately there's not much output we can enable for export-diff in hammer. Could you try running the command via gdb to figure out where and why it's failing? Make sure you have librbd-dbg installed, then send the output from gdb doing: gdb --args rbd merge-diff /data/volume1/scrun1-120720151502.bck \ /data/volume1/scrun1-120720151504.bck /data/volume1/mrg-scrun1-0204.bck break rbd.cc:1931 break rbd.cc:1935 break rbd.cc:1967 break rbd.cc:1985 break rbd.cc:1999 break rbd.cc:2008 break rbd.cc:2021 break rbd.cc:2053 break rbd.cc:2098 run # (it will run now, stopping when it hits the error) info locals Will do - how does one load librbd-dbg? I have the following on the system: librbd-dev - RADOS block device client library (development files) librbd1-dbg - debugging symbols for librbd1 is librbd1-dbg sufficient? Yes, I just forgot the 1 in the package name. Also a question - the merge-diff really stitches the to diff files together, not really merges, correct? For example, in the following workflow: export-diff from full image - 10GB export-diff from snap1 - 2 GB export-diff from snap2 - 1 GB My resulting merge export file would be 13GB, correct? It does merge overlapping sections, i.e. part of snap1 that was overwritten in snap2, so the merged diff may be smaller than the original two. Josh
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
