Re: [ceph-users] Object lost
Hi Jason, I've been able to rebuild some of images but all are corrupted at this time, but your procedure appears ok. Thanks! 2016-09-22 15:07 GMT+02:00 Jason Dillaman : > You can do something like the following: > > # create a sparse file the size of your image > $ dd if=/dev/zero of=rbd_export bs=1 count=0 seek= > > # import the data blocks > $ POOL=images > $ PREFIX=rbd_data.1014109cf92e > $ BLOCK_SIZE=512 > $ for x in $(rados --pool ${POOL} ls | grep ${PREFIX} | sort) ; do > rm -rf tmp_object > rados --pool ${POOL} get $x tmp_object > SUFFIX=0x$(echo ${x} | cut -d. -f3) > OFFSET=$(($SUFFIX * 0x40 / ${BLOCK_SIZE})) > echo ${x} @ ${OFFSET} > dd conv=notrunc if=tmp_object of=rbd_export seek=${OFFSET} > bs=${BLOCK_SIZE} > done > > On Thu, Sep 22, 2016 at 5:27 AM, Fran Barrera > wrote: > > Hi Jason, > > > > I've followed your steps and now I can list all available data blocks of > my > > image, but I don't know how rebuild a sparse image, I found this script > > (https://raw.githubusercontent.com/smmoore/ceph/master/rbd_restore.sh) > and > > https://www.sebastien-han.fr/blog/2015/01/29/ceph-recover- > a-rbd-image-from-a-dead-cluster/ > > but I don't know if this can help me. > > > > Any suggestions? > > > > Thanks. > > > > 2016-09-21 22:35 GMT+02:00 Jason Dillaman : > >> > >> Unfortunately, it sounds like the image's header object was lost > >> during your corruption event. While you can manually retrieve the > >> image data blocks from the cluster, undoubtedly many might be lost > >> and/or corrupted as well. > >> > >> You'll first need to determine the internal id of your image: > >> $ rados --pool images getomapval rbd_directory > >> name_07e54256-d123-4e61-a23a-7f8008340751 > >> value (16 bytes) : > >> 0c 00 00 00 31 30 31 34 31 30 39 63 66 39 32 65 > >> |1014109cf92e| > >> 0010 > >> > >> In my example above, the image id (1014109cf92e in this case) is the > >> string starting after the first four bytes (the id length). I can then > >> use the rados tool to list all available data blocks: > >> > >> $ rados --pool images ls | grep rbd_data.1014109cf92e | sort > >> rbd_data.1014109cf92e. > >> rbd_data.1014109cf92e.000b > >> rbd_data.1014109cf92e.0010 > >> > >> The sequence of hex numbers at the end of each data object is the > >> object number and it represents the byte offset within the image (4MB > >> * object number = byte offset assuming default 4MB object size and no > >> fancy striping enabled). > >> > >> You should be able to script something up to rebuild a sparse image > >> with whatever data is still available in your cluster. > >> > >> On Wed, Sep 21, 2016 at 11:12 AM, Fran Barrera > >> wrote: > >> > Hello, > >> > > >> > I have a Ceph Jewel cluster with 4 osds and only one monitor > integrated > >> > with > >> > Openstack Mitaka. > >> > > >> > Two OSD were down, with a service restart one of them was recovered. > The > >> > cluster began to recover and was OK. Finally the disk of the other OSD > >> > was > >> > corrupted and the solution was a format and recreate the OSD. > >> > > >> > Now I have the cluster OK, but the problem now is with some of the > >> > images > >> > stored in Ceph. > >> > > >> > $ rbd list -p images|grep 07e54256-d123-4e61-a23a-7f8008340751 > >> > 07e54256-d123-4e61-a23a-7f8008340751 > >> > > >> > $ rbd export -p images 07e54256-d123-4e61-a23a-7f8008340751 > >> > /tmp/image.img > >> > 2016-09-21 17:07:00.889379 7f51f9520700 -1 librbd::image::OpenRequest: > >> > failed to retreive immutable metadata: (2) No such file or directory > >> > rbd: error opening image 07e54256-d123-4e61-a23a-7f8008340751: (2) No > >> > such > >> > file or directory > >> > > >> > Ceph can list the image but nothing more, for example an export. So > >> > Openstack can not retrieve this image. I try repair the pg but > appear's > >> > ok. > >> > Is there any solution for this? > >> > > >> > Kind Regards, > >> > Fran. > >> > > >> > ___ > >> > ceph-users mailing list > >> > ceph-users@lists.ceph.com > >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > > >> > >> > >> > >> -- > >> Jason > > > > > > > > -- > Jason > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Object lost
You can do something like the following: # create a sparse file the size of your image $ dd if=/dev/zero of=rbd_export bs=1 count=0 seek= # import the data blocks $ POOL=images $ PREFIX=rbd_data.1014109cf92e $ BLOCK_SIZE=512 $ for x in $(rados --pool ${POOL} ls | grep ${PREFIX} | sort) ; do rm -rf tmp_object rados --pool ${POOL} get $x tmp_object SUFFIX=0x$(echo ${x} | cut -d. -f3) OFFSET=$(($SUFFIX * 0x40 / ${BLOCK_SIZE})) echo ${x} @ ${OFFSET} dd conv=notrunc if=tmp_object of=rbd_export seek=${OFFSET} bs=${BLOCK_SIZE} done On Thu, Sep 22, 2016 at 5:27 AM, Fran Barrera wrote: > Hi Jason, > > I've followed your steps and now I can list all available data blocks of my > image, but I don't know how rebuild a sparse image, I found this script > (https://raw.githubusercontent.com/smmoore/ceph/master/rbd_restore.sh) and > https://www.sebastien-han.fr/blog/2015/01/29/ceph-recover-a-rbd-image-from-a-dead-cluster/ > but I don't know if this can help me. > > Any suggestions? > > Thanks. > > 2016-09-21 22:35 GMT+02:00 Jason Dillaman : >> >> Unfortunately, it sounds like the image's header object was lost >> during your corruption event. While you can manually retrieve the >> image data blocks from the cluster, undoubtedly many might be lost >> and/or corrupted as well. >> >> You'll first need to determine the internal id of your image: >> $ rados --pool images getomapval rbd_directory >> name_07e54256-d123-4e61-a23a-7f8008340751 >> value (16 bytes) : >> 0c 00 00 00 31 30 31 34 31 30 39 63 66 39 32 65 >> |1014109cf92e| >> 0010 >> >> In my example above, the image id (1014109cf92e in this case) is the >> string starting after the first four bytes (the id length). I can then >> use the rados tool to list all available data blocks: >> >> $ rados --pool images ls | grep rbd_data.1014109cf92e | sort >> rbd_data.1014109cf92e. >> rbd_data.1014109cf92e.000b >> rbd_data.1014109cf92e.0010 >> >> The sequence of hex numbers at the end of each data object is the >> object number and it represents the byte offset within the image (4MB >> * object number = byte offset assuming default 4MB object size and no >> fancy striping enabled). >> >> You should be able to script something up to rebuild a sparse image >> with whatever data is still available in your cluster. >> >> On Wed, Sep 21, 2016 at 11:12 AM, Fran Barrera >> wrote: >> > Hello, >> > >> > I have a Ceph Jewel cluster with 4 osds and only one monitor integrated >> > with >> > Openstack Mitaka. >> > >> > Two OSD were down, with a service restart one of them was recovered. The >> > cluster began to recover and was OK. Finally the disk of the other OSD >> > was >> > corrupted and the solution was a format and recreate the OSD. >> > >> > Now I have the cluster OK, but the problem now is with some of the >> > images >> > stored in Ceph. >> > >> > $ rbd list -p images|grep 07e54256-d123-4e61-a23a-7f8008340751 >> > 07e54256-d123-4e61-a23a-7f8008340751 >> > >> > $ rbd export -p images 07e54256-d123-4e61-a23a-7f8008340751 >> > /tmp/image.img >> > 2016-09-21 17:07:00.889379 7f51f9520700 -1 librbd::image::OpenRequest: >> > failed to retreive immutable metadata: (2) No such file or directory >> > rbd: error opening image 07e54256-d123-4e61-a23a-7f8008340751: (2) No >> > such >> > file or directory >> > >> > Ceph can list the image but nothing more, for example an export. So >> > Openstack can not retrieve this image. I try repair the pg but appear's >> > ok. >> > Is there any solution for this? >> > >> > Kind Regards, >> > Fran. >> > >> > ___ >> > ceph-users mailing list >> > ceph-users@lists.ceph.com >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > >> >> >> >> -- >> Jason > > -- Jason ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Object lost
Hi Jason, I've followed your steps and now I can list all available data blocks of my image, but I don't know how rebuild a sparse image, I found this script ( https://raw.githubusercontent.com/smmoore/ceph/master/rbd_restore.sh) and https://www.sebastien-han.fr/blog/2015/01/29/ceph-recover-a-rbd-image-from-a-dead-cluster/ but I don't know if this can help me. Any suggestions? Thanks. 2016-09-21 22:35 GMT+02:00 Jason Dillaman : > Unfortunately, it sounds like the image's header object was lost > during your corruption event. While you can manually retrieve the > image data blocks from the cluster, undoubtedly many might be lost > and/or corrupted as well. > > You'll first need to determine the internal id of your image: > $ rados --pool images getomapval rbd_directory > name_07e54256-d123-4e61-a23a-7f8008340751 > value (16 bytes) : > 0c 00 00 00 31 30 31 34 31 30 39 63 66 39 32 65 > |1014109cf92e| > 0010 > > In my example above, the image id (1014109cf92e in this case) is the > string starting after the first four bytes (the id length). I can then > use the rados tool to list all available data blocks: > > $ rados --pool images ls | grep rbd_data.1014109cf92e | sort > rbd_data.1014109cf92e. > rbd_data.1014109cf92e.000b > rbd_data.1014109cf92e.0010 > > The sequence of hex numbers at the end of each data object is the > object number and it represents the byte offset within the image (4MB > * object number = byte offset assuming default 4MB object size and no > fancy striping enabled). > > You should be able to script something up to rebuild a sparse image > with whatever data is still available in your cluster. > > On Wed, Sep 21, 2016 at 11:12 AM, Fran Barrera > wrote: > > Hello, > > > > I have a Ceph Jewel cluster with 4 osds and only one monitor integrated > with > > Openstack Mitaka. > > > > Two OSD were down, with a service restart one of them was recovered. The > > cluster began to recover and was OK. Finally the disk of the other OSD > was > > corrupted and the solution was a format and recreate the OSD. > > > > Now I have the cluster OK, but the problem now is with some of the images > > stored in Ceph. > > > > $ rbd list -p images|grep 07e54256-d123-4e61-a23a-7f8008340751 > > 07e54256-d123-4e61-a23a-7f8008340751 > > > > $ rbd export -p images 07e54256-d123-4e61-a23a-7f8008340751 > /tmp/image.img > > 2016-09-21 17:07:00.889379 7f51f9520700 -1 librbd::image::OpenRequest: > > failed to retreive immutable metadata: (2) No such file or directory > > rbd: error opening image 07e54256-d123-4e61-a23a-7f8008340751: (2) No > such > > file or directory > > > > Ceph can list the image but nothing more, for example an export. So > > Openstack can not retrieve this image. I try repair the pg but appear's > ok. > > Is there any solution for this? > > > > Kind Regards, > > Fran. > > > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > -- > Jason > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Object lost
Unfortunately, it sounds like the image's header object was lost during your corruption event. While you can manually retrieve the image data blocks from the cluster, undoubtedly many might be lost and/or corrupted as well. You'll first need to determine the internal id of your image: $ rados --pool images getomapval rbd_directory name_07e54256-d123-4e61-a23a-7f8008340751 value (16 bytes) : 0c 00 00 00 31 30 31 34 31 30 39 63 66 39 32 65 |1014109cf92e| 0010 In my example above, the image id (1014109cf92e in this case) is the string starting after the first four bytes (the id length). I can then use the rados tool to list all available data blocks: $ rados --pool images ls | grep rbd_data.1014109cf92e | sort rbd_data.1014109cf92e. rbd_data.1014109cf92e.000b rbd_data.1014109cf92e.0010 The sequence of hex numbers at the end of each data object is the object number and it represents the byte offset within the image (4MB * object number = byte offset assuming default 4MB object size and no fancy striping enabled). You should be able to script something up to rebuild a sparse image with whatever data is still available in your cluster. On Wed, Sep 21, 2016 at 11:12 AM, Fran Barrera wrote: > Hello, > > I have a Ceph Jewel cluster with 4 osds and only one monitor integrated with > Openstack Mitaka. > > Two OSD were down, with a service restart one of them was recovered. The > cluster began to recover and was OK. Finally the disk of the other OSD was > corrupted and the solution was a format and recreate the OSD. > > Now I have the cluster OK, but the problem now is with some of the images > stored in Ceph. > > $ rbd list -p images|grep 07e54256-d123-4e61-a23a-7f8008340751 > 07e54256-d123-4e61-a23a-7f8008340751 > > $ rbd export -p images 07e54256-d123-4e61-a23a-7f8008340751 /tmp/image.img > 2016-09-21 17:07:00.889379 7f51f9520700 -1 librbd::image::OpenRequest: > failed to retreive immutable metadata: (2) No such file or directory > rbd: error opening image 07e54256-d123-4e61-a23a-7f8008340751: (2) No such > file or directory > > Ceph can list the image but nothing more, for example an export. So > Openstack can not retrieve this image. I try repair the pg but appear's ok. > Is there any solution for this? > > Kind Regards, > Fran. > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Jason ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com