are the new osds running 0.94.5 or did they get the latest .6 packages? are you also using cache tiering? we ran in to a problem with individual rbd objects getting corrupted when using 0.94.6 with a cache tier and min_read_recency_for_promote was > 1. our only solution to corruption that happened was to restore from backup. setting min_read_recency_for_promote to 1 or making sure the osds were running .5 were sufficient to prevent it from happening though we currently do both.
mike On Fri, Apr 29, 2016 at 9:41 AM, Robert Sander <[email protected] > wrote: > Hi, > > yesterday we ran into a strange bug / mysterious issue with a Hammer > 0.94.5 storage cluster. > > We added OSDs and the cluster started the backfilling. Suddenly one of > the running VMs complained that it lost a partition in a 2TB RBD. > > After resetting the VM it could not boot any more as the RBD has no > partition info at the start. :( > > It looks like the data in the objects has been changed somehow. > > How is that possible? Any ideas? > > The VM was restored from a backup but we would still like to know how > this happened and maybe restore some data that was not backed up before > the crash. > > Regards > -- > Robert Sander > Heinlein Support GmbH > Schwedter Str. 8/9b, 10119 Berlin > > http://www.heinlein-support.de > > Tel: 030 / 405051-43 > Fax: 030 / 405051-19 > > Zwangsangaben lt. §35a GmbHG: > HRB 93818 B / Amtsgericht Berlin-Charlottenburg, > Geschäftsführer: Peer Heinlein -- Sitz: Berlin > > > _______________________________________________ > ceph-users mailing list > [email protected] > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
