are the new osds running 0.94.5 or did they get the latest .6 packages? are
you also using cache tiering? we ran in to a problem with individual rbd
objects getting corrupted when using 0.94.6 with a cache tier
and min_read_recency_for_promote was > 1. our only solution to corruption
that happened was to restore from backup.
setting min_read_recency_for_promote to 1 or making sure the osds were
running .5 were sufficient to prevent it from happening though we currently
do both.

mike

On Fri, Apr 29, 2016 at 9:41 AM, Robert Sander <[email protected]
> wrote:

> Hi,
>
> yesterday we ran into a strange bug / mysterious issue with a Hammer
> 0.94.5 storage cluster.
>
> We added OSDs and the cluster started the backfilling. Suddenly one of
> the running VMs complained that it lost a partition in a 2TB RBD.
>
> After resetting the VM it could not boot any more as the RBD has no
> partition info at the start. :(
>
> It looks like the data in the objects has been changed somehow.
>
> How is that possible? Any ideas?
>
> The VM was restored from a backup but we would still like to know how
> this happened and maybe restore some data that was not backed up before
> the crash.
>
> Regards
> --
> Robert Sander
> Heinlein Support GmbH
> Schwedter Str. 8/9b, 10119 Berlin
>
> http://www.heinlein-support.de
>
> Tel: 030 / 405051-43
> Fax: 030 / 405051-19
>
> Zwangsangaben lt. §35a GmbHG:
> HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
> Geschäftsführer: Peer Heinlein -- Sitz: Berlin
>
>
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to