After 4 months of test we decided to go live and store real VDI in production.
However just the same day something went suddenly wrong.

The last copy of the VDI in Ceph was corrupted.
Trying to fix the filesystem works for open it, but mysqld never went online even after reinstalling but only forcing fixing of innoDB. To many database corrupted. Finally the last reboot VDI became inaccessible and do not start.

At the sametime other 3 VDI stuck and becomes inaccessible, try to access to them get this error:
cannot open /dev/xvdb: Input/output error

If I run from the Dom(0) the ceph -s it seems allright:

        cluster 33693979-3177-44f4-bcb8-1b3bbc658352
         health HEALTH_OK
         monmap e2: 2 mons at 
{ceph-node1=xxxxxx:6789/0,ceph-node2=yyyyyy:6789/0}
                election epoch 218, quorum 0,1 ceph-node1,ceph-node2
         osdmap e2560: 8 osds: 8 up, 8 in
                flags sortbitwise,require_jewel_osds
          pgmap v8311677: 484 pgs, 3 pools, 940 GB data, 444 kobjects
                1883 GB used, 5525 GB / 7408 GB avail
                     484 active+clean
      client io 482 kB/s wr, 0 op/s rd, 5 op/s wr

Everything seems ok.
I can see pool and rbd:

   NAME                                                                         
         SIZE PARENT FMT PROT LOCK
   VHD-1649fde4-6637-43d8-b815-656e4080887d                                     
      102400M          2
   VHD-1bcd9d72-d8fe-4fc2-ad6a-ce84b18e4340                                     
      102400M          2
   VHD-239ea931-5ddd-4aaf-bc89-b192641f6dcf                                     
         200G          2
   VHD-3e16395d-7dad-4680-a7ad-7f398da7fd9e                                     
         200G          2
   VHD-41a76fe7-c9ff-4082-adb4-43f3120a9106                                     
      102400M          2
   
VHD-41a76fe7-c9ff-4082-adb4-43f3120a9106@SNAP-346d01f4-cbd4-4e8a-af8f-473c2de3c60d
 102400M          2 yes
   VHD-48fdb12d-110b-419c-9330-0f05827fe41e                                     
      102400M          2
   VHD-602b05be-395d-442e-bd68-7742deaf97bd                                     
         200G          2
   VHD-691e4fb4-5f7b-4bc1-af7b-98780d799067                                     
      102400M          2
   VHD-6da2154e-06fd-4063-8af5-ae86ae61df50                                     
       10240M          2      excl
   VHD-8631ab86-c85c-407b-9e15-bd86e830ba74                                     
         200G          2
   VHD-97cbc3d2-519c-4d11-b795-7a71179e193c                                     
      102400M          2
   
VHD-97cbc3d2-519c-4d11-b795-7a71179e193c@SNAP-ad9ab028-57d9-48e4-9d13-a9a02b06e68a
 102400M          2 yes
   VHD-acb2a9b0-e98d-474e-aa42-ed4e5534ddbe                                     
      102400M          2      excl
   VHD-adaa639e-a7a4-4723-9c18-0c2b3ab9d99f                                     
      102400M          2
   VHD-bb0f40c7-206b-4023-b56f-4a70d02d5a58                                     
         200G          2
   
VHD-bb0f40c7-206b-4023-b56f-4a70d02d5a58@SNAP-f8b1cc40-4a9b-4550-8bfa-7fad467e726b
    200G          2 yes
   VHD-c8aca7bd-1e37-4af4-b642-f267602e210f                                     
      102400M          2
   
VHD-c8aca7bd-1e37-4af4-b642-f267602e210f@SNAP-0e830e68-c7b4-49a9-8210-4c0b4ecca762
 102400M          2 yes
   VHD-cf2139ac-b1c4-404d-87da-db8f992a3e72                                     
      102400M          2
   
VHD-cf2139ac-b1c4-404d-87da-db8f992a3e72@SNAP-4fb59ee5-3256-4e75-9d48-facdd818d754
 102400M          2 yes

however they are unbootable and unreadable even as secondary driver.
Hypervisor refure to load them in the VMs.

What Can I do?


_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to