Re: [ceph-users] Help recovering failed cluster

2016-06-12 Thread Anthony D'Atri
> Current cluster health:
>cluster 537a3e12-95d8-48c3-9e82-91abbfdf62e0
> health HEALTH_WARN
>5 pgs degraded
>8 pgs down
>48 pgs incomplete
>3 pgs recovering
>1 pgs recovery_wait
>76 pgs stale
>5 pgs stuck degraded
>48 pgs stuck inactive
>76 pgs stuck stale
>53 pgs stuck unclean
>5 pgs stuck undersized
>5 pgs undersized



First I have to remark on you having 7 mons.  Your cluster is very small - many 
clusters with hundreds of OSD’s are happy with 5.  At the Vancouver OpenStack 
summer there was a discussion re number of mons, there was consensus that 5 is 
generally plenty and that with 7+ the traffic among them really starts being 
excessive.  YMMV of course.

Assuming you have size on your pools set to 3 and min_size set to 2, this might 
be one of those times where temporarily setting min_size on the pools to 1 does 
the trick or at least helps.  I suspect in your case it wouldn’t completely 
heal the cluster but it might improve it and allow recovery to proceed.  Later 
you’d revert to the usual setting for obvious reasons.

— Anthony
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Help recovering failed cluster

2016-06-10 Thread John Blackwood
Had aa little bit of help in IRC, was asked to attach the OSD tree, health 
detail and crush map. PG dump is included at the link below - too big to attach 
directly.

https://drive.google.com/open?id=0B3Dsc6YwKik_T0NPZm1oYmdLT0k



-- 
JOHN BLACKWOOD

P: 905 444 9166 
F: 905 668 8778
Chief Technical Officer
j...@kaiinnovations.com 
www.kaiinnovations.com 
Ontario Manitoba






> On Jun 10, 2016, at 4:25 PM, John Blackwood  wrote:
> 
> 
> We're looking for some assistance recovering data from a failed ceph cluster; 
> or some help determining if it is even possible to recover any data.
> 
> Background:
> - We were using Ceph with Proxmox following the instructions Proxmox provides 
> (https://pve.proxmox.com/wiki/Ceph_Server 
> ); which seems fairly close to the 
> ceph recommendations except that the storage is on the same physical systems 
> that virtual machines are running on. 
> - Some of our Proxmox nodes use ZFS, and there is a rare bug where ZFS + 
> Proxmox clustering can result in Proxmox hanging indefinitely
> - We were using HA on our proxmox nodes, which means when they hang, they are 
> rebooted (hard) automatically
> - Hard reboots are bad for file systems
> - Hard reboots mean that Ceph tries to recover - meaning more systems hitting 
> the bug followed by more system restarts and general mayhem
> 
> We first ran into issues overnight; and at some point during the process one 
> of the file systems on an OSD was corrupted. We managed to stabilize the 
> systems, however we've not been able to recover the critical data from the 
> pool (about 5-10%). 
> 
> Current cluster health:
> cluster 537a3e12-95d8-48c3-9e82-91abbfdf62e0
>  health HEALTH_WARN
> 5 pgs degraded
> 8 pgs down
> 48 pgs incomplete
> 3 pgs recovering
> 1 pgs recovery_wait
> 76 pgs stale
> 5 pgs stuck degraded
> 48 pgs stuck inactive
> 76 pgs stuck stale
> 53 pgs stuck unclean
> 5 pgs stuck undersized
> 5 pgs undersized
> 74 requests are blocked > 32 sec
> recovery 14656/6951979 objects degraded (0.211%)
> recovery 20585/6951979 objects misplaced (0.296%)
> recovery 5/3348270 unfound (0.000%)
>  monmap e7: 7 mons at 
> {0=10.11.0.126:6789/0,1=10.11.0.125:6789/0,2=10.11.0.124:6789/0,3=10.11.0.123:6789/0,4=10.11.0.122:6789/0,5=10.11.0.119:6789/0,6=10.11.0.121:6789/0}
> election epoch 482, quorum 0,1,2,3,4,5,6 5,6,4,3,2,1,0
>  osdmap e15746: 16 osds: 16 up, 16 in; 5 remapped pgs
>   pgmap v10200890: 3072 pgs, 3 pools, 12914 GB data, 3269 kobjects
> 26923 GB used, 23327 GB / 50250 GB avail
> 14656/6951979 objects degraded (0.211%)
> 20585/6951979 objects misplaced (0.296%)
> 5/3348270 unfound (0.000%)
> 2943 active+clean
>   76 stale+active+clean
>   40 incomplete
>8 down+incomplete
>3 active+recovering+undersized+degraded+remapped
>1 active+recovery_wait+undersized+degraded+remapped
>1 active+undersized+degraded+remapped
> 
> There are two RBD's which we are looking to recover (out of about 130), 
> totalling about 200GB of data. Those RBDs do not appear to be using any of 
> the PGs which are incomplete or down; but do seem to use ones which are 
> stale+active+clean and so if we read from the mapped RBD it will block 
> indefinitely.
> 
> We were looking at http://ceph.com/community/incomplete-pgs-oh-my/ 
>  as a means of recovering 
> the incomplete PGs as it does seem that the complete ones are on the 
> corrupted OSD, and most or all were able to be exported without issue; 
> however I'm not sure if this is the correct way to go or if I should be 
> looking at something else. 
> 
> -- 
> JOHN BLACKWOOD
> 
> P: 905 444 9166 
> F: 905 668 8778
> Chief Technical Officer
> j...@kaiinnovations.com 
> www.kaiinnovations.com 
> Ontario Manitoba
> 
> 
> 
> 
> 
> 
> 
> 
> DISCLAIMER: This email and any files transmitted with it are confidential and 
> intended solely for the use of the individual or entity to whom they are 
> addressed. If you have received it by mistake, please let us know by email 
> reply and delete it from your system; you should not disseminate, distribute 
> or copy this email.___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


-- 
DISCLAIMER: This email and any files transmitted with it are confidential 
and intended solely 

[ceph-users] Help recovering failed cluster

2016-06-10 Thread John Blackwood

We're looking for some assistance recovering data from a failed ceph cluster; 
or some help determining if it is even possible to recover any data.

Background:
- We were using Ceph with Proxmox following the instructions Proxmox provides 
(https://pve.proxmox.com/wiki/Ceph_Server 
); which seems fairly close to the 
ceph recommendations except that the storage is on the same physical systems 
that virtual machines are running on. 
- Some of our Proxmox nodes use ZFS, and there is a rare bug where ZFS + 
Proxmox clustering can result in Proxmox hanging indefinitely
- We were using HA on our proxmox nodes, which means when they hang, they are 
rebooted (hard) automatically
- Hard reboots are bad for file systems
- Hard reboots mean that Ceph tries to recover - meaning more systems hitting 
the bug followed by more system restarts and general mayhem

We first ran into issues overnight; and at some point during the process one of 
the file systems on an OSD was corrupted. We managed to stabilize the systems, 
however we've not been able to recover the critical data from the pool (about 
5-10%). 

Current cluster health:
cluster 537a3e12-95d8-48c3-9e82-91abbfdf62e0
 health HEALTH_WARN
5 pgs degraded
8 pgs down
48 pgs incomplete
3 pgs recovering
1 pgs recovery_wait
76 pgs stale
5 pgs stuck degraded
48 pgs stuck inactive
76 pgs stuck stale
53 pgs stuck unclean
5 pgs stuck undersized
5 pgs undersized
74 requests are blocked > 32 sec
recovery 14656/6951979 objects degraded (0.211%)
recovery 20585/6951979 objects misplaced (0.296%)
recovery 5/3348270 unfound (0.000%)
 monmap e7: 7 mons at 
{0=10.11.0.126:6789/0,1=10.11.0.125:6789/0,2=10.11.0.124:6789/0,3=10.11.0.123:6789/0,4=10.11.0.122:6789/0,5=10.11.0.119:6789/0,6=10.11.0.121:6789/0}
election epoch 482, quorum 0,1,2,3,4,5,6 5,6,4,3,2,1,0
 osdmap e15746: 16 osds: 16 up, 16 in; 5 remapped pgs
  pgmap v10200890: 3072 pgs, 3 pools, 12914 GB data, 3269 kobjects
26923 GB used, 23327 GB / 50250 GB avail
14656/6951979 objects degraded (0.211%)
20585/6951979 objects misplaced (0.296%)
5/3348270 unfound (0.000%)
2943 active+clean
  76 stale+active+clean
  40 incomplete
   8 down+incomplete
   3 active+recovering+undersized+degraded+remapped
   1 active+recovery_wait+undersized+degraded+remapped
   1 active+undersized+degraded+remapped

There are two RBD's which we are looking to recover (out of about 130), 
totalling about 200GB of data. Those RBDs do not appear to be using any of the 
PGs which are incomplete or down; but do seem to use ones which are 
stale+active+clean and so if we read from the mapped RBD it will block 
indefinitely.

We were looking at http://ceph.com/community/incomplete-pgs-oh-my/ 
 as a means of recovering the 
incomplete PGs as it does seem that the complete ones are on the corrupted OSD, 
and most or all were able to be exported without issue; however I'm not sure if 
this is the correct way to go or if I should be looking at something else. 

-- 
JOHN BLACKWOOD

P: 905 444 9166 
F: 905 668 8778
Chief Technical Officer
j...@kaiinnovations.com 
www.kaiinnovations.com 
Ontario Manitoba








-- 
DISCLAIMER: This email and any files transmitted with it are confidential 
and intended solely for the use of the individual or entity to whom they 
are addressed. If you have received it by mistake, please let us know by 
email reply and delete it from your system; you should not disseminate, 
distribute or copy this email.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com