Hello Community need help to fix a long going Ceph problem. Cluster is unhealthy , Multiple OSDs are DOWN. When i am trying to restart OSD’s i am getting this error
2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc: In function 'void
Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970
common/Thread.cc: 129: FAILED assert(ret == 0)
Environment : 4 Nodes , OSD+Monitor , Firefly latest , CentOS6.5 ,
3.17.2-1.el6.elrepo.x86_64
Tried upgrading from 0.80.7 to 0.80.8 but no Luck
Tried centOS stock kernel 2.6.32 but no Luck
Memory is not a problem more then 150+GB is free
Did any one every faced this problem ??
Cluster status
cluster 2bd3283d-67ef-4316-8b7e-d8f4747eae33
health HEALTH_WARN 7334 pgs degraded; 1185 pgs down; 1 pgs incomplete;
1735 pgs peering; 8938 pgs stale; 1
736 pgs stuck inactive; 8938 pgs stuck stale; 10320 pgs stuck unclean; recovery
6061/31080 objects degraded (19
.501%); 111/196 in osds are down; clock skew detected on mon.pouta-s02,
mon.pouta-s03
monmap e3: 3 mons at
{pouta-s01=10.XXX.50.1:6789/0,pouta-s02=10.XXX.50.2:6789/0,pouta-s03=10.XXX.50.3:6789
/0}, election epoch 1312, quorum 0,1,2 pouta-s01,pouta-s02,pouta-s03
osdmap e26633: 239 osds: 85 up, 196 in
pgmap v60389: 17408 pgs, 13 pools, 42345 MB data, 10360 objects
4699 GB used, 707 TB / 711 TB avail
6061/31080 objects degraded (19.501%)
14 down+remapped+peering
39 active
3289 active+clean
547 peering
663 stale+down+peering
705 stale+active+remapped
1 active+degraded+remapped
1 stale+down+incomplete
484 down+peering
455 active+remapped
3696 stale+active+degraded
4 remapped+peering
23 stale+down+remapped+peering
51 stale+active
3637 active+degraded
3799 stale+active+clean
OSD : Logs
2015-03-09 12:22:16.312774 7f760dac9700 -1 common/Thread.cc: In function 'void
Thread::create(size_t)' thread 7f760dac9700 time 2015-03-09 12:22:16.311970
common/Thread.cc: 129: FAILED assert(ret == 0)
ceph version 0.80.8 (69eaad7f8308f21573c604f121956e64679a52a7)
1: (Thread::create(unsigned long)+0x8a) [0xaf41da]
2: (SimpleMessenger::add_accept_pipe(int)+0x6a) [0xae84fa]
3: (Accepter::entry()+0x265) [0xb5c635]
4: /lib64/libpthread.so.0() [0x3c8a6079d1]
5: (clone()+0x6d) [0x3c8a2e89dd]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
interpret this.
More information at Ceph Tracker Issue :
http://tracker.ceph.com/issues/10988#change-49018
<http://tracker.ceph.com/issues/10988#change-49018>
****************************************************************
Karan Singh
Systems Specialist , Storage Platforms
CSC - IT Center for Science,
Keilaranta 14, P. O. Box 405, FIN-02101 Espoo, Finland
mobile: +358 503 812758
tel. +358 9 4572001
fax +358 9 4572302
http://www.csc.fi/
****************************************************************
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
