Hello folks,

Nobody to give me a hint ?

The communication and auth with mon is ok

2019-03-25 14:16:25.342 7fa3af260700 1 -- 10.8.33.158:6789/0 <== osd.0 
10.8.33.183:6800/293177 184 ==== auth(proto 2 2 bytes epoch 0) v1 ==== 32+0+0 
(2260890001 0 0) 0x559759ffd680 con 0x55975548700
0
2019-03-25 14:16:25.342 7fa3af260700 10 mon.2@1(peon).auth v146 
preprocess_query auth(proto 2 2 bytes epoch 0) v1 from osd.0 
10.8.33.183:6800/293177
2019-03-25 14:16:25.342 7fa3af260700 10 mon.2@1(peon).auth v146 prep_auth() 
blob_size=2
2019-03-25 14:16:25.342 7fa3af260700 2 mon.2@1(peon) e1 send_reply 
0x55976b3bf320 0x559754bb1200 auth_reply(proto 2 0 (0) Success) v1
2019-03-25 14:16:25.342 7fa3af260700 1 -- 10.8.33.158:6789/0 --> 
10.8.33.183:6800/293177 -- auth_reply(proto 2 0 (0) Success) v1 -- 
0x559754bb1200 con 0

But the OSD is still in booting state

FSID seems correct... so I'm lost here.....
Nothing in the osd logs (even with debug to 20) except some complain about mgr 
which reject osd report because osd metadata not complete (I guess due to osd 
booting state)

One thing to notice, I came to this status after redeploying the VMs hosting 
Ceph cluster, so IP addresses have changed

Somebody to help ?

# ceph osd dump
epoch 15
fsid 5267611a-48f7-4979-823e-84531e104d63
created 2019-03-20 18:14:24.296267
modified 2019-03-22 14:38:45.816422
flags sortbitwise,recovery_deletes,purged_snapdirs
crush_version 5
full_ratio 0.95
backfillfull_ratio 0.9
nearfull_ratio 0.85
require_min_compat_client jewel
min_compat_client jewel
require_osd_release mimic
max_osd 3
osd.0 down in weight 1 up_from 0 up_thru 0 down_at 0 last_clean_interval [0,0) 
- - - - exists 32d92b43-6333-4c5c-8153-af373ce12e62
osd.1 down in weight 1 up_from 0 up_thru 0 down_at 0 last_clean_interval [0,0) 
- - - - exists 07b03870-1bd9-42f9-ac61-9e9be3b30e73
osd.2 down in weight 1 up_from 0 up_thru 0 down_at 0 last_clean_interval [0,0) 
- - - - exists b77f8ae8-82cf-4e31-9e36-f510698abf8e

Thank you !!
Vincent

De : PHARABOT Vincent
Envoyé : vendredi 22 mars 2019 10:45
À : '[email protected]' <[email protected]>
Objet : OSD stuck in booting state

Hello cephers

I would need your help once again.... (still ceph beginner sorry)

In a cluster I have 3 osd which could not be seen as up, still stuck on down 
state. Of course osd process are running.

On osd side, the osd is stuck on booting state since a long time
It doesn't look like a network or communication issue between osd and mon

I guess something wrong on osd side but could not figure out what for now...

Thanks a lot for your help !

# ceph -s
cluster:
id: 5267611a-48f7-4979-823e-84531e104d63
health: HEALTH_WARN
3 slow ops, oldest one blocked for 134780 sec, daemons [mon.1,mon.2] have slow 
ops.

services:
mon: 3 daemons, quorum 1,2,0
mgr: mgr.2(active), standbys: mgr.0, mgr.1
osd: 3 osds: 0 up, 0 in

data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs:

# ceph health detail
HEALTH_WARN 3 slow ops, oldest one blocked for 134795 sec, daemons 
[mon.1,mon.2] have slow ops.
SLOW_OPS 3 slow ops, oldest one blocked for 134795 sec, daemons [mon.1,mon.2] 
have slow ops.

# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 2.92978 root default
-4 0.97659 host ip-10-8-33-183
0 hdd 0.97659 osd.0 down 0 1.00000
-3 0.97659 host ip-10-8-64-158
2 0.97659 osd.2 down 0 1.00000
-2 0.97659 host ip-10-8-85-231

# ceph osd dump
epoch 7
fsid 5267611a-48f7-4979-823e-84531e104d63
created 2019-03-20 18:14:24.296267
modified 2019-03-21 09:26:58.920300
flags sortbitwise,recovery_deletes,purged_snapdirs
crush_version 5
full_ratio 0.95
backfillfull_ratio 0.9
nearfull_ratio 0.85
require_min_compat_client jewel
min_compat_client jewel
require_osd_release mimic
max_osd 3
osd.0 down out weight 0 up_from 0 up_thru 0 down_at 0 last_clean_interval [0,0) 
- - - - exists,new 32d92b43-6333-4c5c-8153-af373ce12e62
osd.1 down out weight 0 up_from 0 up_thru 0 down_at 0 last_clean_interval [0,0) 
- - - - exists,new 07b03870-1bd9-42f9-ac61-9e9be3b30e73
osd.2 down out weight 0 up_from 0 up_thru 0 down_at 0 last_clean_interval [0,0) 
- - - - exists,new b77f8ae8-82cf-4e31-9e36-f510698abf8e

"ops": [
{
"description": "osd_boot(osd.0 booted 0 features 4611087854031142907 v17)",
"initiated_at": "2019-03-22 08:47:20.243710",
"age": 405.638170,
"duration": 405.638185,
"type_data": {
"events": [
{
"time": "2019-03-22 08:47:20.243710",
"event": "initiated"
},
{
"time": "2019-03-22 08:47:20.243710",
"event": "header_read"
},
{
"time": "2019-03-22 08:47:20.243713",
"event": "throttled"
},
{
"time": "2019-03-22 08:47:20.243766",
"event": "all_read"
},
{
"time": "2019-03-22 08:47:20.243821",
"event": "dispatched"
},
{
"time": "2019-03-22 08:47:20.243826",
"event": "mon:_ms_dispatch"
},
{
"time": "2019-03-22 08:47:20.243827",
"event": "mon:dispatch_op"
},
{
"time": "2019-03-22 08:47:20.243827",
"event": "psvc:dispatch"
},
{
"time": "2019-03-22 08:47:20.243828",
"event": "osdmap:wait_for_readable"
},
{
"time": "2019-03-22 08:47:20.243829",
"event": "osdmap:wait_for_finished_proposal"
},
{
"time": "2019-03-22 08:47:21.064088",
"event": "callback retry"
},
{
"time": "2019-03-22 08:47:21.064090",
"event": "psvc:dispatch"
},
{
"time": "2019-03-22 08:47:21.064091",
"event": "osdmap:wait_for_readable"
},
{



OSD side:
[root@ip-10-8-33-183 ~]# ceph daemon osd.0 status
{
"cluster_fsid": "5267611a-48f7-4979-823e-84531e104d63",
"osd_fsid": "32d92b43-6333-4c5c-8153-af373ce12e62",
"whoami": 0,
"state": "booting",
"oldest_map": 1,
"newest_map": 17,
"num_pgs": 200
}

Vincent

This email and any attachments are intended solely for the use of the 
individual or entity to whom it is addressed and may be confidential and/or 
privileged.

If you are not one of the named recipients or have received this email in error,

(i) you should not read, disclose, or copy it,

(ii) please notify sender of your receipt by reply email and delete this email 
and all attachments,

(iii) Dassault Systèmes does not accept or assume any liability or 
responsibility for any use of or reliance on this email.


Please be informed that your personal data are processed according to our data 
privacy policy as described on our website. Should you have any questions 
related to personal data protection, please contact 3DS Data Protection Officer 
at [email protected]<mailto:[email protected]>


For other languages, go to https://www.3ds.com/terms/email-disclaimer
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to