Hi,
I appreciate the insistency that the directions be followed. I wholly agree.
The only liberty I took was to do a ‘yum update’ instead of just ‘yum update
ceph-osd’ and then reboot. (Also my MDS runs on the MON hosts, so it got
update a step early.)
As for the logs:
[2019-07-24 15:07:22,713][ceph_volume.main][INFO ] Running command:
ceph-volume simple scan
[2019-07-24 15:07:22,714][ceph_volume.process][INFO ] Running command:
/bin/systemctl show --no-pager --property=Id --state=running ceph-osd@*
[2019-07-24 15:07:27,574][ceph_volume.main][INFO ] Running command:
ceph-volume simple activate --all
[2019-07-24 15:07:27,575][ceph_volume.devices.simple.activate][INFO ]
activating OSD specified in
/etc/ceph/osd/0-93fb5f2f-0273-4c87-a718-886d7e6db983.json
[2019-07-24 15:07:27,576][ceph_volume.devices.simple.activate][ERROR ] Required
devices (block and data) not present for bluestore
[2019-07-24 15:07:27,576][ceph_volume.devices.simple.activate][ERROR ]
bluestore devices found: [u'data']
[2019-07-24 15:07:27,576][ceph_volume][ERROR ] exception caught by decorator
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/ceph_volume/decorators.py", line 59,
in newfunc
return f(*a, **kw)
File "/usr/lib/python2.7/site-packages/ceph_volume/main.py", line 148, in main
terminal.dispatch(self.mapper, subcommand_args)
File "/usr/lib/python2.7/site-packages/ceph_volume/terminal.py", line 182, in
dispatch
instance.main()
File "/usr/lib/python2.7/site-packages/ceph_volume/devices/simple/main.py",
line 33, in main
terminal.dispatch(self.mapper, self.argv)
File "/usr/lib/python2.7/site-packages/ceph_volume/terminal.py", line 182, in
dispatch
instance.main()
File
"/usr/lib/python2.7/site-packages/ceph_volume/devices/simple/activate.py", line
272, in main
self.activate(args)
File "/usr/lib/python2.7/site-packages/ceph_volume/decorators.py", line 16,
in is_root
return func(*a, **kw)
File
"/usr/lib/python2.7/site-packages/ceph_volume/devices/simple/activate.py", line
131, in activate
self.validate_devices(osd_metadata)
File
"/usr/lib/python2.7/site-packages/ceph_volume/devices/simple/activate.py", line
62, in validate_devices
raise RuntimeError('Unable to activate bluestore OSD due to missing
devices')
RuntimeError: Unable to activate bluestore OSD due to missing devices
(this is repeated for each of the 16 drives)
Any other thoughts? (I’ll delete/create the OSDs with ceph-deply otherwise.)
peter
Peter Eisch
Senior Site Reliability Engineer
T1.612.659.3228
virginpulse.com
|virginpulse.com/global-challenge
Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | Switzerland
| United Kingdom | USA
Confidentiality Notice: The information contained in this e-mail, including any
attachment(s), is intended solely for use by the designated recipient(s).
Unauthorized use, dissemination, distribution, or reproduction of this message
by anyone other than the intended recipient(s), or a person designated as
responsible for delivering such messages to the intended recipient, is strictly
prohibited and may be unlawful. This e-mail may contain proprietary,
confidential or privileged information. Any views or opinions expressed are
solely those of the author and do not necessarily represent those of Virgin
Pulse, Inc. If you have received this message in error, or are not the named
recipient(s), please immediately notify the sender and delete this e-mail
message.
v2.59
From: Alfredo Deza <[email protected]>
Date: Wednesday, July 24, 2019 at 3:02 PM
To: Peter Eisch <[email protected]>
Cc: Paul Emmerich <[email protected]>, "[email protected]"
<[email protected]>
Subject: Re: [ceph-users] Upgrading and lost OSDs
On Wed, Jul 24, 2019 at 3:49 PM Peter Eisch
<[email protected]<mailto:[email protected]>> wrote:
I’m at step 6. I updated/rebooted the host to complete “installing the new
packages and restarting the ceph-osd daemon” on the first OSD host. All the
systemctl definitions to start the OSDs were deleted, all the properties in
/var/lib/ceph/osd/ceph-* directories were deleted. All the files in
/var/lib/ceph/osd-lockbox, for comparison, were untouched and still present.
Peeking into step 7 I can run ceph-volume:
# ceph-volume simple scan /dev/sda1
Running command: /usr/sbin/cryptsetup status /dev/sda1
Running command: /usr/sbin/cryptsetup status
93fb5f2f-0273-4c87-a718-886d7e6db983
Running command: /bin/mount -v /dev/sda5 /tmp/tmpF5F8t2
stdout: mount: /dev/sda5 mounted on /tmp/tmpF5F8t2.
Running command: /usr/sbin/cryptsetup status /dev/sda5
Running command: /bin/ceph --cluster ceph --name
client.osd-lockbox.93fb5f2f-0273-4c87-a718-886d7e6db983 --keyring
/tmp/tmpF5F8t2/keyring config-key get
dm-crypt/osd/93fb5f2f-0273-4c87-a718-886d7e6db983/luks
Running command: /bin/umount -v /tmp/tmpF5F8t2
stderr: umount: /tmp/tmpF5F8t2 (/dev/sda5) unmounted
Running command: /usr/sbin/cryptsetup --key-file - --allow-discards luksOpen
/dev/sda1 93fb5f2f-0273-4c87-a718-886d7e6db983
Running command: /bin/mount -v /dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983
/tmp/tmpYK0WEV
stdout: mount: /dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983 mounted on
/tmp/tmpYK0WEV.
--> broken symlink found /tmp/tmpYK0WEV/block ->
/dev/mapper/a05b447c-c901-4690-a249-cc1a2d62a110
Running command: /usr/sbin/cryptsetup status /tmp/tmpYK0WEV/block_dmcrypt
Running command: /usr/sbin/cryptsetup status
/dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983
Running command: /bin/umount -v /tmp/tmpYK0WEV
stderr: umount: /tmp/tmpYK0WEV
(/dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983) unmounted
Running command: /usr/sbin/cryptsetup remove
/dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983
--> OSD 0 got scanned and metadata persisted to file:
/etc/ceph/osd/0-93fb5f2f-0273-4c87-a718-886d7e6db983.json
--> To take over management of this scanned OSD, and disable ceph-disk and
udev, run:
--> ceph-volume simple activate 0 93fb5f2f-0273-4c87-a718-886d7e6db983
#
#
# ceph-volume simple activate 0 93fb5f2f-0273-4c87-a718-886d7e6db983
--> Required devices (block and data) not present for bluestore
--> bluestore devices found: [u'data']
--> RuntimeError: Unable to activate bluestore OSD due to missing devices
#
The tool detected bluestore, or rather, it failed to find a journal associated
with /dev/sda1. Scanning a single partition can cause that. There is a flag to
spit out the findings to STDOUT instead of persisting them in /etc/ceph/osd/
Since this is a "whole system" upgrade, then the upgrade documentation
instructions need to be followed:
ceph-volume simple scan
ceph-volume simple activate --all
If the `scan` command doesn't display any information (not even with the
--stdout flag) then the logs at /var/log/ceph/ceph-volume.log need to be
inspected. It would be useful to check any findings in there
Okay, this created /etc/ceph/osd/*.json. This is cool. Is there a command or
option which will read these files and mount the devices?
peter
Peter Eisch
Senior Site Reliability Engineer
T
1.612.659.3228<tel:1.612.659.3228>
[Facebook]<https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.facebook.com%2FVirginPulse&data=02%7C01%7Cpeter.eisch%40virginpulse.com%7C25cea362ad224625423308d71071c968%7Cb123a16e892b4cf6a55a6f8c7606a035%7C0%7C0%7C636995953227707952&sdata=B2JiNp12z7gsfF2i5T2l%2FSjfg6Fhg8E85OpdyGpEMHg%3D&reserved=0>
[LinkedIn]<https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fcompany%2Fvirgin-pulse&data=02%7C01%7Cpeter.eisch%40virginpulse.com%7C25cea362ad224625423308d71071c968%7Cb123a16e892b4cf6a55a6f8c7606a035%7C0%7C0%7C636995953227717942&sdata=z3Ii%2BGgPKe7fCOhNGXw%2BlD9j28YCY4gH81is%2BJoiSJU%3D&reserved=0>
[Twitter]<https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftwitter.com%2Fvirginpulse&data=02%7C01%7Cpeter.eisch%40virginpulse.com%7C25cea362ad224625423308d71071c968%7Cb123a16e892b4cf6a55a6f8c7606a035%7C0%7C0%7C636995953227717942&sdata=zUaGW2Fm16sdyJdHUPtDN6CzaMXtxMOHvmNDi9VshCw%3D&reserved=0>
virginpulse.com<https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.virginpulse.com%2F&data=02%7C01%7Cpeter.eisch%40virginpulse.com%7C25cea362ad224625423308d71071c968%7Cb123a16e892b4cf6a55a6f8c7606a035%7C0%7C0%7C636995953227717942&sdata=n3m9S%2Bt8fYzGY%2BqPw2wT433TQhf2oPXp9wAum9s9%2BUk%3D&reserved=0>
|
virginpulse.com/global-challenge<https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.virginpulse.com%2Fen-gb%2Fglobal-challenge%2F&data=02%7C01%7Cpeter.eisch%40virginpulse.com%7C25cea362ad224625423308d71071c968%7Cb123a16e892b4cf6a55a6f8c7606a035%7C0%7C0%7C636995953227727937&sdata=5wn%2F%2FkY1IL0d4BxXNpqCJUHG09gUFRTr2S9KWv1mVG4%3D&reserved=0>
Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | Switzerland
| United Kingdom | USA
Confidentiality Notice: The information contained in this e-mail, including any
attachment(s), is intended solely for use by the designated recipient(s).
Unauthorized use, dissemination, distribution, or reproduction of this message
by anyone other than the intended recipient(s), or a person designated as
responsible for delivering such messages to the intended recipient, is strictly
prohibited and may be unlawful. This e-mail may contain proprietary,
confidential or privileged information. Any views or opinions expressed are
solely those of the author and do not necessarily represent those of Virgin
Pulse, Inc. If you have received this message in error, or are not the named
recipient(s), please immediately notify the sender and delete this e-mail
message.
v2.59
From: Alfredo Deza <[email protected]<mailto:[email protected]>>
Date: Wednesday, July 24, 2019 at 2:20 PM
To: Peter Eisch
<[email protected]<mailto:[email protected]>>
Cc: Paul Emmerich <[email protected]<mailto:[email protected]>>,
"[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Subject: Re: [ceph-users] Upgrading and lost OSDs
On Wed, Jul 24, 2019 at 2:56 PM Peter Eisch
<[email protected]<mailto:[email protected]>> wrote:
Hi Paul,
To do better to answer you question, I'm following:
http://docs.ceph.com/docs/nautilus/releases/nautilus/<https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdocs.ceph.com%2Fdocs%2Fnautilus%2Freleases%2Fnautilus%2F&data=02%7C01%7Cpeter.eisch%40virginpulse.com%7C25cea362ad224625423308d71071c968%7Cb123a16e892b4cf6a55a6f8c7606a035%7C0%7C0%7C636995953227727937&sdata=D1Q0Hrg9mq2tTtfTt1kp4Ts3vBc7mertKvYy8vBWNF8%3D&reserved=0>
At step 6, upgrade OSDs, I jumped on an OSD host and did a full 'yum update'
for patching the host and rebooted to pick up the current centos kernel.
If you are at Step 6 then it is *crucial* to understand that the tooling used
to create the OSDs is no longer available and Step 7 *is absolutely required*.
ceph-volume has to scan the system and give you the output of all OSDs found so
that it can persist them in /etc/ceph/osd/*.json files and then can later be
"activated".
I didn't do anything to specific commands for just updating the ceph RPMs in
this process.
It is not clear if you are at Step 6 and wondering why OSDs are not up, or you
are past that and ceph-volume wasn't able to detect anything.
peter
Peter Eisch
Senior Site Reliability Engineer
T
1.612.659.3228<tel:1.612.659.3228>
[Facebook]<https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.facebook.com%2FVirginPulse&data=02%7C01%7Cpeter.eisch%40virginpulse.com%7C25cea362ad224625423308d71071c968%7Cb123a16e892b4cf6a55a6f8c7606a035%7C0%7C0%7C636995953227737929&sdata=uncHzIOSXt25%2F0NydXSJaLAf6E3Ad05N%2BJLBKYYJQ%2Fw%3D&reserved=0>
[LinkedIn]<https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fcompany%2Fvirgin-pulse&data=02%7C01%7Cpeter.eisch%40virginpulse.com%7C25cea362ad224625423308d71071c968%7Cb123a16e892b4cf6a55a6f8c7606a035%7C0%7C0%7C636995953227737929&sdata=NxQ3BmhgWo93uoQfJ1W7lLcDdSUQHgoXu1I49vzibwE%3D&reserved=0>
[Twitter]<https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftwitter.com%2Fvirginpulse&data=02%7C01%7Cpeter.eisch%40virginpulse.com%7C25cea362ad224625423308d71071c968%7Cb123a16e892b4cf6a55a6f8c7606a035%7C0%7C0%7C636995953227747924&sdata=sYUapxFqHq0LVyxO4I7kkwN1y9PG5ZLHd83gseRIxvM%3D&reserved=0>
virginpulse.com<https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.virginpulse.com%2F&data=02%7C01%7Cpeter.eisch%40virginpulse.com%7C25cea362ad224625423308d71071c968%7Cb123a16e892b4cf6a55a6f8c7606a035%7C0%7C0%7C636995953227747924&sdata=aoXH94QOjdngJXkAPcz9kmJAK5BA6c9rR5BP01lX0bw%3D&reserved=0>
|
virginpulse.com/global-challenge<https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.virginpulse.com%2Fen-gb%2Fglobal-challenge%2F&data=02%7C01%7Cpeter.eisch%40virginpulse.com%7C25cea362ad224625423308d71071c968%7Cb123a16e892b4cf6a55a6f8c7606a035%7C0%7C0%7C636995953227757921&sdata=aUbx8uIlqldr5JtKT1PL05nbzcNPHONpouVkj1qXRdM%3D&reserved=0>
Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | Switzerland
| United Kingdom | USA
Confidentiality Notice: The information contained in this e-mail, including any
attachment(s), is intended solely for use by the designated recipient(s).
Unauthorized use, dissemination, distribution, or reproduction of this message
by anyone other than the intended recipient(s), or a person designated as
responsible for delivering such messages to the intended recipient, is strictly
prohibited and may be unlawful. This e-mail may contain proprietary,
confidential or privileged information. Any views or opinions expressed are
solely those of the author and do not necessarily represent those of Virgin
Pulse, Inc. If you have received this message in error, or are not the named
recipient(s), please immediately notify the sender and delete this e-mail
message.
v2.59
From: Paul Emmerich <[email protected]<mailto:[email protected]>>
Date: Wednesday, July 24, 2019 at 1:39 PM
To: Peter Eisch
<[email protected]<mailto:[email protected]>>
Cc: Xavier Trilla
<[email protected]<mailto:[email protected]>>,
"[email protected]<mailto:[email protected]>"
<[email protected]<mailto:[email protected]>>
Subject: Re: [ceph-users] Upgrading and lost OSDs
On Wed, Jul 24, 2019 at 8:36 PM Peter Eisch
<mailto:[email protected]<mailto:[email protected]>> wrote:
# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 1.7T 0 disk
├─sda1 8:1 0 100M 0 part
├─sda2 8:2 0 1.7T 0 part
└─sda5 8:5 0 10M 0 part
sdb 8:16 0 1.7T 0 disk
├─sdb1 8:17 0 100M 0 part
├─sdb2 8:18 0 1.7T 0 part
└─sdb5 8:21 0 10M 0 part
sdc 8:32 0 1.7T 0 disk
├─sdc1 8:33 0 100M 0 part
That's ceph-disk which was removed, run "ceph-volume simple scan"
--
Paul Emmerich
Looking for help with your Ceph cluster? Contact us at
https://nam02.safelinks.protection.outlook.com/?url=https://croit.io&data=02|01|[email protected]|93235ab7971a4beceab708d710664a14|b123a16e892b4cf6a55a6f8c7606a035|0|0|636995903843215231&sdata=YEQI+UvikVPVeOFNSB2ikqVRiul8ElD3JEZDVOQI+NY=&reserved=0<https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcroit.io&data=02%7C01%7Cpeter.eisch%40virginpulse.com%7C25cea362ad224625423308d71071c968%7Cb123a16e892b4cf6a55a6f8c7606a035%7C0%7C0%7C636995953227757921&sdata=gAFSElTqBRhu1zDIQYvWQ2WZtqqHoW%2FLa3stBfqXXHQ%3D&reserved=0>
croit GmbH
Freseniusstr. 31h
81247 München
https://nam02.safelinks.protection.outlook.com/?url=http://www.croit.io&data=02|01|[email protected]|93235ab7971a4beceab708d710664a14|b123a16e892b4cf6a55a6f8c7606a035|0|0|636995903843225224&sdata=83sD9wJHxE5W0renuDE7RGR/cPznR6jl9rEfl1AO+oA=&reserved=0<https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.croit.io&data=02%7C01%7Cpeter.eisch%40virginpulse.com%7C25cea362ad224625423308d71071c968%7Cb123a16e892b4cf6a55a6f8c7606a035%7C0%7C0%7C636995953227767913&sdata=LWJKCZ4VxuwoHMJGSWiEWAcawFPw7pDGC48%2B6bnXk6A%3D&reserved=0>
Tel: +49 89 1896585 90
...
I'm thinking the OSD would start (I can recreate the .service definitions in
systemctl) if the above were mounted in a way like they are on another of my
hosts:
# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 1.7T 0 disk
├─sda1 8:1 0 100M 0 part
│ └─97712be4-1234-4acc-8102-2265769053a5 253:17 0 98M 0 crypt
/var/lib/ceph/osd/ceph-16
├─sda2 8:2 0 1.7T 0 part
│ └─049b7160-1234-4edd-a5dc-fe00faca8d89 253:16 0 1.7T 0 crypt
└─sda5 8:5 0 10M 0 part
/var/lib/ceph/osd-lockbox/97712be4-9674-4acc-1234-2265769053a5
sdb 8:16 0 1.7T 0 disk
├─sdb1 8:17 0 100M 0 part
│ └─f03f0298-1234-42e9-8b28-f3016e44d1e2 253:26 0 98M 0 crypt
/var/lib/ceph/osd/ceph-17
├─sdb2 8:18 0 1.7T 0 part
│ └─51177019-1234-4963-82d1-5006233f5ab2 253:30 0 1.7T 0 crypt
└─sdb5 8:21 0 10M 0 part
/var/lib/ceph/osd-lockbox/f03f0298-1234-42e9-8b28-f3016e44d1e2
sdc 8:32 0 1.7T 0 disk
├─sdc1 8:33 0 100M 0 part
│ └─0184df0c-1234-404d-92de-cb71b1047abf 253:8 0 98M 0 crypt
/var/lib/ceph/osd/ceph-18
├─sdc2 8:34 0 1.7T 0 part
│ └─fdad7618-1234-4021-a63e-40d973712e7b 253:13 0 1.7T 0 crypt
...
Thank you for your time on this,
peter
From: Xavier Trilla
<mailto:[email protected]<mailto:[email protected]>>
Date: Wednesday, July 24, 2019 at 1:25 PM
To: Peter Eisch
<mailto:[email protected]<mailto:[email protected]>>
Cc: "mailto:[email protected]<mailto:[email protected]>"
<mailto:[email protected]<mailto:[email protected]>>
Subject: Re: [ceph-users] Upgrading and lost OSDs
Hi Peter,
Im not sure but maybe after some changes the OSDs are not being recongnized by
ceph scripts.
Ceph used to use udev to detect the OSDs and then moved to lvm, which kind of
OSDs are you running? Blustore or filestore? Which version did you use to
create them?
Cheers!
El 24 jul 2019, a les 20:04, Peter Eisch
<mailto:mailto<mailto:mailto>:[email protected]<mailto:[email protected]>>
va escriure:
Hi,
I’m working through updating from 12.2.12/luminious to 14.2.2/nautilus on
centos 7.6. The managers are updated alright:
# ceph -s
cluster:
id: 2fdb5976-1234-4b29-ad9c-1ca74a9466ec
health: HEALTH_WARN
Degraded data redundancy: 24177/9555955 objects degraded (0.253%),
7 pgs degraded, 1285 pgs undersized
3 monitors have not enabled msgr2
...
I updated ceph on a OSD host with 'yum update' and then rebooted to grab the
current kernel. Along the way, the contents of all the directories in
/var/lib/ceph/osd/ceph-*/ were deleted. Thus I have 16 OSDs down from this. I
can manage the undersized but I'd like to get these drives working again
without deleting each OSD and recreating them.
So far I've pulled the respective cephx key into the 'keyring' file and
populated 'bluestore' into the 'type' files but I'm unsure how to get the
lockboxes mounted to where I can get the OSDs running. The osd-lockbox
directory is otherwise untouched from when the OSDs were deployed.
Is there a way to run ceph-deploy or some other tool to rebuild the mounts for
the drives?
peter
_______________________________________________
ceph-users mailing list
[email protected]<mailto:[email protected]>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com<https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.ceph.com%2Flistinfo.cgi%2Fceph-users-ceph.com&data=02%7C01%7Cpeter.eisch%40virginpulse.com%7C25cea362ad224625423308d71071c968%7Cb123a16e892b4cf6a55a6f8c7606a035%7C0%7C0%7C636995953227767913&sdata=jeAj%2FfzN%2BOG1NPFeZwqYnQiB4mgbgLGqq85tz99xBz8%3D&reserved=0>
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com