Re: [ceph-users] ceph-volume lvm create leaves half-built OSDs lying around

2019-09-11 Thread Alfredo Deza
On Wed, Sep 11, 2019 at 6:18 AM Matthew Vernon  wrote:
>
> Hi,
>
> We keep finding part-made OSDs (they appear not attached to any host,
> and down and out; but still counting towards the number of OSDs); we
> never saw this with ceph-disk. On investigation, this is because
> ceph-volume lvm create makes the OSD (ID and auth at least) too early in
> the process and is then unable to roll-back cleanly (because the
> bootstrap-osd credential isn't allowed to remove OSDs).
>
> As an example (very truncated):
>
> Running command: /usr/bin/ceph --cluster ceph --name
> client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
> -i - osd new 20cea174-4c1b-4330-ad33-505a03156c33
> Running command: vgcreate --force --yes
> ceph-9d66ec60-c71b-49e0-8c1a-e74e98eafb0e /dev/sdbh
>  stderr: Device /dev/sdbh not found (or ignored by filtering).
>   Unable to add physical volume '/dev/sdbh' to volume group
> 'ceph-9d66ec60-c71b-49e0-8c1a-e74e98eafb0e'.
> --> Was unable to complete a new OSD, will rollback changes
> --> OSD will be fully purged from the cluster, because the ID was generated
> Running command: ceph osd purge osd.828 --yes-i-really-mean-it
>  stderr: 2019-09-10 15:07:53.396528 7fbca2caf700 -1 auth: unable to find
> a keyring on
> /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,:
> (2) No such file or directory
>  stderr: 2019-09-10 15:07:53.397318 7fbca2caf700 -1 monclient:
> authenticate NOTE: no keyring found; disabled cephx authentication
> 2019-09-10 15:07:53.397334 7fbca2caf700  0 librados: client.admin
> authentication error (95) Operation not supported
>
Ah this is tricky to solve for every case... ceph-volume is doing a
best-effort here

> This is annoying to have to clear up, and it seems to me could be
> avoided by either:
>
> i) ceph-volume should (attempt to) set up the LVM volumes  before
> making the new OSD id

That would've helped in your particular case where the failure is
observed when trying to create the LV. When the failure is on the Ceph
side... the problem is
similar.

> or
> ii) allow the bootstrap-osd credential to purge OSDs

I wasn't aware that the bootstrap-osd credentials allowed to
purge/destroy OSDs, are you sure this is possible? If it is I think
that would be reasonable to try.

>
> i) seems like clearly the better answer...?
>
> Regards,
>
> Matthew
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Upgrading and lost OSDs

2019-07-26 Thread Alfredo Deza
On Thu, Jul 25, 2019 at 7:00 PM Bob R  wrote:

> I would try 'mv /etc/ceph/osd{,.old}' then run 'ceph-volume  simple scan'
> again. We had some problems upgrading due to OSDs (perhaps initially
> installed as firefly?) missing the 'type' attribute and iirc the
> 'ceph-volume simple scan' command refused to overwrite existing json files
> after I made some changes to ceph-volume.
>

Ooof. I could swear that this issue was fixed already and it took me a
while to find out that it wasn't at all. We saw this a few months ago in
our Long Running Cluster used for dogfooding.

I've created a ticket to track this work at
http://tracker.ceph.com/issues/40987

But what you've done is exactly why we chose to persist the JSON files in
/etc/ceph/osd/*.json, so that an admin could tell if anything is missing
(or incorrect like in this case) and make the changes needed.



> Bob
>
> On Wed, Jul 24, 2019 at 1:24 PM Alfredo Deza  wrote:
>
>>
>>
>> On Wed, Jul 24, 2019 at 4:15 PM Peter Eisch 
>> wrote:
>>
>>> Hi,
>>>
>>>
>>>
>>> I appreciate the insistency that the directions be followed.  I wholly
>>> agree.  The only liberty I took was to do a ‘yum update’ instead of just
>>> ‘yum update ceph-osd’ and then reboot.  (Also my MDS runs on the MON hosts,
>>> so it got update a step early.)
>>>
>>>
>>>
>>> As for the logs:
>>>
>>>
>>>
>>> [2019-07-24 15:07:22,713][ceph_volume.main][INFO  ] Running command:
>>> ceph-volume  simple scan
>>>
>>> [2019-07-24 15:07:22,714][ceph_volume.process][INFO  ] Running command:
>>> /bin/systemctl show --no-pager --property=Id --state=running ceph-osd@*
>>>
>>> [2019-07-24 15:07:27,574][ceph_volume.main][INFO  ] Running command:
>>> ceph-volume  simple activate --all
>>>
>>> [2019-07-24 15:07:27,575][ceph_volume.devices.simple.activate][INFO  ]
>>> activating OSD specified in
>>> /etc/ceph/osd/0-93fb5f2f-0273-4c87-a718-886d7e6db983.json
>>>
>>> [2019-07-24 15:07:27,576][ceph_volume.devices.simple.activate][ERROR ]
>>> Required devices (block and data) not present for bluestore
>>>
>>> [2019-07-24 15:07:27,576][ceph_volume.devices.simple.activate][ERROR ]
>>> bluestore devices found: [u'data']
>>>
>>> [2019-07-24 15:07:27,576][ceph_volume][ERROR ] exception caught by
>>> decorator
>>>
>>> Traceback (most recent call last):
>>>
>>>   File "/usr/lib/python2.7/site-packages/ceph_volume/decorators.py",
>>> line 59, in newfunc
>>>
>>> return f(*a, **kw)
>>>
>>>   File "/usr/lib/python2.7/site-packages/ceph_volume/main.py", line 148,
>>> in main
>>>
>>> terminal.dispatch(self.mapper, subcommand_args)
>>>
>>>   File "/usr/lib/python2.7/site-packages/ceph_volume/terminal.py", line
>>> 182, in dispatch
>>>
>>> instance.main()
>>>
>>>   File
>>> "/usr/lib/python2.7/site-packages/ceph_volume/devices/simple/main.py", line
>>> 33, in main
>>>
>>> terminal.dispatch(self.mapper, self.argv)
>>>
>>>   File "/usr/lib/python2.7/site-packages/ceph_volume/terminal.py", line
>>> 182, in dispatch
>>>
>>> instance.main()
>>>
>>>   File
>>> "/usr/lib/python2.7/site-packages/ceph_volume/devices/simple/activate.py",
>>> line 272, in main
>>>
>>> self.activate(args)
>>>
>>>   File "/usr/lib/python2.7/site-packages/ceph_volume/decorators.py",
>>> line 16, in is_root
>>>
>>> return func(*a, **kw)
>>>
>>>   File
>>> "/usr/lib/python2.7/site-packages/ceph_volume/devices/simple/activate.py",
>>> line 131, in activate
>>>
>>> self.validate_devices(osd_metadata)
>>>
>>>   File
>>> "/usr/lib/python2.7/site-packages/ceph_volume/devices/simple/activate.py",
>>> line 62, in validate_devices
>>>
>>> raise RuntimeError('Unable to activate bluestore OSD due to missing
>>> devices')
>>>
>>> RuntimeError: Unable to activate bluestore OSD due to missing devices
>>>
>>>
>>>
>>> (this is repeated for each of the 16 drives)
>>>
>>>
>>>
>>> Any other thoughts?  (I’ll delete/create the OSDs with ceph-deply
>>> otherwise.)
>>>
>>
>> Try using `ceph-volu

Re: [ceph-users] Upgrading and lost OSDs

2019-07-24 Thread Alfredo Deza
t;
>
> Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | 
> Switzerland | United Kingdom | USA
> Confidentiality Notice: The information contained in this e-mail,
> including any attachment(s), is intended solely for use by the designated
> recipient(s). Unauthorized use, dissemination, distribution, or
> reproduction of this message by anyone other than the intended
> recipient(s), or a person designated as responsible for delivering such
> messages to the intended recipient, is strictly prohibited and may be
> unlawful. This e-mail may contain proprietary, confidential or privileged
> information. Any views or opinions expressed are solely those of the author
> and do not necessarily represent those of Virgin Pulse, Inc. If you have
> received this message in error, or are not the named recipient(s), please
> immediately notify the sender and delete this e-mail message.
> v2.59
>
> *From: *Alfredo Deza 
> *Date: *Wednesday, July 24, 2019 at 3:02 PM
> *To: *Peter Eisch 
> *Cc: *Paul Emmerich , "ceph-users@lists.ceph.com"
> 
> *Subject: *Re: [ceph-users] Upgrading and lost OSDs
>
>
>
>
>
> On Wed, Jul 24, 2019 at 3:49 PM Peter Eisch 
> wrote:
>
>
>
> I’m at step 6.  I updated/rebooted the host to complete “installing the
> new packages and restarting the ceph-osd daemon” on the first OSD host.
> All the systemctl definitions to start the OSDs were deleted, all the
> properties in /var/lib/ceph/osd/ceph-* directories were deleted.  All the
> files in /var/lib/ceph/osd-lockbox, for comparison, were untouched and
> still present.
>
>
>
> Peeking into step 7 I can run ceph-volume:
>
>
>
> # ceph-volume simple scan /dev/sda1
>
> Running command: /usr/sbin/cryptsetup status /dev/sda1
>
> Running command: /usr/sbin/cryptsetup status
> 93fb5f2f-0273-4c87-a718-886d7e6db983
>
> Running command: /bin/mount -v /dev/sda5 /tmp/tmpF5F8t2
>
> stdout: mount: /dev/sda5 mounted on /tmp/tmpF5F8t2.
>
> Running command: /usr/sbin/cryptsetup status /dev/sda5
>
> Running command: /bin/ceph --cluster ceph --name
> client.osd-lockbox.93fb5f2f-0273-4c87-a718-886d7e6db983 --keyring
> /tmp/tmpF5F8t2/keyring config-key get
> dm-crypt/osd/93fb5f2f-0273-4c87-a718-886d7e6db983/luks
>
> Running command: /bin/umount -v /tmp/tmpF5F8t2
>
> stderr: umount: /tmp/tmpF5F8t2 (/dev/sda5) unmounted
>
> Running command: /usr/sbin/cryptsetup --key-file - --allow-discards
> luksOpen /dev/sda1 93fb5f2f-0273-4c87-a718-886d7e6db983
>
> Running command: /bin/mount -v
> /dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983 /tmp/tmpYK0WEV
>
> stdout: mount: /dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983 mounted on
> /tmp/tmpYK0WEV.
>
> --> broken symlink found /tmp/tmpYK0WEV/block ->
> /dev/mapper/a05b447c-c901-4690-a249-cc1a2d62a110
>
> Running command: /usr/sbin/cryptsetup status /tmp/tmpYK0WEV/block_dmcrypt
>
> Running command: /usr/sbin/cryptsetup status
> /dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983
>
> Running command: /bin/umount -v /tmp/tmpYK0WEV
>
> stderr: umount: /tmp/tmpYK0WEV
> (/dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983) unmounted
>
> Running command: /usr/sbin/cryptsetup remove
> /dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983
>
> --> OSD 0 got scanned and metadata persisted to file:
> /etc/ceph/osd/0-93fb5f2f-0273-4c87-a718-886d7e6db983.json
>
> --> To take over management of this scanned OSD, and disable ceph-disk and
> udev, run:
>
> --> ceph-volume simple activate 0 93fb5f2f-0273-4c87-a718-886d7e6db983
>
> #
>
> #
>
> # ceph-volume simple activate 0 93fb5f2f-0273-4c87-a718-886d7e6db983
>
> --> Required devices (block and data) not present for bluestore
>
> --> bluestore devices found: [u'data']
>
> -->  RuntimeError: Unable to activate bluestore OSD due to missing devices
>
> #
>
>
>
> The tool detected bluestore, or rather, it failed to find a journal
> associated with /dev/sda1. Scanning a single partition can cause that.
> There is a flag to spit out the findings to STDOUT instead of persisting
> them in /etc/ceph/osd/
>
>
>
> Since this is a "whole system" upgrade, then the upgrade documentation
> instructions need to be followed:
>
>
>
> ceph-volume simple scan
> ceph-volume simple activate --all
>
>
>
> If the `scan` command doesn't display any information (not even with the
> --stdout flag) then the logs at /var/log/ceph/ceph-volume.log need to be
> inspected. It would be useful to check any findings in there
>
>
>
>
> Okay, this created /etc/ceph/osd/*.json.  This is cool.  Is there a
> command or option which will read these files and mount the devices

Re: [ceph-users] Upgrading and lost OSDs

2019-07-24 Thread Alfredo Deza
On Wed, Jul 24, 2019 at 3:49 PM Peter Eisch 
wrote:

>
>
> I’m at step 6.  I updated/rebooted the host to complete “installing the
> new packages and restarting the ceph-osd daemon” on the first OSD host.
> All the systemctl definitions to start the OSDs were deleted, all the
> properties in /var/lib/ceph/osd/ceph-* directories were deleted.  All the
> files in /var/lib/ceph/osd-lockbox, for comparison, were untouched and
> still present.
>
>
>
> Peeking into step 7 I can run ceph-volume:
>
>
>
> # ceph-volume simple scan /dev/sda1
>
> Running command: /usr/sbin/cryptsetup status /dev/sda1
>
> Running command: /usr/sbin/cryptsetup status
> 93fb5f2f-0273-4c87-a718-886d7e6db983
>
> Running command: /bin/mount -v /dev/sda5 /tmp/tmpF5F8t2
>
> stdout: mount: /dev/sda5 mounted on /tmp/tmpF5F8t2.
>
> Running command: /usr/sbin/cryptsetup status /dev/sda5
>
> Running command: /bin/ceph --cluster ceph --name
> client.osd-lockbox.93fb5f2f-0273-4c87-a718-886d7e6db983 --keyring
> /tmp/tmpF5F8t2/keyring config-key get
> dm-crypt/osd/93fb5f2f-0273-4c87-a718-886d7e6db983/luks
>
> Running command: /bin/umount -v /tmp/tmpF5F8t2
>
> stderr: umount: /tmp/tmpF5F8t2 (/dev/sda5) unmounted
>
> Running command: /usr/sbin/cryptsetup --key-file - --allow-discards
> luksOpen /dev/sda1 93fb5f2f-0273-4c87-a718-886d7e6db983
>
> Running command: /bin/mount -v
> /dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983 /tmp/tmpYK0WEV
>
> stdout: mount: /dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983 mounted on
> /tmp/tmpYK0WEV.
>
> --> broken symlink found /tmp/tmpYK0WEV/block ->
> /dev/mapper/a05b447c-c901-4690-a249-cc1a2d62a110
>
> Running command: /usr/sbin/cryptsetup status /tmp/tmpYK0WEV/block_dmcrypt
>
> Running command: /usr/sbin/cryptsetup status
> /dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983
>
> Running command: /bin/umount -v /tmp/tmpYK0WEV
>
> stderr: umount: /tmp/tmpYK0WEV
> (/dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983) unmounted
>
> Running command: /usr/sbin/cryptsetup remove
> /dev/mapper/93fb5f2f-0273-4c87-a718-886d7e6db983
>
> --> OSD 0 got scanned and metadata persisted to file:
> /etc/ceph/osd/0-93fb5f2f-0273-4c87-a718-886d7e6db983.json
>
> --> To take over management of this scanned OSD, and disable ceph-disk and
> udev, run:
>
> --> ceph-volume simple activate 0 93fb5f2f-0273-4c87-a718-886d7e6db983
>
> #
>
> #
>
> # ceph-volume simple activate 0 93fb5f2f-0273-4c87-a718-886d7e6db983
>
> --> Required devices (block and data) not present for bluestore
>
> --> bluestore devices found: [u'data']
>
> -->  RuntimeError: Unable to activate bluestore OSD due to missing devices
>
> #
>

The tool detected bluestore, or rather, it failed to find a journal
associated with /dev/sda1. Scanning a single partition can cause that.
There is a flag to spit out the findings to STDOUT instead of persisting
them in /etc/ceph/osd/

Since this is a "whole system" upgrade, then the upgrade documentation
instructions need to be followed:

ceph-volume simple scan
ceph-volume simple activate --all


If the `scan` command doesn't display any information (not even with the
--stdout flag) then the logs at /var/log/ceph/ceph-volume.log need to be
inspected. It would be useful to check any findings in there


>
> Okay, this created /etc/ceph/osd/*.json.  This is cool.  Is there a
> command or option which will read these files and mount the devices?
>
>
>
> peter
>
>
>
>
>
>
> Peter Eisch
> Senior Site Reliability Engineer
> T *1.612.659.3228* <1.612.659.3228>
> [image: Facebook] <https://www.facebook.com/VirginPulse>
> [image: LinkedIn] <https://www.linkedin.com/company/virgin-pulse>
> [image: Twitter] <https://twitter.com/virginpulse>
> *virginpulse.com* <https://www.virginpulse.com/>
> | *virginpulse.com/global-challenge*
> <https://www.virginpulse.com/en-gb/global-challenge/>
>
> Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | 
> Switzerland | United Kingdom | USA
> Confidentiality Notice: The information contained in this e-mail,
> including any attachment(s), is intended solely for use by the designated
> recipient(s). Unauthorized use, dissemination, distribution, or
> reproduction of this message by anyone other than the intended
> recipient(s), or a person designated as responsible for delivering such
> messages to the intended recipient, is strictly prohibited and may be
> unlawful. This e-mail may contain proprietary, confidential or privileged
> information. Any views or opinions expressed are solely those of the author
> and do not necessarily represent those of Virgin Pulse, Inc. If you have

Re: [ceph-users] Upgrading and lost OSDs

2019-07-24 Thread Alfredo Deza
On Wed, Jul 24, 2019 at 2:56 PM Peter Eisch 
wrote:

> Hi Paul,
>
> To do better to answer you question, I'm following:
> http://docs.ceph.com/docs/nautilus/releases/nautilus/
>
> At step 6, upgrade OSDs, I jumped on an OSD host and did a full 'yum
> update' for patching the host and rebooted to pick up the current centos
> kernel.
>

If you are at Step 6 then it is *crucial* to understand that the tooling
used to create the OSDs is no longer available and Step 7 *is absolutely
required*.

ceph-volume has to scan the system and give you the output of all OSDs
found so that it can persist them in /etc/ceph/osd/*.json files and then
can later be
"activated".


> I didn't do anything to specific commands for just updating the ceph RPMs
> in this process.
>
>
It is not clear if you are at Step 6 and wondering why OSDs are not up, or
you are past that and ceph-volume wasn't able to detect anything.


> peter
>
> Peter Eisch
> Senior Site Reliability Engineer
> T *1.612.659.3228* <1.612.659.3228>
> [image: Facebook] 
> [image: LinkedIn] 
> [image: Twitter] 
> *virginpulse.com* 
> | *virginpulse.com/global-challenge*
> 
>
> Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | 
> Switzerland | United Kingdom | USA
> Confidentiality Notice: The information contained in this e-mail,
> including any attachment(s), is intended solely for use by the designated
> recipient(s). Unauthorized use, dissemination, distribution, or
> reproduction of this message by anyone other than the intended
> recipient(s), or a person designated as responsible for delivering such
> messages to the intended recipient, is strictly prohibited and may be
> unlawful. This e-mail may contain proprietary, confidential or privileged
> information. Any views or opinions expressed are solely those of the author
> and do not necessarily represent those of Virgin Pulse, Inc. If you have
> received this message in error, or are not the named recipient(s), please
> immediately notify the sender and delete this e-mail message.
> v2.59
>
> From: Paul Emmerich 
> Date: Wednesday, July 24, 2019 at 1:39 PM
> To: Peter Eisch 
> Cc: Xavier Trilla , "ceph-users@lists.ceph.com"
> 
> Subject: Re: [ceph-users] Upgrading and lost OSDs
>
> On Wed, Jul 24, 2019 at 8:36 PM Peter Eisch  peter.ei...@virginpulse.com> wrote:
> # lsblk
> NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
> sda 8:0 0 1.7T 0 disk
> ├─sda1 8:1 0 100M 0 part
> ├─sda2 8:2 0 1.7T 0 part
> └─sda5 8:5 0 10M 0 part
> sdb 8:16 0 1.7T 0 disk
> ├─sdb1 8:17 0 100M 0 part
> ├─sdb2 8:18 0 1.7T 0 part
> └─sdb5 8:21 0 10M 0 part
> sdc 8:32 0 1.7T 0 disk
> ├─sdc1 8:33 0 100M 0 part
>
> That's ceph-disk which was removed, run "ceph-volume simple scan"
>
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at
> https://nam02.safelinks.protection.outlook.com/?url=https://croit.io=02|01|peter.ei...@virginpulse.com|93235ab7971a4beceab708d710664a14|b123a16e892b4cf6a55a6f8c7606a035|0|0|636995903843215231=YEQI+UvikVPVeOFNSB2ikqVRiul8ElD3JEZDVOQI+NY==0
> 
>
> croit GmbH
> Freseniusstr. 31h
> 81247 München
>
> https://nam02.safelinks.protection.outlook.com/?url=http://www.croit.io=02|01|peter.ei...@virginpulse.com|93235ab7971a4beceab708d710664a14|b123a16e892b4cf6a55a6f8c7606a035|0|0|636995903843225224=83sD9wJHxE5W0renuDE7RGR/cPznR6jl9rEfl1AO+oA==0
> 
> Tel: +49 89 1896585 90
>
>
> ...
> I'm thinking the OSD would start (I can recreate the .service definitions
> in systemctl) if the above were mounted in a way like they are on another
> of my hosts:
> # lsblk
> NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
> sda 8:0 0 1.7T 0 disk
> ├─sda1 8:1 0 100M 0 part
> │ └─97712be4-1234-4acc-8102-2265769053a5 253:17 0 98M 0 crypt
> /var/lib/ceph/osd/ceph-16
> ├─sda2 8:2 0 1.7T 0 part
> │ └─049b7160-1234-4edd-a5dc-fe00faca8d89 253:16 0 1.7T 0 crypt
> └─sda5 8:5 0 10M 0 part
> /var/lib/ceph/osd-lockbox/97712be4-9674-4acc-1234-2265769053a5
> sdb 8:16 0 1.7T 0 disk
> ├─sdb1 8:17 0 100M 0 part
> │ └─f03f0298-1234-42e9-8b28-f3016e44d1e2 253:26 0 98M 0 crypt
> /var/lib/ceph/osd/ceph-17
> ├─sdb2 8:18 0 1.7T 0 part
> │ └─51177019-1234-4963-82d1-5006233f5ab2 253:30 0 1.7T 0 crypt
> └─sdb5 8:21 0 10M 0 part
> /var/lib/ceph/osd-lockbox/f03f0298-1234-42e9-8b28-f3016e44d1e2
> sdc 8:32 0 1.7T 0 disk
> ├─sdc1 8:33 0 100M 

Re: [ceph-users] ceph-volume failed after replacing disk

2019-07-05 Thread Alfredo Deza
On Fri, Jul 5, 2019 at 6:23 AM ST Wong (ITSC)  wrote:
>
> Hi,
>
>
>
> I target to run just destroy and re-use the ID as stated in manual but seems 
> not working.
>
> Seems I’m unable to re-use the ID ?

The OSD replacement guide does not mention anything about crush and
auth commands. I believe you are now in a situation where the ID is no
longer able to be re-used, and ceph-volume
will not create one for you when specifying it in the CLI.

I don't know why there is so much attachment to these ID numbers, why
is it desirable to have that 71 number back again?
>
>
>
> Thanks.
>
> /stwong
>
>
>
>
>
> From: Paul Emmerich 
> Sent: Friday, July 5, 2019 5:54 PM
> To: ST Wong (ITSC) 
> Cc: Eugen Block ; ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] ceph-volume failed after replacing disk
>
>
>
>
>
> On Fri, Jul 5, 2019 at 11:25 AM ST Wong (ITSC)  wrote:
>
> Hi,
>
> Yes, I run the commands before:
>
> # ceph osd crush remove osd.71
> device 'osd.71' does not appear in the crush map
> # ceph auth del osd.71
> entity osd.71 does not exist
>
>
>
> which is probably the reason why you couldn't recycle the OSD ID.
>
>
>
> Either run just destroy and re-use the ID or run purge and not re-use the ID.
>
> Manually deleting auth and crush entries is no longer needed since purge was 
> introduced.
>
>
>
>
>
> Paul
>
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> 81247 München
> www.croit.io
> Tel: +49 89 1896585 90
>
>
>
>
> Thanks.
> /stwong
>
> -Original Message-
> From: ceph-users  On Behalf Of Eugen Block
> Sent: Friday, July 5, 2019 4:54 PM
> To: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] ceph-volume failed after replacing disk
>
> Hi,
>
> did you also remove that OSD from crush and also from auth before recreating 
> it?
>
> ceph osd crush remove osd.71
> ceph auth del osd.71
>
> Regards,
> Eugen
>
>
> Zitat von "ST Wong (ITSC)" :
>
> > Hi all,
> >
> > We replaced a faulty disk out of N OSD and tried to follow steps
> > according to "Replacing and OSD" in
> > http://docs.ceph.com/docs/nautilus/rados/operations/add-or-rm-osds/,
> > but got error:
> >
> > # ceph osd destroy 71--yes-i-really-mean-it # ceph-volume lvm create
> > --bluestore --data /dev/data/lv01 --osd-id
> > 71 --block.db /dev/db/lv01
> > Running command: /bin/ceph-authtool --gen-print-key Running command:
> > /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring
> > /var/lib/ceph/bootstrap-osd/ceph.keyring osd tree -f json
> > -->  RuntimeError: The osd ID 71 is already in use or does not exist.
> >
> > ceph -s still shows  N OSDS.   I then remove with "ceph osd rm 71".
> >  Now "ceph -s" shows N-1 OSDS and id 71 doesn't appear in "ceph osd
> > ls".
> >
> > However, repeating the ceph-volume command still gets same error.
> > We're running CEPH 14.2.1.   I must have some steps missed.Would
> > anyone please help? Thanks a lot.
> >
> > Rgds,
> > /stwong
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph-volume ignores cluster name from ceph.conf

2019-06-28 Thread Alfredo Deza
On Fri, Jun 28, 2019 at 7:53 AM Stolte, Felix  wrote:
>
> Thanks for the update Alfredo. What steps need to be done to rename my 
> cluster back to "ceph"?

That is a tough one, the ramifications of a custom cluster name are
wild - it touches everything. I am not sure there is a step-by-step
guide on how to do this, I would personally recommend re-doing the
cluster (knowing well this might not be possible in certain cases)
>
> The clustername is in several folder- and filenames etc
>
> Regards
> Felix
> -
> -
> Forschungszentrum Juelich GmbH
> 52425 Juelich
> Sitz der Gesellschaft: Juelich
> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
> Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
> Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
> Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
> Prof. Dr. Sebastian M. Schmidt
> -
> ---------
>
>
> Am 27.06.19, 15:09 schrieb "Alfredo Deza" :
>
> Although ceph-volume does a best-effort to support custom cluster
> names, the Ceph project does not support custom cluster names anymore
> even though you can still see settings/options that will allow you to
> set it.
>
> For reference see: https://bugzilla.redhat.com/show_bug.cgi?id=1459861
>
> On Thu, Jun 27, 2019 at 7:59 AM Stolte, Felix  
> wrote:
> >
> > Hi folks,
> >
> > I have a nautilus 14.2.1 cluster with a non-default cluster name 
> (ceph_stag instead of ceph). I set “cluster = ceph_stag” in 
> /etc/ceph/ceph_stag.conf.
> >
> > ceph-volume is using the correct config file but does not use the 
> specified clustername. Did I hit a bug or do I need to define the clustername 
> elsewere?
> >
> > Regards
> > Felix
> > IT-Services
> > Telefon 02461 61-9243
> > E-Mail: f.sto...@fz-juelich.de
> > 
> -
> > 
> -
> > Forschungszentrum Juelich GmbH
> > 52425 Juelich
> > Sitz der Gesellschaft: Juelich
> > Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
> > Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
> > Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
> > Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
> > Prof. Dr. Sebastian M. Schmidt
> > 
> -
> > 
> -
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] pgs incomplete

2019-06-27 Thread Alfredo Deza
On Thu, Jun 27, 2019 at 10:36 AM ☣Adam  wrote:

> Well that caused some excitement (either that or the small power
> disruption did)!  One of my OSDs is now down because it keeps crashing
> due to a failed assert (stacktraces attached, also I'm apparently
> running mimic, not luminous).
>
> In the past a failed assert on an OSD has meant removing the disk,
> wiping it, re-adding it as a new one, and then have ceph rebuild it from
> other copies of the data.
>
> I did this all manually in the past, but I'm trying to get more familiar
> with ceph's commands.  Will the following commands do the same?
>
> ceph-volume lvm zap --destroy --osd-id 11
> # Presumably that has to be run from the node with OSD 11, not just
> # any ceph node?
> # Source: http://docs.ceph.com/docs/mimic/ceph-volume/lvm/zap


That looks correct, and yes, you would need to run on the node with OSD 11.


>
> Do I need to remove the OSD (ceph osd out 11; wait for stabilization;
> ceph osd purge 11) before I do this and run and "ceph-deploy osd create"
> afterwards?
>

I think that what you need es essentially the same as the guide for
migrating from filestore to bluestore:

http://docs.ceph.com/docs/mimic/rados/operations/bluestore-migration/


> Thanks,
> Adam
>
>
> On 6/26/19 6:35 AM, Paul Emmerich wrote:
> > Have you tried: ceph osd force-create-pg ?
> >
> > If that doesn't work: use objectstore-tool on the OSD (while it's not
> > running) and use it to force mark the PG as complete. (Don't know the
> > exact command off the top of my head)
> >
> > Caution: these are obviously really dangerous commands
> >
> >
> >
> > Paul
> >
> >
> >
> > --
> > Paul Emmerich
> >
> > Looking for help with your Ceph cluster? Contact us at https://croit.io
> >
> > croit GmbH
> > Freseniusstr. 31h
> > 81247 München
> > www.croit.io 
> > Tel: +49 89 1896585 90
> >
> >
> > On Wed, Jun 26, 2019 at 1:56 AM ☣Adam  > > wrote:
> >
> > How can I tell ceph to give up on "incomplete" PGs?
> >
> > I have 12 pgs which are "inactive, incomplete" that won't recover.  I
> > think this is because in the past I have carelessly pulled disks too
> > quickly without letting the system recover.  I suspect the disks that
> > have the data for these are long gone.
> >
> > Whatever the reason, I want to fix it so I have a clean cluser even
> if
> > that means losing data.
> >
> > I went through the "troubleshooting pgs" guide[1] which is excellent,
> > but didn't get me to a fix.
> >
> > The output of `ceph pg 2.0 query` includes this:
> > "recovery_state": [
> > {
> > "name": "Started/Primary/Peering/Incomplete",
> > "enter_time": "2019-06-25 18:35:20.306634",
> > "comment": "not enough complete instances of this PG"
> > },
> >
> > I've already restated all OSDs in various orders, and I changed
> min_size
> > to 1 to see if that would allow them to get fixed, but no such luck.
> > These pools are not erasure coded and I'm using the Luminous release.
> >
> > How can I tell ceph to give up on these PGs?  There's nothing
> identified
> > as unfound, so mark_unfound_lost doesn't help.  I feel like `ceph osd
> > lost` might be it, but at this point the OSD numbers have been reused
> > for new disks, so I'd really like to limit the damage to the 12 PGs
> > which are incomplete if possible.
> >
> > Thanks,
> > Adam
> >
> > [1]
> >
> http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com 
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph-volume ignores cluster name from ceph.conf

2019-06-27 Thread Alfredo Deza
Although ceph-volume does a best-effort to support custom cluster
names, the Ceph project does not support custom cluster names anymore
even though you can still see settings/options that will allow you to
set it.

For reference see: https://bugzilla.redhat.com/show_bug.cgi?id=1459861

On Thu, Jun 27, 2019 at 7:59 AM Stolte, Felix  wrote:
>
> Hi folks,
>
> I have a nautilus 14.2.1 cluster with a non-default cluster name (ceph_stag 
> instead of ceph). I set “cluster = ceph_stag” in /etc/ceph/ceph_stag.conf.
>
> ceph-volume is using the correct config file but does not use the specified 
> clustername. Did I hit a bug or do I need to define the clustername elsewere?
>
> Regards
> Felix
> IT-Services
> Telefon 02461 61-9243
> E-Mail: f.sto...@fz-juelich.de
> -
> -
> Forschungszentrum Juelich GmbH
> 52425 Juelich
> Sitz der Gesellschaft: Juelich
> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
> Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
> Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
> Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
> Prof. Dr. Sebastian M. Schmidt
> -
> -
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Changing the release cadence

2019-06-25 Thread Alfredo Deza
On Mon, Jun 17, 2019 at 4:09 PM David Turner  wrote:
>
> This was a little long to respond with on Twitter, so I thought I'd share my 
> thoughts here. I love the idea of a 12 month cadence. I like October because 
> admins aren't upgrading production within the first few months of a new 
> release. It gives it plenty of time to be stable for the OS distros as well 
> as giving admins something low-key to work on over the holidays with testing 
> the new releases in stage/QA.

October sounds ideal, but in reality, we haven't been able to release
right on time as long as I can remember. Realistically, if we set
October, we are probably going to get into November/December.

For example, Nautilus was set to release in February and we got it out
late in late March (Almost April)

Would love to see more of a discussion around solving the problem of
releasing when we say we are going to - so that we can then choose
what the cadence is.

>
> On Mon, Jun 17, 2019 at 12:22 PM Sage Weil  wrote:
>>
>> On Wed, 5 Jun 2019, Sage Weil wrote:
>> > That brings us to an important decision: what time of year should we
>> > release?  Once we pick the timing, we'll be releasing at that time *every
>> > year* for each release (barring another schedule shift, which we want to
>> > avoid), so let's choose carefully!
>>
>> I've put up a twitter poll:
>>
>> https://twitter.com/liewegas/status/1140655233430970369
>>
>> Thanks!
>> sage
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Lost OSD from PCIe error, recovered, HOW to restore OSD process

2019-05-19 Thread Alfredo Deza
On Thu, May 16, 2019 at 3:55 PM Mark Lehrer  wrote:

> > Steps 3-6 are to get the drive lvm volume back
>
> How much longer will we have to deal with LVM?  If we can migrate non-LVM
> drives from earlier versions, how about we give ceph-volume the ability to
> create non-LVM OSDs directly?
>

We aren't requiring LVM exclusively, there is for example a ZFS plugin
already, so I would say that if you want to have something like partitions,
you can as a plugin (that would need to be developed). We are concentrating
in LVM because we think that is the way to go.


>
>
> On Thu, May 16, 2019 at 1:20 PM Tarek Zegar  wrote:
>
>> FYI for anyone interested, below is how to recover from a someone
>> removing a NVME drive (the first two steps show how mine were removed and
>> brought back)
>> Steps 3-6 are to get the drive lvm volume back AND get the OSD daemon
>> running for the drive
>>
>> 1. echo 1 > /sys/block/nvme0n1/device/device/remove
>> 2. echo 1 > /sys/bus/pci/rescan
>> 3. vgcfgrestore ceph-8c81b2a3-6c8e-4cae-a3c0-e2d91f82d841 ; vgchange -ay
>> ceph-8c81b2a3-6c8e-4cae-a3c0-e2d91f82d841
>> 4. ceph auth add osd.122 osd 'allow *' mon 'allow rwx' -i
>> /var/lib/ceph/osd/ceph-122/keyring
>> 5. ceph-volume lvm activate --all
>> 6. You should see the drive somewhere in the ceph tree, move it to the
>> right host
>>
>> Tarek
>>
>>
>>
>> [image: Inactive hide details for "Tarek Zegar" ---05/15/2019 10:32:27
>> AM---TLDR; I activated the drive successfully but the daemon won]"Tarek
>> Zegar" ---05/15/2019 10:32:27 AM---TLDR; I activated the drive successfully
>> but the daemon won't start, looks like it's complaining abo
>>
>> From: "Tarek Zegar" 
>> To: Alfredo Deza 
>> Cc: ceph-users 
>> Date: 05/15/2019 10:32 AM
>> Subject: [EXTERNAL] Re: [ceph-users] Lost OSD from PCIe error,
>> recovered, to restore OSD process
>> Sent by: "ceph-users" 
>> --
>>
>>
>>
>> TLDR; I activated the drive successfully but the daemon won't start,
>> looks like it's complaining about mon config, idk why (there is a valid
>> ceph.conf on the host). Thoughts? I feel like it's close. Thank you
>>
>> I executed the command:
>> ceph-volume lvm activate --all
>>
>>
>> It found the drive and activated it:
>> --> Activating OSD ID 122 FSID a151bea5-d123-45d9-9b08-963a511c042a
>> 
>> --> ceph-volume lvm activate successful for osd ID: 122
>>
>>
>>
>> However, systemd would not start the OSD process 122:
>> May 15 14:16:13 pok1-qz1-sr1-rk001-s20 ceph-osd[757237]: 2019-05-15
>> 14:16:13.862 71970700 -1 monclient(hunting): handle_auth_bad_method
>> server allowed_methods [2] but i only support [2]
>> May 15 14:16:13 pok1-qz1-sr1-rk001-s20 ceph-osd[757237]: 2019-05-15
>> 14:16:13.862 7116f700 -1 monclient(hunting): handle_auth_bad_method
>> server allowed_methods [2] but i only support [2]
>> May 15 14:16:13 pok1-qz1-sr1-rk001-s20 ceph-osd[757237]:* failed to
>> fetch mon config (--no-mon-config to skip)*
>> May 15 14:16:13 pok1-qz1-sr1-rk001-s20 systemd[1]: ceph-osd@122.service:
>> Main process exited, code=exited, status=1/FAILURE
>> May 15 14:16:13 pok1-qz1-sr1-rk001-s20 systemd[1]: ceph-osd@122.service: 
>> *Failed
>> with result 'exit-code'.*
>> May 15 14:16:14 pok1-qz1-sr1-rk001-s20 systemd[1]: ceph-osd@122.service:
>> Service hold-off time over, scheduling restart.
>> May 15 14:16:14 pok1-qz1-sr1-rk001-s20 systemd[1]: ceph-osd@122.service:
>> Scheduled restart job, restart counter is at 3.
>> -- Subject: Automatic restarting of a unit has been scheduled
>> -- Defined-By: systemd
>> -- Support: *http://www.ubuntu.com/support*
>> <http://www.ubuntu.com/support>
>> --
>> -- Automatic restarting of the unit ceph-osd@122.service has been
>> scheduled, as the result for
>> -- the configured Restart= setting for the unit.
>> May 15 14:16:14 pok1-qz1-sr1-rk001-s20 systemd[1]: Stopped Ceph object
>> storage daemon osd.122.
>> -- Subject: Unit ceph-osd@122.service has finished shutting down
>> -- Defined-By: systemd
>> -- Support: *http://www.ubuntu.com/support*
>> <http://www.ubuntu.com/support>
>> --
>> -- Unit ceph-osd@122.service has finished shutting down.
>> May 15 14:16:14 pok1-qz1-sr1-rk001-s20 systemd[1]: ceph-osd@122.service:
>> Start request repeated too quickly.
>> May 15 14:16:14 pok1-qz1-sr1-rk001-s20 systemd[1]: ceph-osd@122.service:
&

Re: [ceph-users] Lost OSD from PCIe error, recovered, to restore OSD process

2019-05-15 Thread Alfredo Deza
On Tue, May 14, 2019 at 7:24 PM Bob R  wrote:
>
> Does 'ceph-volume lvm list' show it? If so you can try to activate it with 
> 'ceph-volume lvm activate 122 74b01ec2--124d--427d--9812--e437f90261d4'

Good suggestion. If `ceph-volume lvm list` can see it, it can probably
activate it again. You can activate it with the OSD ID + OSD FSID, or
do:

ceph-volume lvm activate --all

You didn't say if the OSD wasn't coming up after trying to start it
(the systemd unit should still be there for ID 122), or if you tried
rebooting and that OSD didn't come up.

The systemd unit is tied to both the ID and FSID of the OSD, so it
shouldn't matter if the underlying device changed since ceph-volume
ensures it is the right one every time it activates.
>
> Bob
>
> On Tue, May 14, 2019 at 7:35 AM Tarek Zegar  wrote:
>>
>> Someone nuked and OSD that had 1 replica PGs. They accidentally did echo 1 > 
>> /sys/block/nvme0n1/device/device/remove
>> We got it back doing a echo 1 > /sys/bus/pci/rescan
>> However, it reenumerated as a different drive number (guess we didn't have 
>> udev rules)
>> They restored the LVM volume (vgcfgrestore 
>> ceph-8c81b2a3-6c8e-4cae-a3c0-e2d91f82d841 ; vgchange -ay 
>> ceph-8c81b2a3-6c8e-4cae-a3c0-e2d91f82d841)
>>
>> lsblk
>> nvme0n2 259:9 0 1.8T 0 diskc
>> ceph--8c81b2a3--6c8e--4cae--a3c0--e2d91f82d841-osd--data--74b01ec2--124d--427d--9812--e437f90261d4
>>  253:1 0 1.8T 0 lvm
>>
>> We are stuck here. How do we attach an OSD daemon to the drive? It was 
>> OSD.122 previously
>>
>> Thanks
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-volume ignores cluster name?

2019-05-13 Thread Alfredo Deza
On Mon, May 13, 2019 at 6:56 PM  wrote:
>
> All;
>
> I'm working on spinning up a demonstration cluster using ceph, and yes, I'm 
> installing it manually, for the purpose of learning.
>
> I can't seem to correctly create an OSD, as ceph-volume seems to only work if 
> the cluster name is the default.  If I rename my configuration file (at 
> /etc/ceph/) to ceph.conf, I can manage to create an OSD, but then it fails to 
> start.

If you rename your configuration file to something different to what
your OSDs were created with, you will end up with an OSD that thinks
it belongs to the older config file.

ceph-volume does support passing a configuration flag to set the
custom cluster name, and it should work fine, but beware custom
cluster names are no longer supported even though the flag exists.

I bet that if you list your current OSDs with `ceph-volume lvm list
--format=json` you will see that the OSD has the older cluster name.
This means that the cluster name is "sticky" with the OSD, so that
each
device that is part of an OSD knows to what cluster it belongs to.

If you decide to keep using custom cluster names you will probably end
up with issues that are either annoying or plain un-fixable

For reference see: https://bugzilla.redhat.com/show_bug.cgi?id=1459861

>
> I've tried adding the --cluster argument to ceph-volume, but that doesn't 
> seem to affect anything.
>
> Any thoughts?
>
> Thank you,
>
> Dominic L. Hilsbos, MBA
> Director - Information Technology
> Perform Air International Inc.
> dhils...@performair.com
> www.PerformAir.com
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Custom Ceph-Volume Batch with Mixed Devices

2019-05-10 Thread Alfredo Deza
On Fri, May 10, 2019 at 3:21 PM Lazuardi Nasution
 wrote:
>
> Hi Alfredo,
>
> Thank you for your answer, it is very helpful. Do you mean that 
> --osds-per-device=3 is mistyped? It should be --osds-per-device=4 to create 4 
> OSDs as expected, right? I'm trying to not create it by specifying manually 
> created LVM to have consistent ceph way of VG and LV naming.

Typo, yes... good catch!

The VG/LV naming isn't a super advantage here because it was done to
avoid collisions when creating them programmatically :) I don't know
why you want to place OSDs in this way which we aren't recommending
anywhere, you might as well go with what batch proposes.

>
> By the way, is it possible to do this two ceph-volume batch command by using 
> single ceph-ansible run or I should run twice with different configuration? 
> If it is possible, what should I put on configuration file?

This might be a good example to take why I am recommending against it:
tools will probably not support it. I don't think you can make
ceph-ansible do this, unless you are pre-creating the LVs, which if
using Ansible shouldn't be too hard anyway
>
> Best regards,
>
> On Sat, May 11, 2019, 02:09 Alfredo Deza  wrote:
>>
>> On Fri, May 10, 2019 at 2:43 PM Lazuardi Nasution
>>  wrote:
>> >
>> > Hi,
>> >
>> > Let's say I have following devices on a host.
>> >
>> > /dev/sda
>> > /dev/sdb
>> > /dev/nvme0n1
>> >
>> > How can I do ceph-volume batch which create bluestore OSD on HDDs and NVME 
>> > (devided to be 4 OSDs) and put block.db of HDDs on the NVME too? Following 
>> > are what I'm expecting on created LVs.
>>
>> You can, but it isn't easy (batch is meant to be opinionated) and what
>> you are proposing is a bit of an odd scenario that doesn't fit well
>> with what the batch command will want to do, which is: create OSDs
>> from a list
>> of devices and do the most optimal layout possible.
>>
>> I would suggest strongly to just use `ceph-volume lvm create` with
>> pre-made LVs that you can pass into it to arrange things in the way
>> you need. However, you might still be able to force batch here by
>> defining
>> the block.db sizes in ceph.conf, otherwise ceph-volume falls back to
>> "as large as possible". Having defined a size (say, 10GB) you can do
>> this:
>>
>> ceph-volume lvm batch /dev/sda /dev/sdb /dev/nvme0n1
>> ceph-volume lvm batch --osds-per-device=3 /dev/nvme0n1
>>
>> Again, I highly recommend against this setup and trying to make batch
>> do this - not 100% it will work...
>> >
>> > /dev/sda: DATA0
>> > /dev/sdb: DATA1
>> > /dev/nvme0n1: DB0 | DB1 | DATA2 | DATA3 | DATA4 | DATA5
>> >
>> > Best regards,
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Custom Ceph-Volume Batch with Mixed Devices

2019-05-10 Thread Alfredo Deza
On Fri, May 10, 2019 at 2:43 PM Lazuardi Nasution
 wrote:
>
> Hi,
>
> Let's say I have following devices on a host.
>
> /dev/sda
> /dev/sdb
> /dev/nvme0n1
>
> How can I do ceph-volume batch which create bluestore OSD on HDDs and NVME 
> (devided to be 4 OSDs) and put block.db of HDDs on the NVME too? Following 
> are what I'm expecting on created LVs.

You can, but it isn't easy (batch is meant to be opinionated) and what
you are proposing is a bit of an odd scenario that doesn't fit well
with what the batch command will want to do, which is: create OSDs
from a list
of devices and do the most optimal layout possible.

I would suggest strongly to just use `ceph-volume lvm create` with
pre-made LVs that you can pass into it to arrange things in the way
you need. However, you might still be able to force batch here by
defining
the block.db sizes in ceph.conf, otherwise ceph-volume falls back to
"as large as possible". Having defined a size (say, 10GB) you can do
this:

ceph-volume lvm batch /dev/sda /dev/sdb /dev/nvme0n1
ceph-volume lvm batch --osds-per-device=3 /dev/nvme0n1

Again, I highly recommend against this setup and trying to make batch
do this - not 100% it will work...
>
> /dev/sda: DATA0
> /dev/sdb: DATA1
> /dev/nvme0n1: DB0 | DB1 | DATA2 | DATA3 | DATA4 | DATA5
>
> Best regards,
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-volume activate runs infinitely

2019-05-02 Thread Alfredo Deza
On Thu, May 2, 2019 at 8:28 AM Robert Sander
 wrote:
>
> Hi,
>
> On 02.05.19 13:40, Alfredo Deza wrote:
>
> > Can you give a bit more details on the environment? How dense is the
> > server? If the unit retries is fine and I was hoping at some point it
> > would see things ready and start activating (it does retry
> > indefinitely at the moment).
>
> It is a machine with 13 Bluestore OSDs on LVM with SSDs as Block.DB devices.
> The SSDs have also been setup with LVM. This has been done with "ceph-volume 
> lvm batch".
>
> The issue started with the latest Ubuntu updates (no Ceph updates involved)
> and the following reboot. The customer let the boot process run for over
> 30 minutes but the ceph-volume activation services (and wpa-supplicant + 
> logind)
> were not able to start.
>
> > Would also help to see what problems is it encountering as it can't
> > get to activate. There are two logs for this, one for the systemd unit
> > at /var/log/ceph/ceph-volume-systemd.log and the other one at
> > /var/log/ceph/ceph-volume.log that might
> > help.
>
> Like these entries?
>
> [2019-05-02 10:04:32,211][ceph_volume.process][INFO  ] stderr Job for 
> ceph-osd@21.service canceled.
> [2019-05-02 10:04:32,211][ceph_volume][ERROR ] exception caught by decorator
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/dist-packages/ceph_volume/decorators.py", line 59, 
> in newfunc
> return f(*a, **kw)
>   File "/usr/lib/python2.7/dist-packages/ceph_volume/main.py", line 148, in 
> main
> terminal.dispatch(self.mapper, subcommand_args)
>   File "/usr/lib/python2.7/dist-packages/ceph_volume/terminal.py", line 182, 
> in dispatch
> instance.main()
>   File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/main.py", 
> line 40, in main
> terminal.dispatch(self.mapper, self.argv)
>   File "/usr/lib/python2.7/dist-packages/ceph_volume/terminal.py", line 182, 
> in dispatch
> instance.main()
>   File "/usr/lib/python2.7/dist-packages/ceph_volume/decorators.py", line 16, 
> in is_root
> return func(*a, **kw)
>   File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/trigger.py", 
> line 70, in main
> Activate(['--auto-detect-objectstore', osd_id, osd_uuid]).main()
>   File 
> "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/activate.py", line 
> 339, in main
> self.activate(args)
>   File "/usr/lib/python2.7/dist-packages/ceph_volume/decorators.py", line 16, 
> in is_root
> return func(*a, **kw)
>   File 
> "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/activate.py", line 
> 261, in activate
> return activate_bluestore(lvs, no_systemd=args.no_systemd)
>   File 
> "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/activate.py", line 
> 196, in activate_bluestore
> systemctl.start_osd(osd_id)
>   File "/usr/lib/python2.7/dist-packages/ceph_volume/systemd/systemctl.py", 
> line 39, in start_osd
> return start(osd_unit % id_)
>   File "/usr/lib/python2.7/dist-packages/ceph_volume/systemd/systemctl.py", 
> line 8, in start
> process.run(['systemctl', 'start', unit])
>   File "/usr/lib/python2.7/dist-packages/ceph_volume/process.py", line 153, 
> in run
> raise RuntimeError(msg)
> RuntimeError: command returned non-zero exit status: 1
>
>
> [2019-05-02 10:04:32,222][ceph_volume.process][INFO  ] stdout Running 
> command: /bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-21
> --> Absolute path not found for executable: restorecon
> --> Ensure $PATH environment variable contains common executable locations
> Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-21
> Running command: /usr/bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir 
> --dev 
> /dev/ceph-block-393ba2fc-e970-4d48-8dcb-c6261dfdfe08/osd-block-931e2d94-63f6-4df8-baed-6873eb0123e2
>  --path /var/lib/ceph/osd/ceph-21 --no-mon-config
> Running command: /bin/ln -snf 
> /dev/ceph-block-393ba2fc-e970-4d48-8dcb-c6261dfdfe08/osd-block-931e2d94-63f6-4df8-baed-6873eb0123e2
>  /var/lib/ceph/osd/ceph-21/block
> Running command: /bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-21/block
> Running command: /bin/chown -R ceph:ceph /dev/dm-12
> Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-21
> Running command: /bin/ln -snf 
> /dev/ceph-block-dbs-75eda181-946f-4a40-b4e0-8ecd60721398/osd-block-db-45ee9a1f-3ee2-4db9-a057-fd06fa1452e8
>  /var/lib/ceph/osd/ceph-21/block.db
> Running command: /bin/chown -h ceph:ceph 
> /dev/ceph-block-dbs-75eda181-946f-4a40-b4e0-8ecd60721398

Re: [ceph-users] ceph-volume activate runs infinitely

2019-05-02 Thread Alfredo Deza
On Thu, May 2, 2019 at 5:27 AM Robert Sander
 wrote:
>
> Hi,
>
> The ceph-volume@.service units on an Ubuntu 18.04.2 system
> run unlimited and do not finish.
>
> Only after we create this override config the system boots again:
>
> # /etc/systemd/system/ceph-volume@.service.d/override.conf
> [Unit]
> After=network-online.target local-fs.target time-sync.target ceph-mon.target
>
> It looks like "After=local-fs.target" (the original value) is not
> enough for the dependencies.

Can you give a bit more details on the environment? How dense is the
server? If the unit retries is fine and I was hoping at some point it
would see things ready and start activating (it does retry
indefinitely at the moment).

Would also help to see what problems is it encountering as it can't
get to activate. There are two logs for this, one for the systemd unit
at /var/log/ceph/ceph-volume-systemd.log and the other one at
/var/log/ceph/ceph-volume.log that might
help.

The "After=" directive is just adding some wait time to start
activating here, so I wonder how is it that your OSDs didn't
eventually came up.
>
> Regards
> --
> Robert Sander
> Heinlein Support GmbH
> Schwedter Str. 8/9b, 10119 Berlin
>
> https://www.heinlein-support.de
>
> Tel: 030 / 405051-43
> Fax: 030 / 405051-19
>
> Amtsgericht Berlin-Charlottenburg - HRB 93818 B
> Geschäftsführer: Peer Heinlein - Sitz: Berlin
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to properly clean up bluestore disks

2019-04-18 Thread Alfredo Deza
On Thu, Apr 18, 2019 at 3:01 PM Sergei Genchev  wrote:
>
>  Thank you Alfredo
> I did not have any reasons to keep volumes around.
> I tried using ceph-volume to zap these stores, but none of the command 
> worked, including yours 'ceph-volume lvm zap 
> osvg-sdd-db/2ukzAx-g9pZ-IyxU-Sp9h-fHv2-INNY-1vTpvz'

If you do not want to keep them around you would need to use --destroy
and use the lv path as input:

ceph-volume lvm zap --destroy osvg-sdd-db/2ukzAx-g9pZ-IyxU-Sp9h-fHv2-INNY-1vTpvz

>
> I ended up manually removing LUKS volumes and then deleting LVM LV, VG, and PV
>
> cryptsetup remove /dev/mapper/AeV0iG-odWF-NRPE-1bVK-0mxH-OgHL-fneTzr
> cryptsetup remove /dev/mapper/2ukzAx-g9pZ-IyxU-Sp9h-fHv2-INNY-1vTpvz
> lvremove 
> /dev/ceph-f4efa78f-a467-4214-b550-81653da1c9bd/osd-block-097d59be-bbe6-493a-b785-48b259d2ff35
> sgdisk -Z /dev/sdd
>
> # ceph-volume lvm zap osvg-sdd-db/2ukzAx-g9pZ-IyxU-Sp9h-fHv2-INNY-1vTpvz
> Running command: /usr/sbin/cryptsetup status /dev/mapper/
> --> Zapping: osvg-sdd-db/2ukzAx-g9pZ-IyxU-Sp9h-fHv2-INNY-1vTpvz
> Running command: /usr/sbin/wipefs --all 
> osvg-sdd-db/2ukzAx-g9pZ-IyxU-Sp9h-fHv2-INNY-1vTpvz
>  stderr: wipefs: error: osvg-sdd-db/2ukzAx-g9pZ-IyxU-Sp9h-fHv2-INNY-1vTpvz: 
> probing initialization failed: No such file or directory
> -->  RuntimeError: command returned non-zero exit status: 1

In this case, you removed the LV so the wipefs failed because that LV
no longer exists. Do you have output on how it failed before?

>
>
> On Thu, Apr 18, 2019 at 10:10 AM Alfredo Deza  wrote:
>>
>> On Thu, Apr 18, 2019 at 10:55 AM Sergei Genchev  wrote:
>> >
>> >  Hello,
>> > I have a server with 18 disks, and 17 OSD daemons configured. One of the 
>> > OSD daemons failed to deploy with ceph-deploy. The reason for failing is 
>> > unimportant at this point, I believe it was race condition, as I was 
>> > running ceph-deploy inside while loop for all disks in this server.
>> >   Now I have two left over LVM dmcrypded volumes that I am not sure how 
>> > clean up. The command that failed and did not quite clean up after itself 
>> > was:
>> > ceph-deploy osd create --bluestore --dmcrypt --data /dev/sdd --block-db 
>> > osvg/sdd-db ${SERVERNAME}
>> >
>> > # lsblk
>> > ...
>> > sdd 8:48   0   7.3T  0 disk
>> > └─ceph--f4efa78f--a467--4214--b550--81653da1c9bd-osd--block--097d59be--bbe6--493a--b785--48b259d2ff35
>> >   253:32   0   7.3T  0 lvm
>> >   └─AeV0iG-odWF-NRPE-1bVK-0mxH-OgHL-fneTzr253:33   0   7.3T  0 crypt
>> >
>> > sds65:32   0 223.5G  0 disk
>> > ├─sds1 65:33   0   512M  0 part  
>> > /boot
>> > └─sds2 65:34   0   223G  0 part
>> >  ...
>> >├─osvg-sdd--db  253:80 8G  0 lvm
>> >│ └─2ukzAx-g9pZ-IyxU-Sp9h-fHv2-INNY-1vTpvz  253:34   0 8G  0 crypt
>> >
>> > # ceph-volume inventory /dev/sdd
>> >
>> > == Device report /dev/sdd ==
>> >
>> >  available False
>> >  rejected reasons  locked
>> >  path  /dev/sdd
>> >  scheduler modedeadline
>> >  rotational1
>> >  vendorSEAGATE
>> >  human readable size   7.28 TB
>> >  sas address   0x5000c500a6b1d581
>> >  removable 0
>> >  model ST8000NM0185
>> >  ro0
>> > --- Logical Volume ---
>> >  cluster name  ceph
>> >  name  
>> > osd-block-097d59be-bbe6-493a-b785-48b259d2ff35
>> >  osd id39
>> >  cluster fsid  8e7a3953-7647-4133-9b9a-7f4a2e2b7da7
>> >  type  block
>> >  block uuidAeV0iG-odWF-NRPE-1bVK-0mxH-OgHL-fneTzr
>> >  osd fsid  097d59be-bbe6-493a-b785-48b259d2ff35
>> >
>> > I was trying to run
>> > ceph-volume lvm zap --destroy /dev/sdd but it errored out. Osd id on this 
>> > volume is the same as on next drive, /dev/sde, and osd.39 daemon is 
>> > running. This command was trying to zap running osd.
>> >
>> > What is the proper way to clean both data and block db volumes, so I can 
>> > reru

Re: [ceph-users] How to properly clean up bluestore disks

2019-04-18 Thread Alfredo Deza
On Thu, Apr 18, 2019 at 10:55 AM Sergei Genchev  wrote:
>
>  Hello,
> I have a server with 18 disks, and 17 OSD daemons configured. One of the OSD 
> daemons failed to deploy with ceph-deploy. The reason for failing is 
> unimportant at this point, I believe it was race condition, as I was running 
> ceph-deploy inside while loop for all disks in this server.
>   Now I have two left over LVM dmcrypded volumes that I am not sure how clean 
> up. The command that failed and did not quite clean up after itself was:
> ceph-deploy osd create --bluestore --dmcrypt --data /dev/sdd --block-db 
> osvg/sdd-db ${SERVERNAME}
>
> # lsblk
> ...
> sdd 8:48   0   7.3T  0 disk
> └─ceph--f4efa78f--a467--4214--b550--81653da1c9bd-osd--block--097d59be--bbe6--493a--b785--48b259d2ff35
>   253:32   0   7.3T  0 lvm
>   └─AeV0iG-odWF-NRPE-1bVK-0mxH-OgHL-fneTzr253:33   0   7.3T  0 crypt
>
> sds65:32   0 223.5G  0 disk
> ├─sds1 65:33   0   512M  0 part  /boot
> └─sds2 65:34   0   223G  0 part
>  ...
>├─osvg-sdd--db  253:80 8G  0 lvm
>│ └─2ukzAx-g9pZ-IyxU-Sp9h-fHv2-INNY-1vTpvz  253:34   0 8G  0 crypt
>
> # ceph-volume inventory /dev/sdd
>
> == Device report /dev/sdd ==
>
>  available False
>  rejected reasons  locked
>  path  /dev/sdd
>  scheduler modedeadline
>  rotational1
>  vendorSEAGATE
>  human readable size   7.28 TB
>  sas address   0x5000c500a6b1d581
>  removable 0
>  model ST8000NM0185
>  ro0
> --- Logical Volume ---
>  cluster name  ceph
>  name  osd-block-097d59be-bbe6-493a-b785-48b259d2ff35
>  osd id39
>  cluster fsid  8e7a3953-7647-4133-9b9a-7f4a2e2b7da7
>  type  block
>  block uuidAeV0iG-odWF-NRPE-1bVK-0mxH-OgHL-fneTzr
>  osd fsid  097d59be-bbe6-493a-b785-48b259d2ff35
>
> I was trying to run
> ceph-volume lvm zap --destroy /dev/sdd but it errored out. Osd id on this 
> volume is the same as on next drive, /dev/sde, and osd.39 daemon is running. 
> This command was trying to zap running osd.
>
> What is the proper way to clean both data and block db volumes, so I can 
> rerun ceph-deploy again, and add them to the pool?
>

Do you want to keep the LVs around or you want to complete get rid of
them? If you are passing /dev/sdd to 'zap' you are telling the tool to
destroy everything that is in there, regardless of who owns it
(including running
OSDs).

If you want to keep LVs around then you can omit the --destroy flag
and pass the LVs as input, or if using a recent enough version you can
use --osd-fsid to zap:

ceph-volume lvm zap osvg-sdd-db/2ukzAx-g9pZ-IyxU-Sp9h-fHv2-INNY-1vTpvz

If you don't want the LVs around you can add --destroy, but use the LV
as input (not the device)

> Thank you!
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] bluefs-bdev-expand experience

2019-04-12 Thread Alfredo Deza
On Thu, Apr 11, 2019 at 4:23 PM Yury Shevchuk  wrote:
>
> Hi Igor!
>
> I have upgraded from Luminous to Nautilus and now slow device
> expansion works indeed.  The steps are shown below to round up the
> topic.
>
> node2# ceph osd df
> ID CLASS WEIGHT  REWEIGHT SIZERAW USE DATAOMAPMETAAVAIL   
> %USE  VAR  PGS STATUS
>  0   hdd 0.22739  1.0 233 GiB  91 GiB  90 GiB 208 MiB 816 MiB 142 GiB 
> 38.92 1.04 128 up
>  1   hdd 0.22739  1.0 233 GiB  91 GiB  90 GiB 200 MiB 824 MiB 142 GiB 
> 38.92 1.04 128 up
>  3   hdd 0.227390 0 B 0 B 0 B 0 B 0 B 0 B 
> 00   0   down
>  2   hdd 0.22739  1.0 481 GiB 172 GiB  90 GiB 201 MiB 823 MiB 309 GiB 
> 35.70 0.96 128 up
> TOTAL 947 GiB 353 GiB 269 GiB 610 MiB 2.4 GiB 594 GiB 
> 37.28
> MIN/MAX VAR: 0.96/1.04  STDDEV: 1.62
>
> node2# lvextend -L+50G /dev/vg0/osd2
>   Size of logical volume vg0/osd2 changed from 400.00 GiB (102400 extents) to 
> 450.00 GiB (115200 extents).
>   Logical volume vg0/osd2 successfully resized.
>
> node2# ceph-bluestore-tool bluefs-bdev-expand --path /var/lib/ceph/osd/ceph-2/
> inferring bluefs devices from bluestore path
> 2019-04-11 22:28:00.240 7f2e24e190c0 -1 bluestore(/var/lib/ceph/osd/ceph-2) 
> _lock_fsid failed to lock /var/lib/ceph/osd/ceph-2/fsid (is another ceph-osd 
> still running?)(11) Resource temporarily unavailable
> ...
> *** Caught signal (Aborted) **
> [two pages of stack dump stripped]
>
> My mistake in the first place: I tried to expand non-stopped osd again.
>
> node2# systemctl stop ceph-osd.target
>
> node2# ceph-bluestore-tool bluefs-bdev-expand --path /var/lib/ceph/osd/ceph-2/
> inferring bluefs devices from bluestore path
> 0 : device size 0x4000 : own 0x[1000~3000] = 0x3000 : using 
> 0x8ff000
> 1 : device size 0x144000 : own 0x[2000~143fffe000] = 0x143fffe000 : using 
> 0x24dfe000
> 2 : device size 0x708000 : own 0x[30~4] = 0x4 : 
> using 0x0
> Expanding...
> 2 : expanding  from 0x64 to 0x708000
> 2 : size label updated to 483183820800
>
> node2# ceph-bluestore-tool show-label --dev /dev/vg0/osd2 | grep size
> "size": 483183820800,
>
> node2# ceph osd df
> ID CLASS WEIGHT  REWEIGHT SIZERAW USE DATAOMAPMETAAVAIL   
> %USE  VAR  PGS STATUS
>  0   hdd 0.22739  1.0 233 GiB  91 GiB  90 GiB 208 MiB 816 MiB 142 GiB 
> 38.92 1.10 128 up
>  1   hdd 0.22739  1.0 233 GiB  91 GiB  90 GiB 200 MiB 824 MiB 142 GiB 
> 38.92 1.10 128 up
>  3   hdd 0.227390 0 B 0 B 0 B 0 B 0 B 0 B 
> 00   0   down
>  2   hdd 0.22739  1.0 531 GiB 172 GiB  90 GiB 185 MiB 839 MiB 359 GiB 
> 32.33 0.91 128 up
> TOTAL 997 GiB 353 GiB 269 GiB 593 MiB 2.4 GiB 644 GiB 
> 35.41
> MIN/MAX VAR: 0.91/1.10  STDDEV: 3.37
>
> It worked: AVAIL = 594+50 = 644.  Great!
> Thanks a lot for your help.
>
> And one more question regarding your last remark is inline below.
>
> On Wed, Apr 10, 2019 at 09:54:35PM +0300, Igor Fedotov wrote:
> >
> > On 4/9/2019 1:59 PM, Yury Shevchuk wrote:
> > > Igor, thank you, Round 2 is explained now.
> > >
> > > Main aka block aka slow device cannot be expanded in Luminus, this
> > > functionality will be available after upgrade to Nautilus.
> > > Wal and db devices can be expanded in Luminous.
> > >
> > > Now I have recreated osd2 once again to get rid of the paradoxical
> > > cepf osd df output and tried to test db expansion, 40G -> 60G:
> > >
> > > node2:/# ceph-volume lvm zap --destroy --osd-id 2
> > > node2:/# ceph osd lost 2 --yes-i-really-mean-it
> > > node2:/# ceph osd destroy 2 --yes-i-really-mean-it
> > > node2:/# lvcreate -L1G -n osd2wal vg0
> > > node2:/# lvcreate -L40G -n osd2db vg0
> > > node2:/# lvcreate -L400G -n osd2 vg0
> > > node2:/# ceph-volume lvm create --osd-id 2 --bluestore --data vg0/osd2 
> > > --block.db vg0/osd2db --block.wal vg0/osd2wal
> > >
> > > node2:/# ceph osd df
> > > ID CLASS WEIGHT  REWEIGHT SIZE   USE AVAIL  %USE VAR  PGS
> > >   0   hdd 0.22739  1.0 233GiB 9.49GiB 223GiB 4.08 1.24 128
> > >   1   hdd 0.22739  1.0 233GiB 9.49GiB 223GiB 4.08 1.24 128
> > >   3   hdd 0.227390 0B  0B 0B00   0
> > >   2   hdd 0.22739  1.0 400GiB 9.49GiB 391GiB 2.37 0.72 128
> > >  TOTAL 866GiB 28.5GiB 837GiB 3.29
> > > MIN/MAX VAR: 0.72/1.24  STDDEV: 0.83
> > >
> > > node2:/# lvextend -L+20G /dev/vg0/osd2db
> > >Size of logical volume vg0/osd2db changed from 40.00 GiB (10240 
> > > extents) to 60.00 GiB (15360 extents).
> > >Logical volume vg0/osd2db successfully resized.
> > >
> > > node2:/# ceph-bluestore-tool bluefs-bdev-expand --path 
> > > /var/lib/ceph/osd/ceph-2/
> > > inferring bluefs devices from bluestore path
> > >   slot 0 /var/lib/ceph/osd/ceph-2//block.wal
> > >   slot 1 /var/lib/ceph/osd/ceph-2//block.db
> > >   slot 2 /var/lib/ceph/osd/ceph-2//block
> > > 0 : size 0x4000 : 

Re: [ceph-users] v14.2.0 Nautilus released

2019-03-20 Thread Alfredo Deza
On Tue, Mar 19, 2019 at 2:53 PM Benjamin Cherian
 wrote:
>
> Hi,
>
> I'm getting an error when trying to use the APT repo for Ubuntu bionic. Does 
> anyone else have this issue? Is the mirror sync actually still in progress? 
> Or was something setup incorrectly?
>
> E: Failed to fetch 
> https://download.ceph.com/debian-nautilus/dists/bionic/main/binary-amd64/Packages.bz2
>   File has unexpected size (15515 != 15488). Mirror sync in progress? [IP: 
> 158.69.68.124 443]
>Hashes of expected file:
> - Filesize:15488 [weak]
> - SHA256:d5ea08e095eeeaa5cc134b1661bfaf55280fcbf8a265d584a4af80d2a424ec17
> - SHA1:6da3a8aa17ed7f828f35f546cdcf923040e8e5b0 [weak]
> - MD5Sum:7e5a4ecea4a4edc3f483623d48b6efa4 [weak]
>Release file created at: Mon, 11 Mar 2019 18:44:46 +
>

This has now been fixed, let me know if you have any more issues.


>
> Thanks,
> Ben
>
>
> On Tue, Mar 19, 2019 at 7:24 AM Sean Purdy  wrote:
>>
>> Hi,
>>
>>
>> Will debian packages be released?  I don't see them in the nautilus repo.  I 
>> thought that Nautilus was going to be debian-friendly, unlike Mimic.
>>
>>
>> Sean
>>
>> On Tue, 19 Mar 2019 14:58:41 +0100
>> Abhishek Lekshmanan  wrote:
>>
>> >
>> > We're glad to announce the first release of Nautilus v14.2.0 stable
>> > series. There have been a lot of changes across components from the
>> > previous Ceph releases, and we advise everyone to go through the release
>> > and upgrade notes carefully.
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] v14.2.0 Nautilus released

2019-03-20 Thread Alfredo Deza
There aren't any Debian packages built for this release because we
haven't updated the infrastructure to build (and test) Debian packages
yet.

On Tue, Mar 19, 2019 at 10:24 AM Sean Purdy  wrote:
>
> Hi,
>
>
> Will debian packages be released?  I don't see them in the nautilus repo.  I 
> thought that Nautilus was going to be debian-friendly, unlike Mimic.
>
>
> Sean
>
> On Tue, 19 Mar 2019 14:58:41 +0100
> Abhishek Lekshmanan  wrote:
>
> >
> > We're glad to announce the first release of Nautilus v14.2.0 stable
> > series. There have been a lot of changes across components from the
> > previous Ceph releases, and we advise everyone to go through the release
> > and upgrade notes carefully.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-volume lvm batch OSD replacement

2019-03-19 Thread Alfredo Deza
On Tue, Mar 19, 2019 at 7:26 AM Dan van der Ster  wrote:
>
> On Tue, Mar 19, 2019 at 12:17 PM Alfredo Deza  wrote:
> >
> > On Tue, Mar 19, 2019 at 7:00 AM Alfredo Deza  wrote:
> > >
> > > On Tue, Mar 19, 2019 at 6:47 AM Dan van der Ster  
> > > wrote:
> > > >
> > > > Hi all,
> > > >
> > > > We've just hit our first OSD replacement on a host created with
> > > > `ceph-volume lvm batch` with mixed hdds+ssds.
> > > >
> > > > The hdd /dev/sdq was prepared like this:
> > > ># ceph-volume lvm batch /dev/sd[m-r] /dev/sdac --yes
> > > >
> > > > Then /dev/sdq failed and was then zapped like this:
> > > >   # ceph-volume lvm zap /dev/sdq --destroy
> > > >
> > > > The zap removed the pv/vg/lv from sdq, but left behind the db on
> > > > /dev/sdac (see P.S.)
> > >
> > > That is correct behavior for the zap command used.
> > >
> > > >
> > > > Now we're replaced /dev/sdq and we're wondering how to proceed. We see
> > > > two options:
> > > >   1. reuse the existing db lv from osd.240 (Though the osd fsid will
> > > > change when we re-create, right?)
> > >
> > > This is possible but you are right that in the current state, the FSID
> > > and other cluster data exist in the LV metadata. To reuse this LV for
> > > a new (replaced) OSD
> > > then you would need to zap the LV *without* the --destroy flag, which
> > > would clear all metadata on the LV and do a wipefs. The command would
> > > need the full path to
> > > the LV associated with osd.240, something like:
> > >
> > > ceph-volume lvm zap /dev/ceph-osd-lvs/db-lv-240
> > >
> > > >   2. remove the db lv from sdac then run
> > > > # ceph-volume lvm batch /dev/sdq /dev/sdac
> > > >  which should do the correct thing.
> > >
> > > This would also work if the db lv is fully removed with --destroy
> > >
> > > >
> > > > This is all v12.2.11 btw.
> > > > If (2) is the prefered approached, then it looks like a bug that the
> > > > db lv was not destroyed by lvm zap --destroy.
> > >
> > > Since /dev/sdq was passed in to zap, just that one device was removed,
> > > so this is working as expected.
> > >
> > > Alternatively, zap has the ability to destroy or zap LVs associated
> > > with an OSD ID. I think this is not released yet for Luminous but
> > > should be in the next release (which seems to be what you want)
> >
> > Seems like 12.2.11 was released with the ability to zap by OSD ID. You
> > can also zap by OSD FSID, both way will zap (and optionally destroy if
> > using --destroy)
> > all LVs associated with the OSD.
> >
> > Full examples on this can be found here:
> >
> > http://docs.ceph.com/docs/luminous/ceph-volume/lvm/zap/#removing-devices
> >
> >
>
> Ohh that's an improvement! (Our goal is outsourcing the failure
> handling to non-ceph experts, so this will help simplify things.)
>
> In our example, the operator needs to know the osd id, then can do:
>
> 1. ceph-volume lvm zap --destroy --osd-id 240 (wipes sdq and removes
> the lvm from sdac for osd.240)
> 2. replace the hdd
> 3. ceph-volume lvm batch /dev/sdq /dev/sdac --osd-ids 240
>
> But I just remembered that the --osd-ids flag hasn't been backported
> to luminous, so we can't yet do that. I guess we'll follow the first
> (1) procedure to re-use the existing db lv.

It has! (I initially thought it wasn't). Check if `ceph-volume lvm zap
--help` has the flags available, I think they should appear for
12.2.11
>
> -- dan
>
> > >
> > > >
> > > > Once we sort this out, we'd be happy to contribute to the ceph-volume
> > > > lvm batch doc.
> > > >
> > > > Thanks!
> > > >
> > > > Dan
> > > >
> > > > P.S:
> > > >
> > > > = osd.240 ==
> > > >
> > > >   [  db]
> > > > /dev/ceph-094c06db-98dc-47f6-a7e5-1092b099b372/osd-block-db-fa0e7927-dc3e-44d0-a8ce-1d8202fa75dd
> > > >
> > > >   type  db
> > > >   osd id240
> > > >   cluster fsid  b4f463a0-c671-43a8-bd36-e40ab8d233d2
> > > >   cluster name  ceph
> > > >   osd fsid  d4d1fb15-a30a-4325-8628-706772ee4294
> > > >   db device
> > > > /dev/ceph-094c06db-98dc-47f6-a7e5-1092b099b372/osd-block-db-fa0e7927-dc3e-44d0-a8ce-1d8202fa75dd
> > > >   encrypted 0
> > > >   db uuid   iWWdyU-UhNu-b58z-ThSp-Bi3B-19iA-06iJIc
> > > >   cephx lockbox secret
> > > >   block uuidu4326A-Q8bH-afPb-y7Y6-ftNf-TE1X-vjunBd
> > > >   block device
> > > > /dev/ceph-f78ff8a3-803d-4b6d-823b-260b301109ac/osd-data-9e4bf34d-1aa3-4c0a-9655-5dba52dcfcd7
> > > >   vdo   0
> > > >   crush device classNone
> > > >   devices   /dev/sdac
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-volume lvm batch OSD replacement

2019-03-19 Thread Alfredo Deza
On Tue, Mar 19, 2019 at 7:00 AM Alfredo Deza  wrote:
>
> On Tue, Mar 19, 2019 at 6:47 AM Dan van der Ster  wrote:
> >
> > Hi all,
> >
> > We've just hit our first OSD replacement on a host created with
> > `ceph-volume lvm batch` with mixed hdds+ssds.
> >
> > The hdd /dev/sdq was prepared like this:
> ># ceph-volume lvm batch /dev/sd[m-r] /dev/sdac --yes
> >
> > Then /dev/sdq failed and was then zapped like this:
> >   # ceph-volume lvm zap /dev/sdq --destroy
> >
> > The zap removed the pv/vg/lv from sdq, but left behind the db on
> > /dev/sdac (see P.S.)
>
> That is correct behavior for the zap command used.
>
> >
> > Now we're replaced /dev/sdq and we're wondering how to proceed. We see
> > two options:
> >   1. reuse the existing db lv from osd.240 (Though the osd fsid will
> > change when we re-create, right?)
>
> This is possible but you are right that in the current state, the FSID
> and other cluster data exist in the LV metadata. To reuse this LV for
> a new (replaced) OSD
> then you would need to zap the LV *without* the --destroy flag, which
> would clear all metadata on the LV and do a wipefs. The command would
> need the full path to
> the LV associated with osd.240, something like:
>
> ceph-volume lvm zap /dev/ceph-osd-lvs/db-lv-240
>
> >   2. remove the db lv from sdac then run
> > # ceph-volume lvm batch /dev/sdq /dev/sdac
> >  which should do the correct thing.
>
> This would also work if the db lv is fully removed with --destroy
>
> >
> > This is all v12.2.11 btw.
> > If (2) is the prefered approached, then it looks like a bug that the
> > db lv was not destroyed by lvm zap --destroy.
>
> Since /dev/sdq was passed in to zap, just that one device was removed,
> so this is working as expected.
>
> Alternatively, zap has the ability to destroy or zap LVs associated
> with an OSD ID. I think this is not released yet for Luminous but
> should be in the next release (which seems to be what you want)

Seems like 12.2.11 was released with the ability to zap by OSD ID. You
can also zap by OSD FSID, both way will zap (and optionally destroy if
using --destroy)
all LVs associated with the OSD.

Full examples on this can be found here:

http://docs.ceph.com/docs/luminous/ceph-volume/lvm/zap/#removing-devices


>
> >
> > Once we sort this out, we'd be happy to contribute to the ceph-volume
> > lvm batch doc.
> >
> > Thanks!
> >
> > Dan
> >
> > P.S:
> >
> > = osd.240 ==
> >
> >   [  db]
> > /dev/ceph-094c06db-98dc-47f6-a7e5-1092b099b372/osd-block-db-fa0e7927-dc3e-44d0-a8ce-1d8202fa75dd
> >
> >   type  db
> >   osd id240
> >   cluster fsid  b4f463a0-c671-43a8-bd36-e40ab8d233d2
> >   cluster name  ceph
> >   osd fsid  d4d1fb15-a30a-4325-8628-706772ee4294
> >   db device
> > /dev/ceph-094c06db-98dc-47f6-a7e5-1092b099b372/osd-block-db-fa0e7927-dc3e-44d0-a8ce-1d8202fa75dd
> >   encrypted 0
> >   db uuid   iWWdyU-UhNu-b58z-ThSp-Bi3B-19iA-06iJIc
> >   cephx lockbox secret
> >   block uuidu4326A-Q8bH-afPb-y7Y6-ftNf-TE1X-vjunBd
> >   block device
> > /dev/ceph-f78ff8a3-803d-4b6d-823b-260b301109ac/osd-data-9e4bf34d-1aa3-4c0a-9655-5dba52dcfcd7
> >   vdo   0
> >   crush device classNone
> >   devices   /dev/sdac
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-volume lvm batch OSD replacement

2019-03-19 Thread Alfredo Deza
On Tue, Mar 19, 2019 at 6:47 AM Dan van der Ster  wrote:
>
> Hi all,
>
> We've just hit our first OSD replacement on a host created with
> `ceph-volume lvm batch` with mixed hdds+ssds.
>
> The hdd /dev/sdq was prepared like this:
># ceph-volume lvm batch /dev/sd[m-r] /dev/sdac --yes
>
> Then /dev/sdq failed and was then zapped like this:
>   # ceph-volume lvm zap /dev/sdq --destroy
>
> The zap removed the pv/vg/lv from sdq, but left behind the db on
> /dev/sdac (see P.S.)

That is correct behavior for the zap command used.

>
> Now we're replaced /dev/sdq and we're wondering how to proceed. We see
> two options:
>   1. reuse the existing db lv from osd.240 (Though the osd fsid will
> change when we re-create, right?)

This is possible but you are right that in the current state, the FSID
and other cluster data exist in the LV metadata. To reuse this LV for
a new (replaced) OSD
then you would need to zap the LV *without* the --destroy flag, which
would clear all metadata on the LV and do a wipefs. The command would
need the full path to
the LV associated with osd.240, something like:

ceph-volume lvm zap /dev/ceph-osd-lvs/db-lv-240

>   2. remove the db lv from sdac then run
> # ceph-volume lvm batch /dev/sdq /dev/sdac
>  which should do the correct thing.

This would also work if the db lv is fully removed with --destroy

>
> This is all v12.2.11 btw.
> If (2) is the prefered approached, then it looks like a bug that the
> db lv was not destroyed by lvm zap --destroy.

Since /dev/sdq was passed in to zap, just that one device was removed,
so this is working as expected.

Alternatively, zap has the ability to destroy or zap LVs associated
with an OSD ID. I think this is not released yet for Luminous but
should be in the next release (which seems to be what you want)

>
> Once we sort this out, we'd be happy to contribute to the ceph-volume
> lvm batch doc.
>
> Thanks!
>
> Dan
>
> P.S:
>
> = osd.240 ==
>
>   [  db]
> /dev/ceph-094c06db-98dc-47f6-a7e5-1092b099b372/osd-block-db-fa0e7927-dc3e-44d0-a8ce-1d8202fa75dd
>
>   type  db
>   osd id240
>   cluster fsid  b4f463a0-c671-43a8-bd36-e40ab8d233d2
>   cluster name  ceph
>   osd fsid  d4d1fb15-a30a-4325-8628-706772ee4294
>   db device
> /dev/ceph-094c06db-98dc-47f6-a7e5-1092b099b372/osd-block-db-fa0e7927-dc3e-44d0-a8ce-1d8202fa75dd
>   encrypted 0
>   db uuid   iWWdyU-UhNu-b58z-ThSp-Bi3B-19iA-06iJIc
>   cephx lockbox secret
>   block uuidu4326A-Q8bH-afPb-y7Y6-ftNf-TE1X-vjunBd
>   block device
> /dev/ceph-f78ff8a3-803d-4b6d-823b-260b301109ac/osd-data-9e4bf34d-1aa3-4c0a-9655-5dba52dcfcd7
>   vdo   0
>   crush device classNone
>   devices   /dev/sdac
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD after OS reinstallation.

2019-02-22 Thread Alfredo Deza
On Fri, Feb 22, 2019 at 9:38 AM Marco Gaiarin  wrote:
>
> Mandi! Alfredo Deza
>   In chel di` si favelave...
>
> > The problem is that if there is no PARTUUID ceph-volume can't ensure
> > what device is the one actually pointing to data/journal. Being 'GPT'
> > alone will not be enough here :(
>
> Ok. There's some way to 'force' a PARTUUID, in a GPT or non-GPT
> partition, clearly even, if needed, destroying it?
>
>
> I've tried also to create a GPT partition in a DOS partition (eg, in a
> /dev/sda5), and seems that GPT partition get correctly created, but
> still (sub) partition have no PARTUUID...

There are ways to create partitions without a PARTUUID. We have an
example in our docs with parted that will produce what is needed:

http://docs.ceph.com/docs/master/ceph-volume/lvm/prepare/#partitioning

But then again... I would strongly suggest avoiding all of this and
just using the new way of doing OSDs with LVM

>
> --
> dott. Marco Gaiarin GNUPG Key ID: 240A3D66
>   Associazione ``La Nostra Famiglia''  http://www.lanostrafamiglia.it/
>   Polo FVG   -   Via della Bontà, 7 - 33078   -   San Vito al Tagliamento (PN)
>   marco.gaiarin(at)lanostrafamiglia.it   t +39-0434-842711   f +39-0434-842797
>
> Dona il 5 PER MILLE a LA NOSTRA FAMIGLIA!
>   http://www.lanostrafamiglia.it/index.php/it/sostienici/5x1000
> (cf 00307430132, categoria ONLUS oppure RICERCA SANITARIA)
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD after OS reinstallation.

2019-02-21 Thread Alfredo Deza
On Thu, Feb 21, 2019 at 2:34 AM Анатолий Фуников
 wrote:
>
> It's strange but parted output for this disk (/dev/sdf) show me that it's GPT:
>
> (parted) print
> Model: ATA HGST HUS726020AL (scsi)
> Disk /dev/sdf: 2000GB
> Sector size (logical/physical): 512B/4096B
> Partition Table: gpt
>
> Number  Start   End SizeType  File system Flags
>  2 1049kB  1075MB  1074MBceph journal
>  1 1075MB  2000GB  1999GB  xfs   ceph data
>

The problem is that if there is no PARTUUID ceph-volume can't ensure
what device is the one actually pointing to data/journal. Being 'GPT'
alone
will not be enough here :(

> ср, 20 февр. 2019 г. в 17:11, Alfredo Deza :
>>
>> On Wed, Feb 20, 2019 at 8:40 AM Анатолий Фуников
>>  wrote:
>> >
>> > Thanks for the reply.
>> > blkid -s PARTUUID -o value /dev/sdf1 shows me nothing, but blkid /dev/sdf1 
>> > shows me this: /dev/sdf1: UUID="b03810e4-dcc1-46c2-bc31-a1e558904750" 
>> > TYPE="xfs"
>>
>> I think this is what happens with a non-gpt partition. GPT labels will
>> use a PARTUUID to identify the partition, and I just confirmed that
>> ceph-volume will enforce looking for PARTUUID if the JSON
>> identified a partition (vs. an LV).
>>
>> From what I briefly researched it is not possible to add a GPT label
>> on a non-gpt partition without losing data.
>>
>> My suggestion (if you confirm it is not possible to add the GPT label)
>> is to start the migration towards the new way of creating OSDs
>>
>> >
>> > ср, 20 февр. 2019 г. в 16:27, Alfredo Deza :
>> >>
>> >> On Wed, Feb 20, 2019 at 8:16 AM Анатолий Фуников
>> >>  wrote:
>> >> >
>> >> > Hello. I need to raise the OSD on the node after reinstalling the OS, 
>> >> > some OSD were made a long time ago, not even a ceph-disk, but a set of 
>> >> > scripts.
>> >> > There was an idea to get their configuration in json via ceph-volume 
>> >> > simple scan, and then on a fresh system I can make a ceph-volume simple 
>> >> > activate --file 
>> >> > /etc/ceph/osd/31-46eacafe-22b6-4433-8e5c-e595612d8579.json
>> >> > I do ceph-volume simple scan /var/lib/ceph/osd/ceph-31/, and got this 
>> >> > json: https://pastebin.com/uJ8WVZyV
>> >> > It seems everything is not bad, but in the data section I see a direct 
>> >> > link to the device /dev/sdf1, and the uuid field is empty. At the same 
>> >> > time, in the /dev/disk/by-partuuid directory I can find and substitute 
>> >> > this UUID in this json, and delete the direct link to the device in 
>> >> > this json.
>> >> > The question is: how correct is it and can I raise this OSD on a 
>> >> > freshly installed OS with this fixed json?
>> >>
>> >> It worries me that it is unable to find a uuid for the device. This is
>> >> important because paths like /dev/sdf1 are ephemeral and can change
>> >> after a reboot. The uuid is found by running the following:
>> >>
>> >> blkid -s PARTUUID -o value /dev/sdf1
>> >>
>> >> If that is not returning anything, then ceph-volume will probably not
>> >> be able to ensure this device is brought up correctly. You can correct
>> >> or add to anything in the JSON after a scan and rely on that, but then
>> >> again
>> >> without a partuuid I don't think this will work nicely
>> >>
>> >> > ___
>> >> > ceph-users mailing list
>> >> > ceph-users@lists.ceph.com
>> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD after OS reinstallation.

2019-02-20 Thread Alfredo Deza
On Wed, Feb 20, 2019 at 10:21 AM Marco Gaiarin  wrote:
>
> Mandi! Alfredo Deza
>   In chel di` si favelave...
>
> > I think this is what happens with a non-gpt partition. GPT labels will
> > use a PARTUUID to identify the partition, and I just confirmed that
> > ceph-volume will enforce looking for PARTUUID if the JSON
> > identified a partition (vs. an LV).
> > From what I briefly researched it is not possible to add a GPT label
> > on a non-gpt partition without losing data.
>
> Ahem, how can i add a GPT label to a non-GPT partition (even loosing
> data)?

If you are coming from ceph-disk (or something else custom-made) and
don't care about losing data, why not fully migrate to the
new OSDs? 
http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/#rados-replacing-an-osd
>
> Seems the culprit around my 'Proxmox 4.4, Ceph hammer, OSD cache
> link...' thread...
>
>
> Thanks.
>
> --
> dott. Marco Gaiarin GNUPG Key ID: 240A3D66
>   Associazione ``La Nostra Famiglia''  http://www.lanostrafamiglia.it/
>   Polo FVG   -   Via della Bontà, 7 - 33078   -   San Vito al Tagliamento (PN)
>   marco.gaiarin(at)lanostrafamiglia.it   t +39-0434-842711   f +39-0434-842797
>
> Dona il 5 PER MILLE a LA NOSTRA FAMIGLIA!
>   http://www.lanostrafamiglia.it/index.php/it/sostienici/5x1000
> (cf 00307430132, categoria ONLUS oppure RICERCA SANITARIA)
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD after OS reinstallation.

2019-02-20 Thread Alfredo Deza
On Wed, Feb 20, 2019 at 8:40 AM Анатолий Фуников
 wrote:
>
> Thanks for the reply.
> blkid -s PARTUUID -o value /dev/sdf1 shows me nothing, but blkid /dev/sdf1 
> shows me this: /dev/sdf1: UUID="b03810e4-dcc1-46c2-bc31-a1e558904750" 
> TYPE="xfs"

I think this is what happens with a non-gpt partition. GPT labels will
use a PARTUUID to identify the partition, and I just confirmed that
ceph-volume will enforce looking for PARTUUID if the JSON
identified a partition (vs. an LV).

From what I briefly researched it is not possible to add a GPT label
on a non-gpt partition without losing data.

My suggestion (if you confirm it is not possible to add the GPT label)
is to start the migration towards the new way of creating OSDs

>
> ср, 20 февр. 2019 г. в 16:27, Alfredo Deza :
>>
>> On Wed, Feb 20, 2019 at 8:16 AM Анатолий Фуников
>>  wrote:
>> >
>> > Hello. I need to raise the OSD on the node after reinstalling the OS, some 
>> > OSD were made a long time ago, not even a ceph-disk, but a set of scripts.
>> > There was an idea to get their configuration in json via ceph-volume 
>> > simple scan, and then on a fresh system I can make a ceph-volume simple 
>> > activate --file /etc/ceph/osd/31-46eacafe-22b6-4433-8e5c-e595612d8579.json
>> > I do ceph-volume simple scan /var/lib/ceph/osd/ceph-31/, and got this 
>> > json: https://pastebin.com/uJ8WVZyV
>> > It seems everything is not bad, but in the data section I see a direct 
>> > link to the device /dev/sdf1, and the uuid field is empty. At the same 
>> > time, in the /dev/disk/by-partuuid directory I can find and substitute 
>> > this UUID in this json, and delete the direct link to the device in this 
>> > json.
>> > The question is: how correct is it and can I raise this OSD on a freshly 
>> > installed OS with this fixed json?
>>
>> It worries me that it is unable to find a uuid for the device. This is
>> important because paths like /dev/sdf1 are ephemeral and can change
>> after a reboot. The uuid is found by running the following:
>>
>> blkid -s PARTUUID -o value /dev/sdf1
>>
>> If that is not returning anything, then ceph-volume will probably not
>> be able to ensure this device is brought up correctly. You can correct
>> or add to anything in the JSON after a scan and rely on that, but then
>> again
>> without a partuuid I don't think this will work nicely
>>
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD after OS reinstallation.

2019-02-20 Thread Alfredo Deza
On Wed, Feb 20, 2019 at 8:16 AM Анатолий Фуников
 wrote:
>
> Hello. I need to raise the OSD on the node after reinstalling the OS, some 
> OSD were made a long time ago, not even a ceph-disk, but a set of scripts.
> There was an idea to get their configuration in json via ceph-volume simple 
> scan, and then on a fresh system I can make a ceph-volume simple activate 
> --file /etc/ceph/osd/31-46eacafe-22b6-4433-8e5c-e595612d8579.json
> I do ceph-volume simple scan /var/lib/ceph/osd/ceph-31/, and got this json: 
> https://pastebin.com/uJ8WVZyV
> It seems everything is not bad, but in the data section I see a direct link 
> to the device /dev/sdf1, and the uuid field is empty. At the same time, in 
> the /dev/disk/by-partuuid directory I can find and substitute this UUID in 
> this json, and delete the direct link to the device in this json.
> The question is: how correct is it and can I raise this OSD on a freshly 
> installed OS with this fixed json?

It worries me that it is unable to find a uuid for the device. This is
important because paths like /dev/sdf1 are ephemeral and can change
after a reboot. The uuid is found by running the following:

blkid -s PARTUUID -o value /dev/sdf1

If that is not returning anything, then ceph-volume will probably not
be able to ensure this device is brought up correctly. You can correct
or add to anything in the JSON after a scan and rely on that, but then
again
without a partuuid I don't think this will work nicely

> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Problems with osd creation in Ubuntu 18.04, ceph 13.2.4-1bionic

2019-02-18 Thread Alfredo Deza
On Mon, Feb 18, 2019 at 2:46 AM Rainer Krienke  wrote:
>
> Hello,
>
> thanks for your answer, but zapping the disk did not make any
> difference. I still get the same error.  Looking at the debug output I
> found this error message that is probably the root of all trouble:
>
> # ceph-volume lvm prepare --bluestore --data /dev/sdg
> 
> stderr: 2019-02-18 08:29:25.544 7fdaa50ed240 -1
> bluestore(/var/lib/ceph/osd/ceph-0/) _read_fsid unparsable uuid

This "unparsable uuid" line is (unfortunately) expected from
bluestore, and will show up when the OSD is being created for the
first time.

The error messaging was improved a bit (see
https://tracker.ceph.com/issues/22285 and PR
https://github.com/ceph/ceph/pull/20090 )

>
> I found the bugreport below that seems to be exactly that problem I have:
> http://tracker.ceph.com/issues/15386

This doesn't look like the same thing, you are hitting an assert:

 stderr: /build/ceph-13.2.4/src/os/bluestore/KernelDevice.cc: In
function 'virtual int KernelDevice::read(uint64_t, uint64_t,
ceph::bufferlist*, IOContext*, bool)' thread 7f3fcecb3240 time
2019-02-14 13:45:54.841130
 stderr: /build/ceph-13.2.4/src/os/bluestore/KernelDevice.cc: 821:
FAILED assert((uint64_t)r == len)

Which looks like a valid issue to me, might want to go and create a
new ticket in

https://tracker.ceph.com/projects/bluestore/issues/new


>
> However there seems to be no solution  up to now.
>
> Does anyone have more information how to get around this problem?
>
> Thanks
> Rainer
>
> Am 15.02.19 um 18:12 schrieb David Turner:
> > I have found that running a zap before all prepare/create commands with
> > ceph-volume helps things run smoother.  Zap is specifically there to
> > clear everything on a disk away to make the disk ready to be used as an
> > OSD.  Your wipefs command is still fine, but then I would lvm zap the
> > disk before continuing.  I would run the commands like [1] this.  I also
> > prefer the single command lvm create as opposed to lvm prepare and lvm
> > activate.  Try that out and see if you still run into the problems
> > creating the BlueStore filesystem.
> >
> > [1] ceph-volume lvm zap /dev/sdg
> > ceph-volume lvm prepare --bluestore --data /dev/sdg
> >
> > On Thu, Feb 14, 2019 at 10:25 AM Rainer Krienke  > > wrote:
> >
> > Hi,
> >
> > I am quite new to ceph and just try to set up a ceph cluster. Initially
> > I used ceph-deploy for this but when I tried to create a BlueStore osd
> > ceph-deploy fails. Next I tried the direct way on one of the OSD-nodes
> > using ceph-volume to create the osd, but this also fails. Below you can
> > see what  ceph-volume says.
> >
> > I ensured that there was no left over lvm VG and LV on the disk sdg
> > before I started the osd creation for this disk. The very same error
> > happens also on other disks not just for /dev/sdg. All the disk have 4TB
> > in size and the linux system is Ubuntu 18.04 and finally ceph is
> > installed in version 13.2.4-1bionic from this repo:
> > https://download.ceph.com/debian-mimic.
> >
> > There is a VG and two LV's  on the system for the ubuntu system itself
> > that is installed on two separate disks configured as software raid1 and
> > lvm on top of the raid. But I cannot imagine that this might do any harm
> > to cephs osd creation.
> >
> > Does anyone have an idea what might be wrong?
> >
> > Thanks for hints
> > Rainer
> >
> > root@ceph1:~# wipefs -fa /dev/sdg
> > root@ceph1:~# ceph-volume lvm prepare --bluestore --data /dev/sdg
> > Running command: /usr/bin/ceph-authtool --gen-print-key
> > Running command: /usr/bin/ceph --cluster ceph --name
> > client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
> > -i - osd new 14d041d6-0beb-4056-8df2-3920e2febce0
> > Running command: /sbin/vgcreate --force --yes
> > ceph-1433ffd0-0a80-481a-91f5-d7a47b78e17b /dev/sdg
> >  stdout: Physical volume "/dev/sdg" successfully created.
> >  stdout: Volume group "ceph-1433ffd0-0a80-481a-91f5-d7a47b78e17b"
> > successfully created
> > Running command: /sbin/lvcreate --yes -l 100%FREE -n
> > osd-block-14d041d6-0beb-4056-8df2-3920e2febce0
> > ceph-1433ffd0-0a80-481a-91f5-d7a47b78e17b
> >  stdout: Logical volume "osd-block-14d041d6-0beb-4056-8df2-3920e2febce0"
> > created.
> > Running command: /usr/bin/ceph-authtool --gen-print-key
> > Running command: /bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-0
> > --> Absolute path not found for executable: restorecon
> > --> Ensure $PATH environment variable contains common executable
> > locations
> > Running command: /bin/chown -h ceph:ceph
> > 
> > /dev/ceph-1433ffd0-0a80-481a-91f5-d7a47b78e17b/osd-block-14d041d6-0beb-4056-8df2-3920e2febce0
> > Running command: /bin/chown -R ceph:ceph /dev/dm-8
> > Running command: /bin/ln -s
> > 
> > 

Re: [ceph-users] Bluestore deploys to tmpfs?

2019-02-04 Thread Alfredo Deza
On Mon, Feb 4, 2019 at 4:43 AM Hector Martin  wrote:
>
> On 02/02/2019 05:07, Stuart Longland wrote:
> > On 1/2/19 10:43 pm, Alfredo Deza wrote:
> >>> The tmpfs setup is expected. All persistent data for bluestore OSDs
> >>> setup with LVM are stored in LVM metadata. The LVM/udev handler for
> >>> bluestore volumes create these tmpfs filesystems on the fly and populate
> >>> them with the information from the metadata.
> >> That is mostly what happens. There isn't a dependency on UDEV anymore
> >> (yay), but the reason why files are mounted on tmpfs
> >> is because *bluestore* spits them out on activation, this makes the
> >> path fully ephemeral (a great thing!)
> >>
> >> The step-by-step is documented in this summary section of  'activate'
> >> http://docs.ceph.com/docs/master/ceph-volume/lvm/activate/#summary
> >>
> >> Filestore doesn't have any of these capabilities and it is why it does
> >> have an actual existing path (vs. tmpfs), and the files come from the
> >> data partition that
> >> gets mounted.
> >>
> >
> > Well, for whatever reason, ceph-osd isn't calling the activate script
> > before it starts up.
> >
> > It is worth noting that the systems I'm using do not use systemd out of
> > simplicity.  I might need to write an init script to do that.  It wasn't
> > clear last weekend what commands I needed to run to activate a BlueStore
> > OSD.
>
> The way you do this on Gentoo is by writing the OSD FSID into
> /etc/conf.d/ceph-osd.. You need to make note of the ID when the OSD
> is first deployed.
>
> # echo "bluestore_osd_fsid=$(cat /var/lib/ceph/osd/ceph-0/fsid)" >
> /etc/conf.d/ceph-osd.0
>
> And then of course do the usual initscript symlink enable on Gentoo:
>
> # ln -s ceph /etc/init.d/ceph-osd.0
> # rc-update add ceph-osd.0 default
>
> This will then call `ceph-volume lvm activate` for that OSD for you
> before bringing it up, which will populate the tmpfs. It is the Gentoo
> OpenRC equivalent of enabling the systemd unit for that osd-fsid on
> systemd systems (but ceph-volume won't do it for you).

This is spot on. The whole systemd script for ceph-volume, is merely
passing the ID and FSID over to ceph-volume, which ends up doing
something like:

ceph-volume lvm activate ID FSID

>
> You may also want to add some dependencies for all OSDs depending on
> your setup (e.g. I run single-host and the mon has the OSD dm-crypt
> keys, so that has to come first):
>
> # echo 'rc_need="ceph-mon.0"' > /etc/conf.d/ceph-osd
>
> The Gentoo initscript setup for Ceph is unfortunately not very well
> documented. I've been meaning to write a blogpost about this to try to
> share what I've learned :-)
>
> --
> Hector Martin (hec...@marcansoft.com)
> Public Key: https://marcan.st/marcan.asc
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Problem replacing osd with ceph-deploy

2019-02-04 Thread Alfredo Deza
On Fri, Feb 1, 2019 at 6:07 PM Shain Miley  wrote:
>
> Hi,
>
> I went to replace a disk today (which I had not had to do in a while)
> and after I added it the results looked rather odd compared to times past:
>
> I was attempting to replace /dev/sdk on one of our osd nodes:
>
> #ceph-deploy disk zap hqosd7 /dev/sdk
> #ceph-deploy osd create --data /dev/sdk hqosd7
>
> [ceph_deploy.conf][DEBUG ] found configuration file at:
> /root/.cephdeploy.conf
> [ceph_deploy.cli][INFO  ] Invoked (2.0.1): /usr/local/bin/ceph-deploy
> osd create --data /dev/sdk hqosd7
> [ceph_deploy.cli][INFO  ] ceph-deploy options:
> [ceph_deploy.cli][INFO  ]  verbose   : False
> [ceph_deploy.cli][INFO  ]  bluestore : None
> [ceph_deploy.cli][INFO  ]  cd_conf   :
> 
> [ceph_deploy.cli][INFO  ]  cluster   : ceph
> [ceph_deploy.cli][INFO  ]  fs_type   : xfs
> [ceph_deploy.cli][INFO  ]  block_wal : None
> [ceph_deploy.cli][INFO  ]  default_release   : False
> [ceph_deploy.cli][INFO  ]  username  : None
> [ceph_deploy.cli][INFO  ]  journal   : None
> [ceph_deploy.cli][INFO  ]  subcommand: create
> [ceph_deploy.cli][INFO  ]  host  : hqosd7
> [ceph_deploy.cli][INFO  ]  filestore : None
> [ceph_deploy.cli][INFO  ]  func  :  at 0x7fa3b14b3398>
> [ceph_deploy.cli][INFO  ]  ceph_conf : None
> [ceph_deploy.cli][INFO  ]  zap_disk  : False
> [ceph_deploy.cli][INFO  ]  data  : /dev/sdk
> [ceph_deploy.cli][INFO  ]  block_db  : None
> [ceph_deploy.cli][INFO  ]  dmcrypt   : False
> [ceph_deploy.cli][INFO  ]  overwrite_conf: False
> [ceph_deploy.cli][INFO  ]  dmcrypt_key_dir   :
> /etc/ceph/dmcrypt-keys
> [ceph_deploy.cli][INFO  ]  quiet : False
> [ceph_deploy.cli][INFO  ]  debug : False
> [ceph_deploy.osd][DEBUG ] Creating OSD on cluster ceph with data device
> /dev/sdk
> [hqosd7][DEBUG ] connected to host: hqosd7
> [hqosd7][DEBUG ] detect platform information from remote host
> [hqosd7][DEBUG ] detect machine type
> [hqosd7][DEBUG ] find the location of an executable
> [ceph_deploy.osd][INFO  ] Distro info: Ubuntu 16.04 xenial
> [ceph_deploy.osd][DEBUG ] Deploying osd to hqosd7
> [hqosd7][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
> [hqosd7][DEBUG ] find the location of an executable
> [hqosd7][INFO  ] Running command: /usr/sbin/ceph-volume --cluster ceph
> lvm create --bluestore --data /dev/sdk
> [hqosd7][DEBUG ] Running command: /usr/bin/ceph-authtool --gen-print-key
> [hqosd7][DEBUG ] Running command: /usr/bin/ceph --cluster ceph --name
> client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
> -i - osd new c98a11d1-9b7f-487e-8c69-72fc662927d4
> [hqosd7][DEBUG ] Running command: vgcreate --force --yes
> ceph-bbe0e44e-afc9-4cf1-9f1a-ed7d20f796c1 /dev/sdk
> [hqosd7][DEBUG ]  stdout: Physical volume "/dev/sdk" successfully created
> [hqosd7][DEBUG ]  stdout: Volume group
> "ceph-bbe0e44e-afc9-4cf1-9f1a-ed7d20f796c1" successfully created
> [hqosd7][DEBUG ] Running command: lvcreate --yes -l 100%FREE -n
> osd-block-c98a11d1-9b7f-487e-8c69-72fc662927d4
> ceph-bbe0e44e-afc9-4cf1-9f1a-ed7d20f796c1
> [hqosd7][DEBUG ]  stdout: Logical volume
> "osd-block-c98a11d1-9b7f-487e-8c69-72fc662927d4" created.
> [hqosd7][DEBUG ] Running command: /usr/bin/ceph-authtool --gen-print-key
> [hqosd7][DEBUG ] Running command: mount -t tmpfs tmpfs
> /var/lib/ceph/osd/ceph-81
> [hqosd7][DEBUG ] Running command: chown -R ceph:ceph /dev/dm-0
> [hqosd7][DEBUG ] Running command: ln -s
> /dev/ceph-bbe0e44e-afc9-4cf1-9f1a-ed7d20f796c1/osd-block-c98a11d1-9b7f-487e-8c69-72fc662927d4
> /var/lib/ceph/osd/ceph-81/block
> [hqosd7][DEBUG ] Running command: ceph --cluster ceph --name
> client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
> mon getmap -o /var/lib/ceph/osd/ceph-81/activate.monmap
> [hqosd7][DEBUG ]  stderr: got monmap epoch 2
> [hqosd7][DEBUG ] Running command: ceph-authtool
> /var/lib/ceph/osd/ceph-81/keyring --create-keyring --name osd.81
> --add-key AQCyyFRcSwWqGBAAKZR8rcWIEknj/o3rsehOdA==
> [hqosd7][DEBUG ]  stdout: creating /var/lib/ceph/osd/ceph-81/keyring
> [hqosd7][DEBUG ]  stdout: added entity osd.81 auth auth(auid =
> 18446744073709551615 key=AQCyyFRcSwWqGBAAKZR8rcWIEknj/o3rsehOdA== with 0
> caps)
> [hqosd7][DEBUG ] Running command: chown -R ceph:ceph
> /var/lib/ceph/osd/ceph-81/keyring
> [hqosd7][DEBUG ] Running command: chown -R ceph:ceph
> /var/lib/ceph/osd/ceph-81/
> [hqosd7][DEBUG ] Running command: /usr/bin/ceph-osd --cluster ceph
> --osd-objectstore bluestore --mkfs -i 81 --monmap
> /var/lib/ceph/osd/ceph-81/activate.monmap --keyfile - --osd-data
> 

Re: [ceph-users] Problem replacing osd with ceph-deploy

2019-02-04 Thread Alfredo Deza
On Fri, Feb 1, 2019 at 6:35 PM Vladimir Prokofev  wrote:
>
> Your output looks a bit weird, but still, this is normal for bluestore. It 
> creates small separate data partition that is presented as XFS mounted in 
> /var/lib/ceph/osd, while real data partition is hidden as raw(bluestore) 
> block device.

That is not right for this output. It is using ceph-volume with LVM,
there are no partitions being created.

> It's no longer possible to check disk utilisation with df using bluestore.
> To check your osd capacity use 'ceph osd df'
>
> сб, 2 февр. 2019 г. в 02:07, Shain Miley :
>>
>> Hi,
>>
>> I went to replace a disk today (which I had not had to do in a while)
>> and after I added it the results looked rather odd compared to times past:
>>
>> I was attempting to replace /dev/sdk on one of our osd nodes:
>>
>> #ceph-deploy disk zap hqosd7 /dev/sdk
>> #ceph-deploy osd create --data /dev/sdk hqosd7
>>
>> [ceph_deploy.conf][DEBUG ] found configuration file at:
>> /root/.cephdeploy.conf
>> [ceph_deploy.cli][INFO  ] Invoked (2.0.1): /usr/local/bin/ceph-deploy
>> osd create --data /dev/sdk hqosd7
>> [ceph_deploy.cli][INFO  ] ceph-deploy options:
>> [ceph_deploy.cli][INFO  ]  verbose   : False
>> [ceph_deploy.cli][INFO  ]  bluestore : None
>> [ceph_deploy.cli][INFO  ]  cd_conf   :
>> 
>> [ceph_deploy.cli][INFO  ]  cluster   : ceph
>> [ceph_deploy.cli][INFO  ]  fs_type   : xfs
>> [ceph_deploy.cli][INFO  ]  block_wal : None
>> [ceph_deploy.cli][INFO  ]  default_release   : False
>> [ceph_deploy.cli][INFO  ]  username  : None
>> [ceph_deploy.cli][INFO  ]  journal   : None
>> [ceph_deploy.cli][INFO  ]  subcommand: create
>> [ceph_deploy.cli][INFO  ]  host  : hqosd7
>> [ceph_deploy.cli][INFO  ]  filestore : None
>> [ceph_deploy.cli][INFO  ]  func  : > at 0x7fa3b14b3398>
>> [ceph_deploy.cli][INFO  ]  ceph_conf : None
>> [ceph_deploy.cli][INFO  ]  zap_disk  : False
>> [ceph_deploy.cli][INFO  ]  data  : /dev/sdk
>> [ceph_deploy.cli][INFO  ]  block_db  : None
>> [ceph_deploy.cli][INFO  ]  dmcrypt   : False
>> [ceph_deploy.cli][INFO  ]  overwrite_conf: False
>> [ceph_deploy.cli][INFO  ]  dmcrypt_key_dir   :
>> /etc/ceph/dmcrypt-keys
>> [ceph_deploy.cli][INFO  ]  quiet : False
>> [ceph_deploy.cli][INFO  ]  debug : False
>> [ceph_deploy.osd][DEBUG ] Creating OSD on cluster ceph with data device
>> /dev/sdk
>> [hqosd7][DEBUG ] connected to host: hqosd7
>> [hqosd7][DEBUG ] detect platform information from remote host
>> [hqosd7][DEBUG ] detect machine type
>> [hqosd7][DEBUG ] find the location of an executable
>> [ceph_deploy.osd][INFO  ] Distro info: Ubuntu 16.04 xenial
>> [ceph_deploy.osd][DEBUG ] Deploying osd to hqosd7
>> [hqosd7][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
>> [hqosd7][DEBUG ] find the location of an executable
>> [hqosd7][INFO  ] Running command: /usr/sbin/ceph-volume --cluster ceph
>> lvm create --bluestore --data /dev/sdk
>> [hqosd7][DEBUG ] Running command: /usr/bin/ceph-authtool --gen-print-key
>> [hqosd7][DEBUG ] Running command: /usr/bin/ceph --cluster ceph --name
>> client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
>> -i - osd new c98a11d1-9b7f-487e-8c69-72fc662927d4
>> [hqosd7][DEBUG ] Running command: vgcreate --force --yes
>> ceph-bbe0e44e-afc9-4cf1-9f1a-ed7d20f796c1 /dev/sdk
>> [hqosd7][DEBUG ]  stdout: Physical volume "/dev/sdk" successfully created
>> [hqosd7][DEBUG ]  stdout: Volume group
>> "ceph-bbe0e44e-afc9-4cf1-9f1a-ed7d20f796c1" successfully created
>> [hqosd7][DEBUG ] Running command: lvcreate --yes -l 100%FREE -n
>> osd-block-c98a11d1-9b7f-487e-8c69-72fc662927d4
>> ceph-bbe0e44e-afc9-4cf1-9f1a-ed7d20f796c1
>> [hqosd7][DEBUG ]  stdout: Logical volume
>> "osd-block-c98a11d1-9b7f-487e-8c69-72fc662927d4" created.
>> [hqosd7][DEBUG ] Running command: /usr/bin/ceph-authtool --gen-print-key
>> [hqosd7][DEBUG ] Running command: mount -t tmpfs tmpfs
>> /var/lib/ceph/osd/ceph-81
>> [hqosd7][DEBUG ] Running command: chown -R ceph:ceph /dev/dm-0
>> [hqosd7][DEBUG ] Running command: ln -s
>> /dev/ceph-bbe0e44e-afc9-4cf1-9f1a-ed7d20f796c1/osd-block-c98a11d1-9b7f-487e-8c69-72fc662927d4
>> /var/lib/ceph/osd/ceph-81/block
>> [hqosd7][DEBUG ] Running command: ceph --cluster ceph --name
>> client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
>> mon getmap -o /var/lib/ceph/osd/ceph-81/activate.monmap
>> [hqosd7][DEBUG ]  stderr: got monmap epoch 2
>> [hqosd7][DEBUG ] Running command: ceph-authtool
>> /var/lib/ceph/osd/ceph-81/keyring --create-keyring --name osd.81
>> --add-key 

Re: [ceph-users] Bluestore deploys to tmpfs?

2019-02-01 Thread Alfredo Deza
On Fri, Feb 1, 2019 at 3:08 PM Stuart Longland
 wrote:
>
> On 1/2/19 10:43 pm, Alfredo Deza wrote:
> >>> I think mounting tmpfs for something that should be persistent is highly
> >>> dangerous.  Is there some flag I should be using when creating the
> >>> BlueStore OSD to avoid that issue?
> >>
> >> The tmpfs setup is expected. All persistent data for bluestore OSDs
> >> setup with LVM are stored in LVM metadata. The LVM/udev handler for
> >> bluestore volumes create these tmpfs filesystems on the fly and populate
> >> them with the information from the metadata.
> > That is mostly what happens. There isn't a dependency on UDEV anymore
> > (yay), but the reason why files are mounted on tmpfs
> > is because *bluestore* spits them out on activation, this makes the
> > path fully ephemeral (a great thing!)
> >
> > The step-by-step is documented in this summary section of  'activate'
> > http://docs.ceph.com/docs/master/ceph-volume/lvm/activate/#summary
> >
> > Filestore doesn't have any of these capabilities and it is why it does
> > have an actual existing path (vs. tmpfs), and the files come from the
> > data partition that
> > gets mounted.
> >
>
> Well, for whatever reason, ceph-osd isn't calling the activate script
> before it starts up.

ceph-osd doesn't call the activate script. Systemd is the one that
calls ceph-volume to activate OSDs.
>
> It is worth noting that the systems I'm using do not use systemd out of
> simplicity.  I might need to write an init script to do that.  It wasn't
> clear last weekend what commands I needed to run to activate a BlueStore
> OSD.

If deployed with ceph-volume, you can just do:

ceph-volume lvm activate --all

>
> For now though, sounds like tarring up the data directory, unmounting
> the tmpfs then unpacking the tar is a good-enough work-around.  That's
> what I've done for my second node (now I know of the problem) and so it
> should survive a reboot now.

There is no need to tar anything. Calling out to ceph-volume to
activate everything should just work.

>
> The only other two steps were to ensure `lvm` was marked to start at
> boot (so it would bring up all the volume groups) and that there was a
> UDEV rule in place to set the ownership on the LVM VGs for Ceph.

Right, you do need to ensure LVM is installed/enabled. But *for sure*
there is no need to UDEV rules to set any ownership for Ceph, this is
a task
that ceph-volume handles

> --
> Stuart Longland (aka Redhatter, VK4MSL)
>
> I haven't lost my mind...
>   ...it's backed up on a tape somewhere.
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Bluestore deploys to tmpfs?

2019-02-01 Thread Alfredo Deza
On Fri, Feb 1, 2019 at 6:28 AM Burkhard Linke
 wrote:
>
> Hi,
>
> On 2/1/19 11:40 AM, Stuart Longland wrote:
> > Hi all,
> >
> > I'm just in the process of migrating my 3-node Ceph cluster from
> > BTRFS-backed Filestore over to Bluestore.
> >
> > Last weekend I did this with my first node, and while the migration went
> > fine, I noted that the OSD did not survive a reboot test: after
> > rebooting /var/lib/ceph/osd/ceph-0 was completely empty and
> > /etc/init.d/ceph-osd.0 (I run OpenRC init on Gentoo) would refuse to start.
> >
> > https://stuartl.longlandclan.id.au/blog/2019/01/28/solar-cluster-adventures-in-ceph-migration/
> >
> > I managed to recover it, but tonight I'm trying with my second node.
> > I've provisioned a temporary OSD (plugged in via USB3) for it to migrate
> > to using BlueStore.  The ceph cluster called it osd.4.
> >
> > One thing I note is that `ceph-volume` seems to have created a `tmpfs`
> > mount for the new OSD:
> >
> >> tmpfs on /var/lib/ceph/osd/ceph-4 type tmpfs (rw,relatime)
> > Admittedly this is just a temporary OSD, tomorrow I'll be blowing away
> > the *real* OSD on this node (osd.1) and provisioning it again using
> > BlueStore.
> >
> > I really don't want the ohh crap moment I had on Monday afternoon (as
> > one does on the Australia Day long weekend) frantically digging through
> > man pages and having to do the `ceph-bluestore-tool prime-osd-dir` dance.
> >
> > I think mounting tmpfs for something that should be persistent is highly
> > dangerous.  Is there some flag I should be using when creating the
> > BlueStore OSD to avoid that issue?
>
>
> The tmpfs setup is expected. All persistent data for bluestore OSDs
> setup with LVM are stored in LVM metadata. The LVM/udev handler for
> bluestore volumes create these tmpfs filesystems on the fly and populate
> them with the information from the metadata.

That is mostly what happens. There isn't a dependency on UDEV anymore
(yay), but the reason why files are mounted on tmpfs
is because *bluestore* spits them out on activation, this makes the
path fully ephemeral (a great thing!)

The step-by-step is documented in this summary section of  'activate'
http://docs.ceph.com/docs/master/ceph-volume/lvm/activate/#summary

Filestore doesn't have any of these capabilities and it is why it does
have an actual existing path (vs. tmpfs), and the files come from the
data partition that
gets mounted.

>
>
> All our ceph nodes do not have any persistent data in /var/lib/ceph/osd
> anymore:
>
> root@bcf-01:~# mount
> ...
>
> /dev/sdm1 on /boot type ext4 (rw,relatime,data=ordered)
> tmpfs on /var/lib/ceph/osd/ceph-125 type tmpfs (rw,relatime)
> tmpfs on /var/lib/ceph/osd/ceph-128 type tmpfs (rw,relatime)
> tmpfs on /var/lib/ceph/osd/ceph-130 type tmpfs (rw,relatime)
> tmpfs on /var/lib/ceph/osd/ceph-3 type tmpfs (rw,relatime)
> tmpfs on /var/lib/ceph/osd/ceph-1 type tmpfs (rw,relatime)
> tmpfs on /var/lib/ceph/osd/ceph-2 type tmpfs (rw,relatime)
> tmpfs on /var/lib/ceph/osd/ceph-129 type tmpfs (rw,relatime)
> tmpfs on /var/lib/ceph/osd/ceph-5 type tmpfs (rw,relatime)
> tmpfs on /var/lib/ceph/osd/ceph-127 type tmpfs (rw,relatime)
> tmpfs on /var/lib/ceph/osd/ceph-131 type tmpfs (rw,relatime)
> tmpfs on /var/lib/ceph/osd/ceph-6 type tmpfs (rw,relatime)
> tmpfs on /var/lib/ceph/osd/ceph-4 type tmpfs (rw,relatime)
> tmpfs on /var/lib/ceph/osd/ceph-126 type tmpfs (rw,relatime)
> tmpfs on /var/lib/ceph/osd/ceph-124 type tmpfs (rw,relatime)
> 
>
>
> This works fine on machines using systemd. If your setup does not
> support this, you might want to use the 'simple' ceph-volume mode
> instead of the 'lvm' one. AFAIK it uses the gpt partition type method
> that has been around for years.
>
> Regards,
>
> Burkhard
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] migrate ceph-disk to ceph-volume fails with dmcrypt

2019-01-24 Thread Alfredo Deza
On Thu, Jan 24, 2019 at 4:13 PM mlausch  wrote:
>
>
>
> Am 24.01.19 um 22:02 schrieb Alfredo Deza:
> >>
> >> Ok with a new empty journal the OSD will not start. I have now rescued
> >> the data with dd and the recrypt it with a other key and copied the
> >> data back. This worked so far
> >>
> >> Now I encoded the key with base64 and put it to the key-value store.
> >> Also created the neccessary authkeys. Creating the json File by hand
> >> was quiet easy.
> >>
> >> But now there is one problem.
> >> ceph-disk opens the crypt like
> >> cryptsetup --key-file /etc/ceph/dmcrypt-keys/foobar ...
> >> ceph-volume pipes the key via stdin like this
> >> cat foobar | cryptsetup --key-file - ...
> >>
> >> The big problem. if the key is given via stdin cryptsetup hashes this
> >> key per default with some hash. Only if I set --hash plain it works. I
> >> think this is a bug in ceph-volume.
> >>
> >> Can someone confirm this?
> >
> > Ah right, this is when it was supported to have keys in a file.
> >
> > What type of keys do you have: LUKS or plain?
>
> I have both, plain and luks.
> At the moment I played around with the plain dmcrypt OSDs and run into
> this problem. I didn't test the luks crypted OSDs.

There is support in the JSON file to define the type of encryption with the key:

encryption_type

If this is undefined it will default to 'plain'. So that tells me that
we may indeed have a problem here. I'm not sure
what might be needed here, but I do recall having some trouble trying
to understand what ceph-disk was doing. That is
capture in this comment
https://github.com/ceph/ceph/blob/v12.2.10/src/ceph-volume/ceph_volume/devices/simple/activate.py#L150-L155

Do you think that might be related?
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] migrate ceph-disk to ceph-volume fails with dmcrypt

2019-01-24 Thread Alfredo Deza
On Thu, Jan 24, 2019 at 3:17 PM Manuel Lausch  wrote:
>
>
>
> On Wed, 23 Jan 2019 16:32:08 +0100
> Manuel Lausch  wrote:
>
>
> > >
> > > The key api for encryption is *very* odd and a lot of its quirks are
> > > undocumented. For example, ceph-volume is stuck supporting naming
> > > files and keys 'lockbox'
> > > (for backwards compatibility) but there is no real lockbox anymore.
> > > Another quirk is that when storing the secret in the monitor, it is
> > > done using the following convention:
> > >
> > > dm-crypt/osd/{OSD FSID}/luks
> > >
> > > The 'luks' part there doesn't indicate anything about the type of
> > > encryption (!!) so regardless of the type of encryption (luks or
> > > plain) the key would still go there.
> > >
> > > If you manage to get the keys into the monitors you still wouldn't
> > > be able to scan OSDs to produce the JSON files, but you would be
> > > able to create the JSON file with the
> > > metadata that ceph-volume needs to run the OSD.
> >
> > I think it is not that problem to create the json files by myself.
> > Moving the Keys to the monitors and creating appropriate auth-keys
> > should be more or less easy as well.
> >
> > The problem I see is, that there are individual keys for the journal
> > and data partition while the new process useses only one key for both
> > partitions.
> >
> > maybe I can recreate the journal partition with the other key. But is
> > this possible? Are there important data ramaining on the journal after
> > clean stopping the OSD which I cannot throw away without trashing the
> > whole OSD?
> >
>
> Ok with a new empty journal the OSD will not start. I have now rescued
> the data with dd and the recrypt it with a other key and copied the
> data back. This worked so far
>
> Now I encoded the key with base64 and put it to the key-value store.
> Also created the neccessary authkeys. Creating the json File by hand
> was quiet easy.
>
> But now there is one problem.
> ceph-disk opens the crypt like
> cryptsetup --key-file /etc/ceph/dmcrypt-keys/foobar ...
> ceph-volume pipes the key via stdin like this
> cat foobar | cryptsetup --key-file - ...
>
> The big problem. if the key is given via stdin cryptsetup hashes this
> key per default with some hash. Only if I set --hash plain it works. I
> think this is a bug in ceph-volume.
>
> Can someone confirm this?

Ah right, this is when it was supported to have keys in a file.

What type of keys do you have: LUKS or plain?
>
> there is the related code I mean in ceph-volume
> https://github.com/ceph/ceph/blob/v12.2.10/src/ceph-volume/ceph_volume/util/encryption.py#L59
>
> Regards
> Manuel
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] migrate ceph-disk to ceph-volume fails with dmcrypt

2019-01-23 Thread Alfredo Deza
On Wed, Jan 23, 2019 at 11:03 AM Dietmar Rieder
 wrote:
>
> On 1/23/19 3:05 PM, Alfredo Deza wrote:
> > On Wed, Jan 23, 2019 at 8:25 AM Jan Fajerski  wrote:
> >>
> >> On Wed, Jan 23, 2019 at 10:01:05AM +0100, Manuel Lausch wrote:
> >>> Hi,
> >>>
> >>> thats a bad news.
> >>>
> >>> round about 5000 OSDs are affected from this issue. It's not realy a
> >>> solution to redeploy this OSDs.
> >>>
> >>> Is it possible to migrate the local keys to the monitors?
> >>> I see that the OSDs with the "lockbox feature" has only one key for
> >>> data and journal partition and the older OSDs have individual keys for
> >>> journal and data. Might this be a problem?
> >>>
> >>> And a other question.
> >>> Is it a good idea to mix ceph-disk and ceph-volume managed OSDSs on one
> >>> host?
> >>> So I could only migrate newer OSDs to ceph-volume and deploy new
> >>> ones (after disk replacements) with ceph-volume until hopefuly there is
> >>> a solution.
> >> I might be wrong on this, since its been a while since I played with that. 
> >> But
> >> iirc you can't migrate a subset of ceph-disk OSDs to ceph-volume on one 
> >> host.
> >> Once you run ceph-volume simple activate, the ceph-disk systemd units and 
> >> udev
> >> profiles will be disabled. While the remaining ceph-disk OSDs will 
> >> continue to
> >> run, they won't come up after a reboot.
> >
> > This is correct, once you "activate" ceph-disk OSDs via ceph-volume
> > you are disabling all udev/systemd triggers for
> > those OSDs, so you must migrate all.
> >
> > I was assuming the question was more of a way to keep existing
> > ceph-disk OSDs and create new ceph-volume OSDs, which you can, as long
> > as this is not Nautilus or newer where ceph-disk doesn't exist
> >
>
> Will there be any plans to implement a command in ceph-volume that
> allows to create simple volumes like the ones that are migrated from
> ceph-disk using the scan and activate commands from ceph-volume?

The idea is that this is open-ended: you can create your OSDs in
whatever way you want, and add the information to the
JSON files to get them up and running.

Implementing a 'create' for it (if I'm understanding this correctly)
would imply having some sort of opinion on the creation process, which
would go against what the intention was for the command.

>
>
> >> I'm sure there's a way to get them running again, but I imagine you'd 
> >> rather not
> >> manually deal with that.
> >>>
> >>> Regards
> >>> Manuel
> >>>
> >>>
> >>> On Tue, 22 Jan 2019 07:44:02 -0500
> >>> Alfredo Deza  wrote:
> >>>
> >>>
> >>>> This is one case we didn't anticipate :/ We supported the wonky
> >>>> lockbox setup and thought we wouldn't need to go further back,
> >>>> although we did add support for both
> >>>> plain and luks keys.
> >>>>
> >>>> Looking through the code, it is very tightly couple to
> >>>> storing/retrieving keys from the monitors, and I don't know what
> >>>> workarounds might be possible here other than throwing away the OSD
> >>>> and deploying a new one (I take it this is not an option for you at
> >>>> all)
> >>>>
> >>>>
> >>> Manuel Lausch
> >>>
> >>> Systemadministrator
> >>> Storage Services
> >>>
> >>> 1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 |
> >>> 76135 Karlsruhe | Germany Phone: +49 721 91374-1847
> >>> E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de
> >>>
> >>> Hauptsitz Montabaur, Amtsgericht Montabaur, HRB 5452
> >>>
> >>> Geschäftsführer: Thomas Ludwig, Jan Oetjen, Sascha Vollmer
> >>>
> >>>
> >>> Member of United Internet
> >>>
> >>> Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte
> >>> Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat
> >>> sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie
> >>> bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem
> >>> bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern,
> >>> weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu
> 

Re: [ceph-users] migrate ceph-disk to ceph-volume fails with dmcrypt

2019-01-23 Thread Alfredo Deza
On Wed, Jan 23, 2019 at 8:25 AM Jan Fajerski  wrote:
>
> On Wed, Jan 23, 2019 at 10:01:05AM +0100, Manuel Lausch wrote:
> >Hi,
> >
> >thats a bad news.
> >
> >round about 5000 OSDs are affected from this issue. It's not realy a
> >solution to redeploy this OSDs.
> >
> >Is it possible to migrate the local keys to the monitors?
> >I see that the OSDs with the "lockbox feature" has only one key for
> >data and journal partition and the older OSDs have individual keys for
> >journal and data. Might this be a problem?
> >
> >And a other question.
> >Is it a good idea to mix ceph-disk and ceph-volume managed OSDSs on one
> >host?
> >So I could only migrate newer OSDs to ceph-volume and deploy new
> >ones (after disk replacements) with ceph-volume until hopefuly there is
> >a solution.
> I might be wrong on this, since its been a while since I played with that. But
> iirc you can't migrate a subset of ceph-disk OSDs to ceph-volume on one host.
> Once you run ceph-volume simple activate, the ceph-disk systemd units and udev
> profiles will be disabled. While the remaining ceph-disk OSDs will continue to
> run, they won't come up after a reboot.

This is correct, once you "activate" ceph-disk OSDs via ceph-volume
you are disabling all udev/systemd triggers for
those OSDs, so you must migrate all.

I was assuming the question was more of a way to keep existing
ceph-disk OSDs and create new ceph-volume OSDs, which you can, as long
as this is not Nautilus or newer where ceph-disk doesn't exist

> I'm sure there's a way to get them running again, but I imagine you'd rather 
> not
> manually deal with that.
> >
> >Regards
> >Manuel
> >
> >
> >On Tue, 22 Jan 2019 07:44:02 -0500
> >Alfredo Deza  wrote:
> >
> >
> >> This is one case we didn't anticipate :/ We supported the wonky
> >> lockbox setup and thought we wouldn't need to go further back,
> >> although we did add support for both
> >> plain and luks keys.
> >>
> >> Looking through the code, it is very tightly couple to
> >> storing/retrieving keys from the monitors, and I don't know what
> >> workarounds might be possible here other than throwing away the OSD
> >> and deploying a new one (I take it this is not an option for you at
> >> all)
> >>
> >>
> >Manuel Lausch
> >
> >Systemadministrator
> >Storage Services
> >
> >1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 |
> >76135 Karlsruhe | Germany Phone: +49 721 91374-1847
> >E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de
> >
> >Hauptsitz Montabaur, Amtsgericht Montabaur, HRB 5452
> >
> >Geschäftsführer: Thomas Ludwig, Jan Oetjen, Sascha Vollmer
> >
> >
> >Member of United Internet
> >
> >Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte
> >Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat
> >sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie
> >bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem
> >bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern,
> >weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu
> >verwenden.
> >
> >This e-mail may contain confidential and/or privileged information. If
> >you are not the intended recipient of this e-mail, you are hereby
> >notified that saving, distribution or use of the content of this e-mail
> >in any way is prohibited. If you have received this e-mail in error,
> >please notify the sender and delete the e-mail.
> >___
> >ceph-users mailing list
> >ceph-users@lists.ceph.com
> >http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> --
> Jan Fajerski
> Engineer Enterprise Storage
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton,
> HRB 21284 (AG Nürnberg)
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] migrate ceph-disk to ceph-volume fails with dmcrypt

2019-01-23 Thread Alfredo Deza
On Wed, Jan 23, 2019 at 4:01 AM Manuel Lausch  wrote:
>
> Hi,
>
> thats a bad news.
>
> round about 5000 OSDs are affected from this issue. It's not realy a
> solution to redeploy this OSDs.
>
> Is it possible to migrate the local keys to the monitors?
> I see that the OSDs with the "lockbox feature" has only one key for
> data and journal partition and the older OSDs have individual keys for
> journal and data. Might this be a problem?

I don't know how that would look like, but I think it is worth a try
if re-deploying OSDs is not feasible for you.

The key api for encryption is *very* odd and a lot of its quirks are
undocumented. For example, ceph-volume is stuck supporting naming
files and keys 'lockbox'
(for backwards compatibility) but there is no real lockbox anymore.
Another quirk is that when storing the secret in the monitor, it is
done using the following convention:

dm-crypt/osd/{OSD FSID}/luks

The 'luks' part there doesn't indicate anything about the type of
encryption (!!) so regardless of the type of encryption (luks or
plain) the key would still go there.

If you manage to get the keys into the monitors you still wouldn't be
able to scan OSDs to produce the JSON files, but you would be able to
create the JSON file with the
metadata that ceph-volume needs to run the OSD.

The contents are documented here:
http://docs.ceph.com/docs/master/ceph-volume/simple/scan/#json-contents

>
> And a other question.
> Is it a good idea to mix ceph-disk and ceph-volume managed OSDSs on one
> host?

I don't think it is a problem, but we don't test it so I can't say
with certainty.

> So I could only migrate newer OSDs to ceph-volume and deploy new
> ones (after disk replacements) with ceph-volume until hopefuly there is
> a solution.

I would strongly suggest implementing some automation to get all those
OSDs into 100% ceph-volume. The ceph-volume tooling for handling
ceph-disk OSDs is very
very robust and works very well, but it shouldn't be a long term
solution for OSDs that have been deployed with ceph-disk.

>
> Regards
> Manuel
>
>
> On Tue, 22 Jan 2019 07:44:02 -0500
> Alfredo Deza  wrote:
>
>
> > This is one case we didn't anticipate :/ We supported the wonky
> > lockbox setup and thought we wouldn't need to go further back,
> > although we did add support for both
> > plain and luks keys.
> >
> > Looking through the code, it is very tightly couple to
> > storing/retrieving keys from the monitors, and I don't know what
> > workarounds might be possible here other than throwing away the OSD
> > and deploying a new one (I take it this is not an option for you at
> > all)
> >
> >
> Manuel Lausch
>
> Systemadministrator
> Storage Services
>
> 1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 |
> 76135 Karlsruhe | Germany Phone: +49 721 91374-1847
> E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de
>
> Hauptsitz Montabaur, Amtsgericht Montabaur, HRB 5452
>
> Geschäftsführer: Thomas Ludwig, Jan Oetjen, Sascha Vollmer
>
>
> Member of United Internet
>
> Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte
> Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat
> sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie
> bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem
> bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern,
> weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu
> verwenden.
>
> This e-mail may contain confidential and/or privileged information. If
> you are not the intended recipient of this e-mail, you are hereby
> notified that saving, distribution or use of the content of this e-mail
> in any way is prohibited. If you have received this e-mail in error,
> please notify the sender and delete the e-mail.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] migrate ceph-disk to ceph-volume fails with dmcrypt

2019-01-22 Thread Alfredo Deza
On Tue, Jan 22, 2019 at 6:45 AM Manuel Lausch  wrote:
>
> Hi,
>
> we want upgrade our ceph clusters from jewel to luminous. And also want
> to migrate the osds to ceph-volume described in
> http://docs.ceph.com/docs/luminous/ceph-volume/simple/scan/#ceph-volume-simple-scan
>
> The clusters are running since dumpling and are setup with dmcrypt.
> Since dumpling there are until now three different types of dmcrypt
>
> plain dmcrypt with keys local
> luks with keys local
> luks with keys on the ceph monitors
>
> Now it seems only the last type can be migrated to ceph-volume.
>
> ceph-volume simple scan trys to mount a lockbox which does not exists
> on the older OSDs. Are those OSDs not supported with ceph-volume?

This is one case we didn't anticipate :/ We supported the wonky
lockbox setup and thought we wouldn't need to go further back,
although we did add support for both
plain and luks keys.

Looking through the code, it is very tightly couple to
storing/retrieving keys from the monitors, and I don't know what
workarounds might be possible here other than throwing away the OSD
and deploying a new one (I take it this is not an option for you at all)


>
> This are the errors:
>
> # ceph-volume simple scan /var/lib/ceph/osd/ceph-183
>  stderr: lsblk: /var/lib/ceph/osd/ceph-183: not a block device
>  stderr: lsblk: /var/lib/ceph/osd/ceph-183: not a block device
> Running command: /usr/sbin/cryptsetup status 
> /dev/mapper/21ad7722-002f-464c-b460-a8976a7b4872
> Running command: /usr/sbin/cryptsetup status 
> 21ad7722-002f-464c-b460-a8976a7b4872
> Running command: mount -v  /tmp/tmp3t1WRC
>  stderr: mount:  is write-protected, mounting read-only
>  stderr: mount: unknown filesystem type '(null)'
> -->  RuntimeError: command returned non-zero exit status: 32
>
>
> and this is in the ceph-volume.log
>
> [2019-01-22 12:39:31,456][ceph_volume.process][INFO  ] Running command: 
> /usr/sbin/blkid -p /dev/mapper/9b68b7e9-854e-498a-8381-4eef128a9d7a
> [2019-01-22 12:39:31,533][ceph_volume.devices.simple.scan][INFO  ] detecting 
> if argument is a device or a directory: /var/lib/ceph/osd/ceph-183
> [2019-01-22 12:39:31,533][ceph_volume.devices.simple.scan][INFO  ] will scan 
> directly, path is a directory
> [2019-01-22 12:39:31,533][ceph_volume.devices.simple.scan][INFO  ] will scan 
> encrypted OSD directory at path: /var/lib/ceph/osd/ceph-183
> [2019-01-22 12:39:31,534][ceph_volume.process][INFO  ] Running command: 
> /usr/sbin/blkid -s PARTUUID -o value /dev/sdv1
> [2019-01-22 12:39:31,539][ceph_volume.process][INFO  ] stdout 
> 21ad7722-002f-464c-b460-a8976a7b4872
> [2019-01-22 12:39:31,540][ceph_volume.process][INFO  ] Running command: 
> /usr/sbin/cryptsetup status 21ad7722-002f-464c-b460-a8976a7b4872
> [2019-01-22 12:39:31,546][ceph_volume.process][INFO  ] stdout 
> /dev/mapper/21ad7722-002f-464c-b460-a8976a7b4872 is active and is in use.
> [2019-01-22 12:39:31,547][ceph_volume.process][INFO  ] stdout type:PLAIN
> [2019-01-22 12:39:31,547][ceph_volume.process][INFO  ] stdout cipher:  
> aes-cbc-essiv:sha256
> [2019-01-22 12:39:31,547][ceph_volume.process][INFO  ] stdout keysize: 256 
> bits
> [2019-01-22 12:39:31,547][ceph_volume.process][INFO  ] stdout key location: 
> dm-crypt
> [2019-01-22 12:39:31,547][ceph_volume.process][INFO  ] stdout device:  
> /dev/sdv1
> [2019-01-22 12:39:31,547][ceph_volume.process][INFO  ] stdout sector size:  
> 512
> [2019-01-22 12:39:31,547][ceph_volume.process][INFO  ] stdout offset:  0 
> sectors
> [2019-01-22 12:39:31,547][ceph_volume.process][INFO  ] stdout size:
> 7805646479 sectors
> [2019-01-22 12:39:31,547][ceph_volume.process][INFO  ] stdout mode:
> read/write
> [2019-01-22 12:39:31,548][ceph_volume.process][INFO  ] Running command: mount 
> -v  /tmp/tmp3t1WRC
> [2019-01-22 12:39:31,597][ceph_volume.process][INFO  ] stderr mount:  is 
> write-protected, mounting read-only
> [2019-01-22 12:39:31,622][ceph_volume.process][INFO  ] stderr mount: unknown 
> filesystem type '(null)'
> [2019-01-22 12:39:31,622][ceph_volume][ERROR ] exception caught by decorator
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/ceph_volume/decorators.py", line 59, 
> in newfunc
> return f(*a, **kw)
>   File "/usr/lib/python2.7/site-packages/ceph_volume/main.py", line 148, in 
> main
> terminal.dispatch(self.mapper, subcommand_args)
>   File "/usr/lib/python2.7/site-packages/ceph_volume/terminal.py", line 182, 
> in dispatch
> instance.main()
>   File "/usr/lib/python2.7/site-packages/ceph_volume/devices/simple/main.py", 
> line 33, in main
> terminal.dispatch(self.mapper, self.argv)
>   File "/usr/lib/python2.7/site-packages/ceph_volume/terminal.py", line 182, 
> in dispatch
> instance.main()
>   File "/usr/lib/python2.7/site-packages/ceph_volume/devices/simple/scan.py", 
> line 353, in main
> self.scan(args)
>   File "/usr/lib/python2.7/site-packages/ceph_volume/decorators.py", line 16, 
> in is_root
> return func(*a, 

Re: [ceph-users] Problem with OSDs

2019-01-21 Thread Alfredo Deza
On Sun, Jan 20, 2019 at 11:30 PM Brian Topping  wrote:
>
> Hi all, looks like I might have pooched something. Between the two nodes I 
> have, I moved all the PGs to one machine, reformatted the other machine, 
> rebuilt that machine, and moved the PGs back. In both cases, I did this by 
> taking the OSDs on the machine being moved from “out” and waiting for health 
> to be restored, then took them down.
>
> This worked great up to the point I had the mon/manager/rgw where they 
> started, all the OSDs/PGs on the other machine that had been rebuilt. The 
> next step was to rebuild the master machine, copy /etc/ceph and /var/lib/ceph 
> with cpio, then re-add new OSDs on the master machine as it were.
>
> This didn’t work so well. The master has come up just fine, but it’s not 
> connecting to the OSDs. Of the four OSDs, only two came up, and the other two 
> did not (IDs 1 and 3). For it's part, the OSD machine is reporting lines like 
> the following in it’s logs:
>
> > [2019-01-20 16:22:10,106][systemd][WARNING] failed activating OSD, retries 
> > left: 2
> > [2019-01-20 16:22:15,111][ceph_volume.process][INFO  ] Running command: 
> > /usr/sbin/ceph-volume lvm trigger 1-e3bfc69e-a145-4e19-aac2-5f888e1ed2ce
> > [2019-01-20 16:22:15,271][ceph_volume.process][INFO  ] stderr -->  
> > RuntimeError: could not find osd.1 with fsid 
> > e3bfc69e-a145-4e19-aac2-5f888e1ed2ce

When creating an OSD, ceph-volume will capture the ID and the FSID and
use these to create a systemd unit. When the system boots, it queries
LVM for devices that match that ID/FSID information.

Is it possible you've attempted to create an OSD and then failed, and
tried again? That would explain why there would be a systemd unit with
an FSID that doesn't match. By the output, it does look like
you have an OSD 1, but with a different FSID (467... instead of
e3b...). You could try to disable the failing systemd unit with:

systemctl disable
ceph-volume@lvm-1-e3bfc69e-a145-4e19-aac2-5f888e1ed2ce.service

(Follow up with OSD 3) and then run:

ceph-volume lvm activate --all

Hopefully that can get you back into activated OSDs
>
>
> I see this for the volumes:
>
> > [root@gw02 ceph]# ceph-volume lvm list
> >
> > == osd.1 ===
> >
> >   [block]
> > /dev/ceph-c7640f3e-0bf5-4d75-8dd4-00b6434c84d9/osd-block-4672bb90-8cea-4580-85f2-1e692811a05a
> >
> >   type  block
> >   osd id1
> >   cluster fsid  1cf94ce9-1323-4c43-865f-68f4ae9e6af3
> >   cluster name  ceph
> >   osd fsid  4672bb90-8cea-4580-85f2-1e692811a05a
> >   encrypted 0
> >   cephx lockbox secret
> >   block uuid3M5fen-JgsL-t4vz-bh3m-k3pf-hjBV-4R7Cff
> >   block device  
> > /dev/ceph-c7640f3e-0bf5-4d75-8dd4-00b6434c84d9/osd-block-4672bb90-8cea-4580-85f2-1e692811a05a
> >   vdo   0
> >   crush device classNone
> >   devices   /dev/sda3
> >
> > == osd.3 ===
> >
> >   [block]
> > /dev/ceph-f5f453df-1d41-4883-b0f8-d662c6ba8bea/osd-block-084cf33d-8a38-4c82-884a-7c88e3161479
> >
> >   type  block
> >   osd id3
> >   cluster fsid  1cf94ce9-1323-4c43-865f-68f4ae9e6af3
> >   cluster name  ceph
> >   osd fsid  084cf33d-8a38-4c82-884a-7c88e3161479
> >   encrypted 0
> >   cephx lockbox secret
> >   block uuidPSU2ba-6PbF-qhm7-RMER-lCkR-j58b-G9B6A7
> >   block device  
> > /dev/ceph-f5f453df-1d41-4883-b0f8-d662c6ba8bea/osd-block-084cf33d-8a38-4c82-884a-7c88e3161479
> >   vdo   0
> >   crush device classNone
> >   devices   /dev/sdb3
> >
> > == osd.5 ===
> >
> >   [block]
> > /dev/ceph-033e2bbe-5005-45d9-9ecd-4b541fe010bd/osd-block-e854930d-1617-4fe7-b3cd-98ef284643fd
> >
> >   type  block
> >   osd id5
> >   cluster fsid  1cf94ce9-1323-4c43-865f-68f4ae9e6af3
> >   cluster name  ceph
> >   osd fsid  e854930d-1617-4fe7-b3cd-98ef284643fd
> >   encrypted 0
> >   cephx lockbox secret
> >   block uuidF5YIfz-quO4-gbmW-rxyP-qXxe-iN7a-Po1mL9
> >   block device  
> > /dev/ceph-033e2bbe-5005-45d9-9ecd-4b541fe010bd/osd-block-e854930d-1617-4fe7-b3cd-98ef284643fd
> >   vdo   0
> >   crush device classNone
> >   devices   /dev/sdc3
> >
> > == osd.7 ===
> >
> >   [block]
> > /dev/ceph-1f3d4406-af86-4813-8d06-a001c57408fa/osd-block-5c0d0404-390e-4801-94a9-da52c104206f
> >
> >   type  block
> >   osd id7
> >   cluster fsid  1cf94ce9-1323-4c43-865f-68f4ae9e6af3
> >   

Re: [ceph-users] block.db on a LV? (Re: Mixed SSD+HDD OSD setup recommendation)

2019-01-18 Thread Alfredo Deza
On Fri, Jan 18, 2019 at 10:07 AM Jan Kasprzak  wrote:
>
> Alfredo,
>
> Alfredo Deza wrote:
> : On Fri, Jan 18, 2019 at 7:21 AM Jan Kasprzak  wrote:
> : > Eugen Block wrote:
> : > :
> : > : I think you're running into an issue reported a couple of times.
> : > : For the use of LVM you have to specify the name of the Volume Group
> : > : and the respective Logical Volume instead of the path, e.g.
> : > :
> : > : ceph-volume lvm prepare --bluestore --block.db ssd_vg/ssd00 --data 
> /dev/sda
> : > thanks, I will try it. In the meantime, I have discovered another way
> : > how to get around it: convert my SSDs from MBR to GPT partition table,
> : > and then create 15 additional GPT partitions for the respective block.dbs
> : > instead of 2x15 LVs.
> :
> : This is because ceph-volume can accept both LVs or GPT partitions for 
> block.db
> :
> : Another way around this, that doesn't require you to create the LVs is
> : to use the `batch` sub-command, that will automatically
> : detect your HDD and put data on it, and detect the SSD and create the
> : block.db LVs. The command could look something like:
> :
> :
> : ceph-volume lvm batch --bluestore /dev/sda /dev/sdb /dev/sdc /dev/sdd
> : /dev/nvme0n1
> :
> : Would create 4 OSDs, place data on: sda, sdb, sdc, and sdd. And create
> : 4 block.db LVs on nvme0n1
>
> Interesting. Thanks!
>
> Can the batch command accept also partitions instead of a whole
> device for block.db? I already have two partitions on my SSDs for
> root and swap.

Ah in that case, no. The idea is that it abstracts the handling in
such a way that it is as hands-off as possible. It is hard to
accomplish that
if there are partitions in the SSDs. However, it is still possible if
the SSDs have LVs in them. The sub-command will just figure out what
extra space is available and just use that.


>
> -Yenya
>
> --
> | Jan "Yenya" Kasprzak  |
> | http://www.fi.muni.cz/~kas/ GPG: 4096R/A45477D5 |
>  This is the world we live in: the way to deal with computers is to google
>  the symptoms, and hope that you don't have to watch a video. --P. Zaitcev
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] dropping python 2 for nautilus... go/no-go

2019-01-18 Thread Alfredo Deza
On Fri, Jan 18, 2019 at 7:07 AM Hector Martin  wrote:
>
> On 17/01/2019 00:45, Sage Weil wrote:
> > Hi everyone,
> >
> > This has come up several times before, but we need to make a final
> > decision.  Alfredo has a PR prepared that drops Python 2 support entirely
> > in master, which will mean nautilus is Python 3 only.
> >
> > All of our distro targets (el7, bionic, xenial) include python 3, so that
> > isn't an issue.  However, it also means that users of python-rados,
> > python-rbd, and python-cephfs will need to be using python 3.
>
> I'm not sure dropping Python 2 support in Nautilus is reasonable...
> simply because Python 3 support isn't quite stable in Mimic yet - I just
> filed https://tracker.ceph.com/issues/37963 for ceph-volume being broken
> with Python 3 and dm-crypt :-)

These are the exact type of things we can't really get to test because
we rely on functional coverage. Because we currently build Ceph with
support with Python2, then the binaries end up "choosing" the Python2
interpreter and so the tests are all Python2

The other issue is that we found to be borderline impossible to toggle
Python2/Python3 builds to allow some builds to be Python3 so that we
can actually
run some tests.

I do expect breakage though, so the sooner we get to switch the
better. Seems like we are leaning towards merging the
Python3-exclusive branch once Nautilus is out.

>
> I think there needs to be a release that supports both equally well to
> give people time to safely migrate over. Might be worth doing some
> tree-wide reviews (like that division thing) to hopefully squash more
> lurking Python 3 bugs.
>
> (just my 2c - maybe I got unlucky and otherwise things work well enough
> for everyone else in Py3; I'm certainly happy to get rid of Py2 ASAP).
>
> --
> Hector Martin (hec...@marcansoft.com)
> Public Key: https://marcan.st/marcan.asc
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] block.db on a LV? (Re: Mixed SSD+HDD OSD setup recommendation)

2019-01-18 Thread Alfredo Deza
On Fri, Jan 18, 2019 at 7:21 AM Jan Kasprzak  wrote:
>
> Eugen Block wrote:
> : Hi Jan,
> :
> : I think you're running into an issue reported a couple of times.
> : For the use of LVM you have to specify the name of the Volume Group
> : and the respective Logical Volume instead of the path, e.g.
> :
> : ceph-volume lvm prepare --bluestore --block.db ssd_vg/ssd00 --data /dev/sda
>
> Eugen,
>
> thanks, I will try it. In the meantime, I have discovered another way
> how to get around it: convert my SSDs from MBR to GPT partition table,
> and then create 15 additional GPT partitions for the respective block.dbs
> instead of 2x15 LVs.

This is because ceph-volume can accept both LVs or GPT partitions for block.db

Another way around this, that doesn't require you to create the LVs is
to use the `batch` sub-command, that will automatically
detect your HDD and put data on it, and detect the SSD and create the
block.db LVs. The command could look something like:


ceph-volume lvm batch --bluestore /dev/sda /dev/sdb /dev/sdc /dev/sdd
/dev/nvme0n1

Would create 4 OSDs, place data on: sda, sdb, sdc, and sdd. And create
4 block.db LVs on nvme0n1



>
> -Yenya
>
> --
> | Jan "Yenya" Kasprzak  |
> | http://www.fi.muni.cz/~kas/ GPG: 4096R/A45477D5 |
>  This is the world we live in: the way to deal with computers is to google
>  the symptoms, and hope that you don't have to watch a video. --P. Zaitcev
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SLOW SSD's after moving to Bluestore

2018-12-12 Thread Alfredo Deza
On Tue, Dec 11, 2018 at 7:28 PM Tyler Bishop
 wrote:
>
> Now I'm just trying to figure out how to create filestore in Luminous.
> I've read every doc and tried every flag but I keep ending up with
> either a data LV of 100% on the VG or a bunch fo random errors for
> unsupported flags...

An LV with 100% of the VG sounds like it tried to deploy bluestore.
ceph-deploy will try to behave like that unless LVs are created by
hand.

A newer option would be to try the `ceph-volume lvm batch` command on
your server (unsupported as of yet by ceph-deploy) to create all the
vgs/lvs needed including
the detection of HDDs and SSDs that would send the journals to the SSD if any:

ceph-volume lvm batch --filestore /dev/sda /dev/sdb /dev/sdc

Would create 3 OSDs, one for each spinning drive (assuming these are
spinning), and collocate the journal on the device itself. For the
journal on a separate device
a solid device would need to be added, for example:

ceph-volume lvm batch --filestore /dev/sda /dev/sdb /dev/sdc /dev/nvme0n1

Would create 3 OSDs again, but would put 3 journals on nvme0n1


>
> # ceph-disk prepare --filestore --fs-type xfs --data-dev /dev/sdb1
> --journal-dev /dev/sdb2 --osd-id 3
> usage: ceph-disk [-h] [-v] [--log-stdout] [--prepend-to-path PATH]
>  [--statedir PATH] [--sysconfdir PATH] [--setuser USER]
>  [--setgroup GROUP]
>
>
> {prepare,activate,activate-lockbox,activate-block,activate-journal,activate-all,list,suppress-activate,unsuppress-activate,deactivate,destroy,zap,trigger,fix}
>  ...
> ceph-disk: error: unrecognized arguments: /dev/sdb1
> On Tue, Dec 11, 2018 at 7:22 PM Christian Balzer  wrote:
> >
> >
> > Hello,
> >
> > On Tue, 11 Dec 2018 23:22:40 +0300 Igor Fedotov wrote:
> >
> > > Hi Tyler,
> > >
> > > I suspect you have BlueStore DB/WAL at these drives as well, don't you?
> > >
> > > Then perhaps you have performance issues with f[data]sync requests which
> > > DB/WAL invoke pretty frequently.
> > >
> > Since he explicitly mentioned using these SSDs with filestore AND the
> > journals on the same SSD I'd expect a similar impact aka piss-poor
> > performance in his existing setup (the 300 other OSDs).
> >
> > Unless of course some bluestore is significantly more sync happy than the
> > filestore journal and/or other bluestore particulars (reduced caching
> > space, not caching in some situations) are rearing their ugly heads.
> >
> > Christian
> >
> > > See the following links for details:
> > >
> > > https://www.percona.com/blog/2018/02/08/fsync-performance-storage-devices/
> > >
> > > https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/
> > >
> > > The latter link shows pretty poor numbers for M500DC drives.
> > >
> > >
> > > Thanks,
> > >
> > > Igor
> > >
> > >
> > > On 12/11/2018 4:58 AM, Tyler Bishop wrote:
> > >
> > > > Older Crucial/Micron M500/M600
> > > > _
> > > >
> > > > *Tyler Bishop*
> > > > EST 2007
> > > >
> > > >
> > > > O:513-299-7108 x1000
> > > > M:513-646-5809
> > > > http://BeyondHosting.net 
> > > >
> > > >
> > > > This email is intended only for the recipient(s) above and/or
> > > > otherwise authorized personnel. The information contained herein and
> > > > attached is confidential and the property of Beyond Hosting. Any
> > > > unauthorized copying, forwarding, printing, and/or disclosing
> > > > any information related to this email is prohibited. If you received
> > > > this message in error, please contact the sender and destroy all
> > > > copies of this email and any attachment(s).
> > > >
> > > >
> > > > On Mon, Dec 10, 2018 at 8:57 PM Christian Balzer  > > > > wrote:
> > > >
> > > > Hello,
> > > >
> > > > On Mon, 10 Dec 2018 20:43:40 -0500 Tyler Bishop wrote:
> > > >
> > > > > I don't think thats my issue here because I don't see any IO to
> > > > justify the
> > > > > latency.  Unless the IO is minimal and its ceph issuing a bunch
> > > > of discards
> > > > > to the ssd and its causing it to slow down while doing that.
> > > > >
> > > >
> > > > What does atop have to say?
> > > >
> > > > Discards/Trims are usually visible in it, this is during a fstrim 
> > > > of a
> > > > RAID1 / :
> > > > ---
> > > > DSK |  sdb  | busy 81% |  read   0 | write  8587
> > > > | MBw/s 2323.4 |  avio 0.47 ms |
> > > > DSK |  sda  | busy 70% |  read   2 | write  8587
> > > > | MBw/s 2323.4 |  avio 0.41 ms |
> > > > ---
> > > >
> > > > The numbers tend to be a lot higher than what the actual interface 
> > > > is
> > > > capable of, clearly the SSD is reporting its internal activity.
> > > >
> > > > In any case, it should give a good insight of what is going on
> > > > activity
> > > > wise.
> > > > Also for posterity and curiosity, what kind of SSDs?

Re: [ceph-users] SLOW SSD's after moving to Bluestore

2018-12-12 Thread Alfredo Deza
On Tue, Dec 11, 2018 at 8:16 PM Mark Kirkwood
 wrote:
>
> Looks like the 'delaylog' option for xfs is the problem - no longer supported 
> in later kernels. See 
> https://github.com/torvalds/linux/commit/444a702231412e82fb1c09679adc159301e9242c
>
> Offhand I'm not sure where that option is being added (whether ceph-deploy or 
> ceph-volume), but you could just do surgery on whichever one is adding it...

The default flags that ceph-volume uses for mounting XFS are:

rw,noatime,inode64

These can be overridden by a ceph.conf entry:

osd_mount_options_xfs=rw,noatime,inode64
>
> regards
>
> Mark
>
>
> On 12/12/18 1:33 PM, Tyler Bishop wrote:
>>
>>
>>> [osci-1001][DEBUG ] Running command: mount -t xfs -o 
>>> "rw,noatime,noquota,logbsize=256k,logbufs=8,inode64,allocsize=4M,delaylog" 
>>> /dev/ceph-7b308a5a-a8e9-48aa-86a9-39957dcbd1eb/osd-data-81522145-e31b-4325-83fd-6cfefc1b761f
>>>  /var/lib/ceph/osd/ceph-1
>>>
>>> [osci-1001][DEBUG ]  stderr: mount: unsupported option format: 
>>> "rw,noatime,noquota,logbsize=256k,logbufs=8,inode64,allocsize=4M,delaylog"
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] 'ceph-deploy osd create' and filestore OSDs

2018-12-05 Thread Alfredo Deza
On Tue, Dec 4, 2018 at 6:44 PM Matthew Pounsett  wrote:
>
>
>
> On Tue, 4 Dec 2018 at 18:31, Vasu Kulkarni  wrote:
>>>
>>>
>>> Is there a way we can easily set that up without trying to use outdated 
>>> tools?  Presumably if ceph still supports this as the docs claim, there's a 
>>> way to get it done without using ceph-deploy?
>>
>> It might be more involved if you are trying to setup manually, you can give 
>> 1.5.38  a try(not that old) and see if it works 
>> https://pypi.org/project/ceph-deploy/1.5.38/

Vasu has pointed out pretty much everything correctly. If you don't
want the new syntax, 1.5.38 is what you want. Would like to point out
a couple of things here:

* ceph-deploy has a pretty good changelog that gets updated for every
release. The 2.0.0 release has a backwards incompatibility notice
explaining much of the issues you've raised:
http://docs.ceph.com/ceph-deploy/docs/changelog.html

* Deploying to directories is not supported anymore with ceph-deploy,
and soon to be impossible in Ceph with tooling. If you are trying to
replicate production environments with smaller devices, you can still
do this
with some manual work: ceph-deploy (and ceph-volume on the remote
machine) can consume logical volumes, which could be easily set up to
be on a loop device. That is what we do for some of the functional
testing:
we create sparse files of 10GB and attach them to a loop device, then
create an LV on top.

>>
>
>  Not that old now.. but eventually it will be. :)
>
> The goal was to revisit our deployment tools later (after getting the rest of 
> the service working) and replace ceph-deploy with direct configuration of all 
> the machines, so maybe having a deprecated piece of software around that 
> needs to be replaced will help with that motivation when the time comes.
>
> Thanks for your help!
>Matt
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fastest way to find raw device from OSD-ID? (osd -> lvm lv -> lvm pv -> disk)

2018-11-30 Thread Alfredo Deza
On Fri, Nov 30, 2018 at 3:10 PM Paul Emmerich  wrote:
>
> Am Mo., 8. Okt. 2018 um 23:34 Uhr schrieb Alfredo Deza :
> >
> > On Mon, Oct 8, 2018 at 5:04 PM Paul Emmerich  wrote:
> > >
> > > ceph-volume unfortunately doesn't handle completely hanging IOs too
> > > well compared to ceph-disk.
> >
> > Not sure I follow, would you mind expanding on what you mean by
> > "ceph-volume unfortunately doesn't handle completely hanging IOs" ?
> >
> > ceph-volume just provisions the OSD, nothing else. If LVM is hanging,
> > there is nothing we could do there, just like ceph-disk wouldn't be
> > able to do anything if the partitioning
> > tool would hang.
>
> Another follow-up for this since I ran into issues with ceph-volume
> again a few times in the last weeks:
> I've opened issues for the main problems that we are seeing since
> using ceph-volume
>
> http://tracker.ceph.com/issues/37490
> http://tracker.ceph.com/issues/37487
> http://tracker.ceph.com/issues/37492
>
> The summary is that most operations need to access *all* disks and
> that will cause problems if one of them is misbehaving.
> ceph-disk didn't have this problem (but a lot of other problems,
> overall we are more happy with ceph-volume)

Paul, thank you so much for opening these issues. It is sometimes hard
to prevent these sort of "real world" usage problems.

None of them seem hard to tackle, I anticipate they will get done and
merged rather quickly.
>
> Paul
>
> >
> >
> >
> > > It needs to read actual data from each
> > > disk and it'll just hang completely if any of the disks doesn't
> > > respond.
> > >
> > > The low-level command to get the information from LVM is:
> > >
> > > lvs -o lv_tags
> > >
> > > this allows you to map a LV to an OSD id.
> > >
> > >
> > > Paul
> > > Am Mo., 8. Okt. 2018 um 12:09 Uhr schrieb Kevin Olbrich :
> > > >
> > > > Hi!
> > > >
> > > > Yes, thank you. At least on one node this works, the other node just 
> > > > freezes but this might by caused by a bad disk that I try to find.
> > > >
> > > > Kevin
> > > >
> > > > Am Mo., 8. Okt. 2018 um 12:07 Uhr schrieb Wido den Hollander 
> > > > :
> > > >>
> > > >> Hi,
> > > >>
> > > >> $ ceph-volume lvm list
> > > >>
> > > >> Does that work for you?
> > > >>
> > > >> Wido
> > > >>
> > > >> On 10/08/2018 12:01 PM, Kevin Olbrich wrote:
> > > >> > Hi!
> > > >> >
> > > >> > Is there an easy way to find raw disks (eg. sdd/sdd1) by OSD id?
> > > >> > Before I migrated from filestore with simple-mode to bluestore with 
> > > >> > lvm,
> > > >> > I was able to find the raw disk with "df".
> > > >> > Now, I need to go from LVM LV to PV to disk every time I need to
> > > >> > check/smartctl a disk.
> > > >> >
> > > >> > Kevin
> > > >> >
> > > >> >
> > > >> > ___
> > > >> > ceph-users mailing list
> > > >> > ceph-users@lists.ceph.com
> > > >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > > >> >
> > > >
> > > > ___
> > > > ceph-users mailing list
> > > > ceph-users@lists.ceph.com
> > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > >
> > >
> > >
> > > --
> > > Paul Emmerich
> > >
> > > Looking for help with your Ceph cluster? Contact us at https://croit.io
> > >
> > > croit GmbH
> > > Freseniusstr. 31h
> > > 81247 München
> > > www.croit.io
> > > Tel: +49 89 1896585 90
> > > ___
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Migration osds to Bluestore on Ubuntu 14.04 Trusty

2018-11-15 Thread Alfredo Deza
On Thu, Nov 15, 2018 at 8:57 AM Klimenko, Roman  wrote:
>
> Hi everyone!
>
> As I noticed, ceph-volume lacks Ubuntu Trusty compatibility  
> https://tracker.ceph.com/issues/23496
>
> So, I can't follow this instruction 
> http://docs.ceph.com/docs/mimic/rados/operations/bluestore-migration/
>
> Do I have any other option to migrate my Filestore osds (Luminous 12.2.9)  to 
> Bluestore?
>
> P.S This is a test environment, so I can try anything

You could just use ceph-disk, but the way ceph-volume does bluestore
is more robust. I would try really hard to upgrade the OS so that you
can rely on ceph-volume

>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Unhelpful behaviour of ceph-volume lvm batch with >1 NVME card for block.db

2018-11-14 Thread Alfredo Deza
On Wed, Nov 14, 2018 at 9:10 AM Matthew Vernon  wrote:
>
> Hi,
>
> We currently deploy our filestore OSDs with ceph-disk (via
> ceph-ansible), and I was looking at using ceph-volume as we migrate to
> bluestore.
>
> Our servers have 60 OSDs and 2 NVME cards; each OSD is made up of a
> single hdd, and an NVME partition for journal.
>
> If, however, I do:
> ceph-volume lvm batch /dev/sda /dev/sdb [...] /dev/nvme0n1 /dev/nvme1n1
> then I get (inter alia):
>
> Solid State VG:
>   Targets:   block.db  Total size: 1.82 TB
>   Total LVs: 2 Size per LV: 931.51 GB
>
>   Devices:   /dev/nvme0n1, /dev/nvme1n1
>
> i.e. ceph-volume is going to make a single VG containing both NVME
> devices, and split that up into LVs to use for block.db
>
> It seems to me that this is straightforwardly the wrong answer - either
> NVME failing will now take out *every* OSD on the host, whereas the
> obvious alternative (one VG per NVME, divide those into LVs) would give
> you just as good performance, but you'd only lose 1/2 the OSDs if an
> NVME card failed.
>
> Am I missing something obvious here?

This is exactly the intended behavior. The `lvm batch` sub-command is
meant to simplify LV management, and by doing so, it has to adhere to
some constraints.

These constraints (making a single VG out of both NVMe devices) makes
is far easier+robust on the implementation, and allows us to
accommodate for a lot of different scenarios, but I do see how this
might be
unexpected.

>
> I appreciate I /could/ do it all myself, but even using ceph-ansible
> that's going to be very tiresome...
>

Right, so you are able to chop the devices up in any way you find more
acceptable (creating LVs and then passing them to `lvm create`)

There is a bit of wiggle room here though, you could deploy half of it
first which would force `lvm batch` to use just one NVMe:

ceph-volume lvm batch /dev/sda [...] /dev/nvme0n1

And then the rest of devices

ceph-volume lvm batch /dev/sdb [...] /dev/nvme1n1

> Regards,
>
> Matthew
>
>
> --
>  The Wellcome Sanger Institute is operated by Genome Research
>  Limited, a charity registered in England with number 1021457 and a
>  company registered in England with number 2742969, whose registered
>  office is 215 Euston Road, London, NW1 2BE.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph 12.2.9 release

2018-11-08 Thread Alfredo Deza
On Thu, Nov 8, 2018 at 3:02 AM Janne Johansson  wrote:
>
> Den ons 7 nov. 2018 kl 18:43 skrev David Turner :
> >
> > My big question is that we've had a few of these releases this year that 
> > are bugged and shouldn't be upgraded to... They don't have any release 
> > notes or announcement and the only time this comes out is when users 
> > finally ask about it weeks later.  Why is this not proactively announced to 
> > avoid a problematic release and hopefully prevent people from installing 
> > it?  It would be great if there was an actual release notes saying not to 
> > upgrade to this version or something.
>
> I think the big question is why do these packages end up publicly so
> that scripts, updates and anyone not actively trying to hold back get
> exposed to them, then we are somehow supposed to notice that the
> accompanying release notes are lacking and then from that divinate
> that we shouldn't have upgraded into this release at all. This seems
> all backwards in most possible ways.
>
> I'm not even upset about releases having bugs, stuff happens, but the
> way people are forced into it, then it's somehow your fault for
> running ceph-deploy or yum/apt upgrade against official release-repos.

It isn't your fault (or anyone in the community), we don't have a good
system in place for community repos to manage it in a way that would
help when problems like this come up.

We are in a much better place than a few years ago though, when
packages had to be placed manually when creating repos, but more work
is needed to address multi-version support in deb repos
and yanking known bad versions out of the official/latest release location.

> It's almost as if it was meant to push people into slow-moving dists
> like Blue^H^H^H^HRedhat with ceph on top.
>
> --
> May the most significant bit of your life be positive.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Filestore to Bluestore migration question

2018-11-06 Thread Alfredo Deza
It is pretty difficult to know what step you are missing if we are
getting the `activate --all` command.

Maybe if you try one by one, capturing each command, throughout the
process, with output. In the filestore-to-bluestore guides we never
advertise `activate --all` for example.

Something is missing here, and I can't tell what it is.
On Tue, Nov 6, 2018 at 4:13 PM Hayashida, Mami  wrote:
>
> This is becoming even more confusing. I got rid of those 
> ceph-disk@6[0-9].service (which had been symlinked to /dev/null).  Moved 
> /var/lib/ceph/osd/ceph-6[0-9] to  /var/./osd_old/.  Then, I ran  
> `ceph-volume lvm activate --all`.  I got once again
>
> root@osd1:~# ceph-volume lvm activate --all
> --> Activating OSD ID 67 FSID 17cd6755-76f9-4160-906c-1bf13d09fb3d
> Running command: mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-67
> --> Absolute path not found for executable: restorecon
> --> Ensure $PATH environment variable contains common executable locations
> Running command: ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev 
> /dev/hdd67/data67 --path /var/lib/ceph/osd/ceph-67
>  stderr: failed to read label for /dev/hdd67/data67: (2) No such file or 
> directory
> -->  RuntimeError: command returned non-zero exit status: 1
>
> But when I ran `df` and `mount` ceph-67 is the only one that exists. (and in  
> /var/lib/ceph/osd/)
>
> root@osd1:~# df -h | grep ceph-6
> tmpfs   126G 0  126G   0% /var/lib/ceph/osd/ceph-67
>
> root@osd1:~# mount | grep ceph-6
> tmpfs on /var/lib/ceph/osd/ceph-67 type tmpfs (rw,relatime)
>
> root@osd1:~# ls /var/lib/ceph/osd/ | grep ceph-6
> ceph-67
>
> But in I cannot restart any of these 10 daemons (`systemctl start 
> ceph-osd@6[0-9]`).
>
> I am wondering if I should zap these 10 osds and start over although at this 
> point I am afraid even zapping may not be a simple task
>
>
>
> On Tue, Nov 6, 2018 at 3:44 PM, Hector Martin  wrote:
>>
>> On 11/7/18 5:27 AM, Hayashida, Mami wrote:
>> > 1. Stopped osd.60-69:  no problem
>> > 2. Skipped this and went to #3 to check first
>> > 3. Here, `find /etc/systemd/system | grep ceph-volume` returned
>> > nothing.  I see in that directory
>> >
>> > /etc/systemd/system/ceph-disk@60.service# and 61 - 69.
>> >
>> > No ceph-volume entries.
>>
>> Get rid of those, they also shouldn't be there. Then `systemctl
>> daemon-reload` and continue, see if you get into a good state. basically
>> feel free to nuke anything in there related to OSD 60-69, since whatever
>> is needed should be taken care of by the ceph-volume activation.
>>
>>
>> --
>> Hector Martin (hec...@marcansoft.com)
>> Public Key: https://mrcn.st/pub
>
>
>
>
> --
> Mami Hayashida
> Research Computing Associate
>
> Research Computing Infrastructure
> University of Kentucky Information Technology Services
> 301 Rose Street | 102 James F. Hardymon Building
> Lexington, KY 40506-0495
> mami.hayash...@uky.edu
> (859)323-7521
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-deploy osd creation failed with multipath and dmcrypt

2018-11-06 Thread Alfredo Deza
On Tue, Nov 6, 2018 at 8:41 AM Pavan, Krish  wrote:
>
> Trying to created OSD with multipath with dmcrypt and it failed . Any 
> suggestion please?.

ceph-disk is known to have issues like this. It is already deprecated
in the Mimic release and will no longer be available for the upcoming
release (Nautilus).

I would strongly suggest you upgrade ceph-deploy to the 2.X.X series
which supports ceph-volume.

>
> ceph-deploy --overwrite-conf osd create ceph-store1:/dev/mapper/mpathr 
> --bluestore --dmcrypt  -- failed
>
> ceph-deploy --overwrite-conf osd create ceph-store1:/dev/mapper/mpathr 
> --bluestore – worked
>
>
>
> the logs for fail
>
> [ceph-store12][WARNIN] command: Running command: /usr/sbin/restorecon -R 
> /var/lib/ceph/osd-lockbox/e15f1adc-feff-4890-a617-adc473e7331e/magic.68428.tmp
>
> [ceph-store12][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph 
> /var/lib/ceph/osd-lockbox/e15f1adc-feff-4890-a617-adc473e7331e/magic.68428.tmp
>
> [ceph-store12][WARNIN] Traceback (most recent call last):
>
> [ceph-store12][WARNIN]   File "/usr/sbin/ceph-disk", line 9, in 
>
> [ceph-store12][WARNIN] load_entry_point('ceph-disk==1.0.0', 
> 'console_scripts', 'ceph-disk')()
>
> [ceph-store12][WARNIN]   File 
> "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5736, in run
>
> [ceph-store12][WARNIN] main(sys.argv[1:])
>
> [ceph-store12][WARNIN]   File 
> "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5687, in main
>
> [ceph-store12][WARNIN] args.func(args)
>
> [ceph-store12][WARNIN]   File 
> "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2108, in main
>
> [ceph-store12][WARNIN] Prepare.factory(args).prepare()
>
> [ceph-store12][WARNIN]   File 
> "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2097, in prepare
>
> [ceph-store12][WARNIN] self._prepare()
>
> [ceph-store12][WARNIN]   File 
> "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2171, in _prepare
>
> [ceph-store12][WARNIN] self.lockbox.prepare()
>
> [ceph-store12][WARNIN]   File 
> "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2861, in prepare
>
> [ceph-store12][WARNIN] self.populate()
>
> [ceph-store12][WARNIN]   File 
> "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2818, in populate
>
> [ceph-store12][WARNIN] get_partition_base(self.partition.get_dev()),
>
> [ceph-store12][WARNIN]   File 
> "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 844, in 
> get_partition_base
>
> [ceph-store12][WARNIN] raise Error('not a partition', dev)
>
> [ceph-store12][WARNIN] ceph_disk.main.Error: Error: not a partition: 
> /dev/dm-215
>
> [ceph-store12][ERROR ] RuntimeError: command returned non-zero exit status: 1
>
> [ceph_deploy.osd][ERROR ] Failed to execute command: /usr/sbin/ceph-disk -v 
> prepare --dmcrypt --dmcrypt-key-dir /etc/ceph/dmcrypt-keys --bluestore 
> --cluster ceph --fs-type btrfs -- /dev/mapper/mpathr
>
> [ceph_deploy][ERROR ] GenericError: Failed to create 1 OSDs
>
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Filestore to Bluestore migration question

2018-11-05 Thread Alfredo Deza
On Mon, Nov 5, 2018 at 4:21 PM Hayashida, Mami  wrote:
>
> Yes, I still have the volume log showing the activation process for ssd0/db60 
> (and 61-69 as well).   I will email it to you directly as an attachment.

In the logs, I see that ceph-volume does set the permissions correctly:

[2018-11-02 16:20:07,238][ceph_volume.process][INFO  ] Running
command: chown -h ceph:ceph /dev/hdd60/data60
[2018-11-02 16:20:07,242][ceph_volume.process][INFO  ] Running
command: chown -R ceph:ceph /dev/dm-10
[2018-11-02 16:20:07,246][ceph_volume.process][INFO  ] Running
command: ln -s /dev/hdd60/data60 /var/lib/ceph/osd/ceph-60/block
[2018-11-02 16:20:07,249][ceph_volume.process][INFO  ] Running
command: ceph --cluster ceph --name client.bootstrap-osd --keyring
/var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o
/var/lib/ceph/osd/ceph-60/activate.monmap
[2018-11-02 16:20:07,530][ceph_volume.process][INFO  ] stderr got monmap epoch 2
[2018-11-02 16:20:07,547][ceph_volume.process][INFO  ] Running
command: ceph-authtool /var/lib/ceph/osd/ceph-60/keyring
--create-keyring --name osd.60 --add-key
AQBysdxbNgdBNhAA6NQ/UWDHqGAZfFuryCWfxQ==
[2018-11-02 16:20:07,579][ceph_volume.process][INFO  ] stdout creating
/var/lib/ceph/osd/ceph-60/keyring
added entity osd.60 auth auth(auid = 18446744073709551615
key=AQBysdxbNgdBNhAA6NQ/UWDHqGAZfFuryCWfxQ== with 0 caps)
[2018-11-02 16:20:07,583][ceph_volume.process][INFO  ] Running
command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-60/keyring
[2018-11-02 16:20:07,587][ceph_volume.process][INFO  ] Running
command: chown -R ceph:ceph /var/lib/ceph/osd/ceph-60/
[2018-11-02 16:20:07,591][ceph_volume.process][INFO  ] Running
command: chown -h ceph:ceph /dev/ssd0/db60
[2018-11-02 16:20:07,594][ceph_volume.process][INFO  ] Running
command: chown -R ceph:ceph /dev/dm-0

And the failures from osd.60 are *before* those successful chown calls
(15:39:00). I wonder if somehow in the process there was a missing
step and then it all got corrected. I am certain that the UDEV rule
should *not*
be in place for this to work.

The changes in the path for /dev/dm-* is expected, as that is created
every time the system boots.

>
>
> On Mon, Nov 5, 2018 at 4:14 PM, Alfredo Deza  wrote:
>>
>> On Mon, Nov 5, 2018 at 4:04 PM Hayashida, Mami  
>> wrote:
>> >
>> > WOW.  With you two guiding me through every step, the 10 OSDs in question 
>> > are now added back to the cluster as Bluestore disks!!!  Here are my 
>> > responses to the last email from Hector:
>> >
>> > 1. I first checked the permissions and they looked like this
>> >
>> > root@osd1:/var/lib/ceph/osd/ceph-60# ls -l
>> > total 56
>> > -rw-r--r-- 1 ceph ceph 384 Nov  2 16:20 activate.monmap
>> > -rw-r--r-- 1 ceph ceph 10737418240 Nov  2 16:20 block
>> > lrwxrwxrwx 1 ceph ceph  14 Nov  2 16:20 block.db -> /dev/ssd0/db60
>> >
>> > root@osd1:~# ls -l /dev/ssd0/
>> > ...
>> > lrwxrwxrwx 1 root root 7 Nov  5 12:38 db60 -> ../dm-2
>> >
>> > root@osd1:~# ls -la /dev/
>> > ...
>> > brw-rw  1 root disk252,   2 Nov  5 12:38 dm-2
>>
>> This looks like a bug. You mentioned you are running 12.2.9, and we
>> haven't seen problems in ceph-volume that fail to update the
>> permissions on OSD devices. No one should need a UDEV rule to set the
>> permissions for
>> devices, this is a ceph-volume task.
>>
>> When a system starts and the OSD activation happens, it always ensures
>> that the permissions are set correctly. Could you find the section of
>> the logs in /var/log/ceph/ceph-volume.log that shows the activation
>> process for ssd0/db60 ?
>>
>> Hopefully you still have those around, it would help us determine why
>> the permissions aren't being set correctly.
>>
>> > ...
>> >
>> > 2. I then ran ceph-volume activate --all again.  Saw the same error for 
>> > osd.67 I described many emails ago..  None of the permissions changed.  I 
>> > tried restarting ceph-osd@60, but got the same error as before:
>> >
>> > 2018-11-05 15:34:52.001782 7f5a15744e00  0 set uid:gid to 64045:64045 
>> > (ceph:ceph)
>> > 2018-11-05 15:34:52.001808 7f5a15744e00  0 ceph version 12.2.9 
>> > (9e300932ef8a8916fb3fda78c58691a6ab0f4217) luminous (stable), process 
>> > ceph-osd, pid 36506
>> > 2018-11-05 15:34:52.021717 7f5a15744e00  0 pidfile_write: ignore empty 
>> > --pid-file
>> > 2018-11-05 15:34:52.033478 7f5a15744e00  0 load: jerasure load: lrc load: 
>> > isa
>> > 2018-11-05 15:34:52.033557 7f5a15744e00  1 bdev create path 
>> > /var/lib/ceph/osd/ceph-60/block type kernel
>> > 2018-11-05 15:34:52.

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-05 Thread Alfredo Deza
On Mon, Nov 5, 2018 at 4:04 PM Hayashida, Mami  wrote:
>
> WOW.  With you two guiding me through every step, the 10 OSDs in question are 
> now added back to the cluster as Bluestore disks!!!  Here are my responses to 
> the last email from Hector:
>
> 1. I first checked the permissions and they looked like this
>
> root@osd1:/var/lib/ceph/osd/ceph-60# ls -l
> total 56
> -rw-r--r-- 1 ceph ceph 384 Nov  2 16:20 activate.monmap
> -rw-r--r-- 1 ceph ceph 10737418240 Nov  2 16:20 block
> lrwxrwxrwx 1 ceph ceph  14 Nov  2 16:20 block.db -> /dev/ssd0/db60
>
> root@osd1:~# ls -l /dev/ssd0/
> ...
> lrwxrwxrwx 1 root root 7 Nov  5 12:38 db60 -> ../dm-2
>
> root@osd1:~# ls -la /dev/
> ...
> brw-rw  1 root disk252,   2 Nov  5 12:38 dm-2

This looks like a bug. You mentioned you are running 12.2.9, and we
haven't seen problems in ceph-volume that fail to update the
permissions on OSD devices. No one should need a UDEV rule to set the
permissions for
devices, this is a ceph-volume task.

When a system starts and the OSD activation happens, it always ensures
that the permissions are set correctly. Could you find the section of
the logs in /var/log/ceph/ceph-volume.log that shows the activation
process for ssd0/db60 ?

Hopefully you still have those around, it would help us determine why
the permissions aren't being set correctly.

> ...
>
> 2. I then ran ceph-volume activate --all again.  Saw the same error for 
> osd.67 I described many emails ago..  None of the permissions changed.  I 
> tried restarting ceph-osd@60, but got the same error as before:
>
> 2018-11-05 15:34:52.001782 7f5a15744e00  0 set uid:gid to 64045:64045 
> (ceph:ceph)
> 2018-11-05 15:34:52.001808 7f5a15744e00  0 ceph version 12.2.9 
> (9e300932ef8a8916fb3fda78c58691a6ab0f4217) luminous (stable), process 
> ceph-osd, pid 36506
> 2018-11-05 15:34:52.021717 7f5a15744e00  0 pidfile_write: ignore empty 
> --pid-file
> 2018-11-05 15:34:52.033478 7f5a15744e00  0 load: jerasure load: lrc load: isa
> 2018-11-05 15:34:52.033557 7f5a15744e00  1 bdev create path 
> /var/lib/ceph/osd/ceph-60/block type kernel
> 2018-11-05 15:34:52.033572 7f5a15744e00  1 bdev(0x5651bd1b8d80 
> /var/lib/ceph/osd/ceph-60/block) open path /var/lib/ceph/osd/ceph-60/block
> 2018-11-05 15:34:52.033888 7f5a15744e00  1 bdev(0x5651bd1b8d80 
> /var/lib/ceph/osd/ceph-60/block) open size 10737418240 (0x28000, 10GiB) 
> block_size 4096 (4KiB) rotational
> 2018-11-05 15:34:52.033958 7f5a15744e00  1 
> bluestore(/var/lib/ceph/osd/ceph-60) _set_cache_sizes cache_size 1073741824 
> meta 0.4 kv 0.4 data 0.2
> 2018-11-05 15:34:52.033984 7f5a15744e00  1 bdev(0x5651bd1b8d80 
> /var/lib/ceph/osd/ceph-60/block) close
> 2018-11-05 15:34:52.318993 7f5a15744e00  1 
> bluestore(/var/lib/ceph/osd/ceph-60) _mount path /var/lib/ceph/osd/ceph-60
> 2018-11-05 15:34:52.319064 7f5a15744e00  1 bdev create path 
> /var/lib/ceph/osd/ceph-60/block type kernel
> 2018-11-05 15:34:52.319073 7f5a15744e00  1 bdev(0x5651bd1b8fc0 
> /var/lib/ceph/osd/ceph-60/block) open path /var/lib/ceph/osd/ceph-60/block
> 2018-11-05 15:34:52.319356 7f5a15744e00  1 bdev(0x5651bd1b8fc0 
> /var/lib/ceph/osd/ceph-60/block) open size 10737418240 (0x28000, 10GiB) 
> block_size 4096 (4KiB) rotational
> 2018-11-05 15:34:52.319415 7f5a15744e00  1 
> bluestore(/var/lib/ceph/osd/ceph-60) _set_cache_sizes cache_size 1073741824 
> meta 0.4 kv 0.4 data 0.2
> 2018-11-05 15:34:52.319491 7f5a15744e00  1 bdev create path 
> /var/lib/ceph/osd/ceph-60/block.db type kernel
> 2018-11-05 15:34:52.319499 7f5a15744e00  1 bdev(0x5651bd1b9200 
> /var/lib/ceph/osd/ceph-60/block.db) open path 
> /var/lib/ceph/osd/ceph-60/block.db
> 2018-11-05 15:34:52.319514 7f5a15744e00 -1 bdev(0x5651bd1b9200 
> /var/lib/ceph/osd/ceph-60/block.db) open open got: (13) Permission denied
> 2018-11-05 15:34:52.319648 7f5a15744e00 -1 
> bluestore(/var/lib/ceph/osd/ceph-60) _open_db add block 
> device(/var/lib/ceph/osd/ceph-60/block.db) returned: (13) Permission denied
> 2018-11-05 15:34:52.319666 7f5a15744e00  1 bdev(0x5651bd1b8fc0 
> /var/lib/ceph/osd/ceph-60/block) close
> 2018-11-05 15:34:52.598249 7f5a15744e00 -1 osd.60 0 OSD:init: unable to mount 
> object store
> 2018-11-05 15:34:52.598269 7f5a15744e00 -1  ** ERROR: osd init failed: (13) 
> Permission denied
>
> 3. Finally, I literally copied and pasted the udev rule Hector wrote out for 
> me, then rebooted the server.
>
> 4. I tried restarting ceph-osd@60 -- this time it came right up!!!  I was 
> able to start all the rest, including ceph-osd@67 which I thought did not get 
> activated by lvm.
>
> 5. I checked from the admin node and verified osd.60-69 are all in the 
> cluster as Bluestore OSDs and they indeed are.
>
> 
> Thank you SO MUCH, both of you, for putting up with my novice questions all 
> the way.  I am planning to convert the rest of the cluster the same way by 
> reviewing this entire thread to trace what steps need to be taken.
>
> Mami
>
> On Mon, Nov 5, 2018 

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-05 Thread Alfredo Deza
On Mon, Nov 5, 2018 at 12:54 PM Hayashida, Mami  wrote:
>
> I commented out those lines and, yes, I was able to restart the system and 
> all the Filestore OSDs are now running.  But when I cannot start converted 
> Bluestore OSDs (service).   When I look up the log for osd.60, this is what I 
> see:

Something/someone may have added those entries in fstab. It is
certainly not something that ceph-disk or ceph-volume would ever do.
Glad you found them and removed them.


>
> 2018-11-05 12:47:00.756794 7f1f2775ae00  0 set uid:gid to 64045:64045 
> (ceph:ceph)
> 2018-11-05 12:47:00.756821 7f1f2775ae00  0 ceph version 12.2.9 
> (9e300932ef8a8916fb3fda78c58691a6ab0f4217) luminous (stable), process 
> ceph-osd, pid 33706
> 2018-11-05 12:47:00.776554 7f1f2775ae00  0 pidfile_write: ignore empty 
> --pid-file
> 2018-11-05 12:47:00.788716 7f1f2775ae00  0 load: jerasure load: lrc load: isa
> 2018-11-05 12:47:00.788803 7f1f2775ae00  1 bdev create path 
> /var/lib/ceph/osd/ceph-60/block type kernel
> 2018-11-05 12:47:00.788818 7f1f2775ae00  1 bdev(0x564350f4ad80 
> /var/lib/ceph/osd/ceph-60/block) open path /var/lib/ceph/osd/ceph-60/block
> 2018-11-05 12:47:00.789179 7f1f2775ae00  1 bdev(0x564350f4ad80 
> /var/lib/ceph/osd/ceph-60/block) open size 10737418240 (0x28000, 10GiB) 
> block_size 4096 (4KiB) rotational
> 2018-11-05 12:47:00.789257 7f1f2775ae00  1 
> bluestore(/var/lib/ceph/osd/ceph-60) _set_cache_sizes cache_size 1073741824 
> meta 0.4 kv 0.4 data 0.2
> 2018-11-05 12:47:00.789286 7f1f2775ae00  1 bdev(0x564350f4ad80 
> /var/lib/ceph/osd/ceph-60/block) close
> 2018-11-05 12:47:01.075002 7f1f2775ae00  1 
> bluestore(/var/lib/ceph/osd/ceph-60) _mount path /var/lib/ceph/osd/ceph-60
> 2018-11-05 12:47:01.075069 7f1f2775ae00  1 bdev create path 
> /var/lib/ceph/osd/ceph-60/block type kernel
> 2018-11-05 12:47:01.075078 7f1f2775ae00  1 bdev(0x564350f4afc0 
> /var/lib/ceph/osd/ceph-60/block) open path /var/lib/ceph/osd/ceph-60/block
> 2018-11-05 12:47:01.075391 7f1f2775ae00  1 bdev(0x564350f4afc0 
> /var/lib/ceph/osd/ceph-60/block) open size 10737418240 (0x28000, 10GiB) 
> block_size 4096 (4KiB) rotational
> 2018-11-05 12:47:01.075450 7f1f2775ae00  1 
> bluestore(/var/lib/ceph/osd/ceph-60) _set_cache_sizes cache_size 1073741824 
> meta 0.4 kv 0.4 data 0.2
> 2018-11-05 12:47:01.075536 7f1f2775ae00  1 bdev create path 
> /var/lib/ceph/osd/ceph-60/block.db type kernel
> 2018-11-05 12:47:01.075544 7f1f2775ae00  1 bdev(0x564350f4b200 
> /var/lib/ceph/osd/ceph-60/block.db) open path 
> /var/lib/ceph/osd/ceph-60/block.db
> 2018-11-05 12:47:01.07 7f1f2775ae00 -1 bdev(0x564350f4b200 
> /var/lib/ceph/osd/ceph-60/block.db) open open got: (13) Permission denied
> 2018-11-05 12:47:01.075573 7f1f2775ae00 -1 
> bluestore(/var/lib/ceph/osd/ceph-60) _open_db add block 
> device(/var/lib/ceph/osd/ceph-60/block.db) returned: (13) Permission denied
> 2018-11-05 12:47:01.075589 7f1f2775ae00  1 bdev(0x564350f4afc0 
> /var/lib/ceph/osd/ceph-60/block) close
> 2018-11-05 12:47:01.346356 7f1f2775ae00 -1 osd.60 0 OSD:init: unable to mount 
> object store
> 2018-11-05 12:47:01.346378 7f1f2775ae00 -1  ** ERROR: osd init failed: (13) 
> Permission denied

If this has already been converted and it is ceph-volume trying to
start it, could you please try activating that one OSD with:

ceph-volume lvm activate  

And paste the output here, and then if possible, capture the relevant
part of that CLI call from /var/log/ceph/ceph-volume.log
>
>
>
>
> On Mon, Nov 5, 2018 at 12:34 PM, Hector Martin  wrote:
>>
>> On 11/6/18 2:01 AM, Hayashida, Mami wrote:
>> > I did find in /etc/fstab entries like this for those 10 disks
>> >
>> > /dev/sdh1   /var/lib/ceph/osd/ceph-60  xfs noatime,nodiratime 0 0
>> >
>> > Should I comment all 10 of them out (for osd.{60-69}) and try rebooting
>> > again?
>>
>> Yes. Anything that references any of the old partitions that don't exist
>> (/dev/sdh1 etc) should be removed. The disks are now full-disk LVM PVs
>> and should have no partitions.
>>
>> --
>> Hector Martin (hec...@marcansoft.com)
>> Public Key: https://mrcn.st/pub
>
>
>
>
> --
> Mami Hayashida
> Research Computing Associate
>
> Research Computing Infrastructure
> University of Kentucky Information Technology Services
> 301 Rose Street | 102 James F. Hardymon Building
> Lexington, KY 40506-0495
> mami.hayash...@uky.edu
> (859)323-7521
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Filestore to Bluestore migration question

2018-11-05 Thread Alfredo Deza
On Mon, Nov 5, 2018 at 11:51 AM Hector Martin  wrote:
>
> Those units don't get triggered out of nowhere, there has to be a
> partition table with magic GUIDs or a fstab or something to cause them
> to be triggered. The better way should be to get rid of that instead of
> overriding the ceph-disk service instances, I think.

"masking" or linking them to /dev/null is the recommended way of doing
this. It is what ceph-volume does when taking over ceph-disk managed
OSDs.

They get triggered by udev rules that are packaged and installed by Ceph.

>
> Given dev-sdh1.device is trying to start, I suspect you have them in
> /etc/fstab. You should have a look around /etc to see if you have any
> stray references to those devices or old ceph-disk OSDs.
>
> On 11/6/18 1:37 AM, Hayashida, Mami wrote:
> > Alright.  Thanks -- I will try this now.
> >
> > On Mon, Nov 5, 2018 at 11:36 AM, Alfredo Deza  > <mailto:ad...@redhat.com>> wrote:
> >
> > On Mon, Nov 5, 2018 at 11:33 AM Hayashida, Mami
> > mailto:mami.hayash...@uky.edu>> wrote:
> > >
> > > But I still have 50 other Filestore OSDs on the same node, though.  
> > Wouldn't doing it all at once (by not identifying the osd-id) be a problem 
> > for those?  I have not migrated data out of those 50 OSDs yet.
> >
> > Sure, like I said, if you want to do them one by one, then your
> > initial command is fine.
> >
> > >
> > > On Mon, Nov 5, 2018 at 11:31 AM, Alfredo Deza  > <mailto:ad...@redhat.com>> wrote:
> > >>
> > >> On Mon, Nov 5, 2018 at 11:24 AM Hayashida, Mami
> > mailto:mami.hayash...@uky.edu>> wrote:
> > >> >
> > >> > Thank you for all of your replies. Just to clarify...
> > >> >
> > >> > 1. Hector:  I did unmount the file system if what you meant was
> > unmounting the /var/lib/ceph/osd/ceph-$osd-id   for those disks (in
> > my case osd.60-69) before running the ceph-volume lvm zap command
> > >> >
> > >> > 2. Alfredo: so I can at this point run the "ln" command
> > (basically getting rid of the symbolic link) for each of those OSDs
> > I have converted?  For example
> > >> >
> > >> > ln -sf /dev/null /etc/systemc/system/ceph-disk@60.service
> > >> That will take care of OSD 60. This is fine if you want to do
> > them one
> > >> by one. To affect everything from ceph-disk, you would need to:
> > >>
> > >> ln -sf /dev/null /etc/systemd/system/ceph-disk@.service
> > >>
> > >> >
> > >> >Then reboot?
> > >> >
> > >> >
> > >> > On Mon, Nov 5, 2018 at 11:17 AM, Alfredo Deza  > <mailto:ad...@redhat.com>> wrote:
> > >> >>
> > >> >> On Mon, Nov 5, 2018 at 10:43 AM Hayashida, Mami
> > mailto:mami.hayash...@uky.edu>> wrote:
> > >> >> >
> > >> >> > Additional info -- I know that
> > /var/lib/ceph/osd/ceph-{60..69} are not mounted at this point (i.e.
> > mount | grep ceph-60, and 61-69, returns nothing.).  They don't show
> > up when I run "df", either.
> > >> >> >
> > >> >> > On Mon, Nov 5, 2018 at 10:15 AM, Hayashida, Mami
> > mailto:mami.hayash...@uky.edu>> wrote:
> > >> >> >>
> > >> >> >> Well, over the weekend the whole server went down and is
> > now in the emergency mode. (I am running Ubuntu 16.04).  When I run
> > "journalctl  -p err -xb"   I see that
> > >> >> >>
> > >> >> >> systemd[1]: Timed out waiting for device dev-sdh1.device.
> > >> >> >> -- Subject: Unit dev-sdh1.device has failed
> > >> >> >> -- Defined-By: systemd
> > >> >> >> -- Support: http://lists.freeddesktop.org/..
> > <http://lists.freeddesktop.org/..>..
> > >> >> >> --
> > >> >> >> -- Unit dev-sdh1.device has failed.
> > >> >> >>
> > >> >> >>
> > >> >> >> I see this for every single one of the newly-converted
> > Bluestore OSD disks (/dev/sd{h..q}1).
> > >> >>
> > >> >> This will happen with stale c

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-05 Thread Alfredo Deza
On Mon, Nov 5, 2018 at 11:33 AM Hayashida, Mami  wrote:
>
> But I still have 50 other Filestore OSDs on the same node, though.  Wouldn't 
> doing it all at once (by not identifying the osd-id) be a problem for those?  
> I have not migrated data out of those 50 OSDs yet.

Sure, like I said, if you want to do them one by one, then your
initial command is fine.

>
> On Mon, Nov 5, 2018 at 11:31 AM, Alfredo Deza  wrote:
>>
>> On Mon, Nov 5, 2018 at 11:24 AM Hayashida, Mami  
>> wrote:
>> >
>> > Thank you for all of your replies. Just to clarify...
>> >
>> > 1. Hector:  I did unmount the file system if what you meant was unmounting 
>> > the /var/lib/ceph/osd/ceph-$osd-id   for those disks (in my case 
>> > osd.60-69) before running the ceph-volume lvm zap command
>> >
>> > 2. Alfredo: so I can at this point run the "ln" command (basically getting 
>> > rid of the symbolic link) for each of those OSDs I have converted?  For 
>> > example
>> >
>> > ln -sf /dev/null /etc/systemc/system/ceph-disk@60.service
>> That will take care of OSD 60. This is fine if you want to do them one
>> by one. To affect everything from ceph-disk, you would need to:
>>
>> ln -sf /dev/null /etc/systemd/system/ceph-disk@.service
>>
>> >
>> >Then reboot?
>> >
>> >
>> > On Mon, Nov 5, 2018 at 11:17 AM, Alfredo Deza  wrote:
>> >>
>> >> On Mon, Nov 5, 2018 at 10:43 AM Hayashida, Mami  
>> >> wrote:
>> >> >
>> >> > Additional info -- I know that /var/lib/ceph/osd/ceph-{60..69} are not 
>> >> > mounted at this point (i.e.  mount | grep ceph-60, and 61-69, returns 
>> >> > nothing.).  They don't show up when I run "df", either.
>> >> >
>> >> > On Mon, Nov 5, 2018 at 10:15 AM, Hayashida, Mami 
>> >> >  wrote:
>> >> >>
>> >> >> Well, over the weekend the whole server went down and is now in the 
>> >> >> emergency mode. (I am running Ubuntu 16.04).  When I run "journalctl  
>> >> >> -p err -xb"   I see that
>> >> >>
>> >> >> systemd[1]: Timed out waiting for device dev-sdh1.device.
>> >> >> -- Subject: Unit dev-sdh1.device has failed
>> >> >> -- Defined-By: systemd
>> >> >> -- Support: http://lists.freeddesktop.org/
>> >> >> --
>> >> >> -- Unit dev-sdh1.device has failed.
>> >> >>
>> >> >>
>> >> >> I see this for every single one of the newly-converted Bluestore OSD 
>> >> >> disks (/dev/sd{h..q}1).
>> >>
>> >> This will happen with stale ceph-disk systemd units. You can disable 
>> >> those with:
>> >>
>> >> ln -sf /dev/null /etc/systemd/system/ceph-disk@.service
>> >>
>> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >>
>> >> >> On Mon, Nov 5, 2018 at 9:57 AM, Alfredo Deza  wrote:
>> >> >>>
>> >> >>> On Fri, Nov 2, 2018 at 5:04 PM Hayashida, Mami 
>> >> >>>  wrote:
>> >> >>> >
>> >> >>> > I followed all the steps Hector suggested, and almost everything 
>> >> >>> > seems to have worked fine.  I say "almost" because one out of the 
>> >> >>> > 10 osds I was migrating could not be activated even though 
>> >> >>> > everything up to that point worked just as well for that osd as the 
>> >> >>> > other ones. Here is the output for that particular failure:
>> >> >>> >
>> >> >>> > *
>> >> >>> > ceph-volume lvm activate --all
>> >> >>> > ...
>> >> >>> > --> Activating OSD ID 67 FSID 17cd6755-76f9-4160-906c-XX
>> >> >>> > Running command: mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-67
>> >> >>> > --> Absolute path not found for executable: restorecon
>> >> >>> > --> Ensure $PATH environment variable contains common executable 
>> >> >>> > locations
>> >> >>> > Running command: ceph-bluestore-tool --cluster=ceph prime-osd-dir 
>> >> >>> > --dev /dev/hdd67/data67 --path /var/lib/ceph/osd/ceph-67
>> >> >>> >  stderr: f

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-05 Thread Alfredo Deza
On Mon, Nov 5, 2018 at 11:24 AM Hayashida, Mami  wrote:
>
> Thank you for all of your replies. Just to clarify...
>
> 1. Hector:  I did unmount the file system if what you meant was unmounting 
> the /var/lib/ceph/osd/ceph-$osd-id   for those disks (in my case osd.60-69) 
> before running the ceph-volume lvm zap command
>
> 2. Alfredo: so I can at this point run the "ln" command (basically getting 
> rid of the symbolic link) for each of those OSDs I have converted?  For 
> example
>
> ln -sf /dev/null /etc/systemc/system/ceph-disk@60.service
That will take care of OSD 60. This is fine if you want to do them one
by one. To affect everything from ceph-disk, you would need to:

ln -sf /dev/null /etc/systemd/system/ceph-disk@.service

>
>Then reboot?
>
>
> On Mon, Nov 5, 2018 at 11:17 AM, Alfredo Deza  wrote:
>>
>> On Mon, Nov 5, 2018 at 10:43 AM Hayashida, Mami  
>> wrote:
>> >
>> > Additional info -- I know that /var/lib/ceph/osd/ceph-{60..69} are not 
>> > mounted at this point (i.e.  mount | grep ceph-60, and 61-69, returns 
>> > nothing.).  They don't show up when I run "df", either.
>> >
>> > On Mon, Nov 5, 2018 at 10:15 AM, Hayashida, Mami  
>> > wrote:
>> >>
>> >> Well, over the weekend the whole server went down and is now in the 
>> >> emergency mode. (I am running Ubuntu 16.04).  When I run "journalctl  -p 
>> >> err -xb"   I see that
>> >>
>> >> systemd[1]: Timed out waiting for device dev-sdh1.device.
>> >> -- Subject: Unit dev-sdh1.device has failed
>> >> -- Defined-By: systemd
>> >> -- Support: http://lists.freeddesktop.org/
>> >> --
>> >> -- Unit dev-sdh1.device has failed.
>> >>
>> >>
>> >> I see this for every single one of the newly-converted Bluestore OSD 
>> >> disks (/dev/sd{h..q}1).
>>
>> This will happen with stale ceph-disk systemd units. You can disable those 
>> with:
>>
>> ln -sf /dev/null /etc/systemd/system/ceph-disk@.service
>>
>>
>> >>
>> >>
>> >> --
>> >>
>> >> On Mon, Nov 5, 2018 at 9:57 AM, Alfredo Deza  wrote:
>> >>>
>> >>> On Fri, Nov 2, 2018 at 5:04 PM Hayashida, Mami  
>> >>> wrote:
>> >>> >
>> >>> > I followed all the steps Hector suggested, and almost everything seems 
>> >>> > to have worked fine.  I say "almost" because one out of the 10 osds I 
>> >>> > was migrating could not be activated even though everything up to that 
>> >>> > point worked just as well for that osd as the other ones. Here is the 
>> >>> > output for that particular failure:
>> >>> >
>> >>> > *
>> >>> > ceph-volume lvm activate --all
>> >>> > ...
>> >>> > --> Activating OSD ID 67 FSID 17cd6755-76f9-4160-906c-XX
>> >>> > Running command: mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-67
>> >>> > --> Absolute path not found for executable: restorecon
>> >>> > --> Ensure $PATH environment variable contains common executable 
>> >>> > locations
>> >>> > Running command: ceph-bluestore-tool --cluster=ceph prime-osd-dir 
>> >>> > --dev /dev/hdd67/data67 --path /var/lib/ceph/osd/ceph-67
>> >>> >  stderr: failed to read label for /dev/hdd67/data67: (2) No such file 
>> >>> > or directory
>> >>> > -->  RuntimeError: command returned non-zero exit status:
>> >>>
>> >>> I wonder if the /dev/sdo device where hdd67/data67 is located is
>> >>> available, or if something else is missing. You could try poking
>> >>> around with `lvs` and see if that LV shows up, also `ceph-volume lvm
>> >>> list hdd67/data67` can help here because it
>> >>> groups OSDs to LVs. If you run `ceph-volume lvm list --format=json
>> >>> hdd67/data67` you will also see all the metadata stored in it.
>> >>>
>> >>> Would be interesting to see that output to verify things exist and are
>> >>> usable for OSD activation.
>> >>>
>> >>> >
>> >>> > ***
>> >>> > I then checked to see if the rest of the migrated OSDs were back in by 
>> >>> > calling the ceph osd tree command from the admin node.  Since they 
>> >>> > 

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-05 Thread Alfredo Deza
On Mon, Nov 5, 2018 at 10:43 AM Hayashida, Mami  wrote:
>
> Additional info -- I know that /var/lib/ceph/osd/ceph-{60..69} are not 
> mounted at this point (i.e.  mount | grep ceph-60, and 61-69, returns 
> nothing.).  They don't show up when I run "df", either.
>
> On Mon, Nov 5, 2018 at 10:15 AM, Hayashida, Mami  
> wrote:
>>
>> Well, over the weekend the whole server went down and is now in the 
>> emergency mode. (I am running Ubuntu 16.04).  When I run "journalctl  -p err 
>> -xb"   I see that
>>
>> systemd[1]: Timed out waiting for device dev-sdh1.device.
>> -- Subject: Unit dev-sdh1.device has failed
>> -- Defined-By: systemd
>> -- Support: http://lists.freeddesktop.org/
>> --
>> -- Unit dev-sdh1.device has failed.
>>
>>
>> I see this for every single one of the newly-converted Bluestore OSD disks 
>> (/dev/sd{h..q}1).

This will happen with stale ceph-disk systemd units. You can disable those with:

ln -sf /dev/null /etc/systemd/system/ceph-disk@.service


>>
>>
>> --
>>
>> On Mon, Nov 5, 2018 at 9:57 AM, Alfredo Deza  wrote:
>>>
>>> On Fri, Nov 2, 2018 at 5:04 PM Hayashida, Mami  
>>> wrote:
>>> >
>>> > I followed all the steps Hector suggested, and almost everything seems to 
>>> > have worked fine.  I say "almost" because one out of the 10 osds I was 
>>> > migrating could not be activated even though everything up to that point 
>>> > worked just as well for that osd as the other ones. Here is the output 
>>> > for that particular failure:
>>> >
>>> > *
>>> > ceph-volume lvm activate --all
>>> > ...
>>> > --> Activating OSD ID 67 FSID 17cd6755-76f9-4160-906c-XX
>>> > Running command: mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-67
>>> > --> Absolute path not found for executable: restorecon
>>> > --> Ensure $PATH environment variable contains common executable locations
>>> > Running command: ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev 
>>> > /dev/hdd67/data67 --path /var/lib/ceph/osd/ceph-67
>>> >  stderr: failed to read label for /dev/hdd67/data67: (2) No such file or 
>>> > directory
>>> > -->  RuntimeError: command returned non-zero exit status:
>>>
>>> I wonder if the /dev/sdo device where hdd67/data67 is located is
>>> available, or if something else is missing. You could try poking
>>> around with `lvs` and see if that LV shows up, also `ceph-volume lvm
>>> list hdd67/data67` can help here because it
>>> groups OSDs to LVs. If you run `ceph-volume lvm list --format=json
>>> hdd67/data67` you will also see all the metadata stored in it.
>>>
>>> Would be interesting to see that output to verify things exist and are
>>> usable for OSD activation.
>>>
>>> >
>>> > ***
>>> > I then checked to see if the rest of the migrated OSDs were back in by 
>>> > calling the ceph osd tree command from the admin node.  Since they were 
>>> > not, I tried to restart the first of the 10 newly migrated Bluestore osds 
>>> > by calling
>>> >
>>> > ***
>>> > systemctl start ceph-osd@60
>>> >
>>> > At that point, not only this particular service could not be started, but 
>>> > ALL the OSDs (daemons) on the entire node shut down!
>>> >
>>> > **
>>> > root@osd1:~# systemctl status ceph-osd@60
>>> > ● ceph-osd@60.service - Ceph object storage daemon osd.60
>>> >Loaded: loaded (/lib/systemd/system/ceph-osd@.service; 
>>> > enabled-runtime; vendor preset: enabled)
>>> >Active: inactive (dead) since Fri 2018-11-02 15:47:20 EDT; 1h 9min ago
>>> >   Process: 3473621 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} 
>>> > --id %i --setuser ceph --setgroup ceph (code=exited, status=0/SUCCESS)
>>> >   Process: 3473147 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh 
>>> > --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS)
>>> >  Main PID: 3473621 (code=exited, status=0/SUCCESS)
>>> >
>>> > Oct 29 15:57:53 osd1.x.uky.edu ceph-osd[3473621]: 2018-10-29 
>>> > 15:57:53.868856 7f68adaece00 -1 osd.60 48106 log_to_monitors 
>>> > {default=true}
>>> > Oct 29 15:57:53 osd1.x.uky.edu ceph-osd[3473621]: 2018-10-29 
>>> > 15:57:53.874373 7f68adaece00 -1 osd.60 48106 mon_cmd_maybe_osd_

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-05 Thread Alfredo Deza
 storage daemon osd.70
>Loaded: loaded (/lib/systemd/system/ceph-osd@.service; enabled-runtime; 
> vendor preset: enabled)
>Active: inactive (dead) since Fri 2018-11-02 16:34:08 EDT; 2min 6s ago
>   Process: 3473629 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id 
> %i --setuser ceph --setgroup ceph (code=exited, status=0/SUCCESS)
>   Process: 3473153 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster 
> ${CLUSTER} --id %i (code=exited, status=0/SUCCESS)
>  Main PID: 3473629 (code=exited, status=0/SUCCESS)
>
> Oct 29 15:57:51 osd1..uky.edu ceph-osd[3473629]: 2018-10-29 
> 15:57:51.300563 7f530eec2e00 -1 osd.70 pg_epoch: 48095 pg[68.ces1( empty 
> local-lis/les=47489/47489 n=0 ec=6030/6030 lis/c 47488/47488 les/c/f 
> 47489/47489/0 47485/47488/47488) [138,70,203]p138(0) r=1 lpr=0 crt=0'0 
> unknown NO
> Oct 30 06:25:01 osd1..uky.edu ceph-osd[3473629]: 2018-10-30 
> 06:25:01.961743 7f52d8e44700 -1 received  signal: Hangup from  PID: 3485955 
> task name: killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse 
> radosgw  UID: 0
> Oct 31 06:25:02 osd1..uky.edu ceph-osd[3473629]: 2018-10-31 
> 06:25:02.110920 7f52d8e44700 -1 received  signal: Hangup from  PID: 3500945 
> task name: killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse 
> radosgw  UID: 0
> Nov 01 06:25:02 osd1..uky.edu ceph-osd[3473629]: 2018-11-01 
> 06:25:02.101568 7f52d8e44700 -1 received  signal: Hangup from  PID: 3514774 
> task name: killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse 
> radosgw  UID: 0
> Nov 02 06:25:02 osd1..uky.edu ceph-osd[3473629]: 2018-11-02 
> 06:25:01.997633 7f52d8e44700 -1 received  signal: Hangup from  PID: 3528128 
> task name: killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse 
> radosgw  UID: 0
> Nov 02 16:34:05 osd1..uky.edu ceph-osd[3473629]: 2018-11-02 
> 16:34:05.607714 7f52d8e44700 -1 received  signal: Terminated from  PID: 1 
> task name: /lib/systemd/systemd --system --deserialize 20  UID: 0
> Nov 02 16:34:05 osd1..uky.edu ceph-osd[3473629]: 2018-11-02 
> 16:34:05.607738 7f52d8e44700 -1 osd.70 48535 *** Got signal Terminated ***
> Nov 02 16:34:05 osd1..uky.edu systemd[1]: Stopping Ceph object storage 
> daemon osd.70...
> Nov 02 16:34:05 osd1..uky.edu ceph-osd[3473629]: 2018-11-02 
> 16:34:05.677348 7f52d8e44700 -1 osd.70 48535 shutdown
> Nov 02 16:34:08 osd1..uky.edu systemd[1]: Stopped Ceph object storage 
> daemon osd.70.
>
> **
>
> So, at this point, ALL the OSDs on that node have been shut down.
>
> For your information this is the output of lsblk command (selection)
> *
> root@osd1:~# lsblk
> NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
> sda  8:00 447.1G  0 disk
> ├─ssd0-db60252:0040G  0 lvm
> ├─ssd0-db61252:1040G  0 lvm
> ├─ssd0-db62252:2040G  0 lvm
> ├─ssd0-db63252:3040G  0 lvm
> ├─ssd0-db64252:4040G  0 lvm
> ├─ssd0-db65252:5040G  0 lvm
> ├─ssd0-db66252:6040G  0 lvm
> ├─ssd0-db67252:7040G  0 lvm
> ├─ssd0-db68252:8040G  0 lvm
> └─ssd0-db69252:9040G  0 lvm
> sdb  8:16   0 447.1G  0 disk
> ├─sdb1   8:17   040G  0 part
> ├─sdb2   8:18   040G  0 part
>
> .
>
> sdh  8:112  0   3.7T  0 disk
> └─hdd60-data60 252:10   0   3.7T  0 lvm
> sdi  8:128  0   3.7T  0 disk
> └─hdd61-data61 252:11   0   3.7T  0 lvm
> sdj  8:144  0   3.7T  0 disk
> └─hdd62-data62 252:12   0   3.7T  0 lvm
> sdk  8:160  0   3.7T  0 disk
> └─hdd63-data63 252:13   0   3.7T  0 lvm
> sdl  8:176  0   3.7T  0 disk
> └─hdd64-data64 252:14   0   3.7T  0 lvm
> sdm  8:192  0   3.7T  0 disk
> └─hdd65-data65 252:15   0   3.7T  0 lvm
> sdn  8:208  0   3.7T  0 disk
> └─hdd66-data66 252:16   0   3.7T  0 lvm
> sdo  8:224  0   3.7T  0 disk
> └─hdd67-data67 252:17   0   3.7T  0 lvm
> sdp  8:240  0   3.7T  0 disk
> └─hdd68-data68 252:18   0   3.7T  0 lvm
> sdq 65:00   3.7T  0 disk
> └─hdd69-data69 252:19   0   3.7T  0 lvm
> sdr 65:16   0   3.7T  0 disk
> └─sdr1  65:17   0   3.7T  0 part /var/lib/ceph/osd/ceph-70
> .
>
> As a Ceph novice, I am totally clueless about the next step at this point.  
> Any help would be appreciated.
>
> On Thu, Nov 1, 2018 at 3:16 PM, Hayashida, Mami  
> wrote:
>>
>> Thank you, both of you.  I will try this out very soon.
>>
>> On Wed, Oct 31, 2018 at 8:48 AM, Alfredo Deza  wrote:
>>>
>>> On Wed, Oct 31, 2018 at 8:28 AM Hayashida, Mami  
>>> wrote:
>>>

Re: [ceph-users] Filestore to Bluestore migration question

2018-10-31 Thread Alfredo Deza
On Wed, Oct 31, 2018 at 8:28 AM Hayashida, Mami  wrote:
>
> Thank you for your replies. So, if I use the method Hector suggested (by 
> creating PVs, VGs etc. first), can I add the --osd-id parameter to the 
> command as in
>
> ceph-volume lvm prepare --bluestore --data hdd0/data0 --block.db ssd/db0  
> --osd-id 0
> ceph-volume lvm prepare --bluestore --data hdd1/data1 --block.db ssd/db1  
> --osd-id 1
>
> so that Filestore -> Bluestore migration will not change the osd ID on each 
> disk?

That looks correct.

>
> And one more question.  Are there any changes I need to make to the ceph.conf 
> file?  I did comment out this line that was probably used for creating 
> Filestore (using ceph-deploy):  osd journal size = 40960

Since you've pre-created the LVs the commented out line will not
affect anything.

>
>
>
> On Wed, Oct 31, 2018 at 7:03 AM, Alfredo Deza  wrote:
>>
>> On Wed, Oct 31, 2018 at 5:22 AM Hector Martin  wrote:
>> >
>> > On 31/10/2018 05:55, Hayashida, Mami wrote:
>> > > I am relatively new to Ceph and need some advice on Bluestore migration.
>> > > I tried migrating a few of our test cluster nodes from Filestore to
>> > > Bluestore by following this
>> > > (http://docs.ceph.com/docs/luminous/rados/operations/bluestore-migration/)
>> > > as the cluster is currently running 12.2.9. The cluster, originally set
>> > > up by my predecessors, was running Jewel until I upgraded it recently to
>> > > Luminous.
>> > >
>> > > OSDs in each OSD host is set up in such a way that for ever 10 data HDD
>> > > disks, there is one SSD drive that is holding their journals.  For
>> > > example, osd.0 data is on /dev/sdh and its Filestore journal is on a
>> > > partitioned part of /dev/sda. So, lsblk shows something like
>> > >
>> > > sda   8:00 447.1G  0 disk
>> > > ├─sda18:1040G  0 part # journal for osd.0
>> > >
>> > > sdh   8:112  0   3.7T  0 disk
>> > > └─sdh18:113  0   3.7T  0 part /var/lib/ceph/osd/ceph-0
>> > >
>> >
>> > The BlueStore documentation states that the wal will automatically use
>> > the db volume if it fits, so if you're using a single SSD I think
>> > there's no good reason to split out the wal, if I'm understanding it
>> > correctly.
>>
>> This is correct, no need for wal in this case.
>>
>> >
>> > You should be using ceph-volume, since ceph-disk is deprecated. If
>> > you're sharing the SSD as wal/db for a bunch of OSDs, I think you're
>> > going to have to create the LVs yourself first. The data HDDs should be
>> > PVs (I don't think it matters if they're partitions or whole disk PVs as
>> > long as LVM discovers them) each part of a separate VG (e.g. hdd0-hdd9)
>> > containing a single LV. Then the SSD should itself be an LV for a
>> > separate shared SSD VG (e.g. ssd).
>> >
>> > So something like (assuming sda is your wal SSD and sdb and onwards are
>> > your OSD HDDs):
>> > pvcreate /dev/sda
>> > pvcreate /dev/sdb
>> > pvcreate /dev/sdc
>> > ...
>> >
>> > vgcreate ssd /dev/sda
>> > vgcreate hdd0 /dev/sdb
>> > vgcreate hdd1 /dev/sdc
>> > ...
>> >
>> > lvcreate -L 40G -n db0 ssd
>> > lvcreate -L 40G -n db1 ssd
>> > ...
>> >
>> > lvcreate -L 100%VG -n data0 hdd0
>> > lvcreate -L 100%VG -n data1 hdd1
>> > ...
>> >
>> > ceph-volume lvm prepare --bluestore --data hdd0/data0 --block.db ssd/db0
>> > ceph-volume lvm prepare --bluestore --data hdd1/data1 --block.db ssd/db1
>> > ...
>> >
>> > ceph-volume lvm activate --all
>> >
>> > I think it might be possible to just let ceph-volume create the PV/VG/LV
>> > for the data disks and only manually create the DB LVs, but it shouldn't
>> > hurt to do it on your own and just give ready-made LVs to ceph-volume
>> > for everything.
>>
>> Another alternative here is to use the new `lvm batch` subcommand to
>> do all of this in one go:
>>
>> ceph-volume lvm batch /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde
>> /dev/sdf /dev/sdg /dev/sdh
>>
>> Will detect that sda is an SSD and will create the LVs for you for
>> block.db (one for each spinning disk). For each spinning disk, it will
>> place data on them.
>>
>> The one caveat is that you no longer control OSD IDs, and they are
>> created with whatever the monitors are giving out.
>>
>> This operation is not supported from ceph-deploy either.
>> >
>> > --
>> > Hector Martin (hec...@marcansoft.com)
>> > Public Key: https://marcan.st/marcan.asc
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
>
> --
> Mami Hayashida
> Research Computing Associate
>
> Research Computing Infrastructure
> University of Kentucky Information Technology Services
> 301 Rose Street | 102 James F. Hardymon Building
> Lexington, KY 40506-0495
> mami.hayash...@uky.edu
> (859)323-7521
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Filestore to Bluestore migration question

2018-10-31 Thread Alfredo Deza
On Wed, Oct 31, 2018 at 5:22 AM Hector Martin  wrote:
>
> On 31/10/2018 05:55, Hayashida, Mami wrote:
> > I am relatively new to Ceph and need some advice on Bluestore migration.
> > I tried migrating a few of our test cluster nodes from Filestore to
> > Bluestore by following this
> > (http://docs.ceph.com/docs/luminous/rados/operations/bluestore-migration/)
> > as the cluster is currently running 12.2.9. The cluster, originally set
> > up by my predecessors, was running Jewel until I upgraded it recently to
> > Luminous.
> >
> > OSDs in each OSD host is set up in such a way that for ever 10 data HDD
> > disks, there is one SSD drive that is holding their journals.  For
> > example, osd.0 data is on /dev/sdh and its Filestore journal is on a
> > partitioned part of /dev/sda. So, lsblk shows something like
> >
> > sda   8:00 447.1G  0 disk
> > ├─sda18:1040G  0 part # journal for osd.0
> >
> > sdh   8:112  0   3.7T  0 disk
> > └─sdh18:113  0   3.7T  0 part /var/lib/ceph/osd/ceph-0
> >
>
> The BlueStore documentation states that the wal will automatically use
> the db volume if it fits, so if you're using a single SSD I think
> there's no good reason to split out the wal, if I'm understanding it
> correctly.

This is correct, no need for wal in this case.

>
> You should be using ceph-volume, since ceph-disk is deprecated. If
> you're sharing the SSD as wal/db for a bunch of OSDs, I think you're
> going to have to create the LVs yourself first. The data HDDs should be
> PVs (I don't think it matters if they're partitions or whole disk PVs as
> long as LVM discovers them) each part of a separate VG (e.g. hdd0-hdd9)
> containing a single LV. Then the SSD should itself be an LV for a
> separate shared SSD VG (e.g. ssd).
>
> So something like (assuming sda is your wal SSD and sdb and onwards are
> your OSD HDDs):
> pvcreate /dev/sda
> pvcreate /dev/sdb
> pvcreate /dev/sdc
> ...
>
> vgcreate ssd /dev/sda
> vgcreate hdd0 /dev/sdb
> vgcreate hdd1 /dev/sdc
> ...
>
> lvcreate -L 40G -n db0 ssd
> lvcreate -L 40G -n db1 ssd
> ...
>
> lvcreate -L 100%VG -n data0 hdd0
> lvcreate -L 100%VG -n data1 hdd1
> ...
>
> ceph-volume lvm prepare --bluestore --data hdd0/data0 --block.db ssd/db0
> ceph-volume lvm prepare --bluestore --data hdd1/data1 --block.db ssd/db1
> ...
>
> ceph-volume lvm activate --all
>
> I think it might be possible to just let ceph-volume create the PV/VG/LV
> for the data disks and only manually create the DB LVs, but it shouldn't
> hurt to do it on your own and just give ready-made LVs to ceph-volume
> for everything.

Another alternative here is to use the new `lvm batch` subcommand to
do all of this in one go:

ceph-volume lvm batch /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde
/dev/sdf /dev/sdg /dev/sdh

Will detect that sda is an SSD and will create the LVs for you for
block.db (one for each spinning disk). For each spinning disk, it will
place data on them.

The one caveat is that you no longer control OSD IDs, and they are
created with whatever the monitors are giving out.

This operation is not supported from ceph-deploy either.
>
> --
> Hector Martin (hec...@marcansoft.com)
> Public Key: https://marcan.st/marcan.asc
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] nfs-ganesha version in Ceph repos

2018-10-09 Thread Alfredo Deza
On Tue, Oct 9, 2018 at 1:39 PM Erik McCormick
 wrote:
>
> On Tue, Oct 9, 2018 at 1:27 PM Erik McCormick
>  wrote:
> >
> > Hello,
> >
> > I'm trying to set up an nfs-ganesha server with the Ceph FSAL, and
> > running into difficulties getting the current stable release running.
> > The versions in the Luminous repo is stuck at 2.6.1, whereas the
> > current stable version is 2.6.3. I've seen a couple of HA issues in
> > pre 2.6.3 versions that I'd like to avoid.
> >
>
> I should have been more specific that the ones I am looking for are for 
> Centos 7

You mean these repos: http://download.ceph.com/nfs-ganesha/ ?
>
> > I've also been attempting to build my own from source, but banging my
> > head against a wall as far as dependencies and config options are
> > concerned.
> >
> > If anyone reading this has the ability to kick off a fresh build of
> > the V2.6-stable branch with all the knobs turned properly for Ceph, or
> > can point me to a set of cmake configs and scripts that might help me
> > do it myself, I would be eternally grateful.
> >
> > Thanks,
> > Erik
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fastest way to find raw device from OSD-ID? (osd -> lvm lv -> lvm pv -> disk)

2018-10-08 Thread Alfredo Deza
On Mon, Oct 8, 2018 at 5:04 PM Paul Emmerich  wrote:
>
> ceph-volume unfortunately doesn't handle completely hanging IOs too
> well compared to ceph-disk.

Not sure I follow, would you mind expanding on what you mean by
"ceph-volume unfortunately doesn't handle completely hanging IOs" ?

ceph-volume just provisions the OSD, nothing else. If LVM is hanging,
there is nothing we could do there, just like ceph-disk wouldn't be
able to do anything if the partitioning
tool would hang.



> It needs to read actual data from each
> disk and it'll just hang completely if any of the disks doesn't
> respond.
>
> The low-level command to get the information from LVM is:
>
> lvs -o lv_tags
>
> this allows you to map a LV to an OSD id.
>
>
> Paul
> Am Mo., 8. Okt. 2018 um 12:09 Uhr schrieb Kevin Olbrich :
> >
> > Hi!
> >
> > Yes, thank you. At least on one node this works, the other node just 
> > freezes but this might by caused by a bad disk that I try to find.
> >
> > Kevin
> >
> > Am Mo., 8. Okt. 2018 um 12:07 Uhr schrieb Wido den Hollander 
> > :
> >>
> >> Hi,
> >>
> >> $ ceph-volume lvm list
> >>
> >> Does that work for you?
> >>
> >> Wido
> >>
> >> On 10/08/2018 12:01 PM, Kevin Olbrich wrote:
> >> > Hi!
> >> >
> >> > Is there an easy way to find raw disks (eg. sdd/sdd1) by OSD id?
> >> > Before I migrated from filestore with simple-mode to bluestore with lvm,
> >> > I was able to find the raw disk with "df".
> >> > Now, I need to go from LVM LV to PV to disk every time I need to
> >> > check/smartctl a disk.
> >> >
> >> > Kevin
> >> >
> >> >
> >> > ___
> >> > ceph-users mailing list
> >> > ceph-users@lists.ceph.com
> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> 81247 München
> www.croit.io
> Tel: +49 89 1896585 90
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fastest way to find raw device from OSD-ID? (osd -> lvm lv -> lvm pv -> disk)

2018-10-08 Thread Alfredo Deza
On Mon, Oct 8, 2018 at 6:09 AM Kevin Olbrich  wrote:
>
> Hi!
>
> Yes, thank you. At least on one node this works, the other node just freezes 
> but this might by caused by a bad disk that I try to find.

If it is freezing, you could maybe try running the command where it
freezes? (ceph-volume will log it to the terminal)


>
> Kevin
>
> Am Mo., 8. Okt. 2018 um 12:07 Uhr schrieb Wido den Hollander :
>>
>> Hi,
>>
>> $ ceph-volume lvm list
>>
>> Does that work for you?
>>
>> Wido
>>
>> On 10/08/2018 12:01 PM, Kevin Olbrich wrote:
>> > Hi!
>> >
>> > Is there an easy way to find raw disks (eg. sdd/sdd1) by OSD id?
>> > Before I migrated from filestore with simple-mode to bluestore with lvm,
>> > I was able to find the raw disk with "df".
>> > Now, I need to go from LVM LV to PV to disk every time I need to
>> > check/smartctl a disk.
>> >
>> > Kevin
>> >
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-volume: recreate OSD with same ID after drive replacement

2018-10-03 Thread Alfredo Deza
On Wed, Oct 3, 2018 at 3:52 PM Andras Pataki
 wrote:
>
> Ok, understood (for next time).
>
> But just as an update/closure to my investigation - it seems this is a
> feature of ceph-volume (that it can't just create an OSD from scratch
> with a given ID), not of base ceph.  The underlying ceph command (ceph
> osd new) very happily accepts an osd-id as an extra optional argument
> (after the fsid), and creates and osd with the given ID.  In fact, a
> quick change to ceph_volume (create_id function in prepare.py) will make
> ceph-volume recreate the OSD with a given ID.  I'm not a ceph-volume
> expert, but a feature to create an OSD with a given ID from scratch
> would be nice (given that the underlying raw ceph commands already
> support it).

That is something that I wasn't aware of, thanks for bringing it up.
I've created an issue on the tracker to accommodate for that behavior:

http://tracker.ceph.com/issues/36307

>
> Andras
>
> On 10/3/18 11:41 AM, Alfredo Deza wrote:
> > On Wed, Oct 3, 2018 at 11:23 AM Andras Pataki
> >  wrote:
> >> Thanks - I didn't realize that was such a recent fix.
> >>
> >> I've now tried 12.2.8, and perhaps I'm not clear on what I should have
> >> done to the OSD that I'm replacing, since I'm getting the error "The osd
> >> ID 747 is already in use or does not exist.".  The case is clearly the
> >> latter, since I've completely removed the old OSD (osd crush remove,
> >> auth del, osd rm, wipe disk).  Should I have done something different
> >> (i.e. not remove the OSD completely)?
> > Yeah, you completely removed it so now it can't be re-used. This is
> > the proper way if wanting to re-use the ID:
> >
> > http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/#rados-replacing-an-osd
> >
> > Basically:
> >
> >  ceph osd destroy {id} --yes-i-really-mean-it
> >
> >> Searching the docs I see a command 'ceph osd destroy'.  What does that
> >> do (compared to my removal procedure, osd crush remove, auth del, osd rm)?
> >>
> >> Thanks,
> >>
> >> Andras
> >>
> >>
> >> On 10/3/18 10:36 AM, Alfredo Deza wrote:
> >>> On Wed, Oct 3, 2018 at 9:57 AM Andras Pataki
> >>>  wrote:
> >>>> After replacing failing drive I'd like to recreate the OSD with the same
> >>>> osd-id using ceph-volume (now that we've moved to ceph-volume from
> >>>> ceph-disk).  However, I seem to not be successful.  The command I'm 
> >>>> using:
> >>>>
> >>>> ceph-volume lvm prepare --bluestore --osd-id 747 --data H901D44/H901D44
> >>>> --block.db /dev/disk/by-partlabel/H901J44
> >>>>
> >>>> But it created an OSD the ID 601, which was the lowest it could allocate
> >>>> and ignored the 747 apparently.  This is with ceph 12.2.7. Any ideas?
> >>> Yeah, this was a problem that was fixed and released as part of 12.2.8
> >>>
> >>> The tracker issue is: http://tracker.ceph.com/issues/24044
> >>>
> >>> The Luminous PR is https://github.com/ceph/ceph/pull/23102
> >>>
> >>> Sorry for the trouble!
> >>>> Andras
> >>>>
> >>>> ___
> >>>> ceph-users mailing list
> >>>> ceph-users@lists.ceph.com
> >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-volume: recreate OSD with same ID after drive replacement

2018-10-03 Thread Alfredo Deza
On Wed, Oct 3, 2018 at 11:23 AM Andras Pataki
 wrote:
>
> Thanks - I didn't realize that was such a recent fix.
>
> I've now tried 12.2.8, and perhaps I'm not clear on what I should have
> done to the OSD that I'm replacing, since I'm getting the error "The osd
> ID 747 is already in use or does not exist.".  The case is clearly the
> latter, since I've completely removed the old OSD (osd crush remove,
> auth del, osd rm, wipe disk).  Should I have done something different
> (i.e. not remove the OSD completely)?

Yeah, you completely removed it so now it can't be re-used. This is
the proper way if wanting to re-use the ID:

http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/#rados-replacing-an-osd

Basically:

ceph osd destroy {id} --yes-i-really-mean-it

> Searching the docs I see a command 'ceph osd destroy'.  What does that
> do (compared to my removal procedure, osd crush remove, auth del, osd rm)?
>
> Thanks,
>
> Andras
>
>
> On 10/3/18 10:36 AM, Alfredo Deza wrote:
> > On Wed, Oct 3, 2018 at 9:57 AM Andras Pataki
> >  wrote:
> >> After replacing failing drive I'd like to recreate the OSD with the same
> >> osd-id using ceph-volume (now that we've moved to ceph-volume from
> >> ceph-disk).  However, I seem to not be successful.  The command I'm using:
> >>
> >> ceph-volume lvm prepare --bluestore --osd-id 747 --data H901D44/H901D44
> >> --block.db /dev/disk/by-partlabel/H901J44
> >>
> >> But it created an OSD the ID 601, which was the lowest it could allocate
> >> and ignored the 747 apparently.  This is with ceph 12.2.7. Any ideas?
> > Yeah, this was a problem that was fixed and released as part of 12.2.8
> >
> > The tracker issue is: http://tracker.ceph.com/issues/24044
> >
> > The Luminous PR is https://github.com/ceph/ceph/pull/23102
> >
> > Sorry for the trouble!
> >> Andras
> >>
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-volume: recreate OSD with same ID after drive replacement

2018-10-03 Thread Alfredo Deza
On Wed, Oct 3, 2018 at 9:57 AM Andras Pataki
 wrote:
>
> After replacing failing drive I'd like to recreate the OSD with the same
> osd-id using ceph-volume (now that we've moved to ceph-volume from
> ceph-disk).  However, I seem to not be successful.  The command I'm using:
>
> ceph-volume lvm prepare --bluestore --osd-id 747 --data H901D44/H901D44
> --block.db /dev/disk/by-partlabel/H901J44
>
> But it created an OSD the ID 601, which was the lowest it could allocate
> and ignored the 747 apparently.  This is with ceph 12.2.7. Any ideas?

Yeah, this was a problem that was fixed and released as part of 12.2.8

The tracker issue is: http://tracker.ceph.com/issues/24044

The Luminous PR is https://github.com/ceph/ceph/pull/23102

Sorry for the trouble!
>
> Andras
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mimic: 3/4 OSDs crashed on "bluefs enospc"

2018-10-02 Thread Alfredo Deza
On Tue, Oct 2, 2018 at 10:23 AM Alex Litvak
 wrote:
>
> Igor,
>
> Thank you for your reply.  So what you are saying there are really no
> sensible space requirements for a collocated device? Even if I setup 30
> GB for DB (which I really wouldn't like to do due to a space waste
> considerations ) there is a chance that if this space feels up I will be
> in the same trouble under some heavy load scenario?

We do have good sizing recommendations for a separate block.db
partition. Roughly it shouldn't be less than 4% the size of the data
device.

http://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/#sizing

>
> On 10/2/2018 9:15 AM, Igor Fedotov wrote:
> > Even with a single device bluestore has a sort of implicit "BlueFS
> > partition" where DB is stored.  And it dynamically adjusts (rebalances)
> > the space for that partition in background. Unfortunately it might
> > perform that "too lazy" and hence under some heavy load it might end-up
> > with the lack of space for that partition. While main device still has
> > plenty of free space.
> >
> > I'm planning to refactor this re-balancing procedure in the future to
> > eliminate the root cause.
> >
> >
> > Thanks,
> >
> > Igor
> >
> >
> > On 10/2/2018 5:04 PM, Alex Litvak wrote:
> >> I am sorry for interrupting the thread, but my understanding always
> >> was that blue store on the single device should not care of the DB
> >> size, i.e. it would use the data part for all operations if DB is
> >> full.  And if it is not true, what would be sensible defaults on 800
> >> GB SSD?  I used ceph-ansible to build my cluster with system defaults
> >> and from I reading in this thread doesn't give me a good feeling at
> >> all. Document ion on the topic is very sketchy and online posts
> >> contradict each other some times.
> >>
> >> Thank you in advance,
> >>
> >> On 10/2/2018 8:52 AM, Igor Fedotov wrote:
> >>> May I have a repair log for that "already expanded" OSD?
> >>>
> >>>
> >>> On 10/2/2018 4:32 PM, Sergey Malinin wrote:
>  Repair goes through only when LVM volume has been expanded,
>  otherwise it fails with enospc as well as any other operation.
>  However, expanding the volume immediately renders bluefs unmountable
>  with IO error.
>  2 of 3 OSDs got bluefs log currupted (bluestore tool segfaults at
>  the very end of bluefs-log-dump), I'm not sure whether corruption
>  occurred before or after volume expansion.
> 
> 
> > On 2.10.2018, at 16:07, Igor Fedotov  wrote:
> >
> > You mentioned repair had worked before, is that correct? What's the
> > difference now except the applied patch? Different OSD? Anything else?
> >
> >
> > On 10/2/2018 3:52 PM, Sergey Malinin wrote:
> >
> >> It didn't work, emailed logs to you.
> >>
> >>
> >>> On 2.10.2018, at 14:43, Igor Fedotov  wrote:
> >>>
> >>> The major change is in get_bluefs_rebalance_txn function, it
> >>> lacked bluefs_rebalance_txn assignment..
> >>>
> >>>
> >>>
> >>> On 10/2/2018 2:40 PM, Sergey Malinin wrote:
>  PR doesn't seem to have changed since yesterday. Am I missing
>  something?
> 
> 
> > On 2.10.2018, at 14:15, Igor Fedotov  wrote:
> >
> > Please update the patch from the PR - it didn't update bluefs
> > extents list before.
> >
> > Also please set debug bluestore 20 when re-running repair and
> > collect the log.
> >
> > If repair doesn't help - would you send repair and startup logs
> > directly to me as I have some issues accessing ceph-post-file
> > uploads.
> >
> >
> > Thanks,
> >
> > Igor
> >
> >
> > On 10/2/2018 11:39 AM, Sergey Malinin wrote:
> >> Yes, I did repair all OSDs and it finished with 'repair
> >> success'. I backed up OSDs so now I have more room to play.
> >> I posted log files using ceph-post-file with the following IDs:
> >> 4af9cc4d-9c73-41c9-9c38-eb6c551047a0
> >> 20df7df5-f0c9-4186-aa21-4e5c0172cd93
> >>
> >>
> >>> On 2.10.2018, at 11:26, Igor Fedotov  wrote:
> >>>
> >>> You did repair for any of this OSDs, didn't you? For all of
> >>> them?
> >>>
> >>>
> >>> Would you please provide the log for both types (failed on
> >>> mount and failed with enospc) of failing OSDs. Prior to
> >>> collecting please remove existing ones prior and set debug
> >>> bluestore to 20.
> >>>
> >>>
> >>>
> >>> On 10/2/2018 2:16 AM, Sergey Malinin wrote:
>  I was able to apply patches to mimic, but nothing changed.
>  One osd that I had space expanded on fails with bluefs mount
>  IO error, others keep failing with enospc.
> 
> 
> > On 1.10.2018, at 

Re: [ceph-users] ceph-ansible

2018-09-21 Thread Alfredo Deza
On Thu, Sep 20, 2018 at 7:04 PM solarflow99  wrote:
>
> oh, was that all it was...  git clone https://github.com/ceph/ceph-ansible/
> I installed the notario  package from EPEL, 
> python2-notario-0.0.11-2.el7.noarch  and thats the newest they have

Hey Ken, I thought the latest versions were being packaged, is there
something I've missed? The tags have changed format it seems, from
0.0.11
>
>
>
>
> On Thu, Sep 20, 2018 at 3:57 PM Alfredo Deza  wrote:
>>
>> Not sure how you installed ceph-ansible, the requirements mention a
>> version of a dependency (the notario module) which needs to be 0.0.13
>> or newer, and you seem to be using an older one.
>>
>>
>> On Thu, Sep 20, 2018 at 6:53 PM solarflow99  wrote:
>> >
>> > Hi, tying to get this to do a simple deployment, and i'm getting a strange 
>> > error, has anyone seen this?  I'm using Centos 7, rel 5   ansible 2.5.3  
>> > python version = 2.7.5
>> >
>> > I've tried with mimic luninous and even jewel, no luck at all.
>> >
>> >
>> >
>> > TASK [ceph-validate : validate provided configuration] 
>> > **
>> > task path: 
>> > /home/jzygmont/ansible/ceph-ansible/roles/ceph-validate/tasks/main.yml:2
>> > Thursday 20 September 2018  14:05:18 -0700 (0:00:05.734)   0:00:37.439 
>> > 
>> > The full traceback is:
>> > Traceback (most recent call last):
>> >   File 
>> > "/usr/lib/python2.7/site-packages/ansible/executor/task_executor.py", line 
>> > 138, in run
>> > res = self._execute()
>> >   File 
>> > "/usr/lib/python2.7/site-packages/ansible/executor/task_executor.py", line 
>> > 561, in _execute
>> > result = self._handler.run(task_vars=variables)
>> >   File "/home/jzygmont/ansible/ceph-ansible/plugins/actions/validate.py", 
>> > line 43, in run
>> > notario.validate(host_vars, install_options, defined_keys=True)
>> > TypeError: validate() got an unexpected keyword argument 'defined_keys'
>> >
>> > fatal: [172.20.3.178]: FAILED! => {
>> > "msg": "Unexpected failure during module execution.",
>> > "stdout": ""
>> > }
>> >
>> > NO MORE HOSTS LEFT 
>> > **
>> >
>> > PLAY RECAP 
>> > **
>> > 172.20.3.178   : ok=25   changed=0unreachable=0failed=1
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-ansible

2018-09-20 Thread Alfredo Deza
Not sure how you installed ceph-ansible, the requirements mention a
version of a dependency (the notario module) which needs to be 0.0.13
or newer, and you seem to be using an older one.


On Thu, Sep 20, 2018 at 6:53 PM solarflow99  wrote:
>
> Hi, tying to get this to do a simple deployment, and i'm getting a strange 
> error, has anyone seen this?  I'm using Centos 7, rel 5   ansible 2.5.3  
> python version = 2.7.5
>
> I've tried with mimic luninous and even jewel, no luck at all.
>
>
>
> TASK [ceph-validate : validate provided configuration] 
> **
> task path: 
> /home/jzygmont/ansible/ceph-ansible/roles/ceph-validate/tasks/main.yml:2
> Thursday 20 September 2018  14:05:18 -0700 (0:00:05.734)   0:00:37.439 
> 
> The full traceback is:
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/ansible/executor/task_executor.py", 
> line 138, in run
> res = self._execute()
>   File "/usr/lib/python2.7/site-packages/ansible/executor/task_executor.py", 
> line 561, in _execute
> result = self._handler.run(task_vars=variables)
>   File "/home/jzygmont/ansible/ceph-ansible/plugins/actions/validate.py", 
> line 43, in run
> notario.validate(host_vars, install_options, defined_keys=True)
> TypeError: validate() got an unexpected keyword argument 'defined_keys'
>
> fatal: [172.20.3.178]: FAILED! => {
> "msg": "Unexpected failure during module execution.",
> "stdout": ""
> }
>
> NO MORE HOSTS LEFT 
> **
>
> PLAY RECAP 
> **
> 172.20.3.178   : ok=25   changed=0unreachable=0failed=1
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] WAL/DB size

2018-09-07 Thread Alfredo Deza
On Fri, Sep 7, 2018 at 3:31 PM, Maged Mokhtar  wrote:
> On 2018-09-07 14:36, Alfredo Deza wrote:
>
> On Fri, Sep 7, 2018 at 8:27 AM, Muhammad Junaid 
> wrote:
>
> Hi there
>
> Asking the questions as a newbie. May be asked a number of times before by
> many but sorry, it is not clear yet to me.
>
> 1. The WAL device is just like journaling device used before bluestore. And
> CEPH confirms Write to client after writing to it (Before actual write to
> primary device)?
>
> 2. If we have lets say 5 OSD's (4 TB SAS) and 1 200GB SSD. Should we
> partition SSD in 10 partitions? Shoud/Can we set WAL Partition Size against
> each OSD as 10GB? Or what min/max we should set for WAL Partition? And can
> we set remaining 150GB as (30GB * 5) for 5 db partitions for all OSD's?
>
>
> A WAL partition would only help if you have a device faster than the
> SSD where the block.db would go.
>
> We recently updated our sizing recommendations for block.db at least
> 4% of the size of block (also referenced as the data device):
>
> http://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/#sizing
>
> In your case, what you want is to create 5 logical volumes from your
> 200GB at 40GB each, without a need for a WAL device.
>
>
>
> Thanks in advance. Regards.
>
> Muhammad Junaid
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> should not the db size depend on the number of objects stored rather than
> their storage size ? or is the new recommendation assuming some average
> object size ?

The latter. You are correct that it should depend on the number of
objects, but the objects vary in size depending on the type of
workload. RGW objects are different than RBD. So we are
taking a baseline/average for object sizes and recommending based on
that, which is roughly 4% the size of the data device
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] WAL/DB size

2018-09-07 Thread Alfredo Deza
On Fri, Sep 7, 2018 at 9:02 AM, Muhammad Junaid  wrote:
> Thanks Alfredo. Just to clear that My configuration has 5 OSD's (7200 rpm
> SAS HDDS) which are slower than the 200G SSD. Thats why I asked for a 10G
> WAL partition for each OSD on the SSD.
>
> Are you asking us to do 40GB  * 5 partitions on SSD just for block.db?

Yes.

You don't need a separate WAL defined. It only makes sense when you
have something *faster* than where block.db will live.

In your case 'data' will go in the slower spinning devices, 'block.db'
will go in the SSD, and there is no need for WAL. You would only
benefit
from WAL if you had another device, like an NVMe, where 2GB partitions
(or LVs) could be created for block.wal


>
> On Fri, Sep 7, 2018 at 5:36 PM Alfredo Deza  wrote:
>>
>> On Fri, Sep 7, 2018 at 8:27 AM, Muhammad Junaid 
>> wrote:
>> > Hi there
>> >
>> > Asking the questions as a newbie. May be asked a number of times before
>> > by
>> > many but sorry, it is not clear yet to me.
>> >
>> > 1. The WAL device is just like journaling device used before bluestore.
>> > And
>> > CEPH confirms Write to client after writing to it (Before actual write
>> > to
>> > primary device)?
>> >
>> > 2. If we have lets say 5 OSD's (4 TB SAS) and 1 200GB SSD. Should we
>> > partition SSD in 10 partitions? Shoud/Can we set WAL Partition Size
>> > against
>> > each OSD as 10GB? Or what min/max we should set for WAL Partition? And
>> > can
>> > we set remaining 150GB as (30GB * 5) for 5 db partitions for all OSD's?
>>
>> A WAL partition would only help if you have a device faster than the
>> SSD where the block.db would go.
>>
>> We recently updated our sizing recommendations for block.db at least
>> 4% of the size of block (also referenced as the data device):
>>
>>
>> http://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/#sizing
>>
>> In your case, what you want is to create 5 logical volumes from your
>> 200GB at 40GB each, without a need for a WAL device.
>>
>>
>> >
>> > Thanks in advance. Regards.
>> >
>> > Muhammad Junaid
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] WAL/DB size

2018-09-07 Thread Alfredo Deza
On Fri, Sep 7, 2018 at 8:27 AM, Muhammad Junaid  wrote:
> Hi there
>
> Asking the questions as a newbie. May be asked a number of times before by
> many but sorry, it is not clear yet to me.
>
> 1. The WAL device is just like journaling device used before bluestore. And
> CEPH confirms Write to client after writing to it (Before actual write to
> primary device)?
>
> 2. If we have lets say 5 OSD's (4 TB SAS) and 1 200GB SSD. Should we
> partition SSD in 10 partitions? Shoud/Can we set WAL Partition Size against
> each OSD as 10GB? Or what min/max we should set for WAL Partition? And can
> we set remaining 150GB as (30GB * 5) for 5 db partitions for all OSD's?

A WAL partition would only help if you have a device faster than the
SSD where the block.db would go.

We recently updated our sizing recommendations for block.db at least
4% of the size of block (also referenced as the data device):

http://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/#sizing

In your case, what you want is to create 5 logical volumes from your
200GB at 40GB each, without a need for a WAL device.


>
> Thanks in advance. Regards.
>
> Muhammad Junaid
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] "no valid command found" when running "ceph-deploy osd create"

2018-09-04 Thread Alfredo Deza
On Sun, Sep 2, 2018 at 3:01 PM, David Wahler  wrote:
> On Sun, Sep 2, 2018 at 1:31 PM Alfredo Deza  wrote:
>>
>> On Sun, Sep 2, 2018 at 12:00 PM, David Wahler  wrote:
>> > Ah, ceph-volume.log pointed out the actual problem:
>> >
>> > RuntimeError: Cannot use device (/dev/storage/bluestore). A vg/lv path
>> > or an existing device is needed
>>
>> That is odd, is it possible that the error log wasn't the one that
>> matched what you saw on ceph-deploy's end?
>>
>> Usually ceph-deploy will just receive whatever ceph-volume produced.
>
> I tried again, running ceph-volume directly this time, just to see if
> I had mixed anything up. It looks like ceph-deploy is correctly
> reporting the output of ceph-volume. The problem is that ceph-volume
> only writes the relevant error message to the log file, and not to its
> stdout/stderr.
>
> Console output:
>
> rock64@rockpro64-1:~/my-cluster$ sudo ceph-volume --cluster ceph lvm
> create --bluestore --data /dev/storage/foobar
> Running command: /usr/bin/ceph-authtool --gen-print-key
> Running command: /usr/bin/ceph --cluster ceph --name
> client.bootstrap-osd --keyring
> /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new
> e7dd6d45-b556-461c-bad1-83d98a5a1afa
> --> Was unable to complete a new OSD, will rollback changes
> Running command: /usr/bin/ceph --cluster ceph --name
> client.bootstrap-osd --keyring
> /var/lib/ceph/bootstrap-osd/ceph.keyring osd purge-new osd.1
> --yes-i-really-mean-it
>  stderr: no valid command found; 10 closest matches:
> [...etc...]
>
> ceph-volume.log:
>
> [2018-09-02 18:49:21,415][ceph_volume.main][INFO  ] Running command:
> ceph-volume --cluster ceph lvm create --bluestore --data
> /dev/storage/foobar
> [2018-09-02 18:49:21,423][ceph_volume.process][INFO  ] Running
> command: /usr/bin/ceph-authtool --gen-print-key
> [2018-09-02 18:49:26,664][ceph_volume.process][INFO  ] stdout
> AQCxMIxb+SezJRAAGAP/HHtHLVbciSQnZ/c/qw==
> [2018-09-02 18:49:26,668][ceph_volume.process][INFO  ] Running
> command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd
> --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new
> e7dd6d45-b556-461c-bad1-83d98a5a1afa
> [2018-09-02 18:49:27,685][ceph_volume.process][INFO  ] stdout 1
> [2018-09-02 18:49:27,686][ceph_volume.process][INFO  ] Running
> command: /bin/lsblk --nodeps -P -o
> NAME,KNAME,MAJ:MIN,FSTYPE,MOUNTPOINT,LABEL,UUID,RO,RM,MODEL,SIZE,STATE,OWNER,GROUP,MODE,ALIGNMENT,PHY-SEC,LOG-SEC,ROTA,SCHED,TYPE,DISC-ALN,DISC-GRAN,DISC-MAX,DISC-ZERO,PKNAME,PARTLABEL
> /dev/storage/foobar
> [2018-09-02 18:49:27,707][ceph_volume.process][INFO  ] stdout
> NAME="storage-foobar" KNAME="dm-1" MAJ:MIN="253:1" FSTYPE=""
> MOUNTPOINT="" LABEL="" UUID="" RO="0" RM="0" MODEL="" SIZE="100G"
> STATE="running" OWNER="root" GROUP="disk" MODE="brw-rw"
> ALIGNMENT="0" PHY-SEC="4096" LOG-SEC="512" ROTA="1" SCHED=""
> TYPE="lvm" DISC-ALN="0" DISC-GRAN="0B" DISC-MAX="0B" DISC-ZERO="0"
> PKNAME="" PARTLABEL=""
> [2018-09-02 18:49:27,708][ceph_volume.process][INFO  ] Running
> command: /bin/lsblk --nodeps -P -o
> NAME,KNAME,MAJ:MIN,FSTYPE,MOUNTPOINT,LABEL,UUID,RO,RM,MODEL,SIZE,STATE,OWNER,GROUP,MODE,ALIGNMENT,PHY-SEC,LOG-SEC,ROTA,SCHED,TYPE,DISC-ALN,DISC-GRAN,DISC-MAX,DISC-ZERO,PKNAME,PARTLABEL
> /dev/storage/foobar
> [2018-09-02 18:49:27,720][ceph_volume.process][INFO  ] stdout
> NAME="storage-foobar" KNAME="dm-1" MAJ:MIN="253:1" FSTYPE=""
> MOUNTPOINT="" LABEL="" UUID="" RO="0" RM="0" MODEL="" SIZE="100G"
> STATE="running" OWNER="root" GROUP="disk" MODE="brw-rw"
> ALIGNMENT="0" PHY-SEC="4096" LOG-SEC="512" ROTA="1" SCHED=""
> TYPE="lvm" DISC-ALN="0" DISC-GRAN="0B" DISC-MAX="0B" DISC-ZERO="0"
> PKNAME="" PARTLABEL=""
> [2018-09-02 18:49:27,720][ceph_volume.devices.lvm.prepare][ERROR ] lvm
> prepare was unable to complete
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/prepare.py",
> line 216, in safe_prepare
> self.prepare(args)
>   File "/usr/lib/python2.7/dist-packages/ceph_volume/decorators.py",
> line 16, in is_root
> return func(*a, **kw)
>   File "/usr/lib/python2.7/dist-packages/ceph_volume/devi

Re: [ceph-users] SSD OSDs crashing after upgrade to 12.2.7

2018-09-04 Thread Alfredo Deza
On Tue, Sep 4, 2018 at 3:59 AM, Wolfgang Lendl
 wrote:
> is downgrading from 12.2.7 to 12.2.5 an option? - I'm still suffering
> from high frequent osd crashes.
> my hopes are with 12.2.9 - but hope wasn't always my best strategy

12.2.8 just went out. I think that Adam or Radoslaw might have some
time to check those logs now

>
> br
> wolfgang
>
> On 2018-08-30 19:18, Alfredo Deza wrote:
>> On Thu, Aug 30, 2018 at 5:24 AM, Wolfgang Lendl
>>  wrote:
>>> Hi Alfredo,
>>>
>>>
>>> caught some logs:
>>> https://pastebin.com/b3URiA7p
>> That looks like there is an issue with bluestore. Maybe Radoslaw or
>> Adam might know a bit more.
>>
>>
>>> br
>>> wolfgang
>>>
>>> On 2018-08-29 15:51, Alfredo Deza wrote:
>>>> On Wed, Aug 29, 2018 at 2:06 AM, Wolfgang Lendl
>>>>  wrote:
>>>>> Hi,
>>>>>
>>>>> after upgrading my ceph clusters from 12.2.5 to 12.2.7  I'm experiencing 
>>>>> random crashes from SSD OSDs (bluestore) - it seems that HDD OSDs are not 
>>>>> affected.
>>>>> I destroyed and recreated some of the SSD OSDs which seemed to help.
>>>>>
>>>>> this happens on centos 7.5 (different kernels tested)
>>>>>
>>>>> /var/log/messages:
>>>>> Aug 29 10:24:08  ceph-osd: *** Caught signal (Segmentation fault) **
>>>>> Aug 29 10:24:08  ceph-osd: in thread 7f8a8e69e700 
>>>>> thread_name:bstore_kv_final
>>>>> Aug 29 10:24:08  kernel: traps: bstore_kv_final[187470] general 
>>>>> protection ip:7f8a997cf42b sp:7f8a8e69abc0 error:0 in 
>>>>> libtcmalloc.so.4.4.5[7f8a997a8000+46000]
>>>>> Aug 29 10:24:08  systemd: ceph-osd@2.service: main process exited, 
>>>>> code=killed, status=11/SEGV
>>>>> Aug 29 10:24:08  systemd: Unit ceph-osd@2.service entered failed state.
>>>>> Aug 29 10:24:08  systemd: ceph-osd@2.service failed.
>>>>> Aug 29 10:24:28  systemd: ceph-osd@2.service holdoff time over, 
>>>>> scheduling restart.
>>>>> Aug 29 10:24:28  systemd: Starting Ceph object storage daemon osd.2...
>>>>> Aug 29 10:24:28  systemd: Started Ceph object storage daemon osd.2.
>>>>> Aug 29 10:24:28  ceph-osd: starting osd.2 at - osd_data 
>>>>> /var/lib/ceph/osd/ceph-2 /var/lib/ceph/osd/ceph-2/journal
>>>>> Aug 29 10:24:35  ceph-osd: *** Caught signal (Segmentation fault) **
>>>>> Aug 29 10:24:35  ceph-osd: in thread 7f5f1e790700 thread_name:tp_osd_tp
>>>>> Aug 29 10:24:35  kernel: traps: tp_osd_tp[186933] general protection 
>>>>> ip:7f5f43103e63 sp:7f5f1e78a1c8 error:0 in 
>>>>> libtcmalloc.so.4.4.5[7f5f430cd000+46000]
>>>>> Aug 29 10:24:35  systemd: ceph-osd@0.service: main process exited, 
>>>>> code=killed, status=11/SEGV
>>>>> Aug 29 10:24:35  systemd: Unit ceph-osd@0.service entered failed state.
>>>>> Aug 29 10:24:35  systemd: ceph-osd@0.service failed
>>>> These systemd messages aren't usually helpful, try poking around
>>>> /var/log/ceph/ for the output on that one OSD.
>>>>
>>>> If those logs aren't useful either, try bumping up the verbosity (see
>>>> http://docs.ceph.com/docs/master/rados/troubleshooting/log-and-debug/#boot-time
>>>> )
>>>>> did I hit a known issue?
>>>>> any suggestions are highly appreciated
>>>>>
>>>>>
>>>>> br
>>>>> wolfgang
>>>>>
>>>>>
>>>>>
>>>>> ___
>>>>> ceph-users mailing list
>>>>> ceph-users@lists.ceph.com
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>
>>> --
>>> Wolfgang Lendl
>>> IT Systems & Communications
>>> Medizinische Universität Wien
>>> Spitalgasse 23 / BT 88 /Ebene 00
>>> A-1090 Wien
>>> Tel: +43 1 40160-21231
>>> Fax: +43 1 40160-921200
>>>
>>>
>
> --
> Wolfgang Lendl
> IT Systems & Communications
> Medizinische Universität Wien
> Spitalgasse 23 / BT 88 /Ebene 00
> A-1090 Wien
> Tel: +43 1 40160-21231
> Fax: +43 1 40160-921200
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] "no valid command found" when running "ceph-deploy osd create"

2018-09-02 Thread Alfredo Deza
On Sun, Sep 2, 2018 at 12:00 PM, David Wahler  wrote:
> Ah, ceph-volume.log pointed out the actual problem:
>
> RuntimeError: Cannot use device (/dev/storage/bluestore). A vg/lv path
> or an existing device is needed

That is odd, is it possible that the error log wasn't the one that
matched what you saw on ceph-deploy's end?

Usually ceph-deploy will just receive whatever ceph-volume produced.
>
> When I changed "--data /dev/storage/bluestore" to "--data
> storage/bluestore", everything worked fine.
>
> I agree that the ceph-deploy logs are a bit confusing. I submitted a
> PR to add a brief note to the quick-start guide, in case anyone else
> makes the same mistake: https://github.com/ceph/ceph/pull/23879
>
Thanks for the PR!

> Thanks for the assistance!
>
> -- David
>
> On Sun, Sep 2, 2018 at 7:44 AM Alfredo Deza  wrote:
>>
>> There should be useful logs from ceph-volume in
>> /var/log/ceph/ceph-volume.log that might show a bit more here.
>>
>> I would also try the command that fails directly on the server (sans
>> ceph-deploy) to see what is it that is actually failing. Seems like
>> the ceph-deploy log output is a bit out of order (some race condition
>> here maybe)
>>
>>
>> On Sun, Sep 2, 2018 at 2:53 AM, David Wahler  wrote:
>> > Hi all,
>> >
>> > I'm attempting to get a small Mimic cluster running on ARM, starting
>> > with a single node. Since there don't seem to be any Debian ARM64
>> > packages in the official Ceph repository, I had to build from source,
>> > which was fairly straightforward.
>> >
>> > After installing the .deb packages that I built and following the
>> > quick-start guide
>> > (http://docs.ceph.com/docs/mimic/start/quick-ceph-deploy/), things
>> > seemed to be working fine at first, but I got this error when
>> > attempting to create an OSD:
>> >
>> > rock64@rockpro64-1:~/my-cluster$ ceph-deploy osd create --data
>> > /dev/storage/bluestore rockpro64-1
>> > [ceph_deploy.conf][DEBUG ] found configuration file at:
>> > /home/rock64/.cephdeploy.conf
>> > [ceph_deploy.cli][INFO  ] Invoked (2.0.1): /usr/bin/ceph-deploy osd
>> > create --data /dev/storage/bluestore rockpro64-1
>> > [ceph_deploy.cli][INFO  ] ceph-deploy options:
>> > [ceph_deploy.cli][INFO  ]  verbose   : False
>> > [ceph_deploy.cli][INFO  ]  bluestore : None
>> > [ceph_deploy.cli][INFO  ]  cd_conf   :
>> > 
>> > [ceph_deploy.cli][INFO  ]  cluster   : ceph
>> > [ceph_deploy.cli][INFO  ]  fs_type   : xfs
>> > [ceph_deploy.cli][INFO  ]  block_wal : None
>> > [ceph_deploy.cli][INFO  ]  default_release   : False
>> > [ceph_deploy.cli][INFO  ]  username  : None
>> > [ceph_deploy.cli][INFO  ]  journal   : None
>> > [ceph_deploy.cli][INFO  ]  subcommand: create
>> > [ceph_deploy.cli][INFO  ]  host  : rockpro64-1
>> > [ceph_deploy.cli][INFO  ]  filestore : None
>> > [ceph_deploy.cli][INFO  ]  func  : > > osd at 0x7fa9ca0c80>
>> > [ceph_deploy.cli][INFO  ]  ceph_conf : None
>> > [ceph_deploy.cli][INFO  ]  zap_disk  : False
>> > [ceph_deploy.cli][INFO  ]  data  :
>> > /dev/storage/bluestore
>> > [ceph_deploy.cli][INFO  ]  block_db  : None
>> > [ceph_deploy.cli][INFO  ]  dmcrypt   : False
>> > [ceph_deploy.cli][INFO  ]  overwrite_conf: False
>> > [ceph_deploy.cli][INFO  ]  dmcrypt_key_dir   :
>> > /etc/ceph/dmcrypt-keys
>> > [ceph_deploy.cli][INFO  ]  quiet : False
>> > [ceph_deploy.cli][INFO  ]  debug : False
>> > [ceph_deploy.osd][DEBUG ] Creating OSD on cluster ceph with data
>> > device /dev/storage/bluestore
>> > [rockpro64-1][DEBUG ] connection detected need for sudo
>> > [rockpro64-1][DEBUG ] connected to host: rockpro64-1
>> > [rockpro64-1][DEBUG ] detect platform information from remote host
>> > [rockpro64-1][DEBUG ] detect machine type
>> > [rockpro64-1][DEBUG ] find the location of an executable
>> > [ceph_deploy.osd][INFO  ] Distro info: debian buster/sid sid
>> > [ceph_deploy.osd][DEBUG ] Deploying osd to rockpro64-1
>> > [rockpro64-1][DEBUG ] wr

Re: [ceph-users] Slow requests from bluestore osds

2018-09-02 Thread Alfredo Deza
On Sat, Sep 1, 2018 at 12:45 PM, Brett Chancellor
 wrote:
> Hi Cephers,
>   I am in the process of upgrading a cluster from Filestore to bluestore,
> but I'm concerned about frequent warnings popping up against the new
> bluestore devices. I'm frequently seeing messages like this, although the
> specific osd changes, it's always one of the few hosts I've converted to
> bluestore.
>
> 6 ops are blocked > 32.768 sec on osd.219
> 1 osds have slow requests
>
> I'm running 12.2.4, have any of you seen similar issues? It seems as though
> these messages pop up more frequently when one of the bluestore pgs is
> involved in a scrub.  I'll include my bluestore creation process below, in
> case that might cause an issue. (sdb, sdc, sdd are SATA, sde and sdf are
> SSD)

Would be useful to include what those warnings say. The ceph-volume
commands look OK to me

>
>
> ## Process used to create osds
> sudo ceph-disk zap /dev/sdb /dev/sdc /dev/sdd /dev/sdd /dev/sde /dev/sdf
> sudo ceph-volume lvm zap /dev/sdb
> sudo ceph-volume lvm zap /dev/sdc
> sudo ceph-volume lvm zap /dev/sdd
> sudo ceph-volume lvm zap /dev/sde
> sudo ceph-volume lvm zap /dev/sdf
> sudo sgdisk -n 0:2048:+133GiB -t 0: -c 1:"ceph block.db sdb" /dev/sdf
> sudo sgdisk -n 0:0:+133GiB -t 0: -c 2:"ceph block.db sdc" /dev/sdf
> sudo sgdisk -n 0:0:+133GiB -t 0: -c 3:"ceph block.db sdd" /dev/sdf
> sudo sgdisk -n 0:0:+133GiB -t 0: -c 4:"ceph block.db sde" /dev/sdf
> sudo ceph-volume lvm create --bluestore --crush-device-class hdd --data
> /dev/sdb --block.db /dev/sdf1
> sudo ceph-volume lvm create --bluestore --crush-device-class hdd --data
> /dev/sdc --block.db /dev/sdf2
> sudo ceph-volume lvm create --bluestore --crush-device-class hdd --data
> /dev/sdd --block.db /dev/sdf3
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] "no valid command found" when running "ceph-deploy osd create"

2018-09-02 Thread Alfredo Deza
There should be useful logs from ceph-volume in
/var/log/ceph/ceph-volume.log that might show a bit more here.

I would also try the command that fails directly on the server (sans
ceph-deploy) to see what is it that is actually failing. Seems like
the ceph-deploy log output is a bit out of order (some race condition
here maybe)


On Sun, Sep 2, 2018 at 2:53 AM, David Wahler  wrote:
> Hi all,
>
> I'm attempting to get a small Mimic cluster running on ARM, starting
> with a single node. Since there don't seem to be any Debian ARM64
> packages in the official Ceph repository, I had to build from source,
> which was fairly straightforward.
>
> After installing the .deb packages that I built and following the
> quick-start guide
> (http://docs.ceph.com/docs/mimic/start/quick-ceph-deploy/), things
> seemed to be working fine at first, but I got this error when
> attempting to create an OSD:
>
> rock64@rockpro64-1:~/my-cluster$ ceph-deploy osd create --data
> /dev/storage/bluestore rockpro64-1
> [ceph_deploy.conf][DEBUG ] found configuration file at:
> /home/rock64/.cephdeploy.conf
> [ceph_deploy.cli][INFO  ] Invoked (2.0.1): /usr/bin/ceph-deploy osd
> create --data /dev/storage/bluestore rockpro64-1
> [ceph_deploy.cli][INFO  ] ceph-deploy options:
> [ceph_deploy.cli][INFO  ]  verbose   : False
> [ceph_deploy.cli][INFO  ]  bluestore : None
> [ceph_deploy.cli][INFO  ]  cd_conf   :
> 
> [ceph_deploy.cli][INFO  ]  cluster   : ceph
> [ceph_deploy.cli][INFO  ]  fs_type   : xfs
> [ceph_deploy.cli][INFO  ]  block_wal : None
> [ceph_deploy.cli][INFO  ]  default_release   : False
> [ceph_deploy.cli][INFO  ]  username  : None
> [ceph_deploy.cli][INFO  ]  journal   : None
> [ceph_deploy.cli][INFO  ]  subcommand: create
> [ceph_deploy.cli][INFO  ]  host  : rockpro64-1
> [ceph_deploy.cli][INFO  ]  filestore : None
> [ceph_deploy.cli][INFO  ]  func  :  osd at 0x7fa9ca0c80>
> [ceph_deploy.cli][INFO  ]  ceph_conf : None
> [ceph_deploy.cli][INFO  ]  zap_disk  : False
> [ceph_deploy.cli][INFO  ]  data  :
> /dev/storage/bluestore
> [ceph_deploy.cli][INFO  ]  block_db  : None
> [ceph_deploy.cli][INFO  ]  dmcrypt   : False
> [ceph_deploy.cli][INFO  ]  overwrite_conf: False
> [ceph_deploy.cli][INFO  ]  dmcrypt_key_dir   :
> /etc/ceph/dmcrypt-keys
> [ceph_deploy.cli][INFO  ]  quiet : False
> [ceph_deploy.cli][INFO  ]  debug : False
> [ceph_deploy.osd][DEBUG ] Creating OSD on cluster ceph with data
> device /dev/storage/bluestore
> [rockpro64-1][DEBUG ] connection detected need for sudo
> [rockpro64-1][DEBUG ] connected to host: rockpro64-1
> [rockpro64-1][DEBUG ] detect platform information from remote host
> [rockpro64-1][DEBUG ] detect machine type
> [rockpro64-1][DEBUG ] find the location of an executable
> [ceph_deploy.osd][INFO  ] Distro info: debian buster/sid sid
> [ceph_deploy.osd][DEBUG ] Deploying osd to rockpro64-1
> [rockpro64-1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
> [rockpro64-1][WARNIN] osd keyring does not exist yet, creating one
> [rockpro64-1][DEBUG ] create a keyring file
> [rockpro64-1][DEBUG ] find the location of an executable
> [rockpro64-1][INFO  ] Running command: sudo /usr/sbin/ceph-volume
> --cluster ceph lvm create --bluestore --data /dev/storage/bluestore
> [rockpro64-1][DEBUG ] Running command: /usr/bin/ceph-authtool --gen-print-key
> [rockpro64-1][WARNIN] -->  RuntimeError: command returned non-zero
> exit status: 22
> [rockpro64-1][DEBUG ] Running command: /usr/bin/ceph --cluster ceph
> --name client.bootstrap-osd --keyring
> /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new
> 4903fff3-550c-4ce3-aa7d-97193627c6c0
> [rockpro64-1][DEBUG ] --> Was unable to complete a new OSD, will
> rollback changes
> [rockpro64-1][DEBUG ] Running command: /usr/bin/ceph --cluster ceph
> --name client.bootstrap-osd --keyring
> /var/lib/ceph/bootstrap-osd/ceph.keyring osd purge-new osd.0
> --yes-i-really-mean-it
> [rockpro64-1][DEBUG ]  stderr: no valid command found; 10 closest matches:
> [rockpro64-1][DEBUG ] osd tier add-cache   
> [rockpro64-1][DEBUG ] osd tier remove-overlay 
> [rockpro64-1][DEBUG ] osd out  [...]
> [rockpro64-1][DEBUG ] osd in  [...]
> [rockpro64-1][DEBUG ] osd down  [...]
> [rockpro64-1][DEBUG ]  stderr: osd unset
> full|pause|noup|nodown|noout|noin|nobackfill|norebalance|norecover|noscrub|nodeep-scrub|notieragent|nosnaptrim
> [rockpro64-1][DEBUG ] osd require-osd-release luminous|mimic
> {--yes-i-really-mean-it}
> [rockpro64-1][DEBUG ] osd erasure-code-profile ls
> [rockpro64-1][DEBUG ] osd set
> 

Re: [ceph-users] SSD OSDs crashing after upgrade to 12.2.7

2018-08-30 Thread Alfredo Deza
On Thu, Aug 30, 2018 at 5:24 AM, Wolfgang Lendl
 wrote:
> Hi Alfredo,
>
>
> caught some logs:
> https://pastebin.com/b3URiA7p

That looks like there is an issue with bluestore. Maybe Radoslaw or
Adam might know a bit more.


>
> br
> wolfgang
>
> On 2018-08-29 15:51, Alfredo Deza wrote:
>> On Wed, Aug 29, 2018 at 2:06 AM, Wolfgang Lendl
>>  wrote:
>>> Hi,
>>>
>>> after upgrading my ceph clusters from 12.2.5 to 12.2.7  I'm experiencing 
>>> random crashes from SSD OSDs (bluestore) - it seems that HDD OSDs are not 
>>> affected.
>>> I destroyed and recreated some of the SSD OSDs which seemed to help.
>>>
>>> this happens on centos 7.5 (different kernels tested)
>>>
>>> /var/log/messages:
>>> Aug 29 10:24:08  ceph-osd: *** Caught signal (Segmentation fault) **
>>> Aug 29 10:24:08  ceph-osd: in thread 7f8a8e69e700 
>>> thread_name:bstore_kv_final
>>> Aug 29 10:24:08  kernel: traps: bstore_kv_final[187470] general protection 
>>> ip:7f8a997cf42b sp:7f8a8e69abc0 error:0 in 
>>> libtcmalloc.so.4.4.5[7f8a997a8000+46000]
>>> Aug 29 10:24:08  systemd: ceph-osd@2.service: main process exited, 
>>> code=killed, status=11/SEGV
>>> Aug 29 10:24:08  systemd: Unit ceph-osd@2.service entered failed state.
>>> Aug 29 10:24:08  systemd: ceph-osd@2.service failed.
>>> Aug 29 10:24:28  systemd: ceph-osd@2.service holdoff time over, scheduling 
>>> restart.
>>> Aug 29 10:24:28  systemd: Starting Ceph object storage daemon osd.2...
>>> Aug 29 10:24:28  systemd: Started Ceph object storage daemon osd.2.
>>> Aug 29 10:24:28  ceph-osd: starting osd.2 at - osd_data 
>>> /var/lib/ceph/osd/ceph-2 /var/lib/ceph/osd/ceph-2/journal
>>> Aug 29 10:24:35  ceph-osd: *** Caught signal (Segmentation fault) **
>>> Aug 29 10:24:35  ceph-osd: in thread 7f5f1e790700 thread_name:tp_osd_tp
>>> Aug 29 10:24:35  kernel: traps: tp_osd_tp[186933] general protection 
>>> ip:7f5f43103e63 sp:7f5f1e78a1c8 error:0 in 
>>> libtcmalloc.so.4.4.5[7f5f430cd000+46000]
>>> Aug 29 10:24:35  systemd: ceph-osd@0.service: main process exited, 
>>> code=killed, status=11/SEGV
>>> Aug 29 10:24:35  systemd: Unit ceph-osd@0.service entered failed state.
>>> Aug 29 10:24:35  systemd: ceph-osd@0.service failed
>> These systemd messages aren't usually helpful, try poking around
>> /var/log/ceph/ for the output on that one OSD.
>>
>> If those logs aren't useful either, try bumping up the verbosity (see
>> http://docs.ceph.com/docs/master/rados/troubleshooting/log-and-debug/#boot-time
>> )
>>> did I hit a known issue?
>>> any suggestions are highly appreciated
>>>
>>>
>>> br
>>> wolfgang
>>>
>>>
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>
> --
> Wolfgang Lendl
> IT Systems & Communications
> Medizinische Universität Wien
> Spitalgasse 23 / BT 88 /Ebene 00
> A-1090 Wien
> Tel: +43 1 40160-21231
> Fax: +43 1 40160-921200
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Error EINVAL: (22) Invalid argument While using ceph osd safe-to-destroy

2018-08-29 Thread Alfredo Deza
I am addressing the doc bug at https://github.com/ceph/ceph/pull/23801

On Mon, Aug 27, 2018 at 2:08 AM, Eugen Block  wrote:
> Hi,
>
> could you please paste your osd tree and the exact command you try to
> execute?
>
>> Extra note, the while loop in the instructions look like it's bad.  I had
>> to change it to make it work in bash.
>
>
> The documented command didn't work for me either.
>
> Regards,
> Eugen
>
> Zitat von Robert Stanford :
>
>
>> I am following the procedure here:
>> http://docs.ceph.com/docs/mimic/rados/operations/bluestore-migration/
>>
>>  When I get to the part to run "ceph osd safe-to-destroy $ID" in a while
>> loop, I get a EINVAL error.  I get this error when I run "ceph osd
>> safe-to-destroy 0" on the command line by itself, too.  (Extra note, the
>> while loop in the instructions look like it's bad.  I had to change it to
>> make it work in bash.)
>>
>>  I know my ID is correct because I was able to use it in the previous step
>> (ceph osd out $ID).  I also substituted $ID for the number on the command
>> line and got the same error.  Why isn't this working?
>>
>> Error: Error EINVAL: (22) Invalid argument While using ceph osd
>> safe-to-destroy
>>
>>  Thank you
>> R
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD OSDs crashing after upgrade to 12.2.7

2018-08-29 Thread Alfredo Deza
On Wed, Aug 29, 2018 at 2:06 AM, Wolfgang Lendl
 wrote:
> Hi,
>
> after upgrading my ceph clusters from 12.2.5 to 12.2.7  I'm experiencing 
> random crashes from SSD OSDs (bluestore) - it seems that HDD OSDs are not 
> affected.
> I destroyed and recreated some of the SSD OSDs which seemed to help.
>
> this happens on centos 7.5 (different kernels tested)
>
> /var/log/messages:
> Aug 29 10:24:08  ceph-osd: *** Caught signal (Segmentation fault) **
> Aug 29 10:24:08  ceph-osd: in thread 7f8a8e69e700 thread_name:bstore_kv_final
> Aug 29 10:24:08  kernel: traps: bstore_kv_final[187470] general protection 
> ip:7f8a997cf42b sp:7f8a8e69abc0 error:0 in 
> libtcmalloc.so.4.4.5[7f8a997a8000+46000]
> Aug 29 10:24:08  systemd: ceph-osd@2.service: main process exited, 
> code=killed, status=11/SEGV
> Aug 29 10:24:08  systemd: Unit ceph-osd@2.service entered failed state.
> Aug 29 10:24:08  systemd: ceph-osd@2.service failed.
> Aug 29 10:24:28  systemd: ceph-osd@2.service holdoff time over, scheduling 
> restart.
> Aug 29 10:24:28  systemd: Starting Ceph object storage daemon osd.2...
> Aug 29 10:24:28  systemd: Started Ceph object storage daemon osd.2.
> Aug 29 10:24:28  ceph-osd: starting osd.2 at - osd_data 
> /var/lib/ceph/osd/ceph-2 /var/lib/ceph/osd/ceph-2/journal
> Aug 29 10:24:35  ceph-osd: *** Caught signal (Segmentation fault) **
> Aug 29 10:24:35  ceph-osd: in thread 7f5f1e790700 thread_name:tp_osd_tp
> Aug 29 10:24:35  kernel: traps: tp_osd_tp[186933] general protection 
> ip:7f5f43103e63 sp:7f5f1e78a1c8 error:0 in 
> libtcmalloc.so.4.4.5[7f5f430cd000+46000]
> Aug 29 10:24:35  systemd: ceph-osd@0.service: main process exited, 
> code=killed, status=11/SEGV
> Aug 29 10:24:35  systemd: Unit ceph-osd@0.service entered failed state.
> Aug 29 10:24:35  systemd: ceph-osd@0.service failed

These systemd messages aren't usually helpful, try poking around
/var/log/ceph/ for the output on that one OSD.

If those logs aren't useful either, try bumping up the verbosity (see
http://docs.ceph.com/docs/master/rados/troubleshooting/log-and-debug/#boot-time
)
>
> did I hit a known issue?
> any suggestions are highly appreciated
>
>
> br
> wolfgang
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Shared WAL/DB device partition for multiple OSDs?

2018-08-23 Thread Alfredo Deza
On Thu, Aug 23, 2018 at 11:32 AM, Hervé Ballans
 wrote:
> Le 23/08/2018 à 16:13, Alfredo Deza a écrit :
>
> What you mean is that, at this stage, I must directly declare the UUID paths
> in value of --block.db (i.e. replace /dev/nvme0n1p1 with its PARTUUID), that
> is ?
>
> No, this all looks correct. How does the ceph-volume.log and
> ceph-volume-systemd.log look when you are booting up for the OSDs that
> aren't coming up?
>
> Anything useful in there?
>
>
> ceph-volume.log (extract)
> [2018-08-20 11:26:29,430][ceph_volume.process][INFO  ] Running command:
> systemctl start ceph-osd@1
> [2018-08-20 11:26:32,268][ceph_volume.main][INFO  ] Running command:
> ceph-volume  lvm trigger 1-4a9954ce-0a0f-432b-a91d-eaacb45287d4
> [2018-08-20 11:26:32,269][ceph_volume.process][INFO  ] Running command:
> /sbin/lvs --noheadings --readonly --separator=";" -o
> lv_tags,lv_path,lv_name,vg_name,lv_uuid
> [2018-08-20 11:26:32,347][ceph_volume.process][INFO  ] stdout
> ceph.block_device=/dev/ceph-425a36ca-8b60-471d-80ed-19726714ea3a/osd-block-9380cd27-c0fe-4ede-9ed3-d09eff545037,ceph.block_uuid=kkykJn-EmUa-EvZj-Ffr7-dk06-VUG0-nhnUvB,ceph.cephx_lockbox_secret=,ceph.cluster_fsid=838506b7-e0c6-4022-9e17-2d1cf9458be6,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.db_device=/dev/nvme0n1p4,ceph.db_uuid=ec4cf738-1627-4ddf-a7ef-91810fb8aaab,ceph.encrypted=0,ceph.osd_fsid=9380cd27-c0fe-4ede-9ed3-d09eff545037,ceph.osd_id=3,ceph.type=block,ceph.vdo=0";"/dev/ceph-425a36ca-8b60-471d-80ed-19726714ea3a/osd-block-9380cd27-c0fe-4ede-9ed3-d09eff545037";"osd-block-9380cd27-c0fe-4ede-9ed3-d09eff545037";"ceph-425a36ca-8b60-471d-80ed-19726714ea3a";"kkykJn-EmUa-EvZj-Ffr7-dk06-VUG0-nhnUvB
> [2018-08-20 11:26:32,347][ceph_volume.process][INFO  ] stdout
> ceph.block_device=/dev/ceph-597bbc7c-33ba-4020-a1c6-fe4297ad0421/osd-block-5d4af2fc-388c-4795-9d1a-53ad8aba56d8,ceph.block_uuid=WdXALj-0bCs-Ndfy-xGIN-q1rF-hcmg-utIiNB,ceph.cephx_lockbox_secret=,ceph.cluster_fsid=838506b7-e0c6-4022-9e17-2d1cf9458be6,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.db_device=/dev/nvme0n1p8,ceph.db_uuid=9642a688-0991-4a19-b49a-994a22edaf60,ceph.encrypted=0,ceph.osd_fsid=5d4af2fc-388c-4795-9d1a-53ad8aba56d8,ceph.osd_id=7,ceph.type=block,ceph.vdo=0";"/dev/ceph-597bbc7c-33ba-4020-a1c6-fe4297ad0421/osd-block-5d4af2fc-388c-4795-9d1a-53ad8aba56d8";"osd-block-5d4af2fc-388c-4795-9d1a-53ad8aba56d8";"ceph-597bbc7c-33ba-4020-a1c6-fe4297ad0421";"WdXALj-0bCs-Ndfy-xGIN-q1rF-hcmg-utIiNB
> [2018-08-20 11:26:32,347][ceph_volume.process][INFO  ] stdout
> ceph.block_device=/dev/ceph-5dbc3f3a-d781-4d22-9e9c-8c7a74ce39fb/osd-block-b8e82f22-e993-4458-984b-90232b8b3d55,ceph.block_uuid=Q1Aeu5-bm6V-m9yB-O3vX-SFvX-8oHo-SNkrfe,ceph.cephx_lockbox_secret=,ceph.cluster_fsid=838506b7-e0c6-4022-9e17-2d1cf9458be6,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.db_device=/dev/nvme0n1p3,ceph.db_uuid=c7bb1f33-65f7-4116-ae27-0d73dde82208,ceph.encrypted=0,ceph.osd_fsid=b8e82f22-e993-4458-984b-90232b8b3d55,ceph.osd_id=2,ceph.type=block,ceph.vdo=0";"/dev/ceph-5dbc3f3a-d781-4d22-9e9c-8c7a74ce39fb/osd-block-b8e82f22-e993-4458-984b-90232b8b3d55";"osd-block-b8e82f22-e993-4458-984b-90232b8b3d55";"ceph-5dbc3f3a-d781-4d22-9e9c-8c7a74ce39fb";"Q1Aeu5-bm6V-m9yB-O3vX-SFvX-8oHo-SNkrfe
> [2018-08-20 11:26:32,347][ceph_volume.process][INFO  ] stdout
> ceph.block_device=/dev/ceph-70e7b5f7-408d-4810-9d92-d6679d621db0/osd-block-02540fff-5478-4a67-bf5c-679c72150e8d,ceph.block_uuid=jp7YJ5-vNcj-93at-49LK-bSWT-ke4m-Y0Brzy,ceph.cephx_lockbox_secret=,ceph.cluster_fsid=838506b7-e0c6-4022-9e17-2d1cf9458be6,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.db_device=/dev/nvme0n1p5,ceph.db_uuid=b5ea7d3a-8c90-4163-a2af-a4dfa045ced6,ceph.encrypted=0,ceph.osd_fsid=02540fff-5478-4a67-bf5c-679c72150e8d,ceph.osd_id=4,ceph.type=block,ceph.vdo=0";"/dev/ceph-70e7b5f7-408d-4810-9d92-d6679d621db0/osd-block-02540fff-5478-4a67-bf5c-679c72150e8d";"osd-block-02540fff-5478-4a67-bf5c-679c72150e8d";"ceph-70e7b5f7-408d-4810-9d92-d6679d621db0";"jp7YJ5-vNcj-93at-49LK-bSWT-ke4m-Y0Brzy
> [2018-08-20 11:26:32,347][ceph_volume.process][INFO  ] stdout
> ceph.block_device=/dev/ceph-766bd78c-ed1a-4e27-8b4d-7adc4c4f2f0d/osd-block-98bfb597-009b-4e88-bc5e-dd22587d21fe,ceph.block_uuid=enriOd-To2e-lLWi-Dc91-keUr-1tf7-4Hvrj2,ceph.cephx_lockbox_secret=,ceph.cluster_fsid=838506b7-e0c6-4022-9e17-2d1cf9458be6,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.db_device=/dev/nvme0n1p1,ceph.db_uuid=99870da1-b7d1-479d-ad63-14407c593278,ceph.encrypted=0,ceph.osd_fsid=98bfb597-009b-4e88-bc5e-dd22587d21fe,ceph.osd_id=0,ceph.type=block,ceph.vdo=0";"/dev/ceph-766bd78c-ed1a-4e27-8b4d-7adc4c4f2f0d/osd-block-98bfb597-009b-4e88-bc5e-dd22587d21fe";"osd-block-98b

Re: [ceph-users] Shared WAL/DB device partition for multiple OSDs?

2018-08-23 Thread Alfredo Deza
On Thu, Aug 23, 2018 at 9:56 AM, Hervé Ballans
 wrote:
> Le 23/08/2018 à 15:20, Alfredo Deza a écrit :
>
> Thanks Alfredo for your reply. I'm using the very last version of Luminous
> (12.2.7) and ceph-deploy (2.0.1).
> I have no problem in creating my OSD, that's work perfectly.
> My issue only concerns the problem of the mount names of the NVMe partitions
> which change after a reboot when there are more than one NVMe device on the
> OSD node.
>
> ceph-volume is pretty resilient to partition changes because it stores
> the PARTUUID of the partition in LVM, and it queries
> it each time at boot. Note that for bluestore there is no mounting
> whatsoever. Have you created partitions with a PARTUUID on the nvme
> devices for block.db ?
>
>
> Here is how I created my BlueStore OSDs (in the first OSD node) :
>
> 1) On the OSD node node-osd0, I first created block partitions on the NVMe
> device (PM1725a 800GB), like this :
>
> # parted /dev/nvme0n1 mklabel gpt
>
> # echo "1 0 10
> 2 10 20
> 3 20 30
> 4 30 40
> 5 40 50
> 6 50 60
> 7 60 70
> 8 70 80
> 9 80 90
> 10 90 100" | while read num beg end; do parted /dev/nvme0n1 mkpart $num
> $beg% $end%; done
>
> Extract of cat /proc/partitions :
>
>  2592  781412184 nvme1n1
>  2593  781412184 nvme0n1
>  2595   78140416 nvme0n1p1
>  2596   78141440 nvme0n1p2
>  2597   78140416 nvme0n1p3
>  2598   78141440 nvme0n1p4
>  2599   78141440 nvme0n1p5
>  259   10   78141440 nvme0n1p6
>  259   11   78140416 nvme0n1p7
>  259   12   78141440 nvme0n1p8
>  259   13   78141440 nvme0n1p9
>  259   15   78140416 nvme0n1p10
>
> 2) Then, from the admin node, I created my 10 first OSDs like this :
>
> echo "/dev/sda /dev/nvme0n1p1
> /dev/sdb /dev/nvme0n1p2
> /dev/sdc /dev/nvme0n1p3
> /dev/sdd /dev/nvme0n1p4
> /dev/sde /dev/nvme0n1p5
> /dev/sdf /dev/nvme0n1p6
> /dev/sdg /dev/nvme0n1p7
> /dev/sdh /dev/nvme0n1p8
> /dev/sdi /dev/nvme0n1p9
> /dev/sdj /dev/nvme0n1p10" | while read hdd db; do ceph-deploy osd create
> --debug --bluestore --data $hdd --block-db $db node-osd0; done
>
> What you mean is that, at this stage, I must directly declare the UUID paths
> in value of --block.db (i.e. replace /dev/nvme0n1p1 with its PARTUUID), that
> is ?

No, this all looks correct. How does the ceph-volume.log and
ceph-volume-systemd.log look when you are booting up for the OSDs that
aren't coming up?

Anything useful in there?
>
> Currently, I created 60 OSDs like that. The ceph cluster is HEALTH_OK and
> all osds are up and in. But I'm not yet in prodcution and there is only test
> data on it, so I can destroy everything and rebuild my OSDs.
> That's what you advise me to do there, taking care to specify the PARTUUID
> for the block.db instead of the device names ?
>
>
> For instance, if I have two NVMe devices, the first time, the first device
> is mounted with name /dev/nvme0n1 and the second device with name
> /dev/nvme1n1. After node restart, these names can be reversed, that is, the
> first device named /dev/nvme1n1 and the second one /dev/nvme0n1 ! The result
> is that OSDs no longer find their metadata and do not start up...
>
> This sounds very odd. Could you clarify where block and block.db are?
> Also useful here would be to take a look at
> /var/log/ceph/ceph-volume-systemd.log and ceph-volume.log to
> see how ceph-volume is trying to get this OSD up and running.
>
> Also useful would be to check `ceph-volume lvm list` to verify that
> regardless of the name change, it recognizes the correct partition
> mapped to the OSD
>
> Oops !
>
> # ceph-volume lvm list
> -->  KeyError: 'devices'

Can you re-run this like:


CEPH_VOLUME_DEBUG=1 ceph-volume lvm list

And paste the output? I think this has been fixed since, but want to
double check

>
> Thank you again,
> Hervé
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Shared WAL/DB device partition for multiple OSDs?

2018-08-23 Thread Alfredo Deza
On Thu, Aug 23, 2018 at 9:12 AM, Hervé Ballans
 wrote:
> Le 23/08/2018 à 12:51, Alfredo Deza a écrit :
>>
>> On Thu, Aug 23, 2018 at 5:42 AM, Hervé Ballans
>>  wrote:
>>>
>>> Hello all,
>>>
>>> I would like to continue a thread that dates back to last May (sorry if
>>> this
>>> is not a good practice ?..)
>>>
>>> Thanks David for your usefil tips on this thread.
>>> In my side, I created my OSDs with ceph-deploy (in place of ceph-volume)
>>> [1], but this is exactly the same context as this mentioned on this
>>> thread
>>> (hdd  drive for OSDs and wal/db partitions on NVMe device).
>>>
>>> The problem I encounter is that the script that fixes block.db partitions
>>> by
>>> their UUID works very well in live but does not resist to the reboot of
>>> the
>>> OSD node. If I restart the server, the symbolic links of block.db
>>> automatically go up with the device name /dev/nvme...
>>> The problem gets worse when we have 2 NVMe devices on the same node
>>> beacuse
>>> in this case, it happens that the paths to the block.db partitions are
>>> reversed and obviously OSDs don't start !
>>
>> You didn't mention what versions of ceph-deploy and Ceph you are
>> using. Since you brought up partitions and OSDs that are not coming
>> up, it seems
>> that is related to using ceph-disk and ceph-deploy 1.5.X
>>
>> I would suggest trying out the newer version of ceph-deploy (2.0.X)
>> and use ceph-volume, the one caveat being if you need a separate
>> block.db on the NVMe device
>> you would need to create the LV yourself.
>
>
> Thanks Alfredo for your reply. I'm using the very last version of Luminous
> (12.2.7) and ceph-deploy (2.0.1).
> I have no problem in creating my OSD, that's work perfectly.
> My issue only concerns the problem of the mount names of the NVMe partitions
> which change after a reboot when there are more than one NVMe device on the
> OSD node.

ceph-volume is pretty resilient to partition changes because it stores
the PARTUUID of the partition in LVM, and it queries
it each time at boot. Note that for bluestore there is no mounting
whatsoever. Have you created partitions with a PARTUUID on the nvme
devices for block.db ?

>
> For instance, if I have two NVMe devices, the first time, the first device
> is mounted with name /dev/nvme0n1 and the second device with name
> /dev/nvme1n1. After node restart, these names can be reversed, that is, the
> first device named /dev/nvme1n1 and the second one /dev/nvme0n1 ! The result
> is that OSDs no longer find their metadata and do not start up...

This sounds very odd. Could you clarify where block and block.db are?
Also useful here would be to take a look at
/var/log/ceph/ceph-volume-systemd.log and ceph-volume.log to
see how ceph-volume is trying to get this OSD up and running.

Also useful would be to check `ceph-volume lvm list` to verify that
regardless of the name change, it recognizes the correct partition
mapped to the OSD
>
>
>> Some of the manual steps are covered in the bluestore config
>> reference:
>> http://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/#block-and-block-db
>>>
>>> As I'm not yet in production, I can probably recreate all my OSDs by
>>> forcing
>>> the path to the block.db partitions with UUID, but I would like to know
>>> if
>>> there was a way to "freeze" the configuration of block.db paths by their
>>> UUID ("a posteriori") ?
>>>
>>> Or maybe (but this is more a system administration issue) that there is a
>>> way on Linux system to force an NVMe disk to be mounted with a fixed
>>> device
>>> name ? (I specify here that my NVMe partitions do not have a filesystem).
>>>
>>> Thanks for your help,
>>> Hervé
>>>
>>> [1] from admin node
>>> ceph-deploy osd create --debug --bluestore --data $hdd --block-db $db
>>> $osdnode
>>>
>>> Le 11/05/2018 à 18:46, David Turner a écrit :
>>>
>>> # Create the OSD
>>> echo "/dev/sdb /dev/nvme0n1p2
>>> /dev/sdc /dev/nvme0n1p3" | while read hdd db; do
>>>ceph-volume lvm create --bluestore --data $hdd --block.db $db
>>> done
>>>
>>> # Fix the OSDs to look for the block.db partition by UUID instead of its
>>> device name.
>>> for db in /var/lib/ceph/osd/*/block.db; do
>>>dev=$(readlink $db | grep -Eo
>>> nvme[[:digit:]]+n[[:digit:]]+p[[:digit:]]+
>>> || echo false)
>>>if [[ "$dev" != false ]]; then
>>>  uuid=$(ls -l /dev/disk/by-partuuid/ | awk '/'${dev}'$/ {print $9}')
>>>  ln -sf /dev/disk/by-partuuid/$uuid $db
>>>fi
>>> done
>>> systemctl restart ceph-osd.target
>>>
>>>
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Shared WAL/DB device partition for multiple OSDs?

2018-08-23 Thread Alfredo Deza
On Thu, Aug 23, 2018 at 5:42 AM, Hervé Ballans
 wrote:
> Hello all,
>
> I would like to continue a thread that dates back to last May (sorry if this
> is not a good practice ?..)
>
> Thanks David for your usefil tips on this thread.
> In my side, I created my OSDs with ceph-deploy (in place of ceph-volume)
> [1], but this is exactly the same context as this mentioned on this thread
> (hdd  drive for OSDs and wal/db partitions on NVMe device).
>
> The problem I encounter is that the script that fixes block.db partitions by
> their UUID works very well in live but does not resist to the reboot of the
> OSD node. If I restart the server, the symbolic links of block.db
> automatically go up with the device name /dev/nvme...
> The problem gets worse when we have 2 NVMe devices on the same node beacuse
> in this case, it happens that the paths to the block.db partitions are
> reversed and obviously OSDs don't start !

You didn't mention what versions of ceph-deploy and Ceph you are
using. Since you brought up partitions and OSDs that are not coming
up, it seems
that is related to using ceph-disk and ceph-deploy 1.5.X

I would suggest trying out the newer version of ceph-deploy (2.0.X)
and use ceph-volume, the one caveat being if you need a separate
block.db on the NVMe device
you would need to create the LV yourself.

Some of the manual steps are covered in the bluestore config
reference: 
http://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/#block-and-block-db
>
> As I'm not yet in production, I can probably recreate all my OSDs by forcing
> the path to the block.db partitions with UUID, but I would like to know if
> there was a way to "freeze" the configuration of block.db paths by their
> UUID ("a posteriori") ?
>
> Or maybe (but this is more a system administration issue) that there is a
> way on Linux system to force an NVMe disk to be mounted with a fixed device
> name ? (I specify here that my NVMe partitions do not have a filesystem).
>
> Thanks for your help,
> Hervé
>
> [1] from admin node
> ceph-deploy osd create --debug --bluestore --data $hdd --block-db $db
> $osdnode
>
> Le 11/05/2018 à 18:46, David Turner a écrit :
>
> # Create the OSD
> echo "/dev/sdb /dev/nvme0n1p2
> /dev/sdc /dev/nvme0n1p3" | while read hdd db; do
>   ceph-volume lvm create --bluestore --data $hdd --block.db $db
> done
>
> # Fix the OSDs to look for the block.db partition by UUID instead of its
> device name.
> for db in /var/lib/ceph/osd/*/block.db; do
>   dev=$(readlink $db | grep -Eo nvme[[:digit:]]+n[[:digit:]]+p[[:digit:]]+
> || echo false)
>   if [[ "$dev" != false ]]; then
> uuid=$(ls -l /dev/disk/by-partuuid/ | awk '/'${dev}'$/ {print $9}')
> ln -sf /dev/disk/by-partuuid/$uuid $db
>   fi
> done
> systemctl restart ceph-osd.target
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] BlueStore options in ceph.conf not being used

2018-08-22 Thread Alfredo Deza
On Wed, Aug 22, 2018 at 2:48 PM, David Turner  wrote:
> The config settings for DB and WAL size don't do anything.  For journal
> sizes they would be used for creating your journal partition with ceph-disk,
> but ceph-volume does not use them for creating bluestore OSDs.  You need to
> create the partitions for the DB and WAL yourself and supply those
> partitions to the ceph-volume command.  I have heard that they're working on
> this for future releases, but currently those settings don't do anything.

This is accurate, ceph-volume as of the latest release doesn't do any
with them because it doesn't create these for the user.

We are getting close on getting that functionality rolled out, but not
ready unless you are using master (please don't use master :))


>
> On Wed, Aug 22, 2018 at 1:34 PM Robert Stanford 
> wrote:
>>
>>
>>  I have created new OSDs for Ceph Luminous.  In my Ceph.conf I have
>> specified that the db size be 10GB, and the wal size be 1GB.  However when I
>> type ceph daemon osd.0 perf dump I get: bluestore_allocated": 5963776
>>
>>  I think this means that the bluestore db is using the default, and not
>> the value of bluestore block db size in the ceph.conf.  Why is this?
>>
>>  Thanks
>>  R
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Mimic osd fails to start.

2018-08-20 Thread Alfredo Deza
On Mon, Aug 20, 2018 at 10:23 AM, Daznis  wrote:
> Hello,
>
> It appears that something is horribly wrong with the cluster itself. I
> can't create or add any new osds to it at all.

Have you added new monitors? Or replaced monitors? I would check that
all your versions match, something seems to be expecting different
versions.

The "Invalid argument" problem is a common thing we see when that happens.

Something that might help a bit here is if you run ceph-medic against
your cluster:

http://docs.ceph.com/ceph-medic/master/



> On Mon, Aug 20, 2018 at 11:04 AM Daznis  wrote:
>>
>> Hello,
>>
>>
>> Zapping the journal didn't help. I tried to create the journal after
>> zapping it. Also failed. I'm not really sure why this happens.
>>
>> Looking at the monitor logs with 20/20 debug I'm seeing these errors:
>>
>> 2018-08-20 08:57:58.753 7f9d85934700  0 mon.mon02@1(peon) e4
>> handle_command mon_command({"prefix": "osd crush set-device-class",
>> "class": "ssd", "ids": ["48"]} v 0) v1
>> 2018-08-20 08:57:58.753 7f9d85934700 20 is_capable service=osd
>> command=osd crush set-device-class read write on cap allow profile osd
>> 2018-08-20 08:57:58.753 7f9d85934700 20  allow so far , doing grant
>> allow profile osd
>> 2018-08-20 08:57:58.753 7f9d85934700 20  match
>> 2018-08-20 08:57:58.753 7f9d85934700 10 mon.mon02@1(peon) e4
>> _allowed_command capable
>> 2018-08-20 08:57:58.753 7f9d85934700  0 log_channel(audit) log [INF] :
>> from='osd.48 10.24.52.17:6800/153683' entity='osd.48' cmd=[{"prefix":
>> "osd crush set-device-class", "class": "ssd", "ids": ["48"]}]:
>> dispatch
>> 2018-08-20 08:57:58.753 7f9d85934700 10 mon.mon02@1(peon).osd e46327
>> preprocess_query mon_command({"prefix": "osd crush set-device-class",
>> "class": "ssd", "ids": ["48"]} v 0) v1 from osd.48
>> 10.24.52.17:6800/153683
>> 2018-08-20 08:57:58.753 7f9d85934700 10 mon.mon02@1(peon) e4
>> forward_request 4 request mon_command({"prefix": "osd crush
>> set-device-class", "class": "ssd", "ids": ["48"]} v 0) v1 features
>> 4611087854031142907
>> 2018-08-20 08:57:58.753 7f9d85934700 20 mon.mon02@1(peon) e4
>> _ms_dispatch existing session 0x55b4ec482a80 for mon.1
>> 10.24.52.11:6789/0
>> 2018-08-20 08:57:58.753 7f9d85934700 20 mon.mon02@1(peon) e4  caps allow *
>> 2018-08-20 08:57:58.753 7f9d85934700 10 mon.mon02@1(peon).log
>> v10758065 preprocess_query log(1 entries from seq 4 at 2018-08-20
>> 08:57:58.755306) v1 from mon.1 10.24.52.11:6789/0
>> 2018-08-20 08:57:58.753 7f9d85934700 10 mon.mon02@1(peon).log
>> v10758065 preprocess_log log(1 entries from seq 4 at 2018-08-20
>> 08:57:58.755306) v1 from mon.1
>> 2018-08-20 08:57:58.753 7f9d85934700 20 is_capable service=log
>> command= write on cap allow *
>> 2018-08-20 08:57:58.753 7f9d85934700 20  allow so far , doing grant allow *
>> 2018-08-20 08:57:58.753 7f9d85934700 20  allow all
>> 2018-08-20 08:57:58.753 7f9d85934700 10 mon.mon02@1(peon) e4
>> forward_request 5 request log(1 entries from seq 4 at 2018-08-20
>> 08:57:58.755306) v1 features 4611087854031142907
>> 2018-08-20 08:57:58.754 7f9d85934700 20 mon.mon02@1(peon) e4
>> _ms_dispatch existing session 0x55b4ec4828c0 for mon.0
>> 10.24.52.10:6789/0
>> 2018-08-20 08:57:58.754 7f9d85934700 20 mon.mon02@1(peon) e4  caps allow *
>> 2018-08-20 08:57:58.754 7f9d85934700 20 is_capable service=mon
>> command= read on cap allow *
>> 2018-08-20 08:57:58.754 7f9d85934700 20  allow so far , doing grant allow *
>> 2018-08-20 08:57:58.754 7f9d85934700 20  allow all
>> 2018-08-20 08:57:58.754 7f9d85934700 20 is_capable service=mon
>> command= exec on cap allow *
>> 2018-08-20 08:57:58.754 7f9d85934700 20  allow so far , doing grant allow *
>> 2018-08-20 08:57:58.754 7f9d85934700 20  allow all
>> 2018-08-20 08:57:58.754 7f9d85934700 10 mon.mon02@1(peon) e4
>> handle_route mon_command_ack([{"prefix": "osd crush set-device-class",
>> "class": "ssd", "ids": ["48"]}]=-22 (22) Invalid argument v46327) v1
>> to unknown.0 -
>> 2018-08-20 08:57:58.785 7f9d85934700 10 mon.mon02@1(peon) e4
>> ms_handle_reset 0x55b4ecf4b200 10.24.52.17:6800/153683
>> 2018-08-20 08:57:58.785 7f9d85934700 10 mon.mon02@1(peon) e4
>> reset/close on session osd.48 10.24.52.17:6800/153683
>> 2018-08-20 08:57:58.785 7f9d85934700 10 mon.mon02@1(peon) e4
>

Re: [ceph-users] Mimic osd fails to start.

2018-08-18 Thread Alfredo Deza
On Fri, Aug 17, 2018 at 7:05 PM, Daznis  wrote:
> Hello,
>
>
> I have replace one of our failed OSD drives and recreated a new osd
> with ceph-deploy and it failes to start.

Is it possible you haven't zapped the journal on nvme0n1p13 ?



>
> Command: ceph-deploy --overwrite-conf osd create --filestore
> --zap-disk --data /dev/bcache0 --journal /dev/nvme0n1p13 
>
> Output off ceph-deploy:
> [ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
> [ceph_deploy.cli][INFO  ] Invoked (2.0.1): /usr/bin/ceph-deploy
> --overwrite-conf osd create --filestore --zap-disk --data /dev/bcache0
> --journal /dev/nvme0n1p13 
> [ceph_deploy.cli][INFO  ] ceph-deploy options:
> [ceph_deploy.cli][INFO  ]  verbose   : False
> [ceph_deploy.cli][INFO  ]  bluestore : None
> [ceph_deploy.cli][INFO  ]  cd_conf   :
> 
> [ceph_deploy.cli][INFO  ]  cluster   : ceph
> [ceph_deploy.cli][INFO  ]  fs_type   : xfs
> [ceph_deploy.cli][INFO  ]  block_wal : None
> [ceph_deploy.cli][INFO  ]  default_release   : False
> [ceph_deploy.cli][INFO  ]  username  : None
> [ceph_deploy.cli][INFO  ]  journal   : /dev/nvme0n1p13
> [ceph_deploy.cli][INFO  ]  subcommand: create
> [ceph_deploy.cli][INFO  ]  host  : 
> [ceph_deploy.cli][INFO  ]  filestore : True
> [ceph_deploy.cli][INFO  ]  func  :  osd at 0x7f8622194848>
> [ceph_deploy.cli][INFO  ]  ceph_conf : None
> [ceph_deploy.cli][INFO  ]  zap_disk  : True
> [ceph_deploy.cli][INFO  ]  data  : /dev/bcache0
> [ceph_deploy.cli][INFO  ]  block_db  : None
> [ceph_deploy.cli][INFO  ]  dmcrypt   : False
> [ceph_deploy.cli][INFO  ]  overwrite_conf: True
> [ceph_deploy.cli][INFO  ]  dmcrypt_key_dir   :
> /etc/ceph/dmcrypt-keys
> [ceph_deploy.cli][INFO  ]  quiet : False
> [ceph_deploy.cli][INFO  ]  debug : False
> [ceph_deploy.osd][DEBUG ] Creating OSD on cluster ceph with data
> device /dev/bcache0
> [][DEBUG ] connected to host: 
> [][DEBUG ] detect platform information from remote host
> [][DEBUG ] detect machine type
> [][DEBUG ] find the location of an executable
> [ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.5.1804 Core
> [ceph_deploy.osd][DEBUG ] Deploying osd to 
> [][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
> [][DEBUG ] find the location of an executable
> [ceph_deploy.osd][WARNIN] zapping is no longer supported when preparing
> [][INFO  ] Running command: /usr/sbin/ceph-volume --cluster
> ceph lvm create --filestore --data /dev/bcache0 --journal
> /dev/nvme0n1p13
> [][DEBUG ] Running command: /bin/ceph-authtool --gen-print-key
> [][DEBUG ] Running command: /bin/ceph --cluster ceph --name
> client.bootstrap-osd --keyring
> /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new
> a503ae5e-b5b9-40d7-b8b3-194f15e52082
> [][DEBUG ] Running command: /usr/sbin/vgcreate --force --yes
> ceph-a1ffe5bb-6f06-49c6-8aec-e3eb3a311162 /dev/bcache0
> [][DEBUG ]  stdout: Physical volume "/dev/bcache0"
> successfully created.
> [][DEBUG ]  stdout: Volume group
> "ceph-a1ffe5bb-6f06-49c6-8aec-e3eb3a311162" successfully created
> [][DEBUG ] Running command: /usr/sbin/lvcreate --yes -l
> 100%FREE -n osd-data-a503ae5e-b5b9-40d7-b8b3-194f15e52082
> ceph-a1ffe5bb-6f06-49c6-8aec-e3eb3a311162
> [][DEBUG ]  stdout: Logical volume
> "osd-data-a503ae5e-b5b9-40d7-b8b3-194f15e52082" created.
> [][DEBUG ] Running command: /bin/ceph-authtool --gen-print-key
> [][DEBUG ] Running command: /usr/sbin/mkfs -t xfs -f -i
> size=2048 
> /dev/ceph-a1ffe5bb-6f06-49c6-8aec-e3eb3a311162/osd-data-a503ae5e-b5b9-40d7-b8b3-194f15e52082
> [][DEBUG ]  stdout:
> meta-data=/dev/ceph-a1ffe5bb-6f06-49c6-8aec-e3eb3a311162/osd-data-a503ae5e-b5b9-40d7-b8b3-194f15e52082
> isize=2048   agcount=4, agsize=244154112 blks
> [][DEBUG ]  =   sectsz=512
> attr=2, projid32bit=1
> [][DEBUG ]  =   crc=1
> finobt=0, sparse=0
> [][DEBUG ] data =   bsize=4096
> blocks=976616448, imaxpct=5
> [][DEBUG ]  =   sunit=0  swidth=0 
> blks
> [][DEBUG ] naming   =version 2  bsize=4096
> ascii-ci=0 ftype=1
> [][DEBUG ] log  =internal log   bsize=4096
> blocks=476863, version=2
> [][DEBUG ]  =   sectsz=512
> sunit=0 blks, lazy-count=1
> [][DEBUG ] realtime =none   extsz=4096
> blocks=0, rtextents=0
> [][DEBUG ] Running command: /bin/mount -t xfs -o
> rw,noatime,inode64,noquota,nodiratime,logbufs=8,logbsize=256k,attr2
> /dev/ceph-a1ffe5bb-6f06-49c6-8aec-e3eb3a311162/osd-data-a503ae5e-b5b9-40d7-b8b3-194f15e52082
> 

Re: [ceph-users] BlueStore upgrade steps broken

2018-08-17 Thread Alfredo Deza
On Fri, Aug 17, 2018 at 2:55 PM, David Turner  wrote:
> Does the block and/or wal partition need to be an LV?  I just passed
> ceph-volume the raw partition and it seems to be working fine.

A raw device is only allowed for data, but a partition is allowed for
wal/block.db

Not sure if by "raw partition" you mean an actual partition or a raw device

>
> On Fri, Aug 17, 2018 at 2:54 PM Alfredo Deza  wrote:
>>
>> On Fri, Aug 17, 2018 at 10:24 AM, Robert Stanford
>>  wrote:
>> >
>> >  I was using the ceph-volume create command, which I understand combines
>> > the
>> > prepare and activate functions.
>> >
>> > ceph-volume lvm create --osd-id 0 --bluestore --data /dev/sdc --block.db
>> > /dev/sdb --block.wal /dev/sdb
>> >
>> >  That is the command context I've found on the web.  Is it wrong?
>>
>> It is very wrong :(
>>
>> If this was coming from our docs, it needs to be fixed because it will
>> never work.
>>
>> If you really want to place both block.db and block.wal on /dev/sdb,
>> you will need to create one LV for each. ceph-volume will not do this
>> for you.
>>
>> And then you can pass those newly created LVs like:
>>
>> ceph-volume lvm create --osd-id 0 --bluestore --data /dev/sdc
>> --block.db sdb-vg/block-lv --block.wal sdb-vg/wal-lv
>>
>>
>>
>> >
>> >  Thanks
>> > R
>> >
>> > On Fri, Aug 17, 2018 at 5:55 AM Alfredo Deza  wrote:
>> >>
>> >> On Thu, Aug 16, 2018 at 9:00 PM, Robert Stanford
>> >>  wrote:
>> >> >
>> >> >  I am following the steps to my filestore journal with a bluestore
>> >> > journal
>> >> >
>> >> > (http://docs.ceph.com/docs/mimic/rados/operations/bluestore-migration/).
>> >> > It
>> >> > is broken at ceph-volume lvm create.  Here is my error:
>> >> >
>> >> > --> Zapping successful for: /dev/sdc
>> >> > Preparing sdc
>> >> > Running command: /bin/ceph-authtool --gen-print-key
>> >> > Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd
>> >> > --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd tree -f json
>> >> > Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd
>> >> > --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new
>> >> > ff523216-350d-4ca0-9022-0c17662c2c3b 10
>> >> > Running command: vgcreate --force --yes
>> >> > ceph-459b4fbe-e3c4-4f28-b58e-3496bf3ea95a /dev/sdc
>> >> >  stdout: Physical volume "/dev/sdc" successfully created.
>> >> >  stdout: Volume group "ceph-459b4fbe-e3c4-4f28-b58e-3496bf3ea95a"
>> >> > successfully created
>> >> > Running command: lvcreate --yes -l 100%FREE -n
>> >> > osd-block-ff523216-350d-4ca0-9022-0c17662c2c3b
>> >> > ceph-459b4fbe-e3c4-4f28-b58e-3496bf3ea95a
>> >> >  stdout: Logical volume
>> >> > "osd-block-ff523216-350d-4ca0-9022-0c17662c2c3b"
>> >> > created.
>> >> > --> blkid could not detect a PARTUUID for device: sdb
>> >> > --> Was unable to complete a new OSD, will rollback changes
>> >> > --> OSD will be destroyed, keeping the ID because it was provided
>> >> > with
>> >> > --osd-id
>> >> > Running command: ceph osd destroy osd.10 --yes-i-really-mean-it
>> >> >  stderr: destroyed osd.10
>> >> > -->  RuntimeError: unable to use device
>> >> >
>> >> >  Note that SDB is the SSD journal.  It has been zapped prior.
>> >>
>> >> I can't see what the actual command you used is, but I am guessing you
>> >> did something like:
>> >>
>> >> ceph-volume lvm prepare --filestore --data /dev/sdb --journal /dev/sdb
>> >>
>> >> Which is not possible. There are a few ways you can do this (see:
>> >> http://docs.ceph.com/docs/master/ceph-volume/lvm/prepare/#filestore )
>> >>
>> >> With a raw device and a pre-created partition (must have a PARTUUID):
>> >>
>> >> ceph-volume lvm prepare --data /dev/sdb --journal /dev/sdc1
>> >>
>> >> With LVs:
>> >>
>> >> ceph-volume lvm prepare --data vg/my-data --journal vg/my-journal
>> >>
>> >> With an LV for data and a partition:
>> >>
>> >> ceph-volume lvm prepare --data vg/my-data --journal /dev/sdc1
>> >>
>> >> >
>> >> >  What is going wrong, and how can I fix it?
>> >> >
>> >> >  Thank you
>> >> >  R
>> >> >
>> >> >
>> >> > ___
>> >> > ceph-users mailing list
>> >> > ceph-users@lists.ceph.com
>> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >> >
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] BlueStore upgrade steps broken

2018-08-17 Thread Alfredo Deza
On Fri, Aug 17, 2018 at 11:47 AM, Robert Stanford
 wrote:
>
>  What's more, I was planning on using this single journal device (SSD) for 4
> OSDs.  With filestore I simply told each OSD to use this drive, sdb, on the
> command line, and it would create a new partition on that drive every time I
> created an OSD.  I thought it would be the same for BlueStore.  So that begs
> the question, how does one set up an SSD to hold journals for multiple OSDs,
> both db and wal?  Searching has yielded nothing.

We are working on expanding the tooling to this for you, but until
then, it is up to the user to create the LVs manually.

This section might help out a bit on what you would need (for block.db):

http://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/#block-and-block-db
>
>  R
>
>
> On Fri, Aug 17, 2018 at 9:48 AM David Turner  wrote:
>>
>> > ceph-volume lvm create --osd-id 0 --bluestore --data /dev/sdc --block.db
>> > /dev/sdb --block.wal /dev/sdb
>>
>> That command can't work... You're telling it to use the entire /dev/sdb
>> device for the db and then again to do it for the wal, but you can only use
>> the entire device once.  There are 2 things wrong with that.  First, if
>> you're putting db and wal on the same device you do not need to specify the
>> wal.  Second if you are actually intending to use a partition on /dev/sdb
>> instead of the entire block device for this single OSD, then you need to
>> manually create a partition for it and supply that partition to the
>> --block.db command.
>>
>> Likely the command you want will end up being this after you create a
>> partition on the SSD for the db/wal.
>> `ceph-volume lvm create --osd-id 0 --bluestore --data /dev/sdc --block.db
>> /dev/sdb1`
>>
>> On Fri, Aug 17, 2018 at 10:24 AM Robert Stanford 
>> wrote:
>>>
>>>
>>>  I was using the ceph-volume create command, which I understand combines
>>> the prepare and activate functions.
>>>
>>> ceph-volume lvm create --osd-id 0 --bluestore --data /dev/sdc --block.db
>>> /dev/sdb --block.wal /dev/sdb
>>>
>>>  That is the command context I've found on the web.  Is it wrong?
>>>
>>>  Thanks
>>> R
>>>
>>> On Fri, Aug 17, 2018 at 5:55 AM Alfredo Deza  wrote:
>>>>
>>>> On Thu, Aug 16, 2018 at 9:00 PM, Robert Stanford
>>>>  wrote:
>>>> >
>>>> >  I am following the steps to my filestore journal with a bluestore
>>>> > journal
>>>> >
>>>> > (http://docs.ceph.com/docs/mimic/rados/operations/bluestore-migration/). 
>>>> >  It
>>>> > is broken at ceph-volume lvm create.  Here is my error:
>>>> >
>>>> > --> Zapping successful for: /dev/sdc
>>>> > Preparing sdc
>>>> > Running command: /bin/ceph-authtool --gen-print-key
>>>> > Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd
>>>> > --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd tree -f json
>>>> > Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd
>>>> > --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new
>>>> > ff523216-350d-4ca0-9022-0c17662c2c3b 10
>>>> > Running command: vgcreate --force --yes
>>>> > ceph-459b4fbe-e3c4-4f28-b58e-3496bf3ea95a /dev/sdc
>>>> >  stdout: Physical volume "/dev/sdc" successfully created.
>>>> >  stdout: Volume group "ceph-459b4fbe-e3c4-4f28-b58e-3496bf3ea95a"
>>>> > successfully created
>>>> > Running command: lvcreate --yes -l 100%FREE -n
>>>> > osd-block-ff523216-350d-4ca0-9022-0c17662c2c3b
>>>> > ceph-459b4fbe-e3c4-4f28-b58e-3496bf3ea95a
>>>> >  stdout: Logical volume
>>>> > "osd-block-ff523216-350d-4ca0-9022-0c17662c2c3b"
>>>> > created.
>>>> > --> blkid could not detect a PARTUUID for device: sdb
>>>> > --> Was unable to complete a new OSD, will rollback changes
>>>> > --> OSD will be destroyed, keeping the ID because it was provided with
>>>> > --osd-id
>>>> > Running command: ceph osd destroy osd.10 --yes-i-really-mean-it
>>>> >  stderr: destroyed osd.10
>>>> > -->  RuntimeError: unable to use device
>>>> >
>>>> >  Note that SDB is the SSD journal.  It has been zapped prior.
>>>>
>>>> I can't see what the actual command you used is, but I 

Re: [ceph-users] BlueStore upgrade steps broken

2018-08-17 Thread Alfredo Deza
On Fri, Aug 17, 2018 at 10:24 AM, Robert Stanford
 wrote:
>
>  I was using the ceph-volume create command, which I understand combines the
> prepare and activate functions.
>
> ceph-volume lvm create --osd-id 0 --bluestore --data /dev/sdc --block.db
> /dev/sdb --block.wal /dev/sdb
>
>  That is the command context I've found on the web.  Is it wrong?

It is very wrong :(

If this was coming from our docs, it needs to be fixed because it will
never work.

If you really want to place both block.db and block.wal on /dev/sdb,
you will need to create one LV for each. ceph-volume will not do this
for you.

And then you can pass those newly created LVs like:

ceph-volume lvm create --osd-id 0 --bluestore --data /dev/sdc
--block.db sdb-vg/block-lv --block.wal sdb-vg/wal-lv



>
>  Thanks
> R
>
> On Fri, Aug 17, 2018 at 5:55 AM Alfredo Deza  wrote:
>>
>> On Thu, Aug 16, 2018 at 9:00 PM, Robert Stanford
>>  wrote:
>> >
>> >  I am following the steps to my filestore journal with a bluestore
>> > journal
>> > (http://docs.ceph.com/docs/mimic/rados/operations/bluestore-migration/).
>> > It
>> > is broken at ceph-volume lvm create.  Here is my error:
>> >
>> > --> Zapping successful for: /dev/sdc
>> > Preparing sdc
>> > Running command: /bin/ceph-authtool --gen-print-key
>> > Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd
>> > --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd tree -f json
>> > Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd
>> > --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new
>> > ff523216-350d-4ca0-9022-0c17662c2c3b 10
>> > Running command: vgcreate --force --yes
>> > ceph-459b4fbe-e3c4-4f28-b58e-3496bf3ea95a /dev/sdc
>> >  stdout: Physical volume "/dev/sdc" successfully created.
>> >  stdout: Volume group "ceph-459b4fbe-e3c4-4f28-b58e-3496bf3ea95a"
>> > successfully created
>> > Running command: lvcreate --yes -l 100%FREE -n
>> > osd-block-ff523216-350d-4ca0-9022-0c17662c2c3b
>> > ceph-459b4fbe-e3c4-4f28-b58e-3496bf3ea95a
>> >  stdout: Logical volume "osd-block-ff523216-350d-4ca0-9022-0c17662c2c3b"
>> > created.
>> > --> blkid could not detect a PARTUUID for device: sdb
>> > --> Was unable to complete a new OSD, will rollback changes
>> > --> OSD will be destroyed, keeping the ID because it was provided with
>> > --osd-id
>> > Running command: ceph osd destroy osd.10 --yes-i-really-mean-it
>> >  stderr: destroyed osd.10
>> > -->  RuntimeError: unable to use device
>> >
>> >  Note that SDB is the SSD journal.  It has been zapped prior.
>>
>> I can't see what the actual command you used is, but I am guessing you
>> did something like:
>>
>> ceph-volume lvm prepare --filestore --data /dev/sdb --journal /dev/sdb
>>
>> Which is not possible. There are a few ways you can do this (see:
>> http://docs.ceph.com/docs/master/ceph-volume/lvm/prepare/#filestore )
>>
>> With a raw device and a pre-created partition (must have a PARTUUID):
>>
>> ceph-volume lvm prepare --data /dev/sdb --journal /dev/sdc1
>>
>> With LVs:
>>
>> ceph-volume lvm prepare --data vg/my-data --journal vg/my-journal
>>
>> With an LV for data and a partition:
>>
>> ceph-volume lvm prepare --data vg/my-data --journal /dev/sdc1
>>
>> >
>> >  What is going wrong, and how can I fix it?
>> >
>> >  Thank you
>> >  R
>> >
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] A few questions about using SSD for bluestore journal

2018-08-17 Thread Alfredo Deza
On Thu, Aug 16, 2018 at 4:44 PM, Cody  wrote:
> Hi everyone,
>
> As a newbie, I have some questions about using SSD as the Bluestore
> journal device.
>
> 1. Is there a formula to calculate the optimal size of partitions on
> the SSD for each OSD, given their capacity and IO performance? Or is
> there a rule of thumb on this?

We recently updated the docs regarding sizing for bluestore:
http://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/#sizing

You will want no less of 4% of your data size for block.db. For a 1TB
data device you will want 40GB for block.db

>
> 2. Is there a formula to find out the max number of OSDs a single SSD
> can serve for journaling? Or any rule of thumb?
>
> 3. What is the procedure to replace an SSD journal device used for
> DB+WAL in a hot cluster?
>
> Thank you all very much!
>
> Cody
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] BlueStore upgrade steps broken

2018-08-17 Thread Alfredo Deza
On Thu, Aug 16, 2018 at 9:00 PM, Robert Stanford
 wrote:
>
>  I am following the steps to my filestore journal with a bluestore journal
> (http://docs.ceph.com/docs/mimic/rados/operations/bluestore-migration/).  It
> is broken at ceph-volume lvm create.  Here is my error:
>
> --> Zapping successful for: /dev/sdc
> Preparing sdc
> Running command: /bin/ceph-authtool --gen-print-key
> Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd
> --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd tree -f json
> Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd
> --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new
> ff523216-350d-4ca0-9022-0c17662c2c3b 10
> Running command: vgcreate --force --yes
> ceph-459b4fbe-e3c4-4f28-b58e-3496bf3ea95a /dev/sdc
>  stdout: Physical volume "/dev/sdc" successfully created.
>  stdout: Volume group "ceph-459b4fbe-e3c4-4f28-b58e-3496bf3ea95a"
> successfully created
> Running command: lvcreate --yes -l 100%FREE -n
> osd-block-ff523216-350d-4ca0-9022-0c17662c2c3b
> ceph-459b4fbe-e3c4-4f28-b58e-3496bf3ea95a
>  stdout: Logical volume "osd-block-ff523216-350d-4ca0-9022-0c17662c2c3b"
> created.
> --> blkid could not detect a PARTUUID for device: sdb
> --> Was unable to complete a new OSD, will rollback changes
> --> OSD will be destroyed, keeping the ID because it was provided with
> --osd-id
> Running command: ceph osd destroy osd.10 --yes-i-really-mean-it
>  stderr: destroyed osd.10
> -->  RuntimeError: unable to use device
>
>  Note that SDB is the SSD journal.  It has been zapped prior.

I can't see what the actual command you used is, but I am guessing you
did something like:

ceph-volume lvm prepare --filestore --data /dev/sdb --journal /dev/sdb

Which is not possible. There are a few ways you can do this (see:
http://docs.ceph.com/docs/master/ceph-volume/lvm/prepare/#filestore )

With a raw device and a pre-created partition (must have a PARTUUID):

ceph-volume lvm prepare --data /dev/sdb --journal /dev/sdc1

With LVs:

ceph-volume lvm prepare --data vg/my-data --journal vg/my-journal

With an LV for data and a partition:

ceph-volume lvm prepare --data vg/my-data --journal /dev/sdc1

>
>  What is going wrong, and how can I fix it?
>
>  Thank you
>  R
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Optane 900P device class automatically set to SSD not NVME

2018-08-13 Thread Alfredo Deza
On Wed, Aug 1, 2018 at 4:33 AM, Jake Grimmett  wrote:
> Dear All,
>
> Not sure if this is a bug, but when I add Intel Optane 900P drives,
> their device class is automatically set to SSD rather than NVME.

Not sure if we can't really tell apart SSDs from NVMe devices, but you
can use the --crush-device-class flag to force it to NVME:

ceph-volume lvm prepare --bluestore --data /dev/nvme0n1
--crush-device-class=nvme


>
> This happens under Mimic 13.2.0 and 13.2.1
>
> [root@ceph2 ~]# ceph-volume lvm prepare --bluestore --data /dev/nvme0n1
>
> (SNIP see http://p.ip.fi/eopR for output)
>
> Check...
> [root@ceph2 ~]# ceph osd tree | grep "osd.1 "
>   1   ssd0.25470 osd.1   up  1.0 1.0
>
> Fix is easy
> [root@ceph2 ~]# ceph osd crush rm-device-class osd.1
> done removing class of osd(s): 1
>
> [root@ceph2 ~]# ceph osd crush set-device-class nvme osd.1
> set osd(s) 1 to class 'nvme'
>
> Check...
> [root@ceph2 ~]# ceph osd tree | grep "osd.1 "
>   1  nvme0.25470 osd.1   up  1.0 1.0
>
>
> Thanks,
>
> Jake
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph lvm question

2018-07-30 Thread Alfredo Deza
Try using master?

Not sure really what 3.1 supports.

On Mon, Jul 30, 2018 at 2:03 PM, Satish Patel  wrote:
> Thanks Alfredo,
>
> This is what i am trying to do with ceph-ansible v3.1 and getting
> following error, where i am wrong?
>
> ---
> osd_objectstore: bluestore
> osd_scenario: lvm
> lvm_volumes:
>   - data: /dev/sdb
>
>
>
>
> TASK [ceph-osd : include scenarios/lvm.yml]
> *
> Saturday 28 July 2018  17:17:18 -0400 (0:00:00.082)   0:11:08.249 
> *
> fatal: [osd3]: FAILED! => {"failed": true, "reason": "no action
> detected in task. This often indicates a misspelled module name, or
> incorrect module path.\n\nThe error appears to have been in
> '/etc/ansible/roles/ceph-ansible/roles/ceph-osd/tasks/scenarios/lvm.yml':
> line 3, column 3, but may\nbe elsewhere in the file depending on the
> exact syntax problem.\n\nThe offending line appears to be:\n\n\n-
> name: \"use ceph-volume to create {{ osd_objectstore }} osds\"\n  ^
> here\nWe could be wrong, but this one looks like it might be an issue
> with\nmissing quotes.  Always quote template expression brackets when
> they\nstart a value. For instance:\n\nwith_items:\n  - {{ foo
> }}\n\nShould be written as:\n\nwith_items:\n  - \"{{ foo
> }}\"\n\n\nThe error appears to have been in
> '/etc/ansible/roles/ceph-ansible/roles/ceph-osd/tasks/scenarios/lvm.yml':
> line 3, column 3, but may\nbe elsewhere in the file depending on the
> exact syntax problem.\n\nThe offending line appears to be:\n\n\n-
> name: \"use ceph-volume to create {{ osd_objectstore }} osds\"\n  ^
> here\nWe could be wrong, but this one looks like it might be an issue
> with\nmissing quotes.  Always quote template expression brackets when
> they\nstart a value. For instance:\n\nwith_items:\n  - {{ foo
> }}\n\nShould be written as:\n\nwith_items:\n  - \"{{ foo
> }}\"\n\nexception type:  'ansible.errors.AnsibleParserError'>\nexception: no action detected in
> task. This often indicates a misspelled module name, or incorrect
> module path.\n\nThe error appears to have been in
> '/etc/ansible/roles/ceph-ansible/roles/ceph-osd/tasks/scenarios/lvm.yml':
> line 3, column 3, but may\nbe elsewhere in the file depending on the
> exact syntax problem.\n\nThe offending line appears to be:\n\n\n-
> name: \"use ceph-volume to create {{ osd_objectstore }} osds\"\n  ^
> here\nWe could be wrong, but this one looks like it might be an issue
> with\nmissing quotes.  Always quote template expression brackets when
> they\nstart a value. For instance:\n\nwith_items:\n  - {{ foo
> }}\n\nShould be written as:\n\nwith_items:\n  - \"{{ foo
> }}\"\n"}
>
> On Mon, Jul 30, 2018 at 1:11 PM, Alfredo Deza  wrote:
>> On Sat, Jul 28, 2018 at 12:44 AM, Satish Patel  wrote:
>>> I have simple question i want to use LVM with bluestore (Its
>>> recommended method), If i have only single SSD disk for osd in that
>>> case i want to keep journal + data on same disk so how should i create
>>> lvm to accommodate ?
>>
>> bluestore doesn't have a journal like filestore, but probably you mean
>> block.db ? The naming conventions are different.
>>
>> In the case of bluestore, what you are describing doesn't require to
>> have split LVs for each OSD component. You can simply do this:
>>
>> ceph-volume lvm create --bluestore --data /dev/sdb
>>
>> Behind the scenes, ceph-volume will create the vg/lv for /dev/sdb and
>> get everything working
>>
>>>
>>> Do i need to do following
>>>
>>> pvcreate /dev/sdb
>>> vgcreate vg0 /dev/sdb
>>>
>>> Now i have vg(500GB) so i will create two lv-01 (10G Journal) and
>>> lv-02 (490GB Data)
>>>
>>> I am doing correct?
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph lvm question

2018-07-30 Thread Alfredo Deza
On Sat, Jul 28, 2018 at 12:44 AM, Satish Patel  wrote:
> I have simple question i want to use LVM with bluestore (Its
> recommended method), If i have only single SSD disk for osd in that
> case i want to keep journal + data on same disk so how should i create
> lvm to accommodate ?

bluestore doesn't have a journal like filestore, but probably you mean
block.db ? The naming conventions are different.

In the case of bluestore, what you are describing doesn't require to
have split LVs for each OSD component. You can simply do this:

ceph-volume lvm create --bluestore --data /dev/sdb

Behind the scenes, ceph-volume will create the vg/lv for /dev/sdb and
get everything working

>
> Do i need to do following
>
> pvcreate /dev/sdb
> vgcreate vg0 /dev/sdb
>
> Now i have vg(500GB) so i will create two lv-01 (10G Journal) and
> lv-02 (490GB Data)
>
> I am doing correct?
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


  1   2   3   4   5   >