Re: [ceph-users] ceph configuration; Was: FreeBSD rc.d script: sta.rt not found

Willem Jan Withagen Tue, 21 Aug 2018 04:01:12 -0700

Norman,

I'm cc-ing this back to ceph-users for others the reply to or in futureto find


On 21/08/2018 12:01, Norman Gray wrote:

Willem Jan, hello.

Thanks for your detailed notes on my list question.

On 20 Aug 2018, at 21:32, Willem Jan Withagen wrote:
     # zpool create -m/var/lib/ceph/osd/osd.0 osd.0 gpt/zd000 gpt/zd001
Over the weekend I update the Ceph manual for FreeBSD manual, withexactly that.I 'm not sure what sort of devices zd000 and zd001 are, but concatingdevices seriously lowers the MTBF for the vdev. And as such it islikely better to create 2 OSDs on these 2 devices.
My sort-of problem is that the machine I'm doing this on was not speccedwith Ceph in mind: it has 16 3.5TB disks. Given that<http://docs.ceph.com/docs/master/start/hardware-recommendations/>suggests that 20 is a 'high' number of OSDs on a host, I thought itmight be better to aim for an initial setup of 6 two-disk OSDs ratherthan 12 one-disk ones (leaving four disks free).
That said, 12 < 20, so I think that, especially bearing in mind youradvice here, I should probably stick to 1-disk OSDs with one (default)5GB SSD journal each, and not complicate things.


Only one way to find out: try both...

But I certainly do not advise to put concat disks in an OSD. Especiallynot for production. Break one disk, you break the vdev.


And the most important thing for OSDs is 1G per 1T of disk.

So with 70T of disk you'd need 64 or more of RAM, preferably more sinceZFS will want his share as well..CPUs there is not going to that much of a issue. Unless you have realtiny CPUs.


What I still have not figured out is what to do with the SSDs.
There are 3 things you can do (or in any combination)
1) Ceph standard: make it a journal. Mount the SSD on a separate dir and
        get ceph-disk to start using it as journal
2) Attach a ZFS cache to the vdev which will improve reading
3) Attach a ZFS log on SSD to the vdev to improve sync writing.

At the moment I'm doing all three:
[~] w...@freetest.digiware.nl> zfs list
NAME                   USED  AVAIL  REFER  MOUNTPOINT

osd.0.journal 316K 5.33G 88K/usr/jails/ceph_0/var/lib/ceph/osd/osd.0/journal-ssdosd.1.journal 316K 5.33G 88K/usr/jails/ceph_1/var/lib/ceph/osd/osd.1/journal-ssdosd.2.journal 316K 5.33G 88K/usr/jails/ceph_2/var/lib/ceph/osd/osd.2/journal-ssdosd.3.journal 316K 5.33G 88K/usr/jails/ceph_3/var/lib/ceph/osd/osd.3/journal-ssdosd.4.journal 316K 5.33G 88K/usr/jails/ceph_4/var/lib/ceph/osd/osd.4/journal-ssdosd.5.journal 316K 5.33G 88K/usr/jails/ceph_0/var/lib/ceph/osd/osd.5/journal-ssdosd.6.journal 316K 5.33G 88K/usr/jails/ceph_1/var/lib/ceph/osd/osd.6/journal-ssdosd.7.journal 316K 5.33G 88K/usr/jails/ceph_2/var/lib/ceph/osd/osd.7/journal-ssdosd_0 5.16G 220G 5.16G/usr/jails/ceph_0/var/lib/ceph/osd/osd.0osd_1 5.34G 219G 5.34G/usr/jails/ceph_1/var/lib/ceph/osd/osd.1osd_2 5.42G 219G 5.42G/usr/jails/ceph_2/var/lib/ceph/osd/osd.2osd_3 6.62G 1.31T 6.62G/usr/jails/ceph_3/var/lib/ceph/osd/osd.3osd_4 6.83G 1.75T 6.83G/usr/jails/ceph_4/var/lib/ceph/osd/osd.4osd_5 5.92G 1.31T 5.92G/usr/jails/ceph_0/var/lib/ceph/osd/osd.5osd_6 6.00G 1.31T 6.00G/usr/jails/ceph_1/var/lib/ceph/osd/osd.6osd_7 6.10G 1.31T 6.10G/usr/jails/ceph_2/var/lib/ceph/osd/osd.7


[~] w...@freetest.digiware.nl> zpool list -v osd_1

NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAPDEDUP HEALTH ALTROOTosd_1 232G 5.34G 227G - - 0% 2%1.00x ONLINE -

  gpt/osd_1         232G  5.34G   227G        -         -     0%     2%
log                    -      -      -         -      -      -
  gpt/osd.1.log     960M    12K   960M        -         -     0%     0%
cache                  -      -      -         -      -      -
  gpt/osd.1.cache  22.0G  1.01G  21.0G        -         -     0%     4%

So each OSD has a SSD journal (zfs volume) and each osd volume has cacheand log. ATM the cluster is idle, so hence the log is "empty"

But I would first work on the architecture of how you want the clusterto be, and then start tuning. fs log and cache are easily added andremoved after the fact.

I found what appear to be a couple of typos in your script which I canreport back to you. I hope to make significant progress with this workthis week, so should be able to give you more feedback on the script, onmy experiences, and on the FreeBSD page in the manual.


Sure, keep'm coming

--WjW

I'll work through your various notes. Below are a couple of specificpoints.
When I attempt to start the service, I get:

# service ceph start
=== mon.pochhammer ===
You're sort of free to pick names, but most of the times toolingexpects naming converntions:
    mon: mon.[a-z]
    osd: osd.[0-9]+
    mgr: mgr.[x-z]

Using other names should work, but I'm not sure it works for all cases.
Thanks! I wasn't sure if the restricted naming was just for demopurposes. It's valuable to know that this is very firm advice.
Could also be permission thing. Most daemons used to run as root, but"recently" they started running as user ceph:ceph
Yes, I had to change ownership of a couple of files before getting thisfar.
My mon.a directory looks like:
Aha!
Yup, it is an overwhelming set of tools, with little begin or end.
I hadn't planned to be particularly Brave, here.  But onward...

Best wishes,

Norman


_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] ceph configuration; Was: FreeBSD rc.d script: sta.rt not found

Reply via email to