[ceph-users] Benchmarking

2018-06-19 Thread Nino Bosteels
Hi,

Anyone got tips as to how to best benchmark a Ceph blockdevice (RBD)?

I've currently found the more traditional ways (dd, iostat, bonnie++, phoronix 
test suite) and fio which actually supports the rbd-engine.

Though there's not a lot of information about it to be found online (contrary 
to e.g. benchmarking zfs).

Currently I'm using this, closest to the usage of BackupPC-software available 
(small randreads) with fio:

[global]
ioengine=rbd
clientname=admin
pool=[poolname]
rbdname=[rbdname]
rw=randread
randrepeat=1
direct=1
ramp_time=4
bs=4k
[rbd_iodepth32]
iodepth=32

Any ideas (or questions) welcome !

Nino
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Network cluster / addr

2018-08-21 Thread Nino Bosteels
Dear mailinglist,

I've been struggling to find a working configuration of the network cluster / 
addr or even public addr.

* Does ceph interpret multiple values for this in the ceph.conf (I wouldn't say 
so out of my tests)?
* Shouldn't public network be your internet facing range and cluster network 
the private range?

Thanks for your time,
Nino
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [RBD]Replace block device cluster

2018-07-20 Thread Nino Bosteels
In response to my own questions, I read that you shouldn't separate your 
journal / rocksDB from the disks where your data resides, with bluestore. And 
the general rule of one core per OSD seems to be unnecessary, since in the 
current clusters we've got 4 cores with 5 disks and CPU usage never goes over 
20-30%.


New questions are if I should separate the admin / monitor nodes from the data 
storage nodes (separate HDD, or separate machine?). And if I could use a 
separate machine with an SSD for caching? We can't add SSD's to these dedicated 
machines. So perhaps then the network will be the bottleneck and no remarkable 
speed-boost will be noticed.


Back to the interwebz for research 



From: ceph-users  on behalf of Nino Bosteels 

Sent: 19 July 2018 16:01
To: ceph-users@lists.ceph.com
Subject: [ceph-users] [RBD]Replace block device cluster


We’re looking to replace our existing RBD cluster, which makes and stores our 
backups. Atm we’ve got one machine running backuppc, where the RBD is mounted 
and 8 ceph nodes.



The idea is to gain in speed and/or pay less (or pay equally for moar speed).



Doubting to get SSD in the mix. Have I understood correctly that it’s useful 
for setting up a cache pool and / or for separating the journal? Can I use a 
different server for this?





Old specs (8 machines):

CPU:  Intel Xeon D1520 4c/8t 2.2 GHz/2.6 GHz

RAM:32 GB DDR4 ECC 2133 MHz

Disks:5x 6 TB SAS2

Public network card:  1 x 1  Gbps



40 disks, total of 1159.92 euro



Consideration for new specs:

3 machines:

CPU:  Intel  Xeon E5-2620v3 - 6c/12t - 2.4GHz /3.2GHz

RAM:64GB DDR4 ECC 1866 MHz

Disks:12x 4 TB SAS2

Public network card:  1 x 1  Gbps



36 disks for a total of 990 euro



10 machines:

CPU:  Intel  Xeon D-1521 - 4c/8t - 2,4GHz /2,7GHz

RAM:16GB DDR4 ECC 2133MHz

Disks:4x 6TB

Public network card:  1 x 1  Gbps



40 disks for a total of 940 euro

Perhaps in combination with SSD, this last option?!



Any advice is greatly appreciated.



How do you make your decisions / comparisons? 1 disk per OSD I guess, but  
then, how many cores per disk or stuff like that?



Thanks in advance.



Nino Bosteels
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] [RBD]Replace block device cluster

2018-07-19 Thread Nino Bosteels
We're looking to replace our existing RBD cluster, which makes and stores our 
backups. Atm we've got one machine running backuppc, where the RBD is mounted 
and 8 ceph nodes.

The idea is to gain in speed and/or pay less (or pay equally for moar speed).

Doubting to get SSD in the mix. Have I understood correctly that it's useful 
for setting up a cache pool and / or for separating the journal? Can I use a 
different server for this?


Old specs (8 machines):
CPU:  Intel Xeon D1520 4c/8t 2.2 GHz/2.6 GHz
RAM:32 GB DDR4 ECC 2133 MHz
Disks:5x 6 TB SAS2
Public network card:  1 x 1  Gbps

40 disks, total of 1159.92 euro

Consideration for new specs:
3 machines:
CPU:  Intel  Xeon E5-2620v3 - 6c/12t - 2.4GHz /3.2GHz
RAM:64GB DDR4 ECC 1866 MHz
Disks:12x 4 TB SAS2
Public network card:  1 x 1  Gbps

36 disks for a total of 990 euro

10 machines:
CPU:  Intel  Xeon D-1521 - 4c/8t - 2,4GHz /2,7GHz
RAM:16GB DDR4 ECC 2133MHz
Disks:4x 6TB
Public network card:  1 x 1  Gbps

40 disks for a total of 940 euro
Perhaps in combination with SSD, this last option?!

Any advice is greatly appreciated.

How do you make your decisions / comparisons? 1 disk per OSD I guess, but  
then, how many cores per disk or stuff like that?

Thanks in advance.

Nino Bosteels
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Mimic upgrade 13.2.1 > 13.2.2 monmap changed

2018-10-04 Thread Nino Bosteels
Hello list,

I'm having a serious issue, since my ceph cluster has become unresponsive. I 
was upgrading my cluster (3 servers, 3 monitors) from 13.2.1 to 13.2.2, which 
shouldn't be a problem.

Though on reboot my first host reported:

starting mon.ceph01 rank -1 at 192.168.200.197:6789/0 mon_data 
/var/lib/ceph/mon/ceph-ceph01 fsid 27dd45f1-28b5-4ac6-81ab-c62bc581130c
mon.cephxx@-1(probing) e5 preinit fsid 27dd45f1-28b5-4ac6-81ab-c62bc581130c
mon.cephxx@-1(probing) e5 not in monmap and have been in a quorum before; must 
have been removed
-1 mon.cephxx@-1(probing) e5 commit suicide!
-1 failed to initialize

I thought, perhaps the monitor doesn't want to accept the monmap of the other 
2, because of the version-difference. Sadly, I upgraded and rebooted the second 
server.

Since the cluster is unresponsive (because more than half of the monitors is 
offline / out of quorum). The logs of my second host, it keeps spamming:

2018-10-04 14:39:06.802 7fed0058f700 -1 mon.ceph02@1(probing) e6 
get_health_metrics reporting 14 slow ops, oldest is auth(proto 0 27 bytes epoch 
6)

Any help VERY MUCH appreciated, this sucks.

Thanks
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com