Re: [ceph-users] Unexpected increase in the memory usage of OSDs

2019-10-04 Thread Gregory Farnum
Do you have statistics on the size of the OSDMaps or count of them which were being maintained by the OSDs? I'm not sure why having noout set would change that if all the nodes were alive, but that's my bet. -Greg On Thu, Oct 3, 2019 at 7:04 AM Vladimir Brik wrote: > > And, just as unexpectedly,

Re: [ceph-users] mon sudden crash loop - pinned map

2019-10-04 Thread Gregory Farnum
Hmm, that assert means the monitor tried to grab an OSDMap it had on disk but it didn't work. (In particular, a "pinned" full map which we kept around after trimming the others to save on disk space.) That *could* be a bug where we didn't have the pinned map and should have (or incorrectly

Re: [ceph-users] Optimizing terrible RBD performance

2019-10-04 Thread Maged Mokhtar
The 4M throughput numbers you see now ( 150 MB/s read, 60 MB/s write) are probably limited by your 1G network, and can probably go higher if you increase it ( 10G or use active bonds). In real life, the applications and wokloads determine the block size, io depths, whether it is sequential

Re: [ceph-users] Optimizing terrible RBD performance

2019-10-04 Thread Petr Bena
Thank you guys, I changed FIO parameters and it looks far better now - reading about 150MB/s, writing over 60MB/s Now, the question is, what could I change in my setup to make it this fast - the RBD is used as LVM PV for a VG shared between Xen hypervisors, this is the PV:   --- Physical

Re: [ceph-users] rgw: multisite support

2019-10-04 Thread DHilsbos
Swami; For 12.2.11 (Luminous), the previously linked document would be: https://docs.ceph.com/docs/luminous/radosgw/multisite/#migrating-a-single-site-system-to-multi-site Thank you, Dominic L. Hilsbos, MBA Director – Information Technology Perform Air International Inc.

Re: [ceph-users] ssd requirements for wal/db

2019-10-04 Thread Stijn De Weirdt
hi all, maybe to clarify a bit, e.g. https://indico.cern.ch/event/755842/contributions/3243386/attachments/1784159/2904041/2019-jcollet-openlab.pdf clearly shows that the db+wal disks are not saturated, but we are wondering what is really needed/acceptable wrt throughput and latency (eg is a

Re: [ceph-users] ssd requirements for wal/db

2019-10-04 Thread Vitaliy Filippov
WAL/DB isn't "read intensive". It's more "write intensive" :) use server SSDs with capacitors to get adequate write performance. Hi all, We are thinking about putting our wal/db of hdds/ on ssds. If we would put the wal of 4 HDDS on 1 SSD as recommended, what type of SSD would suffice? We

Re: [ceph-users] Optimizing terrible RBD performance

2019-10-04 Thread Maged Mokhtar
The tests are measuring differing things, and fio test result of 1.5 MB/s is not bad. The rados write bench uses by default 4M block size and does 16 threads and is random in nature, you can change the block size and thread count. The dd command uses by default 512 block size and and 1

Re: [ceph-users] Optimizing terrible RBD performance

2019-10-04 Thread JC Lopez
Hi, your RBD bench and RADOS bench use by default 4MB IO request size while your FIO is configured for 4KB IO request size. If you want to compare apple 2 apple (bandwidth) you need to change the FIO IO request size to 4194304. Plus, you tested a sequential workload with RADOS bench but

Re: [ceph-users] Optimizing terrible RBD performance

2019-10-04 Thread Petr Bena
Hello, I tried to use FIO on RBD device I just created and writing is really terrible (around 1.5MB/s) [root@ceph3 tmp]# fio test.fio rbd_iodepth32: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=32 fio-3.7 Starting 1 process Jobs: 1 (f=1):

Re: [ceph-users] Optimizing terrible RBD performance

2019-10-04 Thread Alexandre DERUMIER
Hi, >>dd if=/dev/zero of=/dev/rbd0 writes at 5MB/s - you are testing with a single thread/iodepth=1 sequentially here. Then only 1 disk at time, and you have network latency too. rados bench is doing 16 concurrent write. Try to test with fio for example, with bigger iodepth, small block/big

[ceph-users] Optimizing terrible RBD performance

2019-10-04 Thread Petr Bena
Hello, If this is too long for you, TL;DR; section on the bottom I created a CEPH cluster made of 3 SuperMicro servers, each with 2 OSD (WD RED spinning drives) and I would like to optimize the performance of RBD, which I believe is blocked by some wrong CEPH configuration, because from my

[ceph-users] ssd requirements for wal/db

2019-10-04 Thread Kenneth Waegeman
Hi all, We are thinking about putting our wal/db of hdds/ on ssds. If we would put the wal of 4 HDDS on 1 SSD as recommended, what type of SSD would suffice? We were thinking of using SATA Read Intensive 6Gbps 1DWPD SSDs. Does someone has some experience with this configuration? Would we

Re: [ceph-users] rgw: multisite support

2019-10-04 Thread Joachim Kraftmayer
Maybe this will help you: https://docs.ceph.com/docs/master/radosgw/multisite/#migrating-a-single-site-system-to-multi-site ___ Clyso GmbH Am 03.10.2019 um 13:32 schrieb M Ranga Swami Reddy: Thank you. Do we have a quick document to do this migration? Thanks

Re: [ceph-users] how to set osd_crush_initial_weight 0 without restart any service

2019-10-04 Thread Paul Mezzanini
That would accomplish what you are looking for, yes. Keep in mind that with norebalance that won't stop NEW data from landing there. It will only keep old data from migrating in. This shouldn't pose too much of an issue for most use cases. -- Paul Mezzanini Sr Systems Administrator /

[ceph-users] mon sudden crash loop - pinned map

2019-10-04 Thread Philippe D'Anjou
Hi,our mon is acting up all of a sudden and dying in crash loop with the following: 2019-10-04 14:00:24.339583 lease_expire=0.00 has v0 lc 4549352     -3> 2019-10-04 14:00:24.335 7f6e5d461700  5 mon.km-fsn-1-dc4-m1-797678@0(leader).paxos(paxos active c 4548623..4549352) is_readable = 1 -

Re: [ceph-users] Ceph pg repair clone_missing?

2019-10-04 Thread Marc Roos
> >Try something like the following on each OSD that holds a copy of >rbd_data.1f114174b0dc51.0974 and see what output you get. >Note that you can drop the bluestore flag if they are not bluestore >osds and you will need the osd stopped at the time (set noout). Also >note,