Re: [ceph-users] optimize bluestore for random write i/o

2019-03-12 Thread vitalif
I bet you'd see better memstore results with my vector based object implementation instead of bufferlists. Where can I find it? Nick Fisk noticed the same thing you did.  One interesting observation he made was that disabling CPU C/P states helped bluestore immensely in the iodepth=1 case.

Re: [ceph-users] optimize bluestore for random write i/o

2019-03-12 Thread Mark Nelson
On 3/12/19 8:40 AM, vita...@yourcmc.ru wrote: One way or another we can only have a single thread sending writes to rocksdb.  A lot of the prior optimization work on the write side was to get as much processing out of the kv_sync_thread as possible. That's still a worthwhile goal as it's

Re: [ceph-users] optimize bluestore for random write i/o

2019-03-12 Thread vitalif
One way or another we can only have a single thread sending writes to rocksdb.  A lot of the prior optimization work on the write side was to get as much processing out of the kv_sync_thread as possible.  That's still a worthwhile goal as it's typically what bottlenecks with high amounts of

Re: [ceph-users] optimize bluestore for random write i/o

2019-03-12 Thread Mark Nelson
On 3/12/19 7:31 AM, vita...@yourcmc.ru wrote: Decreasing the min_alloc size isn't always a win, but ican be in some cases.  Originally bluestore_min_alloc_size_ssd was set to 4096 but we increased it to 16384 because at the time our metadata path was slow and increasing it resulted in a pretty

Re: [ceph-users] optimize bluestore for random write i/o

2019-03-12 Thread vitalif
Decreasing the min_alloc size isn't always a win, but ican be in some cases.  Originally bluestore_min_alloc_size_ssd was set to 4096 but we increased it to 16384 because at the time our metadata path was slow and increasing it resulted in a pretty significant performance win (along with

Re: [ceph-users] optimize bluestore for random write i/o

2019-03-06 Thread Stefan Priebe - Profihost AG
Am 06.03.19 um 14:08 schrieb Mark Nelson: > > On 3/6/19 5:12 AM, Stefan Priebe - Profihost AG wrote: >> Hi Mark, >> Am 05.03.19 um 23:12 schrieb Mark Nelson: >>> Hi Stefan, >>> >>> >>> Could you try running your random write workload against bluestore and >>> then take a wallclock profile of an

Re: [ceph-users] optimize bluestore for random write i/o

2019-03-06 Thread Mark Nelson
On 3/6/19 5:12 AM, Stefan Priebe - Profihost AG wrote: Hi Mark, Am 05.03.19 um 23:12 schrieb Mark Nelson: Hi Stefan, Could you try running your random write workload against bluestore and then take a wallclock profile of an OSD using gdbpmp? It's available here:

Re: [ceph-users] optimize bluestore for random write i/o

2019-03-06 Thread Mark Nelson
On 3/5/19 4:23 PM, Vitaliy Filippov wrote: Testing -rw=write without -sync=1 or -fsync=1 (or -fsync=32 for batch IO, or just fio -ioengine=rbd from outside a VM) is rather pointless - you're benchmarking the RBD cache, not Ceph itself. RBD cache is coalescing your writes into big sequential

Re: [ceph-users] optimize bluestore for random write i/o

2019-03-06 Thread Stefan Priebe - Profihost AG
Hi Mark, Am 05.03.19 um 23:12 schrieb Mark Nelson: > Hi Stefan, > > > Could you try running your random write workload against bluestore and > then take a wallclock profile of an OSD using gdbpmp? It's available here: > > > https://github.com/markhpc/gdbpmp sure but it does not work: #

Re: [ceph-users] optimize bluestore for random write i/o

2019-03-05 Thread Vitaliy Filippov
Testing -rw=write without -sync=1 or -fsync=1 (or -fsync=32 for batch IO, or just fio -ioengine=rbd from outside a VM) is rather pointless - you're benchmarking the RBD cache, not Ceph itself. RBD cache is coalescing your writes into big sequential writes. Of course bluestore is faster in

Re: [ceph-users] optimize bluestore for random write i/o

2019-03-05 Thread Mark Nelson
Hi Stefan, Could you try running your random write workload against bluestore and then take a wallclock profile of an OSD using gdbpmp? It's available here: https://github.com/markhpc/gdbpmp Thanks, Mark On 3/5/19 2:29 AM, Stefan Priebe - Profihost AG wrote: Hello list, while the

Re: [ceph-users] optimize bluestore for random write i/o

2019-03-05 Thread Stefan Priebe - Profihost AG
Am 05.03.19 um 10:05 schrieb Paul Emmerich: > This workload is probably bottlenecked by rocksdb (since the small > writes are buffered there), so that's probably what needs tuning here. while reading:

Re: [ceph-users] optimize bluestore for random write i/o

2019-03-05 Thread Paul Emmerich
This workload is probably bottlenecked by rocksdb (since the small writes are buffered there), so that's probably what needs tuning here. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: