Re: [ceph-users] ssd requirements for wal/db

2019-10-04 Thread Stijn De Weirdt
hi all, maybe to clarify a bit, e.g. https://indico.cern.ch/event/755842/contributions/3243386/attachments/1784159/2904041/2019-jcollet-openlab.pdf clearly shows that the db+wal disks are not saturated, but we are wondering what is really needed/acceptable wrt throughput and latency (eg is a

Re: [ceph-users] process stuck in D state on cephfs kernel mount

2019-01-21 Thread Stijn De Weirdt
hi marc, > - how to prevent the D state process to accumulate so much load? you can't. in linux, uninterruptable tasks themself count as "load", this does not mean you eg ran out of cpu resources. stijn > > Thanks, > > > > > > ___ > ceph-users

Re: [ceph-users] CephFS very unstable with many small files

2018-02-25 Thread Stijn De Weirdt
hi, can you give soem more details on the setup? number and size of osds. are you using EC or not? and if so, what EC parameters? thanks, stijn On 02/26/2018 08:15 AM, Linh Vu wrote: > Sounds like you just need more RAM on your MDS. Ours have 256GB each, and the > OSD nodes have 128GB each.

Re: [ceph-users] CephFS very unstable with many small files

2018-02-25 Thread Stijn De Weirdt
hi oliver, >>> in preparation for production, we have run very successful tests with large >>> sequential data, >>> and just now a stress-test creating many small files on CephFS. >>> >>> We use a replicated metadata pool (4 SSDs, 4 replicas) and a data pool with >>> 6 hosts with 32 OSDs each,

Re: [ceph-users] Ceph Bluestore performance question

2018-02-18 Thread Stijn De Weirdt
hi oliver, the IPoIB network is not 56gb, it's probably a lot less (20gb or so). the ib_write_bw test is verbs/rdma based. do you have iperf tests between hosts, and if so, can you share those reuslts? stijn > we are just getting started with our first Ceph cluster (Luminous 12.2.2) and >

Re: [ceph-users] Linux Meltdown (KPTI) fix and how it affects performance?

2018-01-05 Thread Stijn De Weirdt
or do it live https://access.redhat.com/articles/3311301 # echo 0 > /sys/kernel/debug/x86/pti_enabled # echo 0 > /sys/kernel/debug/x86/ibpb_enabled # echo 0 > /sys/kernel/debug/x86/ibrs_enabled stijn On 01/05/2018 12:54 PM, David wrote: > Hi! > > nopti or pti=off in kernel options

Re: [ceph-users] ceph inconsistent pg missing ec object

2017-10-20 Thread Stijn De Weirdt
the others. Plus different log sizes? It's not making a ton of sense > at first glance. > -Greg > > On Thu, Oct 19, 2017 at 1:08 AM Stijn De Weirdt <stijn.dewei...@ugent.be> > wrote: > >> hi greg, >> >> i attached the gzip output of the query and som

Re: [ceph-users] ceph inconsistent pg missing ec object

2017-10-19 Thread Stijn De Weirdt
any other relevant data. You shouldn't need to do manual repair of > erasure-coded pools, since it has checksums and can tell which bits are > bad. Following that article may not have done you any good (though I > wouldn't expect it to hurt, either...)... > -Greg > > On Wed, Oc

[ceph-users] ceph inconsistent pg missing ec object

2017-10-18 Thread Stijn De Weirdt
hi all, we have a ceph 10.2.7 cluster with a 8+3 EC pool. in that pool, there is a pg in inconsistent state. we followed http://ceph.com/geen-categorie/ceph-manually-repair-object/, however, we are unable to solve our issue. from the primary osd logs, the reported pg had a missing object. we

Re: [ceph-users] luminous/bluetsore osd memory requirements

2017-08-12 Thread Stijn De Weirdt
fast your processor should > be... But making it based on how much GHz per TB is an invitation to > context switch to death. > > On Sat, Aug 12, 2017, 8:40 AM Stijn De Weirdt <stijn.dewei...@ugent.be> > wrote: > >> hi all, >> >> thanks for all the

Re: [ceph-users] luminous/bluetsore osd memory requirements

2017-08-12 Thread Stijn De Weirdt
dation will be more realistic than it was in the past (an > effective increase in memory needs), but also that it will be under much > better control than previously. > > On Thu, Aug 10, 2017 at 1:35 AM Stijn De Weirdt <stijn.dewei...@ugent.be> > wrote: > >> hi all

[ceph-users] luminous/bluetsore osd memory requirements

2017-08-10 Thread Stijn De Weirdt
hi all, we are planning to purchse new OSD hardware, and we are wondering if for upcoming luminous with bluestore OSDs, anything wrt the hardware recommendations from http://docs.ceph.com/docs/master/start/hardware-recommendations/ will be different, esp the memory/cpu part. i understand from

Re: [ceph-users] Problem with infernalis el7 package

2015-11-11 Thread Stijn De Weirdt
did you recreate new rpms with same version/release? it would be better to make new rpms with different release (e.g. 9.2.0-1). we have snapshotted mirrors and nginx caches between ceph yum repo and the nodes that install the rpms, so cleaning the cache locally will not help. stijn On 11/11/2015

Re: [ceph-users] CephFS and page cache

2015-10-19 Thread Stijn De Weirdt
>>> So: the key thing to realise is that caching behaviour is full of >>> tradeoffs, and this is really something that needs to be tunable, so >>> that it can be adapted to the differing needs of different workloads. >>> Having an optional "hold onto caps for N seconds after file close" >>> sounds

Re: [ceph-users] btrfs w/ centos 7.1

2015-08-08 Thread Stijn De Weirdt
hi jan, The answer to this, as well as life, universe and everything, is simple: ZFS. is it really the case for ceph? i briefly looked in the filestore code a while ago, since zfs is COW, i expected not to have a journal with ZFS, but i couldn't find anything that suggested this was supported

Re: [ceph-users] Check networking first?

2015-08-03 Thread Stijn De Weirdt
Like a lot of system monitoring stuff, this is the kind of thing that in an ideal world we wouldn't have to worry about, but the experience in practice is that people deploy big distributed storage systems without having really good monitoring in place. We (people providing not to become

Re: [ceph-users] dropping old distros: el6, precise 12.04, debian wheezy?

2015-07-30 Thread Stijn De Weirdt
i would certainly like that all client libs and/or kernel modules stay tested and supported on these OSes for future ceph releases. not sure how much work that is, but the at least client side shouldn't be affected by the init move. stijn On 07/30/2015 04:43 PM, Marc wrote: Hi, much like

Re: [ceph-users] Check networking first?

2015-07-30 Thread Stijn De Weirdt
wouldn't it be nice that ceph does something like this in background (some sort of network-scrub). debugging network like this is not that easy (can't expect admins to install e.g. perfsonar on all nodes and/or clients) something like: every X min, each service X pick a service Y on another

Re: [ceph-users] Redundant Power Supplies

2014-10-30 Thread Stijn De Weirdt
if you don't have 2 powerfeeds, don't spend the money. if you have 2 feeds, well, start with 2 PSUs for your switches ;) if you stick with one PSU for the OSDs, make sure you have your cabling (power and network, don't forget your network switches should be on same power feeds ;) and crushmap

Re: [ceph-users] use ZFS for OSDs

2014-10-29 Thread Stijn De Weirdt
hi michal, thanks for the info. we will certainly try it and see if we come to the same conclusions ;) one small detail: since you were using centos7, i'm assuming you were using ZoL 0.6.3? stijn On 10/29/2014 08:03 PM, Michal Kozanecki wrote: Forgot to mention, when you create the

Re: [ceph-users] the state of cephfs in giant

2014-10-15 Thread Stijn De Weirdt
We've been doing a lot of work on CephFS over the past few months. This is an update on the current state of things as of Giant. ... * Either the kernel client (kernel 3.17 or later) or userspace (ceph-fuse or libcephfs) clients are in good working order. Thanks for all the work and

Re: [ceph-users] IRQ balancing, distribution

2014-09-22 Thread Stijn De Weirdt
hi christian, we once were debugging some performance isssues, and IRQ balancing was one of the issues we looked in, but no real benefit there for us. all interrupts on one cpu is only an issue if the hardware itself is not the bottleneck. we were running some default SAS HBA (Dell H200), and

Re: [ceph-users] IRQ balancing, distribution

2014-09-22 Thread Stijn De Weirdt
but another issue is the OSD processes: do you pin those as well? and how much data do they actually handle. to checksum, the OSD process needs all data, so that can also cause a lot of NUMA traffic, esp if they are not pinned. That's why all my (production) storage nodes have only a single 6

Re: [ceph-users] IRQ balancing, distribution

2014-09-22 Thread Stijn De Weirdt
for each OSD; can these be HT cores or actual physical cores? stijn Regards, Anand -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Stijn De Weirdt Sent: Monday, September 22, 2014 2:36 PM To: ceph-users@lists.ceph.com Subject: Re: [ceph-users

Re: [ceph-users] ceph issue: rbd vs. qemu-kvm

2014-09-18 Thread Stijn De Weirdt
hi steven, we ran into issues when trying to use a non-default user ceph user in opennebula (don't remeber what the default was; but it's probably not libvirt2 ), patches are in https://github.com/OpenNebula/one/pull/33, devs sort-of confirmed they will be in 4.8.1. this way you can set

Re: [ceph-users] v0.84 released

2014-08-26 Thread Stijn De Weirdt
hi all, there are a zillion OSD bug fixes. Things are looking pretty good for the Giant release that is coming up in the next month. any chance of having a compilable cephfs kernel module for el7 for the next major release? stijn ___ ceph-users

Re: [ceph-users] cephfs and EC

2014-07-09 Thread Stijn De Weirdt
pool fails). but probably the read-cache has to be the same as the write cache (eg when people want to modify a file). stijn On 07/08/2014 05:24 PM, Stijn De Weirdt wrote: hi mark, thanks for clarifying it a bit. we'll certainly have a look at the caching tier setup. stijn On 07/08/2014 01

[ceph-users] cephfs and EC

2014-07-08 Thread Stijn De Weirdt
hi all, one of the changes in the 0.82 release (accoridng to the notes) is: mon: prevent EC pools from being used with cephfs can someone clarify this a bit? cephfs with EC pools make no sense? now? ever? or is it just not recommended (i'm also interested in the technical reasons behind it)

Re: [ceph-users] cephfs and EC

2014-07-08 Thread Stijn De Weirdt
hi mark, thanks for clarifying it a bit. we'll certainly have a look at the caching tier setup. stijn On 07/08/2014 01:53 PM, Mark Nelson wrote: On 07/08/2014 04:28 AM, Stijn De Weirdt wrote: hi all, one of the changes in the 0.82 release (accoridng to the notes) is: mon: prevent EC

Re: [ceph-users] ceph kernel module for centos7

2014-07-08 Thread Stijn De Weirdt
that the kmod-ceph src rpms also provide the ceph module. stijn On 06/27/2014 06:26 PM, Stijn De Weirdt wrote: hi all, does anyone know how to build the ceph.ko module for centos7 (3.10.0-123.el7 kernel) QA release? rebuilding the 0.81 ceph-kmod src rpms gives modules for libceph and rbd, but the one

[ceph-users] ceph kernel module for centos7

2014-06-27 Thread Stijn De Weirdt
hi all, does anyone know how to build the ceph.ko module for centos7 (3.10.0-123.el7 kernel) QA release? rebuilding the 0.81 ceph-kmod src rpms gives modules for libceph and rbd, but the one for ceph fails with error (same issue as for rhel7rc, see

Re: [ceph-users] qemu packages for el7

2014-06-18 Thread Stijn De Weirdt
hi kenneth, yes there are if you create them. from the centos git sources wiki, use the qemu-kvm repo, in the spec set rhev to 1 (and change the release) and build it. update the installed rpms and done. works out of the box (but maybe not so much from your side of our office ;) stijn On

Re: [ceph-users] ceph 0.78 mon and mds crashing (bus error)

2014-04-16 Thread Stijn De Weirdt
this issue. ok, we'll rebuild and try asap stijn Regards Yan, Zheng Thanks! Regards Yan, Zheng Thanks! Kenneth - Message from Stijn De Weirdt stijn.dewei...@ugent.be - Date: Fri, 04 Apr 2014 20:31:34 +0200 From: Stijn De Weirdt stijn.dewei...@ugent.be Subject: Re: [ceph

Re: [ceph-users] ceph 0.78 mon and mds crashing (bus error)

2014-04-04 Thread Stijn De Weirdt
hi yan, (taking the list in CC) On 04/04/2014 04:44 PM, Yan, Zheng wrote: On Thu, Apr 3, 2014 at 2:52 PM, Stijn De Weirdt stijn.dewei...@ugent.be wrote: hi, latest pprof output attached. this is no kernel client, this is ceph-fuse on EL6. starting the mds without any ceph-fuse mounts works

Re: [ceph-users] ceph 0.78 mon and mds crashing (bus error)

2014-04-02 Thread Stijn De Weirdt
hi gregory, (i'm a colleague of kenneth) 1) How big and what shape the filesystem is. Do you have some extremely large directory that the MDS keeps trying to load and then dump? anyway to extract this from the mds without having to start it? as it was an rsync operation, i can try to locate

Re: [ceph-users] ceph 0.78 mon and mds crashing (bus error)

2014-04-02 Thread Stijn De Weirdt
hi, 1) How big and what shape the filesystem is. Do you have some extremely large directory that the MDS keeps trying to load and then dump? anyway to extract this from the mds without having to start it? as it was an rsync operation, i can try to locate possible candidates on the source

Re: [ceph-users] clock skew

2014-03-13 Thread Stijn De Weirdt
can we retest the clock skew condition? or get the value that the skew is? ceph status gives health HEALTH_WARN clock skew detected on mon.ceph003 in a polysh session (ie parallel ssh sort of thing) ready (3) date +%s.%N ceph002 : 1394713567.184218678 ceph003 : 1394713567.182722045 ceph001 :

Re: [ceph-users] Dell H310

2014-03-07 Thread Stijn De Weirdt
we tried this with a Dell H200 (also LSI2008 based). however, running some basic benchmarks, we saw no immediate difference between IT and IR firmware. so i'd like to know: what kind of performance improvement do you get, and how did you measure it it? thanks a lot stijn the howto for

Re: [ceph-users] Dell H310

2014-03-07 Thread Stijn De Weirdt
we tried this with a Dell H200 (also LSI2008 based). however, running some basic benchmarks, we saw no immediate difference between IT and IR firmware. so i'd like to know: what kind of performance improvement do you get, and how did you measure it it? IMO, flashing to IT firmware is mainly