Re: [ceph-users] BlueStore bitmap allocator under Luminous and Mimic

2019-07-09 Thread Konstantin Shalygin
On 5/28/19 5:16 PM, Marc Roos wrote: I switched first of may, and did not notice to much difference in memory usage. After the restart of the osd's on the node I see the memory consumption gradually getting back to as before. Can't say anything about latency. Anybody else? Wido? I see many

Re: [ceph-users] ceph-volume failed after replacing disk

2019-07-09 Thread ST Wong (ITSC)
Hi all, I’m testing failover behavior of mon/mgr by stopping the active one. Found message at the end of “ceph –s” output: progress: Rebalancing after osd.71 marked out [..] Is this normal? Thanks and Rgds. /st From: ceph-users On Behalf Of ST Wong

Re: [ceph-users] Ceph performance IOPS

2019-07-09 Thread Davis Mendoza Paco
What would be the most appropriate procedure to move blockdb/wal to SSD? 1.- remove the OSD and recreate it (affects the performance) ceph-volume lvm prepare --bluestore --data --block.wal --block.db 2.- Follow the documentation

Re: [ceph-users] 3 OSDs stopped and unable to restart

2019-07-09 Thread Igor Fedotov
This will cap single bluefs space allocation. Currently it attempts to allocate 70Gb which seems to overflow some 32-bit length fields. With the adjustment such allocation should be capped at ~700MB. I doubt there is any relation between this specific failure and the pool. At least at the

Re: [ceph-users] 3 OSDs stopped and unable to restart

2019-07-09 Thread Brett Chancellor
What does bluestore_bluefs_gift_ratio do? I can't find any documentation on it. Also do you think this could be related to the .rgw.meta pool having too many objects per PG? The disks that die always seem to be backfilling a pg from that pool, and they have ~550k objects per PG. -Brett On Tue,

Re: [ceph-users] 3 OSDs stopped and unable to restart

2019-07-09 Thread Igor Fedotov
Please try to set bluestore_bluefs_gift_ratio to 0.0002 On 7/9/2019 7:39 PM, Brett Chancellor wrote: Too large for pastebin.. The problem is continually crashing new OSDs. Here is the latest one. On Tue, Jul 9, 2019 at 11:46 AM Igor Fedotov > wrote: could you

Re: [ceph-users] DR practice: "uuid != super.uuid" and csum error at blob offset 0x0

2019-07-09 Thread Igor Fedotov
Hi Mark, I doubt read-only mode would help here. Log replay  is required to build a consistent store state and one can't bypass it. And looks like your drive/controller still detect some errors while reading. For the second issue this PR might help (you'll be able to disable csum

Re: [ceph-users] Ceph features and linux kernel version for upmap

2019-07-09 Thread Paul Emmerich
On Tue, Jul 9, 2019 at 4:32 PM Mattia Belluco wrote: > Hi Paul, > > > Should I just go ahead? yes > My guess is that those clients (that are CephFS > clients using the kernel driver) will not be able to mount the > filesystem anymore. > no, that will continue to work because it only blocks

Re: [ceph-users] 3 OSDs stopped and unable to restart

2019-07-09 Thread Igor Fedotov
could you please set debug bluestore to 20 and collect startup log for this specific OSD once again? On 7/9/2019 6:29 PM, Brett Chancellor wrote: I restarted most of the OSDs with the stupid allocator (6 of them wouldn't start unless bitmap allocator was set), but I'm still seeing issues

Re: [ceph-users] 3 OSDs stopped and unable to restart

2019-07-09 Thread Brett Chancellor
I restarted most of the OSDs with the stupid allocator (6 of them wouldn't start unless bitmap allocator was set), but I'm still seeing issues with OSDs crashing. Interestingly it seems that the dying OSDs are always working on a pg from the .rgw.meta pool when they crash. Log :

Re: [ceph-users] Ceph features and linux kernel version for upmap

2019-07-09 Thread Mattia Belluco
Hi Paul, thanks for the quick reply. I am still a bit puzzled as the documentation states: "To allow use of the feature, you must tell the cluster that it only needs to support luminous (and newer) clients with: ceph osd set-require-min-compat-client luminous" That of course in my case fails

Re: [ceph-users] Ceph features and linux kernel version for upmap

2019-07-09 Thread Paul Emmerich
Yes, upmap will work with 4.15. The Ceph kernel client is a completely independent implementation from the user space client, so you can't really map it to any specific features. It has some Luminous features but lacks others. Paul -- Paul Emmerich Looking for help with your Ceph cluster?

[ceph-users] Ceph features and linux kernel version for upmap

2019-07-09 Thread Mattia Belluco
Hello ml, I have been looking for an updated table like the one you can see here: https://ceph.com/geen-categorie/feature-set-mismatch-error-on-ceph-kernel-client/ Case in point we would like to use upmap on our ceph cluster (currently used mainly for CephFS) but `ceph feature` return:

[ceph-users] DR practice: "uuid != super.uuid" and csum error at blob offset 0x0

2019-07-09 Thread Mark Lehrer
My main question is this - is there a way to stop any replay or journaling during OSD startup and bring up the pool/fs in read-only mode? Here is a description of what I'm seeing. I have a Luminous cluster with CephFS and 16 8TB SSDs, using size=3. I had a problem with one of my SAS

[ceph-users] Questions about ceph internals

2019-07-09 Thread Franck Desjeunes
Hi everyone. I have some questions since I'm benchmarking some machines to install a ceph cluster for my personal needs. 1) is a write operation between a ceph client and an OSD synchronous or asynchronous ? As far as I know, the primary OSD notifies the client once all OSDs (primary +

Re: [ceph-users] Missing Ubuntu Packages on Luminous

2019-07-09 Thread Paul Emmerich
Yes, Luminous isn't being build for 18.04 -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Mon, Jul 8, 2019 at 2:45 PM Stolte, Felix wrote: > Hi folks, > > I want to use

Re: [ceph-users] memory usage of: radosgw-admin bucket rm

2019-07-09 Thread Matt Benjamin
Hi Harald, Please file a tracker issue, yes. (Deletes do tend to be slower, presumably due to rocksdb compaction.) Matt On Tue, Jul 9, 2019 at 7:12 AM Harald Staub wrote: > > Currently removing a bucket with a lot of objects: > radosgw-admin bucket rm --bucket=$BUCKET --bypass-gc

Re: [ceph-users] memory usage of: radosgw-admin bucket rm

2019-07-09 Thread Paul Emmerich
Try to add "--inconsistent-index" (caution: will obviously leave your bucket in a broken state during the deletion, so don't try to use the bucket) You can also speed up the deletion with "--max-concurrent-ios" (default 32). The documentation incorrectly claims that "--max-concurrent-ios" is only

[ceph-users] memory usage of: radosgw-admin bucket rm

2019-07-09 Thread Harald Staub
Currently removing a bucket with a lot of objects: radosgw-admin bucket rm --bucket=$BUCKET --bypass-gc --purge-objects This process was killed by the out-of-memory killer. Then looking at the graphs, we see a continuous increase of memory usage for this process, about +24 GB per day. Removal

Re: [ceph-users] 3 OSDs stopped and unable to restart

2019-07-09 Thread Igor Fedotov
Hi Brett, in Nautilus you can do that via ceph config set osd.N bluestore_allocator stupid ceph config set osd.N bluefs_allocator stupid See https://ceph.com/community/new-mimic-centralized-configuration-management/ for more details on a new way of configuration options setting. A known

Re: [ceph-users] slow requests due to scrubbing of very small pg

2019-07-09 Thread Igor Fedotov
Hi Lukasz, if this is filestore then most probably my comments are irrelevant. The issue I expected is BlueStore specific Unfortunately I'm not an expert in filestore hence unable to help in further investigation. Sorry... Thanks, Igor On 7/9/2019 11:39 AM, Luk wrote: We have

Re: [ceph-users] slow requests due to scrubbing of very small pg

2019-07-09 Thread Luk
We have (still) on these OSDs filestore. Regards Lukasz > Hi Igor, > ThankYoufor Your input, will try Your suggestion with > ceph-objectstore-tool. > But for now it looks like main problem is this: > 2019-07-09 09:29:25.410839 7f5e4b64f700 1 heartbeat_map is_healthy >

Re: [ceph-users] slow requests due to scrubbing of very small pg

2019-07-09 Thread Luk
Hi Igor, ThankYoufor Your input, will try Your suggestion with ceph-objectstore-tool. But for now it looks like main problem is this: 2019-07-09 09:29:25.410839 7f5e4b64f700 1 heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7f5e20e87700' had timed out after 15 2019-07-09

Re: [ceph-users] What's the best practice for Erasure Coding

2019-07-09 Thread Frank Schilder
Small addition: This result holds for rbd bench. It seems to imply good performance for large-file IO on cephfs, since cephfs will split large files into many objects of size object_size. Small-file IO is a different story. The formula should be N*alloc_size=object_size/k, where N is some

Re: [ceph-users] What's the best practice for Erasure Coding

2019-07-09 Thread Frank Schilder
Hi Nathan, its just a hypothesis. I did not check what the algorithm does. The reasoning is this. Bluestore and modern disks have preferred read/write sizes that are quite large for large drives. These are usually powers of 2. If you use a k+m EC profile, any read/write is split into k

Re: [ceph-users] 3 OSDs stopped and unable to restart

2019-07-09 Thread Konstantin Shalygin
I'll give that a try. Is it something like... ceph tell 'osd.*' bluestore_allocator stupid ceph tell 'osd.*' bluefs_allocator stupid And should I expect any issues doing this? You should place this to ceph.conf and restart your osds. Otherwise, this should fix new bitmap allocator issue via