[ceph-users] Re: Increasing QD=1 performance (lowering latency)

2021-02-08 Thread Paul Emmerich
A few things that you can try on the network side to shave off microseconds: 1) 10G Base-T has quite some latency compared to fiber or DAC. I've measured 2 µs on Base-T vs. 0.3µs on fiber for one link in one direction, so that's 8µs you can save for a round-trip if it's client -> switch -> osd

[ceph-users] Re: High ceph_osd_commit_latency_ms on Toshiba MG07ACA14TE HDDs

2020-06-24 Thread Paul Emmerich
Well, what I was saying was "does it hurt to unconditionally run hdparm -W 0 on all disks?" Which disk would suffer from this? I haven't seen any disk where this would be a bad idea Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io

[ceph-users] Re: Feedback of the used configuration

2020-06-24 Thread Paul Emmerich
Have a look at cephfs subvolumes: https://docs.ceph.com/docs/master/cephfs/fs-volumes/#fs-subvolumes They are internally just directories with quota/pool placement layout/namespace with some mgr magic to make it easier than doing that all by hand Paul -- Paul Emmerich Looking for help

[ceph-users] Re: High ceph_osd_commit_latency_ms on Toshiba MG07ACA14TE HDDs

2020-06-24 Thread Paul Emmerich
Has anyone ever encountered a drive with a write cache that actually *helped*? I haven't. As in: would it be a good idea for the OSD to just disable the write cache on startup? Worst case it doesn't do anything, best case it improves latency. Paul -- Paul Emmerich Looking for help with your

[ceph-users] Re: Nautilus: Monitors not listening on msgrv1

2020-06-23 Thread Paul Emmerich
both v1 and v2 is the default. You can simply write it like this if you are running on the default ports: mon_host = 10.144.0.2, 10.144.0.3, 10.144.0.4 This has the advantage of being backwards-compatible with old clients. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact

[ceph-users] Re: Radosgw huge traffic to index bucket compared to incoming requests

2020-06-19 Thread Paul Emmerich
f here is recovery speed/locked objects vs. read amplification) I think the formula shards = bucket_size / 100k shouldn't apply for buckets with >= 100 million objects; shards should become bigger as the bucket size increases. Paul -- Paul Emmerich Looking for help with your Ceph cluster? C

[ceph-users] Re: help with failed osds after reboot

2020-06-15 Thread Paul Emmerich
On Mon, Jun 15, 2020 at 7:01 PM wrote: > Ceph version 10.2.7 > > ceph.conf > [global] > fsid = 75d6dba9-2144-47b1-87ef-1fe21d3c58a8 > (...) > mount_activate: Failed to activate > ceph-disk: Error: No cluster conf found in /etc/ceph with fsid > e1d7b4ae-2dcd-40ee-bea

[ceph-users] Re: OSD upgrades

2020-06-02 Thread Paul Emmerich
don't get back (or broken disks that you don't replace quickly) Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Tue, Jun 2, 2020 at 12:32 PM Thomas Byrne -

[ceph-users] Re: OSD upgrades

2020-06-02 Thread Paul Emmerich
"reweight 0" and "out" are the exact same thing Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Tue, Jun 2, 2020 at 9:30 AM Wido den Hollander

[ceph-users] Re: [ceph-users]: Ceph Nautius not working after setting MTU 9000

2020-05-29 Thread Paul Emmerich
-- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Fri, May 29, 2020 at 2:15 AM Dave Hall wrote: > Hello. > > A few days ago I offered to share the notes I've

[ceph-users] Re: No scrubbing during upmap balancing

2020-05-29 Thread Paul Emmerich
Did you disable "osd scrub during recovery"? Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Fri, May 29, 2020 at 12:04 AM Vytenis A wrote: > Forg

[ceph-users] Re: The sufficient OSD capabilities to enable write access on cephfs

2020-05-29 Thread Paul Emmerich
There are two bugs that may cause the tag to be missing from the pools, you can somehow manually add these tags with "ceph osd pool application ..."; I think I posted these commands some time ago on tracker.ceph.com Paul -- Paul Emmerich Looking for help with your Ceph cluster?

[ceph-users] Re: High latency spikes under jewel

2020-05-27 Thread Paul Emmerich
Common problem for FileStore and really no point in debugging this: upgrade everything to a recent version and migrate to BlueStore. 99% of random latency spikes are just fixed by doing that. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit

[ceph-users] Re: 15.2.2 bluestore issue

2020-05-27 Thread Paul Emmerich
Hi, since this bug may lead to data loss when several OSDs crash at the same time (e.g., after a power outage): can we pull the release from the mirrors and docker hub? Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h

[ceph-users] Re: [External Email] Re: Ceph Nautius not working after setting MTU 9000

2020-05-26 Thread Paul Emmerich
Don't optimize stuff without benchmarking *before and after*, don't apply random tuning tipps from the Internet without benchmarking them. My experience with Jumbo frames: 3% performance. On a NVMe-only setup with 100 Gbit/s network. Paul -- Paul Emmerich Looking for help with your Ceph

[ceph-users] Re: diskprediction_local prediction granularity

2020-05-20 Thread Paul Emmerich
On Wed, May 20, 2020 at 5:36 PM Vytenis A wrote: > Is it possible to get any finer prediction date? > related question: did anyone actually observe any correlation between the predicted failure time and the actual time until a failure occurs? Paul -- Paul Emmerich Looking fo

[ceph-users] Re: osds dropping out of the cluster w/ "OSD::osd_op_tp thread … had timed out"

2020-05-19 Thread Paul Emmerich
On Tue, May 19, 2020 at 3:11 PM thoralf schulze wrote: > > On 5/19/20 2:13 PM, Paul Emmerich wrote: > > 3) if necessary add more OSDs; common problem is having very > > few dedicated OSDs for the index pool; running the index on > > all OSDs (and having a fast

[ceph-users] Re: osds dropping out of the cluster w/ "OSD::osd_op_tp thread … had timed out"

2020-05-19 Thread Paul Emmerich
On Tue, May 19, 2020 at 2:06 PM Igor Fedotov wrote: > Hi Thoralf, > > given the following indication from your logs: > > May 18 21:12:34 ceph-osd-05 ceph-osd[2356578]: 2020-05-18 21:12:34.211 > 7fb25cc80700 0 bluestore(/var/lib/ceph/osd/ceph-293) log_latency_fn > slow operation observed for

[ceph-users] Re: Dealing with non existing crush-root= after reclassify on ec pools

2020-05-18 Thread Paul Emmerich
that part of erasure profiles are only used when a crush rule is created when creating a pool without explicitly specifying a crush rule Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel

[ceph-users] Re: nfs migrate to rgw

2020-05-18 Thread Paul Emmerich
for erasure coding on HDDs, but that's unrelated to rgw/you'd have the same problem with CephFS Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 > > Wido den

[ceph-users] Re: Disproportionate Metadata Size

2020-05-13 Thread Paul Emmerich
osd df is misleading when using external DB devices, they are always counted as 100% full there Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Wed, May 13, 2020

[ceph-users] Re: OSD corruption and down PGs

2020-05-12 Thread Paul Emmerich
First thing I'd try is to use objectstore-tool to scrape the inactive/broken PGs from the dead OSDs using it's PG export feature. Then import these PGs into any other OSD which will automatically recover it. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https

[ceph-users] Re: Zeroing out rbd image or volume

2020-05-12 Thread Paul Emmerich
And many hypervisors will turn writing zeroes into an unmap/trim (qemu detect-zeroes=unmap), so running trim on the entire empty disk is often the same as writing zeroes. So +1 for encryption being the proper way here Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us

[ceph-users] Re: Ceph meltdown, need help

2020-05-05 Thread Paul Emmerich
Check network connectivity on all configured networks between alle hosts, OSDs running but being marked as down is usually a network problem Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io

[ceph-users] Re: mount issues with rbd running xfs - Structure needs cleaning

2020-05-04 Thread Paul Emmerich
when encountering a read-only block device Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Mon, May 4, 2020 at 7:05 PM Void Star Nill wrote: > Thanks Janne

[ceph-users] Re: 14.2.9 MDS Failing

2020-05-01 Thread Paul Emmerich
On Fri, May 1, 2020 at 9:27 PM Paul Emmerich wrote: > The OpenFileTable objects are safe to delete while the MDS is offline > anyways, the RADOS object names are mds*_openfiles* > I should clarify this a little bit: you shouldn't touch the CephFS internal state or data structures u

[ceph-users] Re: 14.2.9 MDS Failing

2020-05-01 Thread Paul Emmerich
The OpenFileTable objects are safe to delete while the MDS is offline anyways, the RADOS object names are mds*_openfiles* Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585

[ceph-users] Re: 4.14 kernel or greater recommendation for multiple active MDS

2020-05-01 Thread Paul Emmerich
I've seen issues with clients reconnects on older kernels, yeah. They sometimes get stuck after a network failure Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90

[ceph-users] Re: Ceph MDS - busy?

2020-04-30 Thread Paul Emmerich
Things to check: * metadata is on SSD? * try multiple active MDS servers * try a larger cache for the MDS * try a recent version of Ceph Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel

[ceph-users] Re: ceph crash hangs forever and recovery stop

2020-04-30 Thread Paul Emmerich
reports) Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Thu, Apr 30, 2020 at 4:09 PM Francois Legrand wrote: > Hi everybody (again), > We recently had

[ceph-users] Re: Upgrade Luminous to Nautilus on a Debian system

2020-04-29 Thread Paul Emmerich
ade assistant; it's just one button that does all the right things in the right order. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Wed, Apr 29, 2020 at 8:58 PM He

[ceph-users] Re: Nautilus cluster damaged + crashing OSDs

2020-04-21 Thread Paul Emmerich
On Tue, Apr 21, 2020 at 12:44 PM Brad Hubbard wrote: > > On Tue, Apr 21, 2020 at 6:35 PM Paul Emmerich wrote: > > > > On Tue, Apr 21, 2020 at 3:20 AM Brad Hubbard wrote: > > > > > > Wait for recovery to finish so you know whether any data from the down

[ceph-users] Re: Nautilus cluster damaged + crashing OSDs

2020-04-21 Thread Paul Emmerich
On Tue, Apr 21, 2020 at 3:20 AM Brad Hubbard wrote: > > Wait for recovery to finish so you know whether any data from the down > OSDs is required. If not just reprovision them. Recovery will not finish from this state as several PGs are down and/or stale. Paul > > If data is required from the

[ceph-users] Re: Check if upmap is supported by client?

2020-04-14 Thread Paul Emmerich
client requirement; I don't know the command to do this off the top of my head Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 > > Many thanks and best rega

[ceph-users] Re: Check if upmap is supported by client?

2020-04-13 Thread Paul Emmerich
bit 21 in the features bitmap is upmap support Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Mon, Apr 13, 2020 at 11:53 AM Frank Schilder wrote: > > De

[ceph-users] Re: ceph-df free discrepancy

2020-04-10 Thread Paul Emmerich
On Sat, Apr 11, 2020 at 12:43 AM Reed Dier wrote: > That said, as a straw man argument, ~380GiB free, times 60 OSDs, should be > ~22.8TiB free, if all OSD's grew evenly, which they won't Yes, that's the problem. They won't grow evenly. The fullest one will grow faster than the others. Also,

[ceph-users] Re: Recommendation for decent write latency performance from HDDs

2020-04-10 Thread Paul Emmerich
as usage patterns change over the lifetime of a cluster. Does anyone have any real-world experience with LVM cache? Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Fri, Apr

[ceph-users] Re: remove S3 bucket with rados CLI

2020-04-10 Thread Paul Emmerich
Quick & dirty solution if only one OSD is full (likely as it looks very unbalanced): take down the full OSD, delete data, take it back online Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croi

[ceph-users] Re: [Octopus] OSD overloading

2020-04-08 Thread Paul Emmerich
What's the CPU busy with while spinning at 100%? Check "perf top" for a quick overview Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Wed, Apr 8, 2020

[ceph-users] Re: Fwd: question on rbd locks

2020-04-07 Thread Paul Emmerich
On Tue, Apr 7, 2020 at 6:49 PM Void Star Nill wrote: > So is there a way to tell ceph to release the lock if the client becomes > unavailable? That's the task of the new client trying to take the lock, it needs to kick out the old client and blacklist the connection to ensure consistency. A

[ceph-users] Re: Recommendation for decent write latency performance from HDDs

2020-04-06 Thread Paul Emmerich
The keyword to search for is "deferred writes", there are several parameters that control the size and maximum number of ops that'll be "cached". Increasing to 1 MB is probably a bad idea. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at htt

[ceph-users] Re: Recommendation for decent write latency performance from HDDs

2020-04-04 Thread Paul Emmerich
ta of the > osd- not actual data. Thus a data-commit to the osd til still be dominated > by the writelatency of the underlying - very slow HDD. small writes (<= 32kb, configurable) are written to db first and written back to the slow disk asynchronous to the original request. -- Paul Emmerich Lo

[ceph-users] Re: different RGW Versions on same ceph cluster

2020-04-03 Thread Paul Emmerich
No, this is not supported. You must follow the upgrade order for services. The reason is that many parts of RGW are implemented in the OSD themselves, so you can't run a new RGW against an old OSD. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io

[ceph-users] Re: LARGE_OMAP_OBJECTS 1 large omap objects

2020-04-02 Thread Paul Emmerich
Safe to ignore/increase the warning threshold. You are seeing this because the warning level was reduced to 200k from 2M recently. The file will be sharded in a newer version which will clean this up Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https

[ceph-users] Re: Rados example: create namespace, user for this namespace, read and write objects with created namespace and user

2020-03-11 Thread Paul Emmerich
to create a namespace) Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Wed, Mar 11, 2020 at 4:22 PM Rodrigo Severo - Fábrica wrote: > > Em ter., 10 de mar.

[ceph-users] Re: osd_pg_create causing slow requests in Nautilus

2020-03-11 Thread Paul Emmerich
Encountered this one again today, I've updated the issue with new information: https://tracker.ceph.com/issues/44184 Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90

[ceph-users] Re: Accidentally removed client.admin caps - fix via mon doesn't work

2020-03-11 Thread Paul Emmerich
This indicates that there's something wrong with the config on that mon node. The command should work on any Ceph node that has the keyring. You should check ceph.conf on the monitor node, maybe there's some kind of misconfiguration that might cause other problems in the future. Paul -- Paul

[ceph-users] Re: cephfs snap mkdir strange timestamp

2020-03-10 Thread Paul Emmerich
There's an xattr for this: ceph.snap.btime IIRC Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Tue, Mar 10, 2020 at 11:42 AM Marc Roos wrote: > > > &g

[ceph-users] Re: Monitors' election failed on VMs : e4 handle_auth_request failed to assign global_id

2020-03-10 Thread Paul Emmerich
top and then immediately remove before stopping the next one"? Otherwise that's the problem. -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 > > The 3 old monitor

[ceph-users] Re: Accidentally removed client.admin caps - fix via mon doesn't work

2020-03-09 Thread Paul Emmerich
There's only one mon keyring that's shared by all mons, the mon user therefore doesn't contain the mon name. Try "-n mon." Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io T

[ceph-users] Re: ceph df hangs

2020-03-09 Thread Paul Emmerich
"ceph df" is handled by the mgr, check if your mgr is up and running and if the user has the necessary permissions for the mgr. -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49

[ceph-users] Re: Octopus release announcement

2020-03-03 Thread Paul Emmerich
On Mon, Mar 2, 2020 at 7:19 PM Alex Chalkias wrote: > > Thanks for the update. Are you doing a beta-release prior to the official > launch? the first RC was tagged a few weeks ago: https://github.com/ceph/ceph/tree/v15.1.0 Paul > > > On Mon, Mar 2, 2020 at 7:12 PM Sage Weil wrote: > > > It's

[ceph-users] Re: Cache tier OSDs crashing due to unfound hitset object 14.2.7

2020-02-27 Thread Paul Emmerich
Also: make a backup using the PG export feature of objectstore-tool before doing anything else. Sometimes it's enough to export and delete the PG from the broken OSD and import it into a different OSD using objectstore-tool. Paul -- Paul Emmerich Looking for help with your Ceph cluster

[ceph-users] Re: Cache tier OSDs crashing due to unfound hitset object 14.2.7

2020-02-27 Thread Paul Emmerich
(but you should try to understand what exactly is happening before running random ceph-objectstore-tool commands) Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Thu

[ceph-users] Re: Cache tier OSDs crashing due to unfound hitset object 14.2.7

2020-02-27 Thread Paul Emmerich
I've also encountered this issue, but luckily without the crashing OSDs, so marking as lost resolved it for us. See https://tracker.ceph.com/issues/44286 Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München

[ceph-users] Re: Migrating data to a more efficient EC pool

2020-02-25 Thread Paul Emmerich
Possible without downtime: Configure multi-site, create a new zone for the new pool, let the cluster sync to itself, do a failover to the new zone, delete old zone. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h

[ceph-users] Re: ceph nvme 2x replication

2020-02-19 Thread Paul Emmerich
x2 replication is perfectly fine as long as you also keep min_size at 2 ;) (But that means you're offline as soon as something is offline) Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel

[ceph-users] Re: [FORGED] Lost all Monitors in Nautilus Upgrade, best way forward?

2020-02-19 Thread Paul Emmerich
On Wed, Feb 19, 2020 at 10:03 AM Wido den Hollander wrote: > > > > On 2/19/20 8:49 AM, Sean Matheny wrote: > > Thanks, > > > >> If the OSDs have a newer epoch of the OSDMap than the MON it won't work. > > > > How can I verify this? (i.e the epoch of the monitor vs the epoch of the > > osd(s)) > >

[ceph-users] Re: osd_pg_create causing slow requests in Nautilus

2020-02-19 Thread Paul Emmerich
On Wed, Feb 19, 2020 at 7:26 AM Wido den Hollander wrote: > > > > On 2/18/20 6:54 PM, Paul Emmerich wrote: > > I've also seen this problem on Nautilus with no obvious reason for the > > slowness once. > > Did this resolve itself? Or did you remove the pool? I'

[ceph-users] Re: osd_pg_create causing slow requests in Nautilus

2020-02-18 Thread Paul Emmerich
I've also seen this problem on Nautilus with no obvious reason for the slowness once. In my case it was a rather old cluster that was upgraded all the way from firefly -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247

[ceph-users] Re: Identify slow ops

2020-02-17 Thread Paul Emmerich
that's probably just https://tracker.ceph.com/issues/43893 (a harmless bug) Restart the mons to get rid of the message Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90

[ceph-users] Re: recovery_unfound

2020-02-03 Thread Paul Emmerich
bably also sufficient to just run "ceph osd down" on the primaries on the affected PGs to get them to re-check. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Mon,

[ceph-users] Re: cephf_metadata: Large omap object found

2020-02-03 Thread Paul Emmerich
The warning threshold recently changed, I'd just increase it in this particular case. It just means you have lots of open files. I think there's some work going on to split the openfiles object into multiple, so that problem will be fixed. Paul -- Paul Emmerich Looking for help with your

[ceph-users] Re: data loss on full file system?

2020-02-03 Thread Paul Emmerich
f a 70k > files linux source tree went from 15 s to 6 minutes on a local filesystem > I have at hand. Don't do it for every file: cp foo bar; sync > > Best regards, > Håkan > > > > > > > > > Paul > > > > -- > > Paul Emmerich >

[ceph-users] Re: Inactive pgs preventing osd from starting

2020-01-31 Thread Paul Emmerich
If you don't care about the data: set osd_find_best_info_ignore_history_les = true on the affected OSDs temporarily. This means losing data. For anyone else reading this: don't ever use this option. It's evil and causes data loss (but gets your PG back and active, yay!) Paul -- Paul Emmerich

[ceph-users] Re: Micron SSD/Basic Config

2020-01-31 Thread Paul Emmerich
On Fri, Jan 31, 2020 at 2:06 PM EDH - Manuel Rios wrote: > > Hmm change 40Gbps to 100Gbps networking. > > 40Gbps technology its just a bond of 4x10 Links with some latency due link > aggregation. > 100 Gbps and 25Gbps got less latency and Good performance. In ceph a 50% of > the latency comes

[ceph-users] Re: data loss on full file system?

2020-01-28 Thread Paul Emmerich
Yes, data that is not synced is not guaranteed to be written to disk, this is consistent with POSIX semantics. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Mon

[ceph-users] Re: EC pool creation results in incorrect M value?

2020-01-27 Thread Paul Emmerich
min_size in the crush rule and min_size in the pool are completely different things that happen to share the same name. Ignore min_size in the crush rule, it has virtually no meaning in almost all cases (like this one). Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact

[ceph-users] Re: cephfs : write error: Operation not permitted

2020-01-24 Thread Paul Emmerich
application set cephfs To work with "ceph fs authorize" We automatically runs this on croit on startup on all cephfs pools to make the permissions work properly for our users. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Fr

[ceph-users] Re: Benchmark results for Seagate Exos2X14 Dual Actuator HDDs

2020-01-16 Thread Paul Emmerich
Sorry, we no longer have these test drives :( Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Thu, Jan 16, 2020 at 1:48 PM wrote: > Hi, > > The res

[ceph-users] Benchmark results for Seagate Exos2X14 Dual Actuator HDDs

2020-01-15 Thread Paul Emmerich
for writes, somewhat faster for reads in some scenarios Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io Looking for Ceph training? We have some free spots available https://croit.io/training/4-days-ceph-in-depth-training croit GmbH Freseniusstr. 31h

[ceph-users] Re: Experience with messenger v2 in Nautilus

2020-01-02 Thread Paul Emmerich
://tracker.ceph.com/issues/42583 ). We also had some problems during upgrades in the earlier Nautilus releases, but that seems to be fixed. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel

[ceph-users] Re: radosgw - Etags suffixed with #x0e

2020-01-02 Thread Paul Emmerich
't mix versions like that. Running nautilus and jewel at the same time is unsupported. Upgrade everything and check if that solves your problem. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel:

[ceph-users] Re: High CPU usage by ceph-mgr in 14.2.5

2019-12-19 Thread Paul Emmerich
We're also seeing unusually high mgr CPU usage on some setups, the only thing they have in common seem to > 300 OSDs. Threads using the CPU are "mgr-fin" and and "ms_dispatch" Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https

[ceph-users] Re: OSD state: transitioning to Stray

2019-12-09 Thread Paul Emmerich
An OSD that is down does not recover or backfill. Faster recovery or backfill will not resolve down OSDs Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Mon, Dec

[ceph-users] Re: OSD state: transitioning to Stray

2019-12-09 Thread Paul Emmerich
This message is expected. But your current situation is a great example of why having a separate cluster network is a bad idea in most situations. First thing I'd do in this scenario is to get rid of the cluster network and see if that helps Paul -- Paul Emmerich Looking for help with your

[ceph-users] Re: Size and capacity calculations questions

2019-12-06 Thread Paul Emmerich
Home directories probably means lots of small objects. Default minimum allocation size of BlueStore on HDD is 64 kiB, so there's a lot of overhead for everything smaller; Details: google bluestore min alloc size, can only be changed during OSD creation Paul -- Paul Emmerich Looking for help

[ceph-users] Re: Upgrade from Jewel to Nautilus

2019-12-05 Thread Paul Emmerich
to scrub everything on Luminous first as the first scrub on Luminous performs some data structure migrations that are no longer supported on Nautilus. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io

[ceph-users] Re: Building a petabyte cluster from scratch

2019-12-03 Thread Paul Emmerich
It's pretty pointless to discuss erasure coding vs replicated without knowing how it'll be used. There are setups where erasure coding is faster than replicated. You do need to write less data overall, so if that's your bottleneck then erasure coding will be faster. Paul -- Paul Emmerich

[ceph-users] Re: iSCSI Gateway reboots and permanent loss

2019-12-03 Thread Paul Emmerich
Gateway removal is indeed supported since ceph-iscsi 3.0 (or was it 2.7?) and it works while it is offline :) Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Tue

[ceph-users] Re: Possible data corruption with 14.2.3 and 14.2.4

2019-12-02 Thread Paul Emmerich
On Mon, Dec 2, 2019 at 4:55 PM Simon Ironside wrote: > > Any word on 14.2.5? Nervously waiting here . . . real soon, the release is 99% done (check the corresponding thread on the devel mailing list) Paul > > Thanks, > Simon. > > On 18/11/2019 11:29, Simon Ironside wrote: > > > I will sit

[ceph-users] Re: Can min_read_recency_for_promote be -1

2019-12-02 Thread Paul Emmerich
a specialized cache mode that just acts as a write buffer, there are quite a few applications that would benefit from that. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90

[ceph-users] Re: Questions about the EC pool

2019-11-29 Thread Paul Emmerich
It should take ~25 seconds by default to detect a network failure, the config option that controls this is "osd heartbeat grace" (default 20 seconds, but it takes a little longer for it to really detect the failure). Check ceph -w while performing the test. Paul -- Paul Emmeric

[ceph-users] Re: Changing failure domain

2019-11-28 Thread Paul Emmerich
. This is for disaster recovery only, it'll guarantee durability if you lose a room but not availability. 3+2 erasure coding cannot be split across two rooms in this way because, well, you need 3 out of 5 shards to survive, so you cannot lose half of them. Paul -- Paul Emmerich Looking for help with your Ceph

[ceph-users] Re: EC PGs stuck activating, 2^31-1 as OSD ID, automatic recovery not kicking in

2019-11-22 Thread Paul Emmerich
On Fri, Nov 22, 2019 at 9:33 PM Zoltan Arnold Nagy wrote: > The 2^31-1 in there seems to indicate an overflow somewhere - the way we > were able to figure out where exactly > is to query the PG and compare the "up" and "acting" sets - only _one_ > of them had the 2^31-1 number in place > of the

[ceph-users] Re: msgr2 not used on OSDs in some Nautilus clusters

2019-11-19 Thread Paul Emmerich
There should be a warning that says something like "all OSDs are running nautilus but require-osd-release nautilus is not set" That warning did exist for older releases, pretty sure nautilus also has it? Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact u

[ceph-users] Re: Balancing PGs across OSDs

2019-11-18 Thread Paul Emmerich
You have way too few PGs in one of the roots. Many OSDs have so few PGs that you should see a lot of health warnings because of it. The other root has a factor 5 difference in disk size which isn't ideal either. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us

[ceph-users] Re: add debian buster stable support for ceph-deploy

2019-11-18 Thread Paul Emmerich
We maintain an unofficial mirror for Buster packages: https://croit.io/2019/07/07/2019-07-07-debian-mirror Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Mon, Nov

[ceph-users] Re: Slow write speed on 3-node cluster with 6* SATA Harddisks (~ 3.5 MB/s)

2019-11-06 Thread Paul Emmerich
On Wed, Nov 6, 2019 at 5:57 PM Hermann Himmelbauer wrote: > > Dear Vitaliy, dear Paul, > > Changing the block size for "dd" makes a huge difference. > > However, still some things are not fully clear to me: > > As recommended, I tried writing / reading directly to the rbd and this > is blazingly

[ceph-users] Re: Slow write speed on 3-node cluster with 6* SATA Harddisks (~ 3.5 MB/s)

2019-11-05 Thread Paul Emmerich
On Mon, Nov 4, 2019 at 11:44 PM Hermann Himmelbauer wrote: > > Hi, > I recently upgraded my 3-node cluster to proxmox 6 / debian-10 and > recreated my ceph cluster with a new release (14.2.4 bluestore) - > basically hoping to gain some I/O speed. > > The installation went flawlessly, reading is

[ceph-users] Re: Ceph Health error right after starting balancer

2019-11-01 Thread Paul Emmerich
Looks like you didn't tell the whole story, please post the *full* output of ceph -s and ceph osd df tree. Wild guess: you need to increase "mon max pg per osd" Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr.

[ceph-users] Re: Ceph Health error right after starting balancer

2019-10-31 Thread Paul Emmerich
up automatically Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Thu, Oct 31, 2019 at 2:27 PM Thomas Schneider <74cmo...@gmail.com> wrote: > > Hi, > > a

[ceph-users] Re: iSCSI write performance

2019-10-31 Thread Paul Emmerich
On Fri, Oct 25, 2019 at 11:14 PM Maged Mokhtar wrote: > 3. vmotion between Ceph datastore and an external datastore..this will be > bad. This seems the case you are testing. It is bad because between 2 > different storage systems (iqns are served on different targets), vaai xcopy > cannot be

[ceph-users] Re: iSCSI write performance

2019-10-31 Thread Paul Emmerich
On Mon, Oct 28, 2019 at 8:07 PM Mike Christie wrote: > > On 10/25/2019 03:25 PM, Ryan wrote: > > Can you point me to the directions for the kernel mode iscsi backend. I > > was following these directions > > https://docs.ceph.com/docs/master/rbd/iscsi-target-cli/ > > > > If you just wanted to use

[ceph-users] Re: Correct Migration Workflow Replicated -> Erasure Code

2019-10-30 Thread Paul Emmerich
because it doesn't actually change the bucket. I don't think it would be too complicated to add a native bucket migration mechanism that works similar to "bucket rewrite" (which is intended for something similar but different). Paul -- Paul Emmerich Looking for help with your Ceph cluster?

[ceph-users] Re: Compression on existing RGW buckets

2019-10-29 Thread Paul Emmerich
On Tue, Oct 29, 2019 at 7:26 PM Bryan Stillwell wrote: > > Thanks Casey, > > If I'm understanding this correctly the only way to turn on RGW compression > is to do it essentially cluster wide in Luminous since all our existing > buckets use the same placement rule? That's not going to work for

[ceph-users] Re: Choosing suitable SSD for Ceph cluster

2019-10-25 Thread Paul Emmerich
Disabling write cache helps with the 970 Pro, but it still sucks. I've worked on a setup with heavy metadata requirements (gigantic S3 buckets being listed) that unfortunately had all of that stored on 970 Pros and that never really worked out. Just get a proper SSD like the 883, 983, or 1725.

[ceph-users] Re: PG badly corrupted after merging PGs on mixed FileStore/BlueStore setup

2019-10-23 Thread Paul Emmerich
On Wed, Oct 23, 2019 at 11:27 PM Sage Weil wrote: > > On Wed, 23 Oct 2019, Paul Emmerich wrote: > > Hi, > > > > I'm working on a curious case that looks like a bug in PG merging > > maybe related to FileStore. > > > > Setup is 14.2.1 that is half Bl

[ceph-users] PG badly corrupted after merging PGs on mixed FileStore/BlueStore setup

2019-10-23 Thread Paul Emmerich
e same for all 3 OSDs in that PG. Has anyone encountered something similar? I'll probably just nuke the affected bucket indices tomorrow and re-create them. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München w

[ceph-users] Re: RadosGW cant list objects when there are too many of them

2019-10-21 Thread Paul Emmerich
disaster waiting to happen if this continues to grow. rgw setups with large buckets need SSDs (or better NVMe) for metadata if you value availability. Recovering after a node failure will be horrible if you keep this on HDDs. Paul > > > Regards > ____ > Fro

  1   2   >