[ceph-users] _committed_osd_maps shutdown OSD via async signal, bug or feature?

2017-10-05 Thread Stefan Kooman
Hi, During testing (mimicking BGP / port flaps) on our cluster we are able to trigger a "_committed_osd_maps shutdown OSD via async signal" on the the affected OSD servers in that datacenter (OSDs in that DC become intermittent isolated from their peers). Result is that all OSD processes stop. Is

[ceph-users] TLS for tracker.ceph.com

2017-10-05 Thread Stefan Kooman
Hi, Can we supply http://tracker.ceph.com with TLS and make it https://tracker.ceph.com? Should be trivial with Let's Encrypt for example. Thanks! Gr. Stefan -- | BIT BV http://www.bit.nl/Kamer van Koophandel 09090351 | GPG: 0xD14839C6 +31 318 648 688 / i...@bit.nl

[ceph-users] Ceph mirrors

2017-10-05 Thread Stefan Kooman
-- | BIT BV http://www.bit.nl/Kamer van Koophandel 09090351 | GPG: 0xD14839C6 +31 318 648 688 / i...@bit.nl ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] _committed_osd_maps shutdown OSD via async signal, bug or feature?

2017-10-05 Thread Stefan Kooman
Quoting Gregory Farnum (gfar...@redhat.com): > That's a feature, but invoking it may indicate the presence of another > issue. The OSD shuts down if > 1) it has been deleted from the cluster, or > 2) it has been incorrectly marked down a bunch of times by the cluster, and > gives up, or > 3) it

Re: [ceph-users] Ceph luminous repo not working on Ubuntu xenial

2017-09-28 Thread Stefan Kooman
Quoting Kashif Mumtaz (kashif.mum...@yahoo.com): > > Dear User, > I am striving had to install Ceph luminous version on Ubuntu 16.04.3  (  > xenial ). > Its repo is available at https://download.ceph.com/debian-luminous/  > I added it like sudo apt-add-repository 'deb >

Re: [ceph-users] osd max scrubs not honored?

2017-09-29 Thread Stefan Kooman
Quoting Christian Balzer (ch...@gol.com): > > On Thu, 28 Sep 2017 22:36:22 + Gregory Farnum wrote: > > > Also, realize the deep scrub interval is a per-PG thing and (unfortunately) > > the OSD doesn't use a global view of its PG deep scrub ages to try and > > schedule them intelligently

Re: [ceph-users] Cephfs : security questions?

2017-09-29 Thread Stefan Kooman
Quoting Yoann Moulin (yoann.mou...@epfl.ch): > > Kernels on client is 4.4.0-93 and on ceph node are 4.4.0-96 > > What is exactly an older kernel client ? 4.4 is old ? See http://docs.ceph.com/docs/master/cephfs/best-practices/#which-kernel-version If you're on Ubuntu Xenial I would advise to

Re: [ceph-users] Cephfs : security questions?

2017-09-29 Thread Stefan Kooman
Quoting Yoann Moulin (yoann.mou...@epfl.ch): > > >> Kernels on client is 4.4.0-93 and on ceph node are 4.4.0-96 > >> > >> What is exactly an older kernel client ? 4.4 is old ? > > > > See > > http://docs.ceph.com/docs/master/cephfs/best-practices/#which-kernel-version > > > > If you're on

Re: [ceph-users] Ceph luminous repo not working on Ubuntu xenial

2017-10-01 Thread Stefan Kooman
Quoting Kashif Mumtaz (kashif.mum...@yahoo.com): > Dear, Thanks for help. I am able to install on single node.  Now going > to install on multiple nodes. Just want to clarify one small thing. > Is Ceph key and Ceph repository need to add on every node or it is > required only on admin node  where

[ceph-users] Ceph Luminous release_type "rc"

2017-09-26 Thread Stefan Kooman
Hi, I noticed the ceph version still gives "rc" although we are using the latest Ceph packages: 12.2.0-1xenial (https://download.ceph.com/debian-luminous xenial/main amd64 Packages): ceph daemon mon.mon5 version {"version":"12.2.0","release":"luminous","release_type":"rc"} Why is this important

Re: [ceph-users] ceph-volume: migration and disk partition support

2017-10-10 Thread Stefan Kooman
Hi, Quoting Alfredo Deza (ad...@redhat.com): > Hi, > > Now that ceph-volume is part of the Luminous release, we've been able > to provide filestore support for LVM-based OSDs. We are making use of > LVM's powerful mechanisms to store metadata which allows the process > to no longer rely on UDEV

[ceph-users] Ceph manager documentation missing from network config reference

2017-10-05 Thread Stefan Kooman
Hi, While implementing (stricter) firewall rules I noticed weird behaviour. For the monitors only port 6789 was allowed. We currently co-locate the manager daemon with our monitors. Apparently (at least) port 6800 is also essential. In the Network Configuration Reference [1] there is no mention

Re: [ceph-users] Ceph mirrors

2017-10-05 Thread Stefan Kooman
Hi, Sorry for empty mail, that shouldn't have happened. I would like to address the following. Currently the repository list for debian- packages contain _only_ the latest package version. In case of a (urgent) need to downgrade you cannot easily select an older version. You then need to resort

Re: [ceph-users] Crush Map for test lab

2017-10-12 Thread Stefan Kooman
Quoting Ashley Merrick (ash...@amerrick.co.uk): > Hello, > > Setting up a new test lab, single server 5 disks/OSD. > > Want to run an EC Pool that has more shards than avaliable OSD's , is > it possible to force crush to 're use an OSD for another shard? > > I know normally this is bad practice

Re: [ceph-users] ceph-disk removal roadmap (was ceph-disk is now deprecated)

2017-12-01 Thread Stefan Kooman
Quoting Fabian Grünbichler (f.gruenbich...@proxmox.com): > I think the above roadmap is a good compromise for all involved parties, > and I hope we can use the remainder of Luminous to prepare for a > seam- and painless transition to ceph-volume in time for the Mimic > release, and then finally

Re: [ceph-users] HELP with some basics please

2017-12-05 Thread Stefan Kooman
Quoting tim taler (robur...@gmail.com): > And I'm still puzzled about the implication of the cluster size on the > amount of OSD failures. > With size=2 min_size=1 one host could die and (if by chance there is > NO read error on any bit on the living host) I could (theoretically) > recover, is

[ceph-users] Deterministic naming of LVM volumes (ceph-volume)

2017-12-13 Thread Stefan Kooman
Hi, The new style "ceph-volume" LVM way of provisioning OSDs introduces a little challange for us. In order to create the OSDs as logical, consistent and easily recognizable as possible, we try to name the Volume Groups (VG) and Logical Volumes (LV) the same as the OSD. For example: OSD no. 12

Re: [ceph-users] Deterministic naming of LVM volumes (ceph-volume)

2017-12-13 Thread Stefan Kooman
Quoting Webert de Souza Lima (webert.b...@gmail.com): > if I may suggest, "ceph osd create" allocates and returns an OSD ID. So you > could take it by doing: > > ID=$(ceph osd create) > > then remove it with > > ceph osd rm $ID > > Now you have the $ID and you can deploy it with ceph-volume

Re: [ceph-users] Deterministic naming of LVM volumes (ceph-volume)

2017-12-13 Thread Stefan Kooman
Quoting Willem Jan Withagen (w...@digiware.nl): > LOG.debug('Allocating OSD id...') > secrets = Secrets() > try: > wanttobe = read_one_line(path, 'wanttobe') > if os.path.exists(os.path.join(path, 'wanttobe')): > os.unlink(os.path.join(path, 'wanttobe')) >

Re: [ceph-users] Deterministic naming of LVM volumes (ceph-volume)

2017-12-13 Thread Stefan Kooman
Quoting Burkhard Linke (burkhard.li...@computational.bio.uni-giessen.de): > Just my 2 cents: > > What is happening if ansible runs on multiple hosts in parallel? We won't do that to avoid race conditions that might arise. We don't need parallel deployment of massive amount of OSDs overnight, so

Re: [ceph-users] determining the source of io in the cluster

2017-12-18 Thread Stefan Kooman
Quoting Josef Zelenka (josef.zele...@cloudevelops.com): > Hi everyone, > > we have recently deployed a Luminous(12.2.1) cluster on Ubuntu - three osd > nodes and three monitors, every osd has 3x 2TB SSD + an NVMe drive for a > blockdb. We use it as a backend for our Openstack cluster, so we store

Re: [ceph-users] fail to create bluestore osd with ceph-volume command on ubuntu 14.04 with ceph 12.2.2

2017-12-13 Thread Stefan Kooman
Quoting 姜洵 (jiang...@100tal.com): > Hi folks, > > > I am trying to install create a bluestore osd manually with ceph-volume tool > on > a Ubuntu 14.04 system, but with no luck. The Ceph version I used is Luminous > 12.2.2. > > I do this manually instead of ceph-deploy command, because I want

[ceph-users] Ceph scrub logs: _scan_snaps no head for $object?

2017-12-14 Thread Stefan Kooman
Hi, We see the following in the logs after we start a scrub for some osds: ceph-osd.2.log:2017-12-14 06:50:47.180344 7f0f47db2700 0 log_channel(cluster) log [DBG] : 1.2d8 scrub starts ceph-osd.2.log:2017-12-14 06:50:47.180915 7f0f47db2700 -1 osd.2 pg_epoch: 11897 pg[1.2d8( v 11890'165209

Re: [ceph-users] Deterministic naming of LVM volumes (ceph-volume)

2017-12-13 Thread Stefan Kooman
Quoting Webert de Souza Lima (webert.b...@gmail.com): > Cool > > > On Wed, Dec 13, 2017 at 11:04 AM, Stefan Kooman <ste...@bit.nl> wrote: > > > So, a "ceph osd ls" should give us a list, and we will pick the smallest > > available number as

Re: [ceph-users] Bluestore Compression not inheriting pool option

2017-12-12 Thread Stefan Kooman
Quoting Nick Fisk (n...@fisk.me.uk): > Hi All, > > Has anyone been testing the bluestore pool compression option? > > I have set compression=snappy on a RBD pool. When I add a new bluestore OSD, > data is not being compressed when backfilling, confirmed by looking at the > perf dump results. If

[ceph-users] unable to remove rbd image

2017-11-07 Thread Stefan Kooman
Dear list, Somehow, might have to do with live migrating the virtual machine, an rbd image ends up being undeletable. Trying to remove the image results in *loads* of the same messages over and over again: 2017-11-07 11:30:58.431913 7f9ae2ffd700 -1 JournalPlayer: 0x7f9ae400a130 missing prior

[ceph-users] ceph.conf tuning ... please comment

2017-12-05 Thread Stefan Kooman
Dear list, In a ceph blog post about the new Luminous release there is a paragraph on the need for ceph tuning [1]: "If you are a Ceph power user and believe there is some setting that you need to change for your environment to get the best performance, please tell uswed like to either adjust

Re: [ceph-users] ceph-disk removal roadmap (was ceph-disk is now deprecated)

2017-12-03 Thread Stefan Kooman
Quoting Alfredo Deza (ad...@redhat.com): > > Looks like there is a tag in there that broke it. Lets follow up on a > tracker issue so that we don't hijack this thread? > > http://tracker.ceph.com/projects/ceph-volume/issues/new Issue 22305 made for this: http://tracker.ceph.com/issues/22305

[ceph-users] Ceph luminous packages for Ubuntu 18.04 LTS (bionic)?

2018-05-24 Thread Stefan Kooman
Hi List, Will there be, some point in time, ceph luminous packages for Ubuntu 18.04 LTS (bionic)? Or are we supposed to upgrade to "Mimic" / 18.04 LTS in one go? Gr. Stefan -- | BIT BV http://www.bit.nl/Kamer van Koophandel 09090351 | GPG: 0xD14839C6 +31 318 648 688

Re: [ceph-users] Ceph replication factor of 2

2018-05-24 Thread Stefan Kooman
Quoting Anthony Verevkin (anth...@verevkin.ca): > My thoughts on the subject are that even though checksums do allow to > find which replica is corrupt without having to figure which 2 out of > 3 copies are the same, this is not the only reason min_size=2 was > required. Even if you are running

Re: [ceph-users] Recovery priority

2018-05-31 Thread Stefan Kooman
Quoting Dennis Benndorf (dennis.bennd...@googlemail.com): > Hi, > > lets assume we have size=3 min_size=2 and lost some osds and now have some > placement groups with only one copy left. > > Is there a setting to tell ceph to start recovering those pgs first in order > to reach min_size and so

Re: [ceph-users] separate monitoring node

2018-06-22 Thread Stefan Kooman
Quoting Reed Dier (reed.d...@focusvq.com): > > > On Jun 22, 2018, at 2:14 AM, Stefan Kooman wrote: > > > > Just checking here: Are you using the telegraf ceph plugin on the nodes? > > In that case you _are_ duplicating data. But the good news is that you > > d

Re: [ceph-users] Minimal MDS for CephFS on OSD hosts

2018-06-19 Thread Stefan Kooman
Quoting Denny Fuchs (linuxm...@4lin.net): > > We have also a 2nd cluster which holds the VMs with also 128Gb Ram and 2 x > Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz. But with only system disks (ZFS > Raid1). Storage doesn't matter for MDS, as they won't use it to store ceph data (but instead use

Re: [ceph-users] separate monitoring node

2018-06-19 Thread Stefan Kooman
Quoting John Spray (jsp...@redhat.com): > > The general idea with mgr plugins (Telegraf, etc) is that because > there's only one active mgr daemon, you don't have to worry about > duplicate feeds going in. > > I haven't use the icinga2 check_ceph plugin, but it seems like it's > intended to run

Re: [ceph-users] Ceph scrub logs: _scan_snaps no head for $object?

2018-05-02 Thread Stefan Kooman
Hi, Quoting Stefan Kooman (ste...@bit.nl): > Hi, > > We see the following in the logs after we start a scrub for some osds: > > ceph-osd.2.log:2017-12-14 06:50:47.180344 7f0f47db2700 0 > log_channel(cluster) log [DBG] : 1.2d8 scrub starts > ceph-osd.2.log:2017-

[ceph-users] Increase recovery / backfilling speed (with many small objects)

2018-01-05 Thread Stefan Kooman
Hi, I know I'm not the only one with this question as I have see similar questions on this list: How to speed up recovery / backfilling? Current status: pgs: 155325434/800312109 objects degraded (19.408%) 1395 active+clean 440

[ceph-users] MDS cache size limits

2018-01-04 Thread Stefan Kooman
Hi Ceph fs'ers I have a question about the "mds_cache_memory_limit" parameter and MDS memory usage. We currently have set mds_cache_memory_limit=150G. The MDS server itself (and its active-standby) have 256 GB of RAM. Eventually the MDS process will consume ~ 87.5% of available memory. At that

Re: [ceph-users] MDS cache size limits

2018-01-05 Thread Stefan Kooman
Quoting Patrick Donnelly (pdonn...@redhat.com): > > It's expected but not desired: http://tracker.ceph.com/issues/21402 > > The memory usage tracking is off by a constant factor. I'd suggest > just lowering the limit so it's about where it should be for your > system. Thanks for the info. Yeah,

Re: [ceph-users] Ceph scrub logs: _scan_snaps no head for $object?

2018-01-04 Thread Stefan Kooman
Quoting Konstantin Shalygin (k0...@k0ste.ru): > >This is still a pre-production cluster. Most tests have been done > >using rbd. We did make some rbd clones / snapshots here and there. > > What clients you used? Only luminous clients. Mostly rbd (qemu-kvm) images. Gr. Stefan -- | BIT BV

Re: [ceph-users] Ceph scrub logs: _scan_snaps no head for $object?

2018-01-04 Thread Stefan Kooman
Quoting Konstantin Shalygin (k0...@k0ste.ru): > On 01/04/2018 11:38 PM, Stefan Kooman wrote: > >Only luminous clients. Mostly rbd (qemu-kvm) images. > > Who is managed your images? May be OpenStack Cinder? OpenNebula 5.4.3 (issuing rbd commands to ceph cluster). Gr. Stefan --

Re: [ceph-users] Increase recovery / backfilling speed (with many small objects)

2018-01-08 Thread Stefan Kooman
Quoting Chris Sarginson (csarg...@gmail.com): > You probably want to consider increasing osd max backfills > > You should be able to inject this online > > http://docs.ceph.com/docs/luminous/rados/configuration/osd-config-ref/ > > You might want to drop your osd recovery max active settings

Re: [ceph-users] Reduced data availability: 4 pgs inactive, 4 pgs incomplete

2018-01-05 Thread Stefan Kooman
Quoting Brent Kennedy (bkenn...@cfl.rr.com): > Unfortunately, this cluster was setup before the calculator was in > place and when the equation was not well understood. We have the > storage space to move the pools and recreate them, which was > apparently the only way to handle the issue( you

Re: [ceph-users] ceph luminous - cannot assign requested address

2018-01-18 Thread Stefan Kooman
Quoting Steven Vacaroaia (ste...@gmail.com): > Hi, > > I have noticed the below error message when creating a new OSD using > ceph-volume > deleting the OSD and recreating it does not work - same error message > > However, creating a new one OSD works > > Note > No firewall /iptables are

Re: [ceph-users] Luminous: example of a single down osd taking out a cluster

2018-01-23 Thread Stefan Kooman
Quoting Dan van der Ster (d...@vanderster.com): > > So, first question is: why didn't that OSD get detected as failing > much earlier? We have notiticed that "mon osd adjust heartbeat grace" made the cluster "realize" OSDs going down _much_ later than the MONs / OSDs themselves. Setting this

[ceph-users] Rocksdb Segmentation fault during compaction (on OSD)

2018-01-12 Thread Stefan Kooman
Hi, While trying to get an OSD back in the test cluster, which had been dropped out for unknown reason, we see a RocksDB Segmentation fault during "compaction". I increased debugging to 20/20 for OSD / RocksDB, see part of the logfile below: ... 49477, 49476, 49475, 49474, 49473, 49472, 49471,

Re: [ceph-users] Ceph scrub logs: _scan_snaps no head for $object?

2018-01-03 Thread Stefan Kooman
Quoting Sage Weil (s...@newdream.net): > Hi Stefan, Mehmet, > > Are these clusters that were upgraded from prior versions, or fresh > luminous installs? Fresh luminous install... The cluster was installed with 12.2.0, and later upgraded to 12.2.1 and 12.2.2. > This message indicates that there

Re: [ceph-users] MDS behind on trimming

2017-12-22 Thread Stefan Kooman
Quoting Stefan Kooman (ste...@bit.nl): > Quoting Dan van der Ster (d...@vanderster.com): > > Hi, > > > > We've used double the defaults for around 6 months now and haven't had any > > behind on trimming errors in that time. > > > >mds log max segment

[ceph-users] MDS behind on trimming

2017-12-21 Thread Stefan Kooman
Hi, We have two MDS servers. One active, one active-standby. While doing a parallel rsync of 10 threads with loads of files, dirs, subdirs we get the following HEALTH_WARN: ceph health detail HEALTH_WARN 2 MDSs behind on trimming MDS_TRIM 2 MDSs behind on trimming mdsmds2(mds.0): Behind on

Re: [ceph-users] ceph-volume lvm deactivate/destroy/zap

2017-12-21 Thread Stefan Kooman
Quoting Dan van der Ster (d...@vanderster.com): > Hi, > > For someone who is not an lvm expert, does anyone have a recipe for > destroying a ceph-volume lvm osd? > (I have a failed disk which I want to deactivate / wipe before > physically removing from the host, and the tooling for this doesn't

Re: [ceph-users] MDS behind on trimming

2017-12-21 Thread Stefan Kooman
Quoting Dan van der Ster (d...@vanderster.com): > Hi, > > We've used double the defaults for around 6 months now and haven't had any > behind on trimming errors in that time. > >mds log max segments = 60 >mds log max expiring = 40 > > Should be simple to try. Yup, and works like a

Re: [ceph-users] ceph-volume lvm deactivate/destroy/zap

2017-12-21 Thread Stefan Kooman
Quoting Dan van der Ster (d...@vanderster.com): > Thanks Stefan. But isn't there also some vgremove or lvremove magic > that needs to bring down these /dev/dm-... devices I have? Ah, you want to clean up properly before that. Sure: lvremove -f / vgremove pvremove /dev/ceph-device (should wipe

Re: [ceph-users] separate monitoring node

2018-06-22 Thread Stefan Kooman
Quoting Denny Fuchs (linuxm...@4lin.net): > hi, > > > Am 19.06.2018 um 17:17 schrieb Kevin Hrpcek : > > > > # ceph auth get client.icinga > > exported keyring for client.icinga > > [client.icinga] > > key = > > caps mgr = "allow r" > > caps mon = "allow r" > > thats the point: It's

Re: [ceph-users] ceph-fuse slow cache?

2018-08-24 Thread Stefan Kooman
Hi Gregory, Quoting Gregory Farnum (gfar...@redhat.com): > This is quite strange. Given that you have a log, I think what you want to > do is find one request in the log, trace it through its lifetime, and see > where the time is elapsed. You may find a bifurcation, where some > categories of

[ceph-users] ceph-fuse slow cache?

2018-08-21 Thread Stefan Kooman
Hi, I'm trying to find out why ceph-fuse client(s) are slow. Luminous 12.2.7 Ceph cluster, Mimic 13.2.1 ceph-fuse client. Ubuntu xenial, 4.13.0-38-generic kernel. Test case: 25 curl requests directed at a single threaded apache process (apache2 -X). When the requests are handled by ceph-kernel

Re: [ceph-users] ceph-fuse slow cache?

2018-08-25 Thread Stefan Kooman
Quoting Gregory Farnum (gfar...@redhat.com): > Hmm, these aren't actually the start and end times to the same operation. > put_inode() is literally adjusting a refcount, which can happen for reasons > ranging from the VFS doing something that drops it to an internal operation > completing to a

Re: [ceph-users] ceph-fuse slow cache?

2018-08-27 Thread Stefan Kooman
Hi, Quoting Yan, Zheng (uker...@gmail.com): > Could you strace apacha process, check which syscall waits for a long time. Yes, that's how I did all the tests (strace -t -T apache2 -X). With debug=20 (ceph-fuse) you see apache waiting for almost 20 seconds before it starts serving data:

Re: [ceph-users] v12.2.7 Luminous released

2018-07-17 Thread Stefan Kooman
Quoting Abhishek Lekshmanan (abhis...@suse.com): > *NOTE* The v12.2.5 release has a potential data corruption issue with > erasure coded pools. If you ran v12.2.5 with erasure coding, please see > below. < snip > > Upgrading from v12.2.5 or v12.2.6 > - > > If

Re: [ceph-users] pool has many more objects per pg than average

2018-07-05 Thread Stefan Kooman
Quoting Brett Chancellor (bchancel...@salesforce.com): > The error will go away once you start storing data in the other pools. Or, > you could simply silence the message with mon_pg_warn_max_object_skew = 0 Ran into this issue myself (again). Note to self: You need to restart the _active_ MGR

Re: [ceph-users] Memory leak in Ceph OSD?

2018-03-01 Thread Stefan Kooman
Quoting Caspar Smit (caspars...@supernas.eu): > Stefan, > > How many OSD's and how much RAM are in each server? Currently 7 OSDs, 128 GB RAM. Max wil be 10 OSDs in these servers. 12 cores (at least one core per OSD). > bluestore_cache_size=6G will not mean each OSD is using max 6GB RAM right?

Re: [ceph-users] Memory leak in Ceph OSD?

2018-04-19 Thread Stefan Kooman
Hi, Quoting Stefan Kooman (ste...@bit.nl): > Hi, > > TL;DR: we see "used" memory grows indefinitely on our OSD servers. > Until the point that either 1) a OSD process gets killed by OOMkiller, > or 2) OSD aborts (proably because malloc cannot provide more RAM).

Re: [ceph-users] Using ceph deploy with mon.a instead of mon.hostname?

2018-04-20 Thread Stefan Kooman
Quoting Oliver Schulz (oliver.sch...@tu-dortmund.de): > Dear Ceph Experts, > > I'm try to switch an old Ceph cluster from manual administration to > ceph-deploy, but I'm running into the following error: > > # ceph-deploy gatherkeys HOSTNAME > > [HOSTNAME][INFO ] Running command: /usr/bin/ceph

Re: [ceph-users] Increase recovery / backfilling speed (with many small objects)

2018-02-28 Thread Stefan Kooman
doing some 5K writes. For sure this was not the limit. We would hit max nic bandwith pretty soon though. ceph++ Gr. Stefan [1]: https://owncloud.kooman.org/s/mvbMCVLFbWjAyOn#pdfviewer Quoting Stefan Kooman (ste...@bit.nl): > Hi, > > I know I'm not the only one with this question as I have see similar &

[ceph-users] Memory leak in Ceph OSD?

2018-02-28 Thread Stefan Kooman
Hi, TL;DR: we see "used" memory grows indefinitely on our OSD servers. Until the point that either 1) a OSD process gets killed by OOMkiller, or 2) OSD aborts (proably because malloc cannot provide more RAM). I suspect a memory leak of the OSDs. We were running 12.2.2. We are now running 12.2.3.

Re: [ceph-users] Updating standby mds from 12.2.2 to 12.2.4 caused up:active 12.2.2 mds's to suicide

2018-02-28 Thread Stefan Kooman
Quoting Dan van der Ster (d...@vanderster.com): > Hi all, > > I'm just updating our test cluster from 12.2.2 to 12.2.4. Mon's and > OSD's updated fine. 12.2.4? Did you mean 12.2.3? Or did I miss something? Gr. stefan -- | BIT BV http://www.bit.nl/Kamer van Koophandel 09090351 | GPG:

Re: [ceph-users] Mimic cluster is offline and not healing

2018-09-27 Thread Stefan Kooman
Quoting by morphin (morphinwith...@gmail.com): > After 72 hours I believe we may hit a bug. Any help would be greatly > appreciated. Is it feasible for you to stop all client IO to the Ceph cluster? At least until it stabilizes again. "ceph osd pause" would do the trick (ceph osd unpause would

Re: [ceph-users] MDS hangs in "heartbeat_map" deadlock

2018-10-05 Thread Stefan Kooman
Quoting Gregory Farnum (gfar...@redhat.com): > > Ah, there's a misunderstanding here — the output isn't terribly clear. > "is_healthy" is the name of a *function* in the source code. The line > > heartbeat_map is_healthy 'MDSRank' had timed out after 15 > > is telling you that the

Re: [ceph-users] MDS hangs in "heartbeat_map" deadlock

2018-10-23 Thread Stefan Kooman
Quoting Patrick Donnelly (pdonn...@redhat.com): > Thanks for the detailed notes. It looks like the MDS is stuck > somewhere it's not even outputting any log messages. If possible, it'd > be helpful to get a coredump (e.g. by sending SIGQUIT to the MDS) or, > if you're comfortable with gdb, a

Re: [ceph-users] Unexplainable high memory usage OSD with BlueStore

2018-11-08 Thread Stefan Kooman
Quoting Wido den Hollander (w...@42on.com): > Hi, > > Recently I've seen a Ceph cluster experience a few outages due to memory > issues. > > The machines: > > - Intel Xeon E3 CPU > - 32GB Memory > - 8x 1.92TB SSD > - Ubuntu 16.04 > - Ceph 12.2.8 What kernel version is running? What network

Re: [ceph-users] CephFS kernel client versions - pg-upmap

2018-11-08 Thread Stefan Kooman
Quoting Ilya Dryomov (idryo...@gmail.com): > On Sat, Nov 3, 2018 at 10:41 AM wrote: > > > > Hi. > > > > I tried to enable the "new smart balancing" - backend are on RH luminous > > clients are Ubuntu 4.15 kernel. [cut] > > ok, so 4.15 kernel connects as a "hammer" (<1.0) client? Is there a > >

Re: [ceph-users] MDS hangs in "heartbeat_map" deadlock

2018-11-15 Thread Stefan Kooman
Quoting Stefan Kooman (ste...@bit.nl): > Quoting Patrick Donnelly (pdonn...@redhat.com): > > Thanks for the detailed notes. It looks like the MDS is stuck > > somewhere it's not even outputting any log messages. If possible, it'd > > be helpful to get a coredump (e.g. by sendi

Re: [ceph-users] CephFS kernel client versions - pg-upmap

2018-11-08 Thread Stefan Kooman
Quoting Stefan Kooman (ste...@bit.nl): > I'm pretty sure it isn't. I'm trying to do the same (force luminous > clients only) but ran into the same issue. Even when running 4.19 kernel > it's interpreted as a jewel client. Here is the list I made so far: > > Ker

[ceph-users] MDS hangs in "heartbeat_map" deadlock

2018-10-04 Thread Stefan Kooman
Dear list, Today we hit our first Ceph MDS issue. Out of the blue the active MDS stopped working: mon.mon1 [WRN] daemon mds.mds1 is not responding, replacing it as rank 0 with standby daemon mds.mds2. Logging of ceph-mds1: 2018-10-04 10:50:08.524745 7fdd516bf700 1 mds.mds1 asok_command:

Re: [ceph-users] Mimic cluster is offline and not healing

2018-09-28 Thread Stefan Kooman
Quoting by morphin (morphinwith...@gmail.com): > Good news... :) > > After I tried everything. I decide to re-create my MONs from OSD's and > I used the script: > https://paste.ubuntu.com/p/rNMPdMPhT5/ > > And it worked!!! Congrats! > I think when 2 server crashed and come back same time some

Re: [ceph-users] MDS hangs in "heartbeat_map" deadlock

2018-10-08 Thread Stefan Kooman
Quoting Stefan Kooman (ste...@bit.nl): > > From what you've described here, it's most likely that the MDS is trying to > > read something out of RADOS which is taking a long time, and which we > > didn't expect to cause a slow down. You can check via the admin

Re: [ceph-users] MDS hangs in "heartbeat_map" deadlock

2019-01-16 Thread Stefan Kooman
Hi Patrick, Quoting Stefan Kooman (ste...@bit.nl): > Quoting Stefan Kooman (ste...@bit.nl): > > Quoting Patrick Donnelly (pdonn...@redhat.com): > > > Thanks for the detailed notes. It looks like the MDS is stuck > > > somewhere it's not even outputting any log

Re: [ceph-users] Pool Available Capacity Question

2018-12-08 Thread Stefan Kooman
Jay Munsterman schreef op 7 december 2018 21:55:25 CET: >Hey all, >I hope this is a simple question, but I haven't been able to figure it >out. >On one of our clusters there seems to be a disparity between the global >available space and the space available to pools. > >$ ceph df >GLOBAL: >

Re: [ceph-users] No recovery when "norebalance" flag set

2018-11-26 Thread Stefan Kooman
Quoting Dan van der Ster (d...@vanderster.com): > Haven't seen that exact issue. > > One thing to note though is that if osd_max_backfills is set to 1, > then it can happen that PGs get into backfill state, taking that > single reservation on a given OSD, and therefore the recovery_wait PGs >

Re: [ceph-users] Degraded objects afte: ceph osd in $osd

2018-11-26 Thread Stefan Kooman
Quoting Janne Johansson (icepic...@gmail.com): > Yes, when you add a drive (or 10), some PGs decide they should have one or > more > replicas on the new drives, a new empty PG is created there, and > _then_ that replica > will make that PG get into the "degraded" mode, meaning if it had 3 > fine

Re: [ceph-users] Poor ceph cluster performance

2018-11-26 Thread Stefan Kooman
Quoting Cody (codeology@gmail.com): > The Ceph OSD part of the cluster uses 3 identical servers with the > following specifications: > > CPU: 2 x E5-2603 @1.8GHz > RAM: 16GB > Network: 1G port shared for Ceph public and cluster traffics This will hamper throughput a lot. > Journaling

[ceph-users] No recovery when "norebalance" flag set

2018-11-25 Thread Stefan Kooman
Hi list, During cluster expansion (adding extra disks to existing hosts) some OSDs failed (FAILED assert(0 == "unexpected error", _txc_add_transaction error (39) Directory not empty not handled on operation 21 (op 1, counting from 0), full details: https://8n1.org/14078/c534). We had

[ceph-users] Degraded objects afte: ceph osd in $osd

2018-11-25 Thread Stefan Kooman
Hi List, Another interesting and unexpected thing we observed during cluster expansion is the following. After we added extra disks to the cluster, while "norebalance" flag was set, we put the new OSDs "IN". As soon as we did that a couple of hundered objects would become degraded. During that

Re: [ceph-users] Full L3 Ceph

2018-11-25 Thread Stefan Kooman
Quoting Robin H. Johnson (robb...@gentoo.org): > On Fri, Nov 23, 2018 at 04:03:25AM +0700, Lazuardi Nasution wrote: > > I'm looking example Ceph configuration and topology on full layer 3 > > networking deployment. Maybe all daemons can use loopback alias address in > > this case. But how to set

[ceph-users] Not all pools are equal, but why

2018-09-13 Thread Stefan Kooman
Hi List, TL;DR: what application types are compatible with each other concerning Ceph Pools? I.e. is it safe to mix "RBD" pool with (some) native librados objects? RBD / RGW / Cephfs all have their own pools. Since luminous release there is this "application tag" to (somewhere in the future)

Re: [ceph-users] Ceph MDS WRN replayed op client.$id

2018-09-13 Thread Stefan Kooman
Hi John, Quoting John Spray (jsp...@redhat.com): > On Wed, Sep 12, 2018 at 2:59 PM Stefan Kooman wrote: > > When replaying a journal (either on MDS startup or on a standby-replay > MDS), the replayed file creation operations are being checked for > consistency with the state

Re: [ceph-users] Ceph MDS WRN replayed op client.$id

2018-09-14 Thread Stefan Kooman
Quoting John Spray (jsp...@redhat.com): > On Thu, Sep 13, 2018 at 11:01 AM Stefan Kooman wrote: > We implement locking, and it's correct that another client can't gain > the lock until the first client is evicted. Aside from speeding up > eviction by modifying the timeout, if you

[ceph-users] Ceph MDS WRN replayed op client.$id

2018-09-12 Thread Stefan Kooman
Hi, Once in a while, today a bit more often, the MDS is logging the following: mds.mds1 [WRN] replayed op client.15327973:15585315,15585103 used ino 0x19918de but session next is 0x1873b8b Nothing of importance is logged in the mds (debug_mds_log": "1/5"). What does this warning

Re: [ceph-users] ceph-fuse slow cache?

2018-09-12 Thread Stefan Kooman
Quoting Yan, Zheng (uker...@gmail.com): > > > please add '-f' option (trace child processes' syscall) to strace, Good suggestion. We now see all apache child processes doing it's thing. We have been, on and off, been stracing / debugging this issue. Nothing obvious. We are still trying to get

Re: [ceph-users] Performance Problems

2018-12-10 Thread Stefan Kooman
Quoting Robert Sander (r.san...@heinlein-support.de): > On 07.12.18 18:33, Scharfenberg, Buddy wrote: > > > We have 3 nodes set up, 1 with several large drives, 1 with a handful of > > small ssds, and 1 with several nvme drives. > > This is a very unusual setup. Do you really have all your HDDs

Re: [ceph-users] cephday berlin slides

2018-12-10 Thread Stefan Kooman
Quoting Mike Perez (mipe...@redhat.com): > Hi Serkan, > > I'm currently working on collecting the slides to have them posted to > the Ceph Day Berlin page as Lenz mentioned they would show up. I will > notify once the slides are available on mailing list/twitter. Thanks! FYI: The Ceph Day Berlin

Re: [ceph-users] logging of cluster status (Jewel vs Luminous and later)

2019-01-24 Thread Stefan Kooman
Quoting Matthew Vernon (m...@sanger.ac.uk): > Hi, > > On our Jewel clusters, the mons keep a log of the cluster status e.g. > > 2019-01-24 14:00:00.028457 7f7a17bef700 0 log_channel(cluster) log [INF] : > HEALTH_OK > 2019-01-24 14:00:00.646719 7f7a46423700 0 log_channel(cluster) log [INF] : >

Re: [ceph-users] Moving pools between cluster

2019-04-02 Thread Stefan Kooman
Quoting Burkhard Linke (burkhard.li...@computational.bio.uni-giessen.de): > Hi, > Images: > > Straight-forward attempt would be exporting all images with qemu-img from > one cluster, and uploading them again on the second cluster. But this will > break snapshots, protections etc. You can use

Re: [ceph-users] Ceph nautilus upgrade problem

2019-04-02 Thread Stefan Kooman
Quoting Stadsnet (jwil...@stads.net): > On 26-3-2019 16:39, Ashley Merrick wrote: > >Have you upgraded any OSD's? > > > No didn't go through with the osd's Just checking here: are your sure all PGs have been scrubbed while running Luminous? As the release notes [1] mention this: "If you are

Re: [ceph-users] Ceph nautilus upgrade problem

2019-04-02 Thread Stefan Kooman
Quoting Paul Emmerich (paul.emmer...@croit.io): > This also happened sometimes during a Luminous -> Mimic upgrade due to > a bug in Luminous; however I thought it was fixed on the ceph-mgr > side. > Maybe the fix was (also) required in the OSDs and you are seeing this > because the running OSDs

[ceph-users] MDS_SLOW_METADATA_IO

2019-02-28 Thread Stefan Kooman
Dear list, After upgrading to 12.2.11 the MDSes are reporting slow metadata IOs (MDS_SLOW_METADATA_IO). The metadata IOs would have been blocked for more that 5 seconds. We have one active, and one active standby MDS. All storage on SSD (Samsung PM863a / Intel DC4500). No other (OSD) slow ops

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-28 Thread Stefan Kooman
Quoting Wido den Hollander (w...@42on.com): > Just wanted to chime in, I've seen this with Luminous+BlueStore+NVMe > OSDs as well. Over time their latency increased until we started to > notice I/O-wait inside VMs. On a Luminous 12.2.8 cluster with only SSDs we also hit this issue I guess.

Re: [ceph-users] MDS_SLOW_METADATA_IO

2019-03-03 Thread Stefan Kooman
Quoting Patrick Donnelly (pdonn...@redhat.com): > On Thu, Feb 28, 2019 at 12:49 PM Stefan Kooman wrote: > > > > Dear list, > > > > After upgrading to 12.2.11 the MDSes are reporting slow metadata IOs > > (MDS_SLOW_METADATA_IO). The metadata IOs would have been bl

Re: [ceph-users] How To Scale Ceph for Large Numbers of Clients?

2019-03-14 Thread Stefan Kooman
Quoting Zack Brenton (z...@imposium.com): > On Tue, Mar 12, 2019 at 6:10 AM Stefan Kooman wrote: > > > Hmm, 6 GiB of RAM is not a whole lot. Especially if you are going to > > increase the amount of OSDs (partitions) like Patrick suggested. By > > default it will take 4 G

Re: [ceph-users] How To Scale Ceph for Large Numbers of Clients?

2019-03-12 Thread Stefan Kooman
Quoting Zack Brenton (z...@imposium.com): > Types of devices: > We run our Ceph pods on 3 AWS i3.2xlarge nodes. We're running 3 OSDs, 3 > Mons, and 2 MDS pods (1 active, 1 standby-replay). Currently, each pod runs > with the following resources: > - osds: 2 CPU, 6Gi RAM, 1.7Ti NVMe disk > - mds:

Re: [ceph-users] how to judge the results? - rados bench comparison

2019-04-17 Thread Stefan Kooman
Quoting Lars Täuber (taeu...@bbaw.de): > > > This is something i was told to do, because a reconstruction of failed > > > OSDs/disks would have a heavy impact on the backend network. > > > > Opinions vary on running "public" only versus "public" / "backend". > > Having a separate "backend"

Re: [ceph-users] how to judge the results? - rados bench comparison

2019-04-17 Thread Stefan Kooman
Quoting Lars Täuber (taeu...@bbaw.de): > > I'd probably only use the 25G network for both networks instead of > > using both. Splitting the network usually doesn't help. > > This is something i was told to do, because a reconstruction of failed > OSDs/disks would have a heavy impact on the

  1   2   >