[ceph-users] Ceph Tech Talk Calendar

2018-06-19 Thread Leonardo Vaz
Hi Cephers, We created the following etherpad to organize the calendar for the future Ceph Tech Talks. For the Ceph Tech Talk of June 28th our fellow George Mihaiescu will tell us how Ceph is being used on cancer research at OICR (Ontario Institute for Cancer Research). If you're interested to

Re: [ceph-users] fixing unrepairable inconsistent PG

2018-06-19 Thread Brad Hubbard
Can you post the output of a pg query? On Tue, Jun 19, 2018 at 11:44 PM, Andrei Mikhailovsky wrote: > A quick update on my issue. I have noticed that while I was trying to move > the problem object on osds, the file attributes got lost on one of the osds, > which is I guess why the error

Re: [ceph-users] RGW bucket sharding in Jewel

2018-06-19 Thread David Turner
There have been some bugs in the past with dynamic resharding, but those seem to be the exception not the norm. I was surprised to find that some of our buckets had resharded themselves from 12 shards in Jewel to over 200 shards in luminous without me even realizing it. We would have resharded

Re: [ceph-users] Delete pool nicely

2018-06-19 Thread David Turner
Hrmmm... one gotcha might be that the deleted PGs might try to backfill from the rest of the cluster when you bring an OSD back online. Setting nobackfill/norecover would prevent the other PGs on the OSD from other pools from catching back up... There has to be a way around that. Maybe marking

Re: [ceph-users] Delete pool nicely

2018-06-19 Thread David Turner
I came up with a new theory for how to delete a large pool sanely and without impacting the cluster heavily. I haven't tested this yet, but it just occurred to me as I was planning to remove a large pool of my own, again. First you need to stop all IO to the pool to be deleted. Next you stop an

Re: [ceph-users] Minimal MDS for CephFS on OSD hosts

2018-06-19 Thread Denny Fuchs
hi, > Am 19.06.2018 um 20:40 schrieb Steffen Winther Sørensen : > > > >> Den 19. jun. 2018 kl. 16.50 skrev Webert de Souza Lima >> : >> >> Keep in mind that the mds server is cpu-bound, so during heavy workloads it >> will eat up CPU usage, so the OSD daemons can affect or be affected by

Re: [ceph-users] separate monitoring node

2018-06-19 Thread Denny Fuchs
hi, > Am 19.06.2018 um 17:17 schrieb Kevin Hrpcek : > > # ceph auth get client.icinga > exported keyring for client.icinga > [client.icinga] > key = > caps mgr = "allow r" > caps mon = "allow r" thats the point: It's OK, to check, if all processes are up and running and may some

[ceph-users] CentOS Dojo at CERN

2018-06-19 Thread Leonardo Vaz
Hey Cephers, We will join our friends from OpenStack and CentOS projects at CERN in Geneva on October 19th for the CentOS Dojo: https://blog.centos.org/2018/05/cern-dojo-october-19th-2018/ The call for papers is currently open and more details about the event are available on the URL above.

Re: [ceph-users] Ceph Mimic on Debian 9 Stretch

2018-06-19 Thread Fabian Grünbichler
On Mon, Jun 18, 2018 at 07:15:49PM +, Sage Weil wrote: > On Mon, 18 Jun 2018, Fabian Grünbichler wrote: > > it's of course within your purview as upstream project (lead) to define > > certain platforms/architectures/distros as fully supported, and others > > as best-effort/community-driven/...

Re: [ceph-users] Frequent slow requests

2018-06-19 Thread Chris Taylor
On 2018-06-19 12:17 pm, Frank de Bot (lists) wrote: Frank (lists) wrote: Hi, On a small cluster (3 nodes) I frequently have slow requests. When dumping the inflight ops from the hanging OSD, it seems it doesn't get a 'response' for one of the subops. The events always look like: I've

Re: [ceph-users] RGW bucket sharding in Jewel

2018-06-19 Thread Matt Benjamin
The increased time to list sharded buckets is currently expected, yes. In turn other operations such as put and delete should be faster in proportion to two factors, the number of shards on independent PGs (serialization by PG), and the spread of shards onto independent OSD devices (speedup from

Re: [ceph-users] Frequent slow requests

2018-06-19 Thread Frank de Bot (lists)
Frank (lists) wrote: > Hi, > > On a small cluster (3 nodes) I frequently have slow requests. When > dumping the inflight ops from the hanging OSD, it seems it doesn't get a > 'response' for one of the subops. The events always look like: > I've done some further testing, all slow request are

Re: [ceph-users] Minimal MDS for CephFS on OSD hosts

2018-06-19 Thread Steffen Winther Sørensen
> Den 19. jun. 2018 kl. 16.50 skrev Webert de Souza Lima > : > > Keep in mind that the mds server is cpu-bound, so during heavy workloads it > will eat up CPU usage, so the OSD daemons can affect or be affected by the > MDS daemon. > But it does work well. We've been running a few clusters

Re: [ceph-users] separate monitoring node

2018-06-19 Thread Kevin Hrpcek
I use icinga2 as well with a check_ceph.py that I wrote a couple years ago. The method I use is that icinga2 runs the check from the icinga2 host itself. ceph-common is installed on the icinga2 host since the check_ceph script is a wrapper and parser for the ceph command output using python's

Re: [ceph-users] separate monitoring node

2018-06-19 Thread Stefan Kooman
Quoting John Spray (jsp...@redhat.com): > > The general idea with mgr plugins (Telegraf, etc) is that because > there's only one active mgr daemon, you don't have to worry about > duplicate feeds going in. > > I haven't use the icinga2 check_ceph plugin, but it seems like it's > intended to run

Re: [ceph-users] Minimal MDS for CephFS on OSD hosts

2018-06-19 Thread Webert de Souza Lima
Keep in mind that the mds server is cpu-bound, so during heavy workloads it will eat up CPU usage, so the OSD daemons can affect or be affected by the MDS daemon. But it does work well. We've been running a few clusters with MON, MDS and OSDs sharing the same hosts for a couple of years now.

Re: [ceph-users] separate monitoring node

2018-06-19 Thread John Spray
On Tue, Jun 19, 2018 at 1:17 PM Denny Fuchs wrote: > > Hi, > > at the moment, we use Icinga2, check_ceph* and Telegraf with the Ceph > plugin. I'm asking what I need to have a separate host, which knows all > about the Ceph cluster health. The reason is, that each OSD node has > mostly the exact

Re: [ceph-users] upgrading jewel to luminous fails

2018-06-19 Thread Elias Abacioglu
Ok, I followed Michaels advice I was able to start mon without deleting /var/lib/ceph. It forced me to use the Ubuntu 16.04 image of ceph-container. The reason I wanted to delete /var/lib/ceph was that I wanted to switch to the CentOS 7 image which is becoming more standard in ceph-container and

Re: [ceph-users] Minimal MDS for CephFS on OSD hosts

2018-06-19 Thread Paul Emmerich
Just co-locate them with your OSDs. You can can control how much RAM the MDSs use with the "mds cache memory limit" option. (default 1 GB) Note that the cache should be large enough RAM to keep the active working set in the mds cache but 1 million files is not really a lot. As a rule of thumb:

Re: [ceph-users] fixing unrepairable inconsistent PG

2018-06-19 Thread Andrei Mikhailovsky
A quick update on my issue. I have noticed that while I was trying to move the problem object on osds, the file attributes got lost on one of the osds, which is I guess why the error messages showed the no attribute bit. I then copied the attributes metadata to the problematic object and

[ceph-users] RGW bucket sharding in Jewel

2018-06-19 Thread Matthew Vernon
Hi, Some of our users have Quite Large buckets (up to 20M objects in a bucket), and AIUI best practice would be to have sharded indexes for those buckets (of the order of 1 shard per 100k objects). On a trivial test case (make a 1M-object bucket, shard index to 10 shards, s3cmd ls s3://bucket

Re: [ceph-users] Minimal MDS for CephFS on OSD hosts

2018-06-19 Thread Denny Fuchs
Hi, Am 19.06.2018 15:14, schrieb Stefan Kooman: Storage doesn't matter for MDS, as they won't use it to store ceph data (but instead use the (meta)data pool to store meta data). I would not colocate the MDS daemons with the OSDS, but instead create a couple of VMs (active / standby) and give

Re: [ceph-users] Minimal MDS for CephFS on OSD hosts

2018-06-19 Thread Stefan Kooman
Quoting Denny Fuchs (linuxm...@4lin.net): > > We have also a 2nd cluster which holds the VMs with also 128Gb Ram and 2 x > Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz. But with only system disks (ZFS > Raid1). Storage doesn't matter for MDS, as they won't use it to store ceph data (but instead use

[ceph-users] Minimal MDS for CephFS on OSD hosts

2018-06-19 Thread Denny Fuchs
Hi, for our application, we need a shared storage for round about ~500.000 (max ~1mio) files (count inodes / ~2-4TB), until our app can talk with Rados directly. Because our rack is full, we can't add any more physical machines, so I'm asking, if it is O.K, to put the MDS on the OSD hosts.

Re: [ceph-users] Benchmarking

2018-06-19 Thread David Byte
I use fio. #1. Be sure to have a job that writes random data to 100% of the RBD device before starting. I use the size=100%. #2. Benchmark something that makes sense for your use case, eg. benchmarking 1M sequential writes makes no sense if your workload is 64k random. #3. A follow on to

[ceph-users] separate monitoring node

2018-06-19 Thread Denny Fuchs
Hi, at the moment, we use Icinga2, check_ceph* and Telegraf with the Ceph plugin. I'm asking what I need to have a separate host, which knows all about the Ceph cluster health. The reason is, that each OSD node has mostly the exact same data, which is transmitted into our database (like

[ceph-users] Benchmarking

2018-06-19 Thread Nino Bosteels
Hi, Anyone got tips as to how to best benchmark a Ceph blockdevice (RBD)? I've currently found the more traditional ways (dd, iostat, bonnie++, phoronix test suite) and fio which actually supports the rbd-engine. Though there's not a lot of information about it to be found online (contrary to

[ceph-users] What is the theoretical upper bandwidth of my Ceph cluster?

2018-06-19 Thread Yu Haiyang
Hi All, I have a Ceph cluster that consists of 3 OSDs (each on a different server’s SSD disk partition with 500MB/s maximum read/write speed). The 3 OSDs are connected through a switch which provides a maximum 10 Gbits/sec bandwidth between each pair of servers. My Ceph version is Luminous

Re: [ceph-users] Ceph Mimic on Debian 9 Stretch

2018-06-19 Thread Paul Emmerich
FWIW: we'll be shipping Mimic on Debian with an upgraded libc6. This is of course somewhat of a zombie Debian: our build system initially fell apart quite a few times. But for the build system it's not a problem -- it will be thrown away after the build finishes anyway. These problems will occur

Re: [ceph-users] Install ceph manually with some problem

2018-06-19 Thread Lenz Grimmer
On 06/18/2018 08:38 PM, Michael Kuriger wrote: > Don’t use the installer scripts.  Try  yum install ceph I'm not sure I agree. While running "make install" is of course somewhat of limited use on a distributed cluster, I would expect that it at least installs all the required components on the

[ceph-users] fixing unrepairable inconsistent PG

2018-06-19 Thread Andrei Mikhailovsky
Hello everyone I am having trouble repairing one inconsistent and stubborn PG. I get the following error in ceph.log: 2018-06-19 11:00:00.000225 mon.arh-ibstorage1-ib mon.0 192.168.168.201:6789/0 675 : cluster [ERR] overall HEALTH_ERR noout flag(s) set; 4 scrub errors; Possible data

Re: [ceph-users] upgrading jewel to luminous fails

2018-06-19 Thread Elias Abacioglu
Hi Mike On Mon, Jun 18, 2018 at 8:36 PM, Michael Kuriger wrote: > If you delete /var/lib/ceph, all your authentication is gone. I'm runnig ceph-container with etcd as a K/V store. So it downloads the auth I believe. Perhaps I'm wrong. ___ ceph-users

Re: [ceph-users] Ceph Mimic on Debian 9 Stretch

2018-06-19 Thread Eneko Lacunza
Hi Fabian, Hope your arm is doing well :) unless such a backport is created and tested fairly well (and we will spend some more time investigating this internally despite the caveats above), our plan B will probably involve: - building Luminous for Buster to ease the upgrade from

Re: [ceph-users] IO to OSD with librados

2018-06-19 Thread Hervé Ballans
Le 19/06/2018 à 09:02, Dan van der Ster a écrit : The storage arrays are Nexsan E60 arrays having two active-active redundant controllers, 60 3 TB disk drives. The disk drives are organized into six 8+2 Raid 6 LUNs of 24 TB each. This is not the ideal Ceph hardware. Ceph is designed to use

Re: [ceph-users] performance exporting RBD over NFS

2018-06-19 Thread Frederic BRET
Hi List, If you write to a pool with 3x replication over 10GE, then it will need to ship data 3 times over 10GE to finalize the write, so 350MB/s sounds like a theoretical maximum in terms of a single writer. Sorry but Janne is wrong : it's the primary OSD responsability to write to the

Re: [ceph-users] IO to OSD with librados

2018-06-19 Thread Jialin Liu
Thanks for the advice, Dan. I'll try to reconfigure the cluster and see if the performance changes. Best, Jialin On Tue, Jun 19, 2018 at 12:02 AM Dan van der Ster wrote: > On Tue, Jun 19, 2018 at 1:04 AM Jialin Liu wrote: > > > > Hi Dan, Thanks for the follow-ups. > > > > I have just tried