Re: [ceph-users] RDMA/RoCE enablement failed with (113) No route to host

2018-12-20 Thread Marc Roos
Thanks for posting this Roman. -Original Message- From: Roman Penyaev [mailto:rpeny...@suse.de] Sent: 20 December 2018 14:21 To: Marc Roos Cc: green; mgebai; ceph-users Subject: Re: [ceph-users] RDMA/RoCE enablement failed with (113) No route to host On 2018-12-19 22:01, Marc Roos

Re: [ceph-users] RBD snapshot atomicity guarantees?

2018-12-20 Thread Hector Martin
On 21/12/2018 03.02, Gregory Farnum wrote: > RBD snapshots are indeed crash-consistent. :) > -Greg Thanks for the confirmation! May I suggest putting this little nugget in the docs somewhere? This might help clarify things for others :) -- Hector Martin (hec...@marcansoft.com) Public Key:

Re: [ceph-users] Multiple OSD crashing on 12.2.0. Bluestore / EC pool / rbd

2018-12-20 Thread Daniel K
I'm hitting this same issue on 12.2.5. Upgraded one node to 12.2.10 and it didn't clear. 6 OSDs flapping with this error. I know this is an older issue but are traces still needed? I don't see a resolution available. Thanks, Dan On Wed, Sep 6, 2017 at 10:30 PM Brad Hubbard wrote: > These

[ceph-users] Package availability for Debian / Ubuntu

2018-12-20 Thread Matthew Vernon
Hi, Since the "where are the bionic packages for Luminous?" question remains outstanding, I thought I'd look at the question a little further. The TL;DR is: Jewel: built for Ubuntu trusty & xenial ; Debian jessie & stretch Luminous: built for Ubuntu trusty & xenial ; Debian jessie & stretch

Re: [ceph-users] RBD snapshot atomicity guarantees?

2018-12-20 Thread Gregory Farnum
On Tue, Dec 18, 2018 at 1:11 AM Hector Martin wrote: > Hi list, > > I'm running libvirt qemu guests on RBD, and currently taking backups by > issuing a domfsfreeze, taking a snapshot, and then issuing a domfsthaw. > This seems to be a common approach. > > This is safe, but it's impactful: the

Re: [ceph-users] Ceph health error (was: Prioritize recovery over backfilling)

2018-12-20 Thread Daniel K
Did you ever get anywhere with this? I have 6 OSDs out of 36 continuously flapping with this error in the logs. Thanks, Dan On Fri, Jun 8, 2018 at 11:10 AM Caspar Smit wrote: > Hi all, > > Maybe this will help: > > The issue is with shards 3,4 and 5 of PG 6.3f: > > LOG's of OSD's 16, 17 & 36

Re: [ceph-users] Migration of a Ceph cluster to a new datacenter and new IPs

2018-12-20 Thread Paul Emmerich
I'd do it like this: * create 2 new mons with the new IPs * update all clients to the 3 new mon IPs * delete two old mons * create 1 new mon * delete the last old mon I think it's easier to create/delete mons than to change the IP of an existing mon. This doesn't even incur a downtime for the

Re: [ceph-users] Active mds respawns itself during standby mds reboot

2018-12-20 Thread Paul Emmerich
That can happen if both a mon and an mds fail at the same time; this is a common reason to avoid co-locating the mons with the mds. Or when doing a controlled shutdown: take the mds down first and only take the mon down once the mds state settled. (I think it shouldn't take 3 minutes for it to

Re: [ceph-users] Ceph health error (was: Prioritize recovery over backfilling)

2018-12-20 Thread Paul Emmerich
Oh, I've seen this bug twice on different clusters with Luminous on EC pools with lots of snapshots in the last few months. Seen it on 12.2.5 and 12.2.10 on CentOS. It's basically a broken object somewhere that kills an OSD and then gets recovered to another OSD which then also dies For us

[ceph-users] Bluestore nvme DB/WAL size

2018-12-20 Thread Vladimir Brik
Hello I am considering using logical volumes of an NVMe drive as DB or WAL devices for OSDs on spinning disks. The documentation recommends against DB devices smaller than 4% of slow disk size. Our servers have 16x 10TB HDDs and a single 1.5TB NVMe, so dividing it equally will result in

Re: [ceph-users] Ceph monitors overloaded on large cluster restart

2018-12-20 Thread Joachim Kraftmayer
Hello Andreas, we had the following experience in recent years: 1 year ago we also completely shut down one 2500+ osds ceph cluster and had no problems to start the cluster again. ( 5 mon nodes each with 4 x 25 Gbit/s ) A few years ago, we increased the number of osds to more than 600 in

Re: [ceph-users] why libcephfs API use "struct ceph_statx" instead of "struct stat"

2018-12-20 Thread Gregory Farnum
CephFS is prepared for the statx interface that doesn’t necessarily fill in every member of the stat structure, and allows you to make requests for only certain pieces of information. The purpose is so that the client and MDS can take less expensive actions than are required to satisfy a full

[ceph-users] Scrub behavior

2018-12-20 Thread Vladimir Brik
Hello I am experimenting with how Ceph (13.2.2) deals with on-disk data corruption, and I've run into some unexpected behavior. I am wondering if somebody could comment on whether I understand things correctly. In my tests I would dd /dev/urandom onto an OSD's disk and see what would

Re: [ceph-users] Bluestore nvme DB/WAL size

2018-12-20 Thread Stanislav A. Dmitriev
It'll cause problems if yours the only one NVMe drive will die - you'll lost all the DB partitions and all the OSDs are going to be failed - Stas -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Vladimir Brik Sent:

[ceph-users] InvalidObjectName Error when calling the PutObject operation

2018-12-20 Thread Rishabh S
Dear Members, I am trying to upload an object using SSE-Customer Provided Key and getting following Error. botocore.exceptions.ClientError: An error occurred (InvalidObjectName) when calling the PutObject operation: Unknown >>> s3.list_buckets() {u'Owner': {u'DisplayName': 'User for

Re: [ceph-users] Openstack ceph - non bootable volumes

2018-12-20 Thread Eugen Block
Volumes are being created just fine in the "volumes" pool but they are not bootable Also, ephemeral instances are working fine ( disks are being created on the dedicated ceph pool "instances') That sounds like cinder is missing something regarding glance. So the instance is listed as "ACTIVE"

Re: [ceph-users] Migration of a Ceph cluster to a new datacenter and new IPs

2018-12-20 Thread Burkhard Linke
Hi, On 12/19/18 8:55 PM, Marcus Müller wrote: Hi all, we’re running a ceph hammer cluster with 3 mons and 24 osds (3 same nodes) and need to migrate all servers to a new datacenter and change the IPs of the nodes. I found this tutorial: