Re: [ceph-users] ceph df: Raw used vs. used vs. actual bytes in cephfs

2018-02-19 Thread Flemming Frandsen
I didn't know about ceph df detail, that's quite useful, thanks. I was thinking that the problem had to do with some sort of internal fragmentation, because the filesystem in question does have millions (2.9 M or threabouts) of files, however, even if 4k is lost for each file, that only

Re: [ceph-users] mgr[influx] Cannot transmit statistics: influxdb python module not found.

2018-02-19 Thread knawnd
Marc Roos wrote on 13/02/18 00:50: why not use collectd? centos7 rpms should do fine. Marc, sorry I somehow missed your question. One of the reason could be that collectd is a additional daemon whereas influx plugin for ceph is just an additional part of the already running system (ceph).

Re: [ceph-users] rgw bucket inaccessible - appears to be using incorrect index pool?

2018-02-19 Thread Robin H. Johnson
On Mon, Feb 19, 2018 at 07:57:18PM -0600, Graham Allan wrote: > Sorry to send another long followup, but actually... I'm not sure how to > change the placement_rule for a bucket - or at least what I tried does > not seem to work. Using a different (more disposable) bucket, my attempt > went

Re: [ceph-users] Luminous : performance degrade while read operations (ceph-volume)

2018-02-19 Thread nokia ceph
Hi Alfredo Deza, We have 5 node platforms with lvm osd created from scratch and another 5 node platform migrated from kraken which is ceph volume simple. Both has same issue . Both platform has only hdd for osd. We also noticed 2 times disk iops more compare to kraken , this causes less read

Re: [ceph-users] Newbie question: stretch ceph cluster

2018-02-19 Thread ST Wong (ITSC)
Hi, Thanks for your advice. Will try it out. Best Regards, /ST Wong From: Maged Mokhtar [mailto:mmokh...@petasan.org] Sent: Wednesday, February 14, 2018 4:20 PM To: ST Wong (ITSC) Cc: Luis Periquito; Kai Wagner; Ceph Users Subject: Re: [ceph-users] Newbie question: stretch ceph cluster Hi,

Re: [ceph-users] rgw bucket inaccessible - appears to be using incorrect index pool?

2018-02-19 Thread Graham Allan
Sorry to send another long followup, but actually... I'm not sure how to change the placement_rule for a bucket - or at least what I tried does not seem to work. Using a different (more disposable) bucket, my attempt went like this:: first created a new placement rule "old-placement" in both

Re: [ceph-users] Significance of the us-east-1 region when using S3 clients to talk to RGW

2018-02-19 Thread F21
I am using the official ceph/daemon docker image. It starts RGW and creates a zonegroup and zone with their names set to an empty string: https://github.com/ceph/ceph-container/blob/master/ceph-releases/luminous/ubuntu/16.04/daemon/start_rgw.sh#L36:54 $RGW_ZONEGROUP and $RGW_ZONE are both

Re: [ceph-users] Significance of the us-east-1 region when using S3 clients to talk to RGW

2018-02-19 Thread Yehuda Sadeh-Weinraub
What is the name of your zonegroup? On Mon, Feb 19, 2018 at 3:29 PM, F21 wrote: > I've done some debugging and the LocationConstraint is not being set by the > SDK by default. > > I do, however, need to set the region on the client to us-east-1 for it to > work. Anything

Re: [ceph-users] Significance of the us-east-1 region when using S3 clients to talk to RGW

2018-02-19 Thread F21
I've done some debugging and the LocationConstraint is not being set by the SDK by default. I do, however, need to set the region on the client to us-east-1 for it to work. Anything else will return an InvalidLocationConstraint error. Francis On 20/02/2018 8:40 AM, Yehuda Sadeh-Weinraub

Re: [ceph-users] Missing clones

2018-02-19 Thread Karsten Becker
Hi Eugen, hmmm, that should be : > rbd -p cpVirtualMachines list | while read LINE; do osdmaptool > --test-map-object $LINE --pool 10 osdmap 2>&1; rbd snap ls > cpVirtualMachines/$LINE | grep -v SNAPID | awk '{ print $2 }' | while read > LINE2; do echo "$LINE"; osdmaptool --test-map-object

Re: [ceph-users] Significance of the us-east-1 region when using S3 clients to talk to RGW

2018-02-19 Thread David Turner
Specifically my issue was having problems without this set in the .s3cfg file. `bucket_location = US` On Mon, Feb 19, 2018 at 5:04 PM David Turner wrote: > I wasn't using the Go SDK. I was using s3cmd when I came across this. > > On Mon, Feb 19, 2018 at 4:42 PM Yehuda

Re: [ceph-users] Significance of the us-east-1 region when using S3 clients to talk to RGW

2018-02-19 Thread David Turner
I wasn't using the Go SDK. I was using s3cmd when I came across this. On Mon, Feb 19, 2018 at 4:42 PM Yehuda Sadeh-Weinraub wrote: > Sounds like the go sdk adds a location constraint to requests that > don't go to us-east-1. RGW itself is definitely isn't tied to >

Re: [ceph-users] Missing clones

2018-02-19 Thread Mykola Golub
On Mon, Feb 19, 2018 at 10:17:55PM +0100, Karsten Becker wrote: > BTW - how can I find out, which RBDs are affected by this problem. Maybe > a copy/remove of the affected RBDs could help? But how to find out to > which RBDs this PG belongs to? In this case rbd_data.966489238e1f29.250b

Re: [ceph-users] rgw bucket inaccessible - appears to be using incorrect index pool?

2018-02-19 Thread Graham Allan
Thanks Robin, Of the two issues, this seems to me like it must be #22928. Since the majority of index entries for this bucket are in the .rgw.buckets pool, but newer entries have been created in .rgw.buckets.index, it's clearly failing to use the explicit placement pool - and with the index

Re: [ceph-users] Missing clones

2018-02-19 Thread Eugen Block
BTW - how can I find out, which RBDs are affected by this problem. Maybe a copy/remove of the affected RBDs could help? But how to find out to which RBDs this PG belongs to? Depending on how many PGs your cluster/pool has, you could dump your osdmap and then run the osdmaptool [1] for every

Re: [ceph-users] Significance of the us-east-1 region when using S3 clients to talk to RGW

2018-02-19 Thread Yehuda Sadeh-Weinraub
Sounds like the go sdk adds a location constraint to requests that don't go to us-east-1. RGW itself is definitely isn't tied to us-east-1, and does not know anything about it (unless you happen to have a zonegroup named us-east-1). Maybe there's a way to configure the sdk to avoid doing that?

Re: [ceph-users] Understanding/correcting sudden onslaught of unfound objects

2018-02-19 Thread Graham Allan
On 02/17/2018 12:48 PM, David Zafman wrote: The commits below came after v12.2.2 and may impact this issue. When a pg is active+clean+inconsistent means that scrub has detected issues with 1 or more replicas of 1 or more objects .  An unfound object is a potentially temporary state in

Re: [ceph-users] Missing clones

2018-02-19 Thread Karsten Becker
BTW - how can I find out, which RBDs are affected by this problem. Maybe a copy/remove of the affected RBDs could help? But how to find out to which RBDs this PG belongs to? Best Karsten On 19.02.2018 19:26, Karsten Becker wrote: > Hi. > > Thank you for the tip. I just tried... but

Re: [ceph-users] Luminous : performance degrade while read operations (ceph-volume)

2018-02-19 Thread Alfredo Deza
On Mon, Feb 19, 2018 at 2:01 PM, nokia ceph wrote: > Hi All, > > We have 5 node clusters with EC 4+1 and use bluestore since last year from > Kraken. > Recently we migrated all our platforms to luminous 12.2.2 and finally all > OSDs migrated to ceph-volume simple type

[ceph-users] Luminous : performance degrade while read operations (ceph-volume)

2018-02-19 Thread nokia ceph
Hi All, We have 5 node clusters with EC 4+1 and use bluestore since last year from Kraken. Recently we migrated all our platforms to luminous 12.2.2 and finally all OSDs migrated to ceph-volume simple type and on few platforms installed ceph using ceph-volume . Now we see two times more traffic

Re: [ceph-users] ceph df: Raw used vs. used vs. actual bytes in cephfs

2018-02-19 Thread Pavel Shub
Could you be running into block size (minimum allocation unit) overhead? Default bluestore block size is 4k for hdd and 64k for ssd. This is exacerbated if you have tons of small files. I tend to see this when "ceph df detail" sum of raw used in pools is less than the global raw bytes used. On

Re: [ceph-users] Missing clones

2018-02-19 Thread Karsten Becker
Hi. Thank you for the tip. I just tried... but unfortunately the import aborts: > Write #10:9de96eca:::rbd_data.f5b8603d1b58ba.1d82:head# > snapset 0=[]:{} > Write #10:9de973fe:::rbd_data.966489238e1f29.250b:18# > Write

Re: [ceph-users] Signature check failures.

2018-02-19 Thread Cary
Gregory, I greatly appreciate your assistance. I recompiled Ceph with -ssl and the nss USE flags set, which is opposite what I was using. I am now able to export from our pools without signature check failures. Thank you for pointing me in the right direction. Cary -Dynamic On Fri, Feb 16,

Re: [ceph-users] Cephfs fsal + nfs-ganesha + el7/centos7

2018-02-19 Thread Daniel Gryniewicz
To my knowledge, no one has done any work on ganesha + ceph and selinux. Fedora (and RHEL) includes config in it's selinux package for ganesha + gluster, but I'm sure there's missing bits for ceph. Daniel On 02/17/2018 03:15 PM, Oliver Freyermuth wrote: Hi together, many thanks for the

Re: [ceph-users] Missing clones

2018-02-19 Thread Eugen Block
Could [1] be of interest? Exporting the intact PG and importing it back to the rescpective OSD sounds promising. [1] http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-July/019673.html Zitat von Karsten Becker : Hi. We have size=3 min_size=2. But this

Re: [ceph-users] Migrating to new pools

2018-02-19 Thread Jason Dillaman
Thanks! On Mon, Feb 19, 2018 at 10:33 AM, Eugen Block wrote: > Hi, > > I created a ticket for the rbd import issue: > > https://tracker.ceph.com/issues/23038 > > Regards, > > Eugen > > > Zitat von Jason Dillaman : > >> On Fri, Feb 16, 2018 at 11:20 AM, Eugen

[ceph-users] radosgw + OpenLDAP = Failed the auth strategy, reason=-13

2018-02-19 Thread Konstantin Shalygin
Hi cephers. I try rgw (Luminous 12.2.2) + OpenLDAP. My settings:     "rgw_ldap_binddn": "cn=cisco,ou=people,dc=example,dc=local",     "rgw_ldap_dnattr": "uid",     "rgw_ldap_searchdn": "ou=s3_users,dc=example,dc=local",     "rgw_ldap_searchfilter": "(objectClass=inetOrgPerson)",    

Re: [ceph-users] Missing clones

2018-02-19 Thread Eugen Block
When we ran our test cluster with size 2 I experienced a similar issue, but that was in Hammer. There I could find the corresponding PG data in the filesystem and copy it to the damaged PG. But now we also run Bluestore on Luminous, I don't know yet how to fix this kind of issue, maybe

Re: [ceph-users] Migrating to new pools

2018-02-19 Thread Eugen Block
Hi, I created a ticket for the rbd import issue: https://tracker.ceph.com/issues/23038 Regards, Eugen Zitat von Jason Dillaman : On Fri, Feb 16, 2018 at 11:20 AM, Eugen Block wrote: Hi Jason, ... also forgot to mention "rbd export --export-format 2"

Re: [ceph-users] puppet for the deployment of ceph

2018-02-19 Thread Benjeman Meekhof
We use this one, now heavily modified in our own fork. I'd sooner point you at the original unless it is missing something you need. Ours has diverged a bit and makes no attempt to support anything outside our specific environment (RHEL7). https://github.com/openstack/puppet-ceph

Re: [ceph-users] "Cannot get stat of OSD" in ceph.mgr.log upon enabling influx plugin

2018-02-19 Thread John Spray
On Mon, Feb 19, 2018 at 3:07 PM, Benjeman Meekhof wrote: > The 'cannot stat' messages are normal at startup, we see them also in > our working setup with mgr influx module. Maybe they could be fixed > by delaying the module startup, or having it check for some other > 'all

Re: [ceph-users] "Cannot get stat of OSD" in ceph.mgr.log upon enabling influx plugin

2018-02-19 Thread Benjeman Meekhof
The 'cannot stat' messages are normal at startup, we see them also in our working setup with mgr influx module. Maybe they could be fixed by delaying the module startup, or having it check for some other 'all good' status but I haven't looked into it. You should only be seeing them when the mgr

Re: [ceph-users] "Cannot get stat of OSD" in ceph.mgr.log upon enabling influx plugin

2018-02-19 Thread knawnd
Forgot to mentioned that influx self-test produces a reasonable output too (long json list with some metrics and timestamps) as well as there are the following lines in mgr log: 2018-02-19 17:35:04.208858 7f33a50ec700 1 mgr.server reply handle_command (0) Success 2018-02-19 17:35:04.245285

[ceph-users] "Cannot get stat of OSD" in ceph.mgr.log upon enabling influx plugin

2018-02-19 Thread knawnd
Dear Ceph users, I am trying to enable influx plugin for ceph following http://docs.ceph.com/docs/master/mgr/influx/ but no data comes to influxdb DB. As soon as 'ceph mgr module enable influx' command is executed on one of ceph mgr node (running on CentOS 7.4.1708) there are the following

Re: [ceph-users] Missing clones

2018-02-19 Thread Karsten Becker
Hi. We have size=3 min_size=2. But this "upgrade" has been done during the weekend. We had size=2 min_size=1 before. Best Karsten On 19.02.2018 13:02, Eugen Block wrote: > Hi, > > just to rule out the obvious, which size does the pool have? You aren't > running it with size = 2, do you? >

Re: [ceph-users] Missing clones

2018-02-19 Thread Eugen Block
Hi, just to rule out the obvious, which size does the pool have? You aren't running it with size = 2, do you? Zitat von Karsten Becker : Hi, I have one damaged PG in my cluster. All OSDs are BlueStore. How do I fix this? 2018-02-19 11:00:23.183695 osd.29

[ceph-users] Missing clones

2018-02-19 Thread Karsten Becker
Hi, I have one damaged PG in my cluster. All OSDs are BlueStore. How do I fix this? > 2018-02-19 11:00:23.183695 osd.29 [ERR] repair 10.7b9 > 10:9defb021:::rbd_data.2313975238e1f29.0002cbb5:head expected clone > 10:9defb021:::rbd_data.2313975238e1f29.0002cbb5:64e 1 missing >

Re: [ceph-users] mon service failed to start

2018-02-19 Thread Caspar Smit
Hi Behnam, I would firstly recommend running a filesystem check on the monitor disk first to see if there are any inconsistencies. Is the disk where the monitor is running on a spinning disk or SSD? If SSD you should check the Wear level stats through smartctl. Maybe trim (discard) enabled on

Re: [ceph-users] Ceph Bluestore performance question

2018-02-19 Thread Caspar Smit
"I checked and the OSD-hosts peaked at a load average of about 22 (they have 24+24HT cores) in our dd benchmark, but stayed well below that (only about 20 % per OSD daemon) in the rados bench test." Maybe because your dd test uses bs=1M and rados bench is using 4M as default block size? Caspar

Re: [ceph-users] Upgrade to ceph 12.2.2, libec_jerasure.so: undefined symbol: _ZN4ceph6buffer3ptrC1ERKS1_

2018-02-19 Thread Sebastian Koch - ilexius GmbH
If anyone else faces this problem in the future: I fixed it now by running "apt install --reinstall ceph-osd". Don't know why the normal apt upgrade broke it. On 18.02.2018 22:35, Sebastian Koch - ilexius GmbH wrote: > Hello, > > I ran "apt upgrade" on Ubuntu 16.04 on one node, now the two OSDs

[ceph-users] How to really change public network in ceph

2018-02-19 Thread Mario Giammarco
Hello, I have a test proxmox/ceph cluster with four servers. I need to change the ceph public subnet from 10.1.0.0/24 to 10.1.5.0/24. I have read documentation and tutorials. The most critical part seems monitor map editing. But it seems to me that osds need to bind to new subnet too. I tried to