Re: [ceph-users] Repeated failures in RGW in Ceph 12.1.4

2017-08-30 Thread Ben Hines
The daily log rotation. -Ben On Wed, Aug 30, 2017 at 3:09 PM, Bryan Banister wrote: > Looking at the systemd service it does show that twice, at roughly the > same time and one day apart, the service did receive a HUP signal: > > Aug 29 16:31:02 carf-ceph-osd02

Re: [ceph-users] Repeated failures in RGW in Ceph 12.1.4

2017-08-30 Thread Bryan Banister
Looking at the systemd service it does show that twice, at roughly the same time and one day apart, the service did receive a HUP signal: Aug 29 16:31:02 carf-ceph-osd02 radosgw[130050]: 2017-08-29 16:31:02.528559 7fffc641c700 -1 received signal: Hangup from PID: 73176 task name: killall -q

Re: [ceph-users] Ceph re-ip of OSD node

2017-08-30 Thread Jake Young
Hey Ben, Take a look at the osd log for another OSD who's ip you did not change. What errors does it show related the re-ip'd OSD? Is the other OSD trying to communicate with the re-ip'd OSD's old ip address? Jake On Wed, Aug 30, 2017 at 3:55 PM Jeremy Hanmer

Re: [ceph-users] Repeated failures in RGW in Ceph 12.1.4

2017-08-30 Thread Bryan Banister
We are not sending a HUP signal that we know about. We were not modifying our configuration. However all user accounts in the RGW were lost! -Bryan -Original Message- From: Yehuda Sadeh-Weinraub [mailto:yeh...@redhat.com] Sent: Wednesday, August 30, 2017 3:30 PM To: Bryan Banister

Re: [ceph-users] Repeated failures in RGW in Ceph 12.1.4

2017-08-30 Thread Yehuda Sadeh-Weinraub
On Wed, Aug 30, 2017 at 5:44 PM, Bryan Banister wrote: > Not sure what’s happening but we started to but a decent load on the RGWs we > have setup and we were seeing failures with the following kind of > fingerprint: > > > > 2017-08-29 17:06:22.072361 7ffdc501a700 1

Re: [ceph-users] Ceph re-ip of OSD node

2017-08-30 Thread Jeremy Hanmer
This is simply not true. We run quite a few ceph clusters with rack-level layer2 domains (thus routing between racks) and everything works great. On Wed, Aug 30, 2017 at 10:52 AM, David Turner wrote: > ALL OSDs need to be running the same private network at the same time.

Re: [ceph-users] v12.2.0 Luminous released

2017-08-30 Thread Deepak Naidu
Note sure how often does the http://docs.ceph.com/docs/master/releases/ gets updated, timeline roadmap helps. -- Deepak -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Abhishek Lekshmanan Sent: Tuesday, August 29, 2017 11:20 AM To:

Re: [ceph-users] Repeated failures in RGW in Ceph 12.1.4

2017-08-30 Thread Bryan Banister
Looks like my RGW users also got deleted/lost?! [root@carf-ceph-osd01 ~]# radosgw-admin user list [] Yikes!! Any thoughts? -Bryan From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Bryan Banister Sent: Wednesday, August 30, 2017 9:45 AM To: ceph-users@lists.ceph.com

Re: [ceph-users] Ceph re-ip of OSD node

2017-08-30 Thread Daniel K
Just curious why it wouldn't work as long as the IPs were reachable? Is there something going on in layer 2 with Ceph that wouldn't survive a trip across a router? On Wed, Aug 30, 2017 at 1:52 PM, David Turner wrote: > ALL OSDs need to be running the same private

Re: [ceph-users] Ceph re-ip of OSD node

2017-08-30 Thread David Turner
ALL OSDs need to be running the same private network at the same time. ALL clients, RGW, OSD, MON, MGR, MDS, etc, etc need to be running on the same public network at the same time. You cannot do this as a one at a time migration to the new IP space. Even if all of the servers can still

Re: [ceph-users] Ceph on RDMA

2017-08-30 Thread Haomai Wang
On Wed, Aug 30, 2017 at 7:53 AM, Jeroen Oldenhof wrote: > Hi, > > I used 'https://community.mellanox.com/docs/DOC-2721', and was under the > impression I followed all steps.. but I somehow skipped over the > /usr/lib/systemd/system/ (/lib/systemd/system/ in case of Ubuntu) >

Re: [ceph-users] get error when use prometheus plugin of ceph-mgr

2017-08-30 Thread John Spray
On Wed, Aug 30, 2017 at 5:54 PM, shawn tim wrote: > Hi, John, > Do you mean this error > text format parsing error in line 8: invalid metric name in comment Yes, iirc that's the one you get from the plus/minus signs in MDS metric names. > I install ceph from rpm, but I will

Re: [ceph-users] get error when use prometheus plugin of ceph-mgr

2017-08-30 Thread shawn tim
Hi, John, Do you mean this error text format parsing error in line 8: invalid metric name in comment I install ceph from rpm, but I will try to patch it. many thanks when using prometheus 1.7.1 instead of 2.0.0-beta, the "no token found error" disappeared. Maybe prometheus 2.0.0-beta still has

[ceph-users] Ceph re-ip of OSD node

2017-08-30 Thread Ben Morrice
Hello We have a small cluster that we need to move to a different network in the same datacentre. My workflow was the following (for a single OSD host), but I failed (further details below) 1) ceph osd set noout 2) stop ceph-osd processes 3) change IP, gateway, domain (short hostname is

Re: [ceph-users] Ceph on RDMA

2017-08-30 Thread Jeroen Oldenhof
Hi, I used  'https://community.mellanox.com/docs/DOC-2721', and was under the impression I followed all steps.. but I somehow skipped over the /usr/lib/systemd/system/ (/lib/systemd/system/ in case of Ubuntu) ceph-xxx@.service files. I did update the /etc/security/limits.conf file for

[ceph-users] Repeated failures in RGW in Ceph 12.1.4

2017-08-30 Thread Bryan Banister
Not sure what's happening but we started to but a decent load on the RGWs we have setup and we were seeing failures with the following kind of fingerprint: 2017-08-29 17:06:22.072361 7ffdc501a700 1 rgw realm reloader: Frontends paused 2017-08-29 17:06:22.072359 7fffacbe9700 1 civetweb:

Re: [ceph-users] Power outages!!! help!

2017-08-30 Thread Ronny Aasen
On 30.08.2017 15:32, Steve Taylor wrote: I'm not familiar with dd_rescue, but I've just been reading about it. I'm not seeing any features that would be beneficial in this scenario that aren't also available in dd. What specific features give it "really a far better chance of restoring a copy

Re: [ceph-users] Luminous CephFS on EC - how?

2017-08-30 Thread Martin Millnert
On Wed, Aug 30, 2017 at 02:06:29PM +0100, John Spray wrote: > > As I wrote in my ticket there is room for improvement in docs on how to > > do it and with cli/api rejecting "ceph fs new " with > > pool1 or pool2 being EC. > > The CLI will indeed reject attempts to use an EC pool for metadata, >

Re: [ceph-users] v12.2.0 Luminous released

2017-08-30 Thread Sage Weil
On Wed, 30 Aug 2017, Xiaoxi Chen wrote: > Also upgraded our pre-production to Luminous from Jewel. > > Some nit: > > 1. no clear explanation on what happen if a pool is not associated > with an application and the difference between "rbd init " and > "ceph osd ceph osd pool application enable

Re: [ceph-users] Power outages!!! help!

2017-08-30 Thread Steve Taylor
I'm not familiar with dd_rescue, but I've just been reading about it. I'm not seeing any features that would be beneficial in this scenario that aren't also available in dd. What specific features give it "really a far better chance of restoring a copy of your disk" than dd? I'm always

Re: [ceph-users] Power outages!!! help!

2017-08-30 Thread Steve Taylor
Yes, if I had created the RBD in the same cluster I was trying to repair then I would have used rbd-fuse to "map" the RBD in order to avoid potential deadlock issues with the kernel client. I had another cluster available, so I copied its config file to the osd node, created the RBD in the

Re: [ceph-users] Luminous CephFS on EC - how?

2017-08-30 Thread John Spray
On Wed, Aug 30, 2017 at 1:50 PM, Martin Millnert wrote: > Hi, > > On Wed, Aug 30, 2017 at 12:28:12PM +0100, John Spray wrote: >> On Wed, Aug 30, 2017 at 7:21 AM, Martin Millnert wrote: >> > Hi, >> > >> > what is the proper method to not only setup but also

Re: [ceph-users] Luminous CephFS on EC - how?

2017-08-30 Thread Martin Millnert
On Wed, Aug 30, 2017 at 11:06:02AM +0200, Peter Maloney wrote: > What kind of terrible mail client is this that sends a multipart message where > one part is blank and that's the one Thunderbird chooses to show? (see > blankness below) It's a real email client (mutt) sending text to the mailing

Re: [ceph-users] Luminous CephFS on EC - how?

2017-08-30 Thread Martin Millnert
Hi, On Wed, Aug 30, 2017 at 12:28:12PM +0100, John Spray wrote: > On Wed, Aug 30, 2017 at 7:21 AM, Martin Millnert wrote: > > Hi, > > > > what is the proper method to not only setup but also successfully use > > CephFS on erasure coded data pool? > > The docs[1] very vaguely

Re: [ceph-users] osd heartbeat protocol issue on upgrade v12.1.0 ->v12.2.0

2017-08-30 Thread Marc Roos
I had this also once. If you update all nodes and then systemctl restart 'ceph-osd@*' on all nodes, you should be fine. But first the monitors of course -Original Message- From: Thomas Gebhardt [mailto:gebha...@hrz.uni-marburg.de] Sent: woensdag 30 augustus 2017 14:10 To:

[ceph-users] osd heartbeat protocol issue on upgrade v12.1.0 ->v12.2.0

2017-08-30 Thread Thomas Gebhardt
Hello, when I upgraded (yet a single osd node) from v12.1.0 -> v12.2.0 its osds start flapping and finally got all marked as down. As far as I can see, this is due to an incompatibility of the osd heartbeat protocol between the two versions: v12.2.0 node: 7f4f7b6e6700 -1 osd.X 3879

Re: [ceph-users] v12.2.0 Luminous released

2017-08-30 Thread Nigel Williams
On 30 August 2017 at 20:53, John Spray wrote: > The mgr_initial_modules setting is only applied at the point of > cluster creation, ok. > so I would guess that if it didn't seem to take > effect then this was an upgrade from >=11.x not quite, it was a clean install of

Re: [ceph-users] get error when use prometheus plugin of ceph-mgr

2017-08-30 Thread John Spray
On Wed, Aug 30, 2017 at 3:47 AM, shawn tim wrote: > > > Hello, > I just want to try prometheus plugin of ceph-mgr. > Following this doc(http://docs.ceph.com/docs/master/mgr/prometheus/). I get > output like > > [root@ceph01 ~]# curl localhost:9283/metrics/ | head > % Total

Re: [ceph-users] Luminous CephFS on EC - how?

2017-08-30 Thread John Spray
On Wed, Aug 30, 2017 at 7:21 AM, Martin Millnert wrote: > Hi, > > what is the proper method to not only setup but also successfully use > CephFS on erasure coded data pool? > The docs[1] very vaguely state that erasure coded pools do not support omap > operations hence, "For

Re: [ceph-users] v12.2.0 Luminous released

2017-08-30 Thread John Spray
On Wed, Aug 30, 2017 at 9:54 AM, Nigel Williams wrote: > On 30 August 2017 at 17:43, Mark Kirkwood > wrote: >> Yes - you just edit /var/lib/ceph/bootstrap-mgr/ceph.keyring so the key >> matches what 'ceph auth list' shows and re-deploy

Re: [ceph-users] Power outages!!! help!

2017-08-30 Thread Ronny Aasen
[snip] I'm not sure if I am liking what I see on fdisk... it doesn't show sdb1. I hope it shows up when I run dd_rescue to other drive... =P # fdisk /dev/sdb Welcome to fdisk (util-linux 2.25.2). Changes will remain in memory only, until you decide to write them. Be careful before using

Re: [ceph-users] Reaching aio-max-nr on Ubuntu 16.04 with Luminous

2017-08-30 Thread Thomas Bennett
Hi Dan, Great! Thanks for the feedback. Much appreciated. For completion here is my ansible role: - name: Increase aio-max-nr for bluestore sysctl: name: fs.aio-max-nr value: 1048576 sysctl_file: /etc/sysctl.d/ceph-tuning.conf sysctl_set: yes Cheers, Tom On Wed, Aug 30, 2017

[ceph-users] [rgw][s3] Object not in objects list

2017-08-30 Thread Rudenko Aleksandr
Hi, I use ceph 0.94.10(hammer) with radosgw as S3-compatible object store. I have few objects in some bucket with strange problem. I use awscli as s3 client. GET/HEAD objects work fine but list object doesn’t. In objects list I don’t see these objects. Object metadata: radosgw-admin bi list

Re: [ceph-users] Luminous CephFS on EC - how?

2017-08-30 Thread Peter Maloney
What kind of terrible mail client is this that sends a multipart message where one part is blank and that's the one Thunderbird chooses to show? (see blankness below) Yes you're on the right track. As long as the main fs is on a replicated pool (the one with omap), the ones below it (using file

Re: [ceph-users] Centos7, luminous, cephfs, .snaps

2017-08-30 Thread Nigel Williams
On 30 August 2017 at 18:52, Marc Roos wrote: > I noticed it is .snap not .snaps Yes > mkdir: cannot create directory ‘.snap/snap1’: Operation not permitted > > Is this because my permissions are insufficient on the client id? fairly sure you've forgotten this step:

[ceph-users] Correct osd permissions

2017-08-30 Thread Marc Roos
I have some osd with these permissions, and without mgr. What are the correct ones to have for luminous? osd.0 caps: [mgr] allow profile osd caps: [mon] allow profile osd caps: [osd] allow * osd.14 caps: [mon] allow profile osd caps: [osd] allow *

Re: [ceph-users] v12.2.0 Luminous released

2017-08-30 Thread Nigel Williams
On 30 August 2017 at 17:43, Mark Kirkwood wrote: > Yes - you just edit /var/lib/ceph/bootstrap-mgr/ceph.keyring so the key > matches what 'ceph auth list' shows and re-deploy the mgr (worked for me in > 12.1.3/4 and 12.2.0). thanks for the tip, what I did to get it

Re: [ceph-users] Centos7, luminous, cephfs, .snaps

2017-08-30 Thread Marc Roos
I noticed it is .snap not .snaps []# cd test/ []# ls -arlt total 614400 drwxr-xr-x 1 root root13 Aug 29 23:39 .. -rw-r--r-- 1 root root 209715200 Aug 29 23:39 test1.img -rw-r--r-- 1 root root 209715200 Aug 29 23:39 test2.img -rw-r--r-- 1 root root 209715200 Aug 29 23:40 test3.img

[ceph-users] A question about “lease issued to client” in ceph mds

2017-08-30 Thread Meyers Mark
Hi: I am Mark Meyers, an independent developer in C++, and love ceph around development so much. I want to ask a question about "lease issued to client". In ceph source, src/mds/Locker.cc : Locker::issue_client_lease: if(... &&

Re: [ceph-users] Reaching aio-max-nr on Ubuntu 16.04 with Luminous

2017-08-30 Thread Dan van der Ster
Hi Thomas, Yes we set it to a million. >From our puppet manifest: # need to increase aio-max-nr to allow many bluestore devs sysctl { 'fs.aio-max-nr': val => '1048576' } Cheers, Dan On Aug 30, 2017 9:53 AM, "Thomas Bennett" wrote: > > Hi, > > I've

[ceph-users] A question about “lease issued to client” in ceph mds

2017-08-30 Thread Meyers Mark
Hi: I am Mark Meyers, an independent developer in C++, and love ceph around development so much. I want to ask a question about "lease issued to client". In ceph source, src/mds/Locker.cc : Locker::issue_client_lease: if(... &&

Re: [ceph-users] v12.2.0 Luminous released , collectd json update?

2017-08-30 Thread Marc Roos
If now 12.2.0 is released, how and who should be approached for applying patches for collectd? Aug 30 10:40:42 c01 collectd: ceph plugin: JSON handler failed with status -1. Aug 30 10:40:42 c01 collectd: ceph plugin: cconn_handle_event(name=osd.8,i=4,st=4): error 1 Aug 30 10:40:42 c01

[ceph-users] Reaching aio-max-nr on Ubuntu 16.04 with Luminous

2017-08-30 Thread Thomas Bennett
Hi, I've been testing out Luminous and I've noticed that at some point the number of osds per nodes was limited by aio-max-nr. By default its set to 65536 in Ubuntu 16.04 Has anyone else experienced this issue? fs.aio-nr currently sitting at 196608 with 48 osds. I have 48 osd's per node so

Re: [ceph-users] v12.2.0 Luminous released

2017-08-30 Thread Mark Kirkwood
On 30/08/17 18:48, Nigel Williams wrote: On 30 August 2017 at 16:05, Mark Kirkwood wrote: http://tracker.ceph.com/issues/20950 So the mgr creation requires surgery still :-( is there a way out of this error with ceph-mgr? mgr init Authentication failed, did

Re: [ceph-users] v12.2.0 Luminous released

2017-08-30 Thread Nigel Williams
> On 30 August 2017 at 16:05, Mark Kirkwood > wrote: >> http://tracker.ceph.com/issues/20950 >> >> So the mgr creation requires surgery still :-( is there a way out of this error with ceph-mgr? mgr init Authentication failed, did you specify a mgr ID with a valid

Re: [ceph-users] v12.2.0 Luminous released

2017-08-30 Thread Nigel Williams
On 30 August 2017 at 16:05, Mark Kirkwood wrote: > Very nice! > > I tested an upgrade from Jewel, pretty painless. However we forgot to merge: > > http://tracker.ceph.com/issues/20950 > > So the mgr creation requires surgery still :-( > > regards > > Mark > > > > On

[ceph-users] Luminous CephFS on EC - how?

2017-08-30 Thread Martin Millnert
Hi, what is the proper method to not only setup but also successfully use CephFS on erasure coded data pool? The docs[1] very vaguely state that erasure coded pools do not support omap operations hence, "For Cephfs, using an erasure coded pool means setting that pool in a file layout.". The file

Re: [ceph-users] v12.2.0 Luminous released

2017-08-30 Thread Mark Kirkwood
Very nice! I tested an upgrade from Jewel, pretty painless. However we forgot to merge: http://tracker.ceph.com/issues/20950 So the mgr creation requires surgery still :-( regards Mark On 30/08/17 06:20, Abhishek Lekshmanan wrote: We're glad to announce the first release of Luminous