Re: [ceph-users] Cluster in bad shape, seemingly endless cycle of OSDs failed, then marked down, then booted, then failed again

2018-07-17 Thread Bryan Banister
19.1fdf query Thanks, -Bryan From: Tom W [mailto:to...@ukfast.co.uk] Sent: Tuesday, July 17, 2018 5:36 PM To: Bryan Banister ; ceph-users@lists.ceph.com Subject: RE: Cluster in bad shape, seemingly endless cycle of OSDs failed, then marked down, then booted, then failed again Note: External

Re: [ceph-users] Cluster in bad shape, seemingly endless cycle of OSDs failed, then marked down, then booted, then failed again

2018-07-17 Thread Bryan Banister
s team to see if they can also help rule out any networking related issues. Cheers, -Bryan From: Tom W [mailto:to...@ukfast.co.uk] Sent: Tuesday, July 17, 2018 5:06 PM To: Bryan Banister ; ceph-users@lists.ceph.com Subject: RE: Cluster in bad shape, seemingly endless cycle of OSDs failed, then marke

Re: [ceph-users] Cluster in bad shape, seemingly endless cycle of OSDs failed, then marked down, then booted, then failed again

2018-07-17 Thread Bryan Banister
that unsetting 'nodown' will just return things to the previous state. Thanks! -Bryan From: Tom W [mailto:to...@ukfast.co.uk] Sent: Tuesday, July 17, 2018 4:19 PM To: Bryan Banister ; ceph-users@lists.ceph.com Subject: RE: Cluster in bad shape, seemingly endless cycle of OSDs failed, then marked down

Re: [ceph-users] Cluster in bad shape, seemingly endless cycle of OSDs failed, then marked down, then booted, then failed again

2018-07-17 Thread Bryan Banister
thinking of unsetting the 'nodown" now to see what it does, but is there any other recommendations here before I do that? Thanks again! -Bryan From: Tom W [mailto:to...@ukfast.co.uk] Sent: Tuesday, July 17, 2018 1:58 PM To: Bryan Banister ; ceph-users@lists.ceph.com Subject: Re: Cluste

Re: [ceph-users] Cluster in bad shape, seemingly endless cycle of OSDs failed, then marked down, then booted, then failed again

2018-07-17 Thread Bryan Banister
-users-boun...@lists.ceph.com] On Behalf Of Bryan Banister Sent: Tuesday, July 17, 2018 12:08 PM To: Tom W ; ceph-users@lists.ceph.com Subject: Re: [ceph-users] Cluster in bad shape, seemingly endless cycle of OSDs failed, then marked down, then booted, then failed again Note: External Email

Re: [ceph-users] Cluster in bad shape, seemingly endless cycle of OSDs failed, then marked down, then booted, then failed again

2018-07-17 Thread Bryan Banister
To: Bryan Banister ; ceph-users@lists.ceph.com Subject: Re: Cluster in bad shape, seemingly endless cycle of OSDs failed, then marked down, then booted, then failed again Note: External Email Hi Bryan, What version of Ceph are you currently running on, and do you run

[ceph-users] Cluster in bad shape, seemingly endless cycle of OSDs failed, then marked down, then booted, then failed again

2018-07-17 Thread Bryan Banister
Hi all, We're still very new to managing Ceph and seem to have cluster that is in an endless loop of failing OSDs, then marking them down, then booting them again: Here are some example logs: 2018-07-17 16:48:28.976673 mon.rook-ceph-mon7 [INF] osd.83 failed

[ceph-users] Many inconsistent PGs in EC pool, is this normal?

2018-06-28 Thread Bryan Banister
Hi all, We started running a EC pool based object store, set up with a 4+2 configuration, and we seem to be getting an almost constant report of inconsistent PGs during scrub operations. For example: root@rook-tools:/# ceph pg ls inconsistent PG_STAT OBJECTS MISSING_ON_PRIMARY DEGRADED

Re: [ceph-users] Ceph tech talk on deploy ceph with rook on kubernetes

2018-05-24 Thread Bryan Banister
Hi Sage, Please provide a link to the youtube video once it's posted, thanks!! -Bryan -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Sage Weil Sent: Thursday, May 24, 2018 12:04 PM To: ceph-de...@vger.kernel.org; ceph-users@lists.ceph.com

Re: [ceph-users] Increasing number of PGs by not a factor of two?

2018-05-18 Thread Bryan Banister
+1 From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Kai Wagner Sent: Thursday, May 17, 2018 4:20 PM To: David Turner Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Increasing number of PGs by not a factor of two? Great summary David.

Re: [ceph-users] Help rebalancing OSD usage, Luminus 1.2.2

2018-02-20 Thread Bryan Banister
: David Turner [mailto:drakonst...@gmail.com] Sent: Friday, February 16, 2018 3:21 PM To: Bryan Banister <bbanis...@jumptrading.com<mailto:bbanis...@jumptrading.com>> Cc: Bryan Stillwell <bstillw...@godaddy.com<mailto:bstillw...@godaddy.com>>; Janne Johansson <icepic

Re: [ceph-users] Help rebalancing OSD usage, Luminus 1.2.2

2018-02-16 Thread Bryan Banister
ring, last acting [96,53,70] Some have common OSDs but some OSDs only listed once. Should I try just marking OSDs with stuck requests down to see if that will re-assert them? Thanks!! -Bryan From: David Turner [mailto:drakonst...@gmail.com] Sent: Friday, February 16, 2018 2:51 PM To: Bryan Banist

Re: [ceph-users] Help rebalancing OSD usage, Luminus 1.2.2

2018-02-16 Thread Bryan Banister
t;Started", "enter_time": "2018-02-13 14:33:17.491148" } ], Sorry for all the hand holding, but how do I determine if I need to set an OSD as ‘down’ to fix the issues, and how does it go about re-asserting itself?

Re: [ceph-users] Help rebalancing OSD usage, Luminus 1.2.2

2018-02-16 Thread Bryan Banister
e, but I would suggest limiting the number of operations going on at the same time. Bryan From: Bryan Banister <bbanis...@jumptrading.com<mailto:bbanis...@jumptrading.com>> Date: Tuesday, February 13, 2018 at 1:16 PM To: Bryan Stillwell <bstillw...@godaddy.com<mailto:bst

Re: [ceph-users] Help rebalancing OSD usage, Luminus 1.2.2

2018-02-13 Thread Bryan Banister
] Sent: Tuesday, February 13, 2018 12:43 PM To: Bryan Banister <bbanis...@jumptrading.com>; Janne Johansson <icepic...@gmail.com> Cc: Ceph Users <ceph-users@lists.ceph.com> Subject: Re: [ceph-users] Help rebalancing OSD usage, Luminus 1.2.2 No

Re: [ceph-users] Help rebalancing OSD usage, Luminus 1.2.2

2018-02-12 Thread Bryan Banister
[mailto:icepic...@gmail.com] Sent: Wednesday, January 31, 2018 9:34 AM To: Bryan Banister <bbanis...@jumptrading.com> Cc: Ceph Users <ceph-users@lists.ceph.com> Subject: Re: [ceph-users] Help rebalancing OSD usage, Luminus 1.2.2 Note: External Email 2018-01-31

Re: [ceph-users] Help rebalancing OSD usage, Luminus 1.2.2

2018-01-31 Thread Bryan Banister
m and pgp_num first and then see how it looks? Thanks, -Bryan From: Janne Johansson [mailto:icepic...@gmail.com] Sent: Wednesday, January 31, 2018 7:53 AM To: Bryan Banister <bbanis...@jumptrading.com> Cc: Ceph Users <ceph-users@lists.ceph.com> Subject: Re: [ceph-users] Help rebalan

Re: [ceph-users] Help rebalancing OSD usage, Luminous 12.2.2

2018-01-30 Thread Bryan Banister
this nearly 2yr old thread would still apply? Thanks again, -Bryan From: Bryan Banister Sent: Tuesday, January 30, 2018 10:26 AM To: Bryan Banister <bbanis...@jumptrading.com>; Ceph Users <ceph-users@lists.ceph.com> Subject: RE: Help rebalancing OSD usage, Luminous 12.2.2 Sorry, obvi

Re: [ceph-users] Help rebalancing OSD usage, Luminous 12.2.2

2018-01-30 Thread Bryan Banister
Sorry, obviously should have been Luminous 12.2.2, -B From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Bryan Banister Sent: Tuesday, January 30, 2018 10:24 AM To: Ceph Users <ceph-users@lists.ceph.com> Subject: [ceph-users] Help rebalancing OSD usage, Luminus 1.2.2

[ceph-users] Help rebalancing OSD usage, Luminus 1.2.2

2018-01-30 Thread Bryan Banister
Hi all, We are still very new to running a Ceph cluster and have run a RGW cluster for a while now (6-ish mo), it mainly holds large DB backups (Write once, read once, delete after N days). The system is now warning us about an OSD that is near_full and so we went to look at the usage across

[ceph-users] Simple RGW Lifecycle processing questions (luminous 12.2.2)

2017-12-19 Thread Bryan Banister
Hi All, Hey all, How often does the "lc process" run on RGW buckets in a cluster? Also is it configurable per bucket or anything? Tried searching man pages and ceph docs with no luck, so any help appreciated! Thanks! -Bryan Note: This email is for the

Re: [ceph-users] Any way to get around selinux-policy-base dependency

2017-12-06 Thread Bryan Banister
Thanks Ken, that's understandable, -Bryan -Original Message- From: Ken Dreyer [mailto:kdre...@redhat.com] Sent: Wednesday, December 06, 2017 12:03 PM To: Bryan Banister <bbanis...@jumptrading.com> Cc: Ceph Users <ceph-users@lists.ceph.com>; Rafael Suarez <rsua...@jumptradi

[ceph-users] Any way to get around selinux-policy-base dependency

2017-12-04 Thread Bryan Banister
Hi all, I would like to upgrade to the latest Luminous release but found that it requires the absolute latest selinux-policy-base. We aren't using selinux, so was wondering if there is a way around this dependency requirement? [carf-ceph-osd15][WARNIN] Error: Package:

[ceph-users] Help with full osd and RGW not responsive

2017-10-17 Thread Bryan Banister
Hi all, Still a real novice here and we didn't set up our initial RGW cluster very well. We have 134 osds and set up our RGW pool with only 64 PGs, thus not all of our OSDs got data and now we have one that is 95% full. This apparently has put the cluster into a HEALTH_ERR condition:

Re: [ceph-users] Ceph mgr dashboard, no socket could be created

2017-09-22 Thread Bryan Banister
- From: John Spray [mailto:jsp...@redhat.com] Sent: Friday, September 22, 2017 2:32 AM To: Bryan Banister <bbanis...@jumptrading.com> Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Ceph mgr dashboard, no socket could be created Note: External

[ceph-users] Ceph mgr dashboard, no socket could be created

2017-09-21 Thread Bryan Banister
I'm not sure what happened but the dashboard module can no longer startup now: 2017-09-21 09:28:34.646369 7fffef2e6700 -1 mgr got_mgr_map mgrmap module list changed to (dashboard), respawn 2017-09-21 09:28:34.646372 7fffef2e6700 1 mgr respawn e: '/usr/bin/ceph-mgr' 2017-09-21 09:28:34.646374

Re: [ceph-users] Possible to change the location of run_dir?

2017-09-20 Thread Bryan Banister
is trying to run)? Thanks, -Bryan From: David Turner [mailto:drakonst...@gmail.com] Sent: Wednesday, September 20, 2017 1:34 PM To: Bryan Banister <bbanis...@jumptrading.com>; ceph-users@lists.ceph.com Subject: Re: [ceph-users] Possible to change the location of run_dir? Note: External

[ceph-users] Possible to change the location of run_dir?

2017-09-20 Thread Bryan Banister
We are running telegraf and would like to have the telegraf user read the admin sockets from ceph, which is required for the ceph telegraf plugin to apply the ceph related tags to the data. The ceph admin sockets are by default stored in /var/run/ceph, but this is recreated at boot time, so

Re: [ceph-users] Ceph release cadence

2017-09-06 Thread Bryan Banister
Very new to Ceph but long time but long time sys admin who is jaded/opinionated. My 2 cents: 1) This sounds like a perfect thing to put in a poll and ask/beg people to vote. Hopefully that will get you more of a response from a larger number of users. 2) Given that the value of the odd

Re: [ceph-users] Repeated failures in RGW in Ceph 12.1.4

2017-08-30 Thread Bryan Banister
what would do this? I'll be updating the version to 12.2.0 shortly, -Bryan -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Bryan Banister Sent: Wednesday, August 30, 2017 3:42 PM To: Yehuda Sadeh-Weinraub <yeh...@redhat.com> Cc: ceph

Re: [ceph-users] Repeated failures in RGW in Ceph 12.1.4

2017-08-30 Thread Bryan Banister
We are not sending a HUP signal that we know about. We were not modifying our configuration. However all user accounts in the RGW were lost! -Bryan -Original Message- From: Yehuda Sadeh-Weinraub [mailto:yeh...@redhat.com] Sent: Wednesday, August 30, 2017 3:30 PM To: Bryan Banister

Re: [ceph-users] Repeated failures in RGW in Ceph 12.1.4

2017-08-30 Thread Bryan Banister
Looks like my RGW users also got deleted/lost?! [root@carf-ceph-osd01 ~]# radosgw-admin user list [] Yikes!! Any thoughts? -Bryan From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Bryan Banister Sent: Wednesday, August 30, 2017 9:45 AM To: ceph-users@lists.ceph.com

[ceph-users] Repeated failures in RGW in Ceph 12.1.4

2017-08-30 Thread Bryan Banister
Not sure what's happening but we started to but a decent load on the RGWs we have setup and we were seeing failures with the following kind of fingerprint: 2017-08-29 17:06:22.072361 7ffdc501a700 1 rgw realm reloader: Frontends paused 2017-08-29 17:06:22.072359 7fffacbe9700 1 civetweb:

Re: [ceph-users] Help with down OSD with Ceph 12.1.4 on Bluestore back

2017-08-29 Thread Bryan Banister
Found some bad stuff in the messages file about SCSI block device fails... I think I found my smoking gun... -B From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Bryan Banister Sent: Tuesday, August 29, 2017 5:02 PM To: ceph-users@lists.ceph.com Subject: [ceph-users] Help

[ceph-users] Help with down OSD with Ceph 12.1.4 on Bluestore back

2017-08-29 Thread Bryan Banister
Hi all, Not sure what to do with this down OSD: -2> 2017-08-29 16:55:34.588339 72d58700 1 -- 7.128.13.57:6979/18818 --> 7.128.13.55:0/52877 -- osd_ping(ping_reply e935 stamp 2017-08-29 16:55:34.587991) v4 -- 0x67397000 con 0 -1> 2017-08-29 16:55:34.588351 72557700 1 --

Re: [ceph-users] Anybody gotten boto3 and ceph RGW working?

2017-08-23 Thread Bryan Banister
That was the problem, thanks again, -Bryan From: Bryan Banister Sent: Wednesday, August 23, 2017 9:06 AM To: Bryan Banister <bbanis...@jumptrading.com>; Abhishek Lekshmanan <abhis...@suse.com>; ceph-users@lists.ceph.com Subject: RE: [ceph-users] Anybody gotten boto3 and ceph RGW wor

Re: [ceph-users] Anybody gotten boto3 and ceph RGW working?

2017-08-23 Thread Bryan Banister
Looks like I found the problem: https://github.com/snowflakedb/snowflake-connector-python/issues/1 I’ll try the fixed version of botocore 1.4.87+, -Bryan From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Bryan Banister Sent: Wednesday, August 23, 2017 9:01 AM

Re: [ceph-users] Anybody gotten boto3 and ceph RGW working?

2017-08-23 Thread Bryan Banister
sl=False ) # config=boto3.session.Config(signature_version='s3v2') for bucket in s3.list_buckets(): for key in bucket.objects.all(): print(key.key) Thanks in advance for any help!! -Bryan -Original Message- From: Abhishek Lekshmanan [mailto:a

[ceph-users] Anybody gotten boto3 and ceph RGW working?

2017-08-22 Thread Bryan Banister
Hello, I have the boto python API working with our ceph cluster but haven't figured out a way to get boto3 to communicate yet to our RGWs. Anybody have a simple example? Cheers for any help! -Bryan Note: This email is for the confidential use of the named

Re: [ceph-users] Help with file system with failed mds daemon

2017-08-22 Thread Bryan Banister
...@redhat.com] Sent: Tuesday, August 22, 2017 2:56 PM To: Bryan Banister <bbanis...@jumptrading.com> Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Help with file system with failed mds daemon Note: External Email - On Tue, Aug 22, 201

Re: [ceph-users] Help with file system with failed mds daemon

2017-08-22 Thread Bryan Banister
mds_namespace=carf_ceph02,name=cephfs.k8test,secretfile=k8test.secret mount error 22 = Invalid argument Thanks, -Bryan -Original Message- From: John Spray [mailto:jsp...@redhat.com] Sent: Tuesday, August 22, 2017 11:18 AM To: Bryan Banister <bbanis...@jumptrading.com> Cc: ceph

[ceph-users] Help with file system with failed mds daemon

2017-08-22 Thread Bryan Banister
Hi all, I'm still new to ceph and cephfs. Trying out the multi-fs configuration on at Luminous test cluster. I shutdown the cluster to do an upgrade and when I brought the cluster back up I now have a warnings that one of the file systems has a failed mds daemon: 2017-08-21 17:00:00.81

Re: [ceph-users] Any experience with multiple cephfs instances in one ceph cluster? How experimental is this?

2017-08-21 Thread Bryan Banister
...@redhat.com] Sent: Monday, August 21, 2017 8:48 AM To: Bryan Banister <bbanis...@jumptrading.com> Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Any experience with multiple cephfs instances in one ceph cluster? How experimental is this? Note: External

[ceph-users] Any experience with multiple cephfs instances in one ceph cluster? How experimental is this?

2017-08-21 Thread Bryan Banister
Hi all, I'm very new to ceph and cephfs, so I'm just starting to play around with the Luminous release. There are some very concerning warnings about deploying multiple cephfs instances in the same cluster: "There are no known bugs, but any failures which do result from having multiple active