[ceph-users] Re: ceph osd continously fails

2021-08-12 Thread Wesley Dillingham
Can you send the results of "ceph daemon osd.0 status" and maybe do that for a couple of osd ids ? You may need to target ones which are currently running. Respectfully, *Wes Dillingham* w...@wesdillingham.com LinkedIn On Wed, Aug 11, 2021 at 9:51

[ceph-users] Re: bug ceph auth

2021-07-14 Thread Wesley Dillingham
Do you get the same error if you just do "ceph auth get client.bootstrap-osd" i.e. does client.bootstrap exist as a user? Respectfully, *Wes Dillingham* w...@wesdillingham.com LinkedIn <http://www.linkedin.com/in/wesleydillingham> On Wed, Jul 14, 2021 at 1:56 PM Wesley D

[ceph-users] Re: bug ceph auth

2021-07-14 Thread Wesley Dillingham
is /var/lib/ceph/bootstrap-osd/ in existence and writeable? Respectfully, *Wes Dillingham* w...@wesdillingham.com LinkedIn On Wed, Jul 14, 2021 at 8:35 AM Marc wrote: > > > > [@t01 ~]# ceph auth get client.bootstrap-osd -o >

[ceph-users] Re: Monitors not starting, getting "e3 handle_auth_request failed to assign global_id"

2020-12-14 Thread Wesley Dillingham
/ packet inspection security technology being run on the servers. Perhaps you've made similar updates. Respectfully, *Wes Dillingham* w...@wesdillingham.com LinkedIn <http://www.linkedin.com/in/wesleydillingham> On Tue, Dec 8, 2020 at 7:46 PM Wesley Dillingham wrote: > We have

[ceph-users] Re: Monitors not starting, getting "e3 handle_auth_request failed to assign global_id"

2020-12-08 Thread Wesley Dillingham
We have also had this issue multiple times in 14.2.11 On Tue, Dec 8, 2020, 5:11 PM wrote: > I have same issue. My cluster runing 14.2.11 versions. What is your > version ceph? > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send

[ceph-users] Running Mons on msgrv2/3300 only.

2020-12-08 Thread Wesley Dillingham
We rebuilt all of our mons in one cluster such that they bind only to port 3300 with msgrv2. Previous to this we were binding to both 6789 and 3300. All of our server and client components are sufficiently new (14.2.x) and we haven’t observed any disruption but I am inquiring if this may be

[ceph-users] Mon's falling out of quorum, require rebuilding. Rebuilt with only V2 address.

2020-11-19 Thread Wesley Dillingham
We have had multiple clusters experiencing the following situation over the past few months on both 14.2.6 and 14.2.11. On a few instances it seemed random , in a second situation we had temporary networking disruption, in a third situation we accidentally made some osd changes which caused

[ceph-users] Re: radosgw beast access logs

2020-08-19 Thread Wesley Dillingham
We would very much appreciate having this backported to nautilus. Respectfully, *Wes Dillingham* w...@wesdillingham.com LinkedIn On Wed, Aug 19, 2020 at 9:02 AM Casey Bodley wrote: > On Tue, Aug 18, 2020 at 1:33 PM Graham Allan wrote: > > > >

[ceph-users] Meaning of the "tag" key in bucket metadata

2020-08-12 Thread Wesley Dillingham
Recently we encountered an instance of bucket corruption of two varieties. One in which the bucket metadata was missing and another in which the bucket.instance metadata was missing for various buckets. We have seemingly been successful in restoring the metadata by reconstructing it from the

[ceph-users] Apparent bucket corruption error: get_bucket_instance_from_oid failed

2020-08-04 Thread Wesley Dillingham
Long running cluster, currently running 14.2.6 I have a certain user whose buckets have become corrupted in that the following commands: radosgw-admin bucket check --bucket radosgw-admin bucket list --bucket= return with the following: ERROR: could not init bucket: (2) No such file or

[ceph-users] Re: Q release name

2020-03-23 Thread Wesley Dillingham
Checking the word "Octopus" in different languages the only one starting with a "Q" is in "Maltese": "Qarnit" For good measure here is a Maltesian Qarnit stew recipe: http://littlerock.com.mt/food/maltese-traditional-recipe-stuffat-tal-qarnit-octopus-stew/ Respectfully, *Wes Dillingham*

[ceph-users] Re: Unable to increase PG numbers

2020-02-25 Thread Wesley Dillingham
I believe you are encountering https://tracker.ceph.com/issues/39570 You should do a "ceph versions" on a mon and ensure all your OSDs are nautilus and if so set "ceph osd require-osd-release nautilus" then try to increase pg num. Upgrading to a more recent nautilus release is also probably a

[ceph-users] Re: All pgs peering indefinetely

2020-02-04 Thread Wesley Dillingham
I would guess that you have something preventing osd to osd communication on ports 6800-7300 or osd to mon communication on port 6789 and/or 3300. Respectfully, *Wes Dillingham* w...@wesdillingham.com LinkedIn On Tue, Feb 4, 2020 at 12:44 PM

[ceph-users] ceph-iscsi create RBDs on erasure coded data pools

2020-01-30 Thread Wesley Dillingham
Is it possible to create an EC backed RBD via ceph-iscsi tools (gwcli, rbd-target-api)? It appears that a pre-existing RBD created with the rbd command can be imported, but there is no means to directly create an EC backed RBD. The API seems to expect a single pool field in the body to work with.

[ceph-users] Re: ceph-volume lvm filestore OSDs fail to start on reboot. Permission denied on journal partition

2020-01-23 Thread Wesley Dillingham
red activation for: 219-529ea347-b129-4b53-81cb-bb5f2d91f8ae Respectfully, *Wes Dillingham* w...@wesdillingham.com LinkedIn <http://www.linkedin.com/in/wesleydillingham> On Thu, Jan 23, 2020 at 4:31 AM Jan Fajerski wrote: > On Wed, Jan 22, 2020 at 12:00:28PM -0500, Wesley Dill

[ceph-users] ceph-volume lvm filestore OSDs fail to start on reboot. Permission denied on journal partition

2020-01-22 Thread Wesley Dillingham
After upgrading to Nautilus 14.2.6 from Luminous 12.2.12 we are seeing the following behavior on OSDs which were created with "ceph-volume lvm create --filestore --osd-id --data --journal " Upon restart of the server containing these OSDs they fail to start with the following error in the logs:

[ceph-users] acting_primary is an osd with primary-affinity of 0, which seems wrong

2020-01-03 Thread Wesley Dillingham
In an exploration of trying to speedup the long tail of backfills resulting from marking a failing OSD out I began looking at my PGs to see if i could tune some settings and noticed the following: Scenario: on a 12.2.12 Cluster, I am alerted of an inconsistent PG and am alerted of SMART failures

[ceph-users] Re: iSCSI Gateway reboots and permanent loss

2019-12-05 Thread Wesley Dillingham
com/in/wesleydillingham> On Thu, Dec 5, 2019 at 4:14 PM Mike Christie wrote: > On 12/04/2019 02:34 PM, Wesley Dillingham wrote: > > I have never had a permanent loss of a gateway but I'm a believer in > > Murphy's law and want to have a plan. Glad to hear that there is a > > solution

[ceph-users] Re: iSCSI Gateway reboots and permanent loss

2019-12-04 Thread Wesley Dillingham
On 12/04/2019 08:26 AM, Gesiel Galvão Bernardes wrote: > > Hi, > > > > Em qua., 4 de dez. de 2019 às 00:31, Mike Christie > <mailto:mchri...@redhat.com>> escreveu: > > > > On 12/03/2019 04:19 PM, Wesley Dillingham wrote: > > > Thanks. If

[ceph-users] Re: iSCSI Gateway reboots and permanent loss

2019-12-03 Thread Wesley Dillingham
l of a gateway via the "gwcli". I think the Ceph dashboard can > > do that as well. > > > > On Tue, Dec 3, 2019 at 1:59 PM Wesley Dillingham > wrote: > >> > >> We utilize 4 iSCSI gateways in a cluster and have noticed the following > during p

[ceph-users] iSCSI Gateway reboots and permanent loss

2019-12-03 Thread Wesley Dillingham
We utilize 4 iSCSI gateways in a cluster and have noticed the following during patching cycles when we sequentially reboot single iSCSI-gateways: "gwcli" often hangs on the still-up iSCSI GWs but sometimes still functions and gives the message: "1 gateway is inaccessible - updates will be

[ceph-users] OSD's addrvec, not getting msgr v2 address, PGs stuck unknown or peering

2019-11-11 Thread Wesley Dillingham
Running 14.2.4 (but same issue observed on 14.2.2) we have a problem with, thankfully a testing cluster, where all pgs are failing to peer and are stuck in peering or unknown stale etc states. My working theory is that this is because the OSDs dont seem to be utilizing msgr v2 as "ceph osd find

[ceph-users] Re: using non client.admin user for ceph-iscsi gateways

2019-09-06 Thread Wesley Dillingham
From: Jason Dillaman Sent: Friday, September 6, 2019 12:37 PM To: Wesley Dillingham Cc: ceph-users@ceph.io Subject: Re: [ceph-users] using non client.admin user for ceph-iscsi gateways Notice: This email is from an external sender. On Fri, Sep 6, 2019 at 12:00

[ceph-users] using non client.admin user for ceph-iscsi gateways

2019-09-06 Thread Wesley Dillingham
the iscsi-gateway.cfg seemingly allows for an alternative cephx user other than client.admin to be used, however the comments in the documentations says specifically to use client.admin. Other than having the cfg file point to the appropriate key/user with "gateway_keyring" and giving that

<    1   2