stale state. You should
then be able to force re-create it.
This worked for me with a replicated pool, never tried this with EC.
Afterwards you can re-create these OSDs again.
Wido
>
> From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf o
m: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of
george.vasilaka...@stfc.ac.uk [george.vasilaka...@stfc.ac.uk]
Sent: 22 February 2017 14:35
To: w...@42on.com; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] PG stuck peering after host reboot
So what I see there is this for osd.307:
PGs are reporting being undersized and having ITEM_NONE in their
acting sets as well.
>
> From: Wido den Hollander [w...@42on.com]
> Sent: 22 February 2017 12:18
> To: Vasilakakos, George (STFC,RAL,SC); ceph-users@lists.ceph.com
> Subject: RE: [ceph
_
> From: Wido den Hollander [w...@42on.com]
> Sent: 22 February 2017 12:18
> To: Vasilakakos, George (STFC,RAL,SC); ceph-users@lists.ceph.com
> Subject: RE: [ceph-users] PG stuck peering after host reboot
>
> > Op 21 februari 2017 om 15:35 schreef george.vasilaka..
Subject: RE: [ceph-users] PG stuck peering after host reboot
> Op 21 februari 2017 om 15:35 schreef george.vasilaka...@stfc.ac.uk:
>
>
> I have noticed something odd with the ceph-objectstore-tool command:
>
> It always reports PG X not found even on healthly OSDs/PGs. The 'list' o
ers@lists.ceph.com; bhubb...@redhat.com
> Subject: Re: [ceph-users] PG stuck peering after host reboot
>
> > Can you for the sake of redundancy post your sequence of commands you
> > executed and their output?
>
> [root@ceph-sn852 ~]# systemctl stop ceph-osd@307
> [root@cep
of
george.vasilaka...@stfc.ac.uk [george.vasilaka...@stfc.ac.uk]
Sent: 21 February 2017 10:17
To: w...@42on.com; ceph-users@lists.ceph.com; bhubb...@redhat.com
Subject: Re: [ceph-users] PG stuck peering after host reboot
> Can you for the sake of redundancy post your sequence of commands you
> ex
> Can you for the sake of redundancy post your sequence of commands you
> executed and their output?
[root@ceph-sn852 ~]# systemctl stop ceph-osd@307
[root@ceph-sn852 ~]# ceph-objectstore-tool --data-path
/var/lib/ceph/osd/ceph-307 --op info --pgid 1.323
PG '1.323' not found
[root@ceph-sn852
> Op 20 februari 2017 om 17:52 schreef george.vasilaka...@stfc.ac.uk:
>
>
> Hi Wido,
>
> Just to make sure I have everything straight,
>
> > If the PG still doesn't recover do the same on osd.307 as I think that
> > 'ceph pg X query' still hangs?
>
> > The info from ceph-objectstore-tool
Hi Wido,
Just to make sure I have everything straight,
> If the PG still doesn't recover do the same on osd.307 as I think that 'ceph
> pg X query' still hangs?
> The info from ceph-objectstore-tool might shed some more light on this PG.
You mean run the objectstore command on 307, not remove
t;3. ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-595 --op info
>> >--pgid 1.323
>> >
>> >What does osd.595 think about that PG?
>> >
>> >You could even try 'rm-past-intervals' with the object-store tool, but that
>> >might be a bit
t;Wido
> >
> >>
> >> Best regards,
> >>
> >> George
> >>
> >> From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of
> >> george.vasilaka...@stfc.ac.uk [george.vasilaka...@stf
tervals' with the object-store tool, but that
>might be a bit dangerous. Wouldn't do that immediately.
>
>Wido
>
>>
>> Best regards,
>>
>> George
>>
>> From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of
>> george.va
___
> From: ceph-users [ceph-users-boun...@lists.ceph.com] on behalf of
> george.vasilaka...@stfc.ac.uk [george.vasilaka...@stfc.ac.uk]
> Sent: 14 February 2017 10:27
> To: bhubb...@redhat.com; ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] PG stuck peering after
; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] PG stuck peering after host reboot
Hi Brad,
I'll be doing so later in the day.
Thanks,
George
From: Brad Hubbard [bhubb...@redhat.com]
Sent: 13 February 2017 22:03
To: Vasilakakos, George (STFC,RAL,SC
Hi Brad,
I'll be doing so later in the day.
Thanks,
George
From: Brad Hubbard [bhubb...@redhat.com]
Sent: 13 February 2017 22:03
To: Vasilakakos, George (STFC,RAL,SC); Ceph Users
Subject: Re: [ceph-users] PG stuck peering after host reboot
I'd suggest
[george.vasilaka...@stfc.ac.uk]
Sent: 08 February 2017 18:32
To: gfar...@redhat.com
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] PG stuck peering after host reboot
Hey Greg,
Thanks for your quick responses. I have to leave the office now but I'll look
deeper into it tomorrow to try and understand
(STFC,RAL,SC)
Cc: Ceph Users
Subject: Re: [ceph-users] PG stuck peering after host reboot
On Wed, Feb 8, 2017 at 10:25 AM, <george.vasilaka...@stfc.ac.uk> wrote:
> Hi Greg,
>
>> Yes, "bad crc" indicates that the checksums on an incoming message did
>> not match w
On Wed, Feb 8, 2017 at 10:25 AM, wrote:
> Hi Greg,
>
>> Yes, "bad crc" indicates that the checksums on an incoming message did
>> not match what was provided — ie, the message got corrupted. You
>> shouldn't try and fix that by playing around with the peering
Hi Greg,
> Yes, "bad crc" indicates that the checksums on an incoming message did
> not match what was provided — ie, the message got corrupted. You
> shouldn't try and fix that by playing around with the peering settings
> as it's not a peering bug.
> Unless there's a bug in the messaging layer
On Wed, Feb 8, 2017 at 8:17 AM, wrote:
> Hi Ceph folks,
>
> I have a cluster running Jewel 10.2.5 using a mix EC and replicated pools.
>
> After rebooting a host last night, one PG refuses to complete peering
>
> pg 1.323 is stuck inactive for 73352.498493, current
: [ceph-users] PG stuck peering after host reboot
Hello,
I already had the case, I applied the parameter
(osd_find_best_info_ignore_history_les) to all the osd that have reported the
queries blocked.
--
Cordialement,
CEO FEELB | Corentin BONNETON
cont...@feelb.io<mailto:cont...@feelb.io>
Le
Hello,
I already had the case, I applied the parameter
(osd_find_best_info_ignore_history_les) to all the osd that have reported the
queries blocked.
--
Cordialement,
CEO FEELB | Corentin BONNETON
cont...@feelb.io
> Le 8 févr. 2017 à 17:17, george.vasilaka...@stfc.ac.uk a écrit :
>
> Hi Ceph
Hi Ceph folks,
I have a cluster running Jewel 10.2.5 using a mix EC and replicated pools.
After rebooting a host last night, one PG refuses to complete peering
pg 1.323 is stuck inactive for 73352.498493, current state peering, last acting
[595,1391,240,127,937,362,267,320,7,634,716]
24 matches
Mail list logo