Hi Cory,

Thanks for identifying the bug and creating a PR to fix it. We'll do a
retrospective on this issue to catch and avoid such regressions in the
future. At the moment, we will go ahead with a minimal 16.2.9 release
for this issue.

Thanks,
Neha


On Tue, May 17, 2022 at 5:03 AM Cory Snyder <csny...@iland.com> wrote:
>
> Yep, sorry about that. Thanks for the correction, Dan!
>
> On Tue, May 17, 2022 at 7:44 AM Dan van der Ster <dvand...@gmail.com> wrote:
>
> > On Tue, May 17, 2022 at 1:14 PM Cory Snyder <csny...@iland.com> wrote:
> > >
> > > Hi all,
> > >
> > > Unfortunately, we experienced some issues with the upgrade to 16.2.8
> > > on one of our larger clusters. Within a few hours of the upgrade, all
> > > 5 of our managers had become unavailable. We found that they were all
> > > deadlocked due to (what appears to be) a regression with GIL and mutex
> > > handling. See https://tracker.ceph.com/issues/39264 and
> > > https://github.com/ceph/ceph/pull/38677 for context on previous
> > > manifestations of the issue.
> > >
> > > I discovered some mistakes within a recent Pacific backport that seem
> > > to be responsible. Here is the tracker for the regression:
> > > https://tracker.ceph.com/issues/55687. Here is an open PR that should
> > > resolve the problem: https://github.com/ceph/ceph/pull/38677.
> >
> > I guess you mean https://github.com/ceph/ceph/pull/46302 ?
> >
> > Thanks
> >
> > .. dan
> >
> > >
> > > Note that this is a sort of race condition, and the issue tends to
> > > manifest itself more frequently in larger clusters. Enabling certain
> > > modules may also make it more likely to occur. On our cluster, MGRs
> > > are consistently deadlocking within about an hour.
> > >
> > > Hopefully this is useful to others who are considering an upgrade!
> > >
> > > Thanks,
> > >
> > > Cory Snyder
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Mon, May 16, 2022 at 3:46 PM David Galloway <dgall...@redhat.com>
> > wrote:
> > > >
> > > > We're happy to announce the 8th backport release in the Pacific series.
> > > > We recommend users to update to this release. For a detailed release
> > > > notes with links & changelog please refer to the official blog entry at
> > > > https://ceph.io/en/news/blog/2022/v16-2-8-pacific-released
> > > >
> > > > Notable Changes
> > > > ---------------
> > > >
> > > > * MON/MGR: Pools can now be created with `--bulk` flag. Any pools
> > > > created with `bulk` will use a profile of the `pg_autoscaler` that
> > > > provides more performance from the start. However, any pools created
> > > > without the `--bulk` flag will remain using it's old behavior by
> > > > default. For more details, see:
> > > > https://docs.ceph.com/en/latest/rados/operations/placement-groups/
> > > >
> > > > * MGR: The pg_autoscaler can now be turned `on` and `off` globally with
> > > > the `noautoscale` flag. By default this flag is unset and the default
> > > > pg_autoscale mode remains the same. For more details, see:
> > > > https://docs.ceph.com/en/latest/rados/operations/placement-groups/
> > > >
> > > > * A health warning will now be reported if the ``require-osd-release``
> > > > flag is not set to the appropriate release after a cluster upgrade.
> > > >
> > > > * CephFS: Upgrading Ceph Metadata Servers when using multiple active
> > > > MDSs requires ensuring no pending stray entries which are directories
> > > > are present for active ranks except rank 0. See
> > > >
> > https://docs.ceph.com/en/latest/releases/pacific/#upgrading-from-octopus-or-nautilus
> > .
> > > >
> > > > Getting Ceph
> > > > ------------
> > > > * Git at git://github.com/ceph/ceph.git
> > > > * Tarball at https://download.ceph.com/tarballs/ceph-16.2.8.tar.gz
> > > > * Containers at https://quay.io/repository/ceph/ceph
> > > > * For packages, see
> > https://docs.ceph.com/docs/master/install/get-packages/
> > > > * Release git sha1: 209e51b856505df4f2f16e54c0d7a9e070973185
> > > >
> > > > _______________________________________________
> > > > ceph-users mailing list -- ceph-users@ceph.io
> > > > To unsubscribe send an email to ceph-users-le...@ceph.io
> > > _______________________________________________
> > > ceph-users mailing list -- ceph-users@ceph.io
> > > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> --
>
>
> cory snyder
>
> staff software engineer
>
> USA: +1.567.331.2061
>
> <https://www.linkedin.com/company/ilandcloud/>
> <https://twitter.com/ilandcloud> <https://www.facebook.com/ilandcloud/>
> <https://www.youtube.com/user/ilandinternet>
>
> <https://iland.com/resources/1111-systems-completes-acquisition-of-iland/?utm_campaign=Email%20Signature&utm_medium=email%20signature&utm_source=sig_banner>
>
>
> iland Americas, 1235 North Loop West, Suite 800 Houston, TX 77008
>
> iland Europe, 51 Eastcheap, London EC3M 1JP
>
> iland Asia, 80 Robinson Rd #02-00 Singapore 068898
>
> iland Australia, Level 11, 50 Margaret Street, Sydney, NSW 2000
> ------------------------------
>
> Disclaimer: The information transmitted is intended only for the person or
> entity to which it is addressed and may contain confidential and/or
> privileged material. Any review, retransmission, dissemination or other use
> of, or taking of any action in reliance upon, this information by persons
> or entities other than the intended recipient is prohibited. If you
> received this in error, please contact the sender and delete the material
> from any computer.
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to