[ceph-users] Re: Larger number of OSDs, cheroot, cherrypy, limits + containers == broken

2021-01-07 Thread David Orman
https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-2021-e204b1c101 Ken - thanks for driving this forward. Everyone else involved - if you can test and give a +1 (assuming it works) that'd be great, otherwise it'll be the normal 14 day period before it's promoted to stable. I've already tested

[ceph-users] Re: Larger number of OSDs, cheroot, cherrypy, limits + containers == broken

2020-12-23 Thread David Orman
Thank you for the update, this is excellent news. Hopefully we'll see the fixed package in the next point release container-build for Ceph; this has been a big stumbling block for our deployments and countless others we've seen reporting this. We greatly appreciate your diligent work and urgency

[ceph-users] Re: Larger number of OSDs, cheroot, cherrypy, limits + containers == broken

2020-12-23 Thread Ken Dreyer
On Tue, Dec 22, 2020 at 12:03 PM Ken Dreyer wrote: > There are a few more small cleanups I need to land in order to > reconcile the epel8 and master branches. The maintainers merged the cleanups. Here's the next PR to sync the remaining epel8 diff into master:

[ceph-users] Re: Larger number of OSDs, cheroot, cherrypy, limits + containers == broken

2020-12-22 Thread Ken Dreyer
Thanks David! A couple of things have happened since the last update. The primary Fedora cheroot package maintainer updated cheroot from 8.5.0 to 8.5.1 in Rawhide. I've rebuilt this for el8 and put it into a new repository here: https://fedorapeople.org/~ktdreyer/bz1907005/ There are a few more

[ceph-users] Re: Larger number of OSDs, cheroot, cherrypy, limits + containers == broken

2020-12-11 Thread David Orman
Hi Ken, This seems to have fixed that issue. It exposed another: https://tracker.ceph.com/issues/39264 which is causing ceph-mgr to become entirely unresponsive across the cluster, but cheroot seems to be ok. David On Wed, Dec 9, 2020 at 12:25 PM David Orman wrote: > Ken, > > We have rebuilt

[ceph-users] Re: Larger number of OSDs, cheroot, cherrypy, limits + containers == broken

2020-12-09 Thread David Orman
Ken, We have rebuilt the container images of 15.2.7 with this RPM applied, and will be deploying it to a larger (504 OSD) cluster to test - this cluster had the issue previously until we disabled polling via Prometheus. We will update as soon as it's run for a day or two and we've been able to

[ceph-users] Re: Larger number of OSDs, cheroot, cherrypy, limits + containers == broken

2020-12-08 Thread Dimitri Savineau
As far as I know, the issue isn't specific to using container as deployment using packages (rpm or deb) are also affected by the issue (at least CentOS 8 and Ubuntu 20.04 focal) ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an

[ceph-users] Re: Larger number of OSDs, cheroot, cherrypy, limits + containers == broken

2020-12-08 Thread David Orman
Hi Ken, Thank you for the update! As per: https://github.com/ceph/ceph-container/issues/1748 We implemented the (dropping ulimit to 1024:4096 for mgr) suggested change last night, and on our test cluster of 504 OSDs, being polled by the internal prometheus and our external instance, the mgrs

[ceph-users] Re: Larger number of OSDs, cheroot, cherrypy, limits + containers == broken

2020-12-07 Thread Ken Dreyer
Thanks for bringing this up. We need to update Cheroot in Fedora and EPEL 8. I've opened https://src.fedoraproject.org/rpms/python-cheroot/pull-request/3 to get this into Fedora first. I've published an el8 RPM at https://fedorapeople.org/~ktdreyer/bz1868629/ for early testing. I can bring up a