Re: [ceph-users] dashboard hangs

2019-11-20 Thread thoralf schulze
hi, we were able to track this down to the auto balancer: disabling the auto balancer and cleaning out old (and probably not very meaningful) upmap-entries via ceph osd rm-pg-upmap-items brought back stable mgr daemons and an usable dashboard. the not-so-sensible upmap-entries might or might not

Re: [ceph-users] dashboard hangs

2019-11-14 Thread thoralf schulze
hi Lenz, On 11/13/19 6:38 PM, Lenz Grimmer wrote: > there have been several reports about Ceph mgr modules (not just the > dashboard) experiencing hangs and freezes recently. The thread "mgr > daemons becoming unresponsive" might give you some additional insight. > > Is the "device health

[ceph-users] dashboard hangs

2019-11-13 Thread thoralf schulze
hi there, the dashboard of our moderatly used cluster with 3 mon/mgr-nodes gets stuck about 30 seconds after a mgr becomes active. the dashboard is not usable anymore (ie: the mgr damon does not respond to http requests anymore), although it comes back from the dead occasionally for a few

[ceph-users] Fwd: Netzteilausfälle BARZ

2019-09-27 Thread thoralf schulze
hi, fyi … cheers, t. Forwarded Message Subject:Netzteilausfälle BARZ Date: Fri, 27 Sep 2019 12:03:48 +0200 From: Berliner, Sascha To: Plato, Michael , Schulze, Thoralf CC: Obst, Bernd , Wötzel, Dirk Hallo.   Hier der aktuelle Stand. Das BARZ wird

[ceph-users] mds directory pinning, status display

2019-09-13 Thread thoralf schulze
hi there, while debugging metadata servers reporting slow requests, we took a stab at pinning directories of a cephfs like so: setfattr -n ceph.dir.pin -v 1 /tubfs/kubernetes/ setfattr -n ceph.dir.pin -v 0 /tubfs/profiles/ setfattr -n ceph.dir.pin -v 0 /tubfs/homes on the active mds for rank 0,

Re: [ceph-users] cephfs-snapshots causing mds failover, hangs

2019-08-27 Thread thoralf schulze
hi Zheng, On 8/26/19 3:31 PM, Yan, Zheng wrote: […] > change code to : […] we can happily confirm that this resolves the issue. thank you _very_ much & with kind regards, t. signature.asc Description: OpenPGP digital signature ___ ceph-users

Re: [ceph-users] cephfs-snapshots causing mds failover, hangs

2019-08-26 Thread thoralf schulze
hi Zheng - On 8/26/19 2:55 PM, Yan, Zheng wrote: > I tracked down the bug > https://tracker.ceph.com/issues/41434 wow, that was quick - thank you for investigating. we are looking forward for the fix :-) in the meantime, is there anything we can do to prevent q == p->second.end() from

Re: [ceph-users] cephfs-snapshots causing mds failover, hangs

2019-08-26 Thread thoralf schulze
hi Zheng, On 8/21/19 4:32 AM, Yan, Zheng wrote: > Please enable debug mds (debug_mds=10), and try reproducing it again. please find the logs at https://www.user.tu-berlin.de/thoralf.schulze/ceph-debug.tar.xz . we managed to reproduce the issue as a worst case scenario: before snapshotting,

Re: [ceph-users] cephfs-snapshots causing mds failover, hangs

2019-08-21 Thread thoralf schulze
hi zheng, On 8/21/19 4:32 AM, Yan, Zheng wrote: > Please enable debug mds (debug_mds=10), and try reproducing it again. we will get back with the logs on monday. thank you & with kind regards, t. signature.asc Description: OpenPGP digital signature

[ceph-users] cephfs-snapshots causing mds failover, hangs

2019-08-20 Thread thoralf schulze
hi there, we are struggling with the creation of cephfs-snapshots: doing so reproducible causes a failover of our metadata servers. afterwards, the demoted mds servers won't be available as standby servers and the mds daemons on these machines have to be manually restarted. more often than we