Hi,

do you use more nodes than deployed mgrs and cephadm?

If so it might be, that the node you are connecting to no longer has a instance of the mgr running, and you only getting some leftovers in the browser cache?

At least this was happening in my test cluster, but I was always able to find a node with the mgr running by just trying trough them.

Greetings,

Kai

On 11/19/21 00:03, Zach Heise (SSCC) wrote:

Hello!

Our test cluster is a few months old, was initially set up from scratch with Pacific and has now had two separate small patches 16.2.5 and then a couple weeks ago, 16.2.6 applied to it. The issue I?m describing has been present since the beginning.

We have an active and standby mgr daemon, and the dashboard module is installed with SSL turned on. Self signed certificates only, not trusted by browsers, but I always just click ?okay? through Chrome and Firefox?s warnings about that.

I have noticed that every 2-3 days, in the morning when I start work, our ceph dashboard page does not respond in the browser. It works fine throughout the day, but it seems like after a certain unknown hours without anyone accessing it (I?m the only one using the dashboard now since it?s just a test) something must be going wrong with the dashboard module, or mgr daemon, because when I try to load (or refresh when it's already loaded) the ceph dashboard site, the browser just does the ?throbber <https://en.wikipedia.org/wiki/Throbber>? ? no content on the page ever appears, no errors or anything. None of the buttons on the page load ? nor time out and show a 404 ? for example, Block\Images or Cluster\Hosts in the left sidebar will load, but show empty. And the throbber never stops.

Confirmed that this happens in all browsers too.

I can easily fix it with ceph mgr module disable dashboard and then waiting 10 seconds, then ceph mgr module enable dashboard ? this makes it start working again, until the next time I go a few days without using the dashboard, at which point I need to do the same process again.

Any ideas as to what could be causing this? I have already turned on debug mode. When I?m in this hanging state, I check the cephadm logs with cephadm logs --name mgr.ceph01.fblojp -- -f but there?s nothing obvious (to my untrained eyes at least). When the dashboard is functional, I can see my own navigation around the dashboard in the logs so I know that logging is working:

Nov 01 15:46:32 ceph01.domain conmon[5814]: debug 2021-11-01T20:46:32.601+0000 7f7cbb42e700  0 [dashboard INFO request] [10.130.50.252:52267] [GET] [200] [0.013s] [admin] [1.0K] /api/summary

I already confirmed that the same thing happens regardless of whether I?m using default ports of http://ceph01.domain:8080 or https://ceph01.domain:8443 (although as mentioned I usually use self-signed SSL).

At this moment the dashboard is currently in this hanging state so I am happy to try to get logs.

Thanks,

-Zach


_______________________________________________
ceph-users mailing list --ceph-users@ceph.io
To unsubscribe send an email toceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to