Updates on both problems:

Problem 1
--------------

The bookworm/reef cephadm package needs updating to accommodate the last change in /usr/share/doc/adduser/NEWS.Debian.gz:

  System user home defaults to /nonexistent if --home is not specified.
  Packages that call adduser to create system accounts should explicitly
  specify a location for /home (see Lintian check
  maintainer-script-lacks-home-in-adduser).

i.e. when creating the cephadm user as a system user it needs to explicitly specify the expected home directory of /home/cephadm.

A workaround is to manually create the user+directory before installing ceph.


Problem 2
--------------

This is a complex set of interactions that prevent many mgr modules (including dashboard) from running. It is NOT debian-specific and will eventually bite other distributions as well. At the moment Ceph PR54710 looks the most promising fix (full or partial). Detail is spread across the following:

https://github.com/pyca/cryptography/issues/9016
https://github.com/ceph/ceph/pull/54710
https://tracker.ceph.com/issues/63529
https://forum.proxmox.com/threads/ceph-warning-post-upgrade-to-v8.129371/page-5
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1055212
https://github.com/pyca/bcrypt/issues/694



On 12/01/2024 14:29, Chris Palmer wrote:
More info on problem 2:

When starting the dashboard, the mgr seems to try to initialise cephadm, which in turn uses python crypto libraries that lead to the python error:

$ ceph crash info 2024-01-12T11:10:03.938478Z_2263d2c8-8120-417e-84bc-bb01f5d81e52
{
    "backtrace": [
        "  File \"/usr/share/ceph/mgr/cephadm/__init__.py\", line 1, in <module>\n    from .module import CephadmOrchestrator",         "  File \"/usr/share/ceph/mgr/cephadm/module.py\", line 15, in <module>\n    from cephadm.service_discovery import ServiceDiscovery",         "  File \"/usr/share/ceph/mgr/cephadm/service_discovery.py\", line 20, in <module>\n    from cephadm.ssl_cert_utils import SSLCerts",         "  File \"/usr/share/ceph/mgr/cephadm/ssl_cert_utils.py\", line 8, in <module>\n    from cryptography import x509",         "  File \"/lib/python3/dist-packages/cryptography/x509/__init__.py\", line 6, in <module>\n    from cryptography.x509 import certificate_transparency",         "  File \"/lib/python3/dist-packages/cryptography/x509/certificate_transparency.py\", line 10, in <module>\n    from cryptography.hazmat.bindings._rust import x509 as rust_x509",         "ImportError: PyO3 modules may only be initialized once per interpreter process"
    ],
    "ceph_version": "18.2.1",
    "crash_id": "2024-01-12T11:10:03.938478Z_2263d2c8-8120-417e-84bc-bb01f5d81e52",
    "entity_name": "mgr.xxxxx01",
    "mgr_module": "cephadm",
    "mgr_module_caller": "PyModule::load_subclass_of",
    "mgr_python_exception": "ImportError",
    "os_id": "12",
    "os_name": "Debian GNU/Linux 12 (bookworm)",
    "os_version": "12 (bookworm)",
    "os_version_id": "12",
    "process_name": "ceph-mgr",
    "stack_sig": "7815ad73ced094695056319d1241bf7847da19b4b0dfee7a216407b59a7e3d84",
    "timestamp": "2024-01-12T11:10:03.938478Z",
    "utsname_hostname": "xxxxx01.xxx.xxx",
    "utsname_machine": "x86_64",
    "utsname_release": "6.1.0-17-amd64",
    "utsname_sysname": "Linux",
    "utsname_version": "#1 SMP PREEMPT_DYNAMIC Debian 6.1.69-1 (2023-12-30)"
}


On 12/01/2024 12:39, Chris Palmer wrote:
I was delighted to see the native Debian 12 (bookworm) packages turn up in Reef 18.2.1.

We currently run a number of ceph clusters on Debian11 (bullseye) / Quincy 17.2.7. These are not cephadm-managed.

I have attempted to upgrade a test cluster, and it is not going well. Quincy only supports bullseye, and Reef only supports bookworm, we are reinstalling from bare metal. However I don't think either of these two problems are related to that.

Problem 1
--------------

A simple "apt install ceph" goes most of the way, then errors with

Setting up cephadm (18.2.1-1~bpo12+1) ...
usermod: unlocking the user's password would result in a passwordless account. You should set a password with usermod -p to unlock this user's password. mkdir: cannot create directory ‘/home/cephadm/.ssh’: No such file or directory
dpkg: error processing package cephadm (--configure):
 installed cephadm package post-installation script subprocess returned error exit status 1
dpkg: dependency problems prevent configuration of ceph-mgr-cephadm:
 ceph-mgr-cephadm depends on cephadm; however:
  Package cephadm is not configured yet.

dpkg: error processing package ceph-mgr-cephadm (--configure):
 dependency problems - leaving unconfigured


The two cephadm-related packages are then left in an error state, which apt tries to continue each time it is run.

The cephadm user has a login directory of /nonexistent, however the cephadm --configure script is trying to use /home/cephadm (as it was on Quincy/bullseye).

So, we aren't using cephadm, and decided to keep going as the other packages were actually installed, and deal with the package state later.

Problem 2
---------------

I upgraded 2/3 monitor nodes without any other problems, and (for the moment) removed the other Quincy monitor prior to rebuild.

I then shutdown the remaining Quincy manager, and attempted to start the Reef manager. Although the manager is running, "ceph mgr services" shows it is only providing the restful and not the dashboard service. The log file has lots of the following error:

ImportError: PyO3 modules may only be initialized once per interpreter process

and ceph -s reports "Module 'dashboard' has failed dependency: PyO3 modules may only be initialized once per interpreter process


Questions
---------------

1. Have the Reef/bookworm packages ever been tested in a non-cephadm environment? 2. I want to revert this cluster back to a fully functional state. I cannot bring back up the remaining Quincy monitor though ("require release 18 > 17"). Would I have to go through the procedure of starting over, and trying to rescue the monmap from the OSDs? (OSDs and an active MDS are still up and running Quincy). I'm aware that process exists but have never had to delve into it.


Thanks, Chris
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to