On 17/01/2024 16:11, kefu chai wrote:


On Tue, Jan 16, 2024 at 12:11 AM Chris Palmer <chris.pal...@idnet.com> wrote:

    Updates on both problems:

    Problem 1
    --------------

    The bookworm/reef cephadm package needs updating to accommodate
    the last
    change in /usr/share/doc/adduser/NEWS.Debian.gz:

       System user home defaults to /nonexistent if --home is not
    specified.
       Packages that call adduser to create system accounts should
    explicitly
       specify a location for /home (see Lintian check
       maintainer-script-lacks-home-in-adduser).

    i.e. when creating the cephadm user as a system user it needs to
    explicitly specify the expected home directory of /home/cephadm.


Hi Chris, thank you for the bug report and the suggestion. could you please file a tracker ticket, so we can track and backport the related fixes? i just
created https://github.com/ceph/ceph/pull/55218 in hope to alleviate the
problem.

I've created issue https://tracker.ceph.com/issues/64069 for this.


    A workaround is to manually create the user+directory before
    installing
    ceph.


    Problem 2
    --------------

    This is a complex set of interactions that prevent many mgr modules
    (including dashboard) from running. It is NOT debian-specific and
    will
    eventually bite other distributions as well. At the moment Ceph
    PR54710
    looks the most promising fix (full or partial). Detail is spread
    across
    the following:

    https://github.com/pyca/cryptography/issues/9016
    https://github.com/ceph/ceph/pull/54710
    https://tracker.ceph.com/issues/63529
    
https://forum.proxmox.com/threads/ceph-warning-post-upgrade-to-v8.129371/page-5
    https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1055212
    https://github.com/pyca/bcrypt/issues/694


IIUC, a backport of https://github.com/ceph/ceph/pull/54710 to reef would address this issue, am i right?



Unfortunately I think this may be part of a much bigger MGR problem. My understanding of the relevant background is:

 * MGR modules use python subinterpreters for isolation between modules.
 * Several modules (including but not limited to dashboard & restful)
   use python3-cryptography for hashing and TLS (and possibly other
   things).
 * python3-cryptography delegates some crypto functions to Rust
   functions. These include bcrypt and TLS-related functions.
 * python3-cryptography uses PyO3 to invoke Rust functions.
 * PyO3 does not support being used by subinterpreters. In the past
   this has been allowed but was actually unsafe. Now PyO3 throws an
   exception when it detects multiple initialisations.

So it appears that the MGR use of these functions has always been unsafe, and is now forbidden.

PR54710 identified that the code necessary for the bcrypt hashing used during authentication could easily be written in a small amount of native python, thus avoiding the whole PyO3 area altogether. However there was a note in the discussions that you also had to disable TLS. And it only applied to the dashboard. My stacktrace below shows the exception during TLS initialisation.

As PyO3 updates are adopted in other linux distributions this is likely to break a number of MGR modules. As there does not seem to be any subinterpreter support in PyO3 coming soon, the only option may be to completely eliminate use of python3-cryptopgraphy from all MGR modules. (It is possible MGR modules may also use other python3 modules that use PyO3 to invoke Rust).

Unfortunately for us, we didn't find this until we had upgraded all MONs in a cluster to reef, at which point we can't downgrade them to quincy. And we can't upgrade the MGR. As a temporary measure (this cluster had MON/MGR/MDS/RGW colocated on 2 hosts) we have added another bookworm host running a reef MON to ensure we can maintain quorum. We are not sure whether it is safe to upgrade the other components (OSD, MDS, RGW) while the MGR remains at quincy,

🙁




    On 12/01/2024 14:29, Chris Palmer wrote:
    > More info on problem 2:
    >
    > When starting the dashboard, the mgr seems to try to initialise
    > cephadm, which in turn uses python crypto libraries that lead to
    the
    > python error:
    >
    > $ ceph crash info
    > 2024-01-12T11:10:03.938478Z_2263d2c8-8120-417e-84bc-bb01f5d81e52
    > {
    >     "backtrace": [
    >         "  File \"/usr/share/ceph/mgr/cephadm/__init__.py\",
    line 1,
    > in <module>\n    from .module import CephadmOrchestrator",
    >         "  File \"/usr/share/ceph/mgr/cephadm/module.py\", line
    15, in
    > <module>\n    from cephadm.service_discovery import
    ServiceDiscovery",
    >         "  File
    \"/usr/share/ceph/mgr/cephadm/service_discovery.py\",
    > line 20, in <module>\n    from cephadm.ssl_cert_utils import
    SSLCerts",
    >         "  File \"/usr/share/ceph/mgr/cephadm/ssl_cert_utils.py\",
    > line 8, in <module>\n    from cryptography import x509",
    >         "  File
    > \"/lib/python3/dist-packages/cryptography/x509/__init__.py\",
    line 6,
    > in <module>\n    from cryptography.x509 import
    certificate_transparency",
    >         "  File
    >
    
\"/lib/python3/dist-packages/cryptography/x509/certificate_transparency.py\",

    > line 10, in <module>\n    from cryptography.hazmat.bindings._rust
    > import x509 as rust_x509",
    >         "ImportError: PyO3 modules may only be initialized once per
    > interpreter process"
    >     ],
    >     "ceph_version": "18.2.1",
    >     "crash_id":
    > "2024-01-12T11:10:03.938478Z_2263d2c8-8120-417e-84bc-bb01f5d81e52",
    >     "entity_name": "mgr.xxxxx01",
    >     "mgr_module": "cephadm",
    >     "mgr_module_caller": "PyModule::load_subclass_of",
    >     "mgr_python_exception": "ImportError",
    >     "os_id": "12",
    >     "os_name": "Debian GNU/Linux 12 (bookworm)",
    >     "os_version": "12 (bookworm)",
    >     "os_version_id": "12",
    >     "process_name": "ceph-mgr",
    >     "stack_sig":
    > "7815ad73ced094695056319d1241bf7847da19b4b0dfee7a216407b59a7e3d84",
    >     "timestamp": "2024-01-12T11:10:03.938478Z",
    >     "utsname_hostname": "xxxxx01.xxx.xxx",
    >     "utsname_machine": "x86_64",
    >     "utsname_release": "6.1.0-17-amd64",
    >     "utsname_sysname": "Linux",
    >     "utsname_version": "#1 SMP PREEMPT_DYNAMIC Debian 6.1.69-1
    > (2023-12-30)"
    > }

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to