** Description changed:
[ Impact ]
* This is a common problem with security scanners where all worker threads
are killed by malformed requests and the server becomes unresponsive.
- - The ceph dashboard on reef is susceptible to this issue.
+ - The ceph dashboard on quincy and squid is susceptible to this issue.
* A malicious attacker could use the same technique to DOS the server.
[ Test Plan ]
I reproduced the issue against both a minimal cheroot server and the
ceph dashboard.
In both cases, I used tlsfuzzer [1] to reproduce the bug, by running
`scripts/test-tls13-ccs.py -h <IP> -p <PORT>`.
For cheroot
===========
I did the following in an lxd container.
1. Create a minimal cheroot server file
server.py
---------
from cheroot.wsgi import Server as WSGIServer
from cheroot.ssl.builtin import BuiltinSSLAdapter
def app(environ, start_response):
status = '200 OK'
response_headers = [('Content-type', 'text/plain')]
start_response(status, response_headers)
return [b"Ok."]
server = WSGIServer(('0.0.0.0', 8443), app)
server.ssl_adapter = BuiltinSSLAdapter(
certificate='cert.pem',
private_key='key.pem'
)
if __name__ == '__main__':
try:
server.start()
except KeyboardInterrupt:
server.stop()
-----------
2. Create a self-signed certificate and key
`openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -days
365 -nodes`
3. Start the server: `sudo python3 server.py`
4. Verify that the server responds correctly
`curl -k -I https://<IP>:8443`
Output
---------------
HTTP/1.1 200 OK
Content-type: text/plain
Connection: close
Date: Tue, 10 Mar 2026 15:44:09 GMT
Server: Cheroot/8.5.2
----------------
and check the number of worker threads
`grep -i threads /proc/$(pgrep -f server.py)/status`
Expected Output
---------------
Threads: 11
---------------
5. Run the tls13-ccs script of tlsfuzzer repeatedly until it times out.
`scripts/test-tls13-ccs.py -h <IP> -p 8443`
After several runs you will see:
> AssertionError: Timeout when waiting for peer message
6. Observe that connections to the server now timeout (or hang with no
timeout specified)
`curl -k -I https://<IP>:8443 --max-time 5`
Expected Output
---------------
HTTP/1.1 200 OK
Content-type: text/plain
Connection: close
Date: Tue, 10 Mar 2026 15:44:09 GMT
Server: Cheroot/8.5.2
----------------
Actual Output
------------
curl: (28) Operation timed out after 5002 milliseconds with 0 bytes received
-------------
and check the number of threads for server process
`grep -i threads /proc/$(pgrep -f server.py)/status`
Expected Output
---------------
Threads: 11
---------------
Actual Output
---------------
Threads: 1
---------------
Note that all of the worker threads have died.
For ceph-dashboard
==================
1. Deploy a minimal ceph lab on lxd [2]
2. Add ceph-dashboard to the model [3]
3. Note down the <IP> of one of the ceph-mon nodes which also hosts the
dashboard.
4. Verify that the ceph dashboard is reachable at https://<IP>:8443
either in the browser or with curl
curl -k -I https://<IP>:8443 --max-time 5
Output
---------------
HTTP/1.1 200 OK
Content-Type: text/html;charset=utf-8
Server: Ceph-Dashboard
Date: Tue, 10 Mar 2026 15:55:47 GMT
Content-Security-Policy: frame-ancestors 'self';
X-Content-Type-Options: nosniff
Strict-Transport-Security: max-age=63072000; includeSubDomains; preload
Content-Language: en-US
Vary: Accept-Language, Accept-Encoding
Cache-Control: no-cache
Last-Modified: Fri, 12 Jul 2024 14:10:44 GMT
Accept-Ranges: bytes
Content-Length: 6466
---------------
5. Run the tls13-ccs script of tlsfuzzer repeatedly until it times out.
`scripts/test-tls13-ccs.py -h <IP> -p 8443`
After several runs you will see:
> AssertionError: Timeout when waiting for peer message
6. The ceph-dashboard is now unreachable from the browser and curl
curl -k -I https://<IP>:8443 --max-time 5
Expected Output
---------------
HTTP/1.1 200 OK
Content-Type: text/html;charset=utf-8
Server: Ceph-Dashboard
Date: Tue, 10 Mar 2026 15:55:47 GMT
Content-Security-Policy: frame-ancestors 'self';
X-Content-Type-Options: nosniff
Strict-Transport-Security: max-age=63072000; includeSubDomains; preload
Content-Language: en-US
Vary: Accept-Language, Accept-Encoding
Cache-Control: no-cache
Last-Modified: Fri, 12 Jul 2024 14:10:44 GMT
Accept-Ranges: bytes
Content-Length: 6466
---------------
Actual Output
-------------
curl: (28) Operation timed out after 5002 milliseconds with 0 bytes received
-------------
7. Read the syslog on mon unit and observe uncaught exceptions in the
cheroot server threads
e.g., sudo grep "Thread" -A10 /var/log/syslog
Expected Output: <No Thread Errors>
Actual Output
--------
Mar 10 15:57:31 juju-73b987-2 ceph-mgr[65287]: Exception in thread ('CP
Server Thread-11',):
Mar 10 15:57:31 juju-73b987-2 ceph-mgr[65287]: Traceback (most recent call
last):
Mar 10 15:57:31 juju-73b987-2 ceph-mgr[65287]: File
"/lib/python3/dist-packages/cheroot/server.py", line 1277, in communicate
Mar 10 15:57:31 juju-73b987-2 ceph-mgr[65287]: req.parse_request()
Mar 10 15:57:31 juju-73b987-2 ceph-mgr[65287]: File
"/lib/python3/dist-packages/cheroot/server.py", line 706, in parse_request
Mar 10 15:57:31 juju-73b987-2 ceph-mgr[65287]: success =
self.read_request_line()
Mar 10 15:57:31 juju-73b987-2 ceph-mgr[65287]: File
"/lib/python3/dist-packages/cheroot/server.py", line 747, in read_request_line
Mar 10 15:57:31 juju-73b987-2 ceph-mgr[65287]: request_line =
self.rfile.readline()
Mar 10 15:57:31 juju-73b987-2 ceph-mgr[65287]: File
"/lib/python3/dist-packages/cheroot/server.py", line 304, in readline
Mar 10 15:57:31 juju-73b987-2 ceph-mgr[65287]: data =
self.rfile.readline(256)
Mar 10 15:57:31 juju-73b987-2 ceph-mgr[65287]: File
"/lib/python3.10/_pyio.py", line 582, in readline
---------
Fix Verification
================
As an extra verification of the fixes in the SRU we can run the whole
tlsfuzzer suite against the patched server and verify that the number of
worker threads remains the same. This is a good indicator that there are
no obvious additional vulnerabilities that could crash the worker
thread.
1. Perform steps 1-4 of the "For cheroot" test plan above.
2. From the tlsfuzzer directory run all scripts targeting the server (some of
these will error out because they are missing required arguments but this does
no harm)
`for s in scripts/test-*.py; do $s -h <IP> -p 8443; done`
NOTE: the server will log many errors, but none should crash the worker
threads
3. Verify that the number of threads is the same as before running the
suite
`grep -i threads /proc/$(pgrep -f server.py)/status`
Expected Output
---------------
Threads: 11
---------------
[ Where problems could occur ]
* Because we are now swallowing errors, there is the potential for
threads to be left in a bad state when they would have previously
crashed.
* The smaller patch will not necessarily catch all exception types and
so would leave some errors of this kind, although solving the specific
incarnation.
[ Other Info ]
* This issue was fixed in upstream version 10.0.1, which is in questing and
above.
* Upstream ceph has fixed this issue in reef+[4] by bumping the cheroot
dependency to 10.0.1.
* There are two variants of an upstream patch
- One simply catches and logs SSL errors in the threadpool [5]. This
patch was proposed but not merged.
- The other is a more holistic revisiting of the error handling in the
threadpool [6], and is the patch that landed in 10.0.1.
* This leaves a few options for SRU
1. Apply the minimal patch to jammy and noble
2. Apply upstream patch to jammy and noble
3. Apply the minimal patch to jammy and the upstream patch to noble
4. Apply the minimal patch to jammy and bump the noble package to 10.0.1
* Of these, option 3 seems likely to carry the smallest risk of regression
and so is what I have proposed, but I welcome input on the other options.
[1]: https://github.com/tlsfuzzer/tlsfuzzer
[2]: https://ubuntu.com/ceph/docs/tutorial
[3]: https://ubuntu.com/ceph/docs/install-dashboard
[4]: https://github.com/ceph/ceph/pull/57001
[5]: https://github.com/cherrypy/cheroot/pull/365
[6]: https://github.com/cherrypy/cheroot/pull/649
Related Upstream Issues
-----------------------
https://github.com/cherrypy/cherrypy/issues/1989
https://github.com/cherrypy/cheroot/issues/358
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2143920
Title:
[SRU] Uncaught SSL errors can crash worker threads
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/2143920/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs