aminghadersohi commented on code in PR #40139:
URL: https://github.com/apache/superset/pull/40139#discussion_r3276892222
##########
superset/extensions/ssh.py:
##########
@@ -30,6 +30,32 @@
if TYPE_CHECKING:
from superset.databases.ssh_tunnel.models import SSHTunnel
+# Order matters: paramiko's per-class loaders raise SSHException with vague
+# "unpack requires 4 bytes" messages on type mismatches, so we try the more
+# modern key types first (ed25519, ECDSA) and fall back to RSA, which is the
+# most permissive parser and the historical default in this codebase.
+_SSH_KEY_TYPES: tuple[type[PKey], ...] = (Ed25519Key, ECDSAKey, RSAKey)
+
+
+def _load_private_key(pem: str, password: str | None) -> PKey:
+ """Load a private key PEM regardless of algorithm (ed25519, ECDSA, RSA).
+
+ paramiko does not expose a polymorphic ``PKey.from_private_key``; each
+ key class only accepts its own format. Iterate over the supported types
+ and return the first that parses cleanly.
+ """
+ last_exc: Exception | None = None
+ for key_class in _SSH_KEY_TYPES:
+ try:
+ return key_class.from_private_key(StringIO(pem), password=password)
+ except SSHException as exc:
+ last_exc = exc
+ continue
+ raise SSHException(
+ "Unable to parse SSH private key as any of "
+ f"{', '.join(k.__name__ for k in _SSH_KEY_TYPES)}: {last_exc}"
+ )
Review Comment:
**HIGH: Missing exception chaining**
This `raise` should use `from last_exc`:
```python
raise SSHException(
"Unable to parse SSH private key as any of "
f"{', '.join(k.__name__ for k in _SSH_KEY_TYPES)}: {last_exc}"
) from last_exc
```
Without it Python discards the `__context__` chain, so Sentry (and any
`raise`-from-`raise` traceback) shows only this outer `SSHException` — the
stack frame inside paramiko where the parse actually failed is gone. One word
fixes it.
##########
superset/extensions/ssh.py:
##########
@@ -30,6 +30,32 @@
if TYPE_CHECKING:
from superset.databases.ssh_tunnel.models import SSHTunnel
+# Order matters: paramiko's per-class loaders raise SSHException with vague
+# "unpack requires 4 bytes" messages on type mismatches, so we try the more
+# modern key types first (ed25519, ECDSA) and fall back to RSA, which is the
+# most permissive parser and the historical default in this codebase.
+_SSH_KEY_TYPES: tuple[type[PKey], ...] = (Ed25519Key, ECDSAKey, RSAKey)
+
+
+def _load_private_key(pem: str, password: str | None) -> PKey:
+ """Load a private key PEM regardless of algorithm (ed25519, ECDSA, RSA).
+
+ paramiko does not expose a polymorphic ``PKey.from_private_key``; each
+ key class only accepts its own format. Iterate over the supported types
+ and return the first that parses cleanly.
Review Comment:
**NIT: Docstring slightly inaccurate about paramiko's API**
> "paramiko does not expose a polymorphic `PKey.from_private_key`"
Paramiko 3.2+ does have `PKey.from_path()` — a static method that uses
`cryptography`'s `load_ssh_private_key()` internally and dispatches by type.
It's not usable here because it requires a filesystem path; writing
`ssh_tunnel.private_key` to disk first would expose key material. The comment
is worth clarifying:
```python
# paramiko 3.2+ has PKey.from_path() for polymorphic loading, but it requires
# a file path; writing private key material to disk would be a security
regression.
# Use per-class loaders on the in-memory StringIO instead.
```
##########
superset/extensions/ssh.py:
##########
@@ -30,6 +30,32 @@
if TYPE_CHECKING:
from superset.databases.ssh_tunnel.models import SSHTunnel
+# Order matters: paramiko's per-class loaders raise SSHException with vague
+# "unpack requires 4 bytes" messages on type mismatches, so we try the more
+# modern key types first (ed25519, ECDSA) and fall back to RSA, which is the
+# most permissive parser and the historical default in this codebase.
+_SSH_KEY_TYPES: tuple[type[PKey], ...] = (Ed25519Key, ECDSAKey, RSAKey)
+
+
+def _load_private_key(pem: str, password: str | None) -> PKey:
+ """Load a private key PEM regardless of algorithm (ed25519, ECDSA, RSA).
+
+ paramiko does not expose a polymorphic ``PKey.from_private_key``; each
+ key class only accepts its own format. Iterate over the supported types
+ and return the first that parses cleanly.
+ """
+ last_exc: Exception | None = None
+ for key_class in _SSH_KEY_TYPES:
+ try:
+ return key_class.from_private_key(StringIO(pem), password=password)
+ except SSHException as exc:
+ last_exc = exc
+ continue
+ raise SSHException(
+ "Unable to parse SSH private key as any of "
+ f"{', '.join(k.__name__ for k in _SSH_KEY_TYPES)}: {last_exc}"
Review Comment:
**MEDIUM: `last_exc` always reflects RSAKey, not the closest-matching type**
Because `RSAKey` is last in `_SSH_KEY_TYPES`, `last_exc` always carries the
RSAKey parse error — even when the key is actually ed25519. If a user's ed25519
key is corrupted (not a passphrase issue), they see a RSAKey-specific message
rather than the Ed25519Key one. The most useful error comes from the type that
matched the PEM header, but that's not tracked here.
At minimum a `# NOTE: last_exc reflects the final RSAKey attempt` comment
would prevent future surprise. A fuller fix would collect per-type errors and
surface them all.
##########
superset/extensions/ssh.py:
##########
@@ -30,6 +30,32 @@
if TYPE_CHECKING:
from superset.databases.ssh_tunnel.models import SSHTunnel
+# Order matters: paramiko's per-class loaders raise SSHException with vague
+# "unpack requires 4 bytes" messages on type mismatches, so we try the more
+# modern key types first (ed25519, ECDSA) and fall back to RSA, which is the
+# most permissive parser and the historical default in this codebase.
+_SSH_KEY_TYPES: tuple[type[PKey], ...] = (Ed25519Key, ECDSAKey, RSAKey)
+
+
+def _load_private_key(pem: str, password: str | None) -> PKey:
+ """Load a private key PEM regardless of algorithm (ed25519, ECDSA, RSA).
+
+ paramiko does not expose a polymorphic ``PKey.from_private_key``; each
+ key class only accepts its own format. Iterate over the supported types
+ and return the first that parses cleanly.
+ """
+ last_exc: Exception | None = None
+ for key_class in _SSH_KEY_TYPES:
+ try:
+ return key_class.from_private_key(StringIO(pem), password=password)
+ except SSHException as exc:
+ last_exc = exc
Review Comment:
**MEDIUM: `PasswordRequiredException` is silently swallowed**
`PasswordRequiredException` inherits `AuthenticationException →
SSHException`, so it's caught here. When a user pastes an encrypted ed25519 key
but leaves the passphrase field blank:
1. `Ed25519Key.from_private_key(..., password=None)` →
`PasswordRequiredException("Private key file is encrypted")` — caught,
`last_exc` set
2. `ECDSAKey.from_private_key(...)` → `SSHException` (type mismatch) —
caught, **overwrites** `last_exc`
3. `RSAKey.from_private_key(..., password=None)` →
`PasswordRequiredException` again — caught, re-set
Final error: `"Unable to parse SSH private key as any of ...: Private key
file is encrypted"` — which is misleading. The key *was* parsed; what's missing
is the passphrase. And whether that message survives depends on what ECDSAKey
raises for the given key, which is fragile.
```python
from paramiko import ECDSAKey, Ed25519Key, PasswordRequiredException, PKey,
RSAKey, SSHException
# in _load_private_key:
except PasswordRequiredException:
raise # don't absorb — it's a distinct, actionable signal
except SSHException as exc:
last_exc = exc
continue
```
##########
tests/unit_tests/extensions/ssh_test.py:
##########
@@ -34,3 +40,100 @@ def test_ssh_tunnel_timeout_setting() -> None:
factory.init_app(app)
assert sshtunnel.TUNNEL_TIMEOUT == 123.0
assert sshtunnel.SSH_TIMEOUT == 321.0
+
+
+def _make_ed25519_pem() -> str:
+ """Generate a fresh OpenSSH-format ed25519 private key PEM."""
+ key = Ed25519PrivateKey.generate()
+ return key.private_bytes(
+ encoding=Encoding.PEM,
+ format=PrivateFormat.OpenSSH,
+ encryption_algorithm=NoEncryption(),
+ ).decode()
+
+
+def test_create_tunnel_accepts_ed25519_private_key() -> None:
+ """
+ Regression for #24180: ed25519 SSH keys must be loadable for tunnel
+ setup. The bug surfaces as ``unpack requires 4 bytes`` because the
+ code unconditionally calls ``RSAKey.from_private_key`` regardless of
+ key type, which mis-parses the ed25519 byte stream.
+
+ The Superset UI accepts any key the user pastes, so the fix is to
+ detect the key type (or use ``paramiko.PKey.from_private_key`` once
+ paramiko exposes a polymorphic loader) rather than hard-coding RSA.
+
+ This test exercises ``create_tunnel`` end-to-end with a freshly
+ generated ed25519 key. It mocks ``sshtunnel.open_tunnel`` so the
+ test does not actually open a network connection — only the key
+ parsing path is exercised.
+ """
+ app = Mock()
+ app.config = {
+ "SSH_TUNNEL_LOCAL_BIND_ADDRESS": "127.0.0.1",
+ "SSH_TUNNEL_TIMEOUT_SEC": 10.0,
+ "SSH_TUNNEL_PACKET_TIMEOUT_SEC": 10.0,
+ }
+ manager = SSHManager(app)
+
+ ssh_tunnel = Mock()
+ ssh_tunnel.server_address = "ssh.example.com"
+ ssh_tunnel.server_port = 22
+ ssh_tunnel.username = "tunneluser"
+ ssh_tunnel.password = None
Review Comment:
**NIT: Inline import inside test function**
```python
from cryptography.hazmat.primitives.asymmetric.rsa import
generate_private_key
```
This is inside the function body while `_make_ed25519_pem` uses a top-level
import. Consider moving to module level (or extracting a `_make_rsa_pem()`
helper to mirror `_make_ed25519_pem()`).
##########
tests/unit_tests/extensions/ssh_test.py:
##########
@@ -34,3 +40,100 @@ def test_ssh_tunnel_timeout_setting() -> None:
factory.init_app(app)
assert sshtunnel.TUNNEL_TIMEOUT == 123.0
assert sshtunnel.SSH_TIMEOUT == 321.0
+
+
+def _make_ed25519_pem() -> str:
+ """Generate a fresh OpenSSH-format ed25519 private key PEM."""
+ key = Ed25519PrivateKey.generate()
+ return key.private_bytes(
+ encoding=Encoding.PEM,
+ format=PrivateFormat.OpenSSH,
+ encryption_algorithm=NoEncryption(),
+ ).decode()
+
+
+def test_create_tunnel_accepts_ed25519_private_key() -> None:
+ """
+ Regression for #24180: ed25519 SSH keys must be loadable for tunnel
+ setup. The bug surfaces as ``unpack requires 4 bytes`` because the
+ code unconditionally calls ``RSAKey.from_private_key`` regardless of
+ key type, which mis-parses the ed25519 byte stream.
+
+ The Superset UI accepts any key the user pastes, so the fix is to
+ detect the key type (or use ``paramiko.PKey.from_private_key`` once
+ paramiko exposes a polymorphic loader) rather than hard-coding RSA.
+
+ This test exercises ``create_tunnel`` end-to-end with a freshly
+ generated ed25519 key. It mocks ``sshtunnel.open_tunnel`` so the
+ test does not actually open a network connection — only the key
+ parsing path is exercised.
+ """
+ app = Mock()
+ app.config = {
+ "SSH_TUNNEL_LOCAL_BIND_ADDRESS": "127.0.0.1",
+ "SSH_TUNNEL_TIMEOUT_SEC": 10.0,
+ "SSH_TUNNEL_PACKET_TIMEOUT_SEC": 10.0,
+ }
+ manager = SSHManager(app)
+
+ ssh_tunnel = Mock()
+ ssh_tunnel.server_address = "ssh.example.com"
+ ssh_tunnel.server_port = 22
+ ssh_tunnel.username = "tunneluser"
+ ssh_tunnel.password = None
+ ssh_tunnel.private_key = _make_ed25519_pem()
+ ssh_tunnel.private_key_password = None
+
+ with patch("superset.extensions.ssh.sshtunnel.open_tunnel") as mock_open:
+ manager.create_tunnel(
+ ssh_tunnel, "postgresql://user:[email protected]:5432/x"
+ )
+
+ # Key-type-agnostic loader must produce a paramiko PKey usable as ssh_pkey.
+ assert mock_open.called, "open_tunnel was never invoked — key parsing
aborted"
+ forwarded_pkey = mock_open.call_args.kwargs["ssh_pkey"]
+ assert forwarded_pkey is not None
+
+
+def test_create_tunnel_accepts_rsa_private_key_unchanged() -> None:
+ """
+ Companion to test_create_tunnel_accepts_ed25519_private_key: pin the
+ backward-compatible RSA path so a fix for ed25519 doesn't regress the
+ historically-supported RSA case. Uses a freshly generated RSA key in
+ OpenSSH format.
+ """
+ from cryptography.hazmat.primitives.asymmetric.rsa import
generate_private_key
+
+ rsa_pem = (
+ generate_private_key(public_exponent=65537, key_size=2048)
+ .private_bytes(
+ encoding=Encoding.PEM,
+ format=PrivateFormat.OpenSSH,
+ encryption_algorithm=NoEncryption(),
+ )
+ .decode()
+ )
+
+ app = Mock()
+ app.config = {
+ "SSH_TUNNEL_LOCAL_BIND_ADDRESS": "127.0.0.1",
+ "SSH_TUNNEL_TIMEOUT_SEC": 10.0,
+ "SSH_TUNNEL_PACKET_TIMEOUT_SEC": 10.0,
+ }
+ manager = SSHManager(app)
+
+ ssh_tunnel = Mock()
+ ssh_tunnel.server_address = "ssh.example.com"
+ ssh_tunnel.server_port = 22
+ ssh_tunnel.username = "tunneluser"
+ ssh_tunnel.password = None
+ ssh_tunnel.private_key = rsa_pem
+ ssh_tunnel.private_key_password = None
+
+ with patch("superset.extensions.ssh.sshtunnel.open_tunnel") as mock_open:
+ manager.create_tunnel(
+ ssh_tunnel, "postgresql://user:[email protected]:5432/x"
+ )
+
+ assert mock_open.called
+ assert mock_open.call_args.kwargs["ssh_pkey"] is not None
Review Comment:
**MEDIUM: No ECDSA test**
`_SSH_KEY_TYPES` declares three types; tests cover two. ECDSAKey has a
distinct code path in paramiko (curve-aware parsing) — a regression there
wouldn't be caught. A third companion test using
`ec.generate_private_key(ec.SECP256R1())` would complete the triangle:
```python
from cryptography.hazmat.primitives.asymmetric import ec
def test_create_tunnel_accepts_ecdsa_private_key() -> None:
ecdsa_pem = (
ec.generate_private_key(ec.SECP256R1())
.private_bytes(Encoding.PEM, PrivateFormat.OpenSSH, NoEncryption())
.decode()
)
# ... same mock pattern as the ed25519 test
```
`ec` is already a transitive dependency via paramiko → cryptography.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]