aminghadersohi commented on code in PR #40139:
URL: https://github.com/apache/superset/pull/40139#discussion_r3276892222


##########
superset/extensions/ssh.py:
##########
@@ -30,6 +30,32 @@
 if TYPE_CHECKING:
     from superset.databases.ssh_tunnel.models import SSHTunnel
 
+# Order matters: paramiko's per-class loaders raise SSHException with vague
+# "unpack requires 4 bytes" messages on type mismatches, so we try the more
+# modern key types first (ed25519, ECDSA) and fall back to RSA, which is the
+# most permissive parser and the historical default in this codebase.
+_SSH_KEY_TYPES: tuple[type[PKey], ...] = (Ed25519Key, ECDSAKey, RSAKey)
+
+
+def _load_private_key(pem: str, password: str | None) -> PKey:
+    """Load a private key PEM regardless of algorithm (ed25519, ECDSA, RSA).
+
+    paramiko does not expose a polymorphic ``PKey.from_private_key``; each
+    key class only accepts its own format. Iterate over the supported types
+    and return the first that parses cleanly.
+    """
+    last_exc: Exception | None = None
+    for key_class in _SSH_KEY_TYPES:
+        try:
+            return key_class.from_private_key(StringIO(pem), password=password)
+        except SSHException as exc:
+            last_exc = exc
+            continue
+    raise SSHException(
+        "Unable to parse SSH private key as any of "
+        f"{', '.join(k.__name__ for k in _SSH_KEY_TYPES)}: {last_exc}"
+    )

Review Comment:
   **HIGH: Missing exception chaining**
   
   This `raise` should use `from last_exc`:
   ```python
   raise SSHException(
       "Unable to parse SSH private key as any of "
       f"{', '.join(k.__name__ for k in _SSH_KEY_TYPES)}: {last_exc}"
   ) from last_exc
   ```
   Without it Python discards the `__context__` chain, so Sentry (and any 
`raise`-from-`raise` traceback) shows only this outer `SSHException` — the 
stack frame inside paramiko where the parse actually failed is gone. One word 
fixes it.



##########
superset/extensions/ssh.py:
##########
@@ -30,6 +30,32 @@
 if TYPE_CHECKING:
     from superset.databases.ssh_tunnel.models import SSHTunnel
 
+# Order matters: paramiko's per-class loaders raise SSHException with vague
+# "unpack requires 4 bytes" messages on type mismatches, so we try the more
+# modern key types first (ed25519, ECDSA) and fall back to RSA, which is the
+# most permissive parser and the historical default in this codebase.
+_SSH_KEY_TYPES: tuple[type[PKey], ...] = (Ed25519Key, ECDSAKey, RSAKey)
+
+
+def _load_private_key(pem: str, password: str | None) -> PKey:
+    """Load a private key PEM regardless of algorithm (ed25519, ECDSA, RSA).
+
+    paramiko does not expose a polymorphic ``PKey.from_private_key``; each
+    key class only accepts its own format. Iterate over the supported types
+    and return the first that parses cleanly.

Review Comment:
   **NIT: Docstring slightly inaccurate about paramiko's API**
   
   > "paramiko does not expose a polymorphic `PKey.from_private_key`"
   
   Paramiko 3.2+ does have `PKey.from_path()` — a static method that uses 
`cryptography`'s `load_ssh_private_key()` internally and dispatches by type. 
It's not usable here because it requires a filesystem path; writing 
`ssh_tunnel.private_key` to disk first would expose key material. The comment 
is worth clarifying:
   
   ```python
   # paramiko 3.2+ has PKey.from_path() for polymorphic loading, but it requires
   # a file path; writing private key material to disk would be a security 
regression.
   # Use per-class loaders on the in-memory StringIO instead.
   ```



##########
superset/extensions/ssh.py:
##########
@@ -30,6 +30,32 @@
 if TYPE_CHECKING:
     from superset.databases.ssh_tunnel.models import SSHTunnel
 
+# Order matters: paramiko's per-class loaders raise SSHException with vague
+# "unpack requires 4 bytes" messages on type mismatches, so we try the more
+# modern key types first (ed25519, ECDSA) and fall back to RSA, which is the
+# most permissive parser and the historical default in this codebase.
+_SSH_KEY_TYPES: tuple[type[PKey], ...] = (Ed25519Key, ECDSAKey, RSAKey)
+
+
+def _load_private_key(pem: str, password: str | None) -> PKey:
+    """Load a private key PEM regardless of algorithm (ed25519, ECDSA, RSA).
+
+    paramiko does not expose a polymorphic ``PKey.from_private_key``; each
+    key class only accepts its own format. Iterate over the supported types
+    and return the first that parses cleanly.
+    """
+    last_exc: Exception | None = None
+    for key_class in _SSH_KEY_TYPES:
+        try:
+            return key_class.from_private_key(StringIO(pem), password=password)
+        except SSHException as exc:
+            last_exc = exc
+            continue
+    raise SSHException(
+        "Unable to parse SSH private key as any of "
+        f"{', '.join(k.__name__ for k in _SSH_KEY_TYPES)}: {last_exc}"

Review Comment:
   **MEDIUM: `last_exc` always reflects RSAKey, not the closest-matching type**
   
   Because `RSAKey` is last in `_SSH_KEY_TYPES`, `last_exc` always carries the 
RSAKey parse error — even when the key is actually ed25519. If a user's ed25519 
key is corrupted (not a passphrase issue), they see a RSAKey-specific message 
rather than the Ed25519Key one. The most useful error comes from the type that 
matched the PEM header, but that's not tracked here.
   
   At minimum a `# NOTE: last_exc reflects the final RSAKey attempt` comment 
would prevent future surprise. A fuller fix would collect per-type errors and 
surface them all.



##########
superset/extensions/ssh.py:
##########
@@ -30,6 +30,32 @@
 if TYPE_CHECKING:
     from superset.databases.ssh_tunnel.models import SSHTunnel
 
+# Order matters: paramiko's per-class loaders raise SSHException with vague
+# "unpack requires 4 bytes" messages on type mismatches, so we try the more
+# modern key types first (ed25519, ECDSA) and fall back to RSA, which is the
+# most permissive parser and the historical default in this codebase.
+_SSH_KEY_TYPES: tuple[type[PKey], ...] = (Ed25519Key, ECDSAKey, RSAKey)
+
+
+def _load_private_key(pem: str, password: str | None) -> PKey:
+    """Load a private key PEM regardless of algorithm (ed25519, ECDSA, RSA).
+
+    paramiko does not expose a polymorphic ``PKey.from_private_key``; each
+    key class only accepts its own format. Iterate over the supported types
+    and return the first that parses cleanly.
+    """
+    last_exc: Exception | None = None
+    for key_class in _SSH_KEY_TYPES:
+        try:
+            return key_class.from_private_key(StringIO(pem), password=password)
+        except SSHException as exc:
+            last_exc = exc

Review Comment:
   **MEDIUM: `PasswordRequiredException` is silently swallowed**
   
   `PasswordRequiredException` inherits `AuthenticationException → 
SSHException`, so it's caught here. When a user pastes an encrypted ed25519 key 
but leaves the passphrase field blank:
   
   1. `Ed25519Key.from_private_key(..., password=None)` → 
`PasswordRequiredException("Private key file is encrypted")` — caught, 
`last_exc` set
   2. `ECDSAKey.from_private_key(...)` → `SSHException` (type mismatch) — 
caught, **overwrites** `last_exc`
   3. `RSAKey.from_private_key(..., password=None)` → 
`PasswordRequiredException` again — caught, re-set
   
   Final error: `"Unable to parse SSH private key as any of ...: Private key 
file is encrypted"` — which is misleading. The key *was* parsed; what's missing 
is the passphrase. And whether that message survives depends on what ECDSAKey 
raises for the given key, which is fragile.
   
   ```python
   from paramiko import ECDSAKey, Ed25519Key, PasswordRequiredException, PKey, 
RSAKey, SSHException
   
   # in _load_private_key:
           except PasswordRequiredException:
               raise   # don't absorb — it's a distinct, actionable signal
           except SSHException as exc:
               last_exc = exc
               continue
   ```



##########
tests/unit_tests/extensions/ssh_test.py:
##########
@@ -34,3 +40,100 @@ def test_ssh_tunnel_timeout_setting() -> None:
     factory.init_app(app)
     assert sshtunnel.TUNNEL_TIMEOUT == 123.0
     assert sshtunnel.SSH_TIMEOUT == 321.0
+
+
+def _make_ed25519_pem() -> str:
+    """Generate a fresh OpenSSH-format ed25519 private key PEM."""
+    key = Ed25519PrivateKey.generate()
+    return key.private_bytes(
+        encoding=Encoding.PEM,
+        format=PrivateFormat.OpenSSH,
+        encryption_algorithm=NoEncryption(),
+    ).decode()
+
+
+def test_create_tunnel_accepts_ed25519_private_key() -> None:
+    """
+    Regression for #24180: ed25519 SSH keys must be loadable for tunnel
+    setup. The bug surfaces as ``unpack requires 4 bytes`` because the
+    code unconditionally calls ``RSAKey.from_private_key`` regardless of
+    key type, which mis-parses the ed25519 byte stream.
+
+    The Superset UI accepts any key the user pastes, so the fix is to
+    detect the key type (or use ``paramiko.PKey.from_private_key`` once
+    paramiko exposes a polymorphic loader) rather than hard-coding RSA.
+
+    This test exercises ``create_tunnel`` end-to-end with a freshly
+    generated ed25519 key. It mocks ``sshtunnel.open_tunnel`` so the
+    test does not actually open a network connection — only the key
+    parsing path is exercised.
+    """
+    app = Mock()
+    app.config = {
+        "SSH_TUNNEL_LOCAL_BIND_ADDRESS": "127.0.0.1",
+        "SSH_TUNNEL_TIMEOUT_SEC": 10.0,
+        "SSH_TUNNEL_PACKET_TIMEOUT_SEC": 10.0,
+    }
+    manager = SSHManager(app)
+
+    ssh_tunnel = Mock()
+    ssh_tunnel.server_address = "ssh.example.com"
+    ssh_tunnel.server_port = 22
+    ssh_tunnel.username = "tunneluser"
+    ssh_tunnel.password = None

Review Comment:
   **NIT: Inline import inside test function**
   
   ```python
   from cryptography.hazmat.primitives.asymmetric.rsa import 
generate_private_key
   ```
   
   This is inside the function body while `_make_ed25519_pem` uses a top-level 
import. Consider moving to module level (or extracting a `_make_rsa_pem()` 
helper to mirror `_make_ed25519_pem()`).



##########
tests/unit_tests/extensions/ssh_test.py:
##########
@@ -34,3 +40,100 @@ def test_ssh_tunnel_timeout_setting() -> None:
     factory.init_app(app)
     assert sshtunnel.TUNNEL_TIMEOUT == 123.0
     assert sshtunnel.SSH_TIMEOUT == 321.0
+
+
+def _make_ed25519_pem() -> str:
+    """Generate a fresh OpenSSH-format ed25519 private key PEM."""
+    key = Ed25519PrivateKey.generate()
+    return key.private_bytes(
+        encoding=Encoding.PEM,
+        format=PrivateFormat.OpenSSH,
+        encryption_algorithm=NoEncryption(),
+    ).decode()
+
+
+def test_create_tunnel_accepts_ed25519_private_key() -> None:
+    """
+    Regression for #24180: ed25519 SSH keys must be loadable for tunnel
+    setup. The bug surfaces as ``unpack requires 4 bytes`` because the
+    code unconditionally calls ``RSAKey.from_private_key`` regardless of
+    key type, which mis-parses the ed25519 byte stream.
+
+    The Superset UI accepts any key the user pastes, so the fix is to
+    detect the key type (or use ``paramiko.PKey.from_private_key`` once
+    paramiko exposes a polymorphic loader) rather than hard-coding RSA.
+
+    This test exercises ``create_tunnel`` end-to-end with a freshly
+    generated ed25519 key. It mocks ``sshtunnel.open_tunnel`` so the
+    test does not actually open a network connection — only the key
+    parsing path is exercised.
+    """
+    app = Mock()
+    app.config = {
+        "SSH_TUNNEL_LOCAL_BIND_ADDRESS": "127.0.0.1",
+        "SSH_TUNNEL_TIMEOUT_SEC": 10.0,
+        "SSH_TUNNEL_PACKET_TIMEOUT_SEC": 10.0,
+    }
+    manager = SSHManager(app)
+
+    ssh_tunnel = Mock()
+    ssh_tunnel.server_address = "ssh.example.com"
+    ssh_tunnel.server_port = 22
+    ssh_tunnel.username = "tunneluser"
+    ssh_tunnel.password = None
+    ssh_tunnel.private_key = _make_ed25519_pem()
+    ssh_tunnel.private_key_password = None
+
+    with patch("superset.extensions.ssh.sshtunnel.open_tunnel") as mock_open:
+        manager.create_tunnel(
+            ssh_tunnel, "postgresql://user:[email protected]:5432/x"
+        )
+
+    # Key-type-agnostic loader must produce a paramiko PKey usable as ssh_pkey.
+    assert mock_open.called, "open_tunnel was never invoked — key parsing 
aborted"
+    forwarded_pkey = mock_open.call_args.kwargs["ssh_pkey"]
+    assert forwarded_pkey is not None
+
+
+def test_create_tunnel_accepts_rsa_private_key_unchanged() -> None:
+    """
+    Companion to test_create_tunnel_accepts_ed25519_private_key: pin the
+    backward-compatible RSA path so a fix for ed25519 doesn't regress the
+    historically-supported RSA case. Uses a freshly generated RSA key in
+    OpenSSH format.
+    """
+    from cryptography.hazmat.primitives.asymmetric.rsa import 
generate_private_key
+
+    rsa_pem = (
+        generate_private_key(public_exponent=65537, key_size=2048)
+        .private_bytes(
+            encoding=Encoding.PEM,
+            format=PrivateFormat.OpenSSH,
+            encryption_algorithm=NoEncryption(),
+        )
+        .decode()
+    )
+
+    app = Mock()
+    app.config = {
+        "SSH_TUNNEL_LOCAL_BIND_ADDRESS": "127.0.0.1",
+        "SSH_TUNNEL_TIMEOUT_SEC": 10.0,
+        "SSH_TUNNEL_PACKET_TIMEOUT_SEC": 10.0,
+    }
+    manager = SSHManager(app)
+
+    ssh_tunnel = Mock()
+    ssh_tunnel.server_address = "ssh.example.com"
+    ssh_tunnel.server_port = 22
+    ssh_tunnel.username = "tunneluser"
+    ssh_tunnel.password = None
+    ssh_tunnel.private_key = rsa_pem
+    ssh_tunnel.private_key_password = None
+
+    with patch("superset.extensions.ssh.sshtunnel.open_tunnel") as mock_open:
+        manager.create_tunnel(
+            ssh_tunnel, "postgresql://user:[email protected]:5432/x"
+        )
+
+    assert mock_open.called
+    assert mock_open.call_args.kwargs["ssh_pkey"] is not None

Review Comment:
   **MEDIUM: No ECDSA test**
   
   `_SSH_KEY_TYPES` declares three types; tests cover two. ECDSAKey has a 
distinct code path in paramiko (curve-aware parsing) — a regression there 
wouldn't be caught. A third companion test using 
`ec.generate_private_key(ec.SECP256R1())` would complete the triangle:
   
   ```python
   from cryptography.hazmat.primitives.asymmetric import ec
   
   def test_create_tunnel_accepts_ecdsa_private_key() -> None:
       ecdsa_pem = (
           ec.generate_private_key(ec.SECP256R1())
           .private_bytes(Encoding.PEM, PrivateFormat.OpenSSH, NoEncryption())
           .decode()
       )
       # ... same mock pattern as the ed25519 test
   ```
   
   `ec` is already a transitive dependency via paramiko → cryptography.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to