cloud-fan commented on PR #56630: URL: https://github.com/apache/spark/pull/56630#issuecomment-4761241943
Heads-up that the latest commit is a **design pivot**, not just a fix, so it's worth a fresh look rather than a line-diff review. The original commit tried to *find and strip* each place a credential could hide (userinfo authority + `?`/`;` properties). As @MaxGekk's review showed, that's a denylist that has to chase every driver's URL grammar (Oracle Thin being the first miss). The rework inverts it to an **allowlist**: keep only the `jdbc:<subprotocol>:` prefix — which can't contain credentials — and redact the whole subname. This resolves the review thread wholesale: - **Oracle Thin (blocking):** redacted, no `//`-authority assumption anywhere. - **`?`/`#`-in-password partial leak:** gone; no userinfo parsing. - **`regex`-on-top coverage:** added a `Some(regex)` test. - **Doc accuracy:** scaladoc now states only the driver type survives. Trade-off, called out in the PR description: messages now show only the engine type (e.g. `jdbc:mysql:*********(redacted)`), not host/database. That's intentional, on the "worst case is showing no URL at all" principle — `FAILED_JDBC.*` still carries the driver exception/SQLState for triage. Happy to dial this back to also surface the host if reviewers feel the lost signal outweighs the simplicity. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
