Anupam Yadav created SPARK-57480:
------------------------------------
Summary: Spark Thrift Server allowSettingSystemProperties flag is
bypassable from same JDBC overlay
Key: SPARK-57480
URL: https://issues.apache.org/jira/browse/SPARK-57480
Project: Spark
Issue Type: Sub-task
Components: SQL
Affects Versions: 4.3.0
Reporter: Anupam Yadav
*Follow-up to SPARK-57441 /
[#56501|https://github.com/apache/spark/pull/56501].*
h2. Summary
The {{system:*}} setVariable gate added in SPARK-57441 reads its toggle
({{spark.sql.legacy.hive.thriftServer.allowSettingSystemProperties}}) from the
same per-session {{HiveConf}} that the JDBC client can mutate from the same
{{set:}} overlay via {{set:hiveconf:}}. As a result, the default-deny is
bypassable in a single JDBC session by an unprivileged client.
h2. Repro
A JDBC URL of the form
{noformat}
jdbc:hive2://host:port/db?set:hiveconf:spark.sql.legacy.hive.thriftServer.allowSettingSystemProperties=true;set:system:foo=bar
{noformat}
causes:
# {{configureSession}} ->
{{setVariable("hiveconf:spark.sql.legacy.hive.thriftServer.allowSettingSystemProperties",
"true")}} -> {{setConf}} -> {{verifyAndSet}} -- writes {{true}} into the
per-session {{HiveConf}}. Validation does not reject this because (a)
{{HiveConf.getConfVars(key)}} returns {{null}} for {{spark.*}} keys, (b) the
{{key.startsWith("hive.")}} validation fallback does not fire, and (c)
{{verifyAndSet}} has no default {{restrictList}} / {{modWhiteListPattern}}
entry for this key.
# {{configureSession}} -> {{setVariable("system:foo", "bar")}} -- the
{{SYSTEM_PREFIX}} branch reads the toggle from the same {{HiveConf}}
({{ss.getConf().getBoolean(...)}}), now {{true}}, and proceeds with
{{System.getProperties().setProperty("foo", "bar")}}.
This negates the SPARK-57441 default-deny.
The caveat is iteration order: this works if {{configureSession}} happens to
process the {{hiveconf:}} directive before the {{system:}} one.
{{sessionConfMap}} is {{Map<String, String>}} and iteration order is
caller-controlled (commonly {{HashMap}} -- non-deterministic, retryable -- or
{{LinkedHashMap}} -- attacker-controlled).
h2. Proof
Added the following test to {{HiveSessionImplSuite}} and ran it on master at
SHA {{070d46c5f9e383f8d7a9386abf451be330b1ee49}} (post-SPARK-57441):
{code:scala}
test("BYPASS REPRO: set:hiveconf can flip the legacy flag mid-session") {
val key = "spark.test.HiveSessionImplSuite.bypassRepro"
val flagKey =
StaticSQLConf.LEGACY_HIVE_THRIFT_SERVER_ALLOW_SETTING_SYSTEM_PROPERTIES.key
try {
withSystemPropSession(allowSettingSystemProperties = false) {
// This is what configureSession does for set:hiveconf:<key>=true in a
JDBC URL.
val rcHiveconf = HiveSessionImpl.setVariable("hiveconf:" + flagKey,
"true")
assert(rcHiveconf == 0, s"hiveconf write rejected: rc=$rcHiveconf")
// Now the same flag the system: branch reads from has been flipped.
val rcSystem = HiveSessionImpl.setVariable("system:" + key, "value")
assert(rcSystem == 0,
s"BYPASS NOT REPRODUCED: system: write returned rc=$rcSystem (expected
0 if bypassed)")
assert(System.getProperty(key) == "value",
s"BYPASS NOT REPRODUCED: System.getProperty($key) =
${System.getProperty(key)}")
}
} finally {
System.clearProperty(key)
}
}
{code}
Test passes -- both {{setVariable}} returns 0 (acceptance), and
{{System.getProperty(key) == "value"}} -- proving the bypass.
The session was opened with the legacy flag explicitly seeded to {{false}}
(default-deny posture). The {{set:hiveconf:}} write then flipped it within the
same session.
Run output:
{noformat}
- BYPASS REPRO: set:hiveconf can flip the legacy flag mid-session
Tests: succeeded 4, failed 0, canceled 0, ignored 0, pending 0
All tests passed.
{noformat}
h2. Impact
A remote, low-privilege JDBC client can mutate process-wide JVM system
properties on the shared Spark Thrift Server. This is the exact threat model
SPARK-57441 was filed to mitigate. Possible downstream impacts:
- Override {{javax.net.ssl.trustStore}} and {{trustStorePassword}} for outbound
connections from the server JVM.
- Override {{https.proxyHost}} / {{socksProxyHost}} to redirect outbound
traffic.
- Mutate {{java.security.policy}}, {{java.security.auth.login.config}}, or any
other process-global property other sessions or background threads read.
h2. Suggested fixes (in rough order of localness)
# Stash the resolved boolean on a static field at
{{SparkSQLSessionManager.init}} time, and read from there in {{setVariable}}
rather than re-reading per-session {{HiveConf}}. Detaches the runtime check
from per-session state entirely.
# Add the legacy key to a {{HiveConf}} restrict-list entry in
{{SparkSQLSessionManager.init}} so that per-session {{set:hiveconf:}} writes to
it throw.
# Special-case-reject
{{set:hiveconf:spark.sql.legacy.hive.thriftServer.allowSettingSystemProperties}}
in {{setVariable}}'s {{HIVECONF_PREFIX}} branch (or in {{configureSession}}).
h2. Test gap
The {{HiveSessionImplSuite}} tests added in SPARK-57441 seed {{HiveConf}}
directly and call {{setVariable}} directly, not via {{configureSession}}. They
do not cover the {{set:hiveconf:}} write-through path. Recommend adding a
regression test that exercises a multi-{{set:}} directive sequence.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]