Anupam Yadav created SPARK-57480:
------------------------------------

             Summary: Spark Thrift Server allowSettingSystemProperties flag is 
bypassable from same JDBC overlay
                 Key: SPARK-57480
                 URL: https://issues.apache.org/jira/browse/SPARK-57480
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
    Affects Versions: 4.3.0
            Reporter: Anupam Yadav


*Follow-up to SPARK-57441 / 
[#56501|https://github.com/apache/spark/pull/56501].*

h2. Summary

The {{system:*}} setVariable gate added in SPARK-57441 reads its toggle 
({{spark.sql.legacy.hive.thriftServer.allowSettingSystemProperties}}) from the 
same per-session {{HiveConf}} that the JDBC client can mutate from the same 
{{set:}} overlay via {{set:hiveconf:}}. As a result, the default-deny is 
bypassable in a single JDBC session by an unprivileged client.

h2. Repro

A JDBC URL of the form

{noformat}
jdbc:hive2://host:port/db?set:hiveconf:spark.sql.legacy.hive.thriftServer.allowSettingSystemProperties=true;set:system:foo=bar
{noformat}

causes:

# {{configureSession}} -> 
{{setVariable("hiveconf:spark.sql.legacy.hive.thriftServer.allowSettingSystemProperties",
 "true")}} -> {{setConf}} -> {{verifyAndSet}} -- writes {{true}} into the 
per-session {{HiveConf}}. Validation does not reject this because (a) 
{{HiveConf.getConfVars(key)}} returns {{null}} for {{spark.*}} keys, (b) the 
{{key.startsWith("hive.")}} validation fallback does not fire, and (c) 
{{verifyAndSet}} has no default {{restrictList}} / {{modWhiteListPattern}} 
entry for this key.
# {{configureSession}} -> {{setVariable("system:foo", "bar")}} -- the 
{{SYSTEM_PREFIX}} branch reads the toggle from the same {{HiveConf}} 
({{ss.getConf().getBoolean(...)}}), now {{true}}, and proceeds with 
{{System.getProperties().setProperty("foo", "bar")}}.

This negates the SPARK-57441 default-deny.

The caveat is iteration order: this works if {{configureSession}} happens to 
process the {{hiveconf:}} directive before the {{system:}} one. 
{{sessionConfMap}} is {{Map<String, String>}} and iteration order is 
caller-controlled (commonly {{HashMap}} -- non-deterministic, retryable -- or 
{{LinkedHashMap}} -- attacker-controlled).

h2. Proof

Added the following test to {{HiveSessionImplSuite}} and ran it on master at 
SHA {{070d46c5f9e383f8d7a9386abf451be330b1ee49}} (post-SPARK-57441):

{code:scala}
test("BYPASS REPRO: set:hiveconf can flip the legacy flag mid-session") {
  val key = "spark.test.HiveSessionImplSuite.bypassRepro"
  val flagKey = 
StaticSQLConf.LEGACY_HIVE_THRIFT_SERVER_ALLOW_SETTING_SYSTEM_PROPERTIES.key
  try {
    withSystemPropSession(allowSettingSystemProperties = false) {
      // This is what configureSession does for set:hiveconf:<key>=true in a 
JDBC URL.
      val rcHiveconf = HiveSessionImpl.setVariable("hiveconf:" + flagKey, 
"true")
      assert(rcHiveconf == 0, s"hiveconf write rejected: rc=$rcHiveconf")
      // Now the same flag the system: branch reads from has been flipped.
      val rcSystem = HiveSessionImpl.setVariable("system:" + key, "value")
      assert(rcSystem == 0,
        s"BYPASS NOT REPRODUCED: system: write returned rc=$rcSystem (expected 
0 if bypassed)")
      assert(System.getProperty(key) == "value",
        s"BYPASS NOT REPRODUCED: System.getProperty($key) = 
${System.getProperty(key)}")
    }
  } finally {
    System.clearProperty(key)
  }
}
{code}

Test passes -- both {{setVariable}} returns 0 (acceptance), and 
{{System.getProperty(key) == "value"}} -- proving the bypass.

The session was opened with the legacy flag explicitly seeded to {{false}} 
(default-deny posture). The {{set:hiveconf:}} write then flipped it within the 
same session.

Run output:

{noformat}
- BYPASS REPRO: set:hiveconf can flip the legacy flag mid-session
Tests: succeeded 4, failed 0, canceled 0, ignored 0, pending 0
All tests passed.
{noformat}

h2. Impact

A remote, low-privilege JDBC client can mutate process-wide JVM system 
properties on the shared Spark Thrift Server. This is the exact threat model 
SPARK-57441 was filed to mitigate. Possible downstream impacts:

- Override {{javax.net.ssl.trustStore}} and {{trustStorePassword}} for outbound 
connections from the server JVM.
- Override {{https.proxyHost}} / {{socksProxyHost}} to redirect outbound 
traffic.
- Mutate {{java.security.policy}}, {{java.security.auth.login.config}}, or any 
other process-global property other sessions or background threads read.

h2. Suggested fixes (in rough order of localness)

# Stash the resolved boolean on a static field at 
{{SparkSQLSessionManager.init}} time, and read from there in {{setVariable}} 
rather than re-reading per-session {{HiveConf}}. Detaches the runtime check 
from per-session state entirely.
# Add the legacy key to a {{HiveConf}} restrict-list entry in 
{{SparkSQLSessionManager.init}} so that per-session {{set:hiveconf:}} writes to 
it throw.
# Special-case-reject 
{{set:hiveconf:spark.sql.legacy.hive.thriftServer.allowSettingSystemProperties}}
 in {{setVariable}}'s {{HIVECONF_PREFIX}} branch (or in {{configureSession}}).

h2. Test gap

The {{HiveSessionImplSuite}} tests added in SPARK-57441 seed {{HiveConf}} 
directly and call {{setVariable}} directly, not via {{configureSession}}. They 
do not cover the {{set:hiveconf:}} write-through path. Recommend adding a 
regression test that exercises a multi-{{set:}} directive sequence.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to