[
https://issues.apache.org/jira/browse/SPARK-49910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ángel Álvarez Pascua updated SPARK-49910:
-----------------------------------------
Attachment: HiveMetaStoreClient.java
> spark TLS connection (+ kerberos) to hive metastore
> ---------------------------------------------------
>
> Key: SPARK-49910
> URL: https://issues.apache.org/jira/browse/SPARK-49910
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 3.5.2
> Environment: spark: 3.5.2_scala2.12
> hadoop: 3.3.6
> iceberg: 1.6.0
> hive: 4.0.0
>
> spark and HMS java version:
>
> openjdk version "11.0.24" 2024-07-16
> OpenJDK Runtime Environment Temurin-11.0.24+8 (build 11.0.24+8)
> OpenJDK 64-Bit Server VM Temurin-11.0.24+8 (build 11.0.24+8, mixed mode,
> sharing)
> Reporter: Stefano Bovina
> Priority: Major
> Attachments: HiveMetaStoreClient.java
>
>
> Hi,
> we are trying to configure an integration between trino, spark and hive
> metastore (HMS) in a secure way.
>
> Hive metastore has already been configured in order to use kerberos and TLS.
> Trino has already been configured in order to connect to HMS using TLS and
> kerberos.
>
> Trying to do the same for spark (connect it to HMS using TLS and kerberos) we
> faced a problem with TLS connection: if we configure spark using kerberos and
> plain connection to HMS (reconfiguring HMS too) it works, but if we enable
> TLS on both, spark is not able to connect.
>
> The error on HMS is the following: "Caused by: javax.net.ssl.SSLException:
> Unsupported or unrecognized SSL message" and indeed connections initiated by
> spark are alway plain.
>
> The test matrix is the following:
> # hive (kerberos + ssl) + spark (kerberos + ssl) --> not working
> # hive (kerberos + plain) --> spark (kerberos + plain) --> works
> # hive (ssl) ---> spark (ssl) --> works
>
> While doing "test 1", I also used tcpdump to figure out if spark was trying
> to start an ssl or a plain connection to hive, and for what I'm seeing spark
> is completely ignoring the following parameters and keep trying to open a
> plain connection:
> {code:java}
> spark.hive.metastore.use.SSL true
> spark.hive.metastore.truststore.path /opt/spark/ssl/cert.jks
> spark.hive.metastore.truststore.password mypassword{code}
> If I enable both kerberos and ssl (test 1), it seems like those hive ssl
> related configurations on spark-defaults are being ignored and spark always
> tries to open a plain connection; for example, If I set
> "spark.hive.metastore.truststore.password" to "wrongpassword" the error
> "Password verification failed" should be raised, but nothing
>
> spark conf: [https://gist.github.com/bovy89/83cbe3b9cd7a318fa9fd35355d5801fc]
> pyspark logs:
> [https://gist.github.com/bovy89/a06f0aa4a54f454fea9e0d6ff148cfc5#file-pyspark-log]
> pyspark debug logs:
> [https://gist.github.com/bovy89/e6a9eeca389f05ff7bea78f807ce5714]
> hive metastore logs:
> [https://gist.github.com/bovy89/a06f0aa4a54f454fea9e0d6ff148cfc5#file-hive-metastore-log]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]