(spark) branch branch-4.0 updated: [SPARK-51029][BUILD] Remove `hive-llap-common` compile dependency

dongjoon Wed, 29 Jan 2025 09:35:23 -0800

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-4.0
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/branch-4.0 by this push:
     new 2004df751223 [SPARK-51029][BUILD] Remove `hive-llap-common` compile 
dependency
2004df751223 is described below

commit 2004df751223b421d9612321f9c6facace25c764
Author: Dongjoon Hyun <[email protected]>
AuthorDate: Wed Jan 29 09:33:19 2025 -0800

    [SPARK-51029][BUILD] Remove `hive-llap-common` compile dependency
    
    ### What changes were proposed in this pull request?
    
    This PR aims to remove `hive-llap-common` compile dependency from Apache 
Spark.
    
    ### Why are the changes needed?
    
    Technically, Apache Spark is not using this jar. We had better exclude it 
from Apache Spark distribution in order to mitigate security concerns.
    
    ### Does this PR introduce _any_ user-facing change?
    
    Yes, this is a removal of dependency which may affect existing Hive UDF 
jars. The user can add the `hive-llap-common` library to their class path at 
their own risk, similar to the other third-party libraries.
    
    The migration guide is updated.
    
    ### How was this patch tested?
    
    Pass the CIs.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No.
    
    Closes #49725 from dongjoon-hyun/SPARK-51029.
    
    Authored-by: Dongjoon Hyun <[email protected]>
    Signed-off-by: Dongjoon Hyun <[email protected]>
    (cherry picked from commit 339b036594d9fe87346a25c0a7d87173d7fc632d)
    Signed-off-by: Dongjoon Hyun <[email protected]>
---
 dev/deps/spark-deps-hadoop-3-hive-2.3 | 1 -
 docs/sql-migration-guide.md           | 1 +
 pom.xml                               | 2 +-
 3 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/dev/deps/spark-deps-hadoop-3-hive-2.3 
b/dev/deps/spark-deps-hadoop-3-hive-2.3
index 96d5f9d47714..ca52760a3368 100644
--- a/dev/deps/spark-deps-hadoop-3-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-3-hive-2.3
@@ -86,7 +86,6 @@ hive-cli/2.3.10//hive-cli-2.3.10.jar
 hive-common/2.3.10//hive-common-2.3.10.jar
 hive-exec/2.3.10/core/hive-exec-2.3.10-core.jar
 hive-jdbc/2.3.10//hive-jdbc-2.3.10.jar
-hive-llap-common/2.3.10//hive-llap-common-2.3.10.jar
 hive-metastore/2.3.10//hive-metastore-2.3.10.jar
 hive-serde/2.3.10//hive-serde-2.3.10.jar
 hive-service-rpc/4.0.0//hive-service-rpc-4.0.0.jar
diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md
index 254c54a414a7..f459a88d8e14 100644
--- a/docs/sql-migration-guide.md
+++ b/docs/sql-migration-guide.md
@@ -31,6 +31,7 @@ license: |
 - Since Spark 4.0, any read of SQL tables takes into consideration the SQL 
configs 
`spark.sql.files.ignoreCorruptFiles`/`spark.sql.files.ignoreMissingFiles` 
instead of the core config 
`spark.files.ignoreCorruptFiles`/`spark.files.ignoreMissingFiles`.
 - Since Spark 4.0, when reading SQL tables hits 
`org.apache.hadoop.security.AccessControlException` and 
`org.apache.hadoop.hdfs.BlockMissingException`, the exception will be thrown 
and fail the task, even if `spark.sql.files.ignoreCorruptFiles` is set to 
`true`.
 - Since Spark 4.0, `spark.sql.hive.metastore` drops the support of Hive prior 
to 2.0.0 as they require JDK 8 that Spark does not support anymore. Users 
should migrate to higher versions.
+- Since Spark 4.0, Spark removes `hive-llap-common` dependency. To restore the 
previous behavior, add `hive-llap-common` jar to the class path.
 - Since Spark 4.0, `spark.sql.parquet.compression.codec` drops the support of 
codec name `lz4raw`, please use `lz4_raw` instead.
 - Since Spark 4.0, when overflowing during casting timestamp to byte/short/int 
under non-ansi mode, Spark will return null instead a wrapping value.
 - Since Spark 4.0, the `encode()` and `decode()` functions support only the 
following charsets 'US-ASCII', 'ISO-8859-1', 'UTF-8', 'UTF-16BE', 'UTF-16LE', 
'UTF-16', 'UTF-32'. To restore the previous behavior when the function accepts 
charsets of the current JDK used by Spark, set `spark.sql.legacy.javaCharsets` 
to `true`.
diff --git a/pom.xml b/pom.xml
index 0f4c1d5bd5cb..59c473f4e69a 100644
--- a/pom.xml
+++ b/pom.xml
@@ -274,7 +274,7 @@
     <hive.storage.scope>compile</hive.storage.scope>
     <hive.jackson.scope>compile</hive.jackson.scope>
     <hive.common.scope>compile</hive.common.scope>
-    <hive.llap.scope>compile</hive.llap.scope>
+    <hive.llap.scope>test</hive.llap.scope>
     <hive.serde.scope>compile</hive.serde.scope>
     <hive.shims.scope>compile</hive.shims.scope>
     <orc.deps.scope>compile</orc.deps.scope>


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(spark) branch branch-4.0 updated: [SPARK-51029][BUILD] Remove `hive-llap-common` compile dependency

Reply via email to