Teddy Choi created HIVE-29507:
---------------------------------

             Summary: Create a slim hive-iceberg-handler core JAR
                 Key: HIVE-29507
                 URL: https://issues.apache.org/jira/browse/HIVE-29507
             Project: Hive
          Issue Type: New Feature
          Components: Iceberg integration
            Reporter: Teddy Choi
            Assignee: Teddy Choi


h1. Issue
{quote}Starting from 1.8.0 Iceberg doesn't release Hive runtime connector. For 
Hive query engine integration (specifically with Hive 2.x and 3.x) use Hive 
runtime connector coming with Iceberg 1.6.1, or use Hive 4.0.0 or later which 
is released with embedded Iceberg integration.

[https://iceberg.apache.org/docs/latest/hive/#feature-support]
{quote}
For Hive 3.x and Iceberg 1.8+, a slim {{hive-iceberg-handler-core.jar}} file 
without shading is required. Apache Spark can import 
{{hive-iceberg-handler.jar}} with {{iceberg-spark-runtime.jar}} together. But 
there are some classes on both JAR files. It causes an 
{{{}InvalidClassException{}}}. For example,
{code:java}
java.io.InvalidClassException: org.apache.iceberg.BaseFile; local class 
incompatible: stream classdesc serialVersionUID = 8569836863676564712, local 
class serialVersionUID = -8072381884098305524{code}
h1. Fix

Create a slim hive-iceberg-handler core JAR file to avoid 
{{{}InvalidClassException{}}}.

Before:
 * iceberg-shading
 ** {{maven-shade-plugin}} shades Iceberg and other dependencies.
 * iceberg-handler
 ** {{maven-dependency-plugin}} unpacks iceberg-shading and iceberg-catalog 
then packs them together.

After:
 * iceberg-shading
 ** {{maven-shade-plugin}} shades Iceberg and other dependencies.
 * iceberg-handler
 ** {{maven-shade-plugin}} shades iceberg-shading and iceberg-catalog without 
relocation, which results the same JAR file as {{maven-dependency-plugin}} did.
 ** {{maven-jar-plugin}} creates a new slim JAR without shaded classes.

{{maven-dependency-plugin}} in {{iceberg-handler}} overwrites the class 
directory, so {{maven-jar-plugin}} is affected. Its solution is to use 
{{{}<configuration><includes></includes></configuration>{}}}, but as there are 
many shared Java packages across artifacts, almost 100 individual class names 
should be explicitly configured. That number looks hard to maintain when any 
class is changed in those packages.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to