Teddy Choi created HIVE-29507:
---------------------------------
Summary: Create a slim hive-iceberg-handler core JAR
Key: HIVE-29507
URL: https://issues.apache.org/jira/browse/HIVE-29507
Project: Hive
Issue Type: New Feature
Components: Iceberg integration
Reporter: Teddy Choi
Assignee: Teddy Choi
h1. Issue
{quote}Starting from 1.8.0 Iceberg doesn't release Hive runtime connector. For
Hive query engine integration (specifically with Hive 2.x and 3.x) use Hive
runtime connector coming with Iceberg 1.6.1, or use Hive 4.0.0 or later which
is released with embedded Iceberg integration.
[https://iceberg.apache.org/docs/latest/hive/#feature-support]
{quote}
For Hive 3.x and Iceberg 1.8+, a slim {{hive-iceberg-handler-core.jar}} file
without shading is required. Apache Spark can import
{{hive-iceberg-handler.jar}} with {{iceberg-spark-runtime.jar}} together. But
there are some classes on both JAR files. It causes an
{{{}InvalidClassException{}}}. For example,
{code:java}
java.io.InvalidClassException: org.apache.iceberg.BaseFile; local class
incompatible: stream classdesc serialVersionUID = 8569836863676564712, local
class serialVersionUID = -8072381884098305524{code}
h1. Fix
Create a slim hive-iceberg-handler core JAR file to avoid
{{{}InvalidClassException{}}}.
Before:
* iceberg-shading
** {{maven-shade-plugin}} shades Iceberg and other dependencies.
* iceberg-handler
** {{maven-dependency-plugin}} unpacks iceberg-shading and iceberg-catalog
then packs them together.
After:
* iceberg-shading
** {{maven-shade-plugin}} shades Iceberg and other dependencies.
* iceberg-handler
** {{maven-shade-plugin}} shades iceberg-shading and iceberg-catalog without
relocation, which results the same JAR file as {{maven-dependency-plugin}} did.
** {{maven-jar-plugin}} creates a new slim JAR without shaded classes.
{{maven-dependency-plugin}} in {{iceberg-handler}} overwrites the class
directory, so {{maven-jar-plugin}} is affected. Its solution is to use
{{{}<configuration><includes></includes></configuration>{}}}, but as there are
many shared Java packages across artifacts, almost 100 individual class names
should be explicitly configured. That number looks hard to maintain when any
class is changed in those packages.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)