[
https://issues.apache.org/jira/browse/FLINK-25529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17491547#comment-17491547
]
Yuan Zhu commented on FLINK-25529:
----------------------------------
[~luoyuxia] You are right that there is no orc-core classes in
hive-exec-2.2.0.jar.
But if I just add hive-exec-2.2.0.jar and orc-core-1.5.6, it will come to
ClassNotExceptionException: org.apache.orc.impl.HadoopShims .
Then I add orc-shims-1.5.6.jar as well. Then the Exception turn to
{code:java}
Could not execute SQL statement. Reason:
java.lang.ClassNotFoundException:
org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector{code}
Decimal64ColumnVector seems in hive-storage-api-2.6.0.jar.
Then I also add hive-storage-api-2.6.0.jar, which leads to
{code:java}
Caused by: java.lang.NoSuchMethodError:
org.apache.hadoop.hive.ql.exec.vector.TimestampColumnVector.isUTC()Z at
org.apache.orc.impl.writer.TimestampTreeWriter.writeBatch(TimestampTreeWriter.java:134)
at
org.apache.orc.impl.writer.StructTreeWriter.writeRootBatch(StructTreeWriter.java:56)
at org.apache.orc.impl.WriterImpl.addRowBatch(WriterImpl.java:557) at
org.apache.flink.orc.writer.OrcBulkWriter.addElement(OrcBulkWriter.java:58)
at
org.apache.flink.table.filesystem.FileSystemTableSink$ProjectionBulkFactory$1.addElement(FileSystemTableSink.java:598)
at
org.apache.flink.table.filesystem.FileSystemTableSink$ProjectionBulkFactory$1.addElement(FileSystemTableSink.java:594)
at
org.apache.flink.streaming.api.functions.sink.filesystem.BulkPartWriter.write(BulkPartWriter.java:48)
at
org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.write(Bucket.java:222)
at
org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.onElement(Buckets.java:305)
at
org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSinkHelper.onElement(StreamingFileSinkHelper.java:103)
at
org.apache.flink.table.filesystem.stream.AbstractStreamingWriter.processElement(AbstractStreamingWriter.java:140)
at
org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.pushToOperator(CopyingChainingOutput.java:71)
{code}
It seems that hive-storage-api-2.6.0 conflicts with the class in
hive-exec-2.2.0.
I don't familiar with hive dependencies. Do you have any ideas?
> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write
> bulkly into hive-2.1.1 orc table
> -----------------------------------------------------------------------------------------------------------
>
> Key: FLINK-25529
> URL: https://issues.apache.org/jira/browse/FLINK-25529
> Project: Flink
> Issue Type: Bug
> Components: Connectors / Hive
> Environment: hive 2.1.1
> flink 1.12.4
> Reporter: Yuan Zhu
> Priority: Major
> Labels: pull-request-available
> Attachments: lib.jpg
>
>
> I tried to write data bulkly into hive-2.1.1 with orc format, and encountered
> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter
>
> Using bulk writer by setting table.exec.hive.fallback-mapred-writer = false;
>
> {code:java}
> SET 'table.sql-dialect'='hive';
> create table orders(
> order_id int,
> order_date timestamp,
> customer_name string,
> price decimal(10,3),
> product_id int,
> order_status boolean
> )partitioned by (dt string)
> stored as orc;
>
> SET 'table.sql-dialect'='default';
> create table datagen_source (
> order_id int,
> order_date timestamp(9),
> customer_name varchar,
> price decimal(10,3),
> product_id int,
> order_status boolean
> )with('connector' = 'datagen');
> create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = '/mnt/conf');
> set table.exec.hive.fallback-mapred-writer = false;
> insert into myhive.`default`.orders
> /*+ OPTIONS(
> 'sink.partition-commit.trigger'='process-time',
> 'sink.partition-commit.policy.kind'='metastore,success-file',
> 'sink.rolling-policy.file-size'='128MB',
> 'sink.rolling-policy.rollover-interval'='10s',
> 'sink.rolling-policy.check-interval'='10s',
> 'auto-compaction'='true',
> 'compaction.file-size'='1MB' ) */
> select * , date_format(now(),'yyyy-MM-dd') as dt from datagen_source; {code}
> [ERROR] Could not execute SQL statement. Reason:
> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter
>
> My jars in lib dir listed in attachment.
> In HiveTableSink#createStreamSink(line:270), createBulkWriterFactory if
> table.exec.hive.fallback-mapred-writer is false.
> If table is orc, HiveShimV200#createOrcBulkWriterFactory will be invoked.
> OrcBulkWriterFactory depends on org.apache.orc.PhysicalWriter in orc-core,
> but flink-connector-hive excludes orc-core for conflicting with hive-exec.
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)