[ https://issues.apache.org/jira/browse/FLINK-25529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17489208#comment-17489208 ]
luoyuxia edited comment on FLINK-25529 at 2/9/22, 2:23 AM: ----------------------------------------------------------- [~straw] I checked the hive-exec-2.2.0.jar, there's no orc-core class. But it seems the docuement is misleading, I think you can use orc-core-1.5.6.jar. Hopes it can help. The reason is the [flink-connector-hive] requires [flink-orc] to write orc formats, but [flink-orc] requires orc-core with a version of 1.5.6. was (Author: luoyuxia): [~straw] I checked the hive-exec-2.2.0.jar, there's no orc-core class. But it seems the docuement is misleading, I think you can use orc-core-1.5.6.jar. Hopes it can help. The reason is it flink-connector-hive depends on flink-orc to write orc formats, but flink-orc requires orc-core with a version of 1.5.6. > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write > bulkly into hive-2.1.1 orc table > ----------------------------------------------------------------------------------------------------------- > > Key: FLINK-25529 > URL: https://issues.apache.org/jira/browse/FLINK-25529 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive > Environment: hive 2.1.1 > flink 1.12.4 > Reporter: Yuan Zhu > Priority: Major > Labels: pull-request-available > Attachments: lib.jpg > > > I tried to write data bulkly into hive-2.1.1 with orc format, and encountered > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter > > Using bulk writer by setting table.exec.hive.fallback-mapred-writer = false; > > {code:java} > SET 'table.sql-dialect'='hive'; > create table orders( > order_id int, > order_date timestamp, > customer_name string, > price decimal(10,3), > product_id int, > order_status boolean > )partitioned by (dt string) > stored as orc; > > SET 'table.sql-dialect'='default'; > create table datagen_source ( > order_id int, > order_date timestamp(9), > customer_name varchar, > price decimal(10,3), > product_id int, > order_status boolean > )with('connector' = 'datagen'); > create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = '/mnt/conf'); > set table.exec.hive.fallback-mapred-writer = false; > insert into myhive.`default`.orders > /*+ OPTIONS( > 'sink.partition-commit.trigger'='process-time', > 'sink.partition-commit.policy.kind'='metastore,success-file', > 'sink.rolling-policy.file-size'='128MB', > 'sink.rolling-policy.rollover-interval'='10s', > 'sink.rolling-policy.check-interval'='10s', > 'auto-compaction'='true', > 'compaction.file-size'='1MB' ) */ > select * , date_format(now(),'yyyy-MM-dd') as dt from datagen_source; {code} > [ERROR] Could not execute SQL statement. Reason: > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter > > My jars in lib dir listed in attachment. > In HiveTableSink#createStreamSink(line:270), createBulkWriterFactory if > table.exec.hive.fallback-mapred-writer is false. > If table is orc, HiveShimV200#createOrcBulkWriterFactory will be invoked. > OrcBulkWriterFactory depends on org.apache.orc.PhysicalWriter in orc-core, > but flink-connector-hive excludes orc-core for conflicting with hive-exec. > -- This message was sent by Atlassian Jira (v8.20.1#820001)