[jira] [Commented] (FLINK-25529) java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write bulkly into hive-2.1.1 orc table
[ https://issues.apache.org/jira/browse/FLINK-25529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17491959#comment-17491959 ] Yuan Zhu commented on FLINK-25529: -- Replacing orc-core-1.5.6 with orc-core-1.5.6-nohive will invoke Exception too. {code:java} Caused by: java.lang.NoSuchMethodError: org.apache.orc.TypeDescription.createRowBatch()Lorg/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatch; at org.apache.flink.orc.writer.OrcBulkWriter.(OrcBulkWriter.java:47) at org.apache.flink.orc.writer.OrcBulkWriterFactory.create(OrcBulkWriterFactory.java:106) at org.apache.flink.table.filesystem.FileSystemTableSink$ProjectionBulkFactory.create(FileSystemTableSink.java:593) at org.apache.flink.streaming.api.functions.sink.filesystem.BulkBucketWriter.openNew(BulkBucketWriter.java:75) at org.apache.flink.streaming.api.functions.sink.filesystem.OutputStreamBasedPartFileWriter$OutputStreamBasedBucketWriter.openNewInProgressFile(OutputStreamBasedPartFileWriter.java:90) at org.apache.flink.streaming.api.functions.sink.filesystem.BulkBucketWriter.openNewInProgressFile(BulkBucketWriter.java:36) at org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.rollPartFile(Bucket.java:243) at org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.write(Bucket.java:220) at org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.onElement(Buckets.java:305) at org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSinkHelper.onElement(StreamingFileSinkHelper.java:103) at org.apache.flink.table.filesystem.stream.AbstractStreamingWriter.processElement(AbstractStreamingWriter.java:140) {code} There is only org.apache.orc.TypeDescription.createRowBatch()Lorg.apache.orc.storage.ql.exec.vector.VectorizedRowBatch in orc-core-1.5.6-nohive.jar. It seems there is only one way that is taking a workaround. > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write > bulkly into hive-2.1.1 orc table > --- > > Key: FLINK-25529 > URL: https://issues.apache.org/jira/browse/FLINK-25529 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive > Environment: hive 2.1.1 > flink 1.12.4 >Reporter: Yuan Zhu >Priority: Major > Labels: pull-request-available > Attachments: lib.jpg > > > I tried to write data bulkly into hive-2.1.1 with orc format, and encountered > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter > > Using bulk writer by setting table.exec.hive.fallback-mapred-writer = false; > > {code:java} > SET 'table.sql-dialect'='hive'; > create table orders( > order_id int, > order_date timestamp, > customer_name string, > price decimal(10,3), > product_id int, > order_status boolean > )partitioned by (dt string) > stored as orc; > > SET 'table.sql-dialect'='default'; > create table datagen_source ( > order_id int, > order_date timestamp(9), > customer_name varchar, > price decimal(10,3), > product_id int, > order_status boolean > )with('connector' = 'datagen'); > create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = '/mnt/conf'); > set table.exec.hive.fallback-mapred-writer = false; > insert into myhive.`default`.orders > /*+ OPTIONS( > 'sink.partition-commit.trigger'='process-time', > 'sink.partition-commit.policy.kind'='metastore,success-file', > 'sink.rolling-policy.file-size'='128MB', > 'sink.rolling-policy.rollover-interval'='10s', > 'sink.rolling-policy.check-interval'='10s', > 'auto-compaction'='true', > 'compaction.file-size'='1MB' ) */ > select * , date_format(now(),'-MM-dd') as dt from datagen_source; {code} > [ERROR] Could not execute SQL statement. Reason: > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter > > My jars in lib dir listed in attachment. > In HiveTableSink#createStreamSink(line:270), createBulkWriterFactory if > table.exec.hive.fallback-mapred-writer is false. > If table is orc, HiveShimV200#createOrcBulkWriterFactory will be invoked. > OrcBulkWriterFactory depends on org.apache.orc.PhysicalWriter in orc-core, > but flink-connector-hive excludes orc-core for conflicting with hive-exec. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (FLINK-25529) java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write bulkly into hive-2.1.1 orc table
[ https://issues.apache.org/jira/browse/FLINK-25529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17491785#comment-17491785 ] luoyuxia commented on FLINK-25529: -- [~straw] yes, it's class conflicts. The OrcWriter in orc-1.5.6 will call method TimestampColumnVector.isUTC(), but jvm load the class TimestampColumnVector included in hive-exec, which don't have the method. Two quick way may help to fix the issue: 1. Replace orc-core-1.5.6 with [orc-core-1.5.6-nohive.jar |https://repo1.maven.org/maven2/org/apache/orc/orc-core/1.5.6/orc-core-1.5.6-nohive.jar], then you won't need the hive-storage-api-2.6.0.jar. 2.set table.exec.hive.fallback-mapred-writer = true to avoid using flink-orc-writer as a workaround. It seems there are some compatibility issues for hive in flink-orc-writer. > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write > bulkly into hive-2.1.1 orc table > --- > > Key: FLINK-25529 > URL: https://issues.apache.org/jira/browse/FLINK-25529 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive > Environment: hive 2.1.1 > flink 1.12.4 >Reporter: Yuan Zhu >Priority: Major > Labels: pull-request-available > Attachments: lib.jpg > > > I tried to write data bulkly into hive-2.1.1 with orc format, and encountered > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter > > Using bulk writer by setting table.exec.hive.fallback-mapred-writer = false; > > {code:java} > SET 'table.sql-dialect'='hive'; > create table orders( > order_id int, > order_date timestamp, > customer_name string, > price decimal(10,3), > product_id int, > order_status boolean > )partitioned by (dt string) > stored as orc; > > SET 'table.sql-dialect'='default'; > create table datagen_source ( > order_id int, > order_date timestamp(9), > customer_name varchar, > price decimal(10,3), > product_id int, > order_status boolean > )with('connector' = 'datagen'); > create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = '/mnt/conf'); > set table.exec.hive.fallback-mapred-writer = false; > insert into myhive.`default`.orders > /*+ OPTIONS( > 'sink.partition-commit.trigger'='process-time', > 'sink.partition-commit.policy.kind'='metastore,success-file', > 'sink.rolling-policy.file-size'='128MB', > 'sink.rolling-policy.rollover-interval'='10s', > 'sink.rolling-policy.check-interval'='10s', > 'auto-compaction'='true', > 'compaction.file-size'='1MB' ) */ > select * , date_format(now(),'-MM-dd') as dt from datagen_source; {code} > [ERROR] Could not execute SQL statement. Reason: > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter > > My jars in lib dir listed in attachment. > In HiveTableSink#createStreamSink(line:270), createBulkWriterFactory if > table.exec.hive.fallback-mapred-writer is false. > If table is orc, HiveShimV200#createOrcBulkWriterFactory will be invoked. > OrcBulkWriterFactory depends on org.apache.orc.PhysicalWriter in orc-core, > but flink-connector-hive excludes orc-core for conflicting with hive-exec. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (FLINK-25529) java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write bulkly into hive-2.1.1 orc table
[ https://issues.apache.org/jira/browse/FLINK-25529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17491547#comment-17491547 ] Yuan Zhu commented on FLINK-25529: -- [~luoyuxia] You are right that there is no orc-core classes in hive-exec-2.2.0.jar. But if I just add hive-exec-2.2.0.jar and orc-core-1.5.6, it will come to ClassNotExceptionException: org.apache.orc.impl.HadoopShims . Then I add orc-shims-1.5.6.jar as well. Then the Exception turn to {code:java} Could not execute SQL statement. Reason: java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector{code} Decimal64ColumnVector seems in hive-storage-api-2.6.0.jar. Then I also add hive-storage-api-2.6.0.jar, which leads to {code:java} Caused by: java.lang.NoSuchMethodError: org.apache.hadoop.hive.ql.exec.vector.TimestampColumnVector.isUTC()Zat org.apache.orc.impl.writer.TimestampTreeWriter.writeBatch(TimestampTreeWriter.java:134) at org.apache.orc.impl.writer.StructTreeWriter.writeRootBatch(StructTreeWriter.java:56) at org.apache.orc.impl.WriterImpl.addRowBatch(WriterImpl.java:557)at org.apache.flink.orc.writer.OrcBulkWriter.addElement(OrcBulkWriter.java:58) at org.apache.flink.table.filesystem.FileSystemTableSink$ProjectionBulkFactory$1.addElement(FileSystemTableSink.java:598) at org.apache.flink.table.filesystem.FileSystemTableSink$ProjectionBulkFactory$1.addElement(FileSystemTableSink.java:594) at org.apache.flink.streaming.api.functions.sink.filesystem.BulkPartWriter.write(BulkPartWriter.java:48) at org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.write(Bucket.java:222) at org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.onElement(Buckets.java:305) at org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSinkHelper.onElement(StreamingFileSinkHelper.java:103) at org.apache.flink.table.filesystem.stream.AbstractStreamingWriter.processElement(AbstractStreamingWriter.java:140) at org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.pushToOperator(CopyingChainingOutput.java:71) {code} It seems that hive-storage-api-2.6.0 conflicts with the class in hive-exec-2.2.0. I don't familiar with hive dependencies. Do you have any ideas? > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write > bulkly into hive-2.1.1 orc table > --- > > Key: FLINK-25529 > URL: https://issues.apache.org/jira/browse/FLINK-25529 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive > Environment: hive 2.1.1 > flink 1.12.4 >Reporter: Yuan Zhu >Priority: Major > Labels: pull-request-available > Attachments: lib.jpg > > > I tried to write data bulkly into hive-2.1.1 with orc format, and encountered > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter > > Using bulk writer by setting table.exec.hive.fallback-mapred-writer = false; > > {code:java} > SET 'table.sql-dialect'='hive'; > create table orders( > order_id int, > order_date timestamp, > customer_name string, > price decimal(10,3), > product_id int, > order_status boolean > )partitioned by (dt string) > stored as orc; > > SET 'table.sql-dialect'='default'; > create table datagen_source ( > order_id int, > order_date timestamp(9), > customer_name varchar, > price decimal(10,3), > product_id int, > order_status boolean > )with('connector' = 'datagen'); > create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = '/mnt/conf'); > set table.exec.hive.fallback-mapred-writer = false; > insert into myhive.`default`.orders > /*+ OPTIONS( > 'sink.partition-commit.trigger'='process-time', > 'sink.partition-commit.policy.kind'='metastore,success-file', > 'sink.rolling-policy.file-size'='128MB', > 'sink.rolling-policy.rollover-interval'='10s', > 'sink.rolling-policy.check-interval'='10s', > 'auto-compaction'='true', > 'compaction.file-size'='1MB' ) */ > select * , date_format(now(),'-MM-dd') as dt from datagen_source; {code} > [ERROR] Could not execute SQL statement. Reason: > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter > > My jars in lib dir listed in attachment. > In HiveTableSink#createStreamSink(line:270), createBulkWriterFactory if > table.exec.hive.fallback-mapred-writer is false. > If table is orc, HiveShimV200#createOrcBulkWriterFactory will be invoked. > OrcBulkWriterFactory depends on org.apache.orc.PhysicalWriter in orc-core, > but flink-connector-hive excludes orc-core for conflicting with hive-exec. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (FLINK-25529) java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write bulkly into hive-2.1.1 orc table
[ https://issues.apache.org/jira/browse/FLINK-25529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17489208#comment-17489208 ] luoyuxia commented on FLINK-25529: -- [~straw] I checked the hive-exec-2.2.0.jar, there's no orc-core class. But it seems the docuement is misleading, I think you can use orc-core-1.5.6.jar. Hopes it can help. The reason is it flink-connector-hive depends on flink-orc to write orc formats, but flink-orc requires orc-core with a version of 1.5.6. > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write > bulkly into hive-2.1.1 orc table > --- > > Key: FLINK-25529 > URL: https://issues.apache.org/jira/browse/FLINK-25529 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive > Environment: hive 2.1.1 > flink 1.12.4 >Reporter: Yuan Zhu >Priority: Major > Labels: pull-request-available > Attachments: lib.jpg > > > I tried to write data bulkly into hive-2.1.1 with orc format, and encountered > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter > > Using bulk writer by setting table.exec.hive.fallback-mapred-writer = false; > > {code:java} > SET 'table.sql-dialect'='hive'; > create table orders( > order_id int, > order_date timestamp, > customer_name string, > price decimal(10,3), > product_id int, > order_status boolean > )partitioned by (dt string) > stored as orc; > > SET 'table.sql-dialect'='default'; > create table datagen_source ( > order_id int, > order_date timestamp(9), > customer_name varchar, > price decimal(10,3), > product_id int, > order_status boolean > )with('connector' = 'datagen'); > create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = '/mnt/conf'); > set table.exec.hive.fallback-mapred-writer = false; > insert into myhive.`default`.orders > /*+ OPTIONS( > 'sink.partition-commit.trigger'='process-time', > 'sink.partition-commit.policy.kind'='metastore,success-file', > 'sink.rolling-policy.file-size'='128MB', > 'sink.rolling-policy.rollover-interval'='10s', > 'sink.rolling-policy.check-interval'='10s', > 'auto-compaction'='true', > 'compaction.file-size'='1MB' ) */ > select * , date_format(now(),'-MM-dd') as dt from datagen_source; {code} > [ERROR] Could not execute SQL statement. Reason: > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter > > My jars in lib dir listed in attachment. > In HiveTableSink#createStreamSink(line:270), createBulkWriterFactory if > table.exec.hive.fallback-mapred-writer is false. > If table is orc, HiveShimV200#createOrcBulkWriterFactory will be invoked. > OrcBulkWriterFactory depends on org.apache.orc.PhysicalWriter in orc-core, > but flink-connector-hive excludes orc-core for conflicting with hive-exec. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (FLINK-25529) java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write bulkly into hive-2.1.1 orc table
[ https://issues.apache.org/jira/browse/FLINK-25529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17488779#comment-17488779 ] Yuan Zhu commented on FLINK-25529: -- [~luoyuxia] I try hive-exec-2.2.0.jar. It will cause Exception: {code:java} Caused by: java.lang.NoSuchMethodError: org.apache.orc.OrcFile$WriterOptions.getHadoopShims()Lorg/apache/orc/impl/HadoopShims; at org.apache.flink.orc.writer.PhysicalWriterImpl.(PhysicalWriterImpl.java:103) at org.apache.flink.orc.writer.OrcBulkWriterFactory.create(OrcBulkWriterFactory.java:99) at org.apache.flink.table.filesystem.FileSystemTableSink$ProjectionBulkFactory.create(FileSystemTableSink.java:593) at org.apache.flink.streaming.api.functions.sink.filesystem.BulkBucketWriter.openNew(BulkBucketWriter.java:75) at org.apache.flink.streaming.api.functions.sink.filesystem.OutputStreamBasedPartFileWriter$OutputStreamBasedBucketWriter.openNewInProgressFile(OutputStreamBasedPartFileWriter.java:90) at org.apache.flink.streaming.api.functions.sink.filesystem.BulkBucketWriter.openNewInProgressFile(BulkBucketWriter.java:36) at org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.rollPartFile(Bucket.java:243) at org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.write(Bucket.java:220) at org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.onElement(Buckets.java:305) at org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSinkHelper.onElement(StreamingFileSinkHelper.java:103) at org.apache.flink.table.filesystem.stream.AbstractStreamingWriter.processElement(AbstractStreamingWriter.java:140) {code} I find orc-core is alway 1.3.4 in hive-exec 2.1, 2.2, 2.3. > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write > bulkly into hive-2.1.1 orc table > --- > > Key: FLINK-25529 > URL: https://issues.apache.org/jira/browse/FLINK-25529 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive > Environment: hive 2.1.1 > flink 1.12.4 >Reporter: Yuan Zhu >Priority: Major > Labels: pull-request-available > Attachments: lib.jpg > > > I tried to write data bulkly into hive-2.1.1 with orc format, and encountered > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter > > Using bulk writer by setting table.exec.hive.fallback-mapred-writer = false; > > {code:java} > SET 'table.sql-dialect'='hive'; > create table orders( > order_id int, > order_date timestamp, > customer_name string, > price decimal(10,3), > product_id int, > order_status boolean > )partitioned by (dt string) > stored as orc; > > SET 'table.sql-dialect'='default'; > create table datagen_source ( > order_id int, > order_date timestamp(9), > customer_name varchar, > price decimal(10,3), > product_id int, > order_status boolean > )with('connector' = 'datagen'); > create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = '/mnt/conf'); > set table.exec.hive.fallback-mapred-writer = false; > insert into myhive.`default`.orders > /*+ OPTIONS( > 'sink.partition-commit.trigger'='process-time', > 'sink.partition-commit.policy.kind'='metastore,success-file', > 'sink.rolling-policy.file-size'='128MB', > 'sink.rolling-policy.rollover-interval'='10s', > 'sink.rolling-policy.check-interval'='10s', > 'auto-compaction'='true', > 'compaction.file-size'='1MB' ) */ > select * , date_format(now(),'-MM-dd') as dt from datagen_source; {code} > [ERROR] Could not execute SQL statement. Reason: > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter > > My jars in lib dir listed in attachment. > In HiveTableSink#createStreamSink(line:270), createBulkWriterFactory if > table.exec.hive.fallback-mapred-writer is false. > If table is orc, HiveShimV200#createOrcBulkWriterFactory will be invoked. > OrcBulkWriterFactory depends on org.apache.orc.PhysicalWriter in orc-core, > but flink-connector-hive excludes orc-core for conflicting with hive-exec. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (FLINK-25529) java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write bulkly into hive-2.1.1 orc table
[ https://issues.apache.org/jira/browse/FLINK-25529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17488558#comment-17488558 ] luoyuxia commented on FLINK-25529: -- [~straw] I think you can use hive-exec-2.2.0.jar instead. It's fine for the backward compatibility. > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write > bulkly into hive-2.1.1 orc table > --- > > Key: FLINK-25529 > URL: https://issues.apache.org/jira/browse/FLINK-25529 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive > Environment: hive 2.1.1 > flink 1.12.4 >Reporter: Yuan Zhu >Priority: Major > Labels: pull-request-available > Attachments: lib.jpg > > > I tried to write data bulkly into hive-2.1.1 with orc format, and encountered > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter > > Using bulk writer by setting table.exec.hive.fallback-mapred-writer = false; > > {code:java} > SET 'table.sql-dialect'='hive'; > create table orders( > order_id int, > order_date timestamp, > customer_name string, > price decimal(10,3), > product_id int, > order_status boolean > )partitioned by (dt string) > stored as orc; > > SET 'table.sql-dialect'='default'; > create table datagen_source ( > order_id int, > order_date timestamp(9), > customer_name varchar, > price decimal(10,3), > product_id int, > order_status boolean > )with('connector' = 'datagen'); > create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = '/mnt/conf'); > set table.exec.hive.fallback-mapred-writer = false; > insert into myhive.`default`.orders > /*+ OPTIONS( > 'sink.partition-commit.trigger'='process-time', > 'sink.partition-commit.policy.kind'='metastore,success-file', > 'sink.rolling-policy.file-size'='128MB', > 'sink.rolling-policy.rollover-interval'='10s', > 'sink.rolling-policy.check-interval'='10s', > 'auto-compaction'='true', > 'compaction.file-size'='1MB' ) */ > select * , date_format(now(),'-MM-dd') as dt from datagen_source; {code} > [ERROR] Could not execute SQL statement. Reason: > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter > > My jars in lib dir listed in attachment. > In HiveTableSink#createStreamSink(line:270), createBulkWriterFactory if > table.exec.hive.fallback-mapred-writer is false. > If table is orc, HiveShimV200#createOrcBulkWriterFactory will be invoked. > OrcBulkWriterFactory depends on org.apache.orc.PhysicalWriter in orc-core, > but flink-connector-hive excludes orc-core for conflicting with hive-exec. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (FLINK-25529) java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write bulkly into hive-2.1.1 orc table
[ https://issues.apache.org/jira/browse/FLINK-25529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17486799#comment-17486799 ] Yuan Zhu commented on FLINK-25529: -- [~luoyuxia] If I add the hive-exec-2.1.1.jar and orc-core.jar(1.4.3), it will encounter Error like: {code:java} [ERROR] Could not execute SQL statement. Reason: java.lang.NoSuchMethodError: org.apache.orc.TypeDescription.fromString(Ljava/lang/String;)Lorg/apache/orc/TypeDescription; {code} Because hive-exec.jar is a shaded jar with orc-core 1.3.4, and class TypeDescription conflicts with the class in orc-core.jar 1.4.3 > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write > bulkly into hive-2.1.1 orc table > --- > > Key: FLINK-25529 > URL: https://issues.apache.org/jira/browse/FLINK-25529 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive > Environment: hive 2.1.1 > flink 1.12.4 >Reporter: Yuan Zhu >Priority: Major > Labels: pull-request-available > Attachments: lib.jpg > > > I tried to write data bulkly into hive-2.1.1 with orc format, and encountered > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter > > Using bulk writer by setting table.exec.hive.fallback-mapred-writer = false; > > {code:java} > SET 'table.sql-dialect'='hive'; > create table orders( > order_id int, > order_date timestamp, > customer_name string, > price decimal(10,3), > product_id int, > order_status boolean > )partitioned by (dt string) > stored as orc; > > SET 'table.sql-dialect'='default'; > create table datagen_source ( > order_id int, > order_date timestamp(9), > customer_name varchar, > price decimal(10,3), > product_id int, > order_status boolean > )with('connector' = 'datagen'); > create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = '/mnt/conf'); > set table.exec.hive.fallback-mapred-writer = false; > insert into myhive.`default`.orders > /*+ OPTIONS( > 'sink.partition-commit.trigger'='process-time', > 'sink.partition-commit.policy.kind'='metastore,success-file', > 'sink.rolling-policy.file-size'='128MB', > 'sink.rolling-policy.rollover-interval'='10s', > 'sink.rolling-policy.check-interval'='10s', > 'auto-compaction'='true', > 'compaction.file-size'='1MB' ) */ > select * , date_format(now(),'-MM-dd') as dt from datagen_source; {code} > [ERROR] Could not execute SQL statement. Reason: > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter > > My jars in lib dir listed in attachment. > In HiveTableSink#createStreamSink(line:270), createBulkWriterFactory if > table.exec.hive.fallback-mapred-writer is false. > If table is orc, HiveShimV200#createOrcBulkWriterFactory will be invoked. > OrcBulkWriterFactory depends on org.apache.orc.PhysicalWriter in orc-core, > but flink-connector-hive excludes orc-core for conflicting with hive-exec. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (FLINK-25529) java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write bulkly into hive-2.1.1 orc table
[ https://issues.apache.org/jira/browse/FLINK-25529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17479812#comment-17479812 ] luoyuxia commented on FLINK-25529: -- [~straw]Thanks for your reporting, you're right. There's no org.apache.orc.PhysicalWriter in hive-exec-2.1.0.jar. You may need add an extra orc jar. Anyway, the document is misleading, I'll try to fix it. > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write > bulkly into hive-2.1.1 orc table > --- > > Key: FLINK-25529 > URL: https://issues.apache.org/jira/browse/FLINK-25529 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive > Environment: hive 2.1.1 > flink 1.12.4 >Reporter: Yuan Zhu >Priority: Major > Attachments: lib.jpg > > > I tried to write data bulkly into hive-2.1.1 with orc format, and encountered > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter > > Using bulk writer by setting table.exec.hive.fallback-mapred-writer = false; > > {code:java} > SET 'table.sql-dialect'='hive'; > create table orders( > order_id int, > order_date timestamp, > customer_name string, > price decimal(10,3), > product_id int, > order_status boolean > )partitioned by (dt string) > stored as orc; > > SET 'table.sql-dialect'='default'; > create table datagen_source ( > order_id int, > order_date timestamp(9), > customer_name varchar, > price decimal(10,3), > product_id int, > order_status boolean > )with('connector' = 'datagen'); > create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = '/mnt/conf'); > set table.exec.hive.fallback-mapred-writer = false; > insert into myhive.`default`.orders > /*+ OPTIONS( > 'sink.partition-commit.trigger'='process-time', > 'sink.partition-commit.policy.kind'='metastore,success-file', > 'sink.rolling-policy.file-size'='128MB', > 'sink.rolling-policy.rollover-interval'='10s', > 'sink.rolling-policy.check-interval'='10s', > 'auto-compaction'='true', > 'compaction.file-size'='1MB' ) */ > select * , date_format(now(),'-MM-dd') as dt from datagen_source; {code} > [ERROR] Could not execute SQL statement. Reason: > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter > > My jars in lib dir listed in attachment. > In HiveTableSink#createStreamSink(line:270), createBulkWriterFactory if > table.exec.hive.fallback-mapred-writer is false. > If table is orc, HiveShimV200#createOrcBulkWriterFactory will be invoked. > OrcBulkWriterFactory depends on org.apache.orc.PhysicalWriter in orc-core, > but flink-connector-hive excludes orc-core for conflicting with hive-exec. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (FLINK-25529) java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write bulkly into hive-2.1.1 orc table
[ https://issues.apache.org/jira/browse/FLINK-25529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17478638#comment-17478638 ] Yuan Zhu commented on FLINK-25529: -- Hi, [~luoyuxia] , thanks for your attention. My hive version is 2.1.1. According to [dependencies|https://nightlies.apache.org/flink/flink-docs-release-1.12/dev/table/connectors/hive/#user-defined-dependencies] , there is no class org.apache.orc.PhysicalWriter in hive-exec and hive-connector. The jar provided by you is hive 2.3.6. It cannot be used by 2.1.1. > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write > bulkly into hive-2.1.1 orc table > --- > > Key: FLINK-25529 > URL: https://issues.apache.org/jira/browse/FLINK-25529 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive > Environment: hive 2.1.1 > flink 1.12.4 >Reporter: Yuan Zhu >Priority: Major > Attachments: lib.jpg > > > I tried to write data bulkly into hive-2.1.1 with orc format, and encountered > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter > > Using bulk writer by setting table.exec.hive.fallback-mapred-writer = false; > > {code:java} > SET 'table.sql-dialect'='hive'; > create table orders( > order_id int, > order_date timestamp, > customer_name string, > price decimal(10,3), > product_id int, > order_status boolean > )partitioned by (dt string) > stored as orc; > > SET 'table.sql-dialect'='default'; > create table datagen_source ( > order_id int, > order_date timestamp(9), > customer_name varchar, > price decimal(10,3), > product_id int, > order_status boolean > )with('connector' = 'datagen'); > create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = '/mnt/conf'); > set table.exec.hive.fallback-mapred-writer = false; > insert into myhive.`default`.orders > /*+ OPTIONS( > 'sink.partition-commit.trigger'='process-time', > 'sink.partition-commit.policy.kind'='metastore,success-file', > 'sink.rolling-policy.file-size'='128MB', > 'sink.rolling-policy.rollover-interval'='10s', > 'sink.rolling-policy.check-interval'='10s', > 'auto-compaction'='true', > 'compaction.file-size'='1MB' ) */ > select * , date_format(now(),'-MM-dd') as dt from datagen_source; {code} > [ERROR] Could not execute SQL statement. Reason: > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter > > My jars in lib dir listed in attachment. > In HiveTableSink#createStreamSink(line:270), createBulkWriterFactory if > table.exec.hive.fallback-mapred-writer is false. > If table is orc, HiveShimV200#createOrcBulkWriterFactory will be invoked. > OrcBulkWriterFactory depends on org.apache.orc.PhysicalWriter in orc-core, > but flink-connector-hive excludes orc-core for conflicting with hive-exec. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (FLINK-25529) java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write bulkly into hive-2.1.1 orc table
[ https://issues.apache.org/jira/browse/FLINK-25529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477521#comment-17477521 ] luoyuxia commented on FLINK-25529: -- [~straw] I download the [flink-hive-connector |[Download|https://repo.maven.apache.org/maven2/org/apache/flink/flink-sql-connector-hive-2.3.6_2.11/1.14.2/flink-sql-connector-hive-2.3.6_2.11-1.14.2.jar]] > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write > bulkly into hive-2.1.1 orc table > --- > > Key: FLINK-25529 > URL: https://issues.apache.org/jira/browse/FLINK-25529 > Project: Flink > Issue Type: Bug > Components: Connectors / Hive > Environment: hive 2.1.1 > flink 1.12.4 >Reporter: Yuan Zhu >Priority: Major > Attachments: lib.jpg > > > I tried to write data bulkly into hive-2.1.1 with orc format, and encountered > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter > > Using bulk writer by setting table.exec.hive.fallback-mapred-writer = false; > > {code:java} > SET 'table.sql-dialect'='hive'; > create table orders( > order_id int, > order_date timestamp, > customer_name string, > price decimal(10,3), > product_id int, > order_status boolean > )partitioned by (dt string) > stored as orc; > > SET 'table.sql-dialect'='default'; > create table datagen_source ( > order_id int, > order_date timestamp(9), > customer_name varchar, > price decimal(10,3), > product_id int, > order_status boolean > )with('connector' = 'datagen'); > create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = '/mnt/conf'); > set table.exec.hive.fallback-mapred-writer = false; > insert into myhive.`default`.orders > /*+ OPTIONS( > 'sink.partition-commit.trigger'='process-time', > 'sink.partition-commit.policy.kind'='metastore,success-file', > 'sink.rolling-policy.file-size'='128MB', > 'sink.rolling-policy.rollover-interval'='10s', > 'sink.rolling-policy.check-interval'='10s', > 'auto-compaction'='true', > 'compaction.file-size'='1MB' ) */ > select * , date_format(now(),'-MM-dd') as dt from datagen_source; {code} > [ERROR] Could not execute SQL statement. Reason: > java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter > > My jars in lib dir listed in attachment. > In HiveTableSink#createStreamSink(line:270), createBulkWriterFactory if > table.exec.hive.fallback-mapred-writer is false. > If table is orc, HiveShimV200#createOrcBulkWriterFactory will be invoked. > OrcBulkWriterFactory depends on org.apache.orc.PhysicalWriter in orc-core, > but flink-connector-hive excludes orc-core for conflicting with hive-exec. > -- This message was sent by Atlassian Jira (v8.20.1#820001)