[jira] [Commented] (FLINK-25529) java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write bulkly into hive-2.1.1 orc table

2022-02-14 Thread Yuan Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17491959#comment-17491959
 ] 

Yuan Zhu commented on FLINK-25529:
--

Replacing orc-core-1.5.6 with orc-core-1.5.6-nohive will invoke Exception too.
{code:java}
Caused by: java.lang.NoSuchMethodError: 
org.apache.orc.TypeDescription.createRowBatch()Lorg/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatch;
at org.apache.flink.orc.writer.OrcBulkWriter.(OrcBulkWriter.java:47)  
  at 
org.apache.flink.orc.writer.OrcBulkWriterFactory.create(OrcBulkWriterFactory.java:106)
at 
org.apache.flink.table.filesystem.FileSystemTableSink$ProjectionBulkFactory.create(FileSystemTableSink.java:593)
at 
org.apache.flink.streaming.api.functions.sink.filesystem.BulkBucketWriter.openNew(BulkBucketWriter.java:75)
at 
org.apache.flink.streaming.api.functions.sink.filesystem.OutputStreamBasedPartFileWriter$OutputStreamBasedBucketWriter.openNewInProgressFile(OutputStreamBasedPartFileWriter.java:90)
at 
org.apache.flink.streaming.api.functions.sink.filesystem.BulkBucketWriter.openNewInProgressFile(BulkBucketWriter.java:36)
at 
org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.rollPartFile(Bucket.java:243)
at 
org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.write(Bucket.java:220)
at 
org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.onElement(Buckets.java:305)
at 
org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSinkHelper.onElement(StreamingFileSinkHelper.java:103)
at 
org.apache.flink.table.filesystem.stream.AbstractStreamingWriter.processElement(AbstractStreamingWriter.java:140)
 {code}
There is only 
org.apache.orc.TypeDescription.createRowBatch()Lorg.apache.orc.storage.ql.exec.vector.VectorizedRowBatch
 in orc-core-1.5.6-nohive.jar.

 

It seems there is only one way that is taking a workaround.

> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write 
> bulkly into hive-2.1.1 orc table
> ---
>
> Key: FLINK-25529
> URL: https://issues.apache.org/jira/browse/FLINK-25529
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
> Environment: hive 2.1.1
> flink 1.12.4
>Reporter: Yuan Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: lib.jpg
>
>
> I tried to write data bulkly into hive-2.1.1 with orc format, and encountered 
> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter
>  
> Using bulk writer by setting table.exec.hive.fallback-mapred-writer = false;
>  
> {code:java}
> SET 'table.sql-dialect'='hive';
> create table orders(
>     order_id int,
>     order_date timestamp,
>     customer_name string,
>     price decimal(10,3),
>     product_id int,
>     order_status boolean
> )partitioned by (dt string)
> stored as orc;
>  
> SET 'table.sql-dialect'='default';
> create table datagen_source (
> order_id int,
> order_date timestamp(9),
> customer_name varchar,
> price decimal(10,3),
> product_id int,
> order_status boolean
> )with('connector' = 'datagen');
> create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = '/mnt/conf');
> set table.exec.hive.fallback-mapred-writer = false;
> insert into myhive.`default`.orders
> /*+ OPTIONS(
>     'sink.partition-commit.trigger'='process-time',
>     'sink.partition-commit.policy.kind'='metastore,success-file',
>     'sink.rolling-policy.file-size'='128MB',
>     'sink.rolling-policy.rollover-interval'='10s',
>     'sink.rolling-policy.check-interval'='10s',
>     'auto-compaction'='true',
>     'compaction.file-size'='1MB'    ) */
> select * , date_format(now(),'-MM-dd') as dt from datagen_source;  {code}
> [ERROR] Could not execute SQL statement. Reason:
> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter
>  
> My jars in lib dir listed in attachment.
> In HiveTableSink#createStreamSink(line:270), createBulkWriterFactory if 
> table.exec.hive.fallback-mapred-writer is false.
> If table is orc, HiveShimV200#createOrcBulkWriterFactory will be invoked. 
> OrcBulkWriterFactory depends on org.apache.orc.PhysicalWriter in orc-core, 
> but flink-connector-hive excludes orc-core for conflicting with hive-exec.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25529) java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write bulkly into hive-2.1.1 orc table

2022-02-13 Thread luoyuxia (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17491785#comment-17491785
 ] 

luoyuxia commented on FLINK-25529:
--

[~straw] yes, it's class conflicts. The OrcWriter in orc-1.5.6 will call method 
TimestampColumnVector.isUTC(), but jvm load the class TimestampColumnVector 
included in hive-exec, which don't have the method. Two quick way may help to 
fix the issue:
1. Replace orc-core-1.5.6 with [orc-core-1.5.6-nohive.jar 
|https://repo1.maven.org/maven2/org/apache/orc/orc-core/1.5.6/orc-core-1.5.6-nohive.jar],
 then you won't need the hive-storage-api-2.6.0.jar.
2.set table.exec.hive.fallback-mapred-writer = true to avoid using 
flink-orc-writer as a workaround. It seems there are some compatibility issues 
for hive in flink-orc-writer.


> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write 
> bulkly into hive-2.1.1 orc table
> ---
>
> Key: FLINK-25529
> URL: https://issues.apache.org/jira/browse/FLINK-25529
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
> Environment: hive 2.1.1
> flink 1.12.4
>Reporter: Yuan Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: lib.jpg
>
>
> I tried to write data bulkly into hive-2.1.1 with orc format, and encountered 
> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter
>  
> Using bulk writer by setting table.exec.hive.fallback-mapred-writer = false;
>  
> {code:java}
> SET 'table.sql-dialect'='hive';
> create table orders(
>     order_id int,
>     order_date timestamp,
>     customer_name string,
>     price decimal(10,3),
>     product_id int,
>     order_status boolean
> )partitioned by (dt string)
> stored as orc;
>  
> SET 'table.sql-dialect'='default';
> create table datagen_source (
> order_id int,
> order_date timestamp(9),
> customer_name varchar,
> price decimal(10,3),
> product_id int,
> order_status boolean
> )with('connector' = 'datagen');
> create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = '/mnt/conf');
> set table.exec.hive.fallback-mapred-writer = false;
> insert into myhive.`default`.orders
> /*+ OPTIONS(
>     'sink.partition-commit.trigger'='process-time',
>     'sink.partition-commit.policy.kind'='metastore,success-file',
>     'sink.rolling-policy.file-size'='128MB',
>     'sink.rolling-policy.rollover-interval'='10s',
>     'sink.rolling-policy.check-interval'='10s',
>     'auto-compaction'='true',
>     'compaction.file-size'='1MB'    ) */
> select * , date_format(now(),'-MM-dd') as dt from datagen_source;  {code}
> [ERROR] Could not execute SQL statement. Reason:
> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter
>  
> My jars in lib dir listed in attachment.
> In HiveTableSink#createStreamSink(line:270), createBulkWriterFactory if 
> table.exec.hive.fallback-mapred-writer is false.
> If table is orc, HiveShimV200#createOrcBulkWriterFactory will be invoked. 
> OrcBulkWriterFactory depends on org.apache.orc.PhysicalWriter in orc-core, 
> but flink-connector-hive excludes orc-core for conflicting with hive-exec.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25529) java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write bulkly into hive-2.1.1 orc table

2022-02-13 Thread Yuan Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17491547#comment-17491547
 ] 

Yuan Zhu commented on FLINK-25529:
--

[~luoyuxia] You are right that there is no orc-core classes in 
hive-exec-2.2.0.jar.

But if I just add hive-exec-2.2.0.jar and orc-core-1.5.6, it will come to 
ClassNotExceptionException: org.apache.orc.impl.HadoopShims .

Then I add orc-shims-1.5.6.jar as well. Then the Exception turn to

 
{code:java}
Could not execute SQL statement. Reason:
java.lang.ClassNotFoundException: 
org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector{code}
Decimal64ColumnVector seems in  hive-storage-api-2.6.0.jar.

Then I also add hive-storage-api-2.6.0.jar, which leads to 
{code:java}
Caused by: java.lang.NoSuchMethodError: 
org.apache.hadoop.hive.ql.exec.vector.TimestampColumnVector.isUTC()Zat 
org.apache.orc.impl.writer.TimestampTreeWriter.writeBatch(TimestampTreeWriter.java:134)
at 
org.apache.orc.impl.writer.StructTreeWriter.writeRootBatch(StructTreeWriter.java:56)
at org.apache.orc.impl.WriterImpl.addRowBatch(WriterImpl.java:557)at 
org.apache.flink.orc.writer.OrcBulkWriter.addElement(OrcBulkWriter.java:58)
at 
org.apache.flink.table.filesystem.FileSystemTableSink$ProjectionBulkFactory$1.addElement(FileSystemTableSink.java:598)
at 
org.apache.flink.table.filesystem.FileSystemTableSink$ProjectionBulkFactory$1.addElement(FileSystemTableSink.java:594)
at 
org.apache.flink.streaming.api.functions.sink.filesystem.BulkPartWriter.write(BulkPartWriter.java:48)
at 
org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.write(Bucket.java:222)
at 
org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.onElement(Buckets.java:305)
at 
org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSinkHelper.onElement(StreamingFileSinkHelper.java:103)
at 
org.apache.flink.table.filesystem.stream.AbstractStreamingWriter.processElement(AbstractStreamingWriter.java:140)
at 
org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.pushToOperator(CopyingChainingOutput.java:71)
 {code}
It seems that hive-storage-api-2.6.0 conflicts with the class in 
hive-exec-2.2.0. 

I don't familiar with hive dependencies. Do you have any ideas?

 

> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write 
> bulkly into hive-2.1.1 orc table
> ---
>
> Key: FLINK-25529
> URL: https://issues.apache.org/jira/browse/FLINK-25529
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
> Environment: hive 2.1.1
> flink 1.12.4
>Reporter: Yuan Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: lib.jpg
>
>
> I tried to write data bulkly into hive-2.1.1 with orc format, and encountered 
> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter
>  
> Using bulk writer by setting table.exec.hive.fallback-mapred-writer = false;
>  
> {code:java}
> SET 'table.sql-dialect'='hive';
> create table orders(
>     order_id int,
>     order_date timestamp,
>     customer_name string,
>     price decimal(10,3),
>     product_id int,
>     order_status boolean
> )partitioned by (dt string)
> stored as orc;
>  
> SET 'table.sql-dialect'='default';
> create table datagen_source (
> order_id int,
> order_date timestamp(9),
> customer_name varchar,
> price decimal(10,3),
> product_id int,
> order_status boolean
> )with('connector' = 'datagen');
> create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = '/mnt/conf');
> set table.exec.hive.fallback-mapred-writer = false;
> insert into myhive.`default`.orders
> /*+ OPTIONS(
>     'sink.partition-commit.trigger'='process-time',
>     'sink.partition-commit.policy.kind'='metastore,success-file',
>     'sink.rolling-policy.file-size'='128MB',
>     'sink.rolling-policy.rollover-interval'='10s',
>     'sink.rolling-policy.check-interval'='10s',
>     'auto-compaction'='true',
>     'compaction.file-size'='1MB'    ) */
> select * , date_format(now(),'-MM-dd') as dt from datagen_source;  {code}
> [ERROR] Could not execute SQL statement. Reason:
> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter
>  
> My jars in lib dir listed in attachment.
> In HiveTableSink#createStreamSink(line:270), createBulkWriterFactory if 
> table.exec.hive.fallback-mapred-writer is false.
> If table is orc, HiveShimV200#createOrcBulkWriterFactory will be invoked. 
> OrcBulkWriterFactory depends on org.apache.orc.PhysicalWriter in orc-core, 
> but flink-connector-hive excludes orc-core for conflicting with hive-exec.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25529) java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write bulkly into hive-2.1.1 orc table

2022-02-08 Thread luoyuxia (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17489208#comment-17489208
 ] 

luoyuxia commented on FLINK-25529:
--

[~straw] I checked the hive-exec-2.2.0.jar, there's no orc-core class. But it 
seems the docuement is misleading, I think you can use orc-core-1.5.6.jar. 
Hopes it can help.
The reason is it flink-connector-hive depends on flink-orc to write orc 
formats, but flink-orc requires orc-core with a version of 1.5.6.

> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write 
> bulkly into hive-2.1.1 orc table
> ---
>
> Key: FLINK-25529
> URL: https://issues.apache.org/jira/browse/FLINK-25529
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
> Environment: hive 2.1.1
> flink 1.12.4
>Reporter: Yuan Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: lib.jpg
>
>
> I tried to write data bulkly into hive-2.1.1 with orc format, and encountered 
> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter
>  
> Using bulk writer by setting table.exec.hive.fallback-mapred-writer = false;
>  
> {code:java}
> SET 'table.sql-dialect'='hive';
> create table orders(
>     order_id int,
>     order_date timestamp,
>     customer_name string,
>     price decimal(10,3),
>     product_id int,
>     order_status boolean
> )partitioned by (dt string)
> stored as orc;
>  
> SET 'table.sql-dialect'='default';
> create table datagen_source (
> order_id int,
> order_date timestamp(9),
> customer_name varchar,
> price decimal(10,3),
> product_id int,
> order_status boolean
> )with('connector' = 'datagen');
> create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = '/mnt/conf');
> set table.exec.hive.fallback-mapred-writer = false;
> insert into myhive.`default`.orders
> /*+ OPTIONS(
>     'sink.partition-commit.trigger'='process-time',
>     'sink.partition-commit.policy.kind'='metastore,success-file',
>     'sink.rolling-policy.file-size'='128MB',
>     'sink.rolling-policy.rollover-interval'='10s',
>     'sink.rolling-policy.check-interval'='10s',
>     'auto-compaction'='true',
>     'compaction.file-size'='1MB'    ) */
> select * , date_format(now(),'-MM-dd') as dt from datagen_source;  {code}
> [ERROR] Could not execute SQL statement. Reason:
> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter
>  
> My jars in lib dir listed in attachment.
> In HiveTableSink#createStreamSink(line:270), createBulkWriterFactory if 
> table.exec.hive.fallback-mapred-writer is false.
> If table is orc, HiveShimV200#createOrcBulkWriterFactory will be invoked. 
> OrcBulkWriterFactory depends on org.apache.orc.PhysicalWriter in orc-core, 
> but flink-connector-hive excludes orc-core for conflicting with hive-exec.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25529) java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write bulkly into hive-2.1.1 orc table

2022-02-08 Thread Yuan Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17488779#comment-17488779
 ] 

Yuan Zhu commented on FLINK-25529:
--

[~luoyuxia] I try hive-exec-2.2.0.jar. It will cause Exception:
{code:java}
Caused by: java.lang.NoSuchMethodError: 
org.apache.orc.OrcFile$WriterOptions.getHadoopShims()Lorg/apache/orc/impl/HadoopShims;
at 
org.apache.flink.orc.writer.PhysicalWriterImpl.(PhysicalWriterImpl.java:103)
at 
org.apache.flink.orc.writer.OrcBulkWriterFactory.create(OrcBulkWriterFactory.java:99)
at 
org.apache.flink.table.filesystem.FileSystemTableSink$ProjectionBulkFactory.create(FileSystemTableSink.java:593)
at 
org.apache.flink.streaming.api.functions.sink.filesystem.BulkBucketWriter.openNew(BulkBucketWriter.java:75)
at 
org.apache.flink.streaming.api.functions.sink.filesystem.OutputStreamBasedPartFileWriter$OutputStreamBasedBucketWriter.openNewInProgressFile(OutputStreamBasedPartFileWriter.java:90)
at 
org.apache.flink.streaming.api.functions.sink.filesystem.BulkBucketWriter.openNewInProgressFile(BulkBucketWriter.java:36)
at 
org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.rollPartFile(Bucket.java:243)
at 
org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.write(Bucket.java:220)
at 
org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.onElement(Buckets.java:305)
at 
org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSinkHelper.onElement(StreamingFileSinkHelper.java:103)
at 
org.apache.flink.table.filesystem.stream.AbstractStreamingWriter.processElement(AbstractStreamingWriter.java:140)
 {code}
I find orc-core is alway 1.3.4 in hive-exec 2.1, 2.2, 2.3.

> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write 
> bulkly into hive-2.1.1 orc table
> ---
>
> Key: FLINK-25529
> URL: https://issues.apache.org/jira/browse/FLINK-25529
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
> Environment: hive 2.1.1
> flink 1.12.4
>Reporter: Yuan Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: lib.jpg
>
>
> I tried to write data bulkly into hive-2.1.1 with orc format, and encountered 
> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter
>  
> Using bulk writer by setting table.exec.hive.fallback-mapred-writer = false;
>  
> {code:java}
> SET 'table.sql-dialect'='hive';
> create table orders(
>     order_id int,
>     order_date timestamp,
>     customer_name string,
>     price decimal(10,3),
>     product_id int,
>     order_status boolean
> )partitioned by (dt string)
> stored as orc;
>  
> SET 'table.sql-dialect'='default';
> create table datagen_source (
> order_id int,
> order_date timestamp(9),
> customer_name varchar,
> price decimal(10,3),
> product_id int,
> order_status boolean
> )with('connector' = 'datagen');
> create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = '/mnt/conf');
> set table.exec.hive.fallback-mapred-writer = false;
> insert into myhive.`default`.orders
> /*+ OPTIONS(
>     'sink.partition-commit.trigger'='process-time',
>     'sink.partition-commit.policy.kind'='metastore,success-file',
>     'sink.rolling-policy.file-size'='128MB',
>     'sink.rolling-policy.rollover-interval'='10s',
>     'sink.rolling-policy.check-interval'='10s',
>     'auto-compaction'='true',
>     'compaction.file-size'='1MB'    ) */
> select * , date_format(now(),'-MM-dd') as dt from datagen_source;  {code}
> [ERROR] Could not execute SQL statement. Reason:
> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter
>  
> My jars in lib dir listed in attachment.
> In HiveTableSink#createStreamSink(line:270), createBulkWriterFactory if 
> table.exec.hive.fallback-mapred-writer is false.
> If table is orc, HiveShimV200#createOrcBulkWriterFactory will be invoked. 
> OrcBulkWriterFactory depends on org.apache.orc.PhysicalWriter in orc-core, 
> but flink-connector-hive excludes orc-core for conflicting with hive-exec.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25529) java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write bulkly into hive-2.1.1 orc table

2022-02-07 Thread luoyuxia (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17488558#comment-17488558
 ] 

luoyuxia commented on FLINK-25529:
--

[~straw] I think you can use hive-exec-2.2.0.jar instead. It's fine for the 
backward compatibility.

> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write 
> bulkly into hive-2.1.1 orc table
> ---
>
> Key: FLINK-25529
> URL: https://issues.apache.org/jira/browse/FLINK-25529
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
> Environment: hive 2.1.1
> flink 1.12.4
>Reporter: Yuan Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: lib.jpg
>
>
> I tried to write data bulkly into hive-2.1.1 with orc format, and encountered 
> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter
>  
> Using bulk writer by setting table.exec.hive.fallback-mapred-writer = false;
>  
> {code:java}
> SET 'table.sql-dialect'='hive';
> create table orders(
>     order_id int,
>     order_date timestamp,
>     customer_name string,
>     price decimal(10,3),
>     product_id int,
>     order_status boolean
> )partitioned by (dt string)
> stored as orc;
>  
> SET 'table.sql-dialect'='default';
> create table datagen_source (
> order_id int,
> order_date timestamp(9),
> customer_name varchar,
> price decimal(10,3),
> product_id int,
> order_status boolean
> )with('connector' = 'datagen');
> create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = '/mnt/conf');
> set table.exec.hive.fallback-mapred-writer = false;
> insert into myhive.`default`.orders
> /*+ OPTIONS(
>     'sink.partition-commit.trigger'='process-time',
>     'sink.partition-commit.policy.kind'='metastore,success-file',
>     'sink.rolling-policy.file-size'='128MB',
>     'sink.rolling-policy.rollover-interval'='10s',
>     'sink.rolling-policy.check-interval'='10s',
>     'auto-compaction'='true',
>     'compaction.file-size'='1MB'    ) */
> select * , date_format(now(),'-MM-dd') as dt from datagen_source;  {code}
> [ERROR] Could not execute SQL statement. Reason:
> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter
>  
> My jars in lib dir listed in attachment.
> In HiveTableSink#createStreamSink(line:270), createBulkWriterFactory if 
> table.exec.hive.fallback-mapred-writer is false.
> If table is orc, HiveShimV200#createOrcBulkWriterFactory will be invoked. 
> OrcBulkWriterFactory depends on org.apache.orc.PhysicalWriter in orc-core, 
> but flink-connector-hive excludes orc-core for conflicting with hive-exec.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25529) java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write bulkly into hive-2.1.1 orc table

2022-02-03 Thread Yuan Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17486799#comment-17486799
 ] 

Yuan Zhu commented on FLINK-25529:
--

[~luoyuxia] If I add the hive-exec-2.1.1.jar and orc-core.jar(1.4.3), it will 
encounter Error like:
{code:java}
[ERROR] Could not execute SQL statement. Reason:
java.lang.NoSuchMethodError: 
org.apache.orc.TypeDescription.fromString(Ljava/lang/String;)Lorg/apache/orc/TypeDescription;
 {code}
Because hive-exec.jar is a shaded jar with orc-core 1.3.4, and class 
TypeDescription conflicts with the class in orc-core.jar 1.4.3

> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write 
> bulkly into hive-2.1.1 orc table
> ---
>
> Key: FLINK-25529
> URL: https://issues.apache.org/jira/browse/FLINK-25529
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
> Environment: hive 2.1.1
> flink 1.12.4
>Reporter: Yuan Zhu
>Priority: Major
>  Labels: pull-request-available
> Attachments: lib.jpg
>
>
> I tried to write data bulkly into hive-2.1.1 with orc format, and encountered 
> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter
>  
> Using bulk writer by setting table.exec.hive.fallback-mapred-writer = false;
>  
> {code:java}
> SET 'table.sql-dialect'='hive';
> create table orders(
>     order_id int,
>     order_date timestamp,
>     customer_name string,
>     price decimal(10,3),
>     product_id int,
>     order_status boolean
> )partitioned by (dt string)
> stored as orc;
>  
> SET 'table.sql-dialect'='default';
> create table datagen_source (
> order_id int,
> order_date timestamp(9),
> customer_name varchar,
> price decimal(10,3),
> product_id int,
> order_status boolean
> )with('connector' = 'datagen');
> create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = '/mnt/conf');
> set table.exec.hive.fallback-mapred-writer = false;
> insert into myhive.`default`.orders
> /*+ OPTIONS(
>     'sink.partition-commit.trigger'='process-time',
>     'sink.partition-commit.policy.kind'='metastore,success-file',
>     'sink.rolling-policy.file-size'='128MB',
>     'sink.rolling-policy.rollover-interval'='10s',
>     'sink.rolling-policy.check-interval'='10s',
>     'auto-compaction'='true',
>     'compaction.file-size'='1MB'    ) */
> select * , date_format(now(),'-MM-dd') as dt from datagen_source;  {code}
> [ERROR] Could not execute SQL statement. Reason:
> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter
>  
> My jars in lib dir listed in attachment.
> In HiveTableSink#createStreamSink(line:270), createBulkWriterFactory if 
> table.exec.hive.fallback-mapred-writer is false.
> If table is orc, HiveShimV200#createOrcBulkWriterFactory will be invoked. 
> OrcBulkWriterFactory depends on org.apache.orc.PhysicalWriter in orc-core, 
> but flink-connector-hive excludes orc-core for conflicting with hive-exec.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25529) java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write bulkly into hive-2.1.1 orc table

2022-01-20 Thread luoyuxia (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17479812#comment-17479812
 ] 

luoyuxia commented on FLINK-25529:
--

[~straw]Thanks for your reporting,  you're right.  There's no 
org.apache.orc.PhysicalWriter in hive-exec-2.1.0.jar. You may need add an extra 
orc jar.
Anyway, the document is misleading, I'll try to fix it.

> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write 
> bulkly into hive-2.1.1 orc table
> ---
>
> Key: FLINK-25529
> URL: https://issues.apache.org/jira/browse/FLINK-25529
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
> Environment: hive 2.1.1
> flink 1.12.4
>Reporter: Yuan Zhu
>Priority: Major
> Attachments: lib.jpg
>
>
> I tried to write data bulkly into hive-2.1.1 with orc format, and encountered 
> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter
>  
> Using bulk writer by setting table.exec.hive.fallback-mapred-writer = false;
>  
> {code:java}
> SET 'table.sql-dialect'='hive';
> create table orders(
>     order_id int,
>     order_date timestamp,
>     customer_name string,
>     price decimal(10,3),
>     product_id int,
>     order_status boolean
> )partitioned by (dt string)
> stored as orc;
>  
> SET 'table.sql-dialect'='default';
> create table datagen_source (
> order_id int,
> order_date timestamp(9),
> customer_name varchar,
> price decimal(10,3),
> product_id int,
> order_status boolean
> )with('connector' = 'datagen');
> create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = '/mnt/conf');
> set table.exec.hive.fallback-mapred-writer = false;
> insert into myhive.`default`.orders
> /*+ OPTIONS(
>     'sink.partition-commit.trigger'='process-time',
>     'sink.partition-commit.policy.kind'='metastore,success-file',
>     'sink.rolling-policy.file-size'='128MB',
>     'sink.rolling-policy.rollover-interval'='10s',
>     'sink.rolling-policy.check-interval'='10s',
>     'auto-compaction'='true',
>     'compaction.file-size'='1MB'    ) */
> select * , date_format(now(),'-MM-dd') as dt from datagen_source;  {code}
> [ERROR] Could not execute SQL statement. Reason:
> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter
>  
> My jars in lib dir listed in attachment.
> In HiveTableSink#createStreamSink(line:270), createBulkWriterFactory if 
> table.exec.hive.fallback-mapred-writer is false.
> If table is orc, HiveShimV200#createOrcBulkWriterFactory will be invoked. 
> OrcBulkWriterFactory depends on org.apache.orc.PhysicalWriter in orc-core, 
> but flink-connector-hive excludes orc-core for conflicting with hive-exec.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25529) java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write bulkly into hive-2.1.1 orc table

2022-01-19 Thread Yuan Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17478638#comment-17478638
 ] 

Yuan Zhu commented on FLINK-25529:
--

Hi, [~luoyuxia] , thanks for your attention.

My hive version is 2.1.1. According to 
[dependencies|https://nightlies.apache.org/flink/flink-docs-release-1.12/dev/table/connectors/hive/#user-defined-dependencies]
 , there is no class org.apache.orc.PhysicalWriter in hive-exec and 
hive-connector.

The jar provided by you is hive 2.3.6. It cannot be used by 2.1.1.

> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write 
> bulkly into hive-2.1.1 orc table
> ---
>
> Key: FLINK-25529
> URL: https://issues.apache.org/jira/browse/FLINK-25529
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
> Environment: hive 2.1.1
> flink 1.12.4
>Reporter: Yuan Zhu
>Priority: Major
> Attachments: lib.jpg
>
>
> I tried to write data bulkly into hive-2.1.1 with orc format, and encountered 
> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter
>  
> Using bulk writer by setting table.exec.hive.fallback-mapred-writer = false;
>  
> {code:java}
> SET 'table.sql-dialect'='hive';
> create table orders(
>     order_id int,
>     order_date timestamp,
>     customer_name string,
>     price decimal(10,3),
>     product_id int,
>     order_status boolean
> )partitioned by (dt string)
> stored as orc;
>  
> SET 'table.sql-dialect'='default';
> create table datagen_source (
> order_id int,
> order_date timestamp(9),
> customer_name varchar,
> price decimal(10,3),
> product_id int,
> order_status boolean
> )with('connector' = 'datagen');
> create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = '/mnt/conf');
> set table.exec.hive.fallback-mapred-writer = false;
> insert into myhive.`default`.orders
> /*+ OPTIONS(
>     'sink.partition-commit.trigger'='process-time',
>     'sink.partition-commit.policy.kind'='metastore,success-file',
>     'sink.rolling-policy.file-size'='128MB',
>     'sink.rolling-policy.rollover-interval'='10s',
>     'sink.rolling-policy.check-interval'='10s',
>     'auto-compaction'='true',
>     'compaction.file-size'='1MB'    ) */
> select * , date_format(now(),'-MM-dd') as dt from datagen_source;  {code}
> [ERROR] Could not execute SQL statement. Reason:
> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter
>  
> My jars in lib dir listed in attachment.
> In HiveTableSink#createStreamSink(line:270), createBulkWriterFactory if 
> table.exec.hive.fallback-mapred-writer is false.
> If table is orc, HiveShimV200#createOrcBulkWriterFactory will be invoked. 
> OrcBulkWriterFactory depends on org.apache.orc.PhysicalWriter in orc-core, 
> but flink-connector-hive excludes orc-core for conflicting with hive-exec.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (FLINK-25529) java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write bulkly into hive-2.1.1 orc table

2022-01-17 Thread luoyuxia (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477521#comment-17477521
 ] 

luoyuxia commented on FLINK-25529:
--

[~straw]  I download the [flink-hive-connector 
|[Download|https://repo.maven.apache.org/maven2/org/apache/flink/flink-sql-connector-hive-2.3.6_2.11/1.14.2/flink-sql-connector-hive-2.3.6_2.11-1.14.2.jar]]

> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write 
> bulkly into hive-2.1.1 orc table
> ---
>
> Key: FLINK-25529
> URL: https://issues.apache.org/jira/browse/FLINK-25529
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Hive
> Environment: hive 2.1.1
> flink 1.12.4
>Reporter: Yuan Zhu
>Priority: Major
> Attachments: lib.jpg
>
>
> I tried to write data bulkly into hive-2.1.1 with orc format, and encountered 
> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter
>  
> Using bulk writer by setting table.exec.hive.fallback-mapred-writer = false;
>  
> {code:java}
> SET 'table.sql-dialect'='hive';
> create table orders(
>     order_id int,
>     order_date timestamp,
>     customer_name string,
>     price decimal(10,3),
>     product_id int,
>     order_status boolean
> )partitioned by (dt string)
> stored as orc;
>  
> SET 'table.sql-dialect'='default';
> create table datagen_source (
> order_id int,
> order_date timestamp(9),
> customer_name varchar,
> price decimal(10,3),
> product_id int,
> order_status boolean
> )with('connector' = 'datagen');
> create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = '/mnt/conf');
> set table.exec.hive.fallback-mapred-writer = false;
> insert into myhive.`default`.orders
> /*+ OPTIONS(
>     'sink.partition-commit.trigger'='process-time',
>     'sink.partition-commit.policy.kind'='metastore,success-file',
>     'sink.rolling-policy.file-size'='128MB',
>     'sink.rolling-policy.rollover-interval'='10s',
>     'sink.rolling-policy.check-interval'='10s',
>     'auto-compaction'='true',
>     'compaction.file-size'='1MB'    ) */
> select * , date_format(now(),'-MM-dd') as dt from datagen_source;  {code}
> [ERROR] Could not execute SQL statement. Reason:
> java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter
>  
> My jars in lib dir listed in attachment.
> In HiveTableSink#createStreamSink(line:270), createBulkWriterFactory if 
> table.exec.hive.fallback-mapred-writer is false.
> If table is orc, HiveShimV200#createOrcBulkWriterFactory will be invoked. 
> OrcBulkWriterFactory depends on org.apache.orc.PhysicalWriter in orc-core, 
> but flink-connector-hive excludes orc-core for conflicting with hive-exec.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)