[jira] [Commented] (FLINK-22271) FlinkSQL Read Hive(parquet file) field does not exist

Rui Li (Jira) Wed, 14 Apr 2021 05:24:08 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-22271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17320958#comment-17320958
 ]


Rui Li commented on FLINK-22271:
--------------------------------

Thanks [~moran] for reporting the issue. As a workaround, you can set 
{{table.exec.hive.fallback-mapred-reader}} to true to use hive's own reader, 
which is usually more tolerant for schema evolution.

> FlinkSQL Read Hive(parquet file) field does not exist 
> ------------------------------------------------------
>
>                 Key: FLINK-22271
>                 URL: https://issues.apache.org/jira/browse/FLINK-22271
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table SQL / API
>    Affects Versions: 1.12.2
>            Reporter: moran
>            Priority: Major
>
> Create a parquet table format student insert the data for each field is not 
> empty, FlinkSQL can query the table, if you add a field after query error.
> 1.Step:
> CREATE TABLE tmp.student ( 
>     name STRING, 
>     age INT
> )
> STORED AS PARQUET;
> insert into table tmp.student  values ("java", 12);
> FlinkSQL can read the table at this point.
> 2.Step:
> alter table tmp.student add columns(update_time timestamp);
> Query error after adding field update_time.
> error:
> java.lang.IllegalArgumentException: update_time does not exist
> at 
> org.apache.flink.hive.shaded.formats.parquet.ParquetVectorizedInputFormat.clipParquetSchema(ParquetVectorizedInputFormat.java:193)
>       at 
> org.apache.flink.hive.shaded.formats.parquet.ParquetVectorizedInputFormat.createReader(ParquetVectorizedInputFormat.java:120)
>       at 
> org.apache.flink.hive.shaded.formats.parquet.ParquetVectorizedInputFormat.createReader(ParquetVectorizedInputFormat.java:73)
>       at 
> org.apache.flink.connectors.hive.read.HiveBulkFormatAdapter.createReader(HiveBulkFormatAdapter.java:108)
>       at 
> org.apache.flink.connectors.hive.read.HiveBulkFormatAdapter.createReader(HiveBulkFormatAdapter.java:63)
>       at 
> org.apache.flink.connector.file.src.impl.FileSourceSplitReader.checkSplitOrStartNext(FileSourceSplitReader.java:112)
>       at 
> org.apache.flink.connector.file.src.impl.FileSourceSplitReader.fetch(FileSourceSplitReader.java:65)
>       at 
> org.apache.flink.connector.base.source.reader.fetcher.FetchTask.run(FetchTask.java:56)
>       at 
> org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:138)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-22271) FlinkSQL Read Hive(parquet file) field does not exist

Reply via email to