[jira] [Comment Edited] (FLINK-27100) Support parquet format in FileStore

Yubin Li (Jira) Thu, 07 Apr 2022 02:55:06 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-27100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17518713#comment-17518713
 ]


Yubin Li edited comment on FLINK-27100 at 4/7/22 9:54 AM:
----------------------------------------------------------

[~lzljs3620320] Thanks for your attention, I have configured file.format, and 
encountered the following exception:
{code:java}
Could not find any factories that implement 
'org.apache.flink.table.store.shaded.org.apache.flink.connector.file.table.factories.BulkReaderFormatFactory'
 in the classpath.
    at 
org.apache.flink.table.factories.FactoryUtil.discoverFactory(FactoryUtil.java:526)
 {code}
when I look into classpath, shaded SPI indeed not exists, I think we could do 
as follows
 # we can modify the configuraiton in flink-table-store-dist 
maven-shade-plugin, make formats like json/csv in flink projects take effect.
 # we can wrap Parquet format like ORC, solve the problem as you figured.


was (Author: liyubin117):
[~lzljs3620320] Thanks for your attention, I have configured file.format, and 
encountered the following exception:
{code:java}
Could not find any factories that implement 
'org.apache.flink.table.store.shaded.org.apache.flink.connector.file.table.factories.BulkReaderFormatFactory'
 in the classpath.
    at 
org.apache.flink.table.factories.FactoryUtil.discoverFactory(FactoryUtil.java:526)
 {code}
when I look into classpath, shaded SPI indeed not exists, maybe we can wrap 
Parquet format like ORC, solve the first problem as you figured.

> Support parquet format in FileStore
> -----------------------------------
>
>                 Key: FLINK-27100
>                 URL: https://issues.apache.org/jira/browse/FLINK-27100
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table Store
>            Reporter: Yubin Li
>            Priority: Major
>
> Apache Parquet is a very popular columnar file format, used in many data 
> analysis engines like Hive/Impala/Spark/Flink. we could use simple command 
> lines like parquet-tools to view metadata and data easily instead of using 
> complex java code.
> now flink-table-store only support ORC, but there are massive business data 
> stored as Parquet format, developers/analysisers are very familliar with it, 
> and Parquet has better support for impala engine.
>  maybe it's a good addition to make Parquet usable. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Comment Edited] (FLINK-27100) Support parquet format in FileStore

Reply via email to