[
https://issues.apache.org/jira/browse/FLINK-27100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yubin Li updated FLINK-27100:
-----------------------------
Description:
Apache Parquet is a very popular columnar file format, used in many data
analysis engines like Hive/Impala/Spark/Flink. we could use simple command
lines like parquet-tools to view metadata and data easily instead of using
complex java code.
now flink-table-store only support ORC, but there are massive business data
stored as Parquet format, developers/analysisers are very familliar with it,
and Parquet has better support for impala engine.
maybe it's a good addition to make Parquet usable.
was:
Apache Parquet is a very popular columnar file format, used in many data
analysis engines like Hive/Impala/Spark/Flink. we could use simple command
lines like parquet-tools to view metadata and data easily instead of using
complex java code.
now flink-table-store only support ORC, but there are massive business data
stored as parquet format, developers/analysisers are very familliar with it,
maybe it's a good addition to make Parquet usable.
> Support parquet format in FileStore
> -----------------------------------
>
> Key: FLINK-27100
> URL: https://issues.apache.org/jira/browse/FLINK-27100
> Project: Flink
> Issue Type: New Feature
> Components: Table Store
> Reporter: Yubin Li
> Priority: Major
>
> Apache Parquet is a very popular columnar file format, used in many data
> analysis engines like Hive/Impala/Spark/Flink. we could use simple command
> lines like parquet-tools to view metadata and data easily instead of using
> complex java code.
> now flink-table-store only support ORC, but there are massive business data
> stored as Parquet format, developers/analysisers are very familliar with it,
> and Parquet has better support for impala engine.
> maybe it's a good addition to make Parquet usable.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)