Hi Etienne, Thanks for your reporting.
There are indeed many problems. There is no doubt that we need to improve our current format implementation. But ParquetTableSource and ParquetInputFormat are legacy implementations with legacy interfaces. We have introduced new interfaces for execution and SQL. You can see: - ParquetColumnarRowInputFormat with BulkFormat interface. It is just for columnar row reading, not support complex types, we need migrate ParquetInputFormat to the new BulkFormat interface. - FileSystemTableSource with DynamicTableSource interface, It is a generic FileSystem source for all formats, we can just use it for parquet too. Considering ParquetTableSource and ParquetInputFormat are legacy interfaces, I think we can finish migration work first, what do you think? Best, Jingsong On Wed, Feb 24, 2021 at 12:46 AM Etienne Chauchot <echauc...@apache.org> wrote: > Hi all, > > I've been playing with Parquet with SQL and Avro lately. I've found some > bugs: > > 1. https://issues.apache.org/jira/browse/FLINK-21388 : I already > submitted a PR on this one (https://github.com/apache/flink/pull/14961) > > 2. https://issues.apache.org/jira/browse/FLINK-21389 > > 3. https://issues.apache.org/jira/browse/FLINK-21468 > > I've already started to work on this ticket: > https://issues.apache.org/jira/browse/FLINK-21393 > > > I'd be happy to receive your comments on these tickets > > > Best > > Etienne Chauchot > > > -- Best, Jingsong Lee