[ 
https://issues.apache.org/jira/browse/FLINK-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15733296#comment-15733296
 ] 

Fabian Hueske commented on FLINK-5280:
--------------------------------------

Sure, you can work on this [~ivan.mushketyk].
I think it makes sense to first design the interfaces before we start with the 
actual implementation.

Our goal should be to change as little as possible on the current interfaces. 
It should still be possible to define TableSources for tables with flat schema 
in an easy way.

I would propose the following:
- create a {{FlatTableSouce}} interface and move the 
{{TableSource.getFieldNames()}} and {{TableSource.getFieldTypes()}} methods 
there. The {{TableSource.getNumberOfFields()}} method can be dropped.
- create a {{NestedTableSource}} interface that provides methods to derive a 
nested schema (field names and types). We need to decide how this is supposed 
to look like.
- Change all classes that currently implement {{TableSource}} to also implement 
either {{FlatTableSource}}. {{NestedTableSource}} will be used for instance for 
the Avro table source.
- We need to modify / extend the way that table sources are currently 
registered. First, we need to distinguish flat and nested sources. For the 
nested sources we need an implementation that converts the information of the 
{{NestedTableSource}} interface into the {{RelDataType}} required by Calcite's 
{{Table}} interface (see {{FlinkTable}}).

What do you think?

> Extend TableSource to support nested data
> -----------------------------------------
>
>                 Key: FLINK-5280
>                 URL: https://issues.apache.org/jira/browse/FLINK-5280
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table API & SQL
>    Affects Versions: 1.2.0
>            Reporter: Fabian Hueske
>
> The {{TableSource}} interface does currently only support the definition of 
> flat rows. 
> However, there are several storage formats for nested data that should be 
> supported such as Avro, Json, Parquet, and Orc. The Table API and SQL can 
> also natively handle nested rows.
> The {{TableSource}} interface and the code to register table sources in 
> Calcite's schema need to be extended to support nested data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to