[
https://issues.apache.org/jira/browse/FLINK-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15765160#comment-15765160
]
Ivan Mushketyk commented on FLINK-5280:
---------------------------------------
Hi Fabian,
Thank you for your reply.
At first a question about your comment.
{quote}
In case of a Specific Avro record, we would need an additional step to copy the
first-level Pojo fields into a Row
{quote}
Does "Specific Avro" mean a regular POJO?
Regarding the *TableSource* interface, I think I've lost track of what problem
we are trying to solve here :)
I see the following problems with the current interface:
* There is no explicit relationship between fields positions in a *Row* and
order of fields in a POJO type. As you mentioned, we can get fields order via
*PojoTypeInfo.getFieldIndex()*. Since *TableSource* has a method
*getReturnType* that returns *TypeInformation*, there's nothing that should be
changed about the *TableSource* interface to support it.
* Row type does not have field names which make it problematic to access nested
fields in nested Rows, but I believe this should be fixed in FLINK-5348.
Therefore it seems that the only thing that should be done (except waiting for
FLINK-5348 to be implemented) is to update *TableSourceTable* to use POJO
fields in a correct order. Currently, it just generates indexes 0 to n:
{code}
class TableSourceTable(val tableSource: TableSource[_])
extends FlinkTable[Row](
typeInfo = new RowTypeInfo(tableSource.getFieldTypes),
fieldIndexes = 0.until(tableSource.getNumberOfFields).toArray,
fieldNames = tableSource.getFieldsNames)
{code}
while it should use *PojoTypeInfo.getFieldIndex()* method to build a proper
list of fields indexes.
Am I missing something? Are there are some *TableSource* limitations that I am
missing?
> Extend TableSource to support nested data
> -----------------------------------------
>
> Key: FLINK-5280
> URL: https://issues.apache.org/jira/browse/FLINK-5280
> Project: Flink
> Issue Type: Improvement
> Components: Table API & SQL
> Affects Versions: 1.2.0
> Reporter: Fabian Hueske
> Assignee: Ivan Mushketyk
>
> The {{TableSource}} interface does currently only support the definition of
> flat rows.
> However, there are several storage formats for nested data that should be
> supported such as Avro, Json, Parquet, and Orc. The Table API and SQL can
> also natively handle nested rows.
> The {{TableSource}} interface and the code to register table sources in
> Calcite's schema need to be extended to support nested data.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)