[ 
https://issues.apache.org/jira/browse/FLINK-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15769629#comment-15769629
 ] 

Fabian Hueske commented on FLINK-5280:
--------------------------------------

Hi [~ivan.mushketyk], 

That's an interesting idea! 

I think {{getFieldTypes()}} and {{getNumberOfFields()}} are truly redundant and 
might even cause problems if they are not consistent with {{getReturnType()}}. 
We could make them final but that would change the API as well, so we can also 
remove them. IMO, it makes sense to break the API here. Its not declared stable 
and I don't think it is widely used.

The benefit of keeping {{getFieldNames()}} would be that users could still 
overwrite the names of the TypeInformation by overriding the method. However, 
if we do that we would need to add a {{getFieldIndicies()}} method as well to 
map names to positions for proper POJO support. The question is whether it is 
worth to keep {{getFieldNames}} and add {{getFieldIndicies}}. I think is make 
senses to have these methods. Would be aligned with the 
{{BatchTableEnvironment.fromDataSet()}} methods.

We could have default implementations for {{getFieldNames()}} and 
{{getFieldIndicies()}} that return {{null}} and use 
{{TableEnvironment.getFieldInfo(TypeInformation)}} or the explicitly provided 
information if the methods are overridden. That would allow us to reuse 
existing code instead of duplicating it.

What do you think [~ivan.mushketyk] and [~jark]?

> Extend TableSource to support nested data
> -----------------------------------------
>
>                 Key: FLINK-5280
>                 URL: https://issues.apache.org/jira/browse/FLINK-5280
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table API & SQL
>    Affects Versions: 1.2.0
>            Reporter: Fabian Hueske
>            Assignee: Ivan Mushketyk
>
> The {{TableSource}} interface does currently only support the definition of 
> flat rows. 
> However, there are several storage formats for nested data that should be 
> supported such as Avro, Json, Parquet, and Orc. The Table API and SQL can 
> also natively handle nested rows.
> The {{TableSource}} interface and the code to register table sources in 
> Calcite's schema need to be extended to support nested data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to