[ 
https://issues.apache.org/jira/browse/FLINK-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15751212#comment-15751212
 ] 

Ivan Mushketyk commented on FLINK-5280:
---------------------------------------

Hi [~jark], thank you for your reply.

I see that *fieldIndexes* play a similar role and that they are used in 
*CodeGenerator*, but it is not clear what should be the order of these mapping 
ids. If you take a look at the test POJO *TableSource* that I've created:
https://gist.github.com/mushketyk/acffb701a1f71a6e9bd661c781d7b18c

you can see that the order of fields returned by *getFieldsNames* is pretty 
much random:

{code:java}
@Override
public String[] getFieldsNames() {
        return new String[]{"amount", "childPojo", "id", "name"};
}
{code}

In does not match fields order neither in POJO class definition nor in POJO 
type information. I had to pretty much come up with it by trial and error, 
because (if I understand it correctly) there is no explicit convention on what 
this order should be. That's why I am proposing to add a new method that will 
help to establish clear relationship between fields ids in the result *Row* and 
field in the original POJO type.

I also think that using *fieldIndexes* will be an issue if we want to support 
conversion from types like *GenericRecord* that provide a *Map*-like interface 
with *get* and *put* methods that do not have explicit ordering.

Does it make sense? Or am I trying to solve a wrong problem?

[~jark], [~fhueske] could you please describe you idea regarding *RowTypeInfo* 
approach in more details? I don't think I understand what you propose to do.


> Extend TableSource to support nested data
> -----------------------------------------
>
>                 Key: FLINK-5280
>                 URL: https://issues.apache.org/jira/browse/FLINK-5280
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table API & SQL
>    Affects Versions: 1.2.0
>            Reporter: Fabian Hueske
>            Assignee: Ivan Mushketyk
>
> The {{TableSource}} interface does currently only support the definition of 
> flat rows. 
> However, there are several storage formats for nested data that should be 
> supported such as Avro, Json, Parquet, and Orc. The Table API and SQL can 
> also natively handle nested rows.
> The {{TableSource}} interface and the code to register table sources in 
> Calcite's schema need to be extended to support nested data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to