[
https://issues.apache.org/jira/browse/FLINK-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15751212#comment-15751212
]
Ivan Mushketyk commented on FLINK-5280:
---------------------------------------
Hi [~jark], thank you for your reply.
I see that *fieldIndexes* play a similar role and that they are used in
*CodeGenerator*, but it is not clear what should be the order of these mapping
ids. If you take a look at the test POJO *TableSource* that I've created:
https://gist.github.com/mushketyk/acffb701a1f71a6e9bd661c781d7b18c
you can see that the order of fields returned by *getFieldsNames* is pretty
much random:
{code:java}
@Override
public String[] getFieldsNames() {
return new String[]{"amount", "childPojo", "id", "name"};
}
{code}
In does not match fields order neither in POJO class definition nor in POJO
type information. I had to pretty much come up with it by trial and error,
because (if I understand it correctly) there is no explicit convention on what
this order should be. That's why I am proposing to add a new method that will
help to establish clear relationship between fields ids in the result *Row* and
field in the original POJO type.
I also think that using *fieldIndexes* will be an issue if we want to support
conversion from types like *GenericRecord* that provide a *Map*-like interface
with *get* and *put* methods that do not have explicit ordering.
Does it make sense? Or am I trying to solve a wrong problem?
[~jark], [~fhueske] could you please describe you idea regarding *RowTypeInfo*
approach in more details? I don't think I understand what you propose to do.
> Extend TableSource to support nested data
> -----------------------------------------
>
> Key: FLINK-5280
> URL: https://issues.apache.org/jira/browse/FLINK-5280
> Project: Flink
> Issue Type: Improvement
> Components: Table API & SQL
> Affects Versions: 1.2.0
> Reporter: Fabian Hueske
> Assignee: Ivan Mushketyk
>
> The {{TableSource}} interface does currently only support the definition of
> flat rows.
> However, there are several storage formats for nested data that should be
> supported such as Avro, Json, Parquet, and Orc. The Table API and SQL can
> also natively handle nested rows.
> The {{TableSource}} interface and the code to register table sources in
> Calcite's schema need to be extended to support nested data.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)