GitHub user twalthr opened a pull request:
https://github.com/apache/flink/pull/5132
[FLINK-8203] [FLINK-7681] [table] Make schema definition of
DataStream/DataSet to Table conversion more flexible
## What is the purpose of the change
This PR makes the schema definition more flexible. It add two ways of
adding schema information:
Reference input fields by name:
All fields in the schema definition are referenced by name
(and possibly renamed using an alias (as). In this mode, fields can be
reordered and
projected out. Moreover, we can define proctime and rowtime attributes at
arbitrary
positions using arbitrary names (except those that exist in the result
schema). This mode
can be used for any input type, including POJOs.
Reference input fields by position:
Field references must refer to existing fields in the input type (except for
renaming with alias (as)). In this mode, fields are simply renamed.
Event-time attributes can
replace the field on their position in the input data (if it is of correct
type) or be
appended at the end. Proctime attributes must be appended at the end. This
mode can only be
used if the input type has a defined field order (tuple, case class, Row)
and no of fields
references a field of the input type.
It also allows any TypeInformation. In the past, this behavior was not
consistent.
I will add some paragraphs to the documentation, once we agreed on this new
behavior.
## Brief change log
Various changes in `TableEnvironment`, `Stream/BatchTableEnvironment`, and
pattern matches that referenced `AtomicType` instead of `TypeInformation`.
## Verifying this change
See TableEnvironment tests.
## Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): no
- The public API, i.e., is any changed class annotated with
`@Public(Evolving)`: no
- The serializers: no
- The runtime per-record code paths (performance sensitive): no
- Anything that affects deployment or recovery: JobManager (and its
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
- The S3 file system connector: no
## Documentation
- Does this pull request introduce a new feature? no
- If yes, how is the feature documented? will document it later
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/twalthr/flink FLINK-8203
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/5132.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #5132
----
commit 38562e1dcc5416996ad5531b901f89e4b868e5eb
Author: twalthr <[email protected]>
Date: 2017-12-07T10:52:28Z
[FLINK-8203] [FLINK-7681] [table] Make schema definition of
DataStream/DataSet to Table conversion more flexible
----
---