GitHub user zentol opened a pull request:
https://github.com/apache/flink/pull/1266
[FLINK-2692] Untangle CsvInputFormat
This PR splits the CsvInputFormat into a Tuple and POJO Version. To this
end, The (Common)CsvInputFormat classes were merged, and the type specific
portions refactored into separate classes.
Additionally, the ScalaCsvInputFormat has been removed; Java and Scala API
now use the same InputFormats. Previously, the formats differed in the way they
created the output tuples; this is now realized in a newly introduced abstract
method "createOrReuseInstance(Object[] fieldValues, T reuse)" within the
TupleSerializerBase.
Fields to include and field names are no longer passed via setters, but
instead via the contructor. Several new contructors were added to accommodate
different use cases, along with 2 new static methods to generate a default
include mask, or convert an indice int[] list to a boolean include mask.
Classes no longer have to be passed separately, as they are extracted from
the typeinformation object.
A few sanity checks were moved from the ExecEnvironment to the InputFormat.
The testReadSparseWithShuffledPositions Test was removed since monotonous
order of field indices is, and afaik was, not actually necessary due to the way
it was converted to a boolean[].
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/zentol/flink 2692_csv
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/1266.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1266
----
commit d497415adc2e58b4e9912ae89a53444825416366
Author: zentol <[email protected]>
Date: 2015-10-18T18:23:23Z
[FLINK-2692] Untangle CsvInputFormat
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---