Re: Very wide csv files

Flavio Pompermaier Wed, 23 Nov 2016 07:17:40 -0800

I usually use apache commons CSV for that, as you can see here (inside the
*parseWithApacheCommonsCsv* part of the if):


https://github.com/okkam-it/flink-examples/blob/master/src/main/java/it/okkam/datalinks/batch/flink/datasourcemanager/importers/Csv2RowExample.java

I hope this could help!
Flavio

On Wed, Nov 23, 2016 at 2:48 PM, Anton Solovev <anton_solo...@epam.com>
wrote:

> Hi,
>
> I'm working on https://issues.apache.org/jira/browse/FLINK-2186
>
> As I understand, Flink cannot read wide-column files in tuple, but pojo
> So far we must create that pojo manually, it's convenient when count of
> columns not so many
> When it's over thousand - hardly seems possible
>
> To solve this issue I see these ways:
>
> -          Create an inputFormat that reads each column with proper type
> serializer and keeps them in common storage like Obejct[].
>
> And keeps meta-information about field types. Some chunks of code of an
> attempt https://github.com/apache/flink/compare/master...
> tonycox:FLINK-2186
>
> -          Use a complex combination of Tuples or/and Pojos
>
> -          Somehow use a code generation to create a pojo with huge field
> count
>
> What do you think?
>
> Best regards,
> Anton
>

Re: Very wide csv files

Reply via email to