Re: Flink CSV parsing

2017-03-11 Thread Alexander Alexandrov
FYI, I recently revisited state-of-the-art CSV parsing libraries for Emma. I think this blog post might be useful https://github.com/uniVocity/csv-parsers-comparison The uniVocity parsers library seems to be dominating the benchmarks and is feature complete. As far as I can tell at the moment

Re: Flink CSV parsing

2017-03-10 Thread Flavio Pompermaier
If you already have an idea on how to proceed maybe I can try to take care of issue a PR using commons-csv or whatever library you prefer On 10 Mar 2017 22:07, "Fabian Hueske" wrote: Hi Flavio, Flink's CsvInputFormat was originally meant to be an efficient way to parse

Re: Flink CSV parsing

2017-03-10 Thread Fabian Hueske
Hi Flavio, Flink's CsvInputFormat was originally meant to be an efficient way to parse structured text files and dates back to the very early days of the project (probably 2011 or so). It was never meant to be compliant with the RFC specification and initially didn't support many features like