[
https://issues.apache.org/jira/browse/FLINK-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Flink Jira Bot updated FLINK-10684:
-----------------------------------
Labels: CSV auto-deprioritized-major (was: CSV stale-major)
Priority: Minor (was: Major)
This issue was labeled "stale-major" 7 ago and has not received any updates so
it is being deprioritized. If this ticket is actually Major, please raise the
priority and ask a committer to assign you the issue or revive the public
discussion.
> Improve the CSV reading process
> -------------------------------
>
> Key: FLINK-10684
> URL: https://issues.apache.org/jira/browse/FLINK-10684
> Project: Flink
> Issue Type: Improvement
> Components: API / DataSet
> Reporter: Xingcan Cui
> Priority: Minor
> Labels: CSV, auto-deprioritized-major
>
> CSV is one of the most commonly used file formats in data wrangling. To load
> records from CSV files, Flink has provided the basic {{CsvInputFormat}}, as
> well as some variants (e.g., {{RowCsvInputFormat}} and
> {{PojoCsvInputFormat}}). However, it seems that the reading process can be
> improved. For example, we could add a built-in util to automatically infer
> schemas from CSV headers and samples of data. Also, the current bad record
> handling method can be improved by somehow keeping the invalid lines (and
> even the reasons for failed parsing), instead of logging the total number
> only.
> This is an umbrella issue for all the improvements and bug fixes for the CSV
> reading process.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)