[
https://issues.apache.org/jira/browse/FLINK-29689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vrinda Palod updated FLINK-29689:
---------------------------------
Description: Currently , (was: CSV is one of the most commonly used file
formats in data wrangling. To load records from CSV files, Flink has provided
the basic {{CsvInputFormat}}, as well as some variants (e.g.,
{{RowCsvInputFormat}} and {{PojoCsvInputFormat}}). However, it seems that the
reading process can be improved. For example, we could add a built-in util to
automatically infer schemas from CSV headers and samples of data. Also, the
current bad record handling method can be improved by somehow keeping the
invalid lines (and even the reasons for failed parsing), instead of logging the
total number only.
This is an umbrella issue for all the improvements and bug fixes for the CSV
reading process.)
> ConvertExcelToCSVProcessor to handle more data
> ----------------------------------------------
>
> Key: FLINK-29689
> URL: https://issues.apache.org/jira/browse/FLINK-29689
> Project: Flink
> Issue Type: Improvement
> Components: API / DataSet
> Reporter: Vrinda Palod
> Priority: Major
> Labels: CSV, excel, nifi, poi
>
> Currently ,
--
This message was sent by Atlassian Jira
(v8.20.10#820010)