[jira] [Updated] (FLINK-29689) ConvertExcelToCSVProcessor to handle more data

Vrinda Palod (Jira) Wed, 19 Oct 2022 05:28:18 -0700


     [ 
https://issues.apache.org/jira/browse/FLINK-29689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Vrinda Palod updated FLINK-29689:
---------------------------------
    Description: Currently ,   (was: CSV is one of the most commonly used file 
formats in data wrangling. To load records from CSV files, Flink has provided 
the basic {{CsvInputFormat}}, as well as some variants (e.g., 
{{RowCsvInputFormat}} and {{PojoCsvInputFormat}}). However, it seems that the 
reading process can be improved. For example, we could add a built-in util to 
automatically infer schemas from CSV headers and samples of data. Also, the 
current bad record handling method can be improved by somehow keeping the 
invalid lines (and even the reasons for failed parsing), instead of logging the 
total number only.

This is an umbrella issue for all the improvements and bug fixes for the CSV 
reading process.)

> ConvertExcelToCSVProcessor to handle more data
> ----------------------------------------------
>
>                 Key: FLINK-29689
>                 URL: https://issues.apache.org/jira/browse/FLINK-29689
>             Project: Flink
>          Issue Type: Improvement
>          Components: API / DataSet
>            Reporter: Vrinda Palod
>            Priority: Major
>              Labels: CSV, excel, nifi, poi
>
> Currently , 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-29689) ConvertExcelToCSVProcessor to handle more data

Reply via email to