[
https://issues.apache.org/jira/browse/FLINK-9814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16549067#comment-16549067
]
Fabian Hueske commented on FLINK-9814:
--------------------------------------
There is already an option to skip the first line of a CSV file that provides a
schema header.
So it should not be a problem to add another option to check and skip the
header. We would need to enforce a specific header format though, like using
the configured field delimiter and just the column names (maybe with trimming
white spaces).
This would happen on the first line of every processed CSV file and should not
add much overhead. However, as I said before, the check can be done quite late,
e.g., when the first split of a file is the last one that is given to a worker.
> CsvTableSource "lack of column" warning
> ---------------------------------------
>
> Key: FLINK-9814
> URL: https://issues.apache.org/jira/browse/FLINK-9814
> Project: Flink
> Issue Type: Wish
> Components: Table API & SQL
> Affects Versions: 1.5.0
> Reporter: François Lacombe
> Assignee: vinoyang
> Priority: Minor
> Original Estimate: 48h
> Remaining Estimate: 48h
>
> The CsvTableSource class is built by defining expected columns to be find in
> the corresponding csv file.
>
> It would be great to throw an Exception when the csv file doesn't have the
> same structure as defined in the source. For retro-compatibility sake,
> developers should explicitly set the builder to define columns stricly and
> expect Exception to be thrown in case of structure difference.
> It can be easilly checked with file header if it exists.
> Is this possible ?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)