[
https://issues.apache.org/jira/browse/FLINK-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16487168#comment-16487168
]
Fabian Hueske commented on FLINK-6016:
--------------------------------------
Flink's {{FileInputFormat}} was designed to work well with distributed file
systems such as HDFS that stripe and replicate data across machines and disks.
After all, the primary use case for these systems is to crunch a lot of data
and this data will most likely not reside on a single disk.
> Newlines should be valid in quoted strings in CSV
> -------------------------------------------------
>
> Key: FLINK-6016
> URL: https://issues.apache.org/jira/browse/FLINK-6016
> Project: Flink
> Issue Type: Bug
> Components: Batch Connectors and Input/Output Formats
> Affects Versions: 1.2.0
> Reporter: Luke Hutchison
> Priority: Major
>
> The RFC for the CSV format specifies that newlines are valid in quoted
> strings in CSV:
> https://tools.ietf.org/html/rfc4180
> However, when parsing a CSV file with Flink containing a newline, such as:
> {noformat}
> "3
> 4",5
> {noformat}
> you get this exception:
> {noformat}
> Line could not be parsed: '"3'
> ParserError UNTERMINATED_QUOTED_STRING
> Expect field types: class java.lang.String, class java.lang.String
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)