Rick Moritz created SPARK-20155:
-----------------------------------
Summary: CSV-files with quoted quotes can't be parsed, if
delimiter followes quoted quote
Key: SPARK-20155
URL: https://issues.apache.org/jira/browse/SPARK-20155
Project: Spark
Issue Type: Bug
Components: Input/Output
Affects Versions: 2.0.0
Reporter: Rick Moritz
According to :
https://tools.ietf.org/html/rfc4180#section-2
7. If double-quotes are used to enclose fields, then a double-quote
appearing inside a field must be escaped by preceding it with
another double quote. For example:
"aaa","b""bb","ccc"
This currently works as is, but the following does not:
"aaa","b""b,b","ccc"
while "aaa","b\"b,b","ccc" does get parsed.
I assume, this happens because quotes are currently being parsed in pairs, and
that somehow ends up unquoting delimiter.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]