Kasonnara created NIFI-8377:
-------------------------------
Summary: CSVReader: quoting and trimming with value separator
inconsistency
Key: NIFI-8377
URL: https://issues.apache.org/jira/browse/NIFI-8377
Project: Apache NiFi
Issue Type: Bug
Components: Extensions
Affects Versions: 1.13.2, 1.12.1
Reporter: Kasonnara
Attachments: template-test-CSVReader-for-bug-report.xml
There is a little inconsistency of quoting and trimming when the value
separator is present in the data and using Apache Common CSV parser.
Example:
{noformat}
case, A, B
quoted value,"aa",
quoted and trimmed value, "aa" ,
quoted value with comma,"a,a",
trimmed but wrongly unquoted value with comma, "a,a" ,{noformat}
{color:#000000}here in the 3 first cases, the value is correctly parsed
{color}
{noformat}
A : "aa", B : null{noformat}
{noformat}
A : "aa", B : null{noformat}
{noformat}
A : "a,a", B : null{noformat}
{color:#000000}so using separately quoting containing the value separator or
spaces to trim works well.{color}
{color:#000000}However in the last example that combine quoted value separator
and outer spaces to trim, then quoting fails{color}
{noformat}
A : "\"a", B : "a\""{noformat}
{color:#000000} {color}
{color:#000000}I think setting
org.apache.commons.csv.CSVFormat.withIgnoreSurroundingSpaces(true) on the CSV
parser would solve the issue, but I don't see the whole picture to tell if this
would have other side effects.{color}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)