[ https://issues.apache.org/jira/browse/DRILL-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15458600#comment-15458600 ]
F Méthot commented on DRILL-3178: --------------------------------- With 1.7 build, for this file: > cat data/3428.csv 1,"line1" 2,"line2 " 3,"line3" I get: > select * from my_dfs.`/root/data/3428.csv`; Error: DATA_READ ERROR: Error processing input: Cannot use newline character within quoted string, line=3, char=22. Content parsed: [ ] Failure while reading file file:///root/data/3428.csv. Happened at or shortly before byte position 22. Fragment 0:0 [Error Id: 49a05427-e763-4cca-9f97-e4b4308ecb75 on perfnode206.perf.lab:31010] (state=,code=0) > csv reader should allow newlines inside quotes > ----------------------------------------------- > > Key: DRILL-3178 > URL: https://issues.apache.org/jira/browse/DRILL-3178 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Text & CSV > Affects Versions: 1.0.0 > Environment: Ubuntu Trusty 14.04.2 LTS > Reporter: Neal McBurnett > Fix For: Future > > > When reading a csv file which contains newlines within quoted strings, e.g. > via > select * from dfs.`/tmp/q.csv`; > Drill 1.0 says: > Error: SYSTEM ERROR: com.univocity.parsers.common.TextParsingException: > Error processing input: Cannot use newline character within quoted string > But many tools produce csv files with newlines in quoted strings. Drill > should be able to handle them. > Workaround: the csvquote program (https://github.com/dbro/csvquote) can > encode embedded commas and newlines, and even decode them later if desired. -- This message was sent by Atlassian JIRA (v6.3.4#6332)