[
https://issues.apache.org/jira/browse/DRILL-3393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603113#comment-14603113
]
Chi Lang commented on DRILL-3393:
---------------------------------
I also came across this issue when a tsv file (with quotes) that ends with an
empty line fail to parse with
{quote}
Error: SYSTEM ERROR: com.univocity.parsers.common.TextParsingException: Error
processing input: Cannot use newline character within quoted string, line=2,
char=22. Content parsed: [ ]
Fragment 0:0
{quote}
Example file attached (fail.tsv)
> Quotes not being recognized in tab delimited (tsv) files
> --------------------------------------------------------
>
> Key: DRILL-3393
> URL: https://issues.apache.org/jira/browse/DRILL-3393
> Project: Apache Drill
> Issue Type: Bug
> Components: Storage - Text & CSV
> Affects Versions: 1.0.0
> Reporter: Chi Lang
> Assignee: Steven Phillips
> Priority: Critical
> Fix For: 1.2.0
>
> Attachments: fail.tsv
>
>
> Drill doesn't seem to recognise quotes in tsv, while working fine for csv
> files.
> For example, given the following files
> test.tsv
> -------
> foobar bar
> "aa" "bc"
> -------
> test.csv
> ----------
> foobar,bar
> "aa","bc"
> ----------
> I get these results:
> 0: jdbc:drill:zk=local> select columns[0], columns[1] from dfs.`test.csv`;
> +---------+---------+
> | EXPR$0 | EXPR$1 |
> +---------+---------+
> | foobar | bar |
> | aa | bc |
> +---------+---------+
> 2 rows selected (0.259 seconds)
> 0: jdbc:drill:zk=local> select columns[0], columns[1] from dfs.`test.tsv`;
> +----------+---------+
> | EXPR$0 | EXPR$1 |
> +----------+---------+
> | foobar | bar |
> | aa" "bc | null |
> +----------+---------+
> 2 rows selected (0.122 seconds)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)