[ 
https://issues.apache.org/jira/browse/DRILL-3808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14877383#comment-14877383
 ] 

Aman Sinha commented on DRILL-3808:
-----------------------------------

Ok, got it.. I had modified the quote assignment in TextFormatConfig to test my 
example earlier but realize that this is configurable (should have realized 
that sooner !).   Cool, so no code change is needed.  The following setting in 
the storage plugin configuration in the web UI works: 
{code}
     "tsv": {
      "type": "text",
      "extensions": [
        "tsv"
      ],
      "quote": "\u0000",   <-- I set this to null and the update converted it. 
(incidentally, setting to "\0" doesn't work)
      "delimiter": "\t"
    },
{code}

> Let TextReader have the option to treat double quote as a literal
> -----------------------------------------------------------------
>
>                 Key: DRILL-3808
>                 URL: https://issues.apache.org/jira/browse/DRILL-3808
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Text & CSV
>            Reporter: Sean Hsuan-Yi Chu
>            Assignee: Sean Hsuan-Yi Chu
>            Priority: Critical
>
> According to references [1], [2]:
> In .csv, the double quote is a special character as it can optionally enclose 
> a text field. But in .tsv, it is not a special character, and it can appear 
> anywhere and when it does, it should treated as a literal. The tsv format 
> specification also does not provide for the tab or CR/LF characters to show 
> up anywhere in text fields. However, Drill treats tsv very the same like csv.
> For an example, given data:
> {code}
> "test"\t"test"
> {code}
> A query: select columns[0], columns[1] from `t.tsv`; Drill would give
> {code}
> test      test
> {code}
> However, according to the reference[2], it is supposed to be
> {code}
> "test"      "test"
> {code}
> Ideally, the Drill should follow the standard see[2].
> [1] CSV - https://tools.ietf.org/html/rfc4180
> [2] TSV - 
> http://www.iana.org/assignments/media-types/text/tab-separated-values



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to