Thanks for the work-around. Didn't expect that you have to use single quote to have it process double quote :-)
When I tried to specify double-quote (literally as well as escaping via \" or "") as "quote", the storage config UI didn't like that. On 26 June 2015 at 19:16, Hao Zhu <[email protected]> wrote: > I can reproduce the issue but I also have a workaround for it: > *1. When storage plugin for "tsv" is default:* > "tsv": { > "type": "text", > "extensions": [ > "tsv" > ], > "delimiter": "\t" > }, > > > select columns[0],columns[1] from `test.tsv`; > +----------+---------+ > | EXPR$0 | EXPR$1 | > +----------+---------+ > | foobar | bar | > | aa" "bc | null | > +----------+---------+ > 2 rows selected (0.114 seconds) > > *2. If we add "quote" property to use single quote:* > "tsv": { > "type": "text", > "extensions": [ > "tsv" > ], > "quote": "'", > "delimiter": "\t" > }, > > Then it works fine: > > select columns[0],columns[1] from `test.tsv`; > +---------+---------+ > | EXPR$0 | EXPR$1 | > +---------+---------+ > | foobar | bar | > | "aa" | "bc" | > +---------+---------+ > 2 rows selected (0.11 seconds) > > > > Thanks, > Hao > > > On Fri, Jun 26, 2015 at 11:03 AM, Kristine Hahn <[email protected]> > wrote: > > > I think you might have a problem with your tsv file using spaces > > instead of tabs. > > CSV file contents: > > hello,1,2,3 > > hello,1,2,3 > > hello,1,2,3 > > > > TSV file contents (actual tab character, not spaces): > > hello 1 2 3 > > hello 1 2 3 > > hello 1 2 3 > > > > 0: jdbc:drill:zk=local> select * from > > `/Users/khahn/Downloads/csv_test.csv`; > > +------------------------+ > > | columns | > > +------------------------+ > > | ["hello","1","2","3"] | > > | ["hello","1","2","3"] | > > | ["hello","1","2","3"] | > > +------------------------+ > > 3 rows selected (0.114 seconds) > > > > TSV using tabs > > 0: jdbc:drill:zk=local> select * from > > `/Users/khahn/Downloads/tsv_test.tsv`; > > +------------------------+ > > | columns | > > +------------------------+ > > | ["hello","1","2","3"] | > > | ["hello","1","2","3"] | > > | ["hello","1","2","3"] | > > +------------------------+ > > 3 rows selected (0.122 seconds) > > > > TSV using spaces > > > > 0: jdbc:drill:zk=local> select * from > > `/Users/khahn/Downloads/tsv_test.tsv`; > > +------------------------+ > > | columns | > > +------------------------+ > > | ["hello 1 2 3"] | > > | ["hello 1 2 3"] | > > | ["hello 1 2 3"] | > > +------------------------+ > > 3 rows selected (0.117 seconds) > > Kristine Hahn > > Sr. Technical Writer > > 415-497-8107 @krishahn > > > > > > > > On Fri, Jun 26, 2015 at 10:02 AM, Kristine Hahn <[email protected]> > > wrote: > > > There are some attributes that were introduced in Drill 1.0 that are > > partly > > > documented (sorry no example): > > > > > > > > > http://drill.apache.org/docs/plugin-configuration-basics/#list-of-attributes-and-definitions > > > (see "formats" . . . "quote") > > > > > > http://drill.apache.org/docs/plugin-configuration-basics/#using-the-formats > > > > > > > > > > > > > > > > > > > > > Kristine Hahn > > > Sr. Technical Writer > > > 415-497-8107 @krishahn > > > > > > > > > On Fri, Jun 26, 2015 at 7:27 AM, Chi-Lang Ngo <[email protected]> > wrote: > > >> > > >> Hi, > > >> > > >> I'm having problem querying tab-delimited (tsv) files which has > quotes. > > >> > > >> Drill doesn't seem to recognise quotes in tsv while working fine for > csv > > >> files. > > >> For example, given the following files > > >> > > >> test.tsv > > >> ------- > > >> foobar bar > > >> "aa" "bc" > > >> ------- > > >> > > >> test.csv > > >> ---------- > > >> foobar,bar > > >> "aa","bc" > > >> ---------- > > >> > > >> I get these results > > >> > > >> 0: jdbc:drill:zk=local> select columns[0], columns[1] from > > >> dfs.`/test.csv`; > > >> > > >> +---------+---------+ > > >> > > >> | EXPR$0 | EXPR$1 | > > >> > > >> +---------+---------+ > > >> > > >> | foobar | bar | > > >> > > >> | aa | bc | > > >> > > >> +---------+---------+ > > >> > > >> 2 rows selected (0.259 seconds) > > >> > > >> 0: jdbc:drill:zk=local> select columns[0], columns[1] from > > >> dfs.`/test.tsv`; > > >> > > >> +----------+---------+ > > >> > > >> | EXPR$0 | EXPR$1 | > > >> > > >> +----------+---------+ > > >> > > >> | foobar | bar | > > >> > > >> | aa" "bc | null | > > >> > > >> +----------+---------+ > > >> > > >> 2 rows selected (0.122 seconds) > > >> > > >> Any ideas? > > >> CL > > > > > > > > >
