Thanks for the work-around.

Didn't expect that you have to use single quote to have it process double
quote :-)

When I tried to specify double-quote (literally as well as escaping via \"
or "") as "quote", the storage config UI didn't like that.


On 26 June 2015 at 19:16, Hao Zhu <[email protected]> wrote:

> I can reproduce the issue but I also have a workaround for it:
> *1. When storage plugin for "tsv" is default:*
>     "tsv": {
>       "type": "text",
>       "extensions": [
>         "tsv"
>       ],
>       "delimiter": "\t"
>     },
>
> > select columns[0],columns[1] from `test.tsv`;
> +----------+---------+
> |  EXPR$0  | EXPR$1  |
> +----------+---------+
> | foobar   | bar     |
> | aa" "bc  | null    |
> +----------+---------+
> 2 rows selected (0.114 seconds)
>
> *2. If we add "quote" property to use single quote:*
>     "tsv": {
>       "type": "text",
>       "extensions": [
>         "tsv"
>       ],
>       "quote": "'",
>       "delimiter": "\t"
>     },
>
> Then it works fine:
> > select columns[0],columns[1] from `test.tsv`;
> +---------+---------+
> | EXPR$0  | EXPR$1  |
> +---------+---------+
> | foobar  | bar     |
> | "aa"    | "bc"    |
> +---------+---------+
> 2 rows selected (0.11 seconds)
>
>
>
> Thanks,
> Hao
>
>
> On Fri, Jun 26, 2015 at 11:03 AM, Kristine Hahn <[email protected]>
> wrote:
>
> > I think you might have a problem with your tsv file using spaces
> > instead of tabs.
> > CSV file contents:
> > hello,1,2,3
> > hello,1,2,3
> > hello,1,2,3
> >
> > TSV file contents (actual tab character, not spaces):
> > hello 1 2 3
> > hello 1 2 3
> > hello 1 2 3
> >
> > 0: jdbc:drill:zk=local> select * from
> > `/Users/khahn/Downloads/csv_test.csv`;
> > +------------------------+
> > |        columns         |
> > +------------------------+
> > | ["hello","1","2","3"]  |
> > | ["hello","1","2","3"]  |
> > | ["hello","1","2","3"]  |
> > +------------------------+
> > 3 rows selected (0.114 seconds)
> >
> > TSV using tabs
> > 0: jdbc:drill:zk=local> select * from
> > `/Users/khahn/Downloads/tsv_test.tsv`;
> > +------------------------+
> > |        columns         |
> > +------------------------+
> > | ["hello","1","2","3"]  |
> > | ["hello","1","2","3"]  |
> > | ["hello","1","2","3"]  |
> > +------------------------+
> > 3 rows selected (0.122 seconds)
> >
> > TSV using spaces
> >
> > 0: jdbc:drill:zk=local> select * from
> > `/Users/khahn/Downloads/tsv_test.tsv`;
> > +------------------------+
> > |        columns         |
> > +------------------------+
> > | ["hello   1   2   3"]  |
> > | ["hello   1   2   3"]  |
> > | ["hello   1   2   3"]  |
> > +------------------------+
> > 3 rows selected (0.117 seconds)
> > Kristine Hahn
> > Sr. Technical Writer
> > 415-497-8107 @krishahn
> >
> >
> >
> > On Fri, Jun 26, 2015 at 10:02 AM, Kristine Hahn <[email protected]>
> > wrote:
> > > There are some attributes that were introduced in Drill 1.0 that are
> > partly
> > > documented (sorry no example):
> > >
> > >
> >
> http://drill.apache.org/docs/plugin-configuration-basics/#list-of-attributes-and-definitions
> > > (see "formats" . . . "quote")
> > >
> >
> http://drill.apache.org/docs/plugin-configuration-basics/#using-the-formats
> > >
> > >
> > >
> > >
> > >
> > >
> > > Kristine Hahn
> > > Sr. Technical Writer
> > > 415-497-8107 @krishahn
> > >
> > >
> > > On Fri, Jun 26, 2015 at 7:27 AM, Chi-Lang Ngo <[email protected]>
> wrote:
> > >>
> > >> Hi,
> > >>
> > >> I'm having problem querying tab-delimited (tsv) files which has
> quotes.
> > >>
> > >> Drill doesn't seem to recognise quotes in tsv while working fine for
> csv
> > >> files.
> > >> For example, given the following files
> > >>
> > >> test.tsv
> > >> -------
> > >> foobar bar
> > >> "aa" "bc"
> > >> -------
> > >>
> > >> test.csv
> > >> ----------
> > >> foobar,bar
> > >> "aa","bc"
> > >> ----------
> > >>
> > >> I get these results
> > >>
> > >> 0: jdbc:drill:zk=local> select columns[0], columns[1] from
> > >> dfs.`/test.csv`;
> > >>
> > >> +---------+---------+
> > >>
> > >> | EXPR$0  | EXPR$1  |
> > >>
> > >> +---------+---------+
> > >>
> > >> | foobar  | bar     |
> > >>
> > >> | aa      | bc      |
> > >>
> > >> +---------+---------+
> > >>
> > >> 2 rows selected (0.259 seconds)
> > >>
> > >> 0: jdbc:drill:zk=local> select columns[0], columns[1] from
> > >> dfs.`/test.tsv`;
> > >>
> > >> +----------+---------+
> > >>
> > >> |  EXPR$0  | EXPR$1  |
> > >>
> > >> +----------+---------+
> > >>
> > >> | foobar   | bar     |
> > >>
> > >> | aa" "bc  | null    |
> > >>
> > >> +----------+---------+
> > >>
> > >> 2 rows selected (0.122 seconds)
> > >>
> > >> Any ideas?
> > >> CL
> > >
> > >
> >
>

Reply via email to