Re: Several questions...

2015-07-22 Thread Ted Dunning
Cool. On Wed, Jul 22, 2015 at 6:54 PM, Jacques Nadeau wrote: > I'm sorry I wasn't clearer. The fact that the error is incomprehensible > has already been fixed by Parth and will be part of 1.2 > > On Wed, Jul 22, 2015 at 6:42 PM, Ted Dunning > wrote: > > > On Wed, Jul 22, 2015 at 5:35 PM, Jacq

Re: Querying Apache Spark Generated Parquet

2015-07-22 Thread Neeraja Rentachintala
Hi Do you still see this issue. Can you share a sample parquet file where you see problem. On Thursday, July 16, 2015, Jacques Nadeau wrote: > Can you create a JIRA and post a small sample file that illustrates the > problem? > > On Thu, Jul 16, 2015 at 2:12 AM, Usman Ali > > wrote: > > > Hi, >

Re: Several questions...

2015-07-22 Thread Jacques Nadeau
I'm sorry I wasn't clearer. The fact that the error is incomprehensible has already been fixed by Parth and will be part of 1.2 On Wed, Jul 22, 2015 at 6:42 PM, Ted Dunning wrote: > On Wed, Jul 22, 2015 at 5:35 PM, Jacques Nadeau > wrote: > > > So this works: > > > > SELECT CONVERT_TO('[ [1, 2

Re: Several questions...

2015-07-22 Thread Ted Dunning
On Wed, Jul 22, 2015 at 5:35 PM, Jacques Nadeau wrote: > So this works: > > SELECT CONVERT_TO('[ [1, 2], [3, 4], [5]]' ,'UTF8') AS MYCOL1 FROM > sys.version; > +--+ > |MYCOL1| > +--+ > | [B@7e308c04 | > +--+ > OK. So the difference here is that I had

Re: Several questions...

2015-07-22 Thread Ted Dunning
On Wed, Jul 22, 2015 at 5:44 PM, Jacques Nadeau wrote: > Good point. It is because in case the expression is evaluated after the > data is materialized (yours) and the other the expression is evaluated at > the same time the data is materialized (mine). In the case that they are > evaluated sim

Re: Several questions...

2015-07-22 Thread Jacques Nadeau
Good point. It is because in case the expression is evaluated after the data is materialized (yours) and the other the expression is evaluated at the same time the data is materialized (mine). In the case that they are evaluated simultaneously, we're treating as an INT until the data gets to big.

Re: Several questions...

2015-07-22 Thread Jacques Nadeau
It should return VARBINARY value encoded in UTF8 that matches the binary encoding one would expect. But I just realized there is actually an error in what you wrote. The correct encoding is UTF8, not UTF-8. A recent fix makes the error message much better here and will be included in 1.2. So th

Re: Several questions...

2015-07-22 Thread Ted Dunning
On Wed, Jul 22, 2015 at 4:51 PM, Jacques Nadeau wrote: > SELECT > STRING_BINARY(CONVERT_TO(1, 'INT')) as i, > STRING_BINARY(CONVERT_TO(1, 'INT_BE')) as i_be, > STRING_BINARY(CONVERT_TO(1, 'BIGINT')) as l, > STRING_BINARY(CONVERT_TO(1, 'BIGINT')) as l_be, > STRING_BINARY(CONVERT_TO(1, 'I

Re: Several questions...

2015-07-22 Thread Ted Dunning
On Wed, Jul 22, 2015 at 4:51 PM, Jacques Nadeau wrote: > > SELECT CONVERT_TO('[ [1, 2], [3, 4], [5]]','UTF-8') AS MYCOL1 FROM > sys.version; > > File a bug. This should work. I would love to but I don't know what it should do. Note that the error messages in all of these cases were essentiall

Re: Several questions...

2015-07-22 Thread Jacques Nadeau
It is easier to understand using the BINARY_STRING and STRING_BINARY functions that Aditya so kindly added. In general, CONVERT_TO and CONVERT_FROM are converting to binary and from binary. The encoding defines the translation. SELECT CONVERT_FROM(BINARY_STRING('\x00\x00\x00\xC8'), 'INT_BE') as

Re: Several questions...

2015-07-22 Thread Ted Dunning
Jacques, I just spent an hour or more trying to read the docs on convert_from/to. I had no success. There are plenty of examples of converting to or from UTF-8, but none describing conversions to do with integers. In doing (lots of) experiments, I have failed to 1) create a constant of binary

Re: Inaccurate data representation when selecting from json sub structures and loss of data creating Parquet files from it

2015-07-22 Thread Stefán Baxter
in addition to this. selecting: select some, t.others, t.others.additional from dfs.tmp.`/test.json` as t; - returns this: "yes", {"additional":"last entries only"}, "last entries only" finding the previously missing value but then ignoring all the other values of the sub structure. - Stefan On

Re: Inaccurate data representation when selecting from json sub structures and loss of data creating Parquet files from it

2015-07-22 Thread Stefán Baxter
- never returns this: "yes", {"other":"true","all":" false","sometimes":"yes"} should have been: - never returns this: "yes", {"other":"true","all":" false","sometimes":"yes", "additional":"last entries only"} Regards, -Stefan On Wed, Jul 22, 2015 at 10:52 PM, Stefán Baxter wrote: > Hi, > >

Inaccurate data representation when selecting from json sub structures and loss of data creating Parquet files from it

2015-07-22 Thread Stefán Baxter
Hi, I keep coming across *quirks* in Drill that are quite time consuming to deal with and are now causing mounting concerns. This last one though is far more serious then the previous ones because it deals with loss of data. I'm working with a small(ish) dataset of around 1m records (which I'm m

Re: Several questions...

2015-07-22 Thread Jacques Nadeau
Let me clarify this a bit. If the data is encoded as text (UTF8), then cast is what you want to use. If the data is encoded in a binary representation (such as 4 byte little or big endian integer), then you want to use CONVERT_FROM. CONVERT_FROM is about converting from a binary representation to

Re: Several questions...

2015-07-22 Thread Ted Dunning
Alex, I am sure that there is a better answer in the large sense, but as a quick answer, I wrote a UDF that you can use (I think) to do this conversion. I haven't tested it yet, however, and would be interested if it just solves your issue before pushing forward with making it all nice. You can

Re: Error: Field References Must be Singular Names

2015-07-22 Thread Jinfeng Ni
If possible, could you please some sample data for your error case? I tried to re-produce the problem, by copying the tpch sample nation.parquet into two different directory (nation, nation2), and the query joining the two directory seems to work fine. ls nation1 nation2 nation1: nation.parquet

Re: Several questions...

2015-07-22 Thread Alex Ott
Here is what hbase-shell returns for this field: hbase(main):004:0> get 'urls', 'AZ.OC.ICR', {COLUMN => 'u'} COLUMN CELL u:status timestamp=1411476725886, value=\x00\x00\x00\xC8 The database is populated via Java/Clojure (using the clojure-hbase-schemas) applications that

Re: Several questions...

2015-07-22 Thread Ted Dunning
Yes. Just right on that. Regarding the integer conversion, can you saw what format your data is in? Is it exactly 4 bytes, big endian? On Wed, Jul 22, 2015 at 5:34 AM, Alex Ott wrote: > Ok, answering my first question - I need to take the only the column name > into the backquotes, instead o

Re: Set Drill Response Format to CSV Through Rest APIs

2015-07-22 Thread Usman Ali
Thank you so much for your kind reply. On Thu, Jul 16, 2015 at 9:24 PM, Sudheesh Katkam wrote: > Currently we support only JSON through REST API. > > Thank you, > Sudheesh > > > On Jul 15, 2015, at 9:26 PM, Usman Ali > wrote: > > > > Hi, > > Is there any way to set response format of drill

Error: Field References Must be Singular Names

2015-07-22 Thread Usman Ali
Hi, I am trying to take a join on two directories which contain parquet files. My query reads: *select * from hdfs.root.`parquet1` as t1 join hdfs.root.`parquet2` as t2 on t1.field1= t2.field1;* (parquet1 and parquet2 directories contain parquet files in them) It gives an error saying Field

Re: Several questions...

2015-07-22 Thread Alex Ott
Ok, answering my first question - I need to take the only the column name into the backquotes, instead of taking the complete coordinates: 0: jdbc:drill:zk=local> select CONVERT_FROM(row_key, 'UTF8') as key, CONVERT_FROM(urls.u.`raw-url`, 'UTF8') AS url FROM hbase.urls WHERE row_key = 'AZ.OC.ICR';

Re: Several questions...

2015-07-22 Thread Alex Ott
Hmmm, what I get when I using the CAST: 0: jdbc:drill:zk=local> select CONVERT_FROM(row_key, 'UTF8') as key, CAST(urls.u.status AS INT) AS status FROM hbase.urls WHERE row_key = 'AZ.OC.ICR'; java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: NumberFormatException: � In the documen

Re: Several questions...

2015-07-22 Thread Nathaniel Auvil
to convert data, use the CAST function as in: Select CAST(hbase.urls as VARCHAR(64)) as url from ... On Wed, Jul 22, 2015 at 7:22 AM, Alex Ott wrote: > Hello > > I'm starting to play with Apache Drill & try to use it with HBase. > > I have following questions: > - I have HBase table, where some

Several questions...

2015-07-22 Thread Alex Ott
Hello I'm starting to play with Apache Drill & try to use it with HBase. I have following questions: - I have HBase table, where some columns have minus sign ('-') in the name, like, 'raw-url', etc. How I can query this table & do conversion of the the corresponding columns? I tried to use singl