Thank you for pointing me to this section - somehow I missed it. How do you maintain this documentation? Maybe I have time to add more examples, so it will be easier for other people to start to work with HBase/Drill combo.
On Thu, Jul 23, 2015 at 3:38 PM, Kristine Hahn <[email protected]> wrote: > These data types are listed > > http://drill.apache.org/docs/supported-data-types/#convert_to-and-convert_from > , > but need to be easier to find and include useful examples as Ted pointed > out. Sorry you had a problem. We'll add links to the types from strategic > places. > > On Thursday, July 23, 2015, Alex Ott <[email protected]> wrote: > > > Thank you Jacques > > > > The INT_BE made the trick - now I'm getting status 200 instead of the > > negative number. The problem is that I haven't seen any mention of this > > type anywhere in the documentation - maybe the corresponding section of > the > > conversions should be expanded, because it refers only to standard types > > that mention only INT, LONG, etc,, without endian-specific variants > > > > Another point is ease to use - if you need to make relatively complex > query > > against HBase, your code is cluttered with all these CONVERT_FROMs - > maybe > > there could be a way to "pre-register" HBase schema with data types, > etc., > > and after that - use provided information. Otherwise, it's completely > > unusable, especially if the project targets analysts, not developers - > they > > all will be confused by conversions, big endian types, etc. > > > > On Thu, Jul 23, 2015 at 1:51 AM, Jacques Nadeau <[email protected] > > <javascript:;>> wrote: > > > > > It is easier to understand using the BINARY_STRING and STRING_BINARY > > > functions that Aditya so kindly added. In general, CONVERT_TO and > > > CONVERT_FROM are converting to binary and from binary. The encoding > > > defines the translation. > > > > > > SELECT CONVERT_FROM(BINARY_STRING('\x00\x00\x00\xC8'), 'INT_BE') as > cnvrt > > > from (VALUES (1)); > > > +--------+ > > > | cnvrt | > > > +--------+ > > > | 200 | > > > +--------+ > > > > > > You can read this expression as > > > 1. Start with string literal \x00\x00\x00\xC8 > > > 2. Decode that string literal into a VARBINARY with those four octets. > > > 3. Decode that VARBINARY understanding the data is encoded using big > > endian > > > four byte integer encoding. > > > 4. Return that value. > > > > > > So 000000C8 = 200 if we're dealing with a big endian integer. > > > > > > Other examples: > > > SELECT CONVERT_FROM(BINARY_STRING('\x17\x04\x00\x00'), 'INT') as cnvrt > > from > > > (VALUES (1)); > > > +--------+ > > > | cnvrt | > > > +--------+ > > > | 1047 | > > > +--------+ > > > > > > SELECT > > > STRING_BINARY(CONVERT_TO(1, 'INT')) as i, > > > STRING_BINARY(CONVERT_TO(1, 'INT_BE')) as i_be, > > > STRING_BINARY(CONVERT_TO(1, 'BIGINT')) as l, > > > STRING_BINARY(CONVERT_TO(1, 'BIGINT')) as l_be, > > > STRING_BINARY(CONVERT_TO(1, 'INT_HADOOPV')) as l_be > > > from (VALUES (1)); > > > > > > > > > -------------------+-------------------+-----------------------------------+-----------------------------------+--------+ > > > | i | i_be | l > > > | l_be | l_be0 | > > > > > > > > > +-------------------+-------------------+-----------------------------------+-----------------------------------+--------+ > > > | \x01\x00\x00\x00 | \x00\x00\x00\x01 | > > \x01\x00\x00\x00\x00\x00\x00\x00 > > > | \x01\x00\x00\x00\x00\x00\x00\x00 | \x01 | > > > > > > > > > +-------------------+-------------------+-----------------------------------+-----------------------------------+--------+ > > > > > > SELECT > > > STRING_BINARY(CONVERT_TO('hello', 'UTF8')) u8, > > > STRING_BINARY(CONVERT_TO('hello', 'UTF16')) u16 > > > from (VALUES (1)); > > > +--------+------------------------------------+ > > > | u8 | u16 | > > > +--------+------------------------------------+ > > > | hello | \xFE\xFF\x00h\x00e\x00l\x00l\x00o | > > > +--------+------------------------------------+ > > > > > > You can see a bunch of examples in the tests here [1]. > > > > > > [1] > > > > > > > > > https://github.com/apache/drill/blob/master/exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/TestConvertFunctions.java > > > > > > Now specifically to your three examples: > > > > > > > SELECT CONVERT_TO('[ [1, 2], [3, 4], [5]]','UTF-8') AS MYCOL1 FROM > > > sys.version; > > > > > > File a bug. This should work. > > > > > > > select cast(x as BINARY(10)) foo from (values 1000)tbl(x); > > > > > > I'm pretty sure that SQL doesn't allow a cast from integer to > varbinary, > > > thus a correct failure message. > > > > > > >select convert_to(x, 'INT') from (values 1000) tbl(x); > > > > > > The problem here is that Drill treats all number literals as BIGINT. > > > You're trying to convert to a four byte encoding. This doesn't work > > since > > > your input type requires eight bytes of precision. You either need to > > cast > > > to lower precision or change your encoding to BIGINT or some other > large > > > precision encoding such as BIGINT_HADOOPV or BIGINT_BE. > > > > > > > > > > > -- > > With best wishes, Alex Ott > > http://alexott.net/ > > Twitter: alexott_en (English), alexott (Russian) > > Skype: alex.ott > > > > > -- > Kristine Hahn > Sr. Technical Writer > 415-497-8107 @krishahn > -- With best wishes, Alex Ott http://alexott.net/ Twitter: alexott_en (English), alexott (Russian) Skype: alex.ott
