what if the HBase primary key is a composite key composed of multiple types , for example a string followed by a reverse timestamp (long) like AMZN_9223370655563575807,
are there parameters to specify the length in the function convert_from(string bytea, src_encoding name) On Thu, Dec 18, 2014 at 12:22 AM, Jacques Nadeau <[email protected]> wrote: > String keys work but aren't the most performant or appropriate encoding to > use in many cases. Drill provides CONVERT_TO and CONVERT_FROM with a large > number of encodings (including those use by many Hadoop applications as > well the Apache Phoenix project). This improves performance of data use in > HBase. You can use strings but you should use an encoding appropriate to > your actual data. Drill will then do projection pushdown, filter pushdown > and range pruning based on your query. > > On Wed, Dec 17, 2014 at 8:33 AM, Carol Bourgade <[email protected]> > wrote: > > > > Implala documentation says for best performance use the string data type > > for HBase row keys. I know that you do not have to define the data types > > for Drill queries , but do string bytes work better for drill queries on > > hbase row keys ? > > > > > > > http://www.cloudera.com/content/cloudera/en/documentation/cloudera-impala/latest/topics/impala_hbase.html > > For best performance of Impala queries against HBase tables, most queries > > will perform comparisons in the WHERE against the column that corresponds > > to the HBase row key. When creating the table through the Hive shell, use > > the STRING data type for the column that corresponds to the HBase row > key. > > Impala can translate conditional tests (through operators such as =, <, > > BETWEEN, and IN) against this column into fast lookups in HBase, but this > > optimization ("predicate pushdown") only works when that column is > defined > > as STRING. > > >
