Re: String for HBase row key

Carol McDonald Wed, 17 Dec 2014 08:39:00 -0800

can  drill push down filtering to within HBase

about Impala
" because that column is a STRING, Impala can let HBase perform that test,
indicated by the hbase filters: line in the EXPLAIN output. Doing the
filtering within HBase is more efficient than transmitting all the data to
Impala and doing the filtering on the Impala side."


On Wed, Dec 17, 2014 at 11:33 AM, Carol Bourgade <[email protected]>
wrote:

> Implala documentation says for best performance use the string data type
> for HBase row keys.  I know that you do not have to define the data types
> for Drill queries , but do string bytes work better for drill queries on
> hbase row keys ?
>
>
> http://www.cloudera.com/content/cloudera/en/documentation/cloudera-impala/latest/topics/impala_hbase.html
> For best performance of Impala queries against HBase tables, most queries
> will perform comparisons in the WHERE against the column that corresponds
> to the HBase row key. When creating the table through the Hive shell, use
> the STRING data type for the column that corresponds to the HBase row key.
> Impala can translate conditional tests (through operators such as =, <,
> BETWEEN, and IN) against this column into fast lookups in HBase, but this
> optimization ("predicate pushdown") only works when that column is defined
> as STRING.
>

Re: String for HBase row key

Reply via email to