What about filter pushdown in these cases? I know that some filter ops push down through convert calls. What about through byte_substr?
On Tue, Jan 20, 2015 at 12:39 PM, Jacques Nadeau <[email protected]> wrote: > I believe there is byte_substr (or similar) which you could use before > handing the value to convert_from > > On Tue, Jan 20, 2015 at 7:56 AM, Carol McDonald <[email protected]> > wrote: > > > what if the HBase primary key is a composite key composed of multiple > > types , for example a string followed by a reverse timestamp (long) > like > > AMZN_9223370655563575807, > > > > are there parameters to specify the length in the function > > convert_from(string > > bytea, src_encoding name) > > > > > > > > On Thu, Dec 18, 2014 at 12:22 AM, Jacques Nadeau <[email protected]> > > wrote: > > > > > String keys work but aren't the most performant or appropriate encoding > > to > > > use in many cases. Drill provides CONVERT_TO and CONVERT_FROM with a > > large > > > number of encodings (including those use by many Hadoop applications as > > > well the Apache Phoenix project). This improves performance of data > use > > in > > > HBase. You can use strings but you should use an encoding appropriate > to > > > your actual data. Drill will then do projection pushdown, filter > > pushdown > > > and range pruning based on your query. > > > > > > On Wed, Dec 17, 2014 at 8:33 AM, Carol Bourgade <[email protected]> > > > wrote: > > > > > > > > Implala documentation says for best performance use the string data > > type > > > > for HBase row keys. I know that you do not have to define the data > > types > > > > for Drill queries , but do string bytes work better for drill queries > > on > > > > hbase row keys ? > > > > > > > > > > > > > > > > > > http://www.cloudera.com/content/cloudera/en/documentation/cloudera-impala/latest/topics/impala_hbase.html > > > > For best performance of Impala queries against HBase tables, most > > queries > > > > will perform comparisons in the WHERE against the column that > > corresponds > > > > to the HBase row key. When creating the table through the Hive shell, > > use > > > > the STRING data type for the column that corresponds to the HBase row > > > key. > > > > Impala can translate conditional tests (through operators such as =, > <, > > > > BETWEEN, and IN) against this column into fast lookups in HBase, but > > this > > > > optimization ("predicate pushdown") only works when that column is > > > defined > > > > as STRING. > > > > > > > > > >
