What about filter pushdown in these cases?  I know that some filter ops
push down through convert calls.  What about through byte_substr?



On Tue, Jan 20, 2015 at 12:39 PM, Jacques Nadeau <[email protected]> wrote:

> I believe there is byte_substr (or similar) which you could use before
> handing the value to convert_from
>
> On Tue, Jan 20, 2015 at 7:56 AM, Carol McDonald <[email protected]>
> wrote:
>
> > what if the HBase primary key is a composite key  composed of multiple
> > types , for example  a string followed by a reverse timestamp (long)
> like
> > AMZN_9223370655563575807,
> >
> > are there parameters to specify the length in the function
> > convert_from(string
> > bytea, src_encoding name)
> >
> >
> >
> > On Thu, Dec 18, 2014 at 12:22 AM, Jacques Nadeau <[email protected]>
> > wrote:
> >
> > > String keys work but aren't the most performant or appropriate encoding
> > to
> > > use in many cases.  Drill provides CONVERT_TO and CONVERT_FROM with a
> > large
> > > number of encodings (including those use by many Hadoop applications as
> > > well the Apache Phoenix project).  This improves performance of data
> use
> > in
> > > HBase.  You can use strings but you should use an encoding appropriate
> to
> > > your actual data.  Drill will then do projection pushdown, filter
> > pushdown
> > > and range pruning based on your query.
> > >
> > > On Wed, Dec 17, 2014 at 8:33 AM, Carol Bourgade <[email protected]>
> > > wrote:
> > > >
> > > > Implala documentation says for best performance use the string data
> > type
> > > > for HBase row keys.  I know that you do not have to define the data
> > types
> > > > for Drill queries , but do string bytes work better for drill queries
> > on
> > > > hbase row keys ?
> > > >
> > > >
> > > >
> > >
> >
> http://www.cloudera.com/content/cloudera/en/documentation/cloudera-impala/latest/topics/impala_hbase.html
> > > > For best performance of Impala queries against HBase tables, most
> > queries
> > > > will perform comparisons in the WHERE against the column that
> > corresponds
> > > > to the HBase row key. When creating the table through the Hive shell,
> > use
> > > > the STRING data type for the column that corresponds to the HBase row
> > > key.
> > > > Impala can translate conditional tests (through operators such as =,
> <,
> > > > BETWEEN, and IN) against this column into fast lookups in HBase, but
> > this
> > > > optimization ("predicate pushdown") only works when that column is
> > > defined
> > > > as STRING.
> > > >
> > >
> >
>

Reply via email to