Can you post the plan including the HBaseScan part?  I would like check if
the filter condition is pushed into HBaseScan or not.

For HBaseScan, as I understand, currently, if part of the filter is pushed
into Scan operator, it will remain in the Filter operator, although the
filter in the Scan should have pruned out the rows which do not qualify for
the filter condition.





On Mon, Aug 24, 2015 at 2:10 PM, Aman Sinha <[email protected]> wrote:

> Indeed, it is not efficient. We are doing 16 invocations of
> CONVERT_FROMUTF8($1)  and 16 invocations of CONVERT_FROMUTF8($2).
> Can you pls file a JIRA ?  We should ideally be doing projection pushdown
> in conjunction with the filter pushdown in to the HBase scan and
> computing these functions only once.
>
> Aman
>
>
> On Mon, Aug 24, 2015 at 1:34 PM, Sungwook Yoon <[email protected]> wrote:
>
> > Hi,
> >
> > I have a query, doing something like
> >
> > a in (v1, v2, v3, .... v15)
> >
> > The physical query plan looks like the following.
> >
> > Filter(condition=[AND(>=(CAST($0):INTEGER, 2009), <=(CAST($0):INTEGER,
> > 2013), OR(=(CONVERT_FROMUTF8($1), '39891'), =(CONVERT_FROMUTF8($1),
> > '4280'), =(CONVERT_FROMUTF8($1), '4281'), =(CONVERT_FROMUTF8($1),
> '42820'),
> > =(CONVERT_FROMUTF8($1), '42821'), =(CONVERT_FROMUTF8($1), '42822'),
> > =(CONVERT_FROMUTF8($1), '42823'), =(CONVERT_FROMUTF8($1), '42830'),
> > =(CONVERT_FROMUTF8($1), '42831'), =(CONVERT_FROMUTF8($1), '42832'),
> > =(CONVERT_FROMUTF8($1), '42833'), =(CONVERT_FROMUTF8($1), '42840'),
> > =(CONVERT_FROMUTF8($1), '42841'), =(CONVERT_FROMUTF8($1), '42842'),
> > =(CONVERT_FROMUTF8($1), '42843'), =(CONVERT_FROMUTF8($1), '4289'),
> > =(CONVERT_FROMUTF8($2), '39891'), =(CONVERT_FROMUTF8($2), '4280'),
> > =(CONVERT_FROMUTF8($2), '4281'), =(CONVERT_FROMUTF8($2), '42820'),
> > =(CONVERT_FROMUTF8($2), '42821'), =(CONVERT_FROMUTF8($2), '42822'),
> > =(CONVERT_FROMUTF8($2), '42823'), =(CONVERT_FROMUTF8($2), '42830'),
> > =(CONVERT_FROMUTF8($2), '42831'), =(CONVERT_FROMUTF8($2), '42832'),
> > =(CONVERT_FROMUTF8($2), '42833'), =(CONVERT_FROMUTF8($2), '42840'),
> > =(CONVERT_FROMUTF8($2), '42841'), =(CONVERT_FROMUTF8($2), '42842'),
> > =(CONVERT_FROMUTF8($2), '42843'), =(CONVERT_FROMUTF8($2), '4289')))]) :
> > rowType = RecordType(ANY year, ANY DX1, ANY DX2): rowcount =
> > 3.300738791875E8, cumulative cost = {1.0562364134E10 rows,
> > 5.413211618675E10 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 7136
> >
> >
> > In this plan, does the drill convert to string the same column to
> multiple
> > times as many as the values it is comparing against?
> >
> > From the performance, it looks like it is doing that ...
> >
> > Sungwook
> >
>

Reply via email to