Re: query plan ....

Jinfeng Ni Tue, 25 Aug 2015 11:30:40 -0700

Hi Sungwook,

Can you upload the profile file to a public place? The user mailing list
does not allow to attach files.




On Tue, Aug 25, 2015 at 10:58 AM, Sungwook Yoon <[email protected]> wrote:

>
> Jacques,
>
> Here is the physical profile for this query,
> I attached the 1000 lines of the tail of the drillbit.log from the foreman.
> Also, attached is the picture of the filter minorfragment information.
>
> We have plenty of memory and Drill is not using them all.
>
> Let me know your thoughts.
>
> Thanks,
>
> Sungwook
>
>
>
> 00-00    Screen : rowType = RecordType(BIGINT EXPR$0): rowcount = 1.0,
> cumulative cost = {1.452325068635E10 rows, 7.525684447485E10 cpu, 0.0 io,
> 4096.0 network, 0.0 memory}, id = 228
> 00-01      StreamAgg(group=[{}], EXPR$0=[$SUM0($0)]) : rowType =
> RecordType(BIGINT EXPR$0): rowcount = 1.0, cumulative cost =
> {1.452325068625E10 rows, 7.525684447475E10 cpu, 0.0 io, 4096.0 network, 0.0
> memory}, id = 227
> 00-02        UnionExchange : rowType = RecordType(BIGINT EXPR$0): rowcount
> = 1.0, cumulative cost = {1.452325068525E10 rows, 7.525684446275E10 cpu,
> 0.0 io, 4096.0 network, 0.0 memory}, id = 226
> 01-01          StreamAgg(group=[{}], EXPR$0=[COUNT()]) : rowType =
> RecordType(BIGINT EXPR$0): rowcount = 1.0, cumulative cost =
> {1.452325068425E10 rows, 7.525684445475E10 cpu, 0.0 io, 0.0 network, 0.0
> memory}, id = 225
> 01-02            Project($f0=[1]) : rowType = RecordType(INTEGER $f0):
> rowcount = 1.32029551675E9, cumulative cost = {1.32029551675E10 rows,
> 5.941329825375E10 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 224
> 01-03              SelectionVectorRemover : rowType = RecordType(ANY DX1,
> ANY DX2, ANY DX3, ANY DX4, ANY DX5): rowcount = 1.32029551675E9, cumulative
> cost = {1.188265965075E10 rows, 5.413211618675E10 cpu, 0.0 io, 0.0 network,
> 0.0 memory}, id = 223
> 01-04                Filter(condition=[OR(=($0, '39891'), =($0, '4280'),
> =($0, '4281'), =($0, '42820'), =($0, '42821'), =($0, '42822'), =($0,
> '42823'), =($0, '42830'), =($0, '42831'), =($0, '42832'), =($0, '42833'),
> =($0, '42840'), =($0, '42841'), =($0, '42842'), =($0, '42843'), =($0,
> '4289'), =($1, '39891'), =($1, '4280'), =($1, '4281'), =($1, '42820'),
> =($1, '42821'), =($1, '42822'), =($1, '42823'), =($1, '42830'), =($1,
> '42831'), =($1, '42832'), =($1, '42833'), =($1, '42840'), =($1, '42841'),
> =($1, '42842'), =($1, '42843'), =($1, '4289'), =($2, '39891'), =($2,
> '4280'), =($2, '4281'), =($2, '42820'), =($2, '42821'), =($2, '42822'),
> =($2, '42823'), =($2, '42830'), =($2, '42831'), =($2, '42832'), =($2,
> '42833'), =($2, '42840'), =($2, '42841'), =($2, '42842'), =($2, '42843'),
> =($2, '4289'), =($3, '39891'), =($3, '4280'), =($3, '4281'), =($3,
> '42820'), =($3, '42821'), =($3, '42822'), =($3, '42823'), =($3, '42830'),
> =($3, '42831'), =($3, '42832'), =($3, '42833'), =($3, '42840'), =($3,
> '42841'), =($3, '42842'), =($3, '42843'), =($3, '4289'), =($4, '39891'),
> =($4, '4280'), =($4, '4281'), =($4, '42820'), =($4, '42821'), =($4,
> '42822'), =($4, '42823'), =($4, '42830'), =($4, '42831'), =($4, '42832'),
> =($4, '42833'), =($4, '42840'), =($4, '42841'), =($4, '42842'), =($4,
> '42843'), =($4, '4289'))]) : rowType = RecordType(ANY DX1, ANY DX2, ANY
> DX3, ANY DX4, ANY DX5): rowcount = 1.32029551675E9, cumulative cost =
> {1.0562364134E10 rows, 5.281182067E10 cpu, 0.0 io, 0.0 network, 0.0
> memory}, id = 222
> 01-05                  Scan(groupscan=[ParquetGroupScan
> [entries=[ReadEntryWithPath
> [path=maprfs:/user/hive/warehouse/mktscan_o/year=2009/000114_0],
> ReadEntryWithPath
> [path=maprfs:/user/hive/warehouse/mktscan_o/year=2009/000570_0],
> ReadEntryWithPath
> [path=maprfs:/user/hive/warehouse/mktscan_o/year=2009/000117_0],
> ReadEntryWithPath
> [path=maprfs:/user/hive/warehouse/mktscan_o/year=2009/000792_0],
> ReadEntryWithPath
> [path=maprfs:/user/hive/warehouse/mktscan_o/year=2009/000357_0],
> ReadEntryWithPath
> [path=maprfs:/user/hive/warehouse/mktscan_o/year=2009/000545_0],
> ReadEntryWithPath
> [path=maprfs:/user/hive/warehouse/mktscan_o/year=2009/000452_0],
> ReadEntryWithPath
> [path=maprfs:/user/hive/warehouse/mktscan_o/year=2009/000189_0],
> ReadEntryWithPath
> [path=maprfs:/user/hive/warehouse/mktscan_o/year=2009/000709_0],
> ReadEntryWithPath
> [path=maprfs:/user/hive/warehouse/mktscan_o/year=2009/000761_0], Re
>
>
> On Tue, Aug 25, 2015 at 10:50 AM, Jacques Nadeau <[email protected]>
> wrote:
>
>> I still think a profile analysis would be good.  Can you post the
>> following
>> on gist:
>>
>> - profile for this query
>> - source code for the slow filter operation (turn on debug logging to get
>> this)
>>
>> thanks,
>> Jacques
>>
>> --
>> Jacques Nadeau
>> CTO and Co-Founder, Dremio
>>
>> On Tue, Aug 25, 2015 at 8:16 AM, Sungwook Yoon <[email protected]>
>> wrote:
>>
>> > Understood,..
>> > I can see Jacques and Jinfeng's point of view.
>> >
>> > I would not bother think about this query .... but some claim that this
>> > query ran orders of magnitude faster on some other sql engine...
>> >
>> > I don't know how.. they got that performance from this query..
>> >
>> > Sungwook
>> >
>> >
>> >
>> > On Tue, Aug 25, 2015 at 10:04 AM, Jacques Nadeau <[email protected]>
>> > wrote:
>> >
>> > > CONVERT_FROMUTF8 is a trivial operation.  I would be surprised if that
>> > was
>> > > the bottleneck.  See here [1] for the implementation.  I'm guessing
>> that
>> > > the problem is simply that the best alternative on this structure
>> would
>> > be
>> > > a non-equality join.  The way I look at this query is you have data
>> set
>> > > like the following:
>> > >
>> > > dx1  dx2  dx3  dx4  dx5
>> > > #    #    #    #    #
>> > > #    #    #    #    #
>> > > #    #    #    #    #
>> > > #    #    #    #    #
>> > > #    #    #    #    #
>> > >
>> > > And you are doing a join:
>> > >
>> > > select * from baseTable t
>> > > join dxTable dx on (dx1 = t.c1 OR dx2 = t.c2 OR dx3 = t.c3 OR dx4 =
>> t.c4
>> > OR
>> > > dx5 = t.c5)
>> > >
>> > > That isn't a pretty algorithm in general.  I'm guessing that the best
>> > > opportunity is to convert the data to something where you can start
>> > taking
>> > > advantage of hashing.  For example, put all the values in a single
>> list
>> > > with each base table row key.  For each value, prefix by column id.
>> Then
>> > > do a join on the new compound key and then get to distinct rows per
>> base
>> > > table row key.  It isn't pretty but it would be much better than the
>> > > current approach.
>> > >
>> > >
>> > > [1]
>> > >
>> > >
>> >
>> https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/conv/UTF8ConvertFrom.java
>> > >
>> > >
>> > > --
>> > > Jacques Nadeau
>> > > CTO and Co-Founder, Dremio
>> > >
>> > > On Tue, Aug 25, 2015 at 7:48 AM, Jinfeng Ni <[email protected]>
>> > wrote:
>> > >
>> > > > I looked at the two physical plans you posted. Seems they are
>> > different :
>> > > > the first one has multiple CONVERT_FROMUTF8() call, while the second
>> > does
>> > > > not have.
>> > > >
>> > > > 1.
>> > > >
>> > > > Filter(condition=[AND(>=(CAST($0):INTEGER, 2009),
>> <=(CAST($0):INTEGER,
>> > > > 2013), OR(=(CONVERT_FROMUTF8($1), '39891'), =(CONVERT_FROMUTF8($1),
>> > > > '4280'), =(CONVERT_FROMUTF8($1), '4281'), =(CONVERT_FROMUTF8($1),
>> > > '42820'),
>> > > > =(CONVERT_FROMUTF8($1), '42821'), =(CONVERT_FROMUTF8($1), '42822'),
>> > > > =(CONVERT_FROMUTF8($1), '42823'), =(CONVERT_FROMUTF8($1), '42830'),
>> > > > =(CONVERT_FROMUTF8($1), '42831'), =(CONVERT_FROMUTF8($1), '42832'),
>> > > > =(CONVERT_FROMUTF8($1), '42833'), =(CONVERT_FROMUTF8($1), '42840'),
>> > > > =(CONVERT_FROMUTF8($1), '42841'), =(CONVERT_FROMUTF8($1), '42842'),
>> > > > =(CONVERT_FROMUTF8($1), '42843'), =(CONVERT_FROMUTF8($1), '4289'),
>> > > > =(CONVERT_FROMUTF8($2), '39891'), =(CONVERT_FROMUTF8($2), '4280'),
>> > > > =(CONVERT_FROMUTF8($2), '4281'), =(CONVERT_FROMUTF8($2), '42820'),
>> > > > =(CONVERT_FROMUTF8($2), '42821'), =(CONVERT_FROMUTF8($2), '42822'),
>> > > > =(CONVERT_FROMUTF8($2), '42823'), =(CONVERT_FROMUTF8($2), '42830'),
>> > > > =(CONVERT_FROMUTF8($2), '42831'), =(CONVERT_FROMUTF8($2), '42832'),
>> > > > =(CONVERT_FROMUTF8($2), '42833'), =(CONVERT_FROMUTF8($2), '42840'),
>> > > > =(CONVERT_FROMUTF8($2), '42841'), =(CONVERT_FROMUTF8($2), '42842'),
>> > > > =(CONVERT_FROMUTF8($2), '42843'), =(CONVERT_FROMUTF8($2),
>> '4289')))]) :
>> > > >
>> > > > 2.
>> > > > Filter(condition=[AND(>=($0, 2009), <=($0, 2013),
>> > > > OR(=($1, '39891'), =($1, '4280'), =($1, '4281'), =($1, '42820'),
>> =($1,
>> > > > '42821'), =($1, '42822'), =($1, '42823'), =($1, '42830'), =($1,
>> > '42831'),
>> > > > =($1, '42832'), =($1, '42833'), =($1, '42840'), =($1, '42841'),
>> =($1,
>> > > > '42842'), =($1, '42843'), =($1, '4289'), =($2, '39891'), =($2,
>> '4280'),
>> > > > =($2, '4281'), =($2, '42820'), =($2, '42821'), =($2, '42822'), =($2,
>> > > > '42823'), =($2, '42830'), =($2, '42831'), =($2, '42832'), =($2,
>> > '42833'),
>> > > > =($2, '42840'), =($2, '42841'), =($2, '42842'), =($2, '42843'),
>> =($2,
>> > > > '4289'), =($3, '39891'), =($3, '4280'), =($3, '4281'), =($3,
>> '42820'),
>> > > > =($3, '42821'), =($3, '42822'), =($3, '42823'), =($3, '42830'),
>> =($3,
>> > > > '42831'), =($3, '42832'), =($3, '42833'), =($3, '42840'), =($3,
>> > '42841'),
>> > > > =($3, '42842'), =($3, '42843'), =($3, '4289'), =($4, '39891'), =($4,
>> > > > '4280'), =($4, '4281'), =($4, '42820'), =($4, '42821'), =($4,
>> '42822'),
>> > > > =($4, '42823'), =($4, '42830'), =($4, '42831'), =($4, '42832'),
>> =($4,
>> > > > '42833'), =($4, '42840'), =($4, '42841'), =($4, '42842'), =($4,
>> > '42843'),
>> > > > =($4, '4289'), =($5, '39891'), =($5, '4280'), =($5, '4281'), =($5,
>> > > > '42820'), =($5, '42821'), =($5, '42822'), =($5, '42823'), =($5,
>> > '42830'),
>> > > > =($5, '42831'), =($5, '42832'), =($5, '42833'), =($5, '42840'),
>> =($5,
>> > > > '42841'), =($5, '42842'), =($5, '42843'), =($5, '4289')))])
>> > > >
>> > > > Can you please post the query PLUS the corresponding plan together?
>> > > >
>> > > > For the first plan, there are room for improvement, since Drill
>> should
>> > > push
>> > > > the common expression : CONVERT_FROMUTF8($2) down into a project,
>> and
>> > > refer
>> > > > the result of expression directly in Filter, which will avoid the
>> > > repeated
>> > > > evaluation in Filter.
>> > > >
>> > > > For the second plan, I do not see room for improvement.
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > On Tue, Aug 25, 2015 at 2:38 AM, Sungwook Yoon <[email protected]>
>> > > wrote:
>> > > >
>> > > > > Aman,
>> > > > >
>> > > > > Sorry for slow reply.
>> > > > >
>> > > > > Here is the SQL query.
>> > > > >
>> > > > > SELECT count(1) FROM `o` a WHERE (a.DX1 IN ('39891', '4280',
>> '4281',
>> > > > > '42820', '42821', '42822', '42823', '42830', '42831', '42832',
>> > '42833',
>> > > > > '42840', '42841', '42842', '42843', '4289') OR a.DX2 IN ('39891',
>> > > '4280',
>> > > > > '4281', '42820', '42821', '42822', '42823', '42830', '42831',
>> > '42832',
>> > > > > '42833', '42840', '42841', '42842', '42843', '4289') OR a.DX3 IN
>> > > > ('39891',
>> > > > > '4280', '4281', '42820', '42821', '42822', '42823', '42830',
>> '42831',
>> > > > > '42832', '42833', '42840', '42841', '42842', '42843', '4289') OR
>> > a.DX4
>> > > IN
>> > > > > ('39891', '4280', '4281', '42820', '42821', '42822', '42823',
>> > '42830',
>> > > > > '42831', '42832', '42833', '42840', '42841', '42842', '42843',
>> > '4289')
>> > > OR
>> > > > > a.DX5 IN ('39891', '4280', '4281', '42820', '42821', '42822',
>> > '42823',
>> > > > > '42830', '42831', '42832', '42833', '42840', '42841', '42842',
>> > '42843',
>> > > > > '4289')) AND a.`year` BETWEEN 2009 AND 2013;
>> > > > >
>> > > > > For the view definition, I tried to convert_from to UTF and cast
>> as
>> > > > > Varchar.
>> > > > > For integer type, I casted.
>> > > > > All resulted in similar slow performance.
>> > > > >
>> > > > > Let me know if you have any insights.
>> > > > >
>> > > > > Thanks,
>> > > > >
>> > > > > Sungwook
>> > > > >
>> > > > >
>> > > > > On Mon, Aug 24, 2015 at 7:20 PM, Aman Sinha <[email protected]
>> >
>> > > > wrote:
>> > > > >
>> > > > > > I was about to say that for IN lists of size 20 or more, Drill
>> > uses a
>> > > > > more
>> > > > > > efficient Values operator instead of OR conditions but then
>> > realized
>> > > > the
>> > > > > OR
>> > > > > > filter is referencing 4 different columns : $1..$4 and each of
>> > those
>> > > > > > individual lists is less than 20.  Sungwook,  can you please
>> > provide
>> > > > the
>> > > > > > SQL query and any view definitions or anything that goes with
>> it ?
>> > > It
>> > > > is
>> > > > > > difficult to figure out things without the full picture.
>> > > > > > thanks,
>> > > > > > Aman
>> > > > > >
>> > > > > > On Mon, Aug 24, 2015 at 5:10 PM, Ted Dunning <
>> > [email protected]>
>> > > > > > wrote:
>> > > > > >
>> > > > > > > On Mon, Aug 24, 2015 at 4:50 PM, Sungwook Yoon <
>> > [email protected]
>> > > >
>> > > > > > wrote:
>> > > > > > >
>> > > > > > > > Still, the performance drop down due to OR filtering is just
>> > > > > > > astounding...
>> > > > > > > >
>> > > > > > >
>> > > > > > > That is what query optimizers are for and why getting them to
>> > work
>> > > > well
>> > > > > > is
>> > > > > > > important.
>> > > > > > >
>> > > > > > > The difference in performance that you are observing is not
>> > > > surprising
>> > > > > > > given the redundant work that you are seeing. Using the OR
>> > operator
>> > > > > > > prevents any significant short-circuiting and the repeated
>> > > conversion
>> > > > > > > operations that are happening make the evaluation much more
>> > > expensive
>> > > > > > than
>> > > > > > > it would otherwise be (a dozen extra copies where only one is
>> > > > needed).
>> > > > > > >
>> > > > > > > Other queries that can be subject to similar problems include
>> > > common
>> > > > > > table
>> > > > > > > expressions that read the same (large) input file many
>> times.  So
>> > > > far,
>> > > > > > > Drill doesn't optimize all such expressions well.
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Re: query plan ....

Reply via email to