did you increase the memory setting for Drill from the default?

https://drill.apache.org/docs/configuring-drill-memory/ 
<https://drill.apache.org/docs/configuring-drill-memory/>


> On 10 May 2016, at 02:25, [email protected] wrote:
> 
> 
> hi:
> i run join operation in the drill, i use broadcast and put the small table in 
> the right. The small table has 32000000 rows. I have set the 
> planner.broadcast_threshold to 100000000. The cluster has three nodes and 
> every node has 64G memory. when join is running, the memory is increasing 
> untill the driilbit process exit. But the same query is run successful in the 
> impala and they are in the same cluster.
> here is the query plan.
> 
> 00-00    Screen : rowType = RecordType(VARCHAR(65535) sourceIP, DOUBLE 
> totalRevenue, ANY avgPageRank): rowcount = 1.0, cumulative cost = 
> {5.294005617599999E8 rows, 6.356723058846001E10 cpu, 0.0 io, 
> 1.4803953770495996E11 network, 9.1066982688E8 memory}, id = 5015
> 00-01      Project(sourceIP=[$0], totalRevenue=[$1], avgPageRank=[$2]) : 
> rowType = RecordType(VARCHAR(65535) sourceIP, DOUBLE totalRevenue, ANY 
> avgPageRank): rowcount = 1.0, cumulative cost = {5.2940056165999985E8 rows, 
> 6.356723058836001E10 cpu, 0.0 io, 1.4803953770495996E11 network, 
> 9.1066982688E8 memory}, id = 5014
> 00-02        SelectionVectorRemover : rowType = RecordType(VARCHAR(65535) 
> sourceIP, DOUBLE totalRevenue, ANY avgPageRank): rowcount = 1.0, cumulative 
> cost = {5.2940056165999985E8 rows, 6.356723058836001E10 cpu, 0.0 io, 
> 1.4803953770495996E11 network, 9.1066982688E8 memory}, id = 5013
> 00-03          Limit(fetch=[1]) : rowType = RecordType(VARCHAR(65535) 
> sourceIP, DOUBLE totalRevenue, ANY avgPageRank): rowcount = 1.0, cumulative 
> cost = {5.2940056065999985E8 rows, 6.356723058736001E10 cpu, 0.0 io, 
> 1.4803953770495996E11 network, 9.1066982688E8 memory}, id = 5012
> 00-04            SingleMergeExchange(sort0=[1 DESC]) : rowType = 
> RecordType(VARCHAR(65535) sourceIP, DOUBLE totalRevenue, ANY avgPageRank): 
> rowcount = 457983.77999999997, cumulative cost = {5.2940055965999985E8 rows, 
> 6.356723058336001E10 cpu, 0.0 io, 1.4803953770495996E11 network, 
> 9.1066982688E8 memory}, id = 5011
> 01-01              SelectionVectorRemover : rowType = 
> RecordType(VARCHAR(65535) sourceIP, DOUBLE totalRevenue, ANY avgPageRank): 
> rowcount = 457983.77999999997, cumulative cost = {5.289425758799999E8 rows, 
> 6.356173477800001E10 cpu, 0.0 io, 1.4241183301631998E11 network, 
> 9.1066982688E8 memory}, id = 5010
> 01-02                TopN(limit=[1]) : rowType = RecordType(VARCHAR(65535) 
> sourceIP, DOUBLE totalRevenue, ANY avgPageRank): rowcount = 
> 457983.77999999997, cumulative cost = {5.284845920999999E8 rows, 
> 6.356127679422001E10 cpu, 0.0 io, 1.4241183301631998E11 network, 
> 9.1066982688E8 memory}, id = 5009
> 01-03                  Project(sourceIP=[$0], totalRevenue=[$1], 
> avgPageRank=[$2]) : rowType = RecordType(VARCHAR(65535) sourceIP, DOUBLE 
> totalRevenue, ANY avgPageRank): rowcount = 457983.77999999997, cumulative 
> cost = {5.2802660831999993E8 rows, 6.356127679422001E10 cpu, 0.0 io, 
> 1.4241183301631998E11 network, 9.1066982688E8 memory}, id = 5008
> 01-04                    HashToRandomExchange(dist0=[[$1]]) : rowType = 
> RecordType(VARCHAR(65535) sourceIP, DOUBLE totalRevenue, ANY avgPageRank, ANY 
> E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 457983.77999999997, cumulative cost = 
> {5.2802660831999993E8 rows, 6.356127679422001E10 cpu, 0.0 io, 
> 1.4241183301631998E11 network, 9.1066982688E8 memory}, id = 5007
> 02-01                      UnorderedMuxExchange : rowType = 
> RecordType(VARCHAR(65535) sourceIP, DOUBLE totalRevenue, ANY avgPageRank, ANY 
> E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 457983.77999999997, cumulative cost = 
> {5.2756862453999996E8 rows, 6.3553949053740005E10 cpu, 0.0 io, 
> 1.3490822676479999E11 network, 9.1066982688E8 memory}, id = 5006
> 03-01                        Project(sourceIP=[$0], totalRevenue=[$1], 
> avgPageRank=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($1)]) : rowType = 
> RecordType(VARCHAR(65535) sourceIP, DOUBLE totalRevenue, ANY avgPageRank, ANY 
> E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 457983.77999999997, cumulative cost = 
> {5.2711064076E8 rows, 6.355349106996001E10 cpu, 0.0 io, 1.3490822676479999E11 
> network, 9.1066982688E8 memory}, id = 5005
> 03-02                          Project(sourceIP=[$0], 
> totalRevenue=[CASE(=($4, 0), null, $3)], 
> avgPageRank=[CAST(/(CastHigh(CASE(=($2, 0), null, $1)), $2)):ANY NOT NULL]) : 
> rowType = RecordType(VARCHAR(65535) sourceIP, DOUBLE totalRevenue, ANY 
> avgPageRank): rowcount = 457983.77999999997, cumulative cost = 
> {5.2665265698E8 rows, 6.3551659134840004E10 cpu, 0.0 io, 
> 1.3490822676479999E11 network, 9.1066982688E8 memory}, id = 5004
> 03-03                            HashAgg(group=[{0}], agg#0=[$SUM0($1)], 
> agg#1=[$SUM0($2)], agg#2=[$SUM0($3)], agg#3=[$SUM0($4)]) : rowType = 
> RecordType(VARCHAR(65535) sourceIP, INTEGER $f1, BIGINT $f2, DOUBLE $f3, 
> BIGINT $f4): rowcount = 457983.77999999997, cumulative cost = 
> {5.2619467320000005E8 rows, 6.3547995264600006E10 cpu, 0.0 io, 
> 1.3490822676479999E11 network, 9.1066982688E8 memory}, id = 5003
> 03-04                              Project(sourceIP=[$0], $f1=[$1], $f2=[$2], 
> $f3=[$3], $f4=[$4]) : rowType = RecordType(VARCHAR(65535) sourceIP, INTEGER 
> $f1, BIGINT $f2, DOUBLE $f3, BIGINT $f4): rowcount = 4579837.8, cumulative 
> cost = {5.2161483540000004E8 rows, 6.32915243478E10 cpu, 0.0 io, 
> 1.3490822676479999E11 network, 8.300646816E8 memory}, id = 5002
> 03-05                                HashToRandomExchange(dist0=[[$0]]) : 
> rowType = RecordType(VARCHAR(65535) sourceIP, INTEGER $f1, BIGINT $f2, DOUBLE 
> $f3, BIGINT $f4, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 4579837.8, 
> cumulative cost = {5.2161483540000004E8 rows, 6.32915243478E10 cpu, 0.0 io, 
> 1.3490822676479999E11 network, 8.300646816E8 memory}, id = 5001
> 04-01                                  UnorderedMuxExchange : rowType = 
> RecordType(VARCHAR(65535) sourceIP, INTEGER $f1, BIGINT $f2, DOUBLE $f3, 
> BIGINT $f4, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 4579837.8, cumulative 
> cost = {5.170349976E8 rows, 6.3218246943E10 cpu, 0.0 io, 2.2354132992E10 
> network, 8.300646816E8 memory}, id = 5000
> 05-01                                    Project(sourceIP=[$0], $f1=[$1], 
> $f2=[$2], $f3=[$3], $f4=[$4], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($0)]) 
> : rowType = RecordType(VARCHAR(65535) sourceIP, INTEGER $f1, BIGINT $f2, 
> DOUBLE $f3, BIGINT $f4, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 4579837.8, 
> cumulative cost = {5.124551598E8 rows, 6.32136671052E10 cpu, 0.0 io, 
> 2.2354132992E10 network, 8.300646816E8 memory}, id = 4999
> 05-02                                      HashAgg(group=[{0}], 
> agg#0=[$SUM0($1)], agg#1=[COUNT($1)], agg#2=[$SUM0($2)], agg#3=[COUNT($2)]) : 
> rowType = RecordType(VARCHAR(65535) sourceIP, INTEGER $f1, BIGINT $f2, DOUBLE 
> $f3, BIGINT $f4): rowcount = 4579837.8, cumulative cost = {5.07875322E8 rows, 
> 6.3195347754E10 cpu, 0.0 io, 2.2354132992E10 network, 8.300646816E8 memory}, 
> id = 4998
> 05-03                                        Project(sourceIP=[$0], 
> pagerank=[$5], adrevenue=[$3]) : rowType = RecordType(VARCHAR(65535) 
> sourceIP, INTEGER pagerank, DOUBLE adrevenue): rowcount = 4.5798378E7, 
> cumulative cost = {4.62076944E8 rows, 6.0630638586E10 cpu, 0.0 io, 
> 2.2354132992E10 network, 2.40132288E7 memory}, id = 4997
> 05-04                                          HashJoin(condition=[=($4, 
> $1)], joinType=[inner]) : rowType = RecordType(VARCHAR(65535) sourceip, 
> VARCHAR(65535) desturl, VARCHAR(65535) visitdate, DOUBLE adrevenue, 
> VARCHAR(65535) pageurl, INTEGER pagerank): rowcount = 4.5798378E7, cumulative 
> cost = {4.62076944E8 rows, 6.0630638586E10 cpu, 0.0 io, 2.2354132992E10 
> network, 2.40132288E7 memory}, id = 4996
> 05-06                                            SelectionVectorRemover : 
> rowType = RecordType(VARCHAR(65535) sourceip, VARCHAR(65535) desturl, 
> VARCHAR(65535) visitdate, DOUBLE adrevenue): rowcount = 4.5798378E7, 
> cumulative cost = {4.12185402E8 rows, 5.9950076802E10 cpu, 0.0 io, 0.0 
> network, 0.0 memory}, id = 4993
> 05-07                                              
> Filter(condition=[AND(>=(DATEDIFF($2, '1980-01-01'), 0), <=(DATEDIFF($2, 
> '1980-04-01'), 0))]) : rowType = RecordType(VARCHAR(65535) sourceip, 
> VARCHAR(65535) desturl, VARCHAR(65535) visitdate, DOUBLE adrevenue): rowcount 
> = 4.5798378E7, cumulative cost = {3.66387024E8 rows, 5.9904278424E10 cpu, 0.0 
> io, 0.0 network, 0.0 memory}, id = 4992
> 05-08                                                Scan(groupscan=[HiveScan 
> [table=Table(dbName:default, tableName:uservisits_copy_huge), 
> columns=[`sourceip`, `desturl`, `visitdate`, `adrevenue`], numPartitions=0, 
> partitions= null, 
> inputDirectories=[hdfs://nameservice1/HiBench/huge/Join/Input/uservisits]]]) 
> : rowType = RecordType(VARCHAR(65535) sourceip, VARCHAR(65535) desturl, 
> VARCHAR(65535) visitdate, DOUBLE adrevenue): rowcount = 1.83193512E8, 
> cumulative cost = {1.83193512E8 rows, 5.862192384E10 cpu, 0.0 io, 0.0 
> network, 0.0 memory}, id = 4991
> 05-05                                            BroadcastExchange : rowType 
> = RecordType(VARCHAR(65535) pageurl, INTEGER pagerank): rowcount = 1364388.0, 
> cumulative cost = {2728776.0 rows, 1.20066144E8 cpu, 0.0 io, 2.2354132992E10 
> network, 0.0 memory}, id = 4995
> 06-01                                              Scan(groupscan=[HiveScan 
> [table=Table(dbName:default, tableName:rankings_huge), columns=[`pageurl`, 
> `pagerank`], numPartitions=0, partitions= null, 
> inputDirectories=[hdfs://nameservice1/HiBench/huge/Join/Input/rankings]]]) : 
> rowType = RecordType(VARCHAR(65535) pageurl, INTEGER pagerank): rowcount = 
> 1364388.0, cumulative cost = {1364388.0 rows, 1.0915104E8 cpu, 0.0 io, 0.0 
> network, 0.0 memory}, id = 4994
> --
> 
> thanks for your regards.
> 
> 
> 

Reply via email to