[ 
https://issues.apache.org/jira/browse/DRILL-1162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16121442#comment-16121442
 ] 

Volodymyr Vysotskyi commented on DRILL-1162:
--------------------------------------------

I was not able to receive out of heap memory for the query, so I could not 
determine its reasons.
As for the direct memory, as I wrote in the earlier comments, that the main 
reason is wrong row count estimation when joining by PR. 
Let's consider memory using for hash join operator. Information about direct 
memory using was taken from the query profiles. 
For example for queries similar to this query 
{code:sql}
select count(*) from k
inner join l on k.l_partkey=l.l_partkey
inner join m on m.l_partkey=l.l_partkey
inner join n on m.l_partkey=n.l_partkey
inner join o on n.l_partkey=o.l_partkey
inner join p on p.l_partkey=o.l_partkey;
{code}
Hash join operator uses such memory size:
||join||build side row count||Max Peak Memory for HASH_JOIN||
|k and l|10000|1MB|
|m and (k and l)|109,920|2MB|
|n and (m and (k and l))|1,867,062|16MB|
|o and (n and (m and (k and l)))|43,037,076|338MB|
|o and (n and (m1 and (k and l)))|49,446,827|388MB|
|o and (n1 and (m1 and (k and l)))|54,949,346|431MB|
|q and (p and (m and (k and l)))|61,771,300|484MB|
|p and (o and (n and (k and m)))|99,483,263|778MB|
|s and (q and (p and (k and o)))|330,512,646|2.52GB|
m1, n1, q and s tables are created by queries
{code:sql}
create table n1 as select * from `lineitem1.parquet` limit 45020;
create table m1 as select * from `lineitem1.parquet` limit 35010;
create table q as (select l_partkey from `lineitem1.parquet` union all select 
l_partkey from `lineitem1.parquet` limit 70050);
create table s as (select l_partkey from `lineitem1.parquet` union all select 
l_partkey from `lineitem1.parquet` limit 90050);
{code}
As you can see from the table, hash join operator uses direct memory in 
proportion to the rows count of the build side. 
For the query, similar to the query from Jira description, the ratio of memory 
used by the hash join operator to build side row count almost the same for the 
simplified query (it is executed without OOM)
{code:sql}
select count(*) from `lineitem1.parquet` a
inner join `part.parquet` j on a.l_partkey = j.p_partkey 
inner join `orders.parquet` k on a.l_orderkey = k.o_orderkey 
inner join `supplier.parquet` l on a.l_suppkey = l.s_suppkey 
inner join `partsupp.parquet` m on j.p_partkey = m.ps_partkey and l.s_suppkey = 
m.ps_suppkey 
inner join `customer.parquet` n on k.o_custkey = n.c_custkey 
inner join `lineitem2.parquet` b on a.l_orderkey = b.l_orderkey 
inner join `lineitem2.parquet` d on a.l_suppkey = d.l_suppkey 
inner join `lineitem2.parquet` e on a.l_extendedprice = e.l_extendedprice;
{code}
and for query that fails:
{code:sql}
select count(*) from `lineitem1.parquet` a
inner join `part.parquet` j on a.l_partkey = j.p_partkey 
inner join `orders.parquet` k on a.l_orderkey = k.o_orderkey 
inner join `supplier.parquet` l on a.l_suppkey = l.s_suppkey 
inner join `partsupp.parquet` m on j.p_partkey = m.ps_partkey and l.s_suppkey = 
m.ps_suppkey 
inner join `customer.parquet` n on k.o_custkey = n.c_custkey 
inner join `lineitem2.parquet` c on a.l_partkey = c.l_partkey 
inner join `lineitem2.parquet` d on a.l_suppkey = d.l_suppkey 
inner join `lineitem2.parquet` e on a.l_extendedprice = e.l_extendedprice;
{code}
2.08GB / 181,675,567 and 9.02GB / 787,992,000 respectively.

So the size of memory that is used by hash join operator is proportional to the 
row count of build side. That's why the actual issue is the row count 
estimation that causes plan in which right input of hash join operator actually 
has larger rows number than probe side.

Creating or modifying existing planning rules may change plan more 
significantly, so I don't think that we should do this for the current bug. But 
swapping the join inputs only for hash join when the best physical plan was 
chosen should not cause regressions. Besides that this swapping which I am 
proposed will be happening very rare since all conditions that I described in 
my previous comment should be satisfied. 

There is not enough information to make a strict decision about swapping, so we 
only check that the worst case may happen. I was looking for a better decision 
but taking into account the shortage of information about rows count I chose 
this fix.

> 25 way join ended up with OOM
> -----------------------------
>
>                 Key: DRILL-1162
>                 URL: https://issues.apache.org/jira/browse/DRILL-1162
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Flow, Query Planning & Optimization
>            Reporter: Rahul Challapalli
>            Assignee: Volodymyr Vysotskyi
>            Priority: Critical
>             Fix For: Future
>
>         Attachments: error.log, oom_error.log
>
>
> git.commit.id.abbrev=e5c2da0
> The below query results in 0 results being returned 
> {code:sql}
> select count(*) from `lineitem1.parquet` a 
> inner join `part.parquet` j on a.l_partkey = j.p_partkey 
> inner join `orders.parquet` k on a.l_orderkey = k.o_orderkey 
> inner join `supplier.parquet` l on a.l_suppkey = l.s_suppkey 
> inner join `partsupp.parquet` m on j.p_partkey = m.ps_partkey and l.s_suppkey 
> = m.ps_suppkey 
> inner join `customer.parquet` n on k.o_custkey = n.c_custkey 
> inner join `lineitem2.parquet` b on a.l_orderkey = b.l_orderkey 
> inner join `lineitem2.parquet` c on a.l_partkey = c.l_partkey 
> inner join `lineitem2.parquet` d on a.l_suppkey = d.l_suppkey 
> inner join `lineitem2.parquet` e on a.l_extendedprice = e.l_extendedprice 
> inner join `lineitem2.parquet` f on a.l_comment = f.l_comment 
> inner join `lineitem2.parquet` g on a.l_shipdate = g.l_shipdate 
> inner join `lineitem2.parquet` h on a.l_commitdate = h.l_commitdate 
> inner join `lineitem2.parquet` i on a.l_receiptdate = i.l_receiptdate 
> inner join `lineitem2.parquet` o on a.l_receiptdate = o.l_receiptdate 
> inner join `lineitem2.parquet` p on a.l_receiptdate = p.l_receiptdate 
> inner join `lineitem2.parquet` q on a.l_receiptdate = q.l_receiptdate 
> inner join `lineitem2.parquet` r on a.l_receiptdate = r.l_receiptdate 
> inner join `lineitem2.parquet` s on a.l_receiptdate = s.l_receiptdate 
> inner join `lineitem2.parquet` t on a.l_receiptdate = t.l_receiptdate 
> inner join `lineitem2.parquet` u on a.l_receiptdate = u.l_receiptdate 
> inner join `lineitem2.parquet` v on a.l_receiptdate = v.l_receiptdate 
> inner join `lineitem2.parquet` w on a.l_receiptdate = w.l_receiptdate 
> inner join `lineitem2.parquet` x on a.l_receiptdate = x.l_receiptdate;
> {code}
> However when we remove the last 'inner join' and run the query it returns 
> '716372534'. Since the last inner join is similar to the one's before it, it 
> should match some records and return the data appropriately.
> The logs indicated that it actually returned 0 results. Attached the log file.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to