[
https://issues.apache.org/jira/browse/HIVE-23880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
László Bodor updated HIVE-23880:
--------------------------------
Description:
Merging bloom filters in semijoin reduction can become the main bottleneck in
case of large number of source mapper tasks (~1000, Map 1 in below example) and
a large amount of expected entries (50M) in bloom filters.
For example in TPCDS Q93:
{code}
select /*+ semi(store_returns, sr_item_sk, store_sales, 70000000)*/
ss_customer_sk
,sum(act_sales) sumsales
from (select ss_item_sk
,ss_ticket_number
,ss_customer_sk
,case when sr_return_quantity is not null then
(ss_quantity-sr_return_quantity)*ss_sales_price
else
(ss_quantity*ss_sales_price) end act_sales
from store_sales left outer join store_returns on (sr_item_sk =
ss_item_sk
and
sr_ticket_number = ss_ticket_number)
,reason
where sr_reason_sk = r_reason_sk
and r_reason_desc = 'reason 66') t
group by ss_customer_sk
order by sumsales, ss_customer_sk
limit 100;
{code}
On 10TB-30TB scale there is a chance that from 3-4 mins of query runtime 1-2
mins are spent with merging bloom filters (Reducer 2), as in:
[^lipwig-output3605036885489193068.svg]
{code}
----------------------------------------------------------------------------------------------
VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING
FAILED KILLED
----------------------------------------------------------------------------------------------
Map 3 .......... llap SUCCEEDED 1 1 0 0
0 0
Map 1 .......... llap SUCCEEDED 1263 1263 0 0
0 0
Reducer 2 llap RUNNING 1 0 1 0
0 0
Map 4 llap RUNNING 6154 0 207 5947
0 0
Reducer 5 llap INITED 43 0 0 43
0 0
Reducer 6 llap INITED 1 0 0 1
0 0
----------------------------------------------------------------------------------------------
VERTICES: 02/06 [====>>----------------------] 16% ELAPSED TIME: 149.98 s
----------------------------------------------------------------------------------------------
{code}
For example, 70M entries in bloom filter leads to a 436 465 696 bits, so
merging 1263 bloom filters means running ~ 1263 * 436 465 696 bitwise OR
operation, which is very hot codepath, but can be parallelized.
was:
Merging bloom filters in semijoin reduction can become the main bottleneck in
case of large number of source mapper tasks (~1000) and a large amount of
expected entries (50M) in bloom filters.
For example in TPCDS Q93:
{code}
select /*+ semi(store_returns, sr_item_sk, store_sales, 70000000)*/
ss_customer_sk
,sum(act_sales) sumsales
from (select ss_item_sk
,ss_ticket_number
,ss_customer_sk
,case when sr_return_quantity is not null then
(ss_quantity-sr_return_quantity)*ss_sales_price
else
(ss_quantity*ss_sales_price) end act_sales
from store_sales left outer join store_returns on (sr_item_sk =
ss_item_sk
and
sr_ticket_number = ss_ticket_number)
,reason
where sr_reason_sk = r_reason_sk
and r_reason_desc = 'reason 66') t
group by ss_customer_sk
order by sumsales, ss_customer_sk
limit 100;
{code}
On 10TB-30TB scale there is a chance that from 3-4 mins of query runtime 1-2
mins are spent with merging bloom filters, as in:
> Bloom filters can be merged in a parallel way in VectorUDAFBloomFilterMerge
> ---------------------------------------------------------------------------
>
> Key: HIVE-23880
> URL: https://issues.apache.org/jira/browse/HIVE-23880
> Project: Hive
> Issue Type: Improvement
> Reporter: László Bodor
> Assignee: László Bodor
> Priority: Major
> Attachments: lipwig-output3605036885489193068.svg
>
>
> Merging bloom filters in semijoin reduction can become the main bottleneck in
> case of large number of source mapper tasks (~1000, Map 1 in below example)
> and a large amount of expected entries (50M) in bloom filters.
> For example in TPCDS Q93:
> {code}
> select /*+ semi(store_returns, sr_item_sk, store_sales, 70000000)*/
> ss_customer_sk
> ,sum(act_sales) sumsales
> from (select ss_item_sk
> ,ss_ticket_number
> ,ss_customer_sk
> ,case when sr_return_quantity is not null then
> (ss_quantity-sr_return_quantity)*ss_sales_price
> else
> (ss_quantity*ss_sales_price) end act_sales
> from store_sales left outer join store_returns on (sr_item_sk =
> ss_item_sk
> and
> sr_ticket_number = ss_ticket_number)
> ,reason
> where sr_reason_sk = r_reason_sk
> and r_reason_desc = 'reason 66') t
> group by ss_customer_sk
> order by sumsales, ss_customer_sk
> limit 100;
> {code}
> On 10TB-30TB scale there is a chance that from 3-4 mins of query runtime 1-2
> mins are spent with merging bloom filters (Reducer 2), as in:
> [^lipwig-output3605036885489193068.svg]
> {code}
> ----------------------------------------------------------------------------------------------
> VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING
> FAILED KILLED
> ----------------------------------------------------------------------------------------------
> Map 3 .......... llap SUCCEEDED 1 1 0 0
> 0 0
> Map 1 .......... llap SUCCEEDED 1263 1263 0 0
> 0 0
> Reducer 2 llap RUNNING 1 0 1 0
> 0 0
> Map 4 llap RUNNING 6154 0 207 5947
> 0 0
> Reducer 5 llap INITED 43 0 0 43
> 0 0
> Reducer 6 llap INITED 1 0 0 1
> 0 0
> ----------------------------------------------------------------------------------------------
> VERTICES: 02/06 [====>>----------------------] 16% ELAPSED TIME: 149.98 s
> ----------------------------------------------------------------------------------------------
> {code}
> For example, 70M entries in bloom filter leads to a 436 465 696 bits, so
> merging 1263 bloom filters means running ~ 1263 * 436 465 696 bitwise OR
> operation, which is very hot codepath, but can be parallelized.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)