Song Jun created SPARK-27280:
--------------------------------
Summary: infer filters from Join's OR condition
Key: SPARK-27280
URL: https://issues.apache.org/jira/browse/SPARK-27280
Project: Spark
Issue Type: Improvement
Components: Optimizer, SQL
Affects Versions: 3.0.0
Reporter: Song Jun
In some case, We can infer filters from Join condition with OR expressions.
for example, tpc-ds query 48:
{code:java}
select sum (ss_quantity)
from store_sales, store, customer_demographics, customer_address, date_dim
where s_store_sk = ss_store_sk
and ss_sold_date_sk = d_date_sk and d_year = 2000
and
(
(
cd_demo_sk = ss_cdemo_sk
and
cd_marital_status = 'S'
and
cd_education_status = 'Secondary'
and
ss_sales_price between 100.00 and 150.00
)
or
(
cd_demo_sk = ss_cdemo_sk
and
cd_marital_status = 'M'
and
cd_education_status = 'College'
and
ss_sales_price between 50.00 and 100.00
)
or
(
cd_demo_sk = ss_cdemo_sk
and
cd_marital_status = 'U'
and
cd_education_status = '2 yr Degree'
and
ss_sales_price between 150.00 and 200.00
)
)
and
(
(
ss_addr_sk = ca_address_sk
and
ca_country = 'United States'
and
ca_state in ('AL', 'OH', 'MD')
and ss_net_profit between 0 and 2000
)
or
(ss_addr_sk = ca_address_sk
and
ca_country = 'United States'
and
ca_state in ('VA', 'TX', 'IA')
and ss_net_profit between 150 and 3000
)
or
(ss_addr_sk = ca_address_sk
and
ca_country = 'United States'
and
ca_state in ('RI', 'WI', 'KY')
and ss_net_profit between 50 and 25000
)
)
;
{code}
we can infer two filters from the join or condidtion:
{code:java}
for customer_demographics:
cd_marital_status in(‘D',‘U',‘M') and cd_education_status in('4 yr
Degree’,’Secondary’,’Primary')
for store_sales:
(ss_sales_price between 100.00 and 150.00 or ss_sales_price between 50.00 and
100.00 or ss_sales_price between 150.00 and 200.00)
{code}
then then we can push down the above two filters to filter
customer_demographics/store_sales.
A pr will be submit soon.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]