[ 
https://issues.apache.org/jira/browse/SPARK-44493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuming Wang updated SPARK-44493:
--------------------------------
    Description: 
Example:
{code:sql}
select count(*)
from
  db.very_large_table
where
  session_start_dt between date_sub('2023-07-15', 1) and date_add('2023-07-16', 
1)
  and type = 'event'
  and date(event_timestamp) between '2023-07-15' and '2023-07-16'
  and (
    (
      page_id in (2627, 2835, 2402999)
      and -- other predicates
      and rdt = 0
    ) or (
      page_id in (2616, 3411350)
      and rdt = 0
    ) or (
      page_id = 2403006
    ) or (
      page_id in (2208336, 2356359)
      and -- other predicates
      and rdt = 0
    )
  )
{code}

We can push down {{page_id in(2627, 2835, 2402999, 2616, 3411350, 2403006, 
2208336, 2356359)}} to datasource.
Before:
 !before.png! 
After:
 !after.png! 



  was:
Example:
{code:sql}
select count(*)
from
  db.very_large_table
where
  session_start_dt between date_sub('2023-07-15', 1) and date_add('2023-07-16', 
1)
  and type = 'event'
  and date(event_timestamp) between '2023-07-15' and '2023-07-16'
  and (
    (
      page_id in (2627, 2835, 2402999)
      and -- other predicates
      and rdt = 0
    ) or (
      page_id in (2616, 3411350)
      and rdt = 0
    ) or (
      page_id = 2403006
    ) or (
      page_id in (2208336, 2356359)
      and -- other predicates
      and rdt = 0
    )
  )
{code}

We can push down {{page_id in(2627, 2835, 2402999, 2616, 3411350, 2403006, 
2208336, 2356359)}} to datasource.
Before:

After:




> Extract pushable predicates from disjunctive predicates
> -------------------------------------------------------
>
>                 Key: SPARK-44493
>                 URL: https://issues.apache.org/jira/browse/SPARK-44493
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 4.0.0
>            Reporter: Yuming Wang
>            Priority: Major
>         Attachments: after.png, before.png
>
>
> Example:
> {code:sql}
> select count(*)
> from
>   db.very_large_table
> where
>   session_start_dt between date_sub('2023-07-15', 1) and 
> date_add('2023-07-16', 1)
>   and type = 'event'
>   and date(event_timestamp) between '2023-07-15' and '2023-07-16'
>   and (
>     (
>       page_id in (2627, 2835, 2402999)
>       and -- other predicates
>       and rdt = 0
>     ) or (
>       page_id in (2616, 3411350)
>       and rdt = 0
>     ) or (
>       page_id = 2403006
>     ) or (
>       page_id in (2208336, 2356359)
>       and -- other predicates
>       and rdt = 0
>     )
>   )
> {code}
> We can push down {{page_id in(2627, 2835, 2402999, 2616, 3411350, 2403006, 
> 2208336, 2356359)}} to datasource.
> Before:
>  !before.png! 
> After:
>  !after.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to