Dandandan opened a new issue #799:
URL: https://github.com/apache/arrow-datafusion/issues/799


   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   For TCP-H query 12, according to the profier about 16% (including IO) of the 
time is spent in the code in `InListExpr`.
   
   The full query is:
   
   ```
   select
       l_shipmode,
       sum(case
               when o_orderpriority = '1-URGENT'
                   or o_orderpriority = '2-HIGH'
                   then 1
               else 0
           end) as high_line_count,
       sum(case
               when o_orderpriority <> '1-URGENT'
                   and o_orderpriority <> '2-HIGH'
                   then 1
               else 0
           end) as low_line_count
   from
       lineitem
           join
       orders
       on
               l_orderkey = o_orderkey
   where
           l_shipmode in ('MAIL', 'SHIP')
     and l_commitdate < l_receiptdate
     and l_shipdate < l_commitdate
     and l_receiptdate >= date '1994-01-01'
     and l_receiptdate < date '1995-01-01'
   group by
       l_shipmode
   order by
       l_shipmode;
   ```
   
   We can replace:
   
   ` x in (a, b)` 
   with the equivalent of
   
   `(x==a or x==b)`
   
   for "small enough" lists for some performance improvement.
   
   **Describe the solution you'd like**
   
   **Describe alternatives you've considered**
   A clear and concise description of any alternative solutions or features 
you've considered.
   
   **Additional context**
   Add any other context or screenshots about the feature request here.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to