jhorstmann commented on pull request #9038:
URL: https://github.com/apache/arrow/pull/9038#issuecomment-753397568


   Semantics for the `IN` and `OR` versions should be the same *in SQL*, for 
example [try the following query in postgres][1]:
   
   ```
   SELECT 'abc' in ('abc', 'def')
        , 'abc' in ('abc', 'def', null)
        , 'abc' in ('def', null)
        , null in ('abc', 'def')
        -- same expressions rewritten using OR
        , (('abc' = 'abc') OR ('abc' = 'def'))
        , (('abc' = 'abc') OR ('abc' = 'def') or ('abc' = null))
        , (('abc' = 'def') or ('abc' = null))
        , ((null = 'abc') OR (null = 'def'))
   ```
   
    1. result is `true`, no nulls involved
    2. result is `true`, additional `null` on the rhs does not change this 
since `(true OR null) = true`
    3. result is `null` since `(false OR null) IS NULL`
    4. result is `null` since lhs of each comparison is null
   
   But you are right, it's not that simple since the arrow `or` / `and` kernels 
currently do not follow this sql behaviour. We might want to change that or 
rather introduce separate boolean kernels with the sql behaviour regarding 
nulls.
   
   
    [1]: http://sqlfiddle.com/#!17/9eecb/67863


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to