bert-beyondloops opened a new issue, #16474:
URL: https://github.com/apache/datafusion/issues/16474

   ### Describe the bug
   
   
   When using the array_has function with an empty list, the result is <null>
   
   Currently : 
   
   array_has([], 1) => **null**.
   array_has(null, 1) => null.
   
   In my opinion, it should **_only_** return null when the list itself is null.
   
   In duckDb:
   array_has([], 1) => **false**.
   array_has(null, 1) => null.
   
   
   
   
   ### To Reproduce
   
   
   ```
   CREATE TABLE foo ( bar INTEGER[] );
   
   INSERT INTO foo VALUES 
   ( [1, 3, 5]),
   (  []),
   ( null)
   ;
   
   SELECT bar, IFNULL(CAST(list_contains(bar, 1) AS VARCHAR), 'null') FROM foo;
   ```
   
   will result in :
   ```
   
   +-----------+-----------------------------------------------+
   | bar       | nvl(array_has(foo.bar,Int64(1)),Utf8("null")) |
   +-----------+-----------------------------------------------+
   | [1, 3, 5] | true                                          |
   | []        | null                                          |
   |           | null                                          |
   +-----------+-----------------------------------------------+
   ```
   
   whereas for example in duckDb you get : 
   ```
   
   ┌───────────┬──────────────────────────────────────────────────────────┐
   │    bar    │ COALESCE(CAST(list_contains(bar, 1) AS VARCHAR), 'null') │
   │  int32[]  │                         varchar                          │
   ├───────────┼──────────────────────────────────────────────────────────┤
   │ [1, 3, 5] │ true                                                     │
   │ []        │ false                                                    │
   │           │ null                                                     │
   └───────────┴──────────────────────────────────────────────────────────┘
   ```
   
   ### Expected behavior
   
   array_has([], 1) => false.
   array_has(null, 1) => null.
   
   ### Additional context
   
   According to me, the issue is situated in the 
`array_has_dispatch_for_scalar` method : 
   
   
   ```
   ...
   for (i, offset) in offsets.windows(2).enumerate() {
           let start = offset[0].to_usize().unwrap();
           let end = offset[1].to_usize().unwrap();
           let length = end - start;
           // For non-nested list, length is 0 for null
           if length == 0 {
               continue;
           }
           let sliced_array = eq_array.slice(start, length);
           final_contained[i] = Some(sliced_array.true_count() > 0);
       }
   ...
   ```
   
   There is no distinction between null or length = 0.
   You cannot only look at the offset difference. You have to look at the 
validity buffer as well.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to