alamb commented on code in PR #13966:
URL: https://github.com/apache/datafusion/pull/13966#discussion_r1900398739


##########
datafusion/functions-nested/src/set_ops.rs:
##########
@@ -516,11 +516,18 @@ fn general_array_distinct<OffsetSize: OffsetSizeTrait>(
     let mut new_arrays = Vec::with_capacity(array.len());
     let converter = RowConverter::new(vec![SortField::new(dt)])?;
     // distinct for each list in ListArray
-    for arr in array.iter().flatten() {
+    for arr in array.iter() {
+        let last_offset: OffsetSize = offsets.last().copied().unwrap();
+        if arr.is_none() {
+            // Add same offset for null
+            offsets.push(last_offset);
+            continue;
+        }
+
+        let arr = arr.unwrap();

Review Comment:
   I think another way to express this pattern without having to do `unwrap` is:
   
   ```suggestion
           let Some(arr) = arr else {
               // Add same offset for null
               offsets.push(last_offset);
               continue;
           }
   ```



##########
datafusion/sqllogictest/test_files/array.slt:
##########
@@ -5674,6 +5674,13 @@ select array_distinct([sum(a)]) from t1 where a > 100 
group by b;
 statement ok
 drop table t1;
 
+query ?
+select array_distinct(a) from values ([1, 2, 3]), (null), ([1, 3, 1]) as X(a);
+----
+[1, 2, 3]
+NULL
+[1, 3]

Review Comment:
   > Does this mean that `datafusion-cli -c select array_distinct(null);` 
should also succeed? It seems that `array_distinct` only accepts arguments of 
array type.
   > 
   
   I would expect that `array_distinct(null)` would return `null` as well. A 
few lines up it seems there is a reference to 
   - https://github.com/apache/datafusion/issues/7142
   
   ```
   #TODO: https://github.com/apache/datafusion/issues/7142
   #query ?
   #select array_distinct(null);
   #----
   #NULL
   ```
   
   I tried it with this PR and found the query still doesn't work
   
   Thus I think this PR neither makes the behavior better or worse



##########
datafusion/functions-nested/src/set_ops.rs:
##########
@@ -538,6 +545,7 @@ fn general_array_distinct<OffsetSize: OffsetSizeTrait>(
         Arc::clone(field),
         offsets,
         values,
-        None,
+        // Keep the list nulls

Review Comment:
   👍 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to