2010YOUY01 commented on code in PR #19587:
URL: https://github.com/apache/datafusion/pull/19587#discussion_r2656071853
##########
datafusion/common/src/utils/mod.rs:
##########
@@ -1026,22 +1026,26 @@ mod tests {
ScalarValue::Int32(Some(2)),
Null,
ScalarValue::Int32(Some(0)),
- ] < vec![
+ ]
+ .partial_cmp(&vec![
ScalarValue::Int32(Some(2)),
Null,
ScalarValue::Int32(Some(1)),
- ]
+ ])
+ .is_none()
Review Comment:
TLDR: I suggest we follow the PostgreSQL's behavior and return true here.
By definition it should return Null
SQL Null behavior reference:
https://github.com/apache/datafusion/blob/b818f93416d18d06374a0707f5ef571f8a384070/datafusion/pruning/src/pruning_predicate.rs#L113
However postgres and DuckDB all has 'Null equals Null' behavior if Null is
inside a composite type
```sh
postgres=# SELECT ARRAY[2, NULL, 0] < ARRAY[2, NULL, 1];
?column?
----------
t
(1 row)
D SELECT [2, NULL, 0] < [2, NULL, 1] AS result;
┌─────────┐
│ result │
│ boolean │
├─────────┤
│ true │
└─────────┘
```
Postgres explains the rationale here
https://www.postgresql.org/docs/current/functions-comparisons.html#COMPOSITE-TYPE-COMPARISON
I’ve read that section three times now, and I’ll be honest — I still have no
idea what they’re talking about 😅
DuckDB said they're following Postgres behavior
https://duckdb.org/docs/stable/sql/data_types/list#comparison-and-ordering
##########
datafusion/common/src/scalar/mod.rs:
##########
@@ -723,7 +727,7 @@ impl PartialOrd for ScalarValue {
if k1 == k2 { v1.partial_cmp(v2) } else { None }
}
(Dictionary(_, _), _) => None,
- (Null, Null) => Some(Ordering::Equal),
+ // Null is handled by the early return above, but we need this for
exhaustiveness
Review Comment:
should we do something like
```rust
(Null, Null) | (Null, _) | (_, Null) => unreachable!("Nulls are already
handled before entering this matching arm
```
to be more defensive
##########
datafusion/common/src/scalar/mod.rs:
##########
@@ -5760,10 +5764,9 @@ mod tests {
.unwrap(),
Ordering::Less
);
- assert_eq!(
+ assert!(
ScalarValue::try_cmp(&ScalarValue::Int32(None),
&ScalarValue::Int32(Some(2)))
Review Comment:
It would be great to update the doc comments for `try_cmp`, now it only says
it errors for incompatible types, but it's also throwing error for input nulls
after the change
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]