Jefffrey commented on code in PR #18518:
URL: https://github.com/apache/datafusion/pull/18518#discussion_r2501541689
##########
datafusion/spark/src/function/array/shuffle.rs:
##########
@@ -66,32 +68,108 @@ impl ScalarUDFImpl for SparkShuffle {
}
fn return_type(&self, arg_types: &[DataType]) -> Result<DataType> {
+ if arg_types.is_empty() {
+ return plan_err!("shuffle expects at least 1 argument");
+ }
Review Comment:
```suggestion
```
We don't need this check
##########
datafusion/spark/src/function/array/shuffle.rs:
##########
@@ -47,7 +49,7 @@ impl Default for SparkShuffle {
impl SparkShuffle {
pub fn new() -> Self {
Self {
- signature: Signature::arrays(1, None, Volatility::Volatile),
+ signature: Signature::user_defined(Volatility::Volatile),
Review Comment:
We should avoid using user_defined and make use of
`TypeSignature::ArraySignature`, for example:
https://github.com/apache/datafusion/blob/7591919be7e6582ed7f6a8d0b033f7a0a8ad60f7/datafusion/expr-common/src/signature.rs#L1247-L1262
- Index can be used as seed since it is Int64 type
- We can't use `Signature::array_and_index` directly since that would coerce
FixedSizeLists to List arrays, which we don't want since we have a native
implementation for FixedSizeLists
- Ensure there is support for both array only, and array + seed
##########
datafusion/sqllogictest/test_files/spark/array/shuffle.slt:
##########
@@ -22,7 +22,7 @@ SELECT array_sort(shuffle([1, 2, 3, 4, 5, NULL])) = [NULL,1,
2, 3, 4, 5];
true
query B
-SELECT shuffle([1, 2, 3, 4, 5, NULL]) != [1, 2, 3, 4, 5, NULL];
+SELECT shuffle([1, 2, 3, 4, 5, NULL], 1) != [1, 2, 3, 4, 5, NULL];
----
true
Review Comment:
We can just assert the output list instead of it being an inequality with
its sorted version now, e.g.
```sql
query ?
SELECT shuffle([1, 2, 3, 4, 5, NULL], 1);
----
[2, 5, NULL, 3, 4, 1]
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]