Jefffrey commented on code in PR #20116:
URL: https://github.com/apache/datafusion/pull/20116#discussion_r2761901006
##########
datafusion/spark/src/function/hash/sha2.rs:
##########
@@ -112,10 +203,31 @@ fn sha2_binary_impl<'a, BinaryArrType>(
) -> ArrayRef
where
BinaryArrType: BinaryArrayType<'a>,
+{
+ sha2_binary_bitlen_iter(values, bit_lengths.iter())
+}
+
+fn sha2_binary_scalar_bitlen<'a, BinaryArrType>(
+ values: &BinaryArrType,
+ bit_length: i32,
+) -> ArrayRef
+where
+ BinaryArrType: BinaryArrayType<'a>,
+{
+ sha2_binary_bitlen_iter(values, std::iter::repeat(Some(bit_length)))
Review Comment:
I was thinking along the lines of removing the match logic on the hot loop
below, if we know the bit length for all values; I think it'll result in more
verbose code but could be worth performance. Can look into this in a followup
##########
datafusion/spark/src/function/hash/sha2.rs:
##########
@@ -87,7 +88,97 @@ impl ScalarUDFImpl for SparkSha2 {
}
fn invoke_with_args(&self, args: ScalarFunctionArgs) ->
Result<ColumnarValue> {
- make_scalar_function(sha2_impl, vec![])(&args.args)
+ let [values, bit_lengths] = take_function_args(self.name(),
args.args.iter())?;
+
+ match (values, bit_lengths) {
+ (
+ ColumnarValue::Scalar(value_scalar),
+ ColumnarValue::Scalar(ScalarValue::Int32(Some(bit_length))),
+ ) => {
+ if value_scalar.is_null() {
Review Comment:
We should pull all null checks into a single branch at the top, e.g.
```rust
match (values, bit_lengths) {
(ColumnarValue::Scalar(s), _) | (_, ColumnarValue::Scalar(s)) if
s.is_null() => { // return scalar null }
```
This means we'd only need 4 arms:
- One arm checking if either is null
- One arm for scalar + scalar
- One arm for array + scalar
- Catch all (array + array, scalar + array)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]