Jefffrey commented on code in PR #21315:
URL: https://github.com/apache/datafusion/pull/21315#discussion_r3092337613
##########
datafusion/functions-aggregate-common/src/min_max.rs:
##########
@@ -423,6 +439,53 @@ macro_rules! min_max {
}};
}
+fn scalar_batch_extreme(values: &ArrayRef, ordering: Ordering) ->
Result<ScalarValue> {
+ let mut index = 0;
+ let mut extreme = loop {
+ if index == values.len() {
+ return ScalarValue::try_from(values.data_type());
+ }
+
+ let current = ScalarValue::try_from_array(values, index)?;
+ index += 1;
+
+ if !current.is_null() {
+ break current;
+ }
+ };
+
+ while index < values.len() {
+ let current = ScalarValue::try_from_array(values, index)?;
+ index += 1;
+
+ if !current.is_null() && extreme.try_cmp(¤t)? == ordering {
+ extreme = current;
+ }
+ }
+
+ Ok(extreme)
+}
+
+fn dictionary_scalar_parts(value: &ScalarValue) -> (&ScalarValue,
Option<&DataType>) {
+ match value {
+ ScalarValue::Dictionary(key_type, inner) => {
+ (inner.as_ref(), Some(key_type.as_ref()))
+ }
+ other => (other, None),
+ }
+}
+
+fn is_row_wise_batch_type(data_type: &DataType) -> bool {
Review Comment:
What does row_wise_batch_type mean? Semantically this doesn't make sense to
me, how a type like Int8 isn't row wise compare to any of these types
##########
datafusion/functions-aggregate-common/src/min_max.rs:
##########
@@ -423,6 +439,53 @@ macro_rules! min_max {
}};
}
+fn scalar_batch_extreme(values: &ArrayRef, ordering: Ordering) ->
Result<ScalarValue> {
Review Comment:
I'm curious why this was renamed and refactored from `min_max_batch_generic`
when the functionality seems equivalent? And to me this new version is less
readable than the original
##########
datafusion/functions-aggregate-common/src/min_max.rs:
##########
@@ -413,6 +413,22 @@ macro_rules! min_max {
min_max_generic!(lhs, rhs, $OP)
}
+ (lhs, rhs)
+ if matches!(lhs, ScalarValue::Dictionary(_, _))
+ || matches!(rhs, ScalarValue::Dictionary(_, _)) =>
+ {
+ let (lhs, lhs_key_type) = dictionary_scalar_parts(lhs);
+ let (rhs, rhs_key_type) = dictionary_scalar_parts(rhs);
+ let result = min_max_generic!(lhs, rhs, $OP);
+
+ match lhs_key_type.zip(rhs_key_type) {
+ Some((key_type, _)) => {
+ ScalarValue::Dictionary(Box::new(key_type.clone()),
Box::new(result))
Review Comment:
Is there significance here to `rhs` key type always being ignored?
##########
datafusion/functions-aggregate-common/src/min_max.rs:
##########
@@ -413,6 +413,22 @@ macro_rules! min_max {
min_max_generic!(lhs, rhs, $OP)
}
+ (lhs, rhs)
+ if matches!(lhs, ScalarValue::Dictionary(_, _))
+ || matches!(rhs, ScalarValue::Dictionary(_, _)) =>
+ {
Review Comment:
Something to keep in mind here is that the docstring of this macro and the
catch-all error arm below indicates the `lhs` & `rhs` are meant to be the same
type; if we're expanding this functionality to allow disparate types (even if
logically equivalent) we should ensure documentation stays up to date
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]