kosiew commented on code in PR #21315:
URL: https://github.com/apache/datafusion/pull/21315#discussion_r3099670789
##########
datafusion/functions-aggregate-common/src/min_max.rs:
##########
@@ -423,6 +439,53 @@ macro_rules! min_max {
}};
}
+fn scalar_batch_extreme(values: &ArrayRef, ordering: Ordering) ->
Result<ScalarValue> {
+ let mut index = 0;
+ let mut extreme = loop {
+ if index == values.len() {
+ return ScalarValue::try_from(values.data_type());
+ }
+
+ let current = ScalarValue::try_from_array(values, index)?;
+ index += 1;
+
+ if !current.is_null() {
+ break current;
+ }
+ };
+
+ while index < values.len() {
+ let current = ScalarValue::try_from_array(values, index)?;
+ index += 1;
+
+ if !current.is_null() && extreme.try_cmp(¤t)? == ordering {
+ extreme = current;
+ }
+ }
+
+ Ok(extreme)
+}
+
+fn dictionary_scalar_parts(value: &ScalarValue) -> (&ScalarValue,
Option<&DataType>) {
+ match value {
+ ScalarValue::Dictionary(key_type, inner) => {
+ (inner.as_ref(), Some(key_type.as_ref()))
+ }
+ other => (other, None),
+ }
+}
+
+fn is_row_wise_batch_type(data_type: &DataType) -> bool {
Review Comment:
I agree the current name is confusing. The intent was not “these types are
row-wise by nature,” but “these are the types for which min/max currently falls
back to logical row-by-row scalar comparison instead of a specialized Arrow
kernel.” In other words this helper is describing the implementation strategy,
not an inherent property of the datatype.
I'll rename to `requires_logical_row_scan` to better communicate the
intention.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]