seddonm1 commented on a change in pull request #9233:
URL: https://github.com/apache/arrow/pull/9233#discussion_r575535374
##########
File path: rust/datafusion/src/physical_plan/hash_aggregate.rs
##########
@@ -398,97 +405,165 @@ fn group_aggregate_batch(
Ok(accumulators)
}
-/// Create a key `Vec<u8>` that is used as key for the hashmap
-pub(crate) fn create_key(
- group_by_keys: &[ArrayRef],
+/// Appends a sequence of [u8] bytes for the value in `col[row]` to
+/// `vec` to be used as a key into the hash map for a dictionary type
+///
+/// Note that ideally, for dictionary encoded columns, we would be
+/// able to simply use the dictionary idicies themselves (no need to
+/// look up values) or possibly simply build the hash table entirely
+/// on the dictionary indexes.
+///
+/// This aproach would likely work (very) well for the common case,
Review comment:
`aproach` -> `approach`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]