westonpace commented on code in PR #40817:
URL: https://github.com/apache/arrow/pull/40817#discussion_r1547783579


##########
cpp/src/arrow/acero/partition_util.h:
##########
@@ -62,7 +62,7 @@ class PartitionSort {
   template <class INPUT_PRTN_ID_FN, class OUTPUT_POS_FN>
   static void Eval(int64_t num_rows, int num_prtns, uint16_t* prtn_ranges,
                    INPUT_PRTN_ID_FN prtn_id_impl, OUTPUT_POS_FN 
output_pos_impl) {
-    ARROW_DCHECK(num_rows > 0 && num_rows <= (1 << 15));
+    ARROW_DCHECK(num_rows > 0 && num_rows <= ((1 << 16) - 1));
     ARROW_DCHECK(num_prtns >= 1 && num_prtns <= (1 << 15));

Review Comment:
   I believe the goal here is to use up to `dop_` partitions but only use that 
many if we have enough rows to justify it.  We only want to create more 
partitions if these partitions have `min_num_rows_per_prtn` rows.  If there are 
not very many rows then we use fewer partitions.
   
   Also, we require `num_prtns_` to be a power of 2 because we are going to 
calculate the partition id using masking and so we need `log_num_prtns_` to 
tell us how many bits to use for masking.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to