Copilot commented on code in PR #7246:
URL: https://github.com/apache/kyuubi/pull/7246#discussion_r2645790522
##########
externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/sql/execution/arrow/KyuubiArrowConverters.scala:
##########
@@ -274,19 +274,28 @@ object KyuubiArrowConverters extends SQLConfHelper with
Logging {
var estimatedBatchSize = 0L
Utils.tryWithSafeFinally {
+ def isBatchSizeLimitExceeded: Boolean = {
+ // If `maxEstimatedBatchSize` is zero or negative, it implies
unlimited.
+ maxEstimatedBatchSize > 0 && estimatedBatchSize >=
maxEstimatedBatchSize
+ }
+ def isRecordLimitExceeded: Boolean = {
+ // If `maxRecordsPerBatch` is zero or negative, it implies unlimited.
+ maxRecordsPerBatch > 0 && rowCountInLastBatch >= maxRecordsPerBatch
+ }
+ def isGlobalLimitNotReached: Boolean = {
+ // If the limit is negative, it means no restriction
+ // or the current number of rows has not reached the limit.
+ rowCount < limit || limit < 0
+ }
+
// Always write the first row.
- while (rowIter.hasNext && (
- // For maxBatchSize and maxRecordsPerBatch, respect whatever
smaller.
+ while (rowIter.hasNext && isGlobalLimitNotReached && (
// If the size in bytes is positive (set properly), always write
the first row.
- rowCountInLastBatch == 0 && maxEstimatedBatchSize > 0 ||
- // If the size in bytes of rows are 0 or negative, unlimit it.
- estimatedBatchSize <= 0 ||
- estimatedBatchSize < maxEstimatedBatchSize ||
- // If the size of rows are 0 or negative, unlimit it.
- maxRecordsPerBatch <= 0 ||
- rowCountInLastBatch < maxRecordsPerBatch ||
- rowCount < limit ||
- limit < 0)) {
+ rowCountInLastBatch == 0 ||
+ // If either limit is hit, create a batch. This implies that the
limit that is hit
+ // first triggers the creation of a batch even if the other
limit is not yet hit
+ // hence preferring the more restrictive limit.
Review Comment:
The comment states "If either limit is hit, create a batch" but the logic
actually continues the loop when neither limit is exceeded. The comment should
be clarified to say "Continue adding rows to the batch until either limit is
exceeded" or similar wording to accurately reflect the loop continuation
condition rather than batch creation trigger.
```suggestion
// Continue adding rows to the current batch until either
limit is exceeded.
// The limit that is reached first determines the batch
boundary, even if the
// other limit has not yet been reached, thus preferring the
more restrictive limit.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]