vinothchandar commented on a change in pull request #3213:
URL: https://github.com/apache/hudi/pull/3213#discussion_r663061755



##########
File path: 
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java
##########
@@ -185,21 +185,26 @@ private void assignInserts(WorkloadProfile profile, 
HoodieEngineContext context)
 
         // first try packing this into one of the smallFiles
         for (SmallFile smallFile : smallFiles) {
-          long recordsToAppend = Math.min((config.getParquetMaxFileSize() - 
smallFile.sizeBytes) / averageRecordSize,
-              totalUnassignedInserts);
-          if (recordsToAppend > 0 && totalUnassignedInserts > 0) {
-            // create a new bucket or re-use an existing bucket
-            int bucket;
-            if 
(updateLocationToBucket.containsKey(smallFile.location.getFileId())) {
-              bucket = 
updateLocationToBucket.get(smallFile.location.getFileId());
-              LOG.info("Assigning " + recordsToAppend + " inserts to existing 
update bucket " + bucket);
-            } else {
-              bucket = addUpdateBucket(partitionPath, 
smallFile.location.getFileId());
-              LOG.info("Assigning " + recordsToAppend + " inserts to new 
update bucket " + bucket);
+          if (totalUnassignedInserts > 0) {
+            long recordsToAppend = Math.min((config.getParquetMaxFileSize() - 
smallFile.sizeBytes) / averageRecordSize,
+                totalUnassignedInserts);
+            if (recordsToAppend > 0) {
+              // create a new bucket or re-use an existing bucket
+              int bucket;
+              if 
(updateLocationToBucket.containsKey(smallFile.location.getFileId())) {
+                bucket = 
updateLocationToBucket.get(smallFile.location.getFileId());
+                LOG.info("Assigning " + recordsToAppend + " inserts to 
existing update bucket " + bucket);
+              } else {
+                bucket = addUpdateBucket(partitionPath, 
smallFile.location.getFileId());
+                LOG.info("Assigning " + recordsToAppend + " inserts to new 
update bucket " + bucket);
+              }
+              bucketNumbers.add(bucket);
+              recordsPerBucket.add(recordsToAppend);
+              totalUnassignedInserts -= recordsToAppend;
             }
-            bucketNumbers.add(bucket);
-            recordsPerBucket.add(recordsToAppend);
-            totalUnassignedInserts -= recordsToAppend;
+          } else {

Review comment:
       if the main thing you want to fix is to break out early. Can we do it 
after L203, with a if check, without nesting this 1 level under again in if 
block?
   
   ```Java 
   totalUnassignedInserts -= recordsToAppend;
   if (totalUnassignedInserts <= 0) {
     break;
   }
   ```

##########
File path: 
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/commit/UpsertPartitioner.java
##########
@@ -185,21 +185,26 @@ private void assignInserts(WorkloadProfile profile, 
HoodieEngineContext context)
 
         // first try packing this into one of the smallFiles
         for (SmallFile smallFile : smallFiles) {
-          long recordsToAppend = Math.min((config.getParquetMaxFileSize() - 
smallFile.sizeBytes) / averageRecordSize,

Review comment:
       IIUC `recordsToAppend` will become negative in the current code when 
`totalUnassignedInserts` is negative?  and we won't get into any assignment, 
even if we loop thru all small files?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to