nsivabalan edited a comment on pull request #2168:
URL: https://github.com/apache/hudi/pull/2168#issuecomment-742791285


   > @nsivabalan The use-case you described seems to be intentional but the 
behavior is not correct. If the number of records to update is explicitly asked 
by the dag, then `Option<Long> numRecordsToUpdate` should be set. In this case, 
the code path assumes it's not set. May be there's a bug in setting that 
variable ?
   > In general, left 2 comments, after they are addressed, can merge this.
   
   I don't think so. this code path assume numFiles is not set, not 
numRecordsToUpdate. Check my comment in code snippet below. 
   
   ```
   if (!numFiles.isPresent() || numFiles.get() == 0) {
         numFilesToUpdate = (int) Math.ceil((double) numRecordsToUpdate.get() / 
recordsInSingleFile);
         int totalExistingFilesCount = 
partitionToFileIdCountMap.values().stream().reduce((a, b) -> a + b).get();
         numFilesToUpdate = Math.min(numFilesToUpdate, totalExistingFilesCount);
         numRecordsToUpdatePerFile = recordsInSingleFile; // this line ignores 
the numRecordsToUpdate passed in and sets to total records in one single file 
slice. 
       }


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to