nsivabalan edited a comment on pull request #2168:
URL: https://github.com/apache/hudi/pull/2168#issuecomment-742791285
> @nsivabalan The use-case you described seems to be intentional but the
behavior is not correct. If the number of records to update is explicitly asked
by the dag, then `Option<Long> numRecordsToUpdate` should be set. In this case,
the code path assumes it's not set. May be there's a bug in setting that
variable ?
> In general, left 2 comments, after they are addressed, can merge this.
I don't think so. this code path assume numFiles is not set, not
numRecordsToUpdate. Check my comment in code snippet below.
```
if (!numFiles.isPresent() || numFiles.get() == 0) {
numFilesToUpdate = (int) Math.ceil((double) numRecordsToUpdate.get() /
recordsInSingleFile);
int totalExistingFilesCount =
partitionToFileIdCountMap.values().stream().reduce((a, b) -> a + b).get();
numFilesToUpdate = Math.min(numFilesToUpdate, totalExistingFilesCount);
numRecordsToUpdatePerFile = recordsInSingleFile; // this line ignores
the numRecordsToUpdate passed in and sets to total records in one single file
slice.
}
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]