SteNicholas opened a new pull request #2111: URL: https://github.com/apache/hudi/pull/2111
## *Tips* - *Thank you very much for contributing to Apache Hudi.* - *Please review https://hudi.apache.org/contributing.html before opening a pull request.* ## What is the purpose of the pull request *When setting different small file limit, if the limit is set to 0, the new inserts will not update the old records and write into a new file, otherwise the limit is set to other value such as 128M, the new inserts could update the old records lies in small file picked up the UpsertPartitioner. `UpsertPartitioner` should insert new records regardless of small file when using insert operation.* ## Brief change log - *`assignInserts` of `UpsertPartitioner` add the check whether write operation type is is changing records or not for the case that using insert operation inserts new records regardless of small file.* ## Verify this pull request - *Modify the creation of `WorkloadProfile` for `testUpsertPartitioner` and `testUpsertPartitionerWithSmallInsertHandling` of `TestUpsertPartitioner` to verify the test case with UPSERT write operation.* ## Committer checklist - [x] Has a corresponding JIRA in PR title & commit - [x] Commit message is descriptive of the change - [x] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
