mikedias commented on PR #7865: URL: https://github.com/apache/paimon/pull/7865#issuecomment-4538083982
Thanks for the awesome review @JingsongLi, here is the follow-up: ### 1. Silent exception swallowing in PartitionBucketMapping.loadFromScan Good point, I agree that failing fast is safer. Removed the try-catch block and let the exception propagate. ### 2. PartitionBucketMapping staleness in long-running streaming jobs It is already expected that the rescale process can only be performed when no job is writing to the table, since it uses strict commit mode. Also, there are additional checks that will fail the commit if the bucket counts don't match. In any case, I've added a note on that in the rescale doc. ### 3. PartitionEntry.merge() tie-breaking semantics Good point, no one wants non-deterministic behavior in their codebases 😄. Fixed that, making `merge` commutative. ### 4. TableWriteCoordinator scan reuse concern Sounds good, added a small note in the code. ### 5. Serialization size for large partition maps I suspect this scenario is rare, and the workaround would be to set the default bucket count to the most common in the table. That would hit the optimization condition and reduce the memory pressure. ### 6. SchemaBucketFileStoreTable missing newWrite(commitUser, writeId, rowKeyExtractor) override FIxed! ### 7. Test coverage Covered the `loadFromScan` scenario. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
