mikedias commented on PR #7865:
URL: https://github.com/apache/paimon/pull/7865#issuecomment-4538083982

   Thanks for the awesome review @JingsongLi, here is the follow-up:
   
   ### 1. Silent exception swallowing in PartitionBucketMapping.loadFromScan
   
   Good point, I agree that failing fast is safer. Removed the try-catch block 
and let the exception propagate.
   
   
   ### 2. PartitionBucketMapping staleness in long-running streaming jobs
   It is already expected that the rescale process can only be performed when 
no job is writing to the table, since it uses strict commit mode. Also, there 
are additional checks that will fail the commit if the bucket counts don't 
match. In any case, I've added a note on that in the rescale doc.
   
   
   ### 3. PartitionEntry.merge() tie-breaking semantics
   
   Good point, no one wants non-deterministic behavior in their codebases 😄. 
Fixed that, making `merge` commutative. 
   
   
   ### 4. TableWriteCoordinator scan reuse concern
   
   Sounds good, added a small note in the code.
   
   ### 5. Serialization size for large partition maps
   
   I suspect this scenario is rare, and the workaround would be to set the 
default bucket count to the most common in the table. That would hit the 
optimization condition and reduce the memory pressure.
   
   ### 6. SchemaBucketFileStoreTable missing newWrite(commitUser, writeId, 
rowKeyExtractor) override
   
   FIxed!
   
   ### 7. Test coverage
   
   Covered the `loadFromScan` scenario. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to