xloya commented on pull request #3977:
URL: https://github.com/apache/iceberg/pull/3977#issuecomment-1032135907


   > My concern with this approach is that it can potentially create more small 
files (with close and open for the same partition after cache eviction). @xloya 
can you share some of the results that you tried with this approach? I am sure 
it can help with memory usage. but does it create more small files?
   > 
   > I shared a design doc on shuffling support in Flink sink with the 
community a few months ago. That was a diff approach. 
https://docs.google.com/document/d/13N8cMqPi-ZPSKbkXGOBMPOzbv2Fua59j8bIjjtxLWqo/edit#heading=h.o4q8a61sahkq
   
   Yes, this will lead to an increase in the number of small files, but with 
reasonable configuration, I think it can be in a relatively balanced state.  
   For the way you mentioned, I think it is feasible. We will perform `Keyby` 
operation on Flink writing to solve this problem.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to