stevenzwu edited a comment on pull request #3181:
URL: https://github.com/apache/iceberg/pull/3181#issuecomment-930764904


   @jackye1995 Please take another look.
   
   Regarding your comment that new configs is more specific to the use case on 
engine side, I think this is not engine specific. Sure, it probably matters a 
little more on the streaming ingestion (Flink or Spark streaming). It can 
matter to batch write too. 
   
   E.g., We want to have smaller row group size (like 16 MB) to be able to 
split files into more splits for higher parallelism.  if the average row size 
is big (like MBs), then we need to tune down these configs to have more 
accurate control on the target row group size. This is useful if we want more 
accurate control on the row group size (and memory consumption) irrespective to 
streaming or batch write.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to