bvaradar commented on issue #1902:
URL: https://github.com/apache/hudi/issues/1902#issuecomment-672406725


   With bulk insert, the parallelism configuration determines the lower bound 
on the number of files. Since, you started with bulk insert, you are seeing 
that many number of files. Hudi upsert/insert will route "new records" (with 
new record keys) to these small files. So, If there are new records on the same 
partition, you will see those smalll files growing.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to