[GitHub] [hudi] bvaradar commented on issue #2393: [SUPPORT] excessive small files created

GitBox Thu, 31 Dec 2020 07:22:27 -0800


bvaradar commented on issue #2393:
URL: https://github.com/apache/hudi/issues/2393#issuecomment-752987079



   @tooptoop4 : Hudi does not have access to incoming size of the files being 
loaded as it is managed by spark. The input dataframe is the interaction point 
for Hudi. With custom merge logic and payload size in memory different from 
that in file, it is not possible to get an accurate measure. I have opened an 
exploratory jira to look at this : 
https://issues.apache.org/jira/browse/HUDI-1501


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] bvaradar commented on issue #2393: [SUPPORT] excessive small files created

Reply via email to