bvaradar commented on issue #2393: URL: https://github.com/apache/hudi/issues/2393#issuecomment-752987079
@tooptoop4 : Hudi does not have access to incoming size of the files being loaded as it is managed by spark. The input dataframe is the interaction point for Hudi. With custom merge logic and payload size in memory different from that in file, it is not possible to get an accurate measure. I have opened an exploratory jira to look at this : https://issues.apache.org/jira/browse/HUDI-1501 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
