mridulm commented on PR #38064: URL: https://github.com/apache/spark/pull/38064#issuecomment-1278026577
@liuzqt Most task results are very small. We will now be over-provisioning that by a few orders of magnitude when moving to `ChunkedByteBufferOutputStream` - while a vanishingly small set of cases hit the large buffer case. This can potentially have an impact on memory utilization at executor, and if possible look at ways to mitigate - particularly, for example, when we have a good estimate of the output size. This is not to say I have serious concerns (we do use `ChunkedByteBufferOutputStream` with precisely that size everywhere else !) - but it is not without tradeoff. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
