Github user countmdm commented on the issue: https://github.com/apache/spark/pull/21811 @kiszk the situation "before" is well understood. In the respective SPARK-24801 ticket I present a fragment from the analysis of this heap dump by jxray (www.jxray.com). It shows that ~2.5GB of memory, or 64% of the used heap size, is wasted by ~40.5 thousand emtpty byte[] arrays in question: 2,597,946K (64.1%): byte[]: 40583 / 100% of empty 2,597,946K (64.1%) âorg.apache.spark.network.util.ByteArrayWritableChannel.data âorg.apache.spark.network.sasl.SaslEncryption$EncryptedMessage.byteChannel âio.netty.channel.ChannelOutboundBuffer$Entry.msg ... However, we don't, and probably cannot, get the real "after" evidence. That's because, as I said, I don't know how to reproduce the situation in house. And I think it's very unlikely that the customer can easily reproduce it either, let alone accept our patched code and collect the necessary data before and after the fix. However, I believe this fix is simple and obvious enough, and thus we can be pretty sure that with it, in the above situation there would simply be no problematic byte[] arrays anymore, and memory consumption will be 64% smaller.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org