Github user countmdm commented on the issue:

    https://github.com/apache/spark/pull/21811
  
    @kiszk the situation "before" is well understood. In the respective 
SPARK-24801 ticket I present a fragment from the analysis of this heap dump by 
jxray (www.jxray.com). It shows that ~2.5GB of memory, or 64% of the used heap 
size, is wasted by ~40.5 thousand emtpty byte[] arrays in question:
    
    2,597,946K (64.1%): byte[]: 40583 / 100% of empty 2,597,946K (64.1%)
    ↖org.apache.spark.network.util.ByteArrayWritableChannel.data
    ↖org.apache.spark.network.sasl.SaslEncryption$EncryptedMessage.byteChannel
    ↖io.netty.channel.ChannelOutboundBuffer$Entry.msg
    ...
    
    However, we don't, and probably cannot, get the real "after" evidence. 
That's because, as I said, I don't know how to reproduce the situation in 
house. And I think it's very unlikely that the customer can easily reproduce it 
either, let alone accept our patched code and collect the necessary data before 
and after the fix. However, I believe this fix is simple and obvious enough, 
and thus we can be pretty sure that with it, in the above situation there would 
simply be no problematic byte[] arrays anymore, and memory consumption will be 
64% smaller.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to