[ 
https://issues.apache.org/jira/browse/SPARK-24801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16545820#comment-16545820
 ] 

Misha Dmitriev commented on SPARK-24801:
----------------------------------------

Correct, there are indeed 40583 instances of {{EncryptedMessage}} in memory. 
From the other section of jxray report, which shows reference chains starting 
from GC roots, and shows the number of objects at each level, I see the 
following:
{code:java}
2,929,966K (72.3%) Object tree for GC root(s) Java Static 
org.apache.spark.network.yarn.YarnShuffleService.instance

org.apache.spark.network.yarn.YarnShuffleService.blockHandler ↘ 2,753,031K 
(67.9%), 1 reference(s) 
org.apache.spark.network.shuffle.ExternalShuffleBlockHandler.streamManager ↘ 
2,753,019K (67.9%), 1 reference(s) 
org.apache.spark.network.server.OneForOneStreamManager.streams ↘ 2,753,019K 
(67.9%), 1 reference(s) 
{java.util.concurrent.ConcurrentHashMap}.values ↘ 2,753,008K (67.9%), 169 
reference(s) 
org.apache.spark.network.server.OneForOneStreamManager$StreamState.associatedChannel
 ↘ 2,640,203K (65.1%), 32 reference(s) 
io.netty.channel.socket.nio.NioSocketChannel.unsafe ↘ 2,640,039K (65.1%), 32 
reference(s) 
io.netty.channel.socket.nio.NioSocketChannel$NioSocketChannelUnsafe.outboundBuffer
 ↘ 2,640,037K (65.1%), 30 reference(s) 
io.netty.channel.ChannelOutboundBuffer.flushedEntry ↘ 2,639,382K (65.1%), 15 
reference(s) 
io.netty.channel.ChannelOutboundBuffer$Entry.{next} ↘ 2,637,973K (65.1%), 
40,583 reference(s) 
io.netty.channel.ChannelOutboundBuffer$Entry.msg ↘ 2,622,966K (64.7%), 40,583 
reference(s) 
org.apache.spark.network.sasl.SaslEncryption$EncryptedMessage.byteChannel ↘ 
2,598,897K (64.1%), 40,583 reference(s) 
org.apache.spark.network.util.ByteArrayWritableChannel.data ↘ 2,597,946K 
(64.1%), 40,583 reference(s) 
org.apache.spark.network.util.ByteArrayWritableChannel self 951K (< 0.1%), 
40,583 object(s){code}
So basically we have 15 netty {{ChannelOutboundBuffer}} objects, and then 
collectively , via linked lists starting from their {{flushedEntry}} data 
fields, they end up referencing 40,583 {{ChannelOutboundBuffer$Entry}} objects, 
which ultimately reference all these {{EncryptedMessage}} objects.

So looks like here netty for some reason accumulated (didn't send) a very large 
number of messages, and thus netty is likely the main culprit. But then I 
wonder why all these messages are empty

 

> Empty byte[] arrays in spark.network.sasl.SaslEncryption$EncryptedMessage can 
> waste a lot of memory
> ---------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-24801
>                 URL: https://issues.apache.org/jira/browse/SPARK-24801
>             Project: Spark
>          Issue Type: Improvement
>          Components: YARN
>    Affects Versions: 2.3.0
>            Reporter: Misha Dmitriev
>            Priority: Major
>
> I recently analyzed another Yarn NM heap dump with jxray 
> ([www.jxray.com),|http://www.jxray.com),/] and found that 81% of memory is 
> wasted by empty (all zeroes) byte[] arrays. Most of these arrays are 
> referenced by 
> {{org.apache.spark.network.util.ByteArrayWritableChannel.data}}, and these in 
> turn come from 
> {{spark.network.sasl.SaslEncryption$EncryptedMessage.byteChannel}}. Here is 
> the full reference chain that leads to the problematic arrays:
> {code:java}
> 2,597,946K (64.1%): byte[]: 40583 / 100% of empty 2,597,946K (64.1%)
> ↖org.apache.spark.network.util.ByteArrayWritableChannel.data
> ↖org.apache.spark.network.sasl.SaslEncryption$EncryptedMessage.byteChannel
> ↖io.netty.channel.ChannelOutboundBuffer$Entry.msg
> ↖io.netty.channel.ChannelOutboundBuffer$Entry.{next}
> ↖io.netty.channel.ChannelOutboundBuffer.flushedEntry
> ↖io.netty.channel.socket.nio.NioSocketChannel$NioSocketChannelUnsafe.outboundBuffer
> ↖io.netty.channel.socket.nio.NioSocketChannel.unsafe
> ↖org.apache.spark.network.server.OneForOneStreamManager$StreamState.associatedChannel
> ↖{java.util.concurrent.ConcurrentHashMap}.values
> ↖org.apache.spark.network.server.OneForOneStreamManager.streams
> ↖org.apache.spark.network.shuffle.ExternalShuffleBlockHandler.streamManager
> ↖org.apache.spark.network.yarn.YarnShuffleService.blockHandler
> ↖Java Static org.apache.spark.network.yarn.YarnShuffleService.instance{code}
>  
> Checking the code of {{SaslEncryption$EncryptedMessage}}, I see that 
> byteChannel is always initialized eagerly in the constructor:
> {code:java}
> this.byteChannel = new ByteArrayWritableChannel(maxOutboundBlockSize);{code}
> So I think to address the problem of empty byte[] arrays flooding the memory, 
> we should initialize {{byteChannel}} lazily, upon the first use. As far as I 
> can see, it's used only in one method, {{private void nextChunk()}}.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to