[jira] [Resolved] (SPARK-24920) Spark should allow sharing netty's memory pools across all uses

Sean Owen (JIRA) Tue, 08 Jan 2019 11:12:33 -0800


     [ 
https://issues.apache.org/jira/browse/SPARK-24920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sean Owen resolved SPARK-24920.
-------------------------------
       Resolution: Fixed
    Fix Version/s: 3.0.0

Issue resolved by pull request 23278
[https://github.com/apache/spark/pull/23278]

> Spark should allow sharing netty's memory pools across all uses
> ---------------------------------------------------------------
>
>                 Key: SPARK-24920
>                 URL: https://issues.apache.org/jira/browse/SPARK-24920
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 2.4.0
>            Reporter: Imran Rashid
>            Assignee: Attila Zsolt Piros
>            Priority: Major
>              Labels: memory-analysis
>             Fix For: 3.0.0
>
>
> Spark currently creates separate netty memory pools for each of the following 
> "services":
> 1) RPC Client
> 2) RPC Server
> 3) BlockTransfer Client
> 4) BlockTransfer Server
> 5) ExternalShuffle Client
> Depending on configuration and whether its an executor or driver JVM, 
> different of these are active, but its always either 3 or 4.
> Having them independent somewhat defeats the purpose of using pools at all.  
> In my experiments I've found each pool will grow due to a burst of activity 
> in the related service (eg. task start / end msgs), followed another burst in 
> a different service (eg. sending torrent broadcast blocks).  Because of the 
> way these pools work, they allocate memory in large chunks (16 MB by default) 
> for each netty thread, so there is often a surge of 128 MB of allocated 
> memory, even for really tiny messages.  Also a lot of this memory is offheap 
> by default, which makes it even tougher for users to manage.
> I think it would make more sense to combine all of these into a single pool.  
> In some experiments I tried, this noticeably decreased memory usage, both 
> onheap and offheap (no significant performance effect in my small 
> experiments).
> As this is a pretty core change, as I first step I'd propose just exposing 
> this as a conf, to let user experiment more broadly across a wider range of 
> workloads



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-24920) Spark should allow sharing netty's memory pools across all uses

Reply via email to