[
https://issues.apache.org/jira/browse/SPARK-21175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wenchen Fan updated SPARK-21175:
--------------------------------
Summary: shuffle service should reject fetch requests if there are already
many requests in progress (was: Slow down "open blocks" on shuffle service
when memory shortage to avoid OOM.)
> shuffle service should reject fetch requests if there are already many
> requests in progress
> -------------------------------------------------------------------------------------------
>
> Key: SPARK-21175
> URL: https://issues.apache.org/jira/browse/SPARK-21175
> Project: Spark
> Issue Type: Improvement
> Components: Shuffle
> Affects Versions: 2.1.1
> Reporter: jin xing
> Assignee: jin xing
> Fix For: 2.3.0
>
>
> A shuffle service can serves blocks from multiple apps/tasks. Thus the
> shuffle service can suffers high memory usage when lots of {{shuffle-read}}
> happen at the same time. In my cluster, OOM always happens on shuffle
> service. Analyzing heap dump, memory cost by Netty(chunks) can be up to 2~3G.
> It might make sense to reject "open blocks" request when memory usage is high
> on shuffle service.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]