[jira] [Commented] (SPARK-3211) .take() is OOM-prone when there are empty partitions

Apache Spark (JIRA) Mon, 25 Aug 2014 12:23:19 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14109591#comment-14109591
 ]


Apache Spark commented on SPARK-3211:
-------------------------------------

User 'ash211' has created a pull request for this issue:
https://github.com/apache/spark/pull/2117

> .take() is OOM-prone when there are empty partitions
> ----------------------------------------------------
>
>                 Key: SPARK-3211
>                 URL: https://issues.apache.org/jira/browse/SPARK-3211
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.0.2
>            Reporter: Andrew Ash
>
> Filed on dev@ on 22 August by [~pnepywoda]:
> {quote}
> On line 777
> https://github.com/apache/spark/commit/42571d30d0d518e69eecf468075e4c5a823a2ae8#diff-1d55e54678eff2076263f2fe36150c17R771
> the logic for take() reads ALL partitions if the first one (or first k) are
> empty. This has actually lead to OOMs when we had many partitions
> (thousands) and unfortunately the first one was empty.
> Wouldn't a better implementation strategy be
> numPartsToTry = partsScanned * 2
> instead of
> numPartsToTry = totalParts - 1
> (this doubling is similar to most memory allocation strategies)
> Thanks!
> - Paul
> {quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-3211) .take() is OOM-prone when there are empty partitions

Reply via email to