[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-12-07 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/15722 LGTM. Merging to master/2.1. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-12-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15722 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69793/ Test PASSed. ---

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-12-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15722 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-12-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15722 **[Test build #69793 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69793/consoleFull)** for PR 15722 at commit

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-12-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15722 **[Test build #69793 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69793/consoleFull)** for PR 15722 at commit

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-12-07 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/15722 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-12-06 Thread jiexiong
Github user jiexiong commented on the issue: https://github.com/apache/spark/pull/15722 @hvanhovell, thanks for your suggestion. I will change the PR description as suggested. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-12-06 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/15722 @jiexiong PR descriptions are used in git commit messages, and should be clear and concise. The fix LGTM, but the description should be improved for future reference. How about we change it into

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15722 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-12-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15722 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69689/ Test FAILed. ---

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-12-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15722 **[Test build #69689 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69689/consoleFull)** for PR 15722 at commit

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-12-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15722 **[Test build #69689 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69689/consoleFull)** for PR 15722 at commit

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-12-05 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/15722 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-11-30 Thread jiexiong
Github user jiexiong commented on the issue: https://github.com/apache/spark/pull/15722 Please retest! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-11-30 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/15722 jenkins retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-11-29 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/15722 Retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-11-28 Thread jiexiong
Github user jiexiong commented on the issue: https://github.com/apache/spark/pull/15722 @hvanhovell , I have already updated the description and explained how the PR fixed it. Could you please take another look? --- If your project is set up for it, you can reply to this email and

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-11-28 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/15722 ping @jiexiong --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-11-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15722 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-11-08 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/15722 @jiexiong could you improve the PR description and add a description of what is actually causing this bug (a summary of the discussions in the PR would probably suffice), and how this PR fixes

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-11-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15722 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68338/ Test PASSed. ---

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-11-08 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15722 **[Test build #68338 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68338/consoleFull)** for PR 15722 at commit

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-11-08 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15722 **[Test build #68338 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68338/consoleFull)** for PR 15722 at commit

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-11-08 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/15722 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-11-02 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/15722 The test case failure does not seem to be related to this change. May be flaky test? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-11-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15722 **[Test build #3399 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3399/consoleFull)** for PR 15722 at commit

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-11-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15722 **[Test build #3399 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3399/consoleFull)** for PR 15722 at commit

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-11-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15722 **[Test build #3396 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3396/consoleFull)** for PR 15722 at commit

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-11-02 Thread davies
Github user davies commented on the issue: https://github.com/apache/spark/pull/15722 @sitalkedia I see your point, even the longArray only take a small factor of memory of the executor, but still could make other concurrent tasks fail to acquire enough initial memory, that make

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-11-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15722 **[Test build #3396 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3396/consoleFull)** for PR 15722 at commit

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-11-02 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/15722 @davies - Yes, we dumped the logging and confirmed that the OOM is because we are not freeing the `LongArray` while reseting the `BytesToBytesMap`. The job which used to fail because of OOM runs

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-11-02 Thread davies
Github user davies commented on the issue: https://github.com/apache/spark/pull/15722 @jiexiong The longArray will not grow indefinitely, it only grow when the number of keys reach 50% of it's size. Another assumption is that the memory used by longArray should be much smaller than

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-11-02 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/15722 @davies - We fixed a similar issue with `UnsafeExternalSorter` in SPARK-14363. Basically following scenario is leading to OOM - Lets say we have total 4G of memory available that is shared

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-11-01 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15722 @jiexiong can you update the pr description to include more information about the OOM and remove the facebook internal links? Thanks. --- If your project is set up for it, you can reply to this

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-11-01 Thread mridulm
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/15722 I agree with @jiexiong's analysis : @davies to support the scenario you mentioned, we should not be free'ing the pages. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-11-01 Thread jiexiong
Github user jiexiong commented on the issue: https://github.com/apache/spark/pull/15722 Here is my understanding: after spilling, it would call reset() to release the memory. In the reset() function, it deletes all the memory pages, but it did not release any memory from longArray().

[GitHub] spark issue #15722: [SPARK-18208] [Shuffle] Executor OOM due to a growing Lo...

2016-11-01 Thread davies
Github user davies commented on the issue: https://github.com/apache/spark/pull/15722 @jiexiong Do you have a theory why this will cause OOM? To me, the current code will use more memory than needed but less allocation, why it will cause OOM? --- If your project is set up for it,