[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-09 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18231 **[Test build #77862 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77862/testReport)** for PR 18231 at commit

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-09 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/18231 > There isn't a reference here anymore; there could be elsewhere. Only if there was a bug in the RPC layer, since this is an RPC handler and the message should not be referenced by the RPC

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-09 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18231 There isn't a reference _here_ anymore; there could be elsewhere. It sounds like there's good reason to believe there is not another reference hanging around though. --- If your project is set up

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-09 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/18231 @srowen I don't see any references to the original `OpenBlocks` message nor to the block id array in the updated code, not sure why do you think there's still a reference somewhere? --- If your

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18231 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18231 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77831/ Test PASSed. ---

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18231 **[Test build #77831 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77831/testReport)** for PR 18231 at commit

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18231 **[Test build #77831 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77831/testReport)** for PR 18231 at commit

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18231 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18231 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77811/ Test PASSed. ---

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18231 **[Test build #77811 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77811/testReport)** for PR 18231 at commit

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18231 @srowen I did a test to verify this patch. I wrap a number of blocks inside `OpenBlocks` and send it to `ExternalShuffleBlockHandler`. With this change: it cost about 133M in the

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18231 Yes, I think it's great to do some tests and give a good evidence. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18231 I'm not clear that's true, no. Not, at least, in the lifetime of the iterator. That's what has to be true for this to help anything. Do you have evidence this is true? for example if you have tests

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18231 there is no where referencing `msg`, right? I guess the `msg` will be garbage collected fluently. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18231 **[Test build #77811 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77811/testReport)** for PR 18231 at commit

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18231 That's not the question though. The question is whether they could be freed even after this change. msg still references it. That's what you need to establish, if only by some empirical testing.

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18231 The blockIds cannot be freed because they are referenced in the iterator. In current change they are not. We reference the mapIdAndReduceIds instead. Thus the blockIds in OpenBlocks can be

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18231 I get it. But that doesn't make the reference in OpenBlocks go away. This only helps anything is msg/msgObj can be garbage collected earlier. Is that the case? right now this is allocating

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18231 I mean the blockIds in `OpenBlocks`, they have reference in iterator. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18231 The current iterator doesn't have any state except for an int. What are you referring to? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18231 @srowen Sorry, I didn't make it clear. 1. In current code, all blockIds are stored in the iterator. They are released only when the iterator is traversed. 2. Now I change the `String` to

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18231 Actually it's more than 12 bytes. Yes, there are millions of these. In my heap dump, it's 1.5 G --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18231 That's 12 bytes. Are there millions of these? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18231 @vanzin Thanks a lot for reviewing this. I refined according to your comments, Please take another look at this when you have time :) --- If your project is set up for it, you can reply to

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18231 @srowen Thanks a lot looking into this :) For example: blockId="shuffle_20_1000_2000", it is stored as an `String`, which costs more than 20 bytes. In this change, it will cost only 8

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18231 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18231 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77806/ Test PASSed. ---

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-08 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18231 **[Test build #77806 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77806/testReport)** for PR 18231 at commit

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18231 **[Test build #77806 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77806/testReport)** for PR 18231 at commit

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18231 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77795/ Test FAILed. ---

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18231 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18231 **[Test build #77795 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77795/testReport)** for PR 18231 at commit

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-07 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18231 **[Test build #77795 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77795/testReport)** for PR 18231 at commit

[GitHub] spark issue #18231: [WIP][SPARK-20994] Remove reduant characters in OpenBloc...

2017-06-07 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/18231 n my cluster, we are suffering from OOM of shuffle-service. We found that a lot of executors are fetching blocks from a single shuffle-service. Analyzing the memory, we found that the