[GitHub] spark issue #21811: [SPARK-24801][CORE] Avoid memory waste by empty byte[] a...

2018-07-26 Thread squito
Github user squito commented on the issue:

https://github.com/apache/spark/pull/21811
  
merged to master, thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21811: [SPARK-24801][CORE] Avoid memory waste by empty byte[] a...

2018-07-26 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/21811
  
+1, LGTM, too.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21811: [SPARK-24801][CORE] Avoid memory waste by empty byte[] a...

2018-07-24 Thread squito
Github user squito commented on the issue:

https://github.com/apache/spark/pull/21811
  
lgtm

will wait a bit for any more comments before merging


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21811: [SPARK-24801][CORE] Avoid memory waste by empty byte[] a...

2018-07-24 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/21811
  
SGTM






---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21811: [SPARK-24801][CORE] Avoid memory waste by empty byte[] a...

2018-07-23 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/21811
  
@countmdm Sorry for overlooking the JIRA description. I got the situation.
While the memory pool could be, it is too complex.

LGTM




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21811: [SPARK-24801][CORE] Avoid memory waste by empty byte[] a...

2018-07-23 Thread countmdm
Github user countmdm commented on the issue:

https://github.com/apache/spark/pull/21811
  
@kiszk the situation "before" is well understood. In the respective 
SPARK-24801 ticket I present a fragment from the analysis of this heap dump by 
jxray (www.jxray.com). It shows that ~2.5GB of memory, or 64% of the used heap 
size, is wasted by ~40.5 thousand emtpty byte[] arrays in question:

2,597,946K (64.1%): byte[]: 40583 / 100% of empty 2,597,946K (64.1%)
↖org.apache.spark.network.util.ByteArrayWritableChannel.data
↖org.apache.spark.network.sasl.SaslEncryption$EncryptedMessage.byteChannel
↖io.netty.channel.ChannelOutboundBuffer$Entry.msg
...

However, we don't, and probably cannot, get the real "after" evidence. 
That's because, as I said, I don't know how to reproduce the situation in 
house. And I think it's very unlikely that the customer can easily reproduce it 
either, let alone accept our patched code and collect the necessary data before 
and after the fix. However, I believe this fix is simple and obvious enough, 
and thus we can be pretty sure that with it, in the above situation there would 
simply be no problematic byte[] arrays anymore, and memory consumption will be 
64% smaller.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21811: [SPARK-24801][CORE] Avoid memory waste by empty byte[] a...

2018-07-23 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/21811
  
@countmdm I see. We may be interested in ratio regarding `byte[] / all 
allocated memory` before and after. Not interested in other objects (e.g. an 
object including customer's name).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21811: [SPARK-24801][CORE] Avoid memory waste by empty byte[] a...

2018-07-23 Thread countmdm
Github user countmdm commented on the issue:

https://github.com/apache/spark/pull/21811
  
Thank you very much for your responses, @squito. I agree with all you said.

@kiszk the heap dump that prompted me to make this change was obtained from 
a customer, who probably ran into some unusual situation. So, not being a spark 
expert, I am not sure how to reproduce this problem. Of course, normally I am 
all for measuring the app's memory before and after the change to make sure it 
worked.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21811: [SPARK-24801][CORE] Avoid memory waste by empty byte[] a...

2018-07-23 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/21811
  
I see. SGTM.
Would it be possible to attach heap profiling (allocated size for each 
type) before and after this PR to record the difference?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21811: [SPARK-24801][CORE] Avoid memory waste by empty byte[] a...

2018-07-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21811
  
**[Test build #4221 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4221/testReport)**
 for PR 21811 at commit 
[`8b46534`](https://github.com/apache/spark/commit/8b465341314b1cbbef726950d589d5fb49db341b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21811: [SPARK-24801][CORE] Avoid memory waste by empty byte[] a...

2018-07-23 Thread squito
Github user squito commented on the issue:

https://github.com/apache/spark/pull/21811
  
> Does it make sense to release byteChannel at deallocate()?

you could, just to let GC kick in a *bit* earlier, but I don't think its 
going to make a big difference.  (Netty's ByteBufs must be since they may be 
owned by netty's internal pools and never gc'ed)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21811: [SPARK-24801][CORE] Avoid memory waste by empty byte[] a...

2018-07-23 Thread squito
Github user squito commented on the issue:

https://github.com/apache/spark/pull/21811
  
> I like it, but this will still create the byte channel right? is there a 
way to reuse it?

we could create a pool, though management becomes a bit more complex.  
would you ever shrink the pool, or would it always stay the same size?  I guess 
it shouldn't grow more than 1 byte array per io thread.

I'd rather get this simple fix in first before doing anything more 
complicated ...


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21811: [SPARK-24801][CORE] Avoid memory waste by empty byte[] a...

2018-07-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21811
  
**[Test build #4221 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4221/testReport)**
 for PR 21811 at commit 
[`8b46534`](https://github.com/apache/spark/commit/8b465341314b1cbbef726950d589d5fb49db341b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21811: [SPARK-24801][CORE] Avoid memory waste by empty byte[] a...

2018-07-22 Thread kiszk
Github user kiszk commented on the issue:

https://github.com/apache/spark/pull/21811
  
Does it make sense to release `byteChannel` at `deallocate()`?
```
if (byteChannel != null) {
  byteChannel = null;
}
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21811: [SPARK-24801][CORE] Avoid memory waste by empty byte[] a...

2018-07-19 Thread squito
Github user squito commented on the issue:

https://github.com/apache/spark/pull/21811
  
@zsxwing @jerryshao @Victsm you might be interested in this


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21811: [SPARK-24801][CORE] Avoid memory waste by empty byte[] a...

2018-07-19 Thread squito
Github user squito commented on the issue:

https://github.com/apache/spark/pull/21811
  
can you add something in the PR description about how this is important 
because sometimes many of these messages queue up in netty's 
ChannelOutboundBuffer before transferTo() is called?  its discussed in the 
jira, but good to have here too


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21811: [SPARK-24801][CORE] Avoid memory waste by empty byte[] a...

2018-07-19 Thread squito
Github user squito commented on the issue:

https://github.com/apache/spark/pull/21811
  
Ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21811: [SPARK-24801][CORE] Avoid memory waste by empty byte[] a...

2018-07-18 Thread countmdm
Github user countmdm commented on the issue:

https://github.com/apache/spark/pull/21811
  
Yes.

On Wed, Jul 18, 2018 at 4:43 PM, UCB AMPLab 
wrote:

> Can one of the admins verify this patch?
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> , or 
mute
> the thread
> 

> .
>



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21811: [SPARK-24801][CORE] Avoid memory waste by empty byte[] a...

2018-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21811
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21811: [SPARK-24801][CORE] Avoid memory waste by empty byte[] a...

2018-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21811
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21811: [SPARK-24801][CORE] Avoid memory waste by empty byte[] a...

2018-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21811
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org