Github user countmdm commented on the issue:
https://github.com/apache/spark/pull/21811
@kiszk the situation "before" is well understood. In the respective
SPARK-24801 ticket I present a fragment from the analysis of this heap dump by
jxray (www.jxray.com). It shows t
Github user countmdm commented on the issue:
https://github.com/apache/spark/pull/21811
Thank you very much for your responses, @squito. I agree with all you said.
@kiszk the heap dump that prompted me to make this change was obtained from
a customer, who probably ran
Github user countmdm commented on the issue:
https://github.com/apache/spark/pull/21811
Yes.
On Wed, Jul 18, 2018 at 4:43 PM, UCB AMPLab
wrote:
> Can one of the admins verify this patch?
>
> â
> You are receiving this because you authore
GitHub user countmdm opened a pull request:
https://github.com/apache/spark/pull/21811
[SPARK-24801][CORE] Avoid memory waste by empty byte[] arrays in
SaslEncryption$EncryptedMessage
## What changes were proposed in this pull request?
Initialize SaslEncryption
Github user countmdm commented on a diff in the pull request:
https://github.com/apache/spark/pull/21456#discussion_r192576365
--- Diff:
common/network-shuffle/src/test/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolverSuite.java
---
@@ -135,4 +136,23 @@ public
Github user countmdm commented on the issue:
https://github.com/apache/spark/pull/21456
Just modified the code to use regexp and pushed the updates.
---
-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
Github user countmdm commented on the issue:
https://github.com/apache/spark/pull/21456
Ok, if you believe this is not a performance problem here, then it's fine
with me. To save us some possible further bouncing of this review, can you
please share here your pattern/regex code? I
Github user countmdm commented on the issue:
https://github.com/apache/spark/pull/21456
@squito wrt. the added code for path normalization: exactly as you say.
This is just a precaution in case spark (or even some code that above spark)
ends up generating pathnames that contain
Github user countmdm commented on a diff in the pull request:
https://github.com/apache/spark/pull/21456#discussion_r192217237
--- Diff:
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java
---
@@ -272,6 +273,57 @@ void close
Github user countmdm commented on the issue:
https://github.com/apache/spark/pull/21456
If we don't do normalization ourselves, we may potentially run into the
following:
path = ... // Produces "foo//bar"
path = path.intern(); // Ok, no separate copies of
Github user countmdm commented on the issue:
https://github.com/apache/spark/pull/21456
@srowen yes, I am pretty sure that this code generates all these duplicate
objects. I've analyzed a heap dump from a real customer, so I cannot publish
the entire jxray report, since it may
Github user countmdm commented on the issue:
https://github.com/apache/spark/pull/21456
I confirm that the -XX:+UseStringDeduplication option is available only
with G1 GC, and it is off by default. So if we decide to use it, I guess we
won't be able to enforce it reliably, especially
Github user countmdm commented on the issue:
https://github.com/apache/spark/pull/21456
Yes.
On Tue, May 29, 2018 at 1:18 PM, UCB AMPLab
wrote:
> Can one of the admins verify this patch?
>
> â
> You are receiving this because you authore
GitHub user countmdm opened a pull request:
https://github.com/apache/spark/pull/21456
[SPARK-24356] [CORE] Duplicate strings in File.path managed by
FileSegmentManagedBuffer
This patch eliminates duplicate strings that come from the 'path' field of
java.io.File objects created
14 matches
Mail list logo