GitHub user bkrieger opened a pull request:
https://github.com/apache/spark/pull/22995
[SPARK-25998] [CORE] Change TorrentBroadcast to hold weak reference of
broadcast object
## What changes were proposed in this pull request?
This PR changes the broadcast object in TorrentBroadcast from a strong
reference to a weak reference. This allows it to be garbage collected even if
the Dataset is held in memory. This is ok, because the broadcast object can
always be re-read.
## How was this patch tested?
Tested in Spark shell by taking a heap dump, full repro steps listed in
https://issues.apache.org/jira/browse/SPARK-25998.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/bkrieger/spark bk/torrent-broadcast-weak
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/22995.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #22995
----
commit a2683b62985fc9c7d15fb92f3bb170a4b5225058
Author: Brandon Krieger <bkrieger@...>
Date: 2018-11-08T23:04:06Z
use weak reference for torrent broadcast
commit 99fbeecf43a289648a56d178fa55e188ce75bdb7
Author: Brandon Krieger <bkrieger@...>
Date: 2018-11-09T21:04:51Z
fix compile
commit 5e0a179c168a70b0166abe4bb51a1d26a2f1d666
Author: Brandon Krieger <bkrieger@...>
Date: 2018-11-09T21:33:22Z
fix
commit 1908b5b8dfa6c0b55db3bd9a90e21ca713e5bf25
Author: Brandon Krieger <bkrieger@...>
Date: 2018-11-09T21:48:44Z
no npe
commit 24183e5b8b63e0b4e117856ab4de7eb1b0ea6c9a
Author: Brandon Krieger <bkrieger@...>
Date: 2018-11-09T21:52:21Z
no option
commit f212da322242386ce3b71e9961a964e60b587287
Author: Brandon Krieger <bkrieger@...>
Date: 2018-11-09T22:08:23Z
typo
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]