Github user squito commented on the issue:
https://github.com/apache/spark/pull/22511
> The analysis makes sense to me. The thing I'm not sure is, how can we hit
it? The "fetch block to temp file" code path is only enabled for big blocks (>
2GB).
The failing tests cases "with replication as stream" turned on fetch to
disk for all data:
https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/DistributedSuite.scala#L166-L167
> a possible approach: can we just not dispose the data in
TorrentBroadcast? The memory store will dispose them anyway when it's removed.
yes I considered this, but I don't feel confident about making that change
for 2.4. I need to spend some more time understanding that (seems it came from
SPARK-19556 /
https://github.com/apache/spark/commit/b56ad2b1ec19fd60fa9d4926d12244fd3f56aca4).
I think this change is the right one for the moment.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]