Oops I forgot to add, for 2, maybe we can add a flag to use DISK_ONLY for
TorrentBroadcast, or if the broadcasts are bigger than some size.
Matei
On Oct 9, 2014, at 3:04 PM, Matei Zaharia wrote:
> Thanks for the feedback. For 1, there is an open patch:
> https://github.com/apache/spark/pull/2
Thanks for the feedback. For 1, there is an open patch:
https://github.com/apache/spark/pull/2659. For 2, broadcast blocks actually use
MEMORY_AND_DISK storage, so they will spill to disk if you have low memory, but
they're faster to access otherwise.
Matei
On Oct 9, 2014, at 12:11 PM, Guillau
Hi,
Thanks to your answer, we've found the problem. It was on reverse IP
resolution on the drivers we used (wrong configuration of the local
bind9). Apparently, not being able to reverse-resolve the IP address of
the nodes was the culprit of the 10s delay.
We've hit two other secondary probl
Maybe there is a firewall issue that makes it slow for your nodes to connect
through the IP addresses they're configured with. I see there's this 10 second
pause between "Updated info of block broadcast_84_piece1" and
"ensureFreeSpace(4194304) called" (where it actually receives the block). HTTP
Could you create a JIRA for it? maybe it's a regression after
https://issues.apache.org/jira/browse/SPARK-3119.
We will appreciate that if you could tell how to reproduce it.
On Mon, Oct 6, 2014 at 1:27 AM, Guillaume Pitel
wrote:
> Hi,
>
> I've had no answer to this on u...@spark.apache.org, so
Hi,
I've had no answer to this on u...@spark.apache.org, so I post it on dev before
filing a JIRA (in case the problem or solution is already identified)
We've had some performance issues since switching to 1.1.0, and we finally found
the origin : TorrentBroadcast seems to be very slow in our