Re: Is executor computing time affected by network latency?

Peter Figliozzi Fri, 23 Sep 2016 13:24:43 -0700

See the reference on shuffles
<http://people.apache.org/~pwendell/spark-nightly/spark-master-docs/latest/programming-guide.html#shuffle-operations>,
"Spark’s mechanism for re-distributing data so that it’s grouped
differently across partitions. This typically involves copying data across
executors and machines, making the shuffle a complex and costly operation."




On Thu, Sep 22, 2016 at 4:14 PM, Soumitra Johri <
[email protected]> wrote:

> If your job involves a shuffle then the compute for the entire batch will
> increase with network latency. What would be interesting is to see how much
> time each task/job/stage takes.
>
> On Thu, Sep 22, 2016 at 5:11 PM Peter Figliozzi <[email protected]>
> wrote:
>
>> It seems to me they must communicate for joins, sorts, grouping, and so
>> forth, where the original data partitioning needs to change.  You could
>> repeat your experiment for different code snippets.  I'll bet it depends on
>> what you do.
>>
>> On Thu, Sep 22, 2016 at 8:54 AM, gusiri <[email protected]> wrote:
>>
>>> Hi,
>>>
>>> When I increase the network latency among spark nodes,
>>>
>>> I see compute time (=executor computing time in Spark Web UI) also
>>> increases.
>>>
>>> In the graph attached, left = latency 1ms vs right = latency 500ms.
>>>
>>> Is there any communication between worker and driver/master even 'during'
>>> executor computing? or any idea on this result?
>>>
>>>
>>> <http://apache-spark-user-list.1001560.n3.nabble.com/
>>> file/n27779/Screen_Shot_2016-09-21_at_5.png>
>>>
>>>
>>>
>>>
>>>
>>> Thank you very much in advance.
>>>
>>> //gusiri
>>>
>>>
>>>
>>>
>>> --
>>> View this message in context: http://apache-spark-user-list.
>>> 1001560.n3.nabble.com/Is-executor-computing-time-
>>> affected-by-network-latency-tp27779.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe e-mail: [email protected]
>>>
>>>
>>

Re: Is executor computing time affected by network latency?

Reply via email to