[ 
https://issues.apache.org/jira/browse/SPARK-14437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15230733#comment-15230733
 ] 

Kevin Hogeland commented on SPARK-14437:
----------------------------------------

Yes, {{BlockManager#doGetRemote}} requests the executor block manager address 
from the driver and attempts to fetch from it:

{code}
16/04/06 18:18:30 WARN scheduler.TaskSetManager: Lost task 1.0 in stage 6778.0 
(TID 7351, ip-172-16-15-0.us-west-2.compute.internal): 
org.apache.spark.storage.BlockFetchException: Failed to fetch block from 1 
locations. Most recent failure cause:
        at 
org.apache.spark.storage.BlockManager$$anonfun$doGetRemote$2.apply(BlockManager.scala:595)
        at 
org.apache.spark.storage.BlockManager$$anonfun$doGetRemote$2.apply(BlockManager.scala:585)
...
Caused by: java.io.IOException: Failed to connect to 
ip-172-16-14-0.us-west-2.compute.internal/172.16.14.0:33203
{code}

> Spark using Netty RPC gets wrong address in some setups
> -------------------------------------------------------
>
>                 Key: SPARK-14437
>                 URL: https://issues.apache.org/jira/browse/SPARK-14437
>             Project: Spark
>          Issue Type: Bug
>          Components: Block Manager, Spark Core
>    Affects Versions: 1.6.0, 1.6.1
>         Environment: AWS, Docker, Flannel
>            Reporter: Kevin Hogeland
>
> Netty can't get the correct origin address in certain network setups. Spark 
> should handle this, as relying on Netty correctly reporting all addresses 
> leads to incompatible and unpredictable network states. We're currently using 
> Docker with Flannel on AWS. Container communication looks something like: 
> {{Container 1 (1.2.3.1) -> Docker host A (1.2.3.0) -> Docker host B (4.5.6.0) 
> -> Container 2 (4.5.6.1)}}
> If the client in that setup is Container 1 (1.2.3.4), Netty channels from 
> there to Container 2 will have a client address of 1.2.3.0.
> The {{RequestMessage}} object that is sent over the wire already contains a 
> {{senderAddress}} field that the sender can use to specify their address. In 
> {{NettyRpcEnv#internalReceive}}, this is replaced with the Netty client 
> socket address when null. {{senderAddress}} in the messages sent from the 
> executors is currently always null, meaning all messages will have these 
> incorrect addresses (we've switched back to Akka as a temporary workaround 
> for this). The executor should send its address explicitly so that the driver 
> doesn't attempt to infer addresses based on possibly incorrect information 
> from Netty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to