Sasi created SPARK-17607:
----------------------------

             Summary: --driver-url doesn't point to my master_ip.
                 Key: SPARK-17607
                 URL: https://issues.apache.org/jira/browse/SPARK-17607
             Project: Spark
          Issue Type: Bug
    Affects Versions: 1.5.2
            Reporter: Sasi
            Priority: Critical


Hi,

I have master machine and slave machine.
My master machine contains 2 interfaces.
First interface has the following ip 10.5.5.2, and the other interface has the 
following ip 10.0.42.230.
I configured the MASTER_IP to be 10.5.5.2, so once the master goes up and its 
worker I see the following INFO lines:
{code}
16/09/20 12:32:32 INFO Worker: Successfully registered with master 
spark://10.5.5.2:7077
16/09/20 12:39:15 INFO Worker: Asked to launch executor 
app-20160920123915-0000/0 for Spark-DataAccessor-JBoss
{code}

I set the SPARK_LOCAL_IP on each worker to be its own ip, e.g 10.5.5.5.

Both constants were configured on spark-env.sh.

The problem started when I tried to get data from my workers.
I got the following INFO line in each worker log.
{code} 
"--driver-url" 
"akka.tcp://sparkDriver@10.0.42.230:43683/user/CoarseGrainedScheduler" "
{code}

As you can see the masterIp is different then the driver-url ip.
Master ip is 10.5.5.2 but driver-url is 10.0.42.230, therefore i'm getting the 
following errors:

{code}
16/09/20 12:17:57 INFO Slf4jLogger: Slf4jLogger started
16/09/20 12:17:57 INFO Remoting: Starting remoting
16/09/20 12:17:57 INFO Remoting: Remoting started; listening on addresses 
:[akka.tcp://driverPropsFetcher@10.5.5.5:34961]
16/09/20 12:17:57 INFO Utils: Successfully started service 'driverPropsFetcher' 
on port 34961.
16/09/20 12:19:00 WARN ReliableDeliverySupervisor: Association with remote 
system [akka.tcp://sparkDriver@10.0.42.230:36711] has failed, address is now 
gated for [5000] ms. Reason: [Association failed with 
[akka.tcp://sparkDriver@10.0.42.230:36711]] Caused by: [Connection timed out: 
/10.0.42.230:36711]
Exception in thread "main" akka.actor.ActorNotFound: Actor not found for: 
ActorSelection[Anchor(akka.tcp://sparkDriver@10.0.42.230:36711/), 
Path(/user/CoarseGrainedScheduler)]
        at
{code}

The master is listen and open for communicate via 10.5.5.2 and not 10.0.42.230.

Looks like the driver-url ignore the real MASTER_IP.
Thanks,
Sasi



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to