Just as a general note, especially when dealing with VM's, NTP is *required* 
otherwise we've found you can get out of snyc'd updates which can result in 
some pretty strange behavior. 

----- Original Message -----

> From: "Brian Devins" <[email protected]>
> To: [email protected]
> Sent: Wednesday, October 15, 2014 11:57:19 AM
> Subject: Re: Connecting spark from a different Machine to mesos cluster

> Also Johannes, is there a network segment between Spark and the Mesos master?
> This looks like behavior I have seen before when the Master cannot connect
> back to the framework. The master also needs to be able to reach the Spark
> machine by IP

> From: Tim Chen < [email protected] >
> Reply-To: " [email protected] " < [email protected] >
> Date: Wednesday, October 15, 2014 at 12:52 PM
> To: " [email protected] " < [email protected] >
> Subject: Re: Connecting spark from a different Machine to mesos cluster

> Hi Johannes,

> When you started your 2nd shell, what log output from the slave do you see
> for that framework?

> Master seems to think it's already terminated.

> Tim

> On Wed, Oct 15, 2014 at 6:31 AM, Johannes Schillinger (Intern) <
> [email protected] > wrote:

> > Hi Tim,
> 

> > We are running Spark 1.1.0 with Hadoop 2.4. Mesos is in Version 0.20.1 all
> > in
> > binary releases.
> 

> > The Spark console is running in default mode, which is fine grained.
> 

> > The Spark process is started from a physical Machine running Ubuntu, the
> > Mesos nodes are running in VMs also in Ubuntu.
> 

> > This is the output of the Spark Shell:
> 

> > --------------------------------------------------------------------------------------------------------------------------------
> 

> > Spark assembly has been built with Hive, including Datanucleus jars on
> > classpath
> 

> > Using Spark's default log4j profile:
> > org/apache/spark/log4j-defaults.properties
> 

> > 14/10/15 15:18:24 INFO SecurityManager: Changing view acls to: USERNAME,
> 

> > 14/10/15 15:18:24 INFO SecurityManager: Changing modify acls to: USERNAME,
> 

> > 14/10/15 15:18:24 INFO SecurityManager: SecurityManager: authentication
> > disabled; ui acls disabled; users with view permissions: Set(USERNAME, );
> > users with modify permissions: Set(USERNAME, )
> 

> > 14/10/15 15:18:24 INFO HttpServer: Starting HTTP Server
> 

> > 14/10/15 15:18:24 INFO Utils: Successfully started service 'HTTP class
> > server' on port 42469.
> 

> > Welcome to
> 

> > ____ __
> 

> > / __/__ ___ _____/ /__
> 

> > _\ \/ _ \/ _ `/ __/ '_/
> 

> > /___/ .__/\_,_/_/ /_/\_\ version 1.1.0
> 

> > /_/
> 

> > Using Scala version 2.10.4 (OpenJDK 64-Bit Server VM, Java 1.7.0_65)
> 

> > Type in expressions to have them evaluated.
> 

> > Type :help for more information.
> 

> > 14/10/15 15:18:26 WARN Utils: Your hostname, karwjohannes01 resolves to a
> > loopback address: 127.0.1.1; using CLIENT_IP instead (on interface eth0)
> 

> > 14/10/15 15:18:26 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to
> > another address
> 

> > 14/10/15 15:18:27 INFO SecurityManager: Changing view acls to: USERNAME,
> 

> > 14/10/15 15:18:27 INFO SecurityManager: Changing modify acls to: USERNAME,
> 

> > 14/10/15 15:18:27 INFO SecurityManager: SecurityManager: authentication
> > disabled; ui acls disabled; users with view permissions: Set(USERNAME, );
> > users with modify permissions: Set(USERNAME, )
> 

> > 14/10/15 15:18:27 INFO Slf4jLogger: Slf4jLogger started
> 

> > 14/10/15 15:18:27 INFO Remoting: Starting remoting
> 

> > 14/10/15 15:18:27 INFO Remoting: Remoting started; listening on addresses
> > :[akka.tcp://sparkDriver@CLIENT_IP:51879]
> 

> > 14/10/15 15:18:27 INFO Remoting: Remoting now listens on addresses:
> > [akka.tcp://sparkDriver@CLIENT_IP:51879]
> 

> > 14/10/15 15:18:27 INFO Utils: Successfully started service 'sparkDriver' on
> > port 51879.
> 

> > 14/10/15 15:18:27 INFO SparkEnv: Registering MapOutputTracker
> 

> > 14/10/15 15:18:27 INFO SparkEnv: Registering BlockManagerMaster
> 

> > 14/10/15 15:18:27 INFO DiskBlockManager: Created local directory at
> > /tmp/spark-local-20141015151827-1a2e
> 

> > 14/10/15 15:18:27 INFO Utils: Successfully started service 'Connection
> > manager for block manager' on port 60963.
> 

> > 14/10/15 15:18:27 INFO ConnectionManager: Bound socket to port 60963 with
> > id
> > = ConnectionManagerId(CLIENT_IP,60963)
> 

> > 14/10/15 15:18:27 INFO MemoryStore: MemoryStore started with capacity 265.4
> > MB
> 

> > 14/10/15 15:18:27 INFO BlockManagerMaster: Trying to register BlockManager
> 

> > 14/10/15 15:18:27 INFO BlockManagerMasterActor: Registering block manager
> > CLIENT_IP:60963 with 265.4 MB RAM
> 

> > 14/10/15 15:18:27 INFO BlockManagerMaster: Registered BlockManager
> 

> > 14/10/15 15:18:27 INFO HttpFileServer: HTTP File server directory is
> > /tmp/spark-b032c76c-93e1-473e-802c-c55e12e85d41
> 

> > 14/10/15 15:18:27 INFO HttpServer: Starting HTTP Server
> 

> > 14/10/15 15:18:27 INFO Utils: Successfully started service 'HTTP file
> > server'
> > on port 47989.
> 

> > 14/10/15 15:18:27 INFO Utils: Successfully started service 'SparkUI' on
> > port
> > 4040.
> 

> > 14/10/15 15:18:27 INFO SparkUI: Started SparkUI at http://CLIENT_IP:4040
> 

> > 14/10/15 15:18:27 WARN NativeCodeLoader: Unable to load native-hadoop
> > library
> > for your platform... using builtin-java classes where applicable
> 

> > I1015 15:18:28.524736 4748 sched.cpp:139] Version: 0.20.1
> 

> > I1015 15:18:28.527180 4750 sched.cpp:235] New master detected at
> > master@MESOS_MASTER_IP:5050
> 

> > I1015 15:18:28.527300 4750 sched.cpp:243] No credentials provided.
> > Attempting
> > to register without authentication
> 

> > --------------------------------------------------------------------------------------------------------------------------------
> 

> > Mesos master WARNING log:
> 

> > W1015 14:13:00.235213 1118 master.cpp:3452] Master returning resources
> > offered to framework 20141007-102213-343139338-5050-1037-3490 because the
> > framework has terminated or is inactive
> 

> > W1015 14:13:35.244055 1121 master.cpp:3452] Master returning resources
> > offered to framework 20141007-102213-343139338-5050-1037-3525 because the
> > framework has terminated or is inactive
> 

> > W1015 14:13:50.252436 1121 master.cpp:3452] Master returning resources
> > offered to framework 20141007-102213-343139338-5050-1037-3540 because the
> > framework has terminated or is inactive
> 

> > W1015 14:14:05.252708 1117 master.cpp:3452] Master returning resources
> > offered to framework 20141007-102213-343139338-5050-1037-3555 because the
> > framework has terminated or is inactive
> 

> > Mesos slave WARNING log :
> 

> > W1015 13:58:19.103196 1211 slave.cpp:1421] Cannot shut down unknown
> > framework
> > 20141007-102213-343139338-5050-1037-3116
> 

> > W1015 13:58:20.104650 1210 slave.cpp:1421] Cannot shut down unknown
> > framework
> > 20141007-102213-343139338-5050-1037-3117
> 

> > W1015 13:58:21.119839 1211 slave.cpp:1421] Cannot shut down unknown
> > framework
> > 20141007-102213-343139338-5050-1037-3118
> 

> > W1015 13:58:22.115965 1210 slave.cpp:1421] Cannot shut down unknown
> > framework
> > 20141007-102213-343139338-5050-1037-3119
> 

> > W1015 13:58:23.104925 1211 slave.cpp:1421] Cannot shut down unknown
> > framework
> > 20141007-102213-343139338-5050-1037-3120
> 

> > W1015 13:58:24.104652 1210 slave.cpp:1421] Cannot shut down unknown
> > framework
> > 20141007-102213-343139338-5050-1037-3121
> 

> > W1015 13:58:59.853744 1212 slave.cpp:1421] Cannot shut down unknown
> > framework
> > 20141007-102213-343139338-5050-1037-3122
> 

> > W1015 13:59:00.853086 1214 slave.cpp:1421] Cannot shut down unknown
> > framework
> > 20141007-102213-343139338-5050-1037-3123
> 

> > W1015 13:59:01.853137 1212 slave.cpp:1421] Cannot shut down unknown
> > framework
> > 20141007-102213-343139338-5050-1037-3124
> 

> > W1015 13:59:03.318259 1214 slave.cpp:1421] Cannot shut down unknown
> > framework
> > 20141007-102213-343139338-5050-1037-3029
> 

> > I hope this information helps, please ask if you have any more questions
> > and
> > thank you for your help!
> 

> > Johannes
> 

> > From: Tim St Clair [mailto: [email protected] ]
> 
> > Sent: Mittwoch, 15. Oktober 2014 15:11
> 
> > To: [email protected]
> 
> > Subject: Re: Connecting spark from a different Machine to mesos cluster
> 

> > Details?
> 

> > 1. What versions are you running?
> 

> > 2. Fine grained mode or Course Gained?
> 

> > 3. Are you running in VM's?
> 

> > Logs always help too.
> 

> > Cheers,
> 

> > Tim
> 

> > > From: "Johannes Schillinger (Intern)" < [email protected] >
> > 
> 
> > > To: [email protected]
> > 
> 
> > > Sent: Wednesday, October 15, 2014 7:42:36 AM
> > 
> 
> > > Subject: Connecting spark from a different Machine to mesos cluster
> > 
> 

> > > Hi,
> > 
> 

> > > we are currently trying to get a mesos cluster running as a base for
> > > Spark.
> > 
> 

> > > The mesos cluster itself runs and connecting a spark shell from the
> > > machine
> > > the maser runs on works perfectly.
> > 
> 

> > > We can see the Framework being started and the slaves working.
> > 
> 

> > > If we try to connect the exact same shell from a different machine to the
> > > exact same cluster the screen stops at
> > 
> 

> > > … 4013 sched.cpp:243] No credentials provided. Attempting to register
> > > without
> > > authentication
> > 
> 

> > > The cluster spins up a framework every two seconds with a new ID and
> > > stops
> > > it
> > > immediately. This continues (we stopped it after a few dozen starts).
> > 
> 

> > > We can see the frameworks being started in the master- and slave-logs as
> > > well
> > > as the command of the master to terminate it.
> > 
> 

> > > Has anyone ever encountered a similar problem or has any advice on
> > > solving
> > > this problem?
> > 
> 

> > > Thanks!
> > 
> 

> > > Johannes
> > 
> 
> > --
> 

> > Cheers,
> 
> > Timothy St. Clair
> 
> > Red Hat Inc.
> 

> Brian Devins | Java Developer
> [email protected]

-- 
Cheers, 
Timothy St. Clair 
Red Hat Inc. 

Reply via email to