Hi Johannes, When you started your 2nd shell, what log output from the slave do you see for that framework?
Master seems to think it's already terminated. Tim On Wed, Oct 15, 2014 at 6:31 AM, Johannes Schillinger (Intern) < [email protected]> wrote: > Hi Tim, > > > > We are running Spark 1.1.0 with Hadoop 2.4. Mesos is in Version 0.20.1 all > in binary releases. > > > > The Spark console is running in default mode, which is fine grained. > > > > The Spark process is started from a physical Machine running Ubuntu, the > Mesos nodes are running in VMs also in Ubuntu. > > > > This is the output of the Spark Shell: > > > > > -------------------------------------------------------------------------------------------------------------------------------- > > Spark assembly has been built with Hive, including Datanucleus jars on > classpath > > Using Spark's default log4j profile: > org/apache/spark/log4j-defaults.properties > > 14/10/15 15:18:24 INFO SecurityManager: Changing view acls to: USERNAME, > > 14/10/15 15:18:24 INFO SecurityManager: Changing modify acls to: USERNAME, > > 14/10/15 15:18:24 INFO SecurityManager: SecurityManager: authentication > disabled; ui acls disabled; users with view permissions: Set(USERNAME, ); > users with modify permissions: Set(USERNAME, ) > > 14/10/15 15:18:24 INFO HttpServer: Starting HTTP Server > > 14/10/15 15:18:24 INFO Utils: Successfully started service 'HTTP class > server' on port 42469. > > Welcome to > > ____ __ > > / __/__ ___ _____/ /__ > > _\ \/ _ \/ _ `/ __/ '_/ > > /___/ .__/\_,_/_/ /_/\_\ version 1.1.0 > > /_/ > > > > Using Scala version 2.10.4 (OpenJDK 64-Bit Server VM, Java 1.7.0_65) > > Type in expressions to have them evaluated. > > Type :help for more information. > > 14/10/15 15:18:26 WARN Utils: Your hostname, karwjohannes01 resolves to a > loopback address: 127.0.1.1; using CLIENT_IP instead (on interface eth0) > > 14/10/15 15:18:26 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to > another address > > 14/10/15 15:18:27 INFO SecurityManager: Changing view acls to: USERNAME, > > 14/10/15 15:18:27 INFO SecurityManager: Changing modify acls to: USERNAME, > > 14/10/15 15:18:27 INFO SecurityManager: SecurityManager: authentication > disabled; ui acls disabled; users with view permissions: Set(USERNAME, ); > users with modify permissions: Set(USERNAME, ) > > 14/10/15 15:18:27 INFO Slf4jLogger: Slf4jLogger started > > 14/10/15 15:18:27 INFO Remoting: Starting remoting > > 14/10/15 15:18:27 INFO Remoting: Remoting started; listening on addresses > :[akka.tcp://sparkDriver@CLIENT_IP:51879] > > 14/10/15 15:18:27 INFO Remoting: Remoting now listens on addresses: > [akka.tcp://sparkDriver@CLIENT_IP:51879] > > 14/10/15 15:18:27 INFO Utils: Successfully started service 'sparkDriver' > on port 51879. > > 14/10/15 15:18:27 INFO SparkEnv: Registering MapOutputTracker > > 14/10/15 15:18:27 INFO SparkEnv: Registering BlockManagerMaster > > 14/10/15 15:18:27 INFO DiskBlockManager: Created local directory at > /tmp/spark-local-20141015151827-1a2e > > 14/10/15 15:18:27 INFO Utils: Successfully started service 'Connection > manager for block manager' on port 60963. > > 14/10/15 15:18:27 INFO ConnectionManager: Bound socket to port 60963 with > id = ConnectionManagerId(CLIENT_IP,60963) > > 14/10/15 15:18:27 INFO MemoryStore: MemoryStore started with capacity > 265.4 MB > > 14/10/15 15:18:27 INFO BlockManagerMaster: Trying to register BlockManager > > 14/10/15 15:18:27 INFO BlockManagerMasterActor: Registering block manager > CLIENT_IP:60963 with 265.4 MB RAM > > 14/10/15 15:18:27 INFO BlockManagerMaster: Registered BlockManager > > 14/10/15 15:18:27 INFO HttpFileServer: HTTP File server directory is > /tmp/spark-b032c76c-93e1-473e-802c-c55e12e85d41 > > 14/10/15 15:18:27 INFO HttpServer: Starting HTTP Server > > 14/10/15 15:18:27 INFO Utils: Successfully started service 'HTTP file > server' on port 47989. > > 14/10/15 15:18:27 INFO Utils: Successfully started service 'SparkUI' on > port 4040. > > 14/10/15 15:18:27 INFO SparkUI: Started SparkUI at http://CLIENT_IP:4040 > > 14/10/15 15:18:27 WARN NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > > I1015 15:18:28.524736 4748 sched.cpp:139] Version: 0.20.1 > > I1015 15:18:28.527180 4750 sched.cpp:235] New master detected at > master@MESOS_MASTER_IP:5050 > > I1015 15:18:28.527300 4750 sched.cpp:243] No credentials provided. > Attempting to register without authentication > > > -------------------------------------------------------------------------------------------------------------------------------- > > > > Mesos master WARNING log: > > W1015 14:13:00.235213 1118 master.cpp:3452] Master returning resources > offered to framework 20141007-102213-343139338-5050-1037-3490 because the > framework has terminated or is inactive > > W1015 14:13:35.244055 1121 master.cpp:3452] Master returning resources > offered to framework 20141007-102213-343139338-5050-1037-3525 because the > framework has terminated or is inactive > > W1015 14:13:50.252436 1121 master.cpp:3452] Master returning resources > offered to framework 20141007-102213-343139338-5050-1037-3540 because the > framework has terminated or is inactive > > W1015 14:14:05.252708 1117 master.cpp:3452] Master returning resources > offered to framework 20141007-102213-343139338-5050-1037-3555 because the > framework has terminated or is inactive > > > > > > Mesos slave WARNING log : > > > > W1015 13:58:19.103196 1211 slave.cpp:1421] Cannot shut down unknown > framework 20141007-102213-343139338-5050-1037-3116 > > W1015 13:58:20.104650 1210 slave.cpp:1421] Cannot shut down unknown > framework 20141007-102213-343139338-5050-1037-3117 > > W1015 13:58:21.119839 1211 slave.cpp:1421] Cannot shut down unknown > framework 20141007-102213-343139338-5050-1037-3118 > > W1015 13:58:22.115965 1210 slave.cpp:1421] Cannot shut down unknown > framework 20141007-102213-343139338-5050-1037-3119 > > W1015 13:58:23.104925 1211 slave.cpp:1421] Cannot shut down unknown > framework 20141007-102213-343139338-5050-1037-3120 > > W1015 13:58:24.104652 1210 slave.cpp:1421] Cannot shut down unknown > framework 20141007-102213-343139338-5050-1037-3121 > > W1015 13:58:59.853744 1212 slave.cpp:1421] Cannot shut down unknown > framework 20141007-102213-343139338-5050-1037-3122 > > W1015 13:59:00.853086 1214 slave.cpp:1421] Cannot shut down unknown > framework 20141007-102213-343139338-5050-1037-3123 > > W1015 13:59:01.853137 1212 slave.cpp:1421] Cannot shut down unknown > framework 20141007-102213-343139338-5050-1037-3124 > > W1015 13:59:03.318259 1214 slave.cpp:1421] Cannot shut down unknown > framework 20141007-102213-343139338-5050-1037-3029 > > > > > > I hope this information helps, please ask if you have any more questions > and thank you for your help! > > > > Johannes > > > > *From:* Tim St Clair [mailto:[email protected]] > *Sent:* Mittwoch, 15. Oktober 2014 15:11 > *To:* [email protected] > *Subject:* Re: Connecting spark from a different Machine to mesos cluster > > > > Details? > > > > 1. What versions are you running? > > 2. Fine grained mode or Course Gained? > > 3. Are you running in VM's? > > > > Logs always help too. > > > > Cheers, > > Tim > > > ------------------------------ > > *From: *"Johannes Schillinger (Intern)" <[email protected]> > *To: *[email protected] > *Sent: *Wednesday, October 15, 2014 7:42:36 AM > *Subject: *Connecting spark from a different Machine to mesos cluster > > > > Hi, > > > > we are currently trying to get a mesos cluster running as a base for Spark. > > > > The mesos cluster itself runs and connecting a spark shell from the > machine the maser runs on works perfectly. > > We can see the Framework being started and the slaves working. > > > > If we try to connect the exact same shell from a different machine to the > exact same cluster the screen stops at > > > > … 4013 sched.cpp:243] No credentials provided. Attempting to register > without authentication > > > > The cluster spins up a framework every two seconds with a new ID and stops > it immediately. This continues (we stopped it after a few dozen starts). > > > > We can see the frameworks being started in the master- and slave-logs as > well as the command of the master to terminate it. > > > > Has anyone ever encountered a similar problem or has any advice on solving > this problem? > > > > Thanks! > > Johannes > > > > > > -- > > Cheers, > Timothy St. Clair > Red Hat Inc. >

