Re: Configration Problem? (need help to get Spark job executed)

2015-02-17 Thread Arush Kharbanda
Hi

It could be due to the connectivity issue between the master and the slaves.

I have seen this issue occur for the following reasons.Are the slaves
visible in the Spark UI?And how much memory is allocated to the executors.

1. Syncing of configuration between Spark Master and Slaves.
2. Network connectivity issues between the master and slave.

Thanks
Arush

On Sat, Feb 14, 2015 at 3:07 PM, NORD SC  wrote:

> Hi all,
>
> I am new to spark and seem to have hit a common newbie obstacle.
>
> I have a pretty simple setup and job but I am unable to get past this
> error when executing a job:
>
> "TaskSchedulerImpl: Initial job has not accepted any resources; check your
> cluster UI to ensure that workers are registered and have sufficient memory”
>
> I have so far gained a basic understanding of worker/executor/driver
> memory, but have run out of ideas what to try next - maybe someone has a
> clue.
>
>
> My setup:
>
> Three node standalone cluster with C* and spark on each node and the
> Datastax C*/Spark connector JAR placed on each node.
>
> On the master I have the slaves configured in conf/slaves and I am using
> sbin/start-all.sh to start the whole cluster.
>
> On each node I have this in conf/spark-defauls.conf
>
> spark.masterspark://devpeng-db-cassandra-1:7077
> spark.eventLog.enabled   true
> spark.serializer org.apache.spark.serializer.KryoSerializer
>
> spark.executor.extraClassPath
> /opt/spark-cassandra-connector-assembly-1.2.0-alpha1.jar
>
> and this in conf/spart-env.sh
>
> SPARK_WORKER_MEMORY=6g
>
>
>
> My App looks like this
>
> object TestApp extends App {
>   val conf = new SparkConf(true).set("spark.cassandra.connection.host",
> "devpeng-db-cassandra-1.")
>   val sc = new SparkContext("spark://devpeng-db-cassandra-1:7077",
> "testApp", conf)
>   val rdd = sc.cassandraTable("test", "kv")
>   println(“Count: “ + String.valueOf(rdd.count) )
>   println(rdd.first)
> }
>
> Any kind of idea what to check next would help me at this point, I think.
>
> Jan
>
> Log of the application start:
>
> [info] Loading project definition from
> /Users/jan/projects/gkh/jump/workspace/gkh-spark-example/project
> [info] Set current project to csconnect (in build
> file:/Users/jan/projects/gkh/jump/workspace/gkh-spark-example/)
> [info] Compiling 1 Scala source to
> /Users/jan/projects/gkh/jump/workspace/gkh-spark-example/target/scala-2.10/classes...
> [info] Running jump.TestApp
> Using Spark's default log4j profile:
> org/apache/spark/log4j-defaults.properties
> 15/02/14 10:30:11 INFO SecurityManager: Changing view acls to: jan
> 15/02/14 10:30:11 INFO SecurityManager: Changing modify acls to: jan
> 15/02/14 10:30:11 INFO SecurityManager: SecurityManager: authentication
> disabled; ui acls disabled; users with view permissions: Set(jan); users
> with modify permissions: Set(jan)
> 15/02/14 10:30:11 INFO Slf4jLogger: Slf4jLogger started
> 15/02/14 10:30:11 INFO Remoting: Starting remoting
> 15/02/14 10:30:12 INFO Remoting: Remoting started; listening on addresses
> :[akka.tcp://sparkDriver@xx:58197]
> 15/02/14 10:30:12 INFO Utils: Successfully started service 'sparkDriver'
> on port 58197.
> 15/02/14 10:30:12 INFO SparkEnv: Registering MapOutputTracker
> 15/02/14 10:30:12 INFO SparkEnv: Registering BlockManagerMaster
> 15/02/14 10:30:12 INFO DiskBlockManager: Created local directory at
> /var/folders/vr/w3whx92d0356g5nj1p6s59grgn/T/spark-local-20150214103012-5b53
> 15/02/14 10:30:12 INFO MemoryStore: MemoryStore started with capacity
> 530.3 MB
> 2015-02-14 10:30:12.304 java[24999:3b07] Unable to load realm info from
> SCDynamicStore
> 15/02/14 10:30:12 WARN NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> 15/02/14 10:30:12 INFO HttpFileServer: HTTP File server directory is
> /var/folders/vr/w3whx92d0356g5nj1p6s59grgn/T/spark-48459a22-c1ff-42d5-8b8e-cc89fe84933d
> 15/02/14 10:30:12 INFO HttpServer: Starting HTTP Server
> 15/02/14 10:30:12 INFO Utils: Successfully started service 'HTTP file
> server' on port 58198.
> 15/02/14 10:30:12 INFO Utils: Successfully started service 'SparkUI' on
> port 4040.
> 15/02/14 10:30:12 INFO SparkUI: Started SparkUI at http://xx:4040
> 15/02/14 10:30:12 INFO AppClient$ClientActor: Connecting to master
> spark://devpeng-db-cassandra-1:7077...
> 15/02/14 10:30:13 INFO SparkDeploySchedulerBackend: Connected to Spark
> cluster with app ID app-20150214103013-0001
> 15/02/14 10:30:13 INFO AppClient$ClientActor: Executor added:
> app-20150214103013-0001/0 on
> worker-20150214102534-devpeng-db-cassandra-2.devpeng
> (devpeng-db-cassandra-2.devpeng.x:57563) with 8 cores
> 15/02/14 10:30:13 INFO SparkDeploySchedulerBackend: Granted executor ID
> app-20150214103013-0001/0 on hostPort
> devpeng-db-cassandra-2.devpeng.:57563 with 8 cores, 512.0 MB RAM
> 15/02/14 10:30:13 INFO AppClient$ClientActor: Executor added:
> app-20150214103013-0001/1 on

Configration Problem? (need help to get Spark job executed)

2015-02-14 Thread NORD SC
Hi all,

I am new to spark and seem to have hit a common newbie obstacle.

I have a pretty simple setup and job but I am unable to get past this error 
when executing a job:

"TaskSchedulerImpl: Initial job has not accepted any resources; check your 
cluster UI to ensure that workers are registered and have sufficient memory”

I have so far gained a basic understanding of worker/executor/driver memory, 
but have run out of ideas what to try next - maybe someone has a clue.


My setup:

Three node standalone cluster with C* and spark on each node and the Datastax 
C*/Spark connector JAR placed on each node.

On the master I have the slaves configured in conf/slaves and I am using 
sbin/start-all.sh to start the whole cluster.

On each node I have this in conf/spark-defauls.conf

spark.masterspark://devpeng-db-cassandra-1:7077
spark.eventLog.enabled   true
spark.serializer org.apache.spark.serializer.KryoSerializer

spark.executor.extraClassPath  
/opt/spark-cassandra-connector-assembly-1.2.0-alpha1.jar

and this in conf/spart-env.sh

SPARK_WORKER_MEMORY=6g



My App looks like this

object TestApp extends App {
  val conf = new SparkConf(true).set("spark.cassandra.connection.host", 
"devpeng-db-cassandra-1.")
  val sc = new SparkContext("spark://devpeng-db-cassandra-1:7077", "testApp", 
conf)
  val rdd = sc.cassandraTable("test", "kv")
  println(“Count: “ + String.valueOf(rdd.count) )
  println(rdd.first)
}

Any kind of idea what to check next would help me at this point, I think.

Jan

Log of the application start:

[info] Loading project definition from 
/Users/jan/projects/gkh/jump/workspace/gkh-spark-example/project
[info] Set current project to csconnect (in build 
file:/Users/jan/projects/gkh/jump/workspace/gkh-spark-example/)
[info] Compiling 1 Scala source to 
/Users/jan/projects/gkh/jump/workspace/gkh-spark-example/target/scala-2.10/classes...
[info] Running jump.TestApp 
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/02/14 10:30:11 INFO SecurityManager: Changing view acls to: jan
15/02/14 10:30:11 INFO SecurityManager: Changing modify acls to: jan
15/02/14 10:30:11 INFO SecurityManager: SecurityManager: authentication 
disabled; ui acls disabled; users with view permissions: Set(jan); users with 
modify permissions: Set(jan)
15/02/14 10:30:11 INFO Slf4jLogger: Slf4jLogger started
15/02/14 10:30:11 INFO Remoting: Starting remoting
15/02/14 10:30:12 INFO Remoting: Remoting started; listening on addresses 
:[akka.tcp://sparkDriver@xx:58197]
15/02/14 10:30:12 INFO Utils: Successfully started service 'sparkDriver' on 
port 58197.
15/02/14 10:30:12 INFO SparkEnv: Registering MapOutputTracker
15/02/14 10:30:12 INFO SparkEnv: Registering BlockManagerMaster
15/02/14 10:30:12 INFO DiskBlockManager: Created local directory at 
/var/folders/vr/w3whx92d0356g5nj1p6s59grgn/T/spark-local-20150214103012-5b53
15/02/14 10:30:12 INFO MemoryStore: MemoryStore started with capacity 530.3 MB
2015-02-14 10:30:12.304 java[24999:3b07] Unable to load realm info from 
SCDynamicStore
15/02/14 10:30:12 WARN NativeCodeLoader: Unable to load native-hadoop library 
for your platform... using builtin-java classes where applicable
15/02/14 10:30:12 INFO HttpFileServer: HTTP File server directory is 
/var/folders/vr/w3whx92d0356g5nj1p6s59grgn/T/spark-48459a22-c1ff-42d5-8b8e-cc89fe84933d
15/02/14 10:30:12 INFO HttpServer: Starting HTTP Server
15/02/14 10:30:12 INFO Utils: Successfully started service 'HTTP file server' 
on port 58198.
15/02/14 10:30:12 INFO Utils: Successfully started service 'SparkUI' on port 
4040.
15/02/14 10:30:12 INFO SparkUI: Started SparkUI at http://xx:4040
15/02/14 10:30:12 INFO AppClient$ClientActor: Connecting to master 
spark://devpeng-db-cassandra-1:7077...
15/02/14 10:30:13 INFO SparkDeploySchedulerBackend: Connected to Spark cluster 
with app ID app-20150214103013-0001
15/02/14 10:30:13 INFO AppClient$ClientActor: Executor added: 
app-20150214103013-0001/0 on 
worker-20150214102534-devpeng-db-cassandra-2.devpeng 
(devpeng-db-cassandra-2.devpeng.x:57563) with 8 cores
15/02/14 10:30:13 INFO SparkDeploySchedulerBackend: Granted executor ID 
app-20150214103013-0001/0 on hostPort devpeng-db-cassandra-2.devpeng.:57563 
with 8 cores, 512.0 MB RAM
15/02/14 10:30:13 INFO AppClient$ClientActor: Executor added: 
app-20150214103013-0001/1 on 
worker-20150214102534-devpeng-db-cassandra-3.devpeng.-38773 
(devpeng-db-cassandra-3.devpeng.xx:38773) with 8 cores
15/02/14 10:30:13 INFO SparkDeploySchedulerBackend: Granted executor ID 
app-20150214103013-0001/1 on hostPort 
devpeng-db-cassandra-3.devpeng.xe:38773 with 8 cores, 512.0 MB RAM
15/02/14 10:30:13 INFO AppClient$ClientActor: Executor updated: 
app-20150214103013-0001/0 is now LOADING
15/02/14 10:30:13 INFO AppClient$ClientActor: Executor updated: 
app-20150214103013-0001/1 is now LOADING
15/02/14 10:30:13 INFO AppClient$ClientActor: Executor up