Figured out the root cause. Master was randomly assigning port to worker for communication. Because of the firewall on master, worker couldn't send out messages to master (maybe like resource details). Weird worker didn't even bother to throw any error also.

On 8/6/2015 3:24 PM, Kushal Chokhani wrote:
Any inputs?

In case of following message, is there a way to check which resources is not sufficient through some logs?

    [Timer-0] WARN org.apache.spark.scheduler.TaskSchedulerImpl  -
    Initial job has not accepted any resources; check your cluster UI
    to ensure that workers are registered and have sufficient resources

Regards.

On 8/6/2015 11:40 AM, Kushal Chokhani wrote:
Hi

I have a spark/cassandra setup where I am using a spark cassandra java connector to query on a table. So far, I have 1 spark master node (2 cores) and 1 worker node (4 cores). Both of them have following spark-env.sh under conf/:

    |#!/usr/bin/env bash
    export SPARK_LOCAL_IP=127.0.0.1
    export SPARK_MASTER_IP="192.168.4.134"
    export SPARK_WORKER_MEMORY=1G
    export SPARK_EXECUTOR_MEMORY=2G

    |

I am using spark 1.4.1 along with cassandra 2.2.0. I have started my cassandra/spark setup. Created keyspace and table under cassandra and added some rows on table. Now I try to run following spark job using spark cassandra java connector:

|     SparkConf conf = new SparkConf();
     conf.setAppName("Testing");
     conf.setMaster("spark://192.168.4.134:7077");
     conf.set("spark.cassandra.connection.host", "192.168.4.129");
     conf.set("spark.logConf", "true");
     conf.set("spark.driver.maxResultSize", "50m");
     conf.set("spark.executor.memory", "200m");
     conf.set("spark.eventLog.enabled", "true");
     conf.set("spark.eventLog.dir", "/tmp/");
     conf.set("spark.executor.extraClassPath", "/home/enlighted/ebd.jar");
     conf.set("spark.cores.max", "1");
     JavaSparkContext sc = new JavaSparkContext(conf);


     JavaRDD<String> cassandraRowsRDD = 
CassandraJavaUtil.javaFunctions(sc).cassandraTable("testing", "ec")
     .map(new Function<CassandraRow, String>() {
         private static final long serialVersionUID = -6263533266898869895L;
         @Override
         public String call(CassandraRow cassandraRow) throws Exception {
             return cassandraRow.toString();
         }
     });
     System.out.println("Data as CassandraRows: \n" + 
StringUtils.join(cassandraRowsRDD.toArray(), "\n"));
     sc.close();|


This job is stuck with insufficient resources warning. Here are logs:

    1107 [main] INFO org.apache.spark.SparkContext  - Spark
    configuration:
    spark.app.name=Testing
    spark.cassandra.connection.host=192.168.4.129
    spark.cores.max=1
    spark.driver.maxResultSize=50m
    spark.eventLog.dir=/tmp/
    spark.eventLog.enabled=true
    spark.executor.extraClassPath=/home/enlighted/ebd.jar
    spark.executor.memory=200m
    spark.logConf=true
    spark.master=spark://192.168.4.134:7077
    1121 [main] INFO org.apache.spark.SecurityManager  - Changing
    view acls to: enlighted
    1122 [main] INFO org.apache.spark.SecurityManager  - Changing
    modify acls to: enlighted
    1123 [main] INFO org.apache.spark.SecurityManager  -
    SecurityManager: authentication disabled; ui acls disabled; users
    with view permissions: Set(enlighted); users with modify
    permissions: Set(enlighted)
    1767 [sparkDriver-akka.actor.default-dispatcher-4] INFO
    akka.event.slf4j.Slf4jLogger  - Slf4jLogger started
1805 [sparkDriver-akka.actor.default-dispatcher-4] INFO Remoting - Starting remoting
    1957 [main] INFO org.apache.spark.util.Utils  - Successfully
    started service 'sparkDriver' on port 54611.
1958 [sparkDriver-akka.actor.default-dispatcher-4] INFO Remoting - Remoting started; listening on addresses
    :[akka.tcp://sparkDriver@192.168.4.134:54611]
    1977 [main] INFO org.apache.spark.SparkEnv  - Registering
    MapOutputTracker
    1989 [main] INFO org.apache.spark.SparkEnv  - Registering
    BlockManagerMaster
    2007 [main] INFO org.apache.spark.storage.DiskBlockManager  -
    Created local directory at
    
/tmp/spark-f21125fd-ae9d-460e-884d-563fa8720f09/blockmgr-3e3d54e7-16df-4e97-be48-b0c0fa0389e7
    2012 [main] INFO org.apache.spark.storage.MemoryStore  -
    MemoryStore started with capacity 456.0 MB
    2044 [main] INFO org.apache.spark.HttpFileServer  - HTTP File
    server directory is
    
/tmp/spark-f21125fd-ae9d-460e-884d-563fa8720f09/httpd-64b4d92e-cde9-45fb-8b38-edc3cca3933c
    2046 [main] INFO org.apache.spark.HttpServer  - Starting HTTP Server
    2086 [main] INFO org.spark-project.jetty.server.Server  -
    jetty-8.y.z-SNAPSHOT
    2098 [main] INFO
    org.spark-project.jetty.server.AbstractConnector  - Started
    SocketConnector@0.0.0.0:44884
    2099 [main] INFO org.apache.spark.util.Utils  - Successfully
    started service 'HTTP file server' on port 44884.
    2108 [main] INFO org.apache.spark.SparkEnv  - Registering
    OutputCommitCoordinator
    2297 [main] INFO org.spark-project.jetty.server.Server  -
    jetty-8.y.z-SNAPSHOT
    2317 [main] INFO
    org.spark-project.jetty.server.AbstractConnector  - Started
    SelectChannelConnector@0.0.0.0:4040
    2318 [main] INFO org.apache.spark.util.Utils  - Successfully
    started service 'SparkUI' on port 4040.
    2320 [main] INFO org.apache.spark.ui.SparkUI  - Started SparkUI
    at http://192.168.4.134:4040
    2387 [sparkDriver-akka.actor.default-dispatcher-3] INFO
    org.apache.spark.deploy.client.AppClient$ClientActor  -
    Connecting to master
    akka.tcp://sparkMaster@192.168.4.134:7077/user/Master...
    2662 [sparkDriver-akka.actor.default-dispatcher-14] INFO
    org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend -
    Connected to Spark cluster with app ID app-20150806054450-0001
    2680 [sparkDriver-akka.actor.default-dispatcher-14] INFO
    org.apache.spark.deploy.client.AppClient$ClientActor  - Executor
    added: app-20150806054450-0001/0 on
    worker-20150806053100-192.168.4.129-45566 (192.168.4.129:45566)
    with 1 cores
    2682 [sparkDriver-akka.actor.default-dispatcher-14] INFO
    org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend -
    Granted executor ID app-20150806054450-0001/0 on hostPort
    192.168.4.129:45566 with 1 cores, 200.0 MB RAM
    2696 [main] INFO org.apache.spark.util.Utils  - Successfully
    started service
    'org.apache.spark.network.netty.NettyBlockTransferService' on
    port 49150.
    2696 [main] INFO
    org.apache.spark.network.netty.NettyBlockTransferService  -
    Server created on 49150
    2700 [sparkDriver-akka.actor.default-dispatcher-14] INFO
    org.apache.spark.deploy.client.AppClient$ClientActor  - Executor
    updated: app-20150806054450-0001/0 is now LOADING
    2706 [main] INFO org.apache.spark.storage.BlockManagerMaster -
    Trying to register BlockManager
    2708 [sparkDriver-akka.actor.default-dispatcher-17] INFO
    org.apache.spark.deploy.client.AppClient$ClientActor  - Executor
    updated: app-20150806054450-0001/0 is now RUNNING
    2710 [sparkDriver-akka.actor.default-dispatcher-14] INFO
    org.apache.spark.storage.BlockManagerMasterEndpoint  -
    Registering block manager 192.168.4.134:49150 with 456.0 MB RAM,
    BlockManagerId(driver, 192.168.4.134, 49150)
    2713 [main] INFO org.apache.spark.storage.BlockManagerMaster -
    Registered BlockManager
2922 [main] INFO org.apache.spark.scheduler.EventLoggingListener - Logging events to file:/tmp/app-20150806054450-0001
    2939 [main] INFO
    org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend -
    SchedulerBackend is ready for scheduling beginning after reached
    minRegisteredResourcesRatio: 0.0
    3321 [main] INFO com.datastax.driver.core.Cluster  - New
    Cassandra host /192.168.4.129:9042 added
    3321 [main] INFO com.datastax.driver.core.Cluster  - New
    Cassandra host /192.168.4.130:9042 added
    3322 [main] INFO
    com.datastax.spark.connector.cql.LocalNodeFirstLoadBalancingPolicy -
    Added host 192.168.4.130 (DC1)
    3322 [main] INFO com.datastax.driver.core.Cluster  - New
    Cassandra host /192.168.4.131:9042 added
    3323 [main] INFO
    com.datastax.spark.connector.cql.LocalNodeFirstLoadBalancingPolicy -
    Added host 192.168.4.131 (DC1)
    3323 [main] INFO com.datastax.driver.core.Cluster  - New
    Cassandra host /192.168.4.132:9042 added
    3323 [main] INFO
    com.datastax.spark.connector.cql.LocalNodeFirstLoadBalancingPolicy -
    Added host 192.168.4.132 (DC1)
    3325 [main] INFO
    com.datastax.spark.connector.cql.CassandraConnector  - Connected
    to Cassandra cluster: enldbcluster
    3881 [main] INFO org.apache.spark.SparkContext  - Starting job:
    toArray at Start.java:85
    3898 [pool-18-thread-1] INFO
    com.datastax.spark.connector.cql.CassandraConnector  -
    Disconnected from Cassandra cluster: enldbcluster
    3901 [dag-scheduler-event-loop] INFO
    org.apache.spark.scheduler.DAGScheduler  - Got job 0 (toArray at
    Start.java:85) with 6 output partitions (allowLocal=false)
    3902 [dag-scheduler-event-loop] INFO
    org.apache.spark.scheduler.DAGScheduler  - Final stage:
    ResultStage 0(toArray at Start.java:85)
    3902 [dag-scheduler-event-loop] INFO
    org.apache.spark.scheduler.DAGScheduler  - Parents of final
    stage: List()
    3908 [dag-scheduler-event-loop] INFO
    org.apache.spark.scheduler.DAGScheduler  - Missing parents: List()
    3925 [dag-scheduler-event-loop] INFO
    org.apache.spark.scheduler.DAGScheduler  - Submitting ResultStage
    0 (MapPartitionsRDD[1] at map at Start.java:77), which has no
    missing parents
    4002 [dag-scheduler-event-loop] INFO
    org.apache.spark.storage.MemoryStore  - ensureFreeSpace(7488)
    called with curMem=0, maxMem=478182113
    4004 [dag-scheduler-event-loop] INFO
    org.apache.spark.storage.MemoryStore  - Block broadcast_0 stored
    as values in memory (estimated size 7.3 KB, free 456.0 MB)
    4013 [dag-scheduler-event-loop] INFO
    org.apache.spark.storage.MemoryStore  - ensureFreeSpace(4015)
    called with curMem=7488, maxMem=478182113
    4013 [dag-scheduler-event-loop] INFO
    org.apache.spark.storage.MemoryStore  - Block broadcast_0_piece0
    stored as bytes in memory (estimated size 3.9 KB, free 456.0 MB)
    4015 [sparkDriver-akka.actor.default-dispatcher-14] INFO
    org.apache.spark.storage.BlockManagerInfo  - Added
    broadcast_0_piece0 in memory on 192.168.4.134:49150 (size: 3.9
    KB, free: 456.0 MB)
    4017 [dag-scheduler-event-loop] INFO
    org.apache.spark.SparkContext  - Created broadcast 0 from
    broadcast at DAGScheduler.scala:874
    4089 [dag-scheduler-event-loop] INFO
    com.datastax.driver.core.Cluster  - New Cassandra host
    /192.168.4.129:9042 added
    4089 [dag-scheduler-event-loop] INFO
    com.datastax.driver.core.Cluster  - New Cassandra host
    /192.168.4.130:9042 added
    4089 [dag-scheduler-event-loop] INFO
    com.datastax.driver.core.Cluster  - New Cassandra host
    /192.168.4.131:9042 added
    4089 [dag-scheduler-event-loop] INFO
    com.datastax.driver.core.Cluster  - New Cassandra host
    /192.168.4.132:9042 added
    4089 [dag-scheduler-event-loop] INFO
    com.datastax.spark.connector.cql.CassandraConnector  - Connected
    to Cassandra cluster: enldbcluster
    4394 [pool-18-thread-1] INFO
    com.datastax.spark.connector.cql.CassandraConnector  -
    Disconnected from Cassandra cluster: enldbcluster
    4806 [dag-scheduler-event-loop] INFO
    org.apache.spark.scheduler.DAGScheduler  - Submitting 6 missing
    tasks from ResultStage 0 (MapPartitionsRDD[1] at map at
    Start.java:77)
    4807 [dag-scheduler-event-loop] INFO
    org.apache.spark.scheduler.TaskSchedulerImpl  - Adding task set
    0.0 with 6 tasks
    19822 [Timer-0] WARN
    org.apache.spark.scheduler.TaskSchedulerImpl  - Initial job has
    not accepted any resources; check your cluster UI to ensure that
    workers are registered and have sufficient resources
    34822 [Timer-0] WARN
    org.apache.spark.scheduler.TaskSchedulerImpl  - Initial job has
    not accepted any resources; check your cluster UI to ensure that
    workers are registered and have sufficient resources
    49822 [Timer-0] WARN
    org.apache.spark.scheduler.TaskSchedulerImpl  - Initial job has
    not accepted any resources; check your cluster UI to ensure that
    workers are registered and have sufficient resources
    64822 [Timer-0] WARN
    org.apache.spark.scheduler.TaskSchedulerImpl  - Initial job has
    not accepted any resources; check your cluster UI to ensure that
    workers are registered and have sufficient resources
    79822 [Timer-0] WARN
    org.apache.spark.scheduler.TaskSchedulerImpl  - Initial job has
    not accepted any resources; check your cluster UI to ensure that
    workers are registered and have sufficient resources
    94822 [Timer-0] WARN
    org.apache.spark.scheduler.TaskSchedulerImpl  - Initial job has
    not accepted any resources; check your cluster UI to ensure that
    workers are registered and have sufficient resources
    109822 [Timer-0] WARN
    org.apache.spark.scheduler.TaskSchedulerImpl  - Initial job has
    not accepted any resources; check your cluster UI to ensure that
    workers are registered and have sufficient resources
    124822 [Timer-0] WARN
    org.apache.spark.scheduler.TaskSchedulerImpl  - Initial job has
    not accepted any resources; check your cluster UI to ensure that
    workers are registered and have sufficient resources
    124963 [sparkDriver-akka.actor.default-dispatcher-14] INFO
    org.apache.spark.deploy.client.AppClient$ClientActor  - Executor
    updated: app-20150806054450-0001/0 is now EXITED (Command exited
    with code 1)
    124964 [sparkDriver-akka.actor.default-dispatcher-14] INFO
    org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend -
    Executor app-20150806054450-0001/0 removed: Command exited with
    code 1
    124968 [sparkDriver-akka.actor.default-dispatcher-17] ERROR
    org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend -
    Asked to remove non-existent executor 0
    124969 [sparkDriver-akka.actor.default-dispatcher-14] INFO
    org.apache.spark.deploy.client.AppClient$ClientActor  - Executor
    added: app-20150806054450-0001/1 on
    worker-20150806053100-192.168.4.129-45566 (192.168.4.129:45566)
    with 1 cores
    124969 [sparkDriver-akka.actor.default-dispatcher-14] INFO
    org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend -
    Granted executor ID app-20150806054450-0001/1 on hostPort
    192.168.4.129:45566 with 1 cores, 200.0 MB RAM
    124975 [sparkDriver-akka.actor.default-dispatcher-14] INFO
    org.apache.spark.deploy.client.AppClient$ClientActor  - Executor
    updated: app-20150806054450-0001/1 is now RUNNING
    125012 [sparkDriver-akka.actor.default-dispatcher-14] INFO
    org.apache.spark.deploy.client.AppClient$ClientActor  - Executor
    updated: app-20150806054450-0001/1 is now LOADING
    139822 [Timer-0] WARN
    org.apache.spark.scheduler.TaskSchedulerImpl  - Initial job has
    not accepted any resources; check your cluster UI to ensure that
    workers are registered and have sufficient resources
    154822 [Timer-0] WARN
    org.apache.spark.scheduler.TaskSchedulerImpl  - Initial job has
    not accepted any resources; check your cluster UI to ensure that
    workers are registered and have sufficient resources
    169823 [Timer-0] WARN
    org.apache.spark.scheduler.TaskSchedulerImpl  - Initial job has
    not accepted any resources; check your cluster UI to ensure that
    workers are registered and have sufficient resources
    184822 [Timer-0] WARN
    org.apache.spark.scheduler.TaskSchedulerImpl  - Initial job has
    not accepted any resources; check your cluster UI to ensure that
    workers are registered and have sufficient resources
    199822 [Timer-0] WARN
    org.apache.spark.scheduler.TaskSchedulerImpl  - Initial job has
    not accepted any resources; check your cluster UI to ensure that
    workers are registered and have sufficient resources

||
Please find attached the spark master UI and pom.xml file with dependencies.

Can anyone please point out what could be an issue here.





---------------------------------------------------------------------
To unsubscribe, e-mail:user-unsubscr...@spark.apache.org
For additional commands, e-mail:user-h...@spark.apache.org


Reply via email to