Yana,

Thanks for your advice.
The Spark UI is showing everything. And I can see from sparkmaster:4040 the details of the running app. And I've also looked into the three logs you mentioned. There's no error or warning. After the parallelize(), I first used the rdd.count() operation. And even with this simple action, the whole process just hung there and no more info is reported.

Best Regards,
Min

On 5/27/2014 4:29 PM, Yana Kadiyska wrote:
Does the spark UI show your program running? (http://spark-masterIP:8118). If the program is listed as running you should be able to see details via the UI. In my experience there are 3 sets of logs -- the log where you're running your program (the driver), the log on the master node, and the log on each executor. The master log often has very useful details when one of your slave executors has an issue. Then you can go and read the logs on that machine. Of course, if you have a small number of workers in your cluster you can just read all the logs. That's just general debugging advice... (I also find it useful to do rdd.partitions.size before anything else to check how many partitions the RDD is actually partitioned to...)


On Tue, May 27, 2014 at 2:48 PM, Min Li <[email protected] <mailto:[email protected]>> wrote:

    Hi all,

    I've a single machine with 8 cores and 8g mem. I've deployed the
    standalone spark on the machine and successfully run the examples.

    Now I'm trying to write some simple java codes. I just read a
local file (23M) into string list and use JavaRDD<String> rdds = sparkContext.paralellize() method to get the corresponding rdd.
    And I asked to run rdds.count(). But the program just stopped on
    the count(). The last log info is:

        14/05/27 14:13:16 INFO SparkContext: Starting job: count at
        RDDTest.java:40
        14/05/27 14:13:16 INFO DAGScheduler: Got job 0 (count at
        RDDTest.java:40) with 2 output partitions (allowLocal=false)
        14/05/27 14:13:16 INFO DAGScheduler: Final stage: Stage 0
        (count at RDDTest.java:40)
        14/05/27 14:13:16 INFO DAGScheduler: Parents of final stage:
        List()
        14/05/27 14:13:16 INFO DAGScheduler: Missing parents: List()
        14/05/27 14:13:16 INFO DAGScheduler: Submitting Stage 0
        (ParallelCollectionRDD[0] at parallelize at RDDTest.java:37),
        which has no missing parents
        14/05/27 14:13:16 INFO SparkDeploySchedulerBackend: Connected
        to Spark cluster with app ID app-20140527141316-0003
        14/05/27 14:13:16 INFO AppClient$ClientActor: Executor added:
        app-20140527141316-0003/0 on worker-20140526221107-spark-35303
        (spark:35303) with 8 cores
        14/05/27 14:13:16 INFO SparkDeploySchedulerBackend: Granted
        executor ID app-20140527141316-0003/0 on hostPort spark:35303
        with 8 cores, 1024.0 MB RAM
        14/05/27 14:13:16 INFO AppClient$ClientActor: Executor
        updated: app-20140527141316-0003/0 is now RUNNING
        14/05/27 14:13:16 INFO DAGScheduler: Submitting 2 missing
        tasks from Stage 0 (ParallelCollectionRDD[0] at parallelize at
        RDDTest.java:37)
        14/05/27 14:13:16 INFO TaskSchedulerImpl: Adding task set 0.0
        with 2 tasks
        14/05/27 14:13:17 INFO SparkDeploySchedulerBackend: Registered
        executor:
        Actor[akka.tcp://sparkExecutor@spark:34279/user/Executor#196489168]
        with ID 0
        14/05/27 14:13:17 INFO TaskSetManager: Starting task 0.0:0 as
        TID 0 on executor 0: spark (PROCESS_LOCAL)
        14/05/27 14:13:17 INFO TaskSetManager: Serialized task 0.0:0
        as 12993529 bytes in 127 ms
        14/05/27 14:13:17 INFO TaskSetManager: Starting task 0.0:1 as
        TID 1 on executor 0: spark (PROCESS_LOCAL)
        14/05/27 14:13:17 INFO TaskSetManager: Serialized task 0.0:1
        as 13006417 bytes in 74 ms
        14/05/27 14:13:17 INFO
        BlockManagerMasterActor$BlockManagerInfo: Registering block
        manager spark:37617 with 589.2 MB RAM

    I tried to figure out what's going on, but just can't. Could any
    please give me some suggestions and point out some possible issues?

    Best Regards,
    Min



Reply via email to