Here's the log messages I'm getting upon calling of this command : /usr/local/spark/bin/spark-submit --verbose --driver-memory 14G --num-executors 1 --driver-cores 1 --executor-memory 10G --class org.apache.spark.mllib.feature.myclass --driver-java-options -Djava.library.path=/usr/local/lib --master local /Users/z001jvm/Downloads/spark_tmp/spark-1.4.0/mllib/out/artifacts/spark_mllib_2_10_jar/spark-mllib_2.10.jar /user/text8_lines.ag
Logs: /usr/local/spark/bin/spark-submit --verbose --driver-memory 14G --num-executors 1 --driver-cores 1 --executor-memory 10G --class org.apache.spark.mllib.feature.myclass --driver-java-options -Djava.library.path=/usr/local/lib --master local /Users/z001jvm/Downloads/spark_tmp/spark-1.4.0/mllib/out/artifacts/spark_mllib_2_10_jar/spark-mllib_2.10.jar /user/text8_lines.ag Spark assembly has been built with Hive, including Datanucleus jars on classpath Using properties file: /usr/local/spark/conf/spark-defaults.conf Adding default property: spark.default.parallelism=1 Adding default property: spark.driver.memory=2g Adding default property: spark.executor.cores=1 Adding default property: spark.driver.cores=1 Parsed arguments: master local deployMode null executorMemory 10G executorCores 1 totalExecutorCores null propertiesFile /usr/local/spark/conf/spark-defaults.conf driverMemory 14G driverCores 1 driverExtraClassPath null driverExtraLibraryPath null driverExtraJavaOptions -Djava.library.path=/usr/local/lib supervise false queue null numExecutors 1 files null pyFiles null archives null mainClass org.apache.spark.mllib.feature.myclass primaryResource file:/Users/z001jvm/Downloads/spark_tmp/spark-1.4.0/mllib/out/artifacts/spark_mllib_2_10_jar/spark-mllib_2.10.jar name org.apache.spark.mllib.feature.myclass childArgs [/user/z001jvm/text8_lines.ag] jars null packages null repositories null verbose true Spark properties used, including those specified through --conf and those from the properties file /usr/local/spark/conf/spark-defaults.conf: spark.default.parallelism -> 1 spark.driver.memory -> 2g spark.driver.cores -> 1 spark.executor.cores -> 1 Main class: org.apache.spark.mllib.feature.myclass Arguments: /user/z001jvm/text8_lines.ag System properties: spark.default.parallelism -> 1 spark.driver.memory -> 14G SPARK_SUBMIT -> true spark.driver.cores -> 1 spark.app.name -> org.apache.spark.mllib.feature.myclass spark.driver.extraJavaOptions -> -Djava.library.path=/usr/local/lib spark.jars -> file:/Users/z001jvm/Downloads/spark_tmp/spark-1.4.0/mllib/out/artifacts/spark_mllib_2_10_jar/spark-mllib_2.10.jar spark.master -> local spark.executor.cores -> 1 Classpath elements: file:/Users/z001jvm/Downloads/spark_tmp/spark-1.4.0/mllib/out/artifacts/spark_mllib_2_10_jar/spark-mllib_2.10.jar The issue is coming during processing of stage 3, the logs for which are as follows: 16/02/24 17:06:27 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks have all completed, from pool 16/02/24 17:06:27 INFO scheduler.DAGScheduler: running: Set() 16/02/24 17:06:27 INFO scheduler.DAGScheduler: waiting: Set(Stage 3, Stage 4) 16/02/24 17:06:27 INFO scheduler.DAGScheduler: failed: Set() 16/02/24 17:06:27 INFO scheduler.DAGScheduler: Missing parents for Stage 3: List() 16/02/24 17:06:27 INFO scheduler.DAGScheduler: Missing parents for Stage 4: List(Stage 3) 16/02/24 17:06:27 INFO scheduler.DAGScheduler: Submitting Stage 3 (MapPartitionsRDD[13] at mapPartitionsWithIndex at Word2Vec.scala:675), which is now runnable 16/02/24 17:06:28 INFO storage.MemoryStore: ensureFreeSpace(16305432) called with curMem=15970039, maxMem=7779731374 16/02/24 17:06:28 INFO storage.MemoryStore: Block broadcast_7 stored as values in memory (estimated size 15.6 MB, free 7.2 GB) 16/02/24 17:06:28 INFO storage.MemoryStore: ensureFreeSpace(4194304) called with curMem=32275471, maxMem=7779731374 16/02/24 17:06:28 INFO storage.MemoryStore: Block broadcast_7_piece0 stored as bytes in memory (estimated size 4.0 MB, free 7.2 GB) 16/02/24 17:06:28 INFO storage.BlockManagerInfo: Added broadcast_7_piece0 in memory on localhost:59581 (size: 4.0 MB, free: 7.2 GB) 16/02/24 17:06:28 INFO storage.BlockManagerMaster: Updated info of block broadcast_7_piece0 16/02/24 17:06:28 INFO storage.MemoryStore: ensureFreeSpace(2956688) called with curMem=36469775, maxMem=7779731374 16/02/24 17:06:28 INFO storage.MemoryStore: Block broadcast_7_piece1 stored as bytes in memory (estimated size 2.8 MB, free 7.2 GB) 16/02/24 17:06:28 INFO storage.BlockManagerInfo: Added broadcast_7_piece1 in memory on localhost:59581 (size: 2.8 MB, free: 7.2 GB) 16/02/24 17:06:28 INFO storage.BlockManagerMaster: Updated info of block broadcast_7_piece1 16/02/24 17:06:28 INFO spark.SparkContext: Created broadcast 7 from broadcast at DAGScheduler.scala:839 16/02/24 17:06:28 INFO scheduler.DAGScheduler: Submitting 6 missing tasks from Stage 3 (MapPartitionsRDD[13] at mapPartitionsWithIndex at Word2Vec.scala:675) 16/02/24 17:06:28 INFO scheduler.TaskSchedulerImpl: *Adding task set 3.0 with 6 tasks* *16/02/24 17:06:28 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 3.0 (TID 3, localhost, PROCESS_LOCAL, 1396 bytes) 16/02/24 17:06:28 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 3.0 (TID 4, localhost, PROCESS_LOCAL, 1396 bytes) 16/02/24 17:06:28 INFO scheduler.TaskSetManager: Starting task 2.0 in stage 3.0 (TID 5, localhost, PROCESS_LOCAL, 1396 bytes) 16/02/24 17:06:28 INFO scheduler.TaskSetManager: Starting task 3.0 in stage 3.0 (TID 6, localhost, PROCESS_LOCAL, 1396 bytes)* *16/02/24 17:06:28 INFO executor.Executor: Running task 0.0 in stage 3.0 (TID 3) 16/02/24 17:06:28 INFO executor.Executor: Running task 1.0 in stage 3.0 (TID 4) 16/02/24 17:06:28 INFO executor.Executor: Running task 2.0 in stage 3.0 (TID 5) 16/02/24 17:06:28 INFO executor.Executor: Running task 3.0 in stage 3.0 (TID 6) 1*6/02/24 17:06:28 INFO spark.CacheManager: Partition rdd_12_2 not found, computing it 16/02/24 17:06:28 INFO spark.CacheManager: Partition rdd_12_0 not found, computing it 16/02/24 17:06:28 INFO spark.CacheManager: Partition rdd_12_3 not found, computing it 16/02/24 17:06:28 INFO spark.CacheManager: Partition rdd_12_1 not found, computing it 16/02/24 17:06:28 INFO storage.ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 1 blocks 16/02/24 17:06:28 INFO storage.ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 1 blocks 16/02/24 17:06:28 INFO storage.ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms 16/02/24 17:06:28 INFO storage.ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms 16/02/24 17:06:28 INFO storage.ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 1 blocks 16/02/24 17:06:28 INFO storage.ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 1 blocks 16/02/24 17:06:28 INFO storage.ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms 16/02/24 17:06:28 INFO storage.ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms 16/02/24 17:06:28 INFO storage.MemoryStore: ensureFreeSpace(16) called with curMem=39426463, maxMem=7779731374 16/02/24 17:06:28 INFO storage.MemoryStore: Block rdd_12_2 stored as values in memory (estimated size 16.0 B, free 7.2 GB) 16/02/24 17:06:28 INFO storage.MemoryStore: ensureFreeSpace(16) called with curMem=39426479, maxMem=7779731374 16/02/24 17:06:28 INFO storage.BlockManagerInfo: Added rdd_12_2 in memory on localhost:59581 (size: 16.0 B, free: 7.2 GB) 16/02/24 17:06:28 INFO storage.MemoryStore: Block rdd_12_3 stored as values in memory (estimated size 16.0 B, free 7.2 GB) 16/02/24 17:06:28 INFO storage.MemoryStore: ensureFreeSpace(16) called with curMem=39426495, maxMem=7779731374 16/02/24 17:06:28 INFO storage.BlockManagerMaster: Updated info of block rdd_12_2 16/02/24 17:06:28 INFO storage.MemoryStore: Block rdd_12_0 stored as values in memory (estimated size 16.0 B, free 7.2 GB) 16/02/24 17:06:28 INFO storage.BlockManagerInfo: Added rdd_12_3 in memory on localhost:59581 (size: 16.0 B, free: 7.2 GB) 16/02/24 17:06:28 INFO storage.BlockManagerMaster: Updated info of block rdd_12_3 16/02/24 17:06:28 INFO storage.BlockManagerInfo: Added rdd_12_0 in memory on localhost:59581 (size: 16.0 B, free: 7.2 GB) 16/02/24 17:06:28 INFO storage.BlockManagerMaster: Updated info of block rdd_12_0 -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Restricting-number-of-cores-not-resulting-in-reduction-in-parallelism-tp26319p26333.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org