Re: failure to parallelize an RDD
Which release of Spark are you using ? Can you turn on DEBUG logging to see if there is more clue ? Thanks On Tue, Jan 12, 2016 at 6:37 PM, AlexG <swift...@gmail.com> wrote: > I transpose a matrix (colChunkOfA) stored as a 200-by-54843210 as an array > of > rows in Array[Array[Float]] format into another matrix (rowChunk) also > stored row-wise as a 54843210-by-200 Array[Array[Float]] using the > following > code: > > val rowChunk = new Array[Tuple2[Int,Array[Float]]](numCols) > val colIndices = (0 until colChunkOfA.length).toArray > > (0 until numCols).foreach( rowIdx => { > rowChunk(rowIdx) = Tuple2(rowIdx, colIndices.map(colChunkOfA(_)(rowIdx))) > }) > > This succeeds, but the following code which attempts to turn rowChunk into > an RDD fails silently: spark-submit just ends, and none of the executor > logs > indicate any errors occurring. > > val parallelRowChunkRDD = sc.parallelize(rowChunk).cache > parallelRowChunkRDD.count > > What is the culprit here? > > Here is the log output starting from the count instruction: > > 16/01/13 02:23:38 INFO SparkContext: Starting job: count at > transposeAvroToAvroChunks.scala:129 > 16/01/13 02:23:38 INFO DAGScheduler: Got job 3 (count at > transposeAvroToAvroChunks.scala:129) with 928 output partitions > 16/01/13 02:23:38 INFO DAGScheduler: Final stage: ResultStage 3(count at > transposeAvroToAvroChunks.scala:129) > 16/01/13 02:23:38 INFO DAGScheduler: Parents of final stage: List() > 16/01/13 02:23:38 INFO DAGScheduler: Missing parents: List() > 16/01/13 02:23:38 INFO DAGScheduler: Submitting ResultStage 3 > (ParallelCollectionRDD[2448] at parallelize at > transposeAvroToAvroChunks.scala:128), which has no missing parents > 16/01/13 02:23:38 INFO MemoryStore: ensureFreeSpace(1048) called with > curMem=50917367, maxMem=127452201615 > 16/01/13 02:23:38 INFO MemoryStore: Block broadcast_615 stored as values in > memory (estimated size 1048.0 B, free 118.7 GB) > 16/01/13 02:23:38 INFO MemoryStore: ensureFreeSpace(740) called with > curMem=50918415, maxMem=127452201615 > 16/01/13 02:23:38 INFO MemoryStore: Block broadcast_615_piece0 stored as > bytes in memory (estimated size 740.0 B, free 118.7 GB) > 16/01/13 02:23:38 INFO BlockManagerInfo: Added broadcast_615_piece0 in > memory on 172.31.36.112:36581 (size: 740.0 B, free: 118.7 GB) > 16/01/13 02:23:38 INFO SparkContext: Created broadcast 615 from broadcast > at > DAGScheduler.scala:861 > 16/01/13 02:23:38 INFO DAGScheduler: Submitting 928 missing tasks from > ResultStage 3 (ParallelCollectionRDD[2448] at parallelize at > transposeAvroToAvroChunks.scala:128) > 16/01/13 02:23:38 INFO TaskSchedulerImpl: Adding task set 3.0 with 928 > tasks > 16/01/13 02:23:39 WARN TaskSetManager: Stage 3 contains a task of very > large > size (47027 KB). The maximum recommended task size is 100 KB. > 16/01/13 02:23:39 INFO TaskSetManager: Starting task 0.0 in stage 3.0 (TID > 1219, 172.31.34.184, PROCESS_LOCAL, 48156290 bytes) > ... > 16/01/13 02:27:13 INFO TaskSetManager: Starting task 927.0 in stage 3.0 > (TID > 2146, 172.31.42.67, PROCESS_LOCAL, 48224789 bytes) > 16/01/13 02:27:17 INFO BlockManagerInfo: Removed broadcast_419_piece0 on > 172.31.36.112:36581 in memory (size: 17.4 KB, free: 118.7 GB) > 16/01/13 02:27:21 INFO BlockManagerInfo: Removed broadcast_419_piece0 on > 172.31.35.157:51059 in memory (size: 17.4 KB, free: 10.4 GB) > 16/01/13 02:27:21 INFO BlockManagerInfo: Removed broadcast_419_piece0 on > 172.31.47.118:34888 in memory (size: 17.4 KB, free: 10.4 GB) > 16/01/13 02:27:22 INFO BlockManagerInfo: Removed broadcast_419_piece0 on > 172.31.38.42:48582 in memory (size: 17.4 KB, free: 10.4 GB) > 16/01/13 02:27:38 INFO BlockManagerInfo: Added broadcast_615_piece0 in > memory on 172.31.41.68:59281 (size: 740.0 B, free: 10.4 GB) > 16/01/13 02:27:55 INFO BlockManagerInfo: Added broadcast_615_piece0 in > memory on 172.31.47.118:59575 (size: 740.0 B, free: 10.4 GB) > 16/01/13 02:28:47 INFO BlockManagerInfo: Added broadcast_615_piece0 in > memory on 172.31.40.24:55643 (size: 740.0 B, free: 10.4 GB) > 16/01/13 02:28:49 INFO BlockManagerInfo: Added broadcast_615_piece0 in > memory on 172.31.47.118:53671 (size: 740.0 B, free: 10.4 GB) > > This is the end of the log, so it looks like all 928 tasks got started, but > presumably somewhere in running, they ran into an error. Nothing shows up > in > the executor logs. > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/failure-to-parallelize-an-RDD-tp25950.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >
failure to parallelize an RDD
I transpose a matrix (colChunkOfA) stored as a 200-by-54843210 as an array of rows in Array[Array[Float]] format into another matrix (rowChunk) also stored row-wise as a 54843210-by-200 Array[Array[Float]] using the following code: val rowChunk = new Array[Tuple2[Int,Array[Float]]](numCols) val colIndices = (0 until colChunkOfA.length).toArray (0 until numCols).foreach( rowIdx => { rowChunk(rowIdx) = Tuple2(rowIdx, colIndices.map(colChunkOfA(_)(rowIdx))) }) This succeeds, but the following code which attempts to turn rowChunk into an RDD fails silently: spark-submit just ends, and none of the executor logs indicate any errors occurring. val parallelRowChunkRDD = sc.parallelize(rowChunk).cache parallelRowChunkRDD.count What is the culprit here? Here is the log output starting from the count instruction: 16/01/13 02:23:38 INFO SparkContext: Starting job: count at transposeAvroToAvroChunks.scala:129 16/01/13 02:23:38 INFO DAGScheduler: Got job 3 (count at transposeAvroToAvroChunks.scala:129) with 928 output partitions 16/01/13 02:23:38 INFO DAGScheduler: Final stage: ResultStage 3(count at transposeAvroToAvroChunks.scala:129) 16/01/13 02:23:38 INFO DAGScheduler: Parents of final stage: List() 16/01/13 02:23:38 INFO DAGScheduler: Missing parents: List() 16/01/13 02:23:38 INFO DAGScheduler: Submitting ResultStage 3 (ParallelCollectionRDD[2448] at parallelize at transposeAvroToAvroChunks.scala:128), which has no missing parents 16/01/13 02:23:38 INFO MemoryStore: ensureFreeSpace(1048) called with curMem=50917367, maxMem=127452201615 16/01/13 02:23:38 INFO MemoryStore: Block broadcast_615 stored as values in memory (estimated size 1048.0 B, free 118.7 GB) 16/01/13 02:23:38 INFO MemoryStore: ensureFreeSpace(740) called with curMem=50918415, maxMem=127452201615 16/01/13 02:23:38 INFO MemoryStore: Block broadcast_615_piece0 stored as bytes in memory (estimated size 740.0 B, free 118.7 GB) 16/01/13 02:23:38 INFO BlockManagerInfo: Added broadcast_615_piece0 in memory on 172.31.36.112:36581 (size: 740.0 B, free: 118.7 GB) 16/01/13 02:23:38 INFO SparkContext: Created broadcast 615 from broadcast at DAGScheduler.scala:861 16/01/13 02:23:38 INFO DAGScheduler: Submitting 928 missing tasks from ResultStage 3 (ParallelCollectionRDD[2448] at parallelize at transposeAvroToAvroChunks.scala:128) 16/01/13 02:23:38 INFO TaskSchedulerImpl: Adding task set 3.0 with 928 tasks 16/01/13 02:23:39 WARN TaskSetManager: Stage 3 contains a task of very large size (47027 KB). The maximum recommended task size is 100 KB. 16/01/13 02:23:39 INFO TaskSetManager: Starting task 0.0 in stage 3.0 (TID 1219, 172.31.34.184, PROCESS_LOCAL, 48156290 bytes) ... 16/01/13 02:27:13 INFO TaskSetManager: Starting task 927.0 in stage 3.0 (TID 2146, 172.31.42.67, PROCESS_LOCAL, 48224789 bytes) 16/01/13 02:27:17 INFO BlockManagerInfo: Removed broadcast_419_piece0 on 172.31.36.112:36581 in memory (size: 17.4 KB, free: 118.7 GB) 16/01/13 02:27:21 INFO BlockManagerInfo: Removed broadcast_419_piece0 on 172.31.35.157:51059 in memory (size: 17.4 KB, free: 10.4 GB) 16/01/13 02:27:21 INFO BlockManagerInfo: Removed broadcast_419_piece0 on 172.31.47.118:34888 in memory (size: 17.4 KB, free: 10.4 GB) 16/01/13 02:27:22 INFO BlockManagerInfo: Removed broadcast_419_piece0 on 172.31.38.42:48582 in memory (size: 17.4 KB, free: 10.4 GB) 16/01/13 02:27:38 INFO BlockManagerInfo: Added broadcast_615_piece0 in memory on 172.31.41.68:59281 (size: 740.0 B, free: 10.4 GB) 16/01/13 02:27:55 INFO BlockManagerInfo: Added broadcast_615_piece0 in memory on 172.31.47.118:59575 (size: 740.0 B, free: 10.4 GB) 16/01/13 02:28:47 INFO BlockManagerInfo: Added broadcast_615_piece0 in memory on 172.31.40.24:55643 (size: 740.0 B, free: 10.4 GB) 16/01/13 02:28:49 INFO BlockManagerInfo: Added broadcast_615_piece0 in memory on 172.31.47.118:53671 (size: 740.0 B, free: 10.4 GB) This is the end of the log, so it looks like all 928 tasks got started, but presumably somewhere in running, they ran into an error. Nothing shows up in the executor logs. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/failure-to-parallelize-an-RDD-tp25950.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: failure to parallelize an RDD
r: Missing parents: List() >> 16/01/13 02:23:38 INFO DAGScheduler: Submitting ResultStage 3 >> (ParallelCollectionRDD[2448] at parallelize at >> transposeAvroToAvroChunks.scala:128), which has no missing parents >> 16/01/13 02:23:38 INFO MemoryStore: ensureFreeSpace(1048) called with >> curMem=50917367, maxMem=127452201615 >> 16/01/13 02:23:38 INFO MemoryStore: Block broadcast_615 stored as values >> in >> memory (estimated size 1048.0 B, free 118.7 GB) >> 16/01/13 02:23:38 INFO MemoryStore: ensureFreeSpace(740) called with >> curMem=50918415, maxMem=127452201615 >> 16/01/13 02:23:38 INFO MemoryStore: Block broadcast_615_piece0 stored as >> bytes in memory (estimated size 740.0 B, free 118.7 GB) >> 16/01/13 02:23:38 INFO BlockManagerInfo: Added broadcast_615_piece0 in >> memory on 172.31.36.112:36581 (size: 740.0 B, free: 118.7 GB) >> 16/01/13 02:23:38 INFO SparkContext: Created broadcast 615 from broadcast >> at >> DAGScheduler.scala:861 >> 16/01/13 02:23:38 INFO DAGScheduler: Submitting 928 missing tasks from >> ResultStage 3 (ParallelCollectionRDD[2448] at parallelize at >> transposeAvroToAvroChunks.scala:128) >> 16/01/13 02:23:38 INFO TaskSchedulerImpl: Adding task set 3.0 with 928 >> tasks >> 16/01/13 02:23:39 WARN TaskSetManager: Stage 3 contains a task of very >> large >> size (47027 KB). The maximum recommended task size is 100 KB. >> 16/01/13 02:23:39 INFO TaskSetManager: Starting task 0.0 in stage 3.0 (TID >> 1219, 172.31.34.184, PROCESS_LOCAL, 48156290 bytes) >> ... >> 16/01/13 02:27:13 INFO TaskSetManager: Starting task 927.0 in stage 3.0 >> (TID >> 2146, 172.31.42.67, PROCESS_LOCAL, 48224789 bytes) >> 16/01/13 02:27:17 INFO BlockManagerInfo: Removed broadcast_419_piece0 on >> 172.31.36.112:36581 in memory (size: 17.4 KB, free: 118.7 GB) >> 16/01/13 02:27:21 INFO BlockManagerInfo: Removed broadcast_419_piece0 on >> 172.31.35.157:51059 in memory (size: 17.4 KB, free: 10.4 GB) >> 16/01/13 02:27:21 INFO BlockManagerInfo: Removed broadcast_419_piece0 on >> 172.31.47.118:34888 in memory (size: 17.4 KB, free: 10.4 GB) >> 16/01/13 02:27:22 INFO BlockManagerInfo: Removed broadcast_419_piece0 on >> 172.31.38.42:48582 in memory (size: 17.4 KB, free: 10.4 GB) >> 16/01/13 02:27:38 INFO BlockManagerInfo: Added broadcast_615_piece0 in >> memory on 172.31.41.68:59281 (size: 740.0 B, free: 10.4 GB) >> 16/01/13 02:27:55 INFO BlockManagerInfo: Added broadcast_615_piece0 in >> memory on 172.31.47.118:59575 (size: 740.0 B, free: 10.4 GB) >> 16/01/13 02:28:47 INFO BlockManagerInfo: Added broadcast_615_piece0 in >> memory on 172.31.40.24:55643 (size: 740.0 B, free: 10.4 GB) >> 16/01/13 02:28:49 INFO BlockManagerInfo: Added broadcast_615_piece0 in >> memory on 172.31.47.118:53671 (size: 740.0 B, free: 10.4 GB) >> >> This is the end of the log, so it looks like all 928 tasks got started, >> but >> presumably somewhere in running, they ran into an error. Nothing shows up >> in >> the executor logs. >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/failure-to-parallelize-an-RDD-tp25950.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> - >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >