sc.parallelize takes a second parameter which is the total number of partitions, are you using that?
Thanks Best Regards On Wed, Jul 29, 2015 at 9:27 PM, Kostas Kougios < [email protected]> wrote: > Hi, I do an sc.parallelize with a list of 512k items. But sometimes not all > executors are used, i.e. they don't have work to do and nothing is logged > after: > > 15/07/29 16:35:22 WARN internal.ThreadLocalRandom: Failed to generate a > seed > from SecureRandom within 3 seconds. Not enough entrophy? > 15/07/29 16:35:22 INFO util.Utils: Successfully started service > 'org.apache.spark.network.netty.NettyBlockTransferService' on port 56477. > 15/07/29 16:35:22 INFO netty.NettyBlockTransferService: Server created on > 56477 > 15/07/29 16:35:22 INFO storage.BlockManagerMaster: Trying to register > BlockManager > 15/07/29 16:35:22 INFO storage.BlockManagerMaster: Registered BlockManager > > Any ideas why so? My last run has 3 of the 64 executors not used. > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/sc-parallelize-512k-items-doesn-t-always-use-64-executors-tp24062.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
