Also, the level of parallelism would be affected by how big your input is. Could this be a problem in your case?
On Sunday, November 9, 2014, Aaron Davidson <ilike...@gmail.com> wrote: > oops, meant to cc userlist too > > On Sat, Nov 8, 2014 at 3:13 PM, Aaron Davidson <ilike...@gmail.com > <javascript:_e(%7B%7D,'cvml','ilike...@gmail.com');>> wrote: > >> The default local master is "local[*]", which should use all cores on >> your system. So you should be able to just do "./bin/pyspark" and >> "sc.parallelize(range(1000)).count()" and see that all your cores were used. >> >> On Sat, Nov 8, 2014 at 2:20 PM, Blind Faith <person.of.b...@gmail.com >> <javascript:_e(%7B%7D,'cvml','person.of.b...@gmail.com');>> wrote: >> >>> I am a Spark newbie and I use python (pyspark). I am trying to run a >>> program on a 64 core system, but no matter what I do, it always uses 1 >>> core. It doesn't matter if I run it using "spark-submit --master local[64] >>> run.sh" or I call x.repartition(64) in my code with an RDD, the spark >>> program always uses one core. Has anyone experience of running spark >>> programs on multicore processors with success? Can someone provide me a >>> very simple example that does properly run on all cores of a multicore >>> system? >>> >> >> > -- Best Regards, Sonal Nube Technologies <http://www.nubetech.co> <http://in.linkedin.com/in/sonalgoyal>