It's probably just because your application is only using 2 threads. Spark
should be allocating a thread pool large enough, but the RDD's you are
operating on have only 2 partitions, for example.
To give it a try, do
sc.parallelize(1 to 10, 20).mapPartitions { iter => Thread.sleep(10000000);
iter }.count
And see if you have 20 tasks being launched.
--
Reynold Xin, AMPLab, UC Berkeley
http://rxin.org
On Sun, Sep 22, 2013 at 9:48 PM, Xiang Huo <[email protected]> wrote:
> Hi all,
>
> I am trying to run a spark program on a server. It is not a cluster but
> only a server. I want to configure my spark program can use at most 20 CPU,
> because this machine is also shared by other users.
>
> I know I can set local[K] as the value of Master URLs to limited how many
> worker threads in this program. But after I run my program, there is only
> at least two CPUs used. And the program will be run a long time if there is
> only one or two cpus used.
>
> Does any one have met similar situation or have any suggestion?
>
> Thanks.
>
> Xiang
> --
> Xiang Huo
> Department of Computer Science
> University of Illinois at Chicago(UIC)
> Chicago, Illinois
> US
> Email: [email protected]
> or [email protected]
>