Re: rdd.count with 100 elements taking 1 second to run

Anshul Singhle Thu, 30 Apr 2015 01:15:02 -0700

Hi Akhil,

I discovered the reason for this problem. There was in issue with my
deployment (Google Cloud Platform). My spark master was on a different
"region" than the slaves. This resulted in huge scheduler delays.


Thanks,
Anshul

On Thu, Apr 30, 2015 at 1:39 PM, Akhil Das <ak...@sigmoidanalytics.com>
wrote:

> Does this speed up?
>
>
> val rdd = sc.parallelize(1 to 100*, 30)*
> rdd.count
>
>
>
>
> Thanks
> Best Regards
>
> On Wed, Apr 29, 2015 at 1:47 AM, Anshul Singhle <ans...@betaglide.com>
> wrote:
>
>> Hi,
>>
>> I'm running the following code in my cluster (standalone mode) via spark
>> shell -
>>
>> val rdd = sc.parallelize(1 to 100)
>> rdd.count
>>
>> This takes around 1.2s to run.
>>
>> Is this expected or am I configuring something wrong?
>>
>> I'm using about 30 cores with 512MB executor memory
>>
>> As expected, GC time is negligible. I'm just getting some scheduler delay
>> and 1s to launch the task
>>
>> Thanks,
>>
>> Anshul
>>
>
>

Re: rdd.count with 100 elements taking 1 second to run

Reply via email to