Are your driver running on the same m/c as master?
On 29 Apr 2015 03:59, "Anshul Singhle" <ans...@betaglide.com> wrote:

> Hi,
>
> I'm running short spark jobs on rdds cached in memory. I'm also using a
> long running job context. I want to be able to complete my jobs (on the
> cached rdd) in under 1 sec.
> I'm getting the following job times with about 15 GB of data distributed
> across 6 nodes. Each executor has about 20GB of memory available. My
> context has about 26 cores in total.
>
> If number of partitions < no of cores -
>
> Some jobs run in 3s others take about 6s -- the time difference can be
> explained by GC time.
>
> If number of partitions = no of cores -
>
> All jobs run in 4s. The initial tasks of each stage on every executor take
> about 1s.
>
> If partitions > cores -
>
> Jobs take more time. The initial tasks of each stage on every executor
> take about 1s. The other tasks run in 45-50 ms each. However, since the
> initial tasks again take about 1s each, the total time in this case is
> about 6s which is more than the previous case.
>
> Clearly the limiting factor here is the initial set of tasks. For every
> case, these tasks take 1s to run, no matter the amount of partitions. Hence
> best results are obtained with partitions = cores, because in that case.
> every core gets 1 task which takes 1s to run.
> In this case, I get 0 GC time. The only explanation is "scheduling delay"
> which is about 0.2 - 0.3 seconds. I looked at my task size and result size
> and that has no bearing on this delay. Also, I'm not getting the task size
> warnings in the logs.
>
> For what I can understand, the first time a task runs on a core, it takes
> 1s to run. Is this normal?
>
> Is it possible to get sub-second latencies?
>
> Can something be done about the scheduler delay?
>
> What other things can I look at to reduce this time?
>
> Regards,
>
> Anshul
>
>
>

Reply via email to