Hi xiefeng,

Even if your RDDs are tiny and reduced to one partition, there is always
orchestration overhead (sending tasks to executor(s), reducing results,
etc., these things are not free).

If you need fast, [near] real-time processing, look towards
spark-streaming.

Regards,
-- 
  Bedrytski Aliaksandr
  [email protected]

On Mon, Sep 5, 2016, at 04:36, xiefeng wrote:
> The spark context will be reused, so the spark context initialization
> won't
> affect the throughput test.
> 
> 
> 
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Why-does-spark-take-so-much-time-for-simple-task-without-calculation-tp27628p27657.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: [email protected]
> 

---------------------------------------------------------------------
To unsubscribe e-mail: [email protected]

Reply via email to