Probably network / shuffling cost? Or broadcast variables? Can you provide more 
details what you do and some timings?

> On 9. Apr 2018, at 07:07, Junfeng Chen <darou...@gmail.com> wrote:
> 
> I have wrote an spark streaming application reading kafka data and convert 
> the json data to parquet and save to hdfs. 
> What make me puzzled is, the processing time of app in yarn mode cost 20% to 
> 50% more time than in local mode. My cluster have three nodes with three node 
> managers, and all three hosts have same hardware, 40cores and 256GB memory. .
> 
> Why? How to solve it? 
> 
> Regard,
> Junfeng Chen

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to