Probably network / shuffling cost? Or broadcast variables? Can you provide more 
details what you do and some timings?

> On 9. Apr 2018, at 07:07, Junfeng Chen <> wrote:
> I have wrote an spark streaming application reading kafka data and convert 
> the json data to parquet and save to hdfs. 
> What make me puzzled is, the processing time of app in yarn mode cost 20% to 
> 50% more time than in local mode. My cluster have three nodes with three node 
> managers, and all three hosts have same hardware, 40cores and 256GB memory. .
> Why? How to solve it? 
> Regard,
> Junfeng Chen

To unsubscribe e-mail:

Reply via email to