Re: How data locality is honored when spark is running on yarn

2016-01-27 Thread Saisai Shao
Hi Todd, There're two levels of locality based scheduling when you run Spark on Yarn if dynamic allocation enabled: 1. Container allocation is based on the locality ratio of pending tasks, this is Yarn specific and only works with dynamic allocation enabled. 2. Task scheduling is locality

How data locality is honored when spark is running on yarn

2016-01-27 Thread Todd
Hi, I am kind of confused about how data locality is honored when spark is running on yarn(client or cluster mode),can someone please elaberate on this? Thanks!