Hi Hari,

Thanks for this information.

Do you have any resources on/can explain, why YARN has this as default
behaviour? What would be the advantages/scenarios to have multiple
assignments in single heartbeat?


Regards
Akshay Bhardwaj
+91-97111-33849


On Mon, May 20, 2019 at 1:29 PM Hariharan <hariharan...@gmail.com> wrote:

> Hi Akshay,
>
> I believe HDP uses the capacity scheduler by default. In the capacity
> scheduler, assignment of multiple containers on the same node is
> determined by the option
> yarn.scheduler.capacity.per-node-heartbeat.multiple-assignments-enabled,
> which is true by default. If you would like YARN to spread out the
> containers, you can set this for false.
>
> You can read learn about this and associated parameters here
> -
> https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html
>
> ~ Hari
>
>
> On Mon, May 20, 2019 at 11:16 AM Akshay Bhardwaj
> <akshay.bhardwaj1...@gmail.com> wrote:
> >
> > Hi All,
> >
> > Just floating this email again. Grateful for any suggestions.
> >
> > Akshay Bhardwaj
> > +91-97111-33849
> >
> >
> > On Mon, May 20, 2019 at 12:25 AM Akshay Bhardwaj <
> akshay.bhardwaj1...@gmail.com> wrote:
> >>
> >> Hi All,
> >>
> >> I am running Spark 2.3 on YARN using HDP 2.6
> >>
> >> I am running spark job using dynamic resource allocation on YARN with
> minimum 2 executors and maximum 6. My job read data from parquet files
> which are present on S3 buckets and store some enriched data to cassandra.
> >>
> >> My question is, how does YARN decide which nodes to launch containers?
> >> I have around 12 YARN nodes running in the cluster, but still i see
> repeated patterns of 3-4 containers launched on the same node for a
> particular job.
> >>
> >> What is the best way to start debugging this reason?
> >>
> >> Akshay Bhardwaj
> >> +91-97111-33849
>

Reply via email to