Re: One job blocks all queues.

Julien Carme Thu, 28 May 2015 09:38:40 -0700

Thanks for your answers.

My understanding is that minimum-user-limit-percent handles resource
sharing inside a queue. My problem is that a single user blocks its queue
(which is OK), but all other queues as well (which is not OK).


My configuration is:

yarn.scheduler.capacity.root.accessible-node-labels=*
yarn.scheduler.capacity.root.accessible-node-labels.default.capacity=-1
yarn.scheduler.capacity.root.accessible-node-labels.default.maximum-capacity=-1
yarn.scheduler.capacity.root.acl_administer_queue=*
yarn.scheduler.capacity.root.background.acl_administer_jobs=*
yarn.scheduler.capacity.root.background.acl_submit_applications=*
yarn.scheduler.capacity.root.background.capacity=10
yarn.scheduler.capacity.root.background.maximum-capacity=100
yarn.scheduler.capacity.root.background.state=RUNNING
yarn.scheduler.capacity.root.background.user-limit-factor=10
yarn.scheduler.capacity.root.capacity=100
yarn.scheduler.capacity.root.xxx.acl_administer_jobs=*
yarn.scheduler.capacity.root.xxx.acl_submit_applications=*
yarn.scheduler.capacity.root.xxx.capacity=50
yarn.scheduler.capacity.root.xxx.maximum-capacity=100
yarn.scheduler.capacity.root.xxx.state=RUNNING
yarn.scheduler.capacity.root.xxx.user-limit-factor=10
yarn.scheduler.capacity.root.default-node-label-expression=
yarn.scheduler.capacity.root.default.acl_administer_jobs=*
yarn.scheduler.capacity.root.default.acl_submit_applications=*
yarn.scheduler.capacity.root.default.capacity=40
yarn.scheduler.capacity.root.default.maximum-capacity=100
yarn.scheduler.capacity.root.default.state=RUNNING
yarn.scheduler.capacity.root.default.user-limit-factor=10
yarn.scheduler.capacity.root.queues=default,xxx,background


I know that default.user-limit-factor=1 would solve the problem.

But I want to allow a single user to have the full power of the cluster
when no one else is using it, I even think it is the whole point of
multi-tenancy, otherwise it is easier to have several Hadoop clusters.

What I want is:
- To run a background job on my background queue (10%), which use 100% of
the cluster as long as no one else is using it.
- When a query is sent on the default (40%) queue, having the background
job lowering its activity to 20% of the cluster on the default queue query
to access to 80% of the resources of the cluster.

That works actually pretty well with any standard Hive or Spark activity.
But if I send my faulty Hive query on the background queue, then everything
is frozen.




2015-05-28 18:00 GMT+02:00 Birender Saini <[email protected]>:

>  Julien -
>
>  Sounds like you are using default Capacity Scheduler settings which has  
> minimum-user-limit-percent
> = 100, meaning the minimum guaranteed resources for a single user is 100%
>
>
>  Read more about this property here -
> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.3/bk_system-admin-guide/content/setting_user_limits.html
>
>  If you want to read more about Capacity Scheduler and key properties
> that can help you fine tune multi tenancy, see this -
>
> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.2.4/bk_yarn_resource_mgt/content/ref-25c07006-4490-4e57-b04c-7582c6ee16b8.1.html
>
>
>  Heres another article explaining how to tune Hive for Interactive and
> Batch Queries -
>
> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.2.4/bk_performance_tuning/content/ch_performance_interactive_queue_chapter.html
>
>
>  *Biren Saini * Solutions Engineer, Hortonworks
>    *Mobile: 704-526-8148 <704-526-8148>*
>  Email: *[email protected] <[email protected]>*
>  Website: *http://www.hortonworks.com/ <http://www.hortonworks.com/>*
>
>
>
>  On May 28, 2015, at 11:28 AM, Julien Carme <[email protected]>
> wrote:
>
>  Hello,
>
>  I am experimenting the use of multi-tenancy in Hadoop.
>
>  I have a Hive queries which does never give a result and whose
> containers seem to freeze forever. It is basically a join where all key
> values of both input tables are the same.
>
>  I understand there can be bugs in Hive and they will be corrected at
> some point, and twisted queries like this one might crash Hive.
>
>  However, one this query is submitted, all the cluster is frozen
> including other queues. The entire cluster is useless until you have
> manually killed the faulty application. If you want to use a single hadoop
> cluster for several customers, this is a major issue.
>
>  Is it the expected behavior? Once Yarn has assigned all its containers,
> is the only thing it can do is wait until they have finished there job?
> What could be a solution to this problem?
>
>  Thanks for your answers.
>
>
>

Re: One job blocks all queues.

Reply via email to