Thanks for your answers. My understanding is that minimum-user-limit-percent handles resource sharing inside a queue. My problem is that a single user blocks its queue (which is OK), but all other queues as well (which is not OK).
My configuration is: yarn.scheduler.capacity.root.accessible-node-labels=* yarn.scheduler.capacity.root.accessible-node-labels.default.capacity=-1 yarn.scheduler.capacity.root.accessible-node-labels.default.maximum-capacity=-1 yarn.scheduler.capacity.root.acl_administer_queue=* yarn.scheduler.capacity.root.background.acl_administer_jobs=* yarn.scheduler.capacity.root.background.acl_submit_applications=* yarn.scheduler.capacity.root.background.capacity=10 yarn.scheduler.capacity.root.background.maximum-capacity=100 yarn.scheduler.capacity.root.background.state=RUNNING yarn.scheduler.capacity.root.background.user-limit-factor=10 yarn.scheduler.capacity.root.capacity=100 yarn.scheduler.capacity.root.xxx.acl_administer_jobs=* yarn.scheduler.capacity.root.xxx.acl_submit_applications=* yarn.scheduler.capacity.root.xxx.capacity=50 yarn.scheduler.capacity.root.xxx.maximum-capacity=100 yarn.scheduler.capacity.root.xxx.state=RUNNING yarn.scheduler.capacity.root.xxx.user-limit-factor=10 yarn.scheduler.capacity.root.default-node-label-expression= yarn.scheduler.capacity.root.default.acl_administer_jobs=* yarn.scheduler.capacity.root.default.acl_submit_applications=* yarn.scheduler.capacity.root.default.capacity=40 yarn.scheduler.capacity.root.default.maximum-capacity=100 yarn.scheduler.capacity.root.default.state=RUNNING yarn.scheduler.capacity.root.default.user-limit-factor=10 yarn.scheduler.capacity.root.queues=default,xxx,background I know that default.user-limit-factor=1 would solve the problem. But I want to allow a single user to have the full power of the cluster when no one else is using it, I even think it is the whole point of multi-tenancy, otherwise it is easier to have several Hadoop clusters. What I want is: - To run a background job on my background queue (10%), which use 100% of the cluster as long as no one else is using it. - When a query is sent on the default (40%) queue, having the background job lowering its activity to 20% of the cluster on the default queue query to access to 80% of the resources of the cluster. That works actually pretty well with any standard Hive or Spark activity. But if I send my faulty Hive query on the background queue, then everything is frozen. 2015-05-28 18:00 GMT+02:00 Birender Saini <[email protected]>: > Julien - > > Sounds like you are using default Capacity Scheduler settings which has > minimum-user-limit-percent > = 100, meaning the minimum guaranteed resources for a single user is 100% > > > Read more about this property here - > http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.3/bk_system-admin-guide/content/setting_user_limits.html > > If you want to read more about Capacity Scheduler and key properties > that can help you fine tune multi tenancy, see this - > > http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.2.4/bk_yarn_resource_mgt/content/ref-25c07006-4490-4e57-b04c-7582c6ee16b8.1.html > > > Heres another article explaining how to tune Hive for Interactive and > Batch Queries - > > http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.2.4/bk_performance_tuning/content/ch_performance_interactive_queue_chapter.html > > > *Biren Saini * Solutions Engineer, Hortonworks > *Mobile: 704-526-8148 <704-526-8148>* > Email: *[email protected] <[email protected]>* > Website: *http://www.hortonworks.com/ <http://www.hortonworks.com/>* > > > > On May 28, 2015, at 11:28 AM, Julien Carme <[email protected]> > wrote: > > Hello, > > I am experimenting the use of multi-tenancy in Hadoop. > > I have a Hive queries which does never give a result and whose > containers seem to freeze forever. It is basically a join where all key > values of both input tables are the same. > > I understand there can be bugs in Hive and they will be corrected at > some point, and twisted queries like this one might crash Hive. > > However, one this query is submitted, all the cluster is frozen > including other queues. The entire cluster is useless until you have > manually killed the faulty application. If you want to use a single hadoop > cluster for several customers, this is a major issue. > > Is it the expected behavior? Once Yarn has assigned all its containers, > is the only thing it can do is wait until they have finished there job? > What could be a solution to this problem? > > Thanks for your answers. > > >
