Thanks for the information. here's my understanding of the resource allocation (please correct me if I am wrong) and my scenario:
1. Assuming the cluster is dedicated to only one Tez application, then I want to maximize the usage of the single application (Mem/CPU) 2. Assuming I have changed all the configurations in YARN side so the memory/CPU allocation of a certain node is maximized (meaning each node can be theoretically full utilized). The input is around 500GB~1TB 3. Then I launched a Tez application (Hive on Tez). Tez will choose the number of tasks (in my case, there are usually 3K tasks), an each task usually run about 10~20 seconds. In this case, I don't think my Tez task should be increased (as each of them just run a couple of seconds so I think each task has the ability to process its data). The swimlane picture is attached (for a smaller data size but the DAG plans are the same). The container reuse switch is also on. In order to maximize the utilization, I would rather like to increase my container number so more tasks can be run in parallel, but I am not sure if Tez AM will ask RM for a certain amount of containers based on what? Can I change the container number Tez asks so the job could be run faster? Xiaoyong From: Jianfeng (Jeff) Zhang [mailto:[email protected]] Sent: Friday, September 11, 2015 1:19 PM To: [email protected] Subject: Re: how to allocate more containers? by default I think container reuse is enabled. You may disable it to get more containers, but it also needs some trade-off and not use resource efficiently. Set tez.am.container.reuse.enabled = false Best Regard, Jeff Zhang From: Jianfeng Zhang <[email protected]<mailto:[email protected]>> Reply-To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Date: Friday, September 11, 2015 at 12:52 PM To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: Re: how to allocate more containers? Resource usage is more related to your cluster configuration (the resource scheduler configuration) Do you intend to increase parallelism (more tasks ) to get more containers ? And there's some configurations that you can use to get containers more quickly with some other trade-off, but it would not give you more containers. Best Regard, Jeff Zhang From: Xiaoyong Zhu <[email protected]<mailto:[email protected]>> Reply-To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Date: Friday, September 11, 2015 at 12:38 PM To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: how to allocate more containers? Hi I am wondering if there is a configuration I can change to allocate more containers for a certain Tez application? I am using Hive on Tez. Thanks! Xiaoyong
