[ https://issues.apache.org/jira/browse/HADOOP-4513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12643441#action_12643441 ]
Vivek Ratan commented on HADOOP-4513: ------------------------------------- Some details. The limits on the initialized jobs are for waiting jobs only. Because of user quotas, we actually need only one limit: the # of initialized (waiting) jobs per user. That number should probably be 1, 2 or 3. Let's assume it's 2. User quotas decide how many concurrent users the queue can support at a given time, in terms of running jobs. If the user quota is 25%, for example, the queue can run jobs from up to 4 users. Suppose there are waiting jobs from 4 or more users. Then, we need to asynchronously initialize the first 2 waiting jobs from each user, for a total of 8 jobs. That's because any waiting job that runs next will come from one of these 8 jobs. If only 2 users have waiting jobs, then we just need to asynchronously initialize 2 jobs from each of these 2 users. So it doesn't make sense to have a per-queue limit on the total number of initialized jobs. Having such a limit can actually cause incorrect behavior, as this pre-configured limit may be small enough to prevent initialization of jobs from one or more users. Note also that because jobs can shift their position in the wait queue because of priorities, and that jobs can complete between the interval that this init thread (which is handling asynchronous inits) run, the total number of initialized jobs at any given time may be higher than what the limits specify. As an example, consider a limit of 2 jobs/user. Suppose three users have submitted jobs that are waiting. Our thread will initialize 6 jobs, two each from each of the three users. Now suppose that one of the user submits a high priority job which jumps to the head of the wait queue. The next time our init thread runs, it will have to initialize this high priority job, even though the user already has two jobs initialized. Ideally, the thread would un-initialize one of the 2 previously jobs. This is a nice optimization, but we probably don't need it right away. > Capacity scheduler should initialize tasks asynchronously > --------------------------------------------------------- > > Key: HADOOP-4513 > URL: https://issues.apache.org/jira/browse/HADOOP-4513 > Project: Hadoop Core > Issue Type: Bug > Components: contrib/capacity-sched > Affects Versions: 0.19.0 > Reporter: Hemanth Yamijala > Assignee: Sreekanth Ramakrishnan > > Currently, the capacity scheduler initializes tasks on demand, as opposed to > the eager initialization technique used by the default scheduler. This is > done in order to save JT memory footprint. However, the initialization is > done in the {{assignTasks}} API which is not a good idea as task > initialization could be a time consuming operation. This JIRA is to move out > the initialization outside the {{assignTasks}} API and do it asynchronously. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.