[jira] Commented: (HADOOP-5396) Queue ACLs should be refreshed without requiring a restart of the job tracker

Hemanth Yamijala (JIRA) Wed, 08 Apr 2009 10:14:34 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-5396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697104#action_12697104
 ]


Hemanth Yamijala commented on HADOOP-5396:
------------------------------------------

Some comments:

AdminOperationsProtocol & MRAdmin:

- Should the API be refreshQueueAcls() ?

QueueManager:
- QUEUE_PROPERTIES_FILE_NAME: Can be QUEUE_ACLS_FILE_NAME. 
- file name can be mapred-queue-acls.xml. The decision to keep queues and acl 
enabled flag in mapred-default.xml seems ok to me.
- Some improved error message ? "Queue ACL in job tracker conf support is 
deprecated, configure the same in : QUEUE_PROPERTIES_FILE_NAME". Something 
like: "Configuring queue ACLs in mapred-site.xml or hadoop-site.xml is 
deprecated. Configure queue ACLs in QUEUE_PROPERTIES_FILE_NAME."
- Code leading to backward compatibiliy, if we find that one ACL is configured, 
it is sufficient to warn. I think we can break out at this point.
- Clearing the schedulerInfoObjects on refresh seems dangerous. These objects 
are set by the scheduler. If the objects are cleared, and the schedulers are 
not notified about this, they may not set them back and this would result in 
loss of information.
- Since this refresh happens while the system is running, I think more care 
should be taken towards error handling. For e.g. if an admin makes a mistake in 
the queue config file - doesn't close a tag, say - and fires the command, 
currently I think the old information is lost and the new information will be 
lost too. I think it is mandatory that the old information is not lost and 
important that the admin be told there's a problem with the configuration.
- Combining the two comments above, maybe one solution could be to factor out 
the part parsing the ACLs into a separate method, something like HashMap 
parseQueueAcls(). This can be called from initialize and the aclsMap member 
varible should be replaced iff the returned map is valid (like non-null, or so 
on). The refreshQueueAcls method can have an equivalent call in the 
QueueManager (i.e. QueueManager.refreshAcls()) which in turn can call the same 
parseQueueAcls() method. Would this work ?

Tests:
- Regarding the test case, I am wondering if we can move to a more conventional 
unit test model to save on test time. So, we could directly test the logic in 
the QueueManager by calling the hasAccess API before and after 
QueueManager.refreshQueueAcls is called. We could set up multiple queues and 
add/remove users to each queue for both types of operations and verify that the 
operations after the refresh are appropriately different. 
- This should also test that erroneously created files do not clear the current 
settings

- The mapred-queue-acls template file should probably be described as "This is 
a template file for queue ACLs configuration properties"

> Queue ACLs should be refreshed without requiring a restart of the job tracker
> -----------------------------------------------------------------------------
>
>                 Key: HADOOP-5396
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5396
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Hemanth Yamijala
>            Assignee: Vinod K V
>         Attachments: cluster_setup.pdf, commands_manual.pdf, 
> hadoop-5396-1.patch, HADOOP-5396-3-svn.txt, HADOOP-5396-4-svn.txt
>
>
> In large shared deployments of the M/R clusters, it is normal that new users 
> will periodically want to get access to some queues on the M/R framework. 
> Requiring a JT restart for each such change is operationally inconvenient and 
> seems an overkill. There should be a way for updating ACLs with new users 
> without requiring a JT restart.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5396) Queue ACLs should be refreshed without requiring a restart of the job tracker

Reply via email to