[
https://issues.apache.org/jira/browse/HADOOP-5396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697104#action_12697104
]
Hemanth Yamijala commented on HADOOP-5396:
------------------------------------------
Some comments:
AdminOperationsProtocol & MRAdmin:
- Should the API be refreshQueueAcls() ?
QueueManager:
- QUEUE_PROPERTIES_FILE_NAME: Can be QUEUE_ACLS_FILE_NAME.
- file name can be mapred-queue-acls.xml. The decision to keep queues and acl
enabled flag in mapred-default.xml seems ok to me.
- Some improved error message ? "Queue ACL in job tracker conf support is
deprecated, configure the same in : QUEUE_PROPERTIES_FILE_NAME". Something
like: "Configuring queue ACLs in mapred-site.xml or hadoop-site.xml is
deprecated. Configure queue ACLs in QUEUE_PROPERTIES_FILE_NAME."
- Code leading to backward compatibiliy, if we find that one ACL is configured,
it is sufficient to warn. I think we can break out at this point.
- Clearing the schedulerInfoObjects on refresh seems dangerous. These objects
are set by the scheduler. If the objects are cleared, and the schedulers are
not notified about this, they may not set them back and this would result in
loss of information.
- Since this refresh happens while the system is running, I think more care
should be taken towards error handling. For e.g. if an admin makes a mistake in
the queue config file - doesn't close a tag, say - and fires the command,
currently I think the old information is lost and the new information will be
lost too. I think it is mandatory that the old information is not lost and
important that the admin be told there's a problem with the configuration.
- Combining the two comments above, maybe one solution could be to factor out
the part parsing the ACLs into a separate method, something like HashMap
parseQueueAcls(). This can be called from initialize and the aclsMap member
varible should be replaced iff the returned map is valid (like non-null, or so
on). The refreshQueueAcls method can have an equivalent call in the
QueueManager (i.e. QueueManager.refreshAcls()) which in turn can call the same
parseQueueAcls() method. Would this work ?
Tests:
- Regarding the test case, I am wondering if we can move to a more conventional
unit test model to save on test time. So, we could directly test the logic in
the QueueManager by calling the hasAccess API before and after
QueueManager.refreshQueueAcls is called. We could set up multiple queues and
add/remove users to each queue for both types of operations and verify that the
operations after the refresh are appropriately different.
- This should also test that erroneously created files do not clear the current
settings
- The mapred-queue-acls template file should probably be described as "This is
a template file for queue ACLs configuration properties"
> Queue ACLs should be refreshed without requiring a restart of the job tracker
> -----------------------------------------------------------------------------
>
> Key: HADOOP-5396
> URL: https://issues.apache.org/jira/browse/HADOOP-5396
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Hemanth Yamijala
> Assignee: Vinod K V
> Attachments: cluster_setup.pdf, commands_manual.pdf,
> hadoop-5396-1.patch, HADOOP-5396-3-svn.txt, HADOOP-5396-4-svn.txt
>
>
> In large shared deployments of the M/R clusters, it is normal that new users
> will periodically want to get access to some queues on the M/R framework.
> Requiring a JT restart for each such change is operationally inconvenient and
> seems an overkill. There should be a way for updating ACLs with new users
> without requiring a JT restart.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.