[
https://issues.apache.org/jira/browse/HDFS-4680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Wang updated HDFS-4680:
------------------------------
Attachment: hdfs-4680-3.patch
Thanks for the review ATM, here's a v3 that removes the extraneous printlns.
> Audit logging of delegation tokens for MR tracing
> -------------------------------------------------
>
> Key: HDFS-4680
> URL: https://issues.apache.org/jira/browse/HDFS-4680
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: namenode, security
> Affects Versions: 2.0.3-alpha
> Reporter: Andrew Wang
> Assignee: Andrew Wang
> Attachments: hdfs-4680-1.patch, hdfs-4680-2.patch, hdfs-4680-3.patch
>
>
> HDFS audit logging tracks HDFS operations made by different users, e.g.
> creation and deletion of files. This is useful for after-the-fact root cause
> analysis and security. However, logging merely the username is insufficient
> for many usecases. For instance, it is common for a single user to run
> multiple MapReduce jobs (I believe this is the case with Hive). In this
> scenario, given a particular audit log entry, it is difficult to trace it
> back to the MR job or task that generated that entry.
> I see a number of potential options for implementing this.
> 1. Make an optional "client name" field part of the NN RPC format. We already
> pass a {{clientName}} as a parameter in many RPC calls, so this would
> essentially make it standardized. MR tasks could then set this field to the
> job and task ID.
> 2. This could be generalized to a set of optional key-value *tags* in the NN
> RPC format, which would then be audit logged. This has standalone benefits
> outside of just verifying MR task ids.
> 3. Neither of the above two options actually securely verify that MR clients
> are who they claim they are. Doing this securely requires the JobTracker to
> sign MR task attempts, and then having the NN verify this signature. However,
> this is substantially more work, and could be built on after idea #2.
> Thoughts welcomed.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira