[ https://issues.apache.org/jira/browse/HADOOP-8973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13594895#comment-13594895 ]
Chris Nauroth commented on HADOOP-8973: --------------------------------------- {quote} So please make both versions of checkDirs consistent, ideally without code duplication. {quote} Definitely agreed. Can we please keep the scope of this jira limited to Windows compatibility for the current logic? Arpit has already filed HADOOP-9367 for follow-up on the existing flaws in the code. {quote} It seems to me that what checkDir(FS, Path, Perm) is doing is sufficient for our use cases. The code to check permissions via the "implies" functions are used in a number of places in the code. What I am trying to understand is why the check is sufficient/correct for all those cases and not here. If there is an issue in that logic then we have issues in a number of critical places in the code which should be fixed. Its more than a question of cross-platform compatibility. If there is no issue that logic then its not clear why we need to do something different for this particular case. {quote} There are 2 broad categories of usage for {{FsAction#implies}}: # Inside the implementation of the HDFS permission model. This includes things like {{FSDirectory}} and {{FSPermissionChecker}}. In this case, the logic of {{FsAction#implies}} is sufficient, because we know that this is the only kind of permission model supported by HDFS. It doesn't support ACLs (at least not yet). # In various places that need to interact with local disk. This includes things like {{NameNode}} and {{DataNode}} initialization, {{ClientDistributedCacheManager}} and {{FSDownload}}. In these cases, the logic of {{FsAction#implies}} is incomplete, because it doesn't cover all possible permission checks that might be enforced on the local disk by the host OS, such as ACLs. I believe this is the key distinction: HDFS (full permission model under our control) vs. local disk (permission model partially invisible to our code). I'm starting to think that instead of {{FsAction#implies}}, we need to write a wrapper API that looks more like {{access}} from unistd.h, and route all local disk interactions through that instead of {{FsAction#implies}}: http://linux.die.net/man/2/access This API just says "tell me if I can read/write/execute this file". It encapsulates the caller from the implementation details of the permission model. Again though, I think this is far out of scope for the current issue. {quote} We need to be careful about the performance impact of how the check is done {quote} I believe the lists of directories to check are small in practice: things like dfs.namenode.name.dir, dfs.datanode.data.dir, yarn.nodemanager.local-dirs. Also, there is no performance regression on Linux, because this logic is guarded in an if (WINDOWS) check. After migration to Java 7, even Windows won't need to do this. > DiskChecker cannot reliably detect an inaccessible disk on Windows with NTFS > ACLs > --------------------------------------------------------------------------------- > > Key: HADOOP-8973 > URL: https://issues.apache.org/jira/browse/HADOOP-8973 > Project: Hadoop Common > Issue Type: Bug > Components: util > Affects Versions: trunk-win > Reporter: Chris Nauroth > Assignee: Chris Nauroth > Attachments: HADOOP-8973-branch-trunk-win.2.patch, > HADOOP-8973-branch-trunk-win.patch > > > DiskChecker.checkDir uses File.canRead, File.canWrite, and File.canExecute to > check if a directory is inaccessible. These APIs are not reliable on Windows > with NTFS ACLs due to a known JVM bug. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira