[
https://issues.apache.org/jira/browse/DRILL-4990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16240724#comment-16240724
]
ASF GitHub Bot commented on DRILL-4990:
---------------------------------------
Github user ppadma commented on a diff in the pull request:
https://github.com/apache/drill/pull/652#discussion_r149173203
--- Diff:
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/WorkspaceSchemaFactory.java
---
@@ -151,17 +152,32 @@ public WorkspaceSchemaFactory(
*/
public boolean accessible(final String userName) throws IOException {
final FileSystem fs = ImpersonationUtil.createFileSystem(userName,
fsConf);
+ boolean tryListStatus = false;
try {
- // We have to rely on the listStatus as a FileSystem can have
complicated controls such as regular unix style
- // permissions, Access Control Lists (ACLs) or Access Control
Expressions (ACE). Hadoop 2.7 version of FileSystem
- // has a limited private API (FileSystem.access) to check the
permissions directly
- // (see https://issues.apache.org/jira/browse/HDFS-6570). Drill
currently relies on Hadoop 2.5.0 version of
- // FileClient. TODO: Update this when DRILL-3749 is fixed.
- fs.listStatus(wsPath);
+ // access API checks if a user has certain permissions on a file or
directory.
+ // returns normally if requested permissions are granted and throws
an exception
+ // if access is denied. This API was added in HDFS 2.6 (see
HDFS-6570).
+ // It is less expensive (than listStatus which was being used
before) and hides the
+ // complicated access control logic underneath.
+ fs.access(wsPath, FsAction.READ);
} catch (final UnsupportedOperationException e) {
- logger.trace("The filesystem for this workspace does not support
this operation.", e);
+ logger.debug("The filesystem for this workspace does not support
access operation.", e);
+ tryListStatus = true;
} catch (final FileNotFoundException | AccessControlException e) {
- return false;
+ logger.debug("file {} not found or cannot be accessed",
wsPath.toString(), e);
+ tryListStatus = true;
--- End diff --
It can be trusted fine for regular file systems HDFS, MapR FS, Mac OS
Extended File System etc. I am not sure if it works for windows or not. We
were never able to figure out whether it is a problem with the test setup or it
is actually not supported.
Updated the diffs with your review comments taken care of. Please take a
look.
> Use new HDFS API access instead of listStatus to check if users have
> permissions to access workspace.
> -----------------------------------------------------------------------------------------------------
>
> Key: DRILL-4990
> URL: https://issues.apache.org/jira/browse/DRILL-4990
> Project: Apache Drill
> Issue Type: Bug
> Components: Query Planning & Optimization
> Affects Versions: 1.8.0
> Reporter: Padma Penumarthy
> Assignee: Padma Penumarthy
>
> For every query, we build the schema tree
> (runSQL->getPlan->getNewDefaultSchema->getRootSchema). All workspaces in all
> storage plugins are checked and are added to the schema tree if they are
> accessible by the user who initiated the query. For file system plugin,
> listStatus API is used to check if the workspace is accessible or not
> (WorkspaceSchemaFactory.accessible) by the user. The idea seem to be if the
> user does not have access to file(s) in the workspace, listStatus will
> generate an exception and we return false. But, listStatus (which lists all
> the entries of a directory) is an expensive operation when there are large
> number of files in the directory. A new API is added in Hadoop 2.6 called
> access (HDFS-6570) which provides the ability to check if the user has
> permissions on a file/directory. Use this new API instead of listStatus.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)