Fredy Wijaya has posted comments on this change. ( http://gerrit.cloudera.org:8080/12632 )
Change subject: IMPALA-8100: Add initial support for Ranger ...................................................................... Patch Set 11: (25 comments) http://gerrit.cloudera.org:8080/#/c/12632/9//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12632/9//COMMIT_MSG@30 PS9, Line 30: - Ran all FE tests : - Ran all E2E authorization tests > did you run these with ranger configured? I'm surprised they would pass sin For FE, it runs with both Ranger and Sentry via JUnit parameterized test. I ran E2E authorization tests to make sure no regression in the Sentry implementation. http://gerrit.cloudera.org:8080/#/c/12632/9/be/src/service/frontend.cc File be/src/service/frontend.cc: http://gerrit.cloudera.org:8080/#/c/12632/9/be/src/service/frontend.cc@41 PS9, Line 41: "org.apache.impala.authorization.sentry.SentryAuthorizationFactory", > I wonder if we should switch this to be a more human-friendly flag instead Phil brought up the same thing. I have a plan to make it more human-friendly in a different JIRA: https://issues.apache.org/jira/browse/IMPALA-8309 http://gerrit.cloudera.org:8080/#/c/12632/9/be/src/service/frontend.cc@46 PS9, Line 46: " uniquely identify the application that communicates with Ranger."); > Worth noting whether this is required, considering it has an empty default Done http://gerrit.cloudera.org:8080/#/c/12632/9/be/src/service/frontend.cc@47 PS9, Line 47: DEFINE_string(server_name, "", "(Required) The name to use for securing this impalad " > we probably can't rename the existing sentry-specific flags (for compat rea Done. server_name is also required for Ranger. http://gerrit.cloudera.org:8080/#/c/12632/9/fe/pom.xml File fe/pom.xml: http://gerrit.cloudera.org:8080/#/c/12632/9/fe/pom.xml@112 PS9, Line 112: <exclusions> > it seems that ranger-plugins-common has a pom dependency on mysql-connector Yes it's a service model. Running mvn dependency:tree on Ranger gave me this output. I think it's pretty light after excluding some of its transitive dependencies. [INFO] +- org.apache.ranger:ranger-plugins-common:jar:1.2.0:compile [INFO] | \- org.codehaus.jackson:jackson-jaxrs:jar:1.9.13:compile [INFO] +- org.apache.ranger:ranger-plugins-audit:jar:1.2.0:compile http://gerrit.cloudera.org:8080/#/c/12632/9/fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java File fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java: http://gerrit.cloudera.org:8080/#/c/12632/9/fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java@126 PS9, Line 126: Privilege.ANY); > I'm not 100% sure about this part of the change. If I don't have the 'VIEW_ If I don't have the 'VIEW_METADATA' privilege but I do have permissions on one or more columns, should I still be able to use 'DESCRIBE' in all cases? Yes, however the output of DESCRIBE will vary depending on whether or not the user has VIEW_METADATA privilege (the second level check). Filtering out the DESCRIBE output is part of this JIRA: https://issues.apache.org/jira/browse/IMPALA-6479. This behavior is pretty similar to "show databases/tables". A user can always execute "show databases/tables", however only databases/tables that the user has access to (with VIEW_METADATA privilege) will be displayed. http://gerrit.cloudera.org:8080/#/c/12632/9/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java File fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java: http://gerrit.cloudera.org:8080/#/c/12632/9/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java@77 PS9, Line 77: .table(authorizable.getTableName()); : break; : case COLUMN: : builder.server(config_.getServerName()) : .database(authorizable.getDbName()); : // * in Ranger means "all". For example to check access > I'm not understanding this. When you are saying "Any column access" do you Done. I agree it was confusing. http://gerrit.cloudera.org:8080/#/c/12632/9/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java@106 PS9, Line 106: .function(authorizable.getFnName()); > does this need to be special cased? It seems all the other privileges alrea Ah yeah you're right. http://gerrit.cloudera.org:8080/#/c/12632/9/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java@117 PS9, Line 117: if (request.getPrivilege() == Privilege.ANY) { > does this result in multiple audit events, one per implied privilege? Yes. http://gerrit.cloudera.org:8080/#/c/12632/9/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java@131 PS9, Line 131: > can you use Collections.emptySet? Good catch, meant to leave it as a todo, but it's easy enough that i'll just do it i this patch. http://gerrit.cloudera.org:8080/#/c/12632/9/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationConfig.java File fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationConfig.java: http://gerrit.cloudera.org:8080/#/c/12632/9/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationConfig.java@38 PS9, Line 38: RangerConf > nit: Preconditions.checkNotNull(foo) returns 'foo', so you can combine thes Done http://gerrit.cloudera.org:8080/#/c/12632/9/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java File fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java: http://gerrit.cloudera.org:8080/#/c/12632/9/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java@55 PS9, Line 55: Preconditions.checkArgument(!Strings.isNullOrEmpty( > is this the only validation that happens on this config? If so, we should m Ah yeah, good point. Done. http://gerrit.cloudera.org:8080/#/c/12632/9/fe/src/main/java/org/apache/impala/authorization/ranger/RangerImpalaResourceBuilder.java File fe/src/main/java/org/apache/impala/authorization/ranger/RangerImpalaResourceBuilder.java: http://gerrit.cloudera.org:8080/#/c/12632/9/fe/src/main/java/org/apache/impala/authorization/ranger/RangerImpalaResourceBuilder.java@37 PS9, Line 37: r(String s > can we add Preconditions.checkNotNull() for all parameters in this file? Is Done http://gerrit.cloudera.org:8080/#/c/12632/9/fe/src/main/java/org/apache/impala/authorization/ranger/RangerImpalaResourceBuilder.java@52 PS9, Line 52: public RangerImpalaResourceBuilder column(String columnName) { > general question about ranger semantics -- are resources case sensitive? sh That'll be handled in Ranger side, more specifically the service definition: https://gerrit.cloudera.org/c/12632/9/testdata/cluster/ranger/setup/impala_servicedef.json#142 http://gerrit.cloudera.org:8080/#/c/12632/9/fe/src/test/java/org/apache/impala/analysis/AuthorizationStmtTest.java File fe/src/test/java/org/apache/impala/analysis/AuthorizationStmtTest.java: http://gerrit.cloudera.org:8080/#/c/12632/9/fe/src/test/java/org/apache/impala/analysis/AuthorizationStmtTest.java@90 PS9, Line 90: private static final String RANGER_SERVICE_TYPE = "impala"; : private static final String RANGER_APP_ID = "impala"; > can these be pulled from the config instead to avoid duplication? Ranger needs service type and app ID to create a Ranger config. http://gerrit.cloudera.org:8080/#/c/12632/9/fe/src/test/java/org/apache/impala/analysis/AuthorizationStmtTest.java@110 PS9, Line 110: System.getenv("IMPALA_HOME") + "/fe/src/test/resources/sentry-site. > this is a little sketchy. Shouldn't this resource be on the classpath and t Yes, we could. This is some remnant of the old stuff. The createHadoopGroupAuthConfig doesn't understand reading a resource from a classpath. I plan to clean up a lot of authorization tests in this JIRA: https://issues.apache.org/jira/browse/IMPALA-8248 http://gerrit.cloudera.org:8080/#/c/12632/9/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java File fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java: http://gerrit.cloudera.org:8080/#/c/12632/9/fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java@178 PS9, Line 178: SentryAuthorizationFactory authzFactory = new SentryAuthorizationFactory( > should we be parametrizing this test to be able to run against ranger as we There's a separate JIRA to refactor all the authorization tests to basically separate Sentry and Ranger specific tests and put the tests that are common to both: https://issues.apache.org/jira/browse/IMPALA-8248. I don't want this patch to get too big. http://gerrit.cloudera.org:8080/#/c/12632/9/fe/src/test/java/org/apache/impala/common/FrontendTestBase.java File fe/src/test/java/org/apache/impala/common/FrontendTestBase.java: http://gerrit.cloudera.org:8080/#/c/12632/9/fe/src/test/java/org/apache/impala/common/FrontendTestBase.java@308 PS9, Line 308: @Override : public boolean isEnabled() { return true; } : @Override : public AuthorizationProvider getProvider() { : return AuthorizationProvider.NONE; : } > I'm not sure what this means, semantically -- authorization enabled but NON Yes, it should be DUMMY or ALWAYS_TRUE provider, but I don't think this warrants for an additional enum value :) This is pretty much a hack to make non-authorization unit tests (see AnalyzerTest and ToSqlTest) easier because some statements, such as REFRESH AUTHORIZATION can only be executed when authorization is enabled, but we also don't want to run it with Sentry or Ranger. http://gerrit.cloudera.org:8080/#/c/12632/9/fe/src/test/resources/ranger-impala-security.xml File fe/src/test/resources/ranger-impala-security.xml: http://gerrit.cloudera.org:8080/#/c/12632/9/fe/src/test/resources/ranger-impala-security.xml@21 PS9, Line 21: <value>test_impala</value> > how does this interact with the gflag for "service id"? service ID is defined in service definition: https://gerrit.cloudera.org/c/12632/9/testdata/cluster/ranger/setup/impala_servicedef.json#50 http://gerrit.cloudera.org:8080/#/c/12632/9/fe/src/test/resources/ranger-impala-security.xml@37 PS9, Line 37: <value>/tmp</value> > can we auto-configure this based on other existing scratch dir configs, at The scratch_dirs flag takes multiple directories, do we just pick any scratch directory randomly? http://gerrit.cloudera.org:8080/#/c/12632/9/fe/src/test/resources/ranger-impala-security.xml@48 PS9, Line 48: Polling interval in milliseconds to poll for changes in policies. > are these strings copied from somewhere? It's weird to see a question mark Yeah I copied this config file from Ranger. I will update all the descriptions. http://gerrit.cloudera.org:8080/#/c/12632/9/fe/src/test/resources/ranger-impala-security.xml@56 PS9, Line 56: RangerRestClient connection timeout in milliseconds. > milliseconds Done http://gerrit.cloudera.org:8080/#/c/12632/9/fe/src/test/resources/ranger-impala-security.xml@64 PS9, Line 64: RangerRestClient read timeout in milliseconds. > milliseconds Done http://gerrit.cloudera.org:8080/#/c/12632/9/testdata/bin/create-load-data.sh File testdata/bin/create-load-data.sh: http://gerrit.cloudera.org:8080/#/c/12632/9/testdata/bin/create-load-data.sh@308 PS9, Line 308: perl -wpl -e 's/\$\{([^}]+)\}/defined $ENV{$1} ? $ENV{$1} : $&/eg' \ > you don't want to die() if $ENV{$1} is not defined here? leaving it unsubst Done http://gerrit.cloudera.org:8080/#/c/12632/9/testdata/cluster/ranger/setup/impala_servicedef.json File testdata/cluster/ranger/setup/impala_servicedef.json: http://gerrit.cloudera.org:8080/#/c/12632/9/testdata/cluster/ranger/setup/impala_servicedef.json@1 PS9, Line 1: { > I'm a little curious, at a high level, how this interacts with any ranger p This is something that I don't quite know the answer whether Impala should have the same service definition as Hive. We can't reuse the existing Hive service definition because it contains specific things to Hive, such as Hive JDBC driver, different access types/privileges, etc. Unlike with Sentry, the Ranger model prefers for each service to have their own service definition. IOW, there will not be shared policies between Impala and Hive. If we need to have shared policies between Impala and Hive, we need to come up with a new service definition, but that also means the existing Hive Ranger plugin may need to be updated. Let me know what you think? I also haven't spent much time on the deployment. I believe only the service definition will need to be deployed. All other files here are purely for testing. I'll look into this more once everything is in a better shape. For now, we can consider all these files here for testing. -- To view, visit http://gerrit.cloudera.org:8080/12632 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8cad9e609d20aae1ff645c84fd58a02afee70276 Gerrit-Change-Number: 12632 Gerrit-PatchSet: 11 Gerrit-Owner: Fredy Wijaya <[email protected]> Gerrit-Reviewer: Austin Nobis <[email protected]> Gerrit-Reviewer: Bharath Vissapragada <[email protected]> Gerrit-Reviewer: Fredy Wijaya <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Paul Rogers <[email protected]> Gerrit-Reviewer: Philip Zeyliger <[email protected]> Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Comment-Date: Thu, 14 Mar 2019 23:54:28 +0000 Gerrit-HasComments: Yes
