Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/11797 )
Change subject: [sentry] Integrate AuthzProvider into CatalogManager ...................................................................... Patch Set 3: (33 comments) http://gerrit.cloudera.org:8080/#/c/11797/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11797/3//COMMIT_MSG@12 PS3, Line 12: as such as http://gerrit.cloudera.org:8080/#/c/11797/3//COMMIT_MSG@17 PS3, Line 17: More authorization endpoints, e.g ListTables : RPC, will be addressed in a follow up patch. Haven't reviewed this patch yet, but it's not obvious from the commit message why this patch covers the set of endpoints that it does and not the others. My guess is that you wanted to cover the "write" endpoints first and will do the read-only ones later. If that's right, please update the commit message to clarify. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/hms_itest-base.h File src/kudu/integration-tests/hms_itest-base.h: http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/hms_itest-base.h@150 PS3, Line 150: boost::optional<const string&> user) { > warning: the parameter 'user' is copied for each invocation but only used a In places where you pass none, use /*foo=*/ comments to indicate what it is. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/master_sentry-itest.cc File src/kudu/integration-tests/master_sentry-itest.cc: http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/master_sentry-itest.cc@147 PS3, Line 147: std::unique_ptr<SentryClient> sentry_client_; Nit: drop std:: prefix. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/master_sentry-itest.cc@152 PS3, Line 152: #if defined(__linux__) : { ExternalMiniCluster::BindMode::UNIQUE_LOOPBACK, }, : #elif defined(__APPLE__) : // On macOS, it's not possible to have unique loopback interfaces. : // Thus, binds to loopback ip address "127.0.0.1". : { ExternalMiniCluster::BindMode::LOOPBACK, }, : #endif I don't understand why this is here; why can't you just kDefaultBindMode? http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/master_sentry-itest.cc@169 PS3, Line 169: ASSERT_TRUE(s.IsNotAuthorized()); Nit: indentation http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/master_sentry-itest.cc@183 PS3, Line 183: ASSERT_EVENTUALLY([&] { : ASSERT_OK(client_->DeleteTable(Substitute("$0.$1", kDatabaseName, "a"))); : }); Why does this need to be wrapped in ASSERT_EVENTUALLY? http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/master_sentry-itest.cc@243 PS3, Line 243: // ASSERT_EVENTUALLY([&] { : // ASSERT_OK(table_alterer->Alter()); : // }); Why will this need to be wrapped? http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/authz_provider.h File src/kudu/master/authz_provider.h: http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/authz_provider.h@44 PS3, Line 44: virtual Status AuthorizeCreateTable(const std::string& user, Curious what motivated the reordering changes here? http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/authz_provider.cc File src/kudu/master/authz_provider.cc: http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/authz_provider.cc@32 PS3, Line 32: DEFINE_string(trusted_user_acl, "impala", Hmm, why don't we default to an empty set and let users who integrate with Impala override this? Seems safer than this default. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/authz_provider.cc@51 PS3, Line 51: vector<string> acls = strings::Split(FLAGS_trusted_user_acl, ",", strings::SkipEmpty()); Surprised you don't need a ScopedLeakCheckDisabler for g_trusted_users. Search around in the codebase; maybe you'll understand better than I when it's needed and when it isn't. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/authz_provider.cc@52 PS3, Line 52: g_trusted_users = new unordered_set<string>(acls.begin(), acls.end()); Nit: indentation. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.h File src/kudu/master/catalog_manager.h: http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.h@538 PS3, Line 538: getting Nit: retrieving For this and for the other functions for which 'rpc' is optional, please update the various call-sites where you've passed nullptr with /*rpc=*/ comments. Below too. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.h@722 PS3, Line 722: // Delete the specified table in the catalog. Doc the effect that 'user' has. Below too. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.h@842 PS3, Line 842: // Looks up the table, locks it with the provided lock mode. And then authorize : // if the user can operate on the table. Note that authorization is skipped if : // user is not provided. How about "Looks up the table, locks it with the provided lock mode, and, if 'user' is provided, checks that the user is authorized to operate on the table." http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.h@851 PS3, Line 851: // 3. A privilege escalation could happen that due to the race condition that Bob : // drops a table that he is not allowed to. How about "Due to the race condition, there could be a privilege escalation where Bob drops a table that he is not authorized to drop." http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.h@859 PS3, Line 859: Status FindLockAndAuthorizeTable(const ReqClass& request, As a convention, we prefer to list IN parameters first and OUT parameters last. In this case, that means authz_func and user should precede table_info and table_lock. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@700 PS3, Line 700: // Always enables AuthzProvider. : authz_provider_.reset(new master::DefaultAuthzProvider()); You can do this in the initializer list. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@753 PS3, Line 753: hms::HmsCatalog::IsEnabled() Isn't this already guaranteed by L726? http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@754 PS3, Line 754: authz_provider_.reset(new master::SentryAuthzProvider()); What's the advantage of initializing authz_provider_ with DefaultAuthzProvider in the CatalogManager constructor vs. doing it here, when !SentryAuthzProvider::IsEnabled()? BTW, are we guaranteed that RPCs cannot arrive to the master before this is initialized? http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1243 PS3, Line 1243: authz_provider_->Stop(); As a general rule, deinitialize in the opposite order of initialization. So this should be deinitialized before the HMS stuff, I think. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1405 PS3, Line 1405: Skip authorization validation if a : // remote call context is not provided, which implies there is no remote : // calls. I don't even think this is necessary; it's obvious from inspection that if 'rpc' isn't provided, we'll skip authorization. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1628 PS3, Line 1628: string owner = req.has_owner() ? req.owner() : (rpc ? rpc->remote_user().username() : ""); What was the motivation for allowing 'rpc' to be nullptr? For ease of testing? Seems like the previous invariant was that nullptr was allowed if HMS wasn't configured, which meant tests that didn't care about the HMS could be lazy. Why didn't that work? http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1798 PS3, Line 1798: return error(); Do we need to authorize in this call? What does 'table_name' describe? What does authorizing against it mean? http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1818 PS3, Line 1818: table_name = lock.data().name(); I presume here we reset the name in case the client provided a table identifier with an ID and without a name? If so, add a comment. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1824 PS3, Line 1824: return error(); Don't need to authorize in this call to error(). http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1920 PS3, Line 1920: return SetupError(authz_provider_->AuthorizeGetTableMetadata(username, table_name), Is DeleteTable() called if a user drops a table in Beeline shell (i.e. through an HMS notification log event)? Do we need to authorize more than GetTableMetadata in that case? http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@2296 PS3, Line 2296: to avoid the new table : // name is checked against during authorization. This last part isn't clear. As best I can tell, you've reordered the change name (includes HMS) and change schema/partitioning (excludes HMS) steps because if you didn't, schema/partitioning would authorize using the new table name. It doesn't get the name from the client request (that gets cleared out), but rather from the TableInfo maintained by the catalog manager. Is this important from a correctness standpoint? Or is this an implementation shortcut? It isn't clear. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@2380 PS3, Line 2380: return SetupError(authz_provider_->AuthorizeAlterTable(username, table_name, new_table), Didn't we already do this in AlterTableRpc? Shouldn't we auth using GetTableMetadata here? I'm comparing this to the DeleteTable family of calls and noticed this difference. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/master.proto File src/kudu/master/master.proto: http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/master.proto@85 PS3, Line 85: // The caller is not authorized to perform the attempted operation. In your mind, how is this different from Status::NotAuthorized? If I issue a master DDL (or in general), when should I expect this and when should I expect Status::NotAuthorized? http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/sentry_authz_provider-test-base.h File src/kudu/master/sentry_authz_provider-test-base.h: http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/sentry_authz_provider-test-base.h@43 PS3, Line 43: Status CreateRoleAndAddToGroups(const std::unique_ptr<sentry::SentryClient>& sentry_client, I think Dan gave feedback to convert these sorts of argument passes into unwrapped ones. I.e. turn const unique_ptr<Foo>& into either Foo* (if we're calling a non-const Foo method) or const Foo& (if calling a const Foo method). http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/sentry/mini_sentry.cc File src/kudu/sentry/mini_sentry.cc: http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/sentry/mini_sentry.cc@268 PS3, Line 268: <name>sentry.db.policy.store.owner.as.privilege</name> Add a comment above explaining what this does and why we need it. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/sentry/sentry_policy_service.thrift File src/kudu/sentry/sentry_policy_service.thrift: http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/sentry/sentry_policy_service.thrift@26 PS3, Line 26: # - Rename enum TSentryGrantOption.TRUE and TSentryGrantOption.FALSE to avoid : # conflict with the macro definition in the system header. This rename is safe because only the value of the enum is sent on the wire, right? -- To view, visit http://gerrit.cloudera.org:8080/11797 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iab4aa027ae6eb4520db48ce348db552c9feec2a8 Gerrit-Change-Number: 11797 Gerrit-PatchSet: 3 Gerrit-Owner: Hao Hao <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Dan Burkert <[email protected]> Gerrit-Reviewer: Hao Hao <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Tue, 11 Dec 2018 06:45:44 +0000 Gerrit-HasComments: Yes
