Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/11797 )
Change subject: [sentry] Integrate AuthzProvider into CatalogManager ...................................................................... Patch Set 7: (30 comments) http://gerrit.cloudera.org:8080/#/c/11797/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11797/7//COMMIT_MSG@20 PS7, Line 20: exposedp exposed http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/client/client-test.cc File src/kudu/client/client-test.cc: http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/client/client-test.cc@293 PS7, Line 293: /* rpc = */ Nit: convention is more like this: /*rpc=*/nullptr See https://gerrit.cloudera.org/c/12415/3/src/kudu/common/schema.cc#196 for more context. Applies elsewhere in your patch too. http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/CMakeLists.txt File src/kudu/integration-tests/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/CMakeLists.txt@91 PS7, Line 91: ADD_KUDU_TEST(master_sentry-itest RUN_SERIAL true PROCESSORS 4) : ADD_KUDU_TEST(master_sentry_advance-itest RUN_SERIAL true PROCESSORS 4) Do these need to be sharded, given the parameterized tests in each and the time it takes to start a cluster with Sentry and HMS? How long does each take to run, in isolation? http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/hms_itest-base.h File src/kudu/integration-tests/hms_itest-base.h: http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/hms_itest-base.h@52 PS7, Line 52: using client::KuduColumnSchema; : using client::KuduSchema; : using client::KuduSchemaBuilder; : using client::KuduTable; : using client::KuduTableCreator; : using client::sp::shared_ptr; : using hms::HmsClient; : using std::string; : using std::unique_ptr; : using strings::Substitute; > I'm not sure that 'using ...' in a header file is a good idea. Agreed with Alexey; I don't think this is a pattern we should promote. Almost all the current usages of this are from old code. http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/master_sentry-itest.cc File src/kudu/integration-tests/master_sentry-itest.cc: http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/master_sentry-itest.cc@31 PS7, Line 31: // IWYU pragma: no_include "kudu/integration-tests/hms_itest-base.h" Again, what happened here? http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/master_sentry-itest.cc@63 PS7, Line 63: ASSERT_EQ(set<string>({ Substitute("$0.$1", kDatabaseName, kTableName), Nit: unordered_set is more representative of the semantics you care about (i.e. deduplication but unsorted). http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/master_sentry-itest.cc@98 PS7, Line 98: // Authorize create table. These comments don't add any useful information. http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/master_sentry_advance-itest.cc File src/kudu/integration-tests/master_sentry_advance-itest.cc: http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/master_sentry_advance-itest.cc@31 PS7, Line 31: // IWYU pragma: no_include "kudu/integration-tests/hms_itest-base.h" What happened here? Why shouldn't we include this? http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/master_sentry_advance-itest.cc@37 PS7, Line 37: advanced > What is "advanced" referring to? Also not really seeing the distinction; why can't these tests just live in master_sentry-itest? http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/master_sentry_advance-itest.cc@91 PS7, Line 91: altering But we're not necessarily altering, right? Below too. http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/master_sentry_advance-itest.cc@125 PS7, Line 125: // Authz funcs for delete table . Nit: misplaced http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/master_sentry_advance-itest.cc@131 PS7, Line 131: // Authz funcs for alter table. On second thought, these comments aren't adding anything useful. http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/registration-test.cc File src/kudu/integration-tests/registration-test.cc: http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/registration-test.cc@197 PS7, Line 197: s = catalog->GetTabletLocations(tablet_id, master::VOTER_REPLICA, &loc, nullptr); need /*rpc=*/ here http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/sentry_itest-base.h File src/kudu/integration-tests/sentry_itest-base.h: http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/sentry_itest-base.h@75 PS7, Line 75: class SentryTestBase : public HmsTestBase, : public master::SentryAuthzProviderTestBase { Let's not do multiple inheritance here, especially since SentryAuthzProviderTestBase looks like it doesn't need to exist (it's not a true test fixture, just an amalgam of free functions bound into a class). http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/sentry_itest-base.h@191 PS7, Line 191: Status GetTabletLocations(const string& table, const string& /*new_table*/) { You could implement this in a different way, using the cluster's MiniClusterFsInspector to list all tablet IDs from disk. Not sure if it'll actually be simpler, but perhaps worth looking at. http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/sentry_itest-base.h@194 PS7, Line 194: RETURN_NOT_OK(cluster_->CreateClient(nullptr, &client_)); Could you create this second client instance without destroying client_, so it needn't be recreated below? Or is the issue that you can only kinit as one user at a time, so you have to explicitly switch like this? http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/sentry_itest-base.h@223 PS7, Line 223: // Always enable Kerberos, as in non-Kerberized environment the logged : // in user is not service users that allowed to connect to the Sentry : // service. Can you rewrite this? The second half (beginning with "the logged in user...") isn't clear. http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/sentry_itest-base.h@268 PS7, Line 268: // Log in as the 'test-user' and reset the client to pick up the change in user. : ASSERT_OK(cluster_->kdc()->Kinit(kTestUser)); : ASSERT_OK(cluster_->CreateClient(nullptr, &client_)); See above comment about switching users. http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/sentry_itest-base.h@273 PS7, Line 273: void TearDown() override { Reverse order of SetUp, so you should stop the Sentry client first, then the HMS client. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1798 PS3, Line 1798: table = FindPtrOrNull(table_ids_map_, table_identifier.table_id()); > Yes, we only return TABLE_NOT_FOUND if the operation on the table is author Right, but I was specifically asking about this particular case, where the provided table ID and name don't match up to the same table. In such a case, the table name used in authorization comes from the RPC, but it might refer to the "wrong" table (if we're to trust the RPC's table ID more). http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/master/catalog_manager.cc@1794 PS7, Line 1794: shared_lock<LockType> l(lock_); Let's make sure this is dropped before any authorization calls. http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/master/catalog_manager.cc@1828 PS7, Line 1828: RETURN_NOT_OK(authorize()); It's unfortunate that we have to hold TableMetadataLock during the authorization, which may be an RPC to Sentry. I believe this necessary for correctness; can you confirm and/or update the comment? http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/master/catalog_manager.cc@2729 PS7, Line 2729: shared_lock<LockType> l(lock_); Can we avoid holding this during the authorization? Seems like we could do this in two passes. The first pass iterates over normalized_table_names_map_ and grabs refs on the entries with lock_ held. The second iterates on the refs, acquires TableMetadataLocks, filters by name, and authorizes. http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/master/catalog_manager.cc@2744 PS7, Line 2744: Status s = authz_provider_->AuthorizeGetTableMetadata(NormalizeTableName(table_name), : *user); This is going to be really expensive when there are a lot of tables. Is there a mechanism with which to do a bulk authorize on a resource set? http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/master/catalog_manager.cc@4640 PS7, Line 4640: shared_lock<LockType> l(lock_); Can we drop this before authorizing? http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/master/catalog_manager.cc@4642 PS7, Line 4642: // is enabled. Because tablet id is a random generated string which doesn't randomly http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/master/catalog_manager.cc@4643 PS7, Line 4643: On the other hand, we can always return : // NOT_AUTHORIZED error, which can be too restricted. Not really sure what this second sentence is offering. Is it suggesting an alternative approach which is also acceptable? If so, please clarify the comment. http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/master/master-test-util.h File src/kudu/master/master-test-util.h: http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/master/master-test-util.h@56 PS7, Line 56: nullptr /*rpc=*/ http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/master/sentry_authz_provider.cc File src/kudu/master/sentry_authz_provider.cc: http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/master/sentry_authz_provider.cc@200 PS7, Line 200: if (AuthzProvider::IsTrustedUser(user)) { : return Status::OK(); : } You could push all of these down into Authorize(). It means a little bit more work for trusted callers, but it seems like it's all building in-memory structures, so maybe worth the code simplification. Doing so also means we're less likely to add a new AuthorizeFoo call in the future and forget to call IsTrustedUser() in it. http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/sentry/mini_sentry.cc File src/kudu/sentry/mini_sentry.cc: http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/sentry/mini_sentry.cc@210 PS7, Line 210: // derive OWNER/ALL privileges from object's ownership. 'all' indicates an Nit: derives -- To view, visit http://gerrit.cloudera.org:8080/11797 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iab4aa027ae6eb4520db48ce348db552c9feec2a8 Gerrit-Change-Number: 11797 Gerrit-PatchSet: 7 Gerrit-Owner: Hao Hao <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Dan Burkert <[email protected]> Gerrit-Reviewer: Hao Hao <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Wed, 13 Feb 2019 08:45:24 +0000 Gerrit-HasComments: Yes
