Hao Hao has posted comments on this change. ( http://gerrit.cloudera.org:8080/11797 )
Change subject: [sentry] Integrate AuthzProvider into CatalogManager ...................................................................... Patch Set 4: (35 comments) http://gerrit.cloudera.org:8080/#/c/11797/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11797/3//COMMIT_MSG@12 PS3, Line 12: su > such as Done http://gerrit.cloudera.org:8080/#/c/11797/3//COMMIT_MSG@17 PS3, Line 17: : > Haven't reviewed this patch yet, but it's not obvious from the commit messa I added all endpoints in this patch. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/hms_itest-base.h File src/kudu/integration-tests/hms_itest-base.h: http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/hms_itest-base.h@150 PS3, Line 150: boost::optional<const string&> user) { > In places where you pass none, use /*foo=*/ comments to indicate what it is Done http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/master_sentry-itest.cc File src/kudu/integration-tests/master_sentry-itest.cc: http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/master_sentry-itest.cc@147 PS3, Line 147: return CreateKuduTable(hms_database.ToString(), hms_table.ToString()); > Nit: drop std:: prefix. Done http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/master_sentry-itest.cc@152 PS3, Line 152: } : : Status AlterTable(const string& table, const string& /*new_table*/) { : unique_ptr<KuduTableAlterer> table_alterer; : table_alterer.reset(client_->NewTableAlterer(table) : ->DropColumn("int32_val")); : re > I don't understand why this is here; why can't you just kDefaultBindMode? Done http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/master_sentry-itest.cc@169 PS3, Line 169: KuduSchema schema; > Nit: indentation Done http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/master_sentry-itest.cc@183 PS3, Line 183: } : : Sta > Why does this need to be wrapped in ASSERT_EVENTUALLY? My initial understanding is Sentry plugin of HMS is populating ownership privilege asynchronously. But after doing more research, I realize Sentry plugin blocks the HMS operations until the ownership is reflected in the Sentry server. Therefore, this is actually not needed. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/master_sentry-itest.cc@243 PS3, Line 243: ASSERT_OK(CreateRoleAndAddToGroups(sentry_client_.get(), kDevRole, kUserGroup)); : ASSERT_OK(CreateRoleAndAddToGroups(sentry_client_.get(), kAdminRole, kAdminGroup)); : > Why will this need to be wrapped? Unlike table creation, this needs to be wrapped. Because Kudu uses transactional listeners of the HMS to get notification of table alteration/deletion events, while Sentry uses post event listener (which is executed outside the HMS transaction). There is a chance that Kudu already finish the table renaming but the privilege renaming hasn't been reflected in the Sentry service. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/authz_provider.h File src/kudu/master/authz_provider.h: http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/authz_provider.h@44 PS3, Line 44: virtual Status AuthorizeCreateTable(const std::string& table_name, > Curious what motivated the reordering changes here? Initially, the motivation was to generalize the parameter when using template functor in 'catalog_manager.cc'. After more thoughts, I think this is not actually necessary. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/authz_provider.cc File src/kudu/master/authz_provider.cc: http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/authz_provider.cc@32 PS3, Line 32: > Hmm, why don't we default to an empty set and let users who integrate with Done http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/authz_provider.cc@51 PS3, Line 51: std::call_once(once, [] { > Surprised you don't need a ScopedLeakCheckDisabler for g_trusted_users. Sea Hmm, I am no sure why LSAN didn't flag it. But I add ScopedLeakCheckDisabler just in case. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/authz_provider.cc@52 PS3, Line 52: debug::ScopedLeakCheckDisabler d; > Nit: indentation. Done http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.h File src/kudu/master/catalog_manager.h: http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.h@538 PS3, Line 538: retriev > Nit: retrieving Done http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.h@722 PS3, Line 722: FRIEND_TEST(kudu::client::ServiceUnavailableRetryClientTest, CreateTable); > Doc the effect that 'user' has. Below too. Done http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.h@842 PS3, Line 842: const PartitionPB& partition); : : // Builds the TabletLoca > How about "Looks up the table, locks it with the provided lock mode, and, i Done http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.h@851 PS3, Line 851: : // Looks up the table, locks it with the provid > How about "Due to the race condition, there could be a privilege escalation Done http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.h@859 PS3, Line 859: // 2. Alice drops table B and renames table B to A, and concurrently Bob drops B. > As a convention, we prefer to list IN parameters first and OUT parameters l Done http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@700 PS3, Line 700: } : > You can do this in the initializer list. Done http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@754 PS3, Line 754: authz_provider_.reset(new master::SentryAuthzProvider()); > What's the advantage of initializing authz_provider_ with DefaultAuthzProvi The only advantage is to ensure authz_provider_ is always initialized even when Init() is not called. However, since authz_provider_ is always used in RPC calls which guarantees the Init() is called using CheckOnline(), so I think I will move the initialization to Init(). http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1243 PS3, Line 1243: hms_notification_log_listener_->Shutdown(); > As a general rule, deinitialize in the opposite order of initialization. So Done http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1405 PS3, Line 1405: : : // a. Val > I don't even think this is necessary; it's obvious from inspection that if Done http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1628 PS3, Line 1628: Status s = hms_catalog_->CreateTable(table->id(), normalized_table_name, owner, schema); > What was the motivation for allowing 'rpc' to be nullptr? For ease of testi That works. After more thoughts, I don't see any reasons to allow 'rpc' to be nullptr. Adding a CHECK here. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1710 PS3, Line 1710: return SetupError(authz_provider_->AuthorizeGetTableMetadata(table_name, username), : resp, MasterErrorPB::NOT_AUTHORIZED); > Hrm, WDYT about baking IsTrustedUser into these AuthzProvider::Authorize* c Good point, although this means each new implementation of AuthzProvider needs to take the flag into account. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1771 PS3, Line 1771: ize = [&] { > I'm a little surprised we're using the authz function to authorize locking We are not using authz function to authorize locking the table, instead we only do authorization after taking the lock for the reasons documented in the 'catalog_manager.h'. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1778 PS3, Line 1778: { > nit: NOT_AUTHORIZED Done http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1798 PS3, Line 1798: table = FindPtrOrNull(table_ids_map_, table_identifier.table_id()); > Do we need to authorize in this call? What does 'table_name' describe? What Yes, we only return TABLE_NOT_FOUND if the operation on the table is authorized. 'table_name' is the name of the table either identified by the table id mapping or the table name stated in the rpc. Authorizing against it means to check if the user has the right privilege to operate on the table. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1818 PS3, Line 1818: return error(); > I presume here we reset the name in case the client provided a table identi Done http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1824 PS3, Line 1824: // Validate if the operation on the table is authorized. Reset the table > Don't need to authorize in this call to error(). Done http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1920 PS3, Line 1920: Status CatalogManager::DeleteTable(const DeleteTableRequestPB& req, > Is DeleteTable() called if a user drops a table in Beeline shell (i.e. thro Right, it should be AuthorizeDropTable as DeleteTableRpc. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@2296 PS3, Line 2296: : // The HMS allows renaming a table to the same n > This last part isn't clear. As best I can tell, you've reordered the change Apologize for the bad phrasing. It is important for both correctness and implementation standpoints. For correctness, the original table name should be used for authorization. It is odd that the user gets error message when they are not authorized to alter the table which they intent to rename the table to. For implementation, it is not possible to use the new table name for authorization. Because Kudu uses the transactional listener of the HMS to get notification of table alteration events, while Sentry uses post event listener (which is executed outside the HMS transaction). There is a chance that Kudu already finish the table renaming but the privilege renaming hasn't been reflected in the Sentry service. However, I realize that a more reasonable alternative is to skip authorization here in the first place, mainly because we already did authorization after acquiring the lock of the table at L2279. Moreover, even though a table renaming could happen before the changes of schema/partitioning taking place (since now table rename and table schema update are not transactional), it is ideal from the users' point of view, to not authorize against the new table name arose from other RPCs. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@2380 PS3, Line 2380: case AlterTableRequestPB::DROP_COLUMN: > Didn't we already do this in AlterTableRpc? Shouldn't we auth using GetTabl We should use alter table authorization rules here to guard any table alterations. And even though we already do this in AlterTableRpc, there isn't any double authorization when HMS integration is enabled. Because we skip the authorization in L2334. The same for DeleteTable family. Sorry, it was my mistake to use GetTableMetadata. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/master.proto File src/kudu/master/master.proto: http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/master.proto@85 PS3, Line 85: // The caller is not authorized to perform the attempted operation. > It's worth noting what we do for scanner-hijacking (see SetupErrorAndRespon As you mentioned, I don't see a reason Status::NotAuthorized should be reflected as a RPC error. Thus, I prefer to keep it as the way it is. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/master.proto@85 PS3, Line 85: // The caller is not authorized to perform the attempted operation. > In your mind, how is this different from Status::NotAuthorized? If I issue As Andrew explained, Status::NotAuthorized is transformed to AppStatusPB. The MasterErrorPB is used to indicate that there is a NOT_AUTHORIZED error in the application layer, so the client unwraps the response and gets the original Status. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/sentry_authz_provider-test-base.h File src/kudu/master/sentry_authz_provider-test-base.h: http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/sentry_authz_provider-test-base.h@43 PS3, Line 43: Status CreateRoleAndAddToGroups(sentry::SentryClient* sentry_client, > I think Dan gave feedback to convert these sorts of argument passes into un Done http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/sentry/mini_sentry.cc File src/kudu/sentry/mini_sentry.cc: http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/sentry/mini_sentry.cc@268 PS3, Line 268: <value>$6</value> > Add a comment above explaining what this does and why we need it. Done -- To view, visit http://gerrit.cloudera.org:8080/11797 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iab4aa027ae6eb4520db48ce348db552c9feec2a8 Gerrit-Change-Number: 11797 Gerrit-PatchSet: 4 Gerrit-Owner: Hao Hao <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Dan Burkert <[email protected]> Gerrit-Reviewer: Hao Hao <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Mon, 04 Feb 2019 20:43:18 +0000 Gerrit-HasComments: Yes
