Hao Hao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11797 )

Change subject: [sentry] Integrate AuthzProvider into CatalogManager
......................................................................


Patch Set 4:

(35 comments)

http://gerrit.cloudera.org:8080/#/c/11797/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11797/3//COMMIT_MSG@12
PS3, Line 12: su
> such as
Done


http://gerrit.cloudera.org:8080/#/c/11797/3//COMMIT_MSG@17
PS3, Line 17:
            :
> Haven't reviewed this patch yet, but it's not obvious from the commit messa
I added all endpoints in this patch.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/hms_itest-base.h
File src/kudu/integration-tests/hms_itest-base.h:

http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/hms_itest-base.h@150
PS3, Line 150:                   boost::optional<const string&> user) {
> In places where you pass none, use /*foo=*/ comments to indicate what it is
Done


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/master_sentry-itest.cc
File src/kudu/integration-tests/master_sentry-itest.cc:

http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/master_sentry-itest.cc@147
PS3, Line 147:     return CreateKuduTable(hms_database.ToString(), 
hms_table.ToString());
> Nit: drop std:: prefix.
Done


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/master_sentry-itest.cc@152
PS3, Line 152:   }
             :
             :   Status AlterTable(const string& table, const string& 
/*new_table*/) {
             :     unique_ptr<KuduTableAlterer> table_alterer;
             :     table_alterer.reset(client_->NewTableAlterer(table)
             :                                ->DropColumn("int32_val"));
             :     re
> I don't understand why this is here; why can't you just kDefaultBindMode?
Done


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/master_sentry-itest.cc@169
PS3, Line 169:     KuduSchema schema;
> Nit: indentation
Done


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/master_sentry-itest.cc@183
PS3, Line 183:   }
             :
             :   Sta
> Why does this need to be wrapped in ASSERT_EVENTUALLY?
My initial understanding is Sentry plugin of HMS is populating ownership 
privilege asynchronously. But after doing more research, I realize Sentry 
plugin blocks the HMS operations until the ownership is reflected in the Sentry 
server. Therefore, this is actually not needed.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/master_sentry-itest.cc@243
PS3, Line 243:     ASSERT_OK(CreateRoleAndAddToGroups(sentry_client_.get(), 
kDevRole, kUserGroup));
             :     ASSERT_OK(CreateRoleAndAddToGroups(sentry_client_.get(), 
kAdminRole, kAdminGroup));
             :
> Why will this need to be wrapped?
Unlike table creation, this needs to be wrapped. Because Kudu uses 
transactional listeners of the HMS to get notification of table 
alteration/deletion events, while Sentry uses post event listener (which is 
executed outside the HMS transaction). There is a chance that Kudu already 
finish the table renaming but the privilege renaming hasn't been reflected in 
the Sentry service.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/authz_provider.h
File src/kudu/master/authz_provider.h:

http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/authz_provider.h@44
PS3, Line 44:   virtual Status AuthorizeCreateTable(const std::string& 
table_name,
> Curious what motivated the reordering changes here?
Initially, the motivation was to generalize the parameter when using template 
functor in 'catalog_manager.cc'.

After more thoughts, I think this is not actually necessary.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/authz_provider.cc
File src/kudu/master/authz_provider.cc:

http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/authz_provider.cc@32
PS3, Line 32:
> Hmm, why don't we default to an empty set and let users who integrate with
Done


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/authz_provider.cc@51
PS3, Line 51:   std::call_once(once, [] {
> Surprised you don't need a ScopedLeakCheckDisabler for g_trusted_users. Sea
Hmm, I am no sure why LSAN didn't flag it. But I add ScopedLeakCheckDisabler 
just in case.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/authz_provider.cc@52
PS3, Line 52:    debug::ScopedLeakCheckDisabler d;
> Nit: indentation.
Done


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.h
File src/kudu/master/catalog_manager.h:

http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.h@538
PS3, Line 538: retriev
> Nit: retrieving
Done


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.h@722
PS3, Line 722:   FRIEND_TEST(kudu::client::ServiceUnavailableRetryClientTest, 
CreateTable);
> Doc the effect that 'user' has. Below too.
Done


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.h@842
PS3, Line 842:                                              const PartitionPB& 
partition);
             :
             :   // Builds the TabletLoca
> How about "Looks up the table, locks it with the provided lock mode, and, i
Done


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.h@851
PS3, Line 851:
             :   // Looks up the table, locks it with the provid
> How about "Due to the race condition, there could be a privilege escalation
Done


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.h@859
PS3, Line 859:   //  2. Alice drops table B and renames table B to A, and 
concurrently Bob drops B.
> As a convention, we prefer to list IN parameters first and OUT parameters l
Done


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@700
PS3, Line 700: }
             :
> You can do this in the initializer list.
Done


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@754
PS3, Line 754:       authz_provider_.reset(new master::SentryAuthzProvider());
> What's the advantage of initializing authz_provider_ with DefaultAuthzProvi
The only advantage is to ensure authz_provider_ is always initialized even when 
Init() is not called. However, since authz_provider_ is always used in RPC 
calls which guarantees the Init() is called using CheckOnline(), so I think I 
will move the initialization to Init().


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1243
PS3, Line 1243:     hms_notification_log_listener_->Shutdown();
> As a general rule, deinitialize in the opposite order of initialization. So
Done


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1405
PS3, Line 1405:
              :
              :   // a. Val
> I don't even think this is necessary; it's obvious from inspection that if
Done


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1628
PS3, Line 1628:     Status s = hms_catalog_->CreateTable(table->id(), 
normalized_table_name, owner, schema);
> What was the motivation for allowing 'rpc' to be nullptr? For ease of testi
That works. After more thoughts, I don't see any reasons to allow 'rpc' to be 
nullptr. Adding a CHECK here.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1710
PS3, Line 1710: return 
SetupError(authz_provider_->AuthorizeGetTableMetadata(table_name, username),
              :                       resp, MasterErrorPB::NOT_AUTHORIZED);
> Hrm, WDYT about baking IsTrustedUser into these AuthzProvider::Authorize* c
Good point, although this means each new implementation of AuthzProvider needs 
to take the flag into account.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1771
PS3, Line 1771: ize = [&] {
> I'm a little surprised we're using the authz function to authorize locking
We are not using authz function to authorize locking the table, instead we only 
do authorization after taking the lock for the reasons documented in the 
'catalog_manager.h'.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1778
PS3, Line 1778:  {
> nit: NOT_AUTHORIZED
Done


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1798
PS3, Line 1798:       table = FindPtrOrNull(table_ids_map_, 
table_identifier.table_id());
> Do we need to authorize in this call? What does 'table_name' describe? What
Yes, we only return TABLE_NOT_FOUND if the operation on the table is authorized.

'table_name' is the name of the table either identified by the table id mapping 
or the table name stated in the rpc. Authorizing against it means to check if 
the user has the right privilege to operate on the table.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1818
PS3, Line 1818:     return error();
> I presume here we reset the name in case the client provided a table identi
Done


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1824
PS3, Line 1824:   // Validate if the operation on the table is authorized. 
Reset the table
> Don't need to authorize in this call to error().
Done


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1920
PS3, Line 1920: Status CatalogManager::DeleteTable(const DeleteTableRequestPB& 
req,
> Is DeleteTable() called if a user drops a table in Beeline shell (i.e. thro
Right, it should be AuthorizeDropTable as DeleteTableRpc.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@2296
PS3, Line 2296:
              :     // The HMS allows renaming a table to the same n
> This last part isn't clear. As best I can tell, you've reordered the change
Apologize for the bad phrasing. It is important for both correctness and 
implementation standpoints.

For correctness, the original table name should be used for authorization. It 
is odd that the user gets error message when they are not authorized to alter 
the table which they intent to rename the table to.

For implementation, it is not possible to use the new table name for 
authorization. Because Kudu uses the transactional listener of the HMS to get 
notification of table alteration events, while Sentry uses post event listener 
(which is executed outside the HMS transaction). There is a chance that Kudu 
already finish the table renaming but the privilege renaming hasn't been 
reflected in the Sentry service.

However, I realize that a more reasonable alternative is to skip authorization 
here in the first place, mainly because we already did authorization after 
acquiring the lock of the table at L2279. Moreover, even though a table 
renaming could happen before the changes of schema/partitioning taking place 
(since now table rename and table schema update are not transactional), it is 
ideal from the users' point of view, to not authorize against the new table 
name arose from other RPCs.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@2380
PS3, Line 2380:       case AlterTableRequestPB::DROP_COLUMN:
> Didn't we already do this in AlterTableRpc? Shouldn't we auth using GetTabl
We should use alter table authorization rules here to guard any table 
alterations. And even though we already do this in AlterTableRpc, there isn't 
any double authorization when HMS integration is enabled. Because we skip the 
authorization in L2334.

The same for DeleteTable family. Sorry, it was my mistake to use 
GetTableMetadata.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/master.proto
File src/kudu/master/master.proto:

http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/master.proto@85
PS3, Line 85:     // The caller is not authorized to perform the attempted 
operation.
> It's worth noting what we do for scanner-hijacking (see SetupErrorAndRespon
As you mentioned, I don't see a reason Status::NotAuthorized should be 
reflected as a RPC error. Thus, I prefer to keep it as the way it is.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/master.proto@85
PS3, Line 85:     // The caller is not authorized to perform the attempted 
operation.
> In your mind, how is this different from Status::NotAuthorized? If I issue
As Andrew explained, Status::NotAuthorized is transformed to AppStatusPB. The 
MasterErrorPB is used to indicate that there is a NOT_AUTHORIZED error in the 
application layer, so the client unwraps the response and gets the original 
Status.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/sentry_authz_provider-test-base.h
File src/kudu/master/sentry_authz_provider-test-base.h:

http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/sentry_authz_provider-test-base.h@43
PS3, Line 43:   Status CreateRoleAndAddToGroups(sentry::SentryClient* 
sentry_client,
> I think Dan gave feedback to convert these sorts of argument passes into un
Done


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/sentry/mini_sentry.cc
File src/kudu/sentry/mini_sentry.cc:

http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/sentry/mini_sentry.cc@268
PS3, Line 268:     <value>$6</value>
> Add a comment above explaining what this does and why we need it.
Done



--
To view, visit http://gerrit.cloudera.org:8080/11797
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iab4aa027ae6eb4520db48ce348db552c9feec2a8
Gerrit-Change-Number: 11797
Gerrit-PatchSet: 4
Gerrit-Owner: Hao Hao <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Dan Burkert <[email protected]>
Gerrit-Reviewer: Hao Hao <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Mon, 04 Feb 2019 20:43:18 +0000
Gerrit-HasComments: Yes

Reply via email to