Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11797 )

Change subject: [sentry] Integrate AuthzProvider into CatalogManager
......................................................................


Patch Set 3:

(33 comments)

http://gerrit.cloudera.org:8080/#/c/11797/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11797/3//COMMIT_MSG@12
PS3, Line 12: as
such as


http://gerrit.cloudera.org:8080/#/c/11797/3//COMMIT_MSG@17
PS3, Line 17: More authorization endpoints, e.g ListTables
            : RPC, will be addressed in a follow up patch.
Haven't reviewed this patch yet, but it's not obvious from the commit message 
why this patch covers the set of endpoints that it does and not the others. My 
guess is that you wanted to cover the "write" endpoints first and will do the 
read-only ones later. If that's right, please update the commit message to 
clarify.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/hms_itest-base.h
File src/kudu/integration-tests/hms_itest-base.h:

http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/hms_itest-base.h@150
PS3, Line 150:                   boost::optional<const string&> user) {
> warning: the parameter 'user' is copied for each invocation but only used a
In places where you pass none, use /*foo=*/ comments to indicate what it is.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/master_sentry-itest.cc
File src/kudu/integration-tests/master_sentry-itest.cc:

http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/master_sentry-itest.cc@147
PS3, Line 147:   std::unique_ptr<SentryClient> sentry_client_;
Nit: drop std:: prefix.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/master_sentry-itest.cc@152
PS3, Line 152: #if defined(__linux__)
             :       { ExternalMiniCluster::BindMode::UNIQUE_LOOPBACK, },
             : #elif defined(__APPLE__)
             : // On macOS, it's not possible to have unique loopback 
interfaces.
             : // Thus, binds to loopback ip address "127.0.0.1".
             :       { ExternalMiniCluster::BindMode::LOOPBACK, },
             : #endif
I don't understand why this is here; why can't you just kDefaultBindMode?


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/master_sentry-itest.cc@169
PS3, Line 169:  ASSERT_TRUE(s.IsNotAuthorized());
Nit: indentation


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/master_sentry-itest.cc@183
PS3, Line 183:   ASSERT_EVENTUALLY([&] {
             :     ASSERT_OK(client_->DeleteTable(Substitute("$0.$1", 
kDatabaseName, "a")));
             :   });
Why does this need to be wrapped in ASSERT_EVENTUALLY?


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/integration-tests/master_sentry-itest.cc@243
PS3, Line 243:   // ASSERT_EVENTUALLY([&] {
             :   //   ASSERT_OK(table_alterer->Alter());
             :   // });
Why will this need to be wrapped?


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/authz_provider.h
File src/kudu/master/authz_provider.h:

http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/authz_provider.h@44
PS3, Line 44:   virtual Status AuthorizeCreateTable(const std::string& user,
Curious what motivated the reordering changes here?


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/authz_provider.cc
File src/kudu/master/authz_provider.cc:

http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/authz_provider.cc@32
PS3, Line 32: DEFINE_string(trusted_user_acl, "impala",
Hmm, why don't we default to an empty set and let users who integrate with 
Impala override this? Seems safer than this default.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/authz_provider.cc@51
PS3, Line 51:     vector<string> acls = strings::Split(FLAGS_trusted_user_acl, 
",", strings::SkipEmpty());
Surprised you don't need a ScopedLeakCheckDisabler for g_trusted_users. Search 
around in the codebase; maybe you'll understand better than I when it's needed 
and when it isn't.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/authz_provider.cc@52
PS3, Line 52:       g_trusted_users = new unordered_set<string>(acls.begin(), 
acls.end());
Nit: indentation.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.h
File src/kudu/master/catalog_manager.h:

http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.h@538
PS3, Line 538: getting
Nit: retrieving

For this and for the other functions for which 'rpc' is optional, please update 
the various call-sites where you've passed nullptr with /*rpc=*/ comments.

Below too.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.h@722
PS3, Line 722:   // Delete the specified table in the catalog.
Doc the effect that 'user' has. Below too.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.h@842
PS3, Line 842:   // Looks up the table, locks it with the provided lock mode. 
And then authorize
             :   // if the user can operate on the table. Note that 
authorization is skipped if
             :   // user is not provided.
How about "Looks up the table, locks it with the provided lock mode, and, if 
'user' is provided, checks that the user is authorized to operate on the table."


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.h@851
PS3, Line 851:   //  3. A privilege escalation could happen that due to the 
race condition that Bob
             :   //     drops a table that he is not allowed to.
How about "Due to the race condition, there could be a privilege escalation 
where Bob drops a table that he is not authorized to drop."


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.h@859
PS3, Line 859:   Status FindLockAndAuthorizeTable(const ReqClass& request,
As a convention, we prefer to list IN parameters first and OUT parameters last. 
In this case, that means authz_func and user should precede table_info and 
table_lock.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@700
PS3, Line 700:   // Always enables AuthzProvider.
             :   authz_provider_.reset(new master::DefaultAuthzProvider());
You can do this in the initializer list.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@753
PS3, Line 753: hms::HmsCatalog::IsEnabled()
Isn't this already guaranteed by L726?


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@754
PS3, Line 754:       authz_provider_.reset(new master::SentryAuthzProvider());
What's the advantage of initializing authz_provider_ with DefaultAuthzProvider 
in the CatalogManager constructor vs. doing it here, when 
!SentryAuthzProvider::IsEnabled()?

BTW, are we guaranteed that RPCs cannot arrive to the master before this is 
initialized?


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1243
PS3, Line 1243:   authz_provider_->Stop();
As a general rule, deinitialize in the opposite order of initialization. So 
this should be deinitialized before the HMS stuff, I think.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1405
PS3, Line 1405: Skip authorization validation if a
              :   // remote call context is not provided, which implies there 
is no remote
              :   // calls.
I don't even think this is necessary; it's obvious from inspection that if 
'rpc' isn't provided, we'll skip authorization.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1628
PS3, Line 1628:     string owner = req.has_owner() ? req.owner() : (rpc ? 
rpc->remote_user().username() : "");
What was the motivation for allowing 'rpc' to be nullptr? For ease of testing? 
Seems like the previous invariant was that nullptr was allowed if HMS wasn't 
configured, which meant tests that didn't care about the HMS could be lazy. Why 
didn't that work?


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1798
PS3, Line 1798:         return error();
Do we need to authorize in this call? What does 'table_name' describe? What 
does authorizing against it mean?


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1818
PS3, Line 1818:   table_name = lock.data().name();
I presume here we reset the name in case the client provided a table identifier 
with an ID and without a name? If so, add a comment.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1824
PS3, Line 1824:     return error();
Don't need to authorize in this call to error().


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1920
PS3, Line 1920:     return 
SetupError(authz_provider_->AuthorizeGetTableMetadata(username, table_name),
Is DeleteTable() called if a user drops a table in Beeline shell (i.e. through 
an HMS notification log event)? Do we need to authorize more than 
GetTableMetadata in that case?


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@2296
PS3, Line 2296: to avoid the new table
              :     // name is checked against during authorization.
This last part isn't clear. As best I can tell, you've reordered the change 
name (includes HMS) and change schema/partitioning (excludes HMS) steps because 
if you didn't, schema/partitioning would authorize using the new table name. It 
doesn't get the name from the client request (that gets cleared out), but 
rather from the TableInfo maintained by the catalog manager.

Is this important from a correctness standpoint? Or is this an implementation 
shortcut? It isn't clear.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@2380
PS3, Line 2380:     return 
SetupError(authz_provider_->AuthorizeAlterTable(username, table_name, 
new_table),
Didn't we already do this in AlterTableRpc? Shouldn't we auth using 
GetTableMetadata here? I'm comparing this to the DeleteTable family of calls 
and noticed this difference.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/master.proto
File src/kudu/master/master.proto:

http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/master.proto@85
PS3, Line 85:     // The caller is not authorized to perform the attempted 
operation.
In your mind, how is this different from Status::NotAuthorized? If I issue a 
master DDL (or in general), when should I expect this and when should I expect 
Status::NotAuthorized?


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/sentry_authz_provider-test-base.h
File src/kudu/master/sentry_authz_provider-test-base.h:

http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/sentry_authz_provider-test-base.h@43
PS3, Line 43:   Status CreateRoleAndAddToGroups(const 
std::unique_ptr<sentry::SentryClient>& sentry_client,
I think Dan gave feedback to convert these sorts of argument passes into 
unwrapped ones. I.e. turn const unique_ptr<Foo>& into either Foo* (if we're 
calling a non-const Foo method) or const Foo& (if calling a const Foo method).


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/sentry/mini_sentry.cc
File src/kudu/sentry/mini_sentry.cc:

http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/sentry/mini_sentry.cc@268
PS3, Line 268:     <name>sentry.db.policy.store.owner.as.privilege</name>
Add a comment above explaining what this does and why we need it.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/sentry/sentry_policy_service.thrift
File src/kudu/sentry/sentry_policy_service.thrift:

http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/sentry/sentry_policy_service.thrift@26
PS3, Line 26: #   - Rename enum TSentryGrantOption.TRUE and 
TSentryGrantOption.FALSE to avoid
            : #     conflict with the macro definition in the system header.
This rename is safe because only the value of the enum is sent on the wire, 
right?



--
To view, visit http://gerrit.cloudera.org:8080/11797
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iab4aa027ae6eb4520db48ce348db552c9feec2a8
Gerrit-Change-Number: 11797
Gerrit-PatchSet: 3
Gerrit-Owner: Hao Hao <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Dan Burkert <[email protected]>
Gerrit-Reviewer: Hao Hao <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 11 Dec 2018 06:45:44 +0000
Gerrit-HasComments: Yes

Reply via email to