Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11797 )

Change subject: [sentry] Integrate AuthzProvider into CatalogManager
......................................................................


Patch Set 7:

(30 comments)

http://gerrit.cloudera.org:8080/#/c/11797/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11797/7//COMMIT_MSG@20
PS7, Line 20: exposedp
exposed


http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/client/client-test.cc@293
PS7, Line 293: /* rpc = */
Nit: convention is more like this:

  /*rpc=*/nullptr

See https://gerrit.cloudera.org/c/12415/3/src/kudu/common/schema.cc#196 for 
more context.

Applies elsewhere in your patch too.


http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/CMakeLists.txt
File src/kudu/integration-tests/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/CMakeLists.txt@91
PS7, Line 91: ADD_KUDU_TEST(master_sentry-itest RUN_SERIAL true PROCESSORS 4)
            : ADD_KUDU_TEST(master_sentry_advance-itest RUN_SERIAL true 
PROCESSORS 4)
Do these need to be sharded, given the parameterized tests in each and the time 
it takes to start a cluster with Sentry and HMS? How long does each take to 
run, in isolation?


http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/hms_itest-base.h
File src/kudu/integration-tests/hms_itest-base.h:

http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/hms_itest-base.h@52
PS7, Line 52: using client::KuduColumnSchema;
            : using client::KuduSchema;
            : using client::KuduSchemaBuilder;
            : using client::KuduTable;
            : using client::KuduTableCreator;
            : using client::sp::shared_ptr;
            : using hms::HmsClient;
            : using std::string;
            : using std::unique_ptr;
            : using strings::Substitute;
> I'm not sure that 'using ...' in a header file is a good idea.
Agreed with Alexey; I don't think this is a pattern we should promote. Almost 
all the current usages of this are from old code.


http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/master_sentry-itest.cc
File src/kudu/integration-tests/master_sentry-itest.cc:

http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/master_sentry-itest.cc@31
PS7, Line 31: // IWYU pragma: no_include 
"kudu/integration-tests/hms_itest-base.h"
Again, what happened here?


http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/master_sentry-itest.cc@63
PS7, Line 63:   ASSERT_EQ(set<string>({ Substitute("$0.$1", kDatabaseName, 
kTableName),
Nit: unordered_set is more representative of the semantics you care about (i.e. 
deduplication but unsorted).


http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/master_sentry-itest.cc@98
PS7, Line 98:         // Authorize create table.
These comments don't add any useful information.


http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/master_sentry_advance-itest.cc
File src/kudu/integration-tests/master_sentry_advance-itest.cc:

http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/master_sentry_advance-itest.cc@31
PS7, Line 31: // IWYU pragma: no_include 
"kudu/integration-tests/hms_itest-base.h"
What happened here? Why shouldn't we include this?


http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/master_sentry_advance-itest.cc@37
PS7, Line 37: advanced
> What is "advanced" referring to?
Also not really seeing the distinction; why can't these tests just live in 
master_sentry-itest?


http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/master_sentry_advance-itest.cc@91
PS7, Line 91: altering
But we're not necessarily altering, right?

Below too.


http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/master_sentry_advance-itest.cc@125
PS7, Line 125:         // Authz funcs for delete table .
Nit: misplaced


http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/master_sentry_advance-itest.cc@131
PS7, Line 131:         // Authz funcs for alter table.
On second thought, these comments aren't adding anything useful.


http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/registration-test.cc
File src/kudu/integration-tests/registration-test.cc:

http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/registration-test.cc@197
PS7, Line 197:         s = catalog->GetTabletLocations(tablet_id, 
master::VOTER_REPLICA, &loc, nullptr);
need /*rpc=*/ here


http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/sentry_itest-base.h
File src/kudu/integration-tests/sentry_itest-base.h:

http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/sentry_itest-base.h@75
PS7, Line 75: class SentryTestBase : public HmsTestBase,
            :                        public master::SentryAuthzProviderTestBase 
{
Let's not do multiple inheritance here, especially since 
SentryAuthzProviderTestBase looks like it doesn't need to exist (it's not a 
true test fixture, just an amalgam of free functions bound into a class).


http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/sentry_itest-base.h@191
PS7, Line 191:   Status GetTabletLocations(const string& table, const string& 
/*new_table*/) {
You could implement this in a different way, using the cluster's 
MiniClusterFsInspector to list all tablet IDs from disk.

Not sure if it'll actually be simpler, but perhaps worth looking at.


http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/sentry_itest-base.h@194
PS7, Line 194:     RETURN_NOT_OK(cluster_->CreateClient(nullptr, &client_));
Could you create this second client instance without destroying client_, so it 
needn't be recreated below? Or is the issue that you can only kinit as one user 
at a time, so you have to explicitly switch like this?


http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/sentry_itest-base.h@223
PS7, Line 223:     // Always enable Kerberos, as in non-Kerberized environment 
the logged
             :     // in user is not service users that allowed to connect to 
the Sentry
             :     // service.
Can you rewrite this? The second half (beginning with "the logged in user...") 
isn't clear.


http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/sentry_itest-base.h@268
PS7, Line 268:     // Log in as the 'test-user' and reset the client to pick up 
the change in user.
             :     ASSERT_OK(cluster_->kdc()->Kinit(kTestUser));
             :     ASSERT_OK(cluster_->CreateClient(nullptr, &client_));
See above comment about switching users.


http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/sentry_itest-base.h@273
PS7, Line 273:   void TearDown() override {
Reverse order of SetUp, so you should stop the Sentry client first, then the 
HMS client.


http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1798
PS3, Line 1798:       table = FindPtrOrNull(table_ids_map_, 
table_identifier.table_id());
> Yes, we only return TABLE_NOT_FOUND if the operation on the table is author
Right, but I was specifically asking about this particular case, where the 
provided table ID and name don't match up to the same table. In such a case, 
the table name used in authorization comes from the RPC, but it might refer to 
the "wrong" table (if we're to trust the RPC's table ID more).


http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/master/catalog_manager.cc@1794
PS7, Line 1794:     shared_lock<LockType> l(lock_);
Let's make sure this is dropped before any authorization calls.


http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/master/catalog_manager.cc@1828
PS7, Line 1828:   RETURN_NOT_OK(authorize());
It's unfortunate that we have to hold TableMetadataLock during the 
authorization, which may be an RPC to Sentry. I believe this necessary for 
correctness; can you confirm and/or update the comment?


http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/master/catalog_manager.cc@2729
PS7, Line 2729:   shared_lock<LockType> l(lock_);
Can we avoid holding this during the authorization? Seems like we could do this 
in two passes. The first pass iterates over normalized_table_names_map_ and 
grabs refs on the entries with lock_ held. The second iterates on the refs, 
acquires TableMetadataLocks, filters by name, and authorizes.


http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/master/catalog_manager.cc@2744
PS7, Line 2744:       Status s = 
authz_provider_->AuthorizeGetTableMetadata(NormalizeTableName(table_name),
              :                                                             
*user);
This is going to be really expensive when there are a lot of tables. Is there a 
mechanism with which to do a bulk authorize on a resource set?


http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/master/catalog_manager.cc@4640
PS7, Line 4640:     shared_lock<LockType> l(lock_);
Can we drop this before authorizing?


http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/master/catalog_manager.cc@4642
PS7, Line 4642:     // is enabled. Because tablet id is a random generated 
string which doesn't
randomly


http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/master/catalog_manager.cc@4643
PS7, Line 4643: On the other hand, we can always return
              :     // NOT_AUTHORIZED error, which can be too restricted.
Not really sure what this second sentence is offering. Is it suggesting an 
alternative approach which is also acceptable? If so, please clarify the 
comment.


http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/master/master-test-util.h
File src/kudu/master/master-test-util.h:

http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/master/master-test-util.h@56
PS7, Line 56: nullptr
/*rpc=*/


http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/master/sentry_authz_provider.cc
File src/kudu/master/sentry_authz_provider.cc:

http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/master/sentry_authz_provider.cc@200
PS7, Line 200:   if (AuthzProvider::IsTrustedUser(user)) {
             :     return Status::OK();
             :   }
You could push all of these down into Authorize(). It means a little bit more 
work for trusted callers, but it seems like it's all building in-memory 
structures, so maybe worth the code simplification.

Doing so also means we're less likely to add a new AuthorizeFoo call in the 
future and forget to call IsTrustedUser() in it.


http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/sentry/mini_sentry.cc
File src/kudu/sentry/mini_sentry.cc:

http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/sentry/mini_sentry.cc@210
PS7, Line 210:   //    derive OWNER/ALL privileges from object's ownership. 
'all' indicates an
Nit: derives



--
To view, visit http://gerrit.cloudera.org:8080/11797
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iab4aa027ae6eb4520db48ce348db552c9feec2a8
Gerrit-Change-Number: 11797
Gerrit-PatchSet: 7
Gerrit-Owner: Hao Hao <hao....@cloudera.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <danburk...@apache.org>
Gerrit-Reviewer: Hao Hao <hao....@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Wed, 13 Feb 2019 08:45:24 +0000
Gerrit-HasComments: Yes

Reply via email to