Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/11797 )
Change subject: [sentry] Integrate AuthzProvider into CatalogManager ...................................................................... Patch Set 10: (9 comments) http://gerrit.cloudera.org:8080/#/c/11797/10/src/kudu/integration-tests/hms_itest-base.h File src/kudu/integration-tests/hms_itest-base.h: http://gerrit.cloudera.org:8080/#/c/11797/10/src/kudu/integration-tests/hms_itest-base.h@36 PS10, Line 36: // Create a database in the HMS catalog. Nit: "Creates" vs. "Create" (see how CheckTable and CheckTableDoesNotExist were commented). Below too. http://gerrit.cloudera.org:8080/#/c/11797/10/src/kudu/integration-tests/master_hms-itest.cc File src/kudu/integration-tests/master_hms-itest.cc: http://gerrit.cloudera.org:8080/#/c/11797/10/src/kudu/integration-tests/master_hms-itest.cc@128 PS10, Line 128: NO_FATALS(CheckTable(hms_database_name, hms_table_name, /* user = */ none)); Nit: as before, should be /*user=*/ and not /* user = */ Elsewhere too. http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/master_sentry_advance-itest.cc File src/kudu/integration-tests/master_sentry_advance-itest.cc: http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/integration-tests/master_sentry_advance-itest.cc@31 PS7, Line 31: > Hmm, when I included this, I got an 'duplicate symbols' compilation error. Sounds like we have a header without an "include once" guard. Can you post the entirety of the error? http://gerrit.cloudera.org:8080/#/c/11797/10/src/kudu/master/catalog_manager.h File src/kudu/master/catalog_manager.h: http://gerrit.cloudera.org:8080/#/c/11797/10/src/kudu/master/catalog_manager.h@56 PS10, Line 56: class AuthzTokenTest_TestSingleMasterUnavailable_Test;; Nit: there's an extra semicolon here. http://gerrit.cloudera.org:8080/#/c/11797/9/src/kudu/master/catalog_manager.h File src/kudu/master/catalog_manager.h: http://gerrit.cloudera.org:8080/#/c/11797/9/src/kudu/master/catalog_manager.h@861 PS9, Line 861: // The authorization step happens after table locking to ensure the validation This doesn't seem quite right; how can Alice rename B to A if Alice just dropped B? I think a simpler example would be "Alice has RENAME privileges on table B. Bob has DROP privileges on table B. Alice tries to rename B to A at the same time that Bob tries to drop B." I'd also talk about how, the goal here is to enforce that, once a DDL has been authorized, it is carried out on a table whose authz-affecting properties do not change. That's a bit confusing sounding; what I mean is that e.g. it is illegal for the table's name to change between the authorization of a particular action on the table and the client-visible execution of that action. http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/11797/3/src/kudu/master/catalog_manager.cc@1798 PS3, Line 1798: auto authorized_error = [&] { > Hmm, I see. I am not sure either the request's table name or table ID is mo Generally speaking we trust the table ID more than its name, because the name can be specified by the user in various client API calls while the ID is only attainable from a prior RPC (and can't be changed by the user). However, if the user is issuing raw RPCs at the wire protocol level, they can set both ID and name to be whatever they want. So let's take a step back and look at this from first principles. Here's what we have: 1. a table identifier with both an ID and a name 2. the ID maps to table A. 3. the name maps to table B. We want to report a good error for this case, and error() doesn't do that, because it'll just return a "table B not found" error. Ideally we'd want to say something about how the client provided two forms of table identification, each of which map to a different table. We'd want to mention both tables' names explicitly. To do this, we should authorize the user against both tables, then construct a custom error message for this case rather than relying on error(). http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/master/catalog_manager.cc@1794 PS7, Line 1794: "the table does not exist", SecureShortDebugString(table_identifier)), > Done Not done? 'l' is still in scope when authorized_error() is called, which calls authorize(), which calls authz_func(), which may make a Sentry call. http://gerrit.cloudera.org:8080/#/c/11797/7/src/kudu/master/catalog_manager.cc@2744 PS7, Line 2744: shared_lock<LockType> l(lock_); : for (const TableInfoMap::value_type &entry : normalized_table_n > I don't see there is a good way to do bulk authorization based on the curre That could work, but it's somewhat complicated and will make reasoning about the relationship between client RPCs and Sentry RPCs more difficult. Would it be really challenging to add a bulk authorization API to Sentry? For this use case we just need to be able to provide a list of authorizables. It could be limited such that all authorizables have to share the same scope. Another idea is to transform the list of tables into a (smaller hopefully) list of databases, and to authorize that list one-by-one, using the results to perform filtering. I guess that's not fundamentally different from fetching all server=server1 privileges, just would perhaps result in smaller chunks. Let's continue with this naive approach for the time being, but please add a strongly worded TODO here so that we'll remember to profile this and work out a better solution. http://gerrit.cloudera.org:8080/#/c/11797/10/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/11797/10/src/kudu/master/catalog_manager.cc@1816 PS10, Line 1816: NormalizeTableName(table_identifier.table_name()) Can reuse the local variable table_name here. -- To view, visit http://gerrit.cloudera.org:8080/11797 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iab4aa027ae6eb4520db48ce348db552c9feec2a8 Gerrit-Change-Number: 11797 Gerrit-PatchSet: 10 Gerrit-Owner: Hao Hao <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Dan Burkert <[email protected]> Gerrit-Reviewer: Hao Hao <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Thu, 07 Mar 2019 23:03:03 +0000 Gerrit-HasComments: Yes
