Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/12833 )
Change subject: WIP [master] introduced SentryAuthzCache ...................................................................... Patch Set 6: (28 comments) http://gerrit.cloudera.org:8080/#/c/12833/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12833/2//COMMIT_MSG@7 PS2, Line 7: WIP [master] introduced SentryAuthzCache > Yeah, I didn't mean that to say, "We should focus on DML instead of DDL," o Ah, I see. Thank you for the reminder and the explanation! http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider-test.cc File src/kudu/master/sentry_authz_provider-test.cc: http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider-test.cc@211 PS5, Line 211: ASSERT_TRUE(s.IsNotAuthorized()) << s.ToString(); > Doesn't the superclass do this? Indeed -- it seems that left from some earlier revision. http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider-test.cc@341 PS5, Line 341: // works as expected. : class TestAuthzHierarchy : : public SentryAuthzProviderTest, : public ::testing::WithParamInterface< : > I think if a user wanted that behavior, they could disable the cache, no? S I think it's a good thing that using cache allows to avoid service interruptions when Sentry goes down for some reason. As of now, I think people can follow Andrew's suggestion and disable authz caching at all if they want 'hard failures' when Sentry is not around. >From the other side, we can introduce some feature where we monitor Sentry's >presence and invalidating the cache as soon as we detected Sentry's down and >allow to toggle that as an option. However, I don't see why such a behavior >might be desirable at all. http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider-test.cc@503 PS5, Line 503: : > Do you intend to extend this with more test cases? Yep, sure. http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider.h File src/kudu/master/sentry_authz_provider.h: http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider.h@30 PS5, Line 30: #include "kudu/sentry/sentry_client.h" > With some minor rework you can avoid this inclusion, which would be nice be Right, that's needed for TTLCache declaration. http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider.h@102 PS5, Line 102: > function Done http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider.cc File src/kudu/master/sentry_authz_provider.cc: http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider.cc@118 PS5, Line 118: > Nit: "A value" Done http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider.cc@130 PS5, Line 130: } > The explanation here doesn't match the runtime behavior. AFAICT, finest_sco Good catch -- it seems I forgot to implement the coarsening of the requested scope. http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider.cc@133 PS5, Line 133: > Nit: "for a" Done http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider.cc@134 PS5, Line 134: : : metric_entity_(std::move(metric_entity)) { > Nit: "and the query led to a cache miss," Done http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider.cc@141 PS5, Line 141: : Status SentryAuthzProvider::Start() { : vector<HostPort> addresses; > Nit: "which means the cache will fetch exactly the authz scope level it was Done http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider.cc@276 PS5, Line 276: > nit: A Done http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider.cc@280 PS5, Line 280: > nit: Maybe name it 'user' instead? Or add a comment that this is used as a Sure, if it makes it clearer then let's use 'user' and 'authorizable' instead. For some reason I thought that 'subject' and 'object' are easier to comprehend in the context of authorizing some actions. http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider.cc@294 PS5, Line 294: > Nit: will Done http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider.cc@295 PS5, Line 295: > Nit: drop 'the' Done http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider.cc@304 PS5, Line 304: > Nit: "The same" Done http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider.cc@305 PS5, Line 305: > privileges Done http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider.cc@307 PS5, Line 307: : : : : : : : > nit: May be slightly more readable starting at 0, eg: Done http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider.cc@367 PS5, Line 367: > This check seems unnecessary. I'm just paranoid :) Removed http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider.cc@368 PS5, Line 368: > Should note that this is an approximation; we're trying to gauge the size o Done http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider.cc@369 PS5, Line 369: > Do we need to add sizeof(p) as well? I think that's already accounted at line 368, no? http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider.cc@464 PS5, Line 464: > After discussion with Andrew offline, I agree that we actually should sanit Thanks for the productive discussion. I think we will end up caching some search-optimized structures instead of raw responses from Sentry, sure. With that, since it will be necessary to do all those transformations anyways, we can do any filtering and sanitize the responses as needed. I think Andrew already posted a patch for that. Maybe, we can start from that point and see how we can improve that further. http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider.cc@492 PS5, Line 492: > Note that these checks are being applied both to the cached and non-cached I guess Hao already mentioned in one of comments above that it's assumed that Sentry sends sane responses, otherwise that must have been a bug that should be fixed. However, after discussion it seems the consensus is that we want to sanitize responses since we are about to filter out anything which is not Kudu-related anyways. As for the cached responses for this current approach, we need to do so because it might be a response for a higher scope of authz hierarchy and it might contain information for other branches which is not relevant. And even worse -- using information from those branches might result in wrong authz decisions. So, it's necessary to filter other unrelated branches out. http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider.cc@524 PS5, Line 524: > Isn't this trivially true by reading the 10 lines just above? Not necessarily. This verification is meant to spot anomalies like the following: authorizable = /userA/serverS/databaseD privilege = /userA/serverS///columnC or authorizable = /userA/serverS/databaseD priviledge = /userA/serverS//tableT I added a comment; hopefully it's clearer now. http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider.cc@525 PS5, Line 525: : : : : : : > Nit: more concise as a for loop: Done http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider.cc@601 PS5, Line 601: > You could also use SentryAUthorizableScope::FromString() and that'll be cas Done. http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider.cc@603 PS5, Line 603: > Did you explore having Put() usage mirror Get()? That is, rather than using Yeah, that's an interesting idea. But as I understand that would be semantically wrong: inserting information, say, for the authz branch of particular table up to the server level will miss info for other branches down from the server level, and that's incorrect. We discussed various caching approaches during our latest authz sync-up meeting and it seems we are about to choose among few. Let's see how it goes. http://gerrit.cloudera.org:8080/#/c/12833/5/src/kudu/master/sentry_authz_provider.cc@608 PS5, Line 608: : : > I don't feel too strongly about it, but I agree and another benefit would b I figured that if doing proper authz scope coarsening, then keeping two separate HasPrivilege() calls makes more sense. As for direct call to GetSentryPrivileges(), I think it's back and forth movements: one of the intermediate version (not published, though) had that, but then I moved everything at authz_provider level. Now it's back in new PS for review. -- To view, visit http://gerrit.cloudera.org:8080/12833 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idaefacd50736f1f152dae34e76778e17b2e84cbe Gerrit-Change-Number: 12833 Gerrit-PatchSet: 6 Gerrit-Owner: Alexey Serbin <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Hao Hao <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Tue, 02 Apr 2019 01:52:59 +0000 Gerrit-HasComments: Yes
