Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15483 )

Change subject: KUDU-3079 Add MiniRanger
......................................................................


Patch Set 16:

(43 comments)

http://gerrit.cloudera.org:8080/#/c/15483/16//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15483/16//COMMIT_MSG@33
PS16, Line 33: RESY
REST


http://gerrit.cloudera.org:8080/#/c/15483/16/build-support/dist_test.py
File build-support/dist_test.py:

http://gerrit.cloudera.org:8080/#/c/15483/16/build-support/dist_test.py@116
PS16, Line 116:      # Add Postgres and Ranger
Nit: terminate with a period.

May also want to note that these are symlinks.


http://gerrit.cloudera.org:8080/#/c/15483/16/build-support/run_dist_test.py
File build-support/run_dist_test.py:

http://gerrit.cloudera.org:8080/#/c/15483/16/build-support/run_dist_test.py@165
PS16, Line 165:   for bin_path in glob.glob(os.path.join(ROOT, "build/*/bin")):
Can you just merge this loop with the chrony one, and update the comment?


http://gerrit.cloudera.org:8080/#/c/15483/16/build-support/run_dist_test.py@176
PS16, Line 176: glob.glob("/usr/lib/jvm/java-1.8.0-*")[0]
This was already done above (L154); reuse.


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/mini-cluster/external_mini_cluster.cc
File src/kudu/mini-cluster/external_mini_cluster.cc:

http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/mini-cluster/external_mini_cluster.cc@337
PS16, Line 337:   if (opts_.enable_ranger) {
Where do we configure the Kudu Masters to talk to MiniRanger? Or is that going 
to be in a follow-up?


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/CMakeLists.txt
File src/kudu/ranger/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/CMakeLists.txt@69
PS16, Line 69: # need to chmod +x 
thirdparty/src/postgresql-42.2.10/postgresql-42.2.10.jar
What is this comment telling us? Is this a TODO?


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/CMakeLists.txt@100
PS16, Line 100:   mini_ranger
Alphabetical sorting.


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger-test.cc
File src/kudu/ranger/mini_ranger-test.cc:

http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger-test.cc@40
PS16, Line 40:   MiniRanger ranger;
Should be ranger_


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger-test.cc@41
PS16, Line 41:
Nit: empty line here.


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger-test.cc@44
PS16, Line 44: TEST_F(MiniRangerTest, TestGrantPrivilege) {
We should at least a full lifecycle (start, restart, stop, etc)

Maybe also test persistence (i.e. you started Ranger and did something. If you 
restart it, are the effects of that change still there?)


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger-test.cc@52
PS16, Line 52:   policy.items.emplace_back(item);
std::move


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.h
File src/kudu/ranger/mini_ranger.h:

http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.h@40
PS16, Line 40: typedef std::pair<std::vector<std::string>, 
std::vector<ActionPB>> policy_item;
This needs documentation, and should be called PolicyItem.


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.h@42
PS16, Line 42: struct AuthorizationPolicy {
Doc.


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.h@50
PS16, Line 50: class MiniRanger {
Doc.


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.h@57
PS16, Line 57: move
How did you get away with this? We shouldn't be 'using' anything in this 
header. Should be std::move.


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.h@64
PS16, Line 64:   Status AddPolicy(const AuthorizationPolicy& policy) 
WARN_UNUSED_RESULT;
Doc.

Also, can we pass by value and move into it?


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.h@69
PS16, Line 69:   static Status CreateRangerConfigs(const std::string& 
config_string,
Doc.


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.h@94
PS16, Line 94:   std::string ranger_admin_url_;
Doc.


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.cc
File src/kudu/ranger/mini_ranger.cc:

http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.cc@53
PS16, Line 53:
What a mess. In future work, could you:
1. Take a stab at removing every property/parameter we don't need.
2. Documenting the effect of what remains?

I know it's a big ask, but these configs are just completely inscrutable. I 
have no idea how anyone in the future could modify them safely.

Might also consider storing these templates in a separate file, with a header 
to retrieve the de-templatized versions of them.


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.cc@1132
PS16, Line 1132:     <value>simple</value>
Does this warrant a TODO or something? I presume we want to support Kerberos at 
some point.


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.cc@1141
PS16, Line 1141:   // TODO(awong): codify that we're waiting for Postgres to 
become usable.
Yeah, this should be the next priority, otherwise the massive number of Sentry 
tests that we should (hopefully) bring to bear with Ranger will take a very 
long time to run.


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.cc@1147
PS16, Line 1147:   if (ranger_process_ && ranger_process_->IsStarted()) {
What if Start() failed mid-way such that mini_pg_ has started but Ranger 
hasn't? Don't we want to explicitly stop mini_pg_ (so we can warn if it failed)?


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.cc@1148
PS16, Line 1148:     CHECK_OK(Stop());
Should WARN_NOT_OK.


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.cc@1157
PS16, Line 1157:   unique_ptr<Subprocess> ranger_shutdown(new Subprocess({
Holy moly you can't just send the thing a TERM/KILL? What happens if you don't 
do this fancy shutdown?

Also, doesn't need to be heap allocated.


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.cc@1179
PS16, Line 1179:   if (data_root_.empty()) {
               :     data_root_ = GetTestDataDirectory();
               :   }
Do in the constructor and make data_root_ const?


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.cc@1213
PS16, Line 1213:   if (!env->FileExists(admin_home)) {
Decompose this block into a helper.


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.cc@1215
PS16, Line 1215:     env->CreateDir(admin_home);
RETURN_NOT_OK.


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.cc@1223
PS16, Line 1223:     LOG(INFO) << "Starting Ranger out of " << admin_home;
Shouldn't this be outside the "fresh install" case?


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.cc@1232
PS16, Line 1232:   RETURN_NOT_OK(GetFQDN(&fqdn));
What about setting up Ranger on a UNIQUE_LOOPBACK address?


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.cc@1237
PS16, Line 1237:   uint16_t ranger_shutdown_port = GetRandomPort();
There should be a comment somewhere explaining all of the ports that we need to 
set up, and what purpose they serve.


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.cc@1238
PS16, Line 1238:   RETURN_NOT_OK(CreateRangerConfigs(
Decompose all the config stuff into a helper.


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.cc@1258
PS16, Line 1258:   if (fresh_install) {
Decompose this block into a helper.


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.cc@1261
PS16, Line 1261:     symlink(JoinPathSegments(ranger_home_, 
"ews/webapp").c_str(),
               :             kWebAppDir.c_str());
Add a symlink function to env.h and env_posix.h (and a test to env-test.cc).

Can be done in a follow-up if you prefer, but soon (as this doesn't even check 
return value).


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.cc@1264
PS16, Line 1264: Much of this encapsulates setup.sh from apache/ranger
Link?


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.cc@1275
PS16, Line 1275:     unique_ptr<Subprocess> db_setup(new Subprocess(
Doesn't need to be heap allocated.


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.cc@1343
PS16, Line 1343:     EasyCurl curl;
               :     EasyJson service;
               :     service.Set("name", "kudu");
               :     service.Set("type", "kudu");
               :     faststring result;
               :     Status curl_result = 
curl.PostToURL(JoinPathSegments(ranger_admin_url_,
               :                                                   
"service/plugins/services"),
               :                                  service.ToString(), &result, 
{"Content-Type: application/json"},
               :                                  "admin", "admin");
Decompose into a helper.


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.cc@1357
PS16, Line 1357: Status MiniRanger::CreateRangerConfigs(const string& 
config_string, const string& file_path) {
Pass by value and std::move both of these.


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.cc@1366
PS16, Line 1366: int16_t MiniRanger::GetRandomPort() {
Duplicated from MiniPostgres; decompose into net_util somewhere.


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/ranger/mini_ranger.cc@1434
PS16, Line 1434:   LOG(INFO) << result.ToString();
If there's an error, will it be reflected in the HTTP status code (i.e. will 
curl fail?) Or do we need to process 'result' to find it?


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/util/curl_util.h
File src/kudu/util/curl_util.h:

PS16:
Can you unit test this change? See PasswdWebserverTest in webserver-test (which 
you may want to refactor to take advantage of the new functionality anyway).


http://gerrit.cloudera.org:8080/#/c/15483/16/src/kudu/util/curl_util.h@51
PS16, Line 51:                   const std::string& username = "",
             :                   const std::string& password = "");
These are likely to be the same with each call to FetchURL or PostToURL; could 
you plumb them as a setter instead?

Also, please make that setter mutually exclusive with set_use_spnego() (i.e. 
allow just one to be used).


http://gerrit.cloudera.org:8080/#/c/15483/16/thirdparty/build-thirdparty.sh
File thirdparty/build-thirdparty.sh:

http://gerrit.cloudera.org:8080/#/c/15483/16/thirdparty/build-thirdparty.sh@298
PS16, Line 298:   cp -R \
This isn't a directory, so why -R?

Also don't you want -f so you can overwrite the destination file if it exists 
(i.e. multiple invocations of this script).

Alternatively, maybe a symlink would be good enough?


http://gerrit.cloudera.org:8080/#/c/15483/16/thirdparty/patches/ranger-fixscripts.patch
File thirdparty/patches/ranger-fixscripts.patch:

PS16:
The patch is obviously just a limited view into the script itself, but is it OK 
to just outright remove these three variables? Looks like XAPOLICYMGR_DIR is 
needed to set XAPOLICYMGR_EWS_DIR just below; how does this work?



--
To view, visit http://gerrit.cloudera.org:8080/15483
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I15ab1eb8abe71c074c26b286073442882e101bc6
Gerrit-Change-Number: 15483
Gerrit-PatchSet: 16
Gerrit-Owner: Attila Bukor <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Attila Bukor <[email protected]>
Gerrit-Reviewer: Grant Henke <[email protected]>
Gerrit-Reviewer: Hao Hao <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 21 Mar 2020 00:33:23 +0000
Gerrit-HasComments: Yes

Reply via email to