Thomas Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13299 )

Change subject: IMPALA-8538: HS2 + HTTP(S) + BASIC/LDAP based thrift server 
endpoint
......................................................................


Patch Set 3:

(17 comments)

http://gerrit.cloudera.org:8080/#/c/13299/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13299/2//COMMIT_MSG@21
PS2, Line 21:
            :
            : TODO
> It seems to me like it would be easier to just copy-paste this, since the T
Sure. I checked, and the differences between this version and thrift master's 
THttp* is quite minimal. Its certainly less work for me not to have to patch 
native-toolchain.

I'm interested in trying to contribute this upstream, but I assume that process 
will take awhile, so for this patch I'll go ahead and take the "[not for 
review]" off of the patch that copies the files that this patch is based on.


http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc
File be/src/rpc/authentication.cc:

http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc@502
PS2, Line 502:     LOG(ERROR) << "Failed to decode base64 auth string from: "
> for all of the log messages in this function, can we include the remote soc
Done


http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc@505
PS2, Line 505:   }
> stack allocating here seems quite dangerous without constraining the length
Done


http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc@507
PS2, Line 507:   if (colon == std::string::npos) {
> Looking at Base64Decode, it doesn't seem to null-terminate the output, but
Done


http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc@521
PS2, Line 521:
> again I'd feel safer about the above code if we used C++ strings, like:
Done


http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc@957
PS2, Line 957:
> this function is a bit weirdly named, since in the HTTP case, it isn't sett
Done


http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc@977
PS2, Line 977:   switch (underlying_transport_type) {
> can we just LOG(FATAL) here since it would be a coding bug?
Done


http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/rpc/authentication.cc@998
PS2, Line 998: }
> same
Done


http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/service/impala-server.cc@2424
PS2, Line 2424:       ThriftServer* server;
> Maybe we should use '-1' to mean disable? port 0 usually means "use an ephe
The problem with that is that the equivalent flags, eg. for hs2_port and 
beewax, use 0 to mean disabled. I'm not sure its a good idea either to have 
inconsistent behavior between flags or to change the behavior of the existing 
flags.


http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/transport/THttpServer.h
File be/src/transport/THttpServer.h:

http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/transport/THttpServer.h@71
PS2, Line 71:  Wraps a transport i
> that's a bit of an odd choice of type instead of std::string
Done


http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/transport/THttpServer.h@80
PS2, Line 80:
> explicit
Done


http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/transport/THttpServer.cpp
File be/src/transport/THttpServer.cpp:

http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/transport/THttpServer.cpp@68
PS2, Line 68: rwarded
> probably needs to be made case-insensitive (odd that x-forwarded-for is not
Done


http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/transport/THttpServer.cpp@69
PS2, Line 69:
> probably need to also check that sz >= 7, otherwise we might read past the
Done


http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/transport/THttpServer.cpp@76
PS2, Line 76:       }
            :       authorized_ = true;
> is base64AuthString_ were a string you could just use a simple assignment h
Done


http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/transport/THttpServer.cpp@79
PS2, Line 79:   }
> I think it's worth considering this a bit carefully. It seems that the desi
As discussed, I went with a design where we set the username on the connection 
context each time we authenticate and then check that the session username 
matches the current connection username when doing operations. This works 
because each connection corresponds to a single thread which will read some 
headers and process the corresponding rpc one at a time.

One concern this leaves me with is that now we're hitting ldap on every rpc. If 
that seems like a potential perf issue, I could add a list of already 
authenticated base64 strings here and check that before calling the auth fn, or 
I could add support for cookies, which is probably a better solution but I'm 
not sure how much work it would be.


http://gerrit.cloudera.org:8080/#/c/13299/2/be/src/transport/THttpServer.cpp@193
PS2, Line 193: }
> in theory shouldn't there be two CRLF after the last header?
Yeah I wasn't sure what to use for 'realm'. I guess the server name, which in 
this patch would be 'hiveserver2-http-frontend' or possibly 
'impala.thrift-server.hiveserver2-http-frontend'


http://gerrit.cloudera.org:8080/#/c/13299/2/bin/start-impala-cluster.py
File bin/start-impala-cluster.py:

http://gerrit.cloudera.org:8080/#/c/13299/2/bin/start-impala-cluster.py@204
PS2, Line 204: :
> flake8: E203 whitespace before ':'
Done



--
To view, visit http://gerrit.cloudera.org:8080/13299
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic5569ac62ef3af2868b5d0581f5029dac736b2ff
Gerrit-Change-Number: 13299
Gerrit-PatchSet: 3
Gerrit-Owner: Thomas Marshall <[email protected]>
Gerrit-Reviewer: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Mike Yoder <[email protected]>
Gerrit-Reviewer: Sudhanshu Arora <[email protected]>
Gerrit-Reviewer: Thomas Marshall <[email protected]>
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Comment-Date: Wed, 22 May 2019 20:56:00 +0000
Gerrit-HasComments: Yes

Reply via email to