Jason Fehr has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23237 )

Change subject: IMPALA-14285: Add SAML2 authentication support for Coordinator 
Web UI
......................................................................


Patch Set 13:

(23 comments)

http://gerrit.cloudera.org:8080/#/c/23237/13//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/23237/13//COMMIT_MSG@31
PS13, Line 31:
Please add details on how this change was tested.


http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/rpc/authentication-util.cc
File be/src/rpc/authentication-util.cc:

http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/rpc/authentication-util.cc@263
PS13, Line 263:   vector<string> split = Split(url, delimiter::Limit("/", 3));
Please add ctests for this function.


http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/rpc/authentication-util.cc@277
PS13, Line 277:     const string& params_string, string* err_msg) {
Please add ctests for this function.


http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/rpc/authentication-util.cc@280
PS13, Line 280:     if (key_value.size() == 2) {
What does it mean if key_value.size() is not 2?  Is that an error situation 
that needs to be handled?


http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/rpc/authentication-util.cc@326
PS13, Line 326: ValidateSaml2Response
This code eventually calls into 
org.pac4j.saml.sso.impl.SAML2AuthnResponseValidator which checks the SAML 
response's "InResponseTo" field to ensure the value of that field matches the 
id of a SAML request that was previously created.  Thus, if the SAML response 
is received on a different coordinator than where the SAML request was 
generated, this validation will fail.  If that is the case, is it possible to 
check other coordinators to see if they issued the SAML request?


http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/backend-gflag-util.cc
File be/src/util/backend-gflag-util.cc:

http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/backend-gflag-util.cc@174
PS13, Line 174: for HiveServer2
This help text would be more descriptive if it said something like "for client 
connections that use the hs2-http protocol"


http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/backend-gflag-util.cc@180
PS13, Line 180: WebServer
"WebServer" is a somewhat generic term.  Would it make more sense to say "debug 
ui" instead?


http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/backend-gflag-util.cc@190
PS13, Line 190: Highserver2
hs2-http connections


http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/backend-gflag-util.cc@194
PS13, Line 194: Webserver
Same question as line 180.


http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/webserver.h
File be/src/util/webserver.h:

http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/webserver.h@202
PS13, Line 202:
              :   void returnUnauthorized(
              :       struct sq_connection* connection);
              :
              :   // Helper method to handle SAML authentication failure with 
metrics and return code
              :   sq_callback_result_t returnSamlAuthFailure(struct 
sq_connection* connection);
              :
              :   void returnWrappedResponse(
              :       struct sq_connection* connection, const 
impala::TWrappedHttpResponse& response);
              :
              :   sq_callback_result_t ReadPostBody(
              :       struct sq_connection* connection, int32_t max_length, 
std::string* out_body);
Please add comments to all of these functions explaining what they do.


http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/webserver.h@329
PS13, Line 329: returnning
Nit: returning


http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/webserver.cc
File be/src/util/webserver.cc:

http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/webserver.cc@373
PS13, Line 373:  || use_saml_
Since this is security related, I'm not a fan of the current approach where 
SAML may or may not work alongside LDAP/Kerberos.  We need to either test SAML 
in conjunction with each or else add a new entry for SAML in the AuthMode enum 
(defined in webserver.h) to enforce that SAML must be used by itself.  My 
preference would be the second option since it removes attack vectors.


http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/webserver.cc@851
PS13, Line 851:   bool use_saml = use_saml_ && !sq_get_header(connection, 
"X-Impala-EETest");
This line needs to check FLAGS_saml2_ee_test_mode, otherwise SAML auth can be 
ignored by passing the "X-Impala-EETest" http header if another auth method is 
not enabled.


http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientBase.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientBase.java:

http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientBase.java@87
PS13, Line 87:   protected abstract String getCallBackUrl() throws Exception;
Nit: Place all the abstract methods together in the code.


http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientBase.java@93
PS13, Line 93:   protected void setRedirectCommon(WrappedWebContext webContext)
Nit: add the "final" keyword to this method so subclasses cannot override it.


http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientBase.java@108
PS13, Line 108:   public String validateAuthnResponseInner(WrappedWebContext 
webContext)
Nit: change to "protected" visibility and add "final" keyword


http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientBase.java@137
PS13, Line 137:   protected <T extends HiveSamlRelayStateInfo> void 
validateAuthnResponseCommon(
Nit: add the "final" keyword to this method so subclasses cannot override it.


http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientHS2.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientHS2.java:

http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientHS2.java@46
PS13, Line 46:     setName(ImpalaSamlClientHS2.class.getSimpleName());
             :     setStateGenerator(HiveSamlRelayStateStoreHS2.get());
Name and StateGenerator are set by both subclasses of ImpalaSamlClientBase.  
These should be set by the ImpalaSamlClientBase superclass, in its constructor, 
so they are not accidentally deleted or forgotten if a third subclass is 
created.


http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientHS2.java@53
PS13, Line 53:     return BackendConfig.INSTANCE.getHS2Saml2SpCallbackUrl();
If this value will never change, then pass it to the ImpalaSamlClientBase 
constructor and have that class implement getCallBackUrl


http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/service/JniFrontend.java
File fe/src/main/java/org/apache/impala/service/JniFrontend.java:

http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/service/JniFrontend.java@825
PS13, Line 825:     Preconditions.checkNotNull(client);
Nit:  client will be Null if SAML idp metadata was not configured.  Pass a 
message that explains which startup param is missing.


http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/service/JniFrontend.java@844
PS13, Line 844:     Preconditions.checkNotNull(client);
Nit:  client will be Null if SAML idp metadata was not configured.  Pass a 
message that explains which startup param is missing.


http://gerrit.cloudera.org:8080/#/c/23237/13/tests/custom_cluster/test_saml2_sso.py
File tests/custom_cluster/test_saml2_sso.py:

http://gerrit.cloudera.org:8080/#/c/23237/13/tests/custom_cluster/test_saml2_sso.py@145
PS13, Line 145:     self._test_saml2_browser_workflow(attributes_xml, False)
Please add additional test cases using a SAML response:
* issued by an untrusted idp
* containing an expired assertion (notAfter in the past)
* containing an assertion that is not yet valid (notBefore in the future)
* containing a value for "InResponseTo" that has an invalid id
* containing a value for "InResponseTo" that matches a previous SAML response 
that was authenticated (in other words, a replay attack)


http://gerrit.cloudera.org:8080/#/c/23237/13/tests/custom_cluster/test_saml2_sso.py@324
PS13, Line 324: class TestWebserverSaml(TestSamlBase):
Need assertions for the "impala.webserver.total-saml-auth-success" and 
"impala.webserver.total-saml-auth-failure" metrics



--
To view, visit http://gerrit.cloudera.org:8080/23237
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12540300529f9c240abf7196141ecb0ae6e37995
Gerrit-Change-Number: 23237
Gerrit-PatchSet: 13
Gerrit-Owner: Mihaly Szjatinya <[email protected]>
Gerrit-Reviewer: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Jason Fehr <[email protected]>
Gerrit-Reviewer: Mihaly Szjatinya <[email protected]>
Gerrit-Reviewer: Nandor Kollar <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Comment-Date: Fri, 30 Jan 2026 00:53:25 +0000
Gerrit-HasComments: Yes

Reply via email to