Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/23237 )
Change subject: IMPALA-14285: Add SAML2 authentication support for Coordinator Web UI ...................................................................... Patch Set 13: (23 comments) http://gerrit.cloudera.org:8080/#/c/23237/13//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/23237/13//COMMIT_MSG@31 PS13, Line 31: Please add details on how this change was tested. http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/rpc/authentication-util.cc File be/src/rpc/authentication-util.cc: http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/rpc/authentication-util.cc@263 PS13, Line 263: vector<string> split = Split(url, delimiter::Limit("/", 3)); Please add ctests for this function. http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/rpc/authentication-util.cc@277 PS13, Line 277: const string& params_string, string* err_msg) { Please add ctests for this function. http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/rpc/authentication-util.cc@280 PS13, Line 280: if (key_value.size() == 2) { What does it mean if key_value.size() is not 2? Is that an error situation that needs to be handled? http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/rpc/authentication-util.cc@326 PS13, Line 326: ValidateSaml2Response This code eventually calls into org.pac4j.saml.sso.impl.SAML2AuthnResponseValidator which checks the SAML response's "InResponseTo" field to ensure the value of that field matches the id of a SAML request that was previously created. Thus, if the SAML response is received on a different coordinator than where the SAML request was generated, this validation will fail. If that is the case, is it possible to check other coordinators to see if they issued the SAML request? http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/backend-gflag-util.cc File be/src/util/backend-gflag-util.cc: http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/backend-gflag-util.cc@174 PS13, Line 174: for HiveServer2 This help text would be more descriptive if it said something like "for client connections that use the hs2-http protocol" http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/backend-gflag-util.cc@180 PS13, Line 180: WebServer "WebServer" is a somewhat generic term. Would it make more sense to say "debug ui" instead? http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/backend-gflag-util.cc@190 PS13, Line 190: Highserver2 hs2-http connections http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/backend-gflag-util.cc@194 PS13, Line 194: Webserver Same question as line 180. http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/webserver.h File be/src/util/webserver.h: http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/webserver.h@202 PS13, Line 202: : void returnUnauthorized( : struct sq_connection* connection); : : // Helper method to handle SAML authentication failure with metrics and return code : sq_callback_result_t returnSamlAuthFailure(struct sq_connection* connection); : : void returnWrappedResponse( : struct sq_connection* connection, const impala::TWrappedHttpResponse& response); : : sq_callback_result_t ReadPostBody( : struct sq_connection* connection, int32_t max_length, std::string* out_body); Please add comments to all of these functions explaining what they do. http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/webserver.h@329 PS13, Line 329: returnning Nit: returning http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/webserver.cc File be/src/util/webserver.cc: http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/webserver.cc@373 PS13, Line 373: || use_saml_ Since this is security related, I'm not a fan of the current approach where SAML may or may not work alongside LDAP/Kerberos. We need to either test SAML in conjunction with each or else add a new entry for SAML in the AuthMode enum (defined in webserver.h) to enforce that SAML must be used by itself. My preference would be the second option since it removes attack vectors. http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/webserver.cc@851 PS13, Line 851: bool use_saml = use_saml_ && !sq_get_header(connection, "X-Impala-EETest"); This line needs to check FLAGS_saml2_ee_test_mode, otherwise SAML auth can be ignored by passing the "X-Impala-EETest" http header if another auth method is not enabled. http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientBase.java File fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientBase.java: http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientBase.java@87 PS13, Line 87: protected abstract String getCallBackUrl() throws Exception; Nit: Place all the abstract methods together in the code. http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientBase.java@93 PS13, Line 93: protected void setRedirectCommon(WrappedWebContext webContext) Nit: add the "final" keyword to this method so subclasses cannot override it. http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientBase.java@108 PS13, Line 108: public String validateAuthnResponseInner(WrappedWebContext webContext) Nit: change to "protected" visibility and add "final" keyword http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientBase.java@137 PS13, Line 137: protected <T extends HiveSamlRelayStateInfo> void validateAuthnResponseCommon( Nit: add the "final" keyword to this method so subclasses cannot override it. http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientHS2.java File fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientHS2.java: http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientHS2.java@46 PS13, Line 46: setName(ImpalaSamlClientHS2.class.getSimpleName()); : setStateGenerator(HiveSamlRelayStateStoreHS2.get()); Name and StateGenerator are set by both subclasses of ImpalaSamlClientBase. These should be set by the ImpalaSamlClientBase superclass, in its constructor, so they are not accidentally deleted or forgotten if a third subclass is created. http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientHS2.java@53 PS13, Line 53: return BackendConfig.INSTANCE.getHS2Saml2SpCallbackUrl(); If this value will never change, then pass it to the ImpalaSamlClientBase constructor and have that class implement getCallBackUrl http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/service/JniFrontend.java File fe/src/main/java/org/apache/impala/service/JniFrontend.java: http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/service/JniFrontend.java@825 PS13, Line 825: Preconditions.checkNotNull(client); Nit: client will be Null if SAML idp metadata was not configured. Pass a message that explains which startup param is missing. http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/service/JniFrontend.java@844 PS13, Line 844: Preconditions.checkNotNull(client); Nit: client will be Null if SAML idp metadata was not configured. Pass a message that explains which startup param is missing. http://gerrit.cloudera.org:8080/#/c/23237/13/tests/custom_cluster/test_saml2_sso.py File tests/custom_cluster/test_saml2_sso.py: http://gerrit.cloudera.org:8080/#/c/23237/13/tests/custom_cluster/test_saml2_sso.py@145 PS13, Line 145: self._test_saml2_browser_workflow(attributes_xml, False) Please add additional test cases using a SAML response: * issued by an untrusted idp * containing an expired assertion (notAfter in the past) * containing an assertion that is not yet valid (notBefore in the future) * containing a value for "InResponseTo" that has an invalid id * containing a value for "InResponseTo" that matches a previous SAML response that was authenticated (in other words, a replay attack) http://gerrit.cloudera.org:8080/#/c/23237/13/tests/custom_cluster/test_saml2_sso.py@324 PS13, Line 324: class TestWebserverSaml(TestSamlBase): Need assertions for the "impala.webserver.total-saml-auth-success" and "impala.webserver.total-saml-auth-failure" metrics -- To view, visit http://gerrit.cloudera.org:8080/23237 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I12540300529f9c240abf7196141ecb0ae6e37995 Gerrit-Change-Number: 23237 Gerrit-PatchSet: 13 Gerrit-Owner: Mihaly Szjatinya <[email protected]> Gerrit-Reviewer: Abhishek Rawat <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Jason Fehr <[email protected]> Gerrit-Reviewer: Mihaly Szjatinya <[email protected]> Gerrit-Reviewer: Nandor Kollar <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Comment-Date: Fri, 30 Jan 2026 00:53:25 +0000 Gerrit-HasComments: Yes
