Mihaly Szjatinya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23237 )

Change subject: IMPALA-14285: Add SAML2 authentication support for Coordinator 
Web UI
......................................................................


Patch Set 15:

(23 comments)

http://gerrit.cloudera.org:8080/#/c/23237/13//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/23237/13//COMMIT_MSG@31
PS13, Line 31:
> Please add details on how this change was tested.
Ack


http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/rpc/authentication-util.cc
File be/src/rpc/authentication-util.cc:

http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/rpc/authentication-util.cc@263
PS13, Line 263:   vector<string> split = Split(url, delimiter::Limit("/", 3));
> Please add ctests for this function.
Added.


http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/rpc/authentication-util.cc@277
PS13, Line 277: bool ParseParams(std::map<string, string*>& params_to_check, 
const string& original,
> Please add ctests for this function.
Added.


http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/rpc/authentication-util.cc@280
PS13, Line 280:     vector<string> key_value = Split(pair, 
delimiter::Limit("=", 1));
> What does it mean if key_value.size() is not 2?  Is that an error situation
It means the parameter doesn't have the standard key=value format. The code 
silently skips these parameters.

P.S. The function was only moved by me from authentication.cc to share it with 
webserver.cc


http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/rpc/authentication-util.cc@326
PS13, Line 326:
Thanks Csaba. I can only add that without load balancer multiple coordinators 
works fine for both http-hs2 and the webserver.

These two statements seems to contradict each other:

> If there is a failover, then it is to client's task to retry the SAML2 auth 
> workflow

> The tests in test_saml2_sso.py should verify a new SAML assertion is 
> requested if the user presents an invalid one.


http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/backend-gflag-util.cc
File be/src/util/backend-gflag-util.cc:

http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/backend-gflag-util.cc@174
PS13, Line 174: for client conn
> This help text would be more descriptive if it said something like "for cli
Ack


http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/backend-gflag-util.cc@180
PS13, Line 180: debug web
> "WebServer" is a somewhat generic term.  Would it make more sense to say "d
ack, maybe 'debug web UI'


http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/backend-gflag-util.cc@190
PS13, Line 190:
> hs2-http connections
ack


http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/backend-gflag-util.cc@194
PS13, Line 194:
> Same question as line 180.
ack


http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/webserver.h
File be/src/util/webserver.h:

http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/webserver.h@202
PS13, Line 202:
              :   // Returns a '401 - Unauthorized' / 'AuthenticationRequired' 
to the client.
              :   void returnUnauthorized(
              :       struct sq_connection* connection);
              :
              :   // Helper method to handle SAML authentication failure with 
metrics and return code
              :   sq_callback_result_t returnSamlAuthFailure(struct 
sq_connection* connection);
              :
              :   // Returns a response based on a TWrappedHttpResponse that 
can come from Frontend code.
              :   void returnWrappedResponse(
              :       struct sq_connection* connection, const 
impala::TWrappedHttpResponse& response);
              :
> Please add comments to all of these functions explaining what they do.
Ack


http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/webserver.h@329
PS13, Line 329: a before s
> Nit: returning
Ack


http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/webserver.cc
File be/src/util/webserver.cc:

http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/webserver.cc@373
PS13, Line 373:  || use_saml_
@Csaba, thanks for details!

> We should still test one or two of the more likely combinations (such as 
> SAML+LDAP or SAML+Keberos).

As this would also involve the HS2, maybe we can have a different scope for 
this?


http://gerrit.cloudera.org:8080/#/c/23237/13/be/src/util/webserver.cc@851
PS13, Line 851:   bool use_saml = use_saml_ && !sq_get_header(connection, 
"X-Impala-EETest");
> This line needs to check FLAGS_saml2_ee_test_mode, otherwise SAML auth can
Actually the comment above is wrong here. I confused 'FLAGS_saml2_ee_test_mode' 
Impala flag with 'X-Impala-EETest' http header.

These are unrelated. We're checking the header here, which is sent from the 
bootstrap scripts based on a mandatory SAML flag.

Fixed the comment.


http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientBase.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientBase.java:

http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientBase.java@87
PS13, Line 87:     // 1. Validate that InResponseTo matches a request ID we 
actually sent
> Nit: Place all the abstract methods together in the code.
This method is not even needed.


http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientBase.java@93
PS13, Line 93:
> Nit: add the "final" keyword to this method so subclasses cannot override i
Ack


http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientBase.java@108
PS13, Line 108:     WithLocationAction locationAction = (WithLocationAction) 
redirect.get();
> Nit: change to "protected" visibility and add "final" keyword
Ack


http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientBase.java@137
PS13, Line 137:     return nameId;
> Nit: add the "final" keyword to this method so subclasses cannot override i
Ack


http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientHS2.java
File 
fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientHS2.java:

http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientHS2.java@46
PS13, Line 46:         ImpalaSamlClientHS2.class.getSimpleName(),
             :         HiveSamlRelayStateStoreHS2.get(),
> Name and StateGenerator are set by both subclasses of ImpalaSamlClientBase.
Good point, fixed


http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientHS2.java@53
PS13, Line 53:       throws InternalException {
> If this value will never change, then pass it to the ImpalaSamlClientBase c
Fixed


http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/service/JniFrontend.java
File fe/src/main/java/org/apache/impala/service/JniFrontend.java:

http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/service/JniFrontend.java@825
PS13, Line 825:     Preconditions.checkNotNull(client, "SAML client not 
configured. "
> Nit:  client will be Null if SAML idp metadata was not configured.  Pass a
Ack


http://gerrit.cloudera.org:8080/#/c/23237/13/fe/src/main/java/org/apache/impala/service/JniFrontend.java@844
PS13, Line 844:
> Nit:  client will be Null if SAML idp metadata was not configured.  Pass a
Ack


http://gerrit.cloudera.org:8080/#/c/23237/13/tests/custom_cluster/test_saml2_sso.py
File tests/custom_cluster/test_saml2_sso.py:

http://gerrit.cloudera.org:8080/#/c/23237/13/tests/custom_cluster/test_saml2_sso.py@145
PS13, Line 145:                                          
assertation_id=str(uuid.uuid4()),
> Please add additional test cases using a SAML response:
Done. The first three are easy.

For replay attacks two security mechanisms are involved:
1. MessageReplaySecurityHandler (already in place) - automatically detects the 
same {msg_id} on credentialsExtractor.extract(). No changes needed for this one.
2. ImpalaSAMLMessageStore (added by me) - detects the same {request_id} as 
'InResponseTo'. Once in place, also checked by credentialsExtractor.extract(). 
It uses the same type of cache as HiveSamlRelayStateStoreBase.

NOTE: While Webserver returns 401 upon any of the above attacks, the HS2 for 
some reason returns 200 and a html form, only without token. I've crafted the 
tests to work with this, but perhaps we could just return the same 401 here:
https://gerrit.cloudera.org/#/c/23237/13/fe/src/main/java/org/apache/impala/authentication/saml/ImpalaSamlClientHS2.java@108
(This code is refactored but the logic is preserved)

NOTE2: I just realised that for the Webserver we're not invalidating the 
RelayState cache in the end of the sequence. For HS2 it is done in the end of 
Step 3/3. This makes the replay attack possible as Step 1 -> Step 2 -> Step 2 
again.
But for Webserver if we invalidate at Step 2/2, this will not even allow us to 
reach credentialsExtractor.extract(), thus making replay attacks impossible.
In that case we can remove the replay attack tests for Webserver. What do you 
think?

cc @Chaba


http://gerrit.cloudera.org:8080/#/c/23237/13/tests/custom_cluster/test_saml2_sso.py@324
PS13, Line 324:     self._test_common_protocol_checks()
> Need assertions for the "impala.webserver.total-saml-auth-success" and "imp
Good point.



--
To view, visit http://gerrit.cloudera.org:8080/23237
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12540300529f9c240abf7196141ecb0ae6e37995
Gerrit-Change-Number: 23237
Gerrit-PatchSet: 15
Gerrit-Owner: Mihaly Szjatinya <[email protected]>
Gerrit-Reviewer: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Jason Fehr <[email protected]>
Gerrit-Reviewer: Mihaly Szjatinya <[email protected]>
Gerrit-Reviewer: Nandor Kollar <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Comment-Date: Wed, 04 Feb 2026 17:26:19 +0000
Gerrit-HasComments: Yes

Reply via email to