[GitHub] [hadoop] fapifta commented on a change in pull request #3658: HADOOP-17975 Fallback to simple auth does not work for a secondary DistributedFileSystem instance

GitBox Tue, 16 Nov 2021 15:37:02 -0800


fapifta commented on a change in pull request #3658:
URL: https://github.com/apache/hadoop/pull/3658#discussion_r750770358




##########
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java
##########
@@ -1679,11 +1679,13 @@ private Connection getConnection(ConnectionId remoteId,
     private final boolean doPing; //do we need to send ping message
     private final int pingInterval; // how often sends ping to the server in 
msecs
     private String saslQop; // here for testing
+    private final AtomicBoolean fallbackToSimpleAuth;
     private final Configuration conf; // used to get the expected kerberos 
principal name
     
     ConnectionId(InetSocketAddress address, Class<?> protocol, 
                  UserGroupInformation ticket, int rpcTimeout,
-                 RetryPolicy connectionRetryPolicy, Configuration conf) {
+                 RetryPolicy connectionRetryPolicy, Configuration conf,
+                 AtomicBoolean fallbackToSimpleAuth) {

Review comment:
       Yes, they are not equals, but let me re-iterate... they should not be 
equals.
   Three things to note here:
   - every DfsClient creates a new AtomicBoolean initialized to false
   - these atomic booleans are set to true by ipc Client's Connection's 
setupIOStreams method which can be called only after we have a connection based 
on the ConnectionID which we would than possibly change in setupIOStreams.
   - in the ipc Client layer, a connection to a NameNode is not different from 
a connection to a DataNode, as it is a connection to a host port combination 
with given network related setup.
   
   Comparing the boolean values instead of the atomic boolean reference will 
bring us back to the initial problem. We will have a false value when we first 
try to get the connection based on the ConnectionID, then after we get the 
connection we change the value to true. Which is even worse, as in this case if 
fallback is on, we would have for sure two connections even for a single 
DfsClient, as when we get the connection the second time, the value would be 
true, so we would create an other connection for the new ConnectionID, and we 
possibly leak the previous one, as we would never create a ConnectionID with 
false again from that client. While on the other hand, from a second DfsClient, 
we would get the connection with the ID contains the false value, and we would 
have the problem we are trying to fix (unset AtomicBoolean prevents fallback of 
the second DfsClient).
   
   
   
   About the number of connections we discussed in the previous PR, though at 
that point NN was out of picture, but yes, this change affects the NN 
connections as well if there are multiple DfsClients that are active.
   
   > Your suggestion to add the client's ID to the ConnectionID would help, but 
there is a tradeoff there, as in this case client B has to set up a new socket 
connection to DataNode D.
   >That would cause two things:
   >
   > 1. is an overhead of creating a new connection instead of reusing what we 
already have.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [hadoop] fapifta commented on a change in pull request #3658: HADOOP-17975 Fallback to simple auth does not work for a secondary DistributedFileSystem instance

Reply via email to