[ https://issues.apache.org/jira/browse/HADOOP-18972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Charles Connell updated HADOOP-18972: ------------------------------------- Description: When {{SaslDataTransferServer}} or {{SaslDataTranferClient}} want to get a SASL properties map to do a handshake, they call {{SaslPropertiesResolver#getServerProperties()}} or {{SaslPropertiesResolver#getClientProperties()}}, and they get back a {{Map<String, String>}}. Every call gets the same {{Map}} object back, and then the callers sometimes call [put()|https://github.com/apache/hadoop/blob/rel/release-3.3.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/sasl/SaslDataTransferServer.java#L385] on it. This means that future users of {{SaslPropertiesResolver}} get back the wrong information. I propose that {{SaslPropertiesResolver}} should pass a copy of its internal map, so that users can safety modify them. I discovered this problem in my company's testing environment as we began to enable {{dfs.data.transfer.protection}} on our DataNodes, while our NameNodes were using {{IngressPortBasedResolver}} to give out block tokens with different QOPs depending on the port used. Then our HDFS client applications became unable to read or write to HDFS because they could not find a QOP in common with the DataNodes during SASL handshake. With multiple threads executing SASL handshakes at the same time, the properties map used in {{SaslDataTransferServer}} in a DataNode could be clobbered during usage, since the same map was used by all threads. Also, future clients that do not have a QOP embedded in their block tokens would connect to a server with the wrong SASL properties map. I think that one or both of these issues explains the problem that I saw. I eliminated this unsafety and saw the problem go away. was: When {{SaslDataTransferServer}} or {{SaslDataTranferClient}} want to get a SASL properties map to do a handshake, they call {{SaslPropertiesResolver#getServerProperties()}} or {{SaslPropertiesResolver#getClientProperties()}}, and they get back a {{Map<String, String>}}. Every call gets the same {{Map}} object back, and then the callers sometimes call [put()|https://github.com/apache/hadoop/blob/rel/release-3.3.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/sasl/SaslDataTransferServer.java#L385] on it. This means that future users of {{SaslPropertiesResolver}} get back the wrong information. I propose that {{SaslPropertiesResolver}} should pass a copy of its internal map, so that users can safety modify them. I discovered this problem in my company's testing environment as we began to enable {{dfs.data.transfer.protection}} on our DataNodes, while our NameNodes were using {{IngressPortBasedResolver}} to give out block tokens with different QOPs depending on the port used. The our HDFS client applications became unable to read or write to HDFS because they could not find a QOP in common with the DataNodes during SASL handshake. With multiple threads executing SASL handshakes at the same time, the properties map used in {{SaslDataTransferServer}} in a DataNode could be clobbered during usage, since the same map was used by all threads. Also, future clients that do not have a QOP embedded in their block tokens would connect to a server with the wrong SASL properties map. I think that one or both of these issues explains the problem that I saw. I eliminated this unsafety and saw the problem go away. > Bug in SaslPropertiesResolver allows mutation of internal state > --------------------------------------------------------------- > > Key: HADOOP-18972 > URL: https://issues.apache.org/jira/browse/HADOOP-18972 > Project: Hadoop Common > Issue Type: Bug > Reporter: Charles Connell > Priority: Minor > Labels: pull-request-available > > When {{SaslDataTransferServer}} or {{SaslDataTranferClient}} want to get a > SASL properties map to do a handshake, they call > {{SaslPropertiesResolver#getServerProperties()}} or > {{SaslPropertiesResolver#getClientProperties()}}, and they get back a > {{Map<String, String>}}. Every call gets the same {{Map}} object back, and > then the callers sometimes call > [put()|https://github.com/apache/hadoop/blob/rel/release-3.3.6/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/sasl/SaslDataTransferServer.java#L385] > on it. This means that future users of {{SaslPropertiesResolver}} get back > the wrong information. > I propose that {{SaslPropertiesResolver}} should pass a copy of its internal > map, so that users can safety modify them. > I discovered this problem in my company's testing environment as we began to > enable {{dfs.data.transfer.protection}} on our DataNodes, while our NameNodes > were using {{IngressPortBasedResolver}} to give out block tokens with > different QOPs depending on the port used. Then our HDFS client applications > became unable to read or write to HDFS because they could not find a QOP in > common with the DataNodes during SASL handshake. With multiple threads > executing SASL handshakes at the same time, the properties map used in > {{SaslDataTransferServer}} in a DataNode could be clobbered during usage, > since the same map was used by all threads. Also, future clients that do not > have a QOP embedded in their block tokens would connect to a server with the > wrong SASL properties map. I think that one or both of these issues explains > the problem that I saw. I eliminated this unsafety and saw the problem go > away. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org