ptlrs opened a new pull request, #9997: URL: https://github.com/apache/ozone/pull/9997
## What changes were proposed in this pull request? The `XceiverClientGrpc#connectToDatanode` intermittently fails with an NPE. The problem is that for a given datanode, there is a race condition between creating a channel and creating a stub. When a new channel is created for a DN, it is put into the `channels` map. However, presence of a channel in the map does not imply that the corresponding stub for the same DN also exists in the asyncStubs map. If the stub is accessed after creating a channel but before the creation of stub, we can get an NPE. This PR fixes the problem by: - maintaining only one `dnChannelInfoMap` for both the channels and stubs instead of two independent maps - creating a `ChannelInfo` class to group the channel and stub ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-14793 ## How was this patch tested? CI: https://github.com/ptlrs/ozone/actions/runs/23703558972 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
