Hey everyone!

We have started seeing test failures in YARN PRs for a while. We have
identified the problematic commit, which is HADOOP-18082
<https://issues.apache.org/jira/browse/HADOOP-18082>, however, this change
just revealed the race condition lying in ProtobufRpcEngine2 introduced in
HADOOP-17046 <https://issues.apache.org/jira/browse/HADOOP-17046>. We have
also fixed the underlying issue via a locking mechanism, presented in
HADOOP-18143 <https://issues.apache.org/jira/browse/HADOOP-18143>, but
since it is out of our area of expertise, we can neither verify nor
guarantee that it will not cause some subtle issues in the RPC system.
As we think it is a core part of Hadoop, we would use feedback from someone
who is proficient in this part.

Regards:
Andras

Reply via email to