Hi Team,
Do we have some recommended tuning for the RPC warning/errors
encountered intermittently?
The error which is seen is the following:
WARNING: Sync RPC framework (inet) finds exception raised.
ERROR: failed to return resource to resource manager, failed to
receive content (pquery.c:991)
This error however, disappears when we retry the query. There are
cases when the query is to be retried more than once.
The error looks to be invoked when COMM2RM_CLIENT_FAIL_RECV is encountered.
The setup is using YARN resource manager. And the following is the
yarn-client configuration used:
<configuration>
<property>
<name>hadoop.security.authentication</name>
<value>kerberos</value>
</property>
<property>
<name>rpc.client.connect.retry</name>
<value>10</value>
</property>
<property>
<name>rpc.client.connect.tcpnodelay</name>
<value>true</value>
</property>
<property>
<name>rpc.client.connect.timeout</name>
<value>600000</value>
</property>
<property>
<name>rpc.client.max.idle</name>
<value>10000</value>
</property>
<property>
<name>rpc.client.ping.interval</name>
<value>10000</value>
</property>
<property>
<name>rpc.client.read.timeout</name>
<value>3600000</value>
</property>
<property>
<name>rpc.client.socket.linger.timeout</name>
<value>-1</value>
</property>
<property>
<name>rpc.client.timeout</name>
<value>3600000</value>
</property>
<property>
<name>rpc.client.write.timeout</name>
<value>3600000</value>
</property>
<property>
<name>yarn.client.failover.max.attempts</name>
<value>15</value>
</property>
</configuration>
I would appreciate some recommendations.
Regards,
Gagan Brahmi