Hi, Gagan, It seems a sync failure between QD and Resource Manager. Not related to libyarn 's RPC. Would you like to attach the master's log file? Thanks!
On Fri, May 13, 2016 at 12:58 AM, Gagan Brahmi <[email protected]> wrote: > Hi Team, > > Do we have some recommended tuning for the RPC warning/errors > encountered intermittently? > > The error which is seen is the following: > > WARNING: Sync RPC framework (inet) finds exception raised. > ERROR: failed to return resource to resource manager, failed to > receive content (pquery.c:991) > > This error however, disappears when we retry the query. There are > cases when the query is to be retried more than once. > > The error looks to be invoked when COMM2RM_CLIENT_FAIL_RECV is encountered. > > The setup is using YARN resource manager. And the following is the > yarn-client configuration used: > > <configuration> > > <property> > <name>hadoop.security.authentication</name> > <value>kerberos</value> > </property> > > <property> > <name>rpc.client.connect.retry</name> > <value>10</value> > </property> > > <property> > <name>rpc.client.connect.tcpnodelay</name> > <value>true</value> > </property> > > <property> > <name>rpc.client.connect.timeout</name> > <value>600000</value> > </property> > > <property> > <name>rpc.client.max.idle</name> > <value>10000</value> > </property> > > <property> > <name>rpc.client.ping.interval</name> > <value>10000</value> > </property> > > <property> > <name>rpc.client.read.timeout</name> > <value>3600000</value> > </property> > > <property> > <name>rpc.client.socket.linger.timeout</name> > <value>-1</value> > </property> > > <property> > <name>rpc.client.timeout</name> > <value>3600000</value> > </property> > > <property> > <name>rpc.client.write.timeout</name> > <value>3600000</value> > </property> > > <property> > <name>yarn.client.failover.max.attempts</name> > <value>15</value> > </property> > > </configuration> > > I would appreciate some recommendations. > > > Regards, > Gagan Brahmi >
