We have a specific environment where we need to harmonize socket connection
timeouts for all Hadoop daemons and some downstreamers too. While reviewing
the socket connection timeouts set in NetUtils, UrlConnection
(HttpURLConnection), I compiled a list of the following configurations:

   - ipc.client.connect.timeout
   - dfs.client.socket-timeout
   - dfs.datanode.socket.write.timeout
   - dfs.client.fsck.connect.timeout
   - dfs.client.fsck.read.timeout
   - dfs.federation.router.connect.timeout
   - dfs.qjournal.http.open.timeout.ms
   - dfs.qjournal.http.read.timeout.ms
   - dfs.checksum.ec.socket-timeout
   - hadoop.security.kms.client.timeout
   - mapreduce.reduce.shuffle.connect.timeout
   - mapreduce.reduce.shuffle.read.timeout


Moreover, although “dfs.datanode.socket.reuse.keepalive” does not indicate
a direct socket timeout, we set it as SocketOptions#SO_TIMEOUT if
opsProcessed != 0 (to block read on InputStream only for this timeout,
beyond which it would result in SocketTimeoutException). Similarly,
“ipc.ping.interval” and “ipc.client.rpc-timeout.ms” are also used to set
SocketOptions#SO_TIMEOUT on the socket.

It's possible that I may have missed some socket timeout configs in the
above list. If anyone could provide feedback on this list or suggest any
missing configs, it would be greatly appreciated.

Reply via email to