Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "SocketTimeout" page has been changed by SteveLoughran:
https://wiki.apache.org/hadoop/SocketTimeout?action=diff&rev1=5&rev2=6

Comment:
add that long haul networks can trigger transient outages

   * The remote machine crashing. This cannot easily be distinguished from a 
network partitioning.
   * A change in the firewall settings of one of the machines preventing 
communication.
   * The settings are wrong and the client is trying to talk to the wrong 
machine, one that is not on the network. That could be an error in Hadoop 
configuration files, or an entry in the DNS tables or the /etc/hosts file.
+  * If its over a long-haul network (i.e. out of cluster), it may be a 
transient failure due to the network playing up.
-  * If using a client of an object store such as the Amazon S3 and OpenStack 
Swift clients, socket timeouts may be caused by remote-throttling of client 
requests: your program is making too many PUT/DELETE requests and is being 
deliberately blocked by the far end. This is most likely to happen when 
creating many small files, or performing bulk deletes (e.g. deleting a 
directory with many child entries). 
+  * If using a client of an object store such as the Amazon S3 and OpenStack 
clients, socket timeouts may be caused by remote-throttling of client requests: 
your program is making too many PUT/DELETE requests and is being deliberately 
blocked by the far end. This is most likely to happen when creating many small 
files, or performing bulk deletes (e.g. deleting a directory with many child 
entries). It can also arise from a transient failure of the long-haul link.
  
  Comparing this exception to the ConnectionRefused error, the latter indicates 
there is a server at the far end, but no program running on it can receive 
inbound connections on the chosen port. A Socket Timeout usually means that 
there is something there, but it or the network are not working right
  
@@ -26, +27 @@

   1. Can you telnet to the target host and port?
   1. Can you telnet to the target host and port from any other machine?
   1. On the target machine, can you telnet to the port using localhost as the 
hostname. If this works but external network connections time out, it's usually 
a firewall issue.
-  1. If it is a remote object store: is the address correct? Does it only 
happen on bulk operations? If the latter, it's probably due to throttling at 
the far end.
+  1. If it is a remote object store: is the address correct? Does it go away 
when you repeat the operation? Does it only happen on bulk operations? If the 
latter, it's probably due to throttling at the far end.
  
  Remember: These are [[YourNetworkYourProblem|your network configuration 
problems]] . Only you can fix them.
  

Reply via email to