On Mon, Jun 25, 2012 at 9:00 AM, Frédéric Fondement <[email protected]> wrote: > 2012-06-25 10:25:30,646 ERROR > org.apache.hadoop.hdfs.server.datanode.DataNode: > DatanodeRegistration(10.120.0.5:50010, > storageID=DS-1339564791-127.0.0.1-50010-1296151113818, infoPort=50075, > ipcPort=50020):DataXceiver > java.net.SocketTimeoutException: 480000 millis timeout while waiting for > channel to be ready for write. ch : > java.nio.channels.SocketChannel[connected local=/10.120.0.5:50010
... > You might have guessed that local machine is 10.120.0.5. Unsuprisingly, > process on port 50010 is the datanode. Port 42564 is changing depending on > the error instance, and seems to correspond to the regionserver process. If > I ask for processes connected to port 50010 using an 'lsof -i :50010', I > have an impressive number of sockets (#400). Is it normal ? > Stuff is working? I don't think these exceptions above an issue. HBase opens all its files to hdfs on startup. The above is a timeout on the file because there has been no activity in 8 minutes. Thats what HDFS does server-side. When the dfs client goes to read on this socket that has been closed later, the connection will be put back up w/o complaint. On the 400 files, hdfs keeps a running thread or so per open file in the datanode (Your lsof shows this?). > I need to add that current load (requests, IOs, CPU, ...) is rather slow. > You mean 'low' or slow? St.Ack > I can't find any other error in namenode or regionserver logs. > > All the best, > > Frédéric. >
