Hey increase the number of open file setting... Regards Shashwat Shriparv
On Mon, Jun 25, 2012 at 10:21 PM, Stack <[email protected]> wrote: > On Mon, Jun 25, 2012 at 9:00 AM, Frédéric Fondement > <[email protected]> wrote: > > 2012-06-25 10:25:30,646 ERROR > > org.apache.hadoop.hdfs.server.datanode.DataNode: > > DatanodeRegistration(10.120.0.5:50010, > > storageID=DS-1339564791-127.0.0.1-50010-1296151113818, infoPort=50075, > > ipcPort=50020):DataXceiver > > java.net.SocketTimeoutException: 480000 millis timeout while waiting for > > channel to be ready for write. ch : > > java.nio.channels.SocketChannel[connected local=/10.120.0.5:50010 > > ... > > > You might have guessed that local machine is 10.120.0.5. Unsuprisingly, > > process on port 50010 is the datanode. Port 42564 is changing depending > on > > the error instance, and seems to correspond to the regionserver process. > If > > I ask for processes connected to port 50010 using an 'lsof -i :50010', I > > have an impressive number of sockets (#400). Is it normal ? > > > > Stuff is working? I don't think these exceptions above an issue. > > HBase opens all its files to hdfs on startup. > > The above is a timeout on the file because there has been no activity > in 8 minutes. Thats what HDFS does server-side. > > When the dfs client goes to read on this socket that has been closed > later, the connection will be put back up w/o complaint. > > On the 400 files, hdfs keeps a running thread or so per open file in > the datanode (Your lsof shows this?). > > > I need to add that current load (requests, IOs, CPU, ...) is rather slow. > > > > You mean 'low' or slow? > > St.Ack > > > I can't find any other error in namenode or regionserver logs. > > > > All the best, > > > > Frédéric. > > > -- ∞ Shashwat Shriparv
