My experience in debugging those kind of issues is that 95% of the time it's a configuration issue, 4.99% of the time it's environment and network issues (network splits, lost packets, etc), and the remaining 0.01% is actual HDFS issues.
The fact that you're saying that you had issues even with no load makes me think it's a configuration issue. Can we see your hdfs config? BTW the HBase log was pointing at 10.1.104.1 as the one having an issue, is that the log we are looking at? (it doesn't seem so) Thx, J-D On Sun, Apr 10, 2011 at 12:05 PM, Eran Kutner <[email protected]> wrote: > This is how it's configured in /etc/security/limits.con on all the > slaves in the cluster: > hadoop - nofile 32768 > hdfs - nofile 32768 > hbase - nofile 32768 > hadoop - nproc 32000 > hdfs - nproc 32000 > hbase - nproc 32000 > > When hbase is loading it prints: > ulimit -n 32768 > > > -eran
