Hi Jamie, Colocating the RS and the DN is good practice and very common. The "could not complete file" errors you're seeing shouldn't be happening - the question is why.
Do you see any swap usage on your node? Any other exceptions in the log before the "Could not complete file" errors? What version of Hadoop are you running? Thanks -Todd On Wed, Jul 7, 2010 at 9:11 AM, Jamie Cockrill <[email protected]>wrote: > Dear all, > > My current HBase/Hadoop architecture has HBase region servers on the > same physical boxes as the HDFS data-nodes. I'm getting an awful lot > of region server crashes. The last thing that happens appears to be a > DroppedSnapshot Exception, caused by an IOException: could not > complete write to file <file on HDFS>. I am running it under load, how > heavy that is I'm not sure how that is quantified, but I'm guessing it > is a load issue. > > Is it common practice to put region servers on data-nodes? Is it > common to see region server crashes when either the HDFS or region > server (or both) is under heavy load? I'm guessing that is the case as > I've seen a few similar posts. I've not got a great deal of capacity > to be separating region servers from HDFS data nodes, but it might be > an argument I could make. > > Thanks > > Jamie > -- Todd Lipcon Software Engineer, Cloudera
