That is a very good point. I had only been considering the data locality features of rack configuration, not the fault tolerance aspect. I'll give that a try. Thank you.
-Matthew On Aug 23, 2010, at 8:30 PM, Daniel Einspanjer wrote: > Matthew, > > Maybe instead of changing the replication factor, you could spin up new nodes > with a different datacenter/rack configuration which would cause hadoop to > ensure the replicas are not solely on those temp nodes? > > Matthew LeMieux <[email protected]> wrote: > > J-D, > > Thank you for the very fast response. I'm running in the Amazon cloud, > and would like to be able to expand the cluster and reduce the cluster > quickly to support hadoop jobs, so I've been playing with the replication > factor and number of machines. > > If I add 5 more machines for a job I'm running over night that needs > some extra power, then removing those machines the next morning becomes a > tedious task because some of the data generated resides perhaps exclusively > on the 5 new machines. Since the HDFS node decommission process takes so > long, I've been putting the replication factor high, and removing machines > incrementally. For the most part, it works great because the replication > speed is much faster than the decommission speed. Hadoop M/R jobs and HDFS > seem completely content to deal with nodes coming and going as long as there > are at least 1 replica from which to replicate. Since the Hadoop M/R jobs > pull data from HBase, do some processing, and then put the data back into > HBase I assumed that hosting the Hadoop M/R and HBase on the same HDFS would > be faster than copying the data to a new transient cluster. > > Is there a way to tell HBase that I want the replication factor to be > high, but not to panic (i.e., go crazy with logs) if it isn't actually at > that number? Sort of like a soft limit and a hard limit... > > Thank you, > > Matthew > > P.S. I do need the data, I wish there was a way to speed this up. I didn't > find any evidence of a crash, just a looooong recovery. > > > On Aug 23, 2010, at 4:30 PM, Jean-Daniel Cryans wrote: > >> Matthew, >> >> Yes, the master won't start until the log splitting is done. By the >> looks of it (such a high number of logs when the max is 32 until the >> region server force flushes hlogs), it seems that your region servers >> in a previous run weren't able to ensure that each HLog had 3 replicas >> even tho dfs.replication was set to that number. Check your previous >> region server logs, you should see instances of: >> >> LOG.warn("HDFS pipeline error detected. " + >> "Found " + numCurrentReplicas + " replicas but expecting " + >> this.initialReplication + " replicas. " + >> " Requesting close of hlog."); >> >> And that probably made 1 log per insert. If your still testing and >> your data isn't important, feel free to kill -9 the master and nuke >> the HBase HDFS folder or at least the .logs folder. If not, you'll >> have to wait. Once it's done, make sure you have a proper >> configuration before any other testing. >> >> J-D >> >> On Mon, Aug 23, 2010 at 4:22 PM, Matthew LeMieux <[email protected]> wrote: >>> I have a cluster of 3 machines where the NameNode is separate from the >>> HMaster based on the distribution from Cloudera (CDH3). I have been >>> running it successfully for a couple weeks. As of this morning, it is >>> completely unusable. I'm looking for some help on how to fix it. Details >>> are below. Thank you. >>> >>> This morning I found HBase to be unresponsive, and tried restarting it. >>> That didn't help. For example, running "hbase shell", and then "list" >>> hangs. >>> >>> The master and region processes start up, but the master does not recognize >>> that the region servers are there. I am getting the following in master's >>> log file: >>> >>> 2010-08-23 23:04:16,100 INFO org.apache.hadoop.hbase.master.ServerManager: >>> 0 region servers, 0 dead, average load NaN >>> 2010-08-23 23:05:16,110 INFO org.apache.hadoop.hbase.master.ServerManager: >>> 0 region servers, 0 dead, average load NaN >>> 2010-08-23 23:06:16,120 INFO org.apache.hadoop.hbase.master.ServerManager: >>> 0 region servers, 0 dead, average load NaN >>> 2010-08-23 23:07:16,130 INFO org.apache.hadoop.hbase.master.ServerManager: >>> 0 region servers, 0 dead, average load NaN >>> 2010-08-23 23:08:16,140 INFO org.apache.hadoop.hbase.master.ServerManager: >>> 0 region servers, 0 dead, average load NaN >>> 2010-08-23 23:09:16,146 INFO org.apache.hadoop.hbase.master.ServerManager: >>> 0 region servers, 0 dead, average load NaN >>> 2010-08-23 23:10:16,150 INFO org.apache.hadoop.hbase.master.ServerManager: >>> 0 region servers, 0 dead, average load NaN >>> 2010-08-23 23:11:16,160 INFO org.apache.hadoop.hbase.master.ServerManager: >>> 0 region servers, 0 dead, average load NaN >>> 2010-08-23 23:12:16,170 INFO org.apache.hadoop.hbase.master.ServerManager: >>> 0 region servers, 0 dead, average load NaN >>> >>> >>> Meanwhile, the region servers show this in their log files: >>> >>> 2010-08-23 23:05:21,006 INFO org.apache.zookeeper.ClientCnxn: Opening >>> socket connection to server zookeeper:2181 >>> 2010-08-23 23:05:21,028 INFO org.apache.zookeeper.ClientCnxn: Socket >>> connection established to zookeeper:2181, initiating session >>> 2010-08-23 23:05:21,168 INFO org.apache.zookeeper.ClientCnxn: Session >>> establishment complete on server zookeeper:2181, sessionid = >>> 0x12aa0cc2520000e, negotiated timeout = 40000 >>> 2010-08-23 23:05:21,172 INFO >>> org.apache.hadoop.hbase.regionserver.HRegionServer: Got ZooKeeper event, >>> state: SyncConnected, type: None, path: null >>> 2010-08-23 23:05:21,177 DEBUG >>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Set watcher on master >>> address ZNode /hbase/master >>> 2010-08-23 23:05:21,421 DEBUG >>> org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Read ZNode >>> /hbase/master got master:60000 >>> 2010-08-23 23:05:21,421 INFO >>> org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at >>> master:60000 that we are up >>> 2010-08-23 23:05:22,056 INFO >>> org.apache.hadoop.hbase.regionserver.ShutdownHook: Installed shutdown hook >>> thread: Shutdownhook:regionserver60020 >>> >>> The Region server process is obviously waiting on something: >>> >>> /tmp/hbaselog$ sudo strace -p7592 >>> Process 7592 attached - interrupt to quit >>> futex(0x7f65534739e0, FUTEX_WAIT, 7602, NULL >>> >>> The Master isn't idle, it appears to be trying to do some sort of recovery >>> having woken up to find 0 region servers. Here is an excerpt from it: >>> >>> 2010-08-23 23:10:06,290 DEBUG >>> org.apache.hadoop.hbase.regionserver.wal.HLog: Splitting hlog 12142 of >>> 143261: >>> hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704435, >>> length=1150 >>> 2010-08-23 23:10:06,290 INFO org.apache.hadoop.hbase.util.FSUtils: >>> Recovering filehdfs://namenode:9000/hbase/.log >>> master,60020,1282577331142/master%3A60020.1282581704435 >>> 2010-08-23 23:10:06,510 INFO org.apache.hadoop.hbase.util.FSUtils: Finished >>> lease recover attempt for hdfs://namenode:9000/hbase/.logs >>> master,60020,1282577331142/master%3A60020.1282581704435 >>> 2010-08-23 23:10:06,513 DEBUG >>> org.apache.hadoop.hbase.regionserver.wal.HLog: Pushed=3 entries from >>> hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/1 >>> master%3A60020.1282581704435 >>> 2010-08-23 23:10:06,513 DEBUG >>> org.apache.hadoop.hbase.regionserver.wal.HLog: Splitting hlog 12143 of >>> 143261: >>> hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451, >>> length=448 >>> 2010-08-23 23:10:06,513 INFO org.apache.hadoop.hbase.util.FSUtils: >>> Recovering >>> filehdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451 >>> 2010-08-23 23:10:06,721 INFO org.apache.hadoop.hbase.util.FSUtils: Finished >>> lease recover attempt for >>> hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451 >>> 2010-08-23 23:10:06,723 DEBUG >>> org.apache.hadoop.hbase.regionserver.wal.HLog: Pushed=2 entries from >>> hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451 >>> 2010-08-23 23:10:06,723 DEBUG >>> org.apache.hadoop.hbase.regionserver.wal.HLog: Splitting hlog 12144 of >>> 143261: >>> hdfs:/namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704468, >>> length=582 >>> 2010-08-23 23:10:06,723 INFO org.apache.hadoop.hbase.util.FSUtils: >>> Recovering >>> filehdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704468 >>> >>> >>> It looks like the Master is sequentially going through logs up to 143261, >>> having started at 1 and is currently at 12144. At the current rate, it >>> will take around 12 hours to complete. Do I have to wait for it to >>> complete before the master will recognize the region servers? If it >>> doesn't have any region servers, then what the heck is the master doing >>> anyway? >>> >>> Thank you for your help, >>> >>> Matthew >>> >>> >>> >
