See my reply under 'importing HFiles' subject. On Wed, Aug 25, 2010 at 8:05 AM, Matthew LeMieux <[email protected]> wrote:
> For those who are curious, using rack awareness to speed up the process of > adding and removing nodes did not work in my experiment. > > I set up a primary rack and a transient rack. The primary rack consists of > permanent machines that always stay in the cluster, while the transient rack > is meant to come and go as more processing power is needed. > > Once the extra rack was no longer needed, HDFS was using up time and > bandwidth to duplicate the data on the primary rack over to the transient > rack rather than replicating the data on transient rack over to the primary > rack. With the alternative method of putting all machines in one rack, the > data on the permanent machines does not need to be copied over to the > transient machines, making the process of replication much faster. > > At this point, I'm considering running two separate clusters: a permanent > HBase cluster and a transient M/R cluster. I could use distcp to quickly > copy the data between clusters. I've seen examples of bulk loading a table > using the HFileOutputFormat. But, once I've copied the files over to the > M/R cluster, is there a way to read in the files, i.e., is there an > HFileInputFormat equivalent? > > FYI, > > Matthew > > > > On Aug 23, 2010, at 8:30 PM, Daniel Einspanjer wrote: > > > Matthew, > > > > Maybe instead of changing the replication factor, you could spin up new > nodes with a different datacenter/rack configuration which would cause > hadoop to ensure the replicas are not solely on those temp nodes? > > > > Matthew LeMieux <[email protected]> wrote: > > > > J-D, > > > > Thank you for the very fast response. I'm running in the Amazon > cloud, and would like to be able to expand the cluster and reduce the > cluster quickly to support hadoop jobs, so I've been playing with the > replication factor and number of machines. > > > > If I add 5 more machines for a job I'm running over night that needs > some extra power, then removing those machines the next morning becomes a > tedious task because some of the data generated resides perhaps exclusively > on the 5 new machines. Since the HDFS node decommission process takes so > long, I've been putting the replication factor high, and removing machines > incrementally. For the most part, it works great because the replication > speed is much faster than the decommission speed. Hadoop M/R jobs and HDFS > seem completely content to deal with nodes coming and going as long as there > are at least 1 replica from which to replicate. Since the Hadoop M/R jobs > pull data from HBase, do some processing, and then put the data back into > HBase I assumed that hosting the Hadoop M/R and HBase on the same HDFS would > be faster than copying the data to a new transient cluster. > > > > Is there a way to tell HBase that I want the replication factor to be > high, but not to panic (i.e., go crazy with logs) if it isn't actually at > that number? Sort of like a soft limit and a hard limit... > > > > Thank you, > > > > Matthew > > > > P.S. I do need the data, I wish there was a way to speed this up. I > didn't find any evidence of a crash, just a looooong recovery. > > > > > > On Aug 23, 2010, at 4:30 PM, Jean-Daniel Cryans wrote: > > > >> Matthew, > >> > >> Yes, the master won't start until the log splitting is done. By the > >> looks of it (such a high number of logs when the max is 32 until the > >> region server force flushes hlogs), it seems that your region servers > >> in a previous run weren't able to ensure that each HLog had 3 replicas > >> even tho dfs.replication was set to that number. Check your previous > >> region server logs, you should see instances of: > >> > >> LOG.warn("HDFS pipeline error detected. " + > >> "Found " + numCurrentReplicas + " replicas but expecting " + > >> this.initialReplication + " replicas. " + > >> " Requesting close of hlog."); > >> > >> And that probably made 1 log per insert. If your still testing and > >> your data isn't important, feel free to kill -9 the master and nuke > >> the HBase HDFS folder or at least the .logs folder. If not, you'll > >> have to wait. Once it's done, make sure you have a proper > >> configuration before any other testing. > >> > >> J-D > >> > >> On Mon, Aug 23, 2010 at 4:22 PM, Matthew LeMieux <[email protected]> > wrote: > >>> I have a cluster of 3 machines where the NameNode is separate from the > HMaster based on the distribution from Cloudera (CDH3). I have been > running it successfully for a couple weeks. As of this morning, it is > completely unusable. I'm looking for some help on how to fix it. Details > are below. Thank you. > >>> > >>> This morning I found HBase to be unresponsive, and tried restarting it. > That didn't help. For example, running "hbase shell", and then "list" > hangs. > >>> > >>> The master and region processes start up, but the master does not > recognize that the region servers are there. I am getting the following in > master's log file: > >>> > >>> 2010-08-23 23:04:16,100 INFO > org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, > average load NaN > >>> 2010-08-23 23:05:16,110 INFO > org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, > average load NaN > >>> 2010-08-23 23:06:16,120 INFO > org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, > average load NaN > >>> 2010-08-23 23:07:16,130 INFO > org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, > average load NaN > >>> 2010-08-23 23:08:16,140 INFO > org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, > average load NaN > >>> 2010-08-23 23:09:16,146 INFO > org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, > average load NaN > >>> 2010-08-23 23:10:16,150 INFO > org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, > average load NaN > >>> 2010-08-23 23:11:16,160 INFO > org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, > average load NaN > >>> 2010-08-23 23:12:16,170 INFO > org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, > average load NaN > >>> > >>> > >>> Meanwhile, the region servers show this in their log files: > >>> > >>> 2010-08-23 23:05:21,006 INFO org.apache.zookeeper.ClientCnxn: Opening > socket connection to server zookeeper:2181 > >>> 2010-08-23 23:05:21,028 INFO org.apache.zookeeper.ClientCnxn: Socket > connection established to zookeeper:2181, initiating session > >>> 2010-08-23 23:05:21,168 INFO org.apache.zookeeper.ClientCnxn: Session > establishment complete on server zookeeper:2181, sessionid = > 0x12aa0cc2520000e, negotiated timeout = 40000 > >>> 2010-08-23 23:05:21,172 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: Got ZooKeeper event, > state: SyncConnected, type: None, path: null > >>> 2010-08-23 23:05:21,177 DEBUG > org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Set watcher on master > address ZNode /hbase/master > >>> 2010-08-23 23:05:21,421 DEBUG > org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Read ZNode /hbase/master > got master:60000 > >>> 2010-08-23 23:05:21,421 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: Telling master at > master:60000 that we are up > >>> 2010-08-23 23:05:22,056 INFO > org.apache.hadoop.hbase.regionserver.ShutdownHook: Installed shutdown hook > thread: Shutdownhook:regionserver60020 > >>> > >>> The Region server process is obviously waiting on something: > >>> > >>> /tmp/hbaselog$ sudo strace -p7592 > >>> Process 7592 attached - interrupt to quit > >>> futex(0x7f65534739e0, FUTEX_WAIT, 7602, NULL > >>> > >>> The Master isn't idle, it appears to be trying to do some sort of > recovery having woken up to find 0 region servers. Here is an excerpt from > it: > >>> > >>> 2010-08-23 23:10:06,290 DEBUG > org.apache.hadoop.hbase.regionserver.wal.HLog: Splitting hlog 12142 of > 143261: > hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704435, > length=1150 > >>> 2010-08-23 23:10:06,290 INFO org.apache.hadoop.hbase.util.FSUtils: > Recovering filehdfs://namenode:9000/hbase/.log > master,60020,1282577331142/master%3A60020.1282581704435 > >>> 2010-08-23 23:10:06,510 INFO org.apache.hadoop.hbase.util.FSUtils: > Finished lease recover attempt for hdfs://namenode:9000/hbase/.logs > master,60020,1282577331142/master%3A60020.1282581704435 > >>> 2010-08-23 23:10:06,513 DEBUG > org.apache.hadoop.hbase.regionserver.wal.HLog: Pushed=3 entries from > hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/1 > master%3A60020.1282581704435 > >>> 2010-08-23 23:10:06,513 DEBUG > org.apache.hadoop.hbase.regionserver.wal.HLog: Splitting hlog 12143 of > 143261: > hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451, > length=448 > >>> 2010-08-23 23:10:06,513 INFO org.apache.hadoop.hbase.util.FSUtils: > Recovering > filehdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451 > >>> 2010-08-23 23:10:06,721 INFO org.apache.hadoop.hbase.util.FSUtils: > Finished lease recover attempt for > hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451 > >>> 2010-08-23 23:10:06,723 DEBUG > org.apache.hadoop.hbase.regionserver.wal.HLog: Pushed=2 entries from > hdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704451 > >>> 2010-08-23 23:10:06,723 DEBUG > org.apache.hadoop.hbase.regionserver.wal.HLog: Splitting hlog 12144 of > 143261: > hdfs:/namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704468, > length=582 > >>> 2010-08-23 23:10:06,723 INFO org.apache.hadoop.hbase.util.FSUtils: > Recovering > filehdfs://namenode:9000/hbase/.logs/master,60020,1282577331142/master%3A60020.1282581704468 > >>> > >>> > >>> It looks like the Master is sequentially going through logs up to > 143261, having started at 1 and is currently at 12144. At the current > rate, it will take around 12 hours to complete. Do I have to wait for it to > complete before the master will recognize the region servers? If it doesn't > have any region servers, then what the heck is the master doing anyway? > >>> > >>> Thank you for your help, > >>> > >>> Matthew > >>> > >>> > >>> > > > >
