A few quick thoughts: - We run with DDR IB as our primary interconnect. We use local disks, however. Things work well. - If you're going to use both Ethernet and IPoIB for access: In the past there were issues when using different network adapters in HBase. For us, hostnames map to the IB ip addresses. Then every Ethernet-only machine that accesses these machines has a /32 static route to the Infiniband IP via the paired Ethernet adapter for each node. - If you update local filesystem to do sync, you'll also need to create FSUtils support for the updated filesystem. (See https://issues.apache.org/jira/browse/HBASE-4169 for example).
Good luck! Jacques On Mon, Dec 5, 2011 at 2:04 PM, Taylor, Ronald C <[email protected]>wrote: > Hello Lars, > > Thanks for your previous help. Got a new question for you. I now have the > opportunity to try using Hadoop and HBase on a newly installed cluster > here, at a nominal cost. A lot of compute power (480+ nodes, 16 cores per > node going up to 32 by the end of FY12, 64 GB RAM per node, with a few fat > nodes with 256GB). One local drive of 1TB per node, and a four petabyte > Lustre file system. Hadoop jobs are already running on this new cluster, on > terabyte size data sets. > > Here's the drawback: I cannot permanently store HBase tables on local > disk. After a job finishes, the disks are reclaimed. So - if I want to > build a continuously available data warehouse (basically for analytics > runs, not for real-time web access by a large community at present - just > me and other internal bioinformatics folk here at PNNL) I need to put the > HBase tables on the Lustre file system. > > Now, all the nodes in this cluster have a very fast InfiniBand QDR network > interconnect. I think it's something like 40 gigabits/sec, as compared to > the 1 gigabit/sec that you might see in a run-of-the-mill Hadoop cluster. > And I just read a couple white papers that say that if the network > interconnect is good enough, the loss of data locality when you use Lustre > with Hadoop is not such a bad thing. That is, I Googled and found several > papers on HDFS vs Lustre. The latest one I found (2011) is a white paper > from a company called Xyratex. Here's a quote from it: > > The use of clustered file systems as a backend for Hadoop storage has been > studied previously. The performance > of distributed file systems such as Lustre2 , Ceph3 , PVFS4 , and GPFS5 > with Hadoop has been compared to that > of HDFS. Most of these investigations have shown that non-HDFS file > systems perform more poorly than HDFS, > although with various optimizations and tuning efforts, a clustered file > system can reach parity with HDFS. However, > a consistent limitation in the studies of HDFS and non-HDFS performance > with Hadoop is that they used the network > infrastructure to which Hadoop is limited, TCP/IP, typically over 1 GigE. > In HPC environments, where much faster > network interconnects are available, significantly better clustered file > system performance with Hadoop is possible. > > Anyway, I am not principally worried about speed or efficiency right now - > this cluster is big enough that even if I do not use it most efficiently, > I'll still be doing better than with my very small current cluster, which > has very limited RAM and antique processors. > > My question is: will HBase work at all on Lustre? That is, on pp. 52-54 of > your O'Reilly HBase book, you say that > > "... you are not locked into HDFS because the "FileSystem" used by HBase > has a pluggable architecture and can be used to replace HDFS with any other > supported system. The possibilities are endless and waiting for the brave > at heart." ... "You can select a different filesystem implementation by > using a URI pattern, where the scheme (the part before the first ":", > i.e., the colon) part of the URI identifies the driver to be used." > > We use HDFS by setting the URI to > > hdfs://<namenode>:port/<path> > > And you say to simply use the local file system a desktop Linux box (which > would not replicate data or maintain copies of the files - no fault > tolerance) one uses > > file:///<path> > > So - can I simply change this one param, and point HBase to a location in > the Lustre file system? That Is, use > > <property> > <name>hbase.rootdir</name> > <value>file:///pic/scratch/rtaylor/hbase</value> > </property> > > Where "/pic" points to the root of the Lustre system. Or use something > similar? I am told that all of the Lustre OSTs are backed by RAID6, so my > HBase tables would be fairly safe from hardware failure. If you put a file > into the Lustre file system, chances are very slim you are going to lose it > from a hardware failure. Also, I can make copies periodically to our > gigantic file storage cluster in a separate building. This does not need > to be a production HBase system (at least, for now). This is more of a data > warehouse / analytics /data integration environment for several > bioinformatics scientists, a system that we can afford to go down from time > to time, in a research environment. > > Note that when I use Hadoop by itself, or Hadoop with HBase tables as > sources and sinks, only the Hbase accesses would be from the Lustre file > system. The Hadoop program would still be able to use HDFS on local disks > on the subset of nodes allotted to it on the cluster - as the Hadoop > programs now running on this new cluster as doing. My problem is just I > don't want to have to rebuild the Hbase tables every time I want to do > something, since the local disk space is retrieved for other possible users > after a job finishes. But I can get permanent (well, yearly renewal) disk > space on the Lustre system. > > So - any advice, before I give this a try? Will changing this one HBase > config parameter suffice to get me started? Or are there other things > involved? > > - Ron > > Ronald Taylor, Ph.D. > Computational Biology & Bioinformatics Group > Pacific Northwest National Laboratory (U.S. Dept of Energy/Battelle) > Richland, WA 99352 > phone: (509) 372-6568 > email: [email protected] > > > -----Original Message----- > From: Lars George [mailto:[email protected]] > Sent: Wednesday, November 30, 2011 6:34 AM > To: [email protected] > Subject: Re: getting HBase up after an unexpected power failure - need > some advice > > Hey, > > Looks like you have a corrupted ZK. Try and stop ZK (after stopping HBase > of course) and restart it. If that also fails, then wipe the data dir ZK > uses (check the config, for example the zoo.cfg for stand alone ZK nodes). > ZK is going to recreate the data files and it should be able to move > forward. > > Cheers, > Lars > >
