I wrote an email 1-2 months ago to the hbase-user mailing list that summarizes the pros and cons of using S3 instead of HDFS. Title is "On storing HBase data in AWS S3".
J-D On Tue, Dec 15, 2009 at 9:58 AM, Andrew Purtell <[email protected]> wrote: > Hi John, > > Thanks for writing us with the updates. I'm glad you noticed the errors about > missing classes etc. because the pastebin only showed the master did not > start so we would have needed to dig deeper in logs (time consuming). > > Please let me know about your experiences using S3 to back the HBase rootdir. > We haven't looked into this because we were not aware of any users actively > trying this out. > > - Andy > > > > > ________________________________ > From: John Roberts <[email protected]> > To: [email protected] > Sent: Tue, December 15, 2009 8:32:52 AM > Subject: Re: HBase: using S3 for storage > > Never mind - I set my hbase.rootdir to > s3://net.montrix.test.s3.amazonaws.com:80/ and it worked and I can see files > being written to my net.montrix.test.s3.amazonaws.com bucket in S3. > > - John > > > > > ________________________________ > From: John Roberts <[email protected]> > To: [email protected] > Sent: Tue, December 15, 2009 6:39:53 AM > Subject: Re: HBase: using S3 for storage > > > There were additional error messages in my master log file which indicated > that I was missing some jar's. I downloaded jets3t-0.7.1.jar and > commons-codec-1.4.jar and set the JETS3T_HOME variable in my hbase-env.sh > file. This got me to the point where it is now trying to use S3. Now I get > the errors below in my master log file. At this point the only question > seems to be exactly what to set my hbase.rootdir property to. My S3 account > has buckets "net.montrix.test" as well as > "net.montrix.test.s3.amazonaws.com". I tried setting my hbase.rootdir value > to this: > > s3://net.montrix.test.s3.amazonaws.com:80/tmp/hbase-jroberts/hbase > > The location of my hbase root dir on my local file system is > /tmp/hbase-jroberts/hbase. That resulted in the error below. So either my > hbase.rootdir value is wrong or perhaps the fs.default.name property in my > hadoop-site.xml is wrong? I have it set to s3://hbase. > > John > > 2009-12-15 06:18:52,917 INFO org.apache.hadoop.hbase.master.HMaster: My > address is localhost.localdomain:60000 > 2009-12-15 06:18:54,696 ERROR org.apache.hadoop.hbase.master.HMaster: Can not > start master > org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.S3ServiceException: > S3 GET failed for '/%2Ftmp%2Fhbase-jroberts%2Fhbase' XML Error Message: <?xml > version="1.0" encoding="UTF-8"?><Error><Code>NoSuchBucket</Code><Message>The > specified bucket does not > exist</Message><BucketName>net.montrix.test.s3.amazonaws.com</BucketName><RequestId>E7E72017C69AB6DF</RequestId><HostId>LHSezOrfx3LrWI+IWQ1Icbz0/FRndFDsyQWIn3Oaru1ui6JXfq9Zfz1tgfUET7TG</HostId></Error> > at > org.apache.hadoop.fs.s3.Jets3tFileSystemStore.get(Jets3tFileSystemStore.java:156) > > > > > > ________________________________ > From: John Roberts <[email protected]> > To: [email protected] > Sent: Tue, December 15, 2009 1:33:45 AM > Subject: Re: HBase: using S3 for storage > > > The stack trace is here: http://pastebin.ca/1715521 > > I set my hbase.rootdir value to the following: > > s3://net.montrix.test.s3.amazonaws.com:80/ > > Note that the net.montrix.test bucket exists in my S3 account. Thanks for > looking at this. > > John > > > > > ________________________________ > From: Andrew Purtell <[email protected]> > To: [email protected] > Sent: Mon, December 14, 2009 11:27:41 PM > Subject: Re: HBase: using S3 for storage > > Hi John, > > Can you pastebin that stack trace? > > - Andy > > > > > ________________________________ > From: John Roberts <[email protected]> > To: [email protected] > Sent: Mon, December 14, 2009 6:49:50 PM > Subject: HBase: using S3 for storage > > I'm running HBase version 0.20.2 and am trying to get my HBase server > to use S3 for storage instead of the local file system. I tried > following the instructions here but could not get it to work: > > http://developer.amazonwebservices.com/connect/thread.jspa?messageID=139683 > > My HBase version does not have a hadoop-site.xml file so I created one in the > conf directory with the following parameters: > > <configuration> > <property> > <name>fs.default.name</name> > <value>s3://hbase</value> > </property> > > <property> > <name>fs.s3.awsAccessKeyId</name> > <value>id</value> > </property> > > <property> > <name>fs.s3.awsSecretAccessKey</name> > <value>secret</value> > </property> > > </configuration> > > I > also updated the hbase.rootdir property with the S3 url as per the reference > above. When I ran > the hbase shell and tried to put a value into a table I got a deep stack > trace with > no mention of S3. > > Has anyone gotten HBase to use S3? If so - could you send me the config > changes you made to get it to work? Thanks! > > John > > >
