I wrote an email 1-2 months ago to the hbase-user mailing list that
summarizes the pros and cons of using S3 instead of HDFS. Title is "On
storing HBase data in AWS S3".

J-D

On Tue, Dec 15, 2009 at 9:58 AM, Andrew Purtell <[email protected]> wrote:
> Hi John,
>
> Thanks for writing us with the updates. I'm glad you noticed the errors about 
> missing classes etc. because the pastebin only showed the master did not 
> start so we would have needed to dig deeper in logs (time consuming).
>
> Please let me know about your experiences using S3 to back the HBase rootdir. 
> We haven't looked into this because we were not aware of any users actively 
> trying this out.
>
>    - Andy
>
>
>
>
> ________________________________
> From: John Roberts <[email protected]>
> To: [email protected]
> Sent: Tue, December 15, 2009 8:32:52 AM
> Subject: Re: HBase: using S3 for storage
>
> Never mind - I set my hbase.rootdir to 
> s3://net.montrix.test.s3.amazonaws.com:80/ and it worked and I can see files 
> being written to my net.montrix.test.s3.amazonaws.com bucket in S3.
>
> - John
>
>
>
>
> ________________________________
> From: John Roberts <[email protected]>
> To: [email protected]
> Sent: Tue, December 15, 2009 6:39:53 AM
> Subject: Re: HBase: using S3 for storage
>
>
> There were additional error messages in my master log file which indicated 
> that I was missing some jar's.  I downloaded jets3t-0.7.1.jar and 
> commons-codec-1.4.jar and set the JETS3T_HOME variable in my hbase-env.sh 
> file.   This got me to the point where it is now trying to use S3.  Now I get 
> the errors below in my master log file.  At this point the only question 
> seems to be exactly what to set my hbase.rootdir property to.  My S3 account 
> has buckets "net.montrix.test" as well as 
> "net.montrix.test.s3.amazonaws.com".   I tried setting my hbase.rootdir value 
> to this:
>
> s3://net.montrix.test.s3.amazonaws.com:80/tmp/hbase-jroberts/hbase
>
> The location of my hbase root dir on my local file system is 
> /tmp/hbase-jroberts/hbase.  That resulted in the error below.   So either my 
> hbase.rootdir value is wrong or perhaps the fs.default.name property in my 
> hadoop-site.xml is wrong?  I have it set to s3://hbase.
>
> John
>
> 2009-12-15 06:18:52,917 INFO org.apache.hadoop.hbase.master.HMaster: My 
> address is localhost.localdomain:60000
> 2009-12-15 06:18:54,696 ERROR org.apache.hadoop.hbase.master.HMaster: Can not 
> start master
> org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.S3ServiceException: 
> S3 GET failed for '/%2Ftmp%2Fhbase-jroberts%2Fhbase' XML Error Message: <?xml 
> version="1.0" encoding="UTF-8"?><Error><Code>NoSuchBucket</Code><Message>The 
> specified bucket does not 
> exist</Message><BucketName>net.montrix.test.s3.amazonaws.com</BucketName><RequestId>E7E72017C69AB6DF</RequestId><HostId>LHSezOrfx3LrWI+IWQ1Icbz0/FRndFDsyQWIn3Oaru1ui6JXfq9Zfz1tgfUET7TG</HostId></Error>
>        at 
> org.apache.hadoop.fs.s3.Jets3tFileSystemStore.get(Jets3tFileSystemStore.java:156)
>
>
>
>
>
> ________________________________
> From: John Roberts <[email protected]>
> To: [email protected]
> Sent: Tue, December 15, 2009 1:33:45 AM
> Subject: Re: HBase: using S3 for storage
>
>
> The stack trace is here: http://pastebin.ca/1715521
>
> I set my hbase.rootdir value to the following:
>
> s3://net.montrix.test.s3.amazonaws.com:80/
>
> Note that the net.montrix.test bucket exists in my S3 account.  Thanks for 
> looking at this.
>
> John
>
>
>
>
> ________________________________
> From: Andrew Purtell <[email protected]>
> To: [email protected]
> Sent: Mon, December 14, 2009 11:27:41 PM
> Subject: Re: HBase: using S3 for storage
>
> Hi John,
>
> Can you pastebin that stack trace?
>
>   - Andy
>
>
>
>
> ________________________________
> From: John Roberts <[email protected]>
> To: [email protected]
> Sent: Mon, December 14, 2009 6:49:50 PM
> Subject: HBase: using S3 for storage
>
> I'm running HBase version 0.20.2 and am trying to get my HBase server
> to use S3 for storage instead of the local file system.  I tried
> following the instructions here but could not get it to work:
>
> http://developer.amazonwebservices.com/connect/thread.jspa?messageID=139683
>
> My HBase version does not have a hadoop-site.xml file so I created one in the 
> conf directory with the following parameters:
>
> <configuration>
> <property>
>  <name>fs.default.name</name>
>  <value>s3://hbase</value>
> </property>
>
> <property>
>  <name>fs.s3.awsAccessKeyId</name>
>  <value>id</value>
> </property>
>
> <property>
>  <name>fs.s3.awsSecretAccessKey</name>
>  <value>secret</value>
> </property>
>
> </configuration>
>
> I
> also updated the hbase.rootdir property with the S3 url as per the reference 
> above.  When I ran
> the hbase shell and tried to put a value into a table I got a deep stack 
> trace with
> no mention of S3.
>
> Has anyone gotten HBase to use S3?  If so - could you send me the config 
> changes you made to get it to work?  Thanks!
>
> John
>
>
>

Reply via email to