I've tried s3 on hbase. See a previous post I made on this list. Basically its considerably slower than hdfs, especially so for random reads. Also I think there could be consistency issues when running s3: master creates file, tells region server to read it, and region server gets a file not found. This happened to me a couple of times running s3 as the main map/reduce filesystem. We've basically decided to run our own hdfs, and just use s3 for backup.
3) I see no need for diff images (we use one for all). have fun, -clint On Wed, May 7, 2008 at 11:12 AM, Jim R. Wilson <[EMAIL PROTECTED]> wrote: > Hi all, > > I'm about to embark on a mystical journey through hosted web-services > with my trusted friend hbase. Here are some questions for my fellow > travelers: > > 1) Has anyone done this before? If so, what lifesaving tips can you offer? > 2) Should I attempt to build an hdfs out of ec2 persistent storage, or > just use S3? > 3) How many images will I need? Just one, or master/slave? > 4) What version of hadoop/hbase should I use? (The hadoop/ec2 > instructions[1] seem to favor the unreleased 0.17, but there doesn't > seem to be a public image with 0.17 at the ready) > > Thanks in advance for any advice, I'm gearing up for quite a trip :) > > [1] http://wiki.apache.org/hadoop/AmazonEC2 > > -- Jim R. Wilson (jimbojw) >
