I used EMR for our project, and it works. It took some time to set up though. EMR requires S3 bucket, but S3 instance has a limit of file size(5GB), so need some extra care here. Has any one encounter the file size problem on S3 also? I kind of think that it's unreasonable to have a 5G size limit when we want to use the system to deal with large data set.
On Sun, Jan 10, 2010 at 8:06 PM, Ted Dunning <[email protected]> wrote: > This seems the easiest answer so far! > > On Sun, Jan 10, 2010 at 8:03 PM, deneche abdelhakim <[email protected] > >wrote: > > > > > % hadoop-ec2 launch-cluster --env REPO=testing --env HADOOP_VERSION=0.20 > \ > > my-hadoop-cluster 10 > > > > > -- > Ted Dunning, CTO > DeepDyve > -- Chenmin Liang Language Technologies Institute, School of Computer Science Carnegie Mellon University
