Hi all,

I'm interested in creating a solution that leverages multiple computing
nodes in an EC2 or Rackspace cloud environment in order to
do massively parallelized processing in the context of serving HTTP
requests, meaning I want results to be aggregated within 1-4 seconds.

>From what I gather, Hadoop is designed for job-oriented tasks and the
minimum job completion time is 30 seconds.  Also HDFS is meant for storing
few large files, as opposed to many small files.

My question is there a framework similar to hadoop that is designed more for
on-demand parallel computing?  What about a technology similar to HDFS that
is better at moving around small files and making them available to slave
nodes on demand?

Reply via email to