On Mon, Aug 18, 2014 at 10:31 AM, J. Roeleveld <jo...@antarean.org> wrote:
>
> I wouldn't use Hadoop for storage of files. It's only useful if you have a lot
> (and I do mean a LOT) of data where a query only returns a very small amount.

Not to mention a lot of data in a small number of files.  I think the
minimum allocation size for Hadoop is measured in megabytes.  I tried
using it to process gentoo-x86 and the number of files just clobbered
the thing.  Since in my job the files were really just static data and
not the actual subject of the map/reduce I instead just replicated the
data to all the nodes and had them retrieve the data from the local
filesystem.

Hadoop is a very specialized tool.  It does what it does very well,
but if you want to use it for something other than map/reduce then
consider carefully whether it is the right tool for the job.

--
Rich

Reply via email to