Do you have to use HDFS with map/reduce? I don't fully understand how closely
bound map/reduce is to HDFS.
In our application it might make more sense to accrue data using MogileFS
and place post-processed data (i.e. larger data) into HDFS for additional
processing.
Comments?
Ted Dunning <[EMAIL PROTECTED]> wrote:
Hadoop may not be what you want for storing lots and lots of files.
If you need to store >10^7 files or if you are storing lots of small (<40MB)
files, then you may prefer a solution like mogileFS. It is engineered for a
very different purpose than hadoop, but may be more appropriate for what you
want. It is also already intended for web-scale reliable applications so
there is a bit more that you can do for redundancy.
On the other hand, HDFS might be just what you need.
On 9/5/07 1:03 PM, "Dongsheng Wang"
wrote:
>
> We are looking at using HDFS as a long term storage solution. We want to use
> it to stored lots of files. The file could be big and small, they are images,
> videos etc... We only write the files once, and may read them many times.
> Sounds like it is perfect to use HDFS.
>
> The concern is that since it¹s been engineered to support MapReduce there may
> be fundamental assumptions that the data being stored by HDFS is transient in
> nature. Obviously for our scalable storage solution zero data loss or
> corruption is a heavy requirement.
>
> Is anybody using HDFS as a long term storage solution? Interested in any info.
> Thanks
>
> - ds
>
>
> ---------------------------------
> Yahoo! oneSearch: Finally, mobile search that gives answers, not web links.
---------------------------------
Be a better Heartthrob. Get better relationship answers from someone who knows.
Yahoo! Answers - Check it out.