On Sep 5, 2007, at 1:03 PM, Dongsheng Wang wrote:
The concern is that since it’s been engineered to support MapReduce there may be fundamental assumptions that the data being stored by HDFS is transient in nature. Obviously for our scalable storage solution zero data loss or corruption is a heavy requirement.
Actually, HDFS was started before Map/Reduce. *smile* Map/Reduce does not use HDFS for the transient storage, just the persistent storage at the end of the job. HDFS should be a fairly good match for your needs.
Is anybody using HDFS as a long term storage solution? Interested in any info. Thanks
We have 1 petabyte of relatively persistent user data... -- Owen
