We have very similar plans for Hadoop to what C G quotes below, but we've found the stability of HDFS to be quite troublesome. We've corrupted HDFS three different ways in a few weeks: 1) running jStack on the Namenode; 2) loading lots of small files into HDFS, causing it to hang on a Map/Reduce job and subsequently display corruption on restart; 3) upgrading to a newer version of Hadoop. Thus we are very uncertain about treating HDFS as a reliable long-term data store.
That being said, we're excited about the opportunities created by Hadoop so we're going to put some time into making it more reliable and creating a utility to archive data out of HDFS for backup purposes. On 9/5/07, C G <[EMAIL PROTECTED]> wrote: > > Our intention is to use HDFS as the core of a large "data repository". We > store "raw" data within HDFS on a more-or-less permanent basis, and > map/reduce it to produce load files for our data warehouse. We have other > plans as well all centered around storing data on a very long term basis in > HDFS. So you're in good company... > > Our plan is for a 64T HDFS repository, with a replication factor of 3 > for a ~21T data space. > > C G > > > Dongsheng Wang <[EMAIL PROTECTED]> wrote: > > We are looking at using HDFS as a long term storage solution. We want to > use it to stored lots of files. The file could be big and small, they are > images, videos etc... We only write the files once, and may read them many > times. Sounds like it is perfect to use HDFS. > > The concern is that since it's been engineered to support MapReduce there > may be fundamental assumptions that the data being stored by HDFS is > transient in nature. Obviously for our scalable storage solution zero data > loss or corruption is a heavy requirement. > > Is anybody using HDFS as a long term storage solution? Interested in any > info. Thanks > > - ds > > > --------------------------------- > Yahoo! oneSearch: Finally, mobile search that gives answers, not web > links. > > > --------------------------------- > Ready for the edge of your seat? Check out tonight's top picks on Yahoo! > TV.
