HDFS does not really meet your needs. I think that MapR's solution would. I will contact off-line to give details.
On Thu, Oct 6, 2011 at 3:35 PM, Hemant kulkarni <kulkarnihem...@gmail.com>wrote: > Hi all, > We are a small software development firm working on data backup > software. We have a backup product which copies data from client > machine to data store. Currently we provide a specialized hardware to > store data(1-3TB disks and servers). We want to provide solution to > some customers(mining company) with following requirements > 1] Huge data storage capacity(initially starting with 100 TB but > should be easy to increase) > 2] Initially this facility is used as data storage but in future > company plans to add data processing software(some MapReduce jobs) > 3] Most of data is unstructured (mostly images, text files and videos) > 4] many times data is duplicate of some original. So need de duplication > 5] Mostly data is added every time(daily backup) and occasionally > read.(Write every day new data and read on weekly) > 6] data copied is in terms of files(every backup is 100,000 files each > file is some MB and some files in KB) > 7] this is data storage so latency requirements are not very strict > 8] Some part of data have very high HA requirements. Should be copied > to data centers outside country on timely basis(weekly, but data size > is small like few TB) > 9]Currently we provide some sort of HSM(Hierarchical Storage > Management ). company needs something similar in new solution > 10] Single namespace and versioning of files is another requirement > > As I understood HDFS doesn't suit directly for such storage due to > following design consideration > 1] Large no of small files > 2] duplicate data > 3] write many read once requirement > > Here are my questions > 1] Does DHFS support our client requirements? or at least can it be > configured to suit needs? > 2] is there any customization of HDFS(if possible) which will serve the > purpose > > is there any other solution which will work? > > All thoughts/suggestions are welcome > > Regards, > Hemant. >