Hi Ananth, A general approach to do this in HDFS is Sequence Files or Hadoop Archives. In a layman terms, you just pack a few files into a larger file and you can develop your own logic on top of this. Having said that you will probably have to pay a penalty for random access: HDFS was not designed for this. However, there are other solutions on top of Hadoop like HBase to do this (among many others).
I know this is very concise, but let me know you business case and I can go into more details. Regards, Alex K On Tue, Jul 6, 2010 at 1:51 PM, Ananth Sarathy <[email protected]>wrote: > Yea I know I can use a nas or San. I am not really asking about this as a > use case on what the best way way to do it is but rather what the best way > to do use hdfs is it was decided that hdfs WAS the fileasystem you were > going to use to serve lots of small files. > > sent from my nexus one > > On Jul 6, 2010 3:43 PM, "Patrick Angeles" <[email protected]> wrote: > If all you want is dumb storage for small-ish files, you can always just > use > NAS or SAN. > > For the MP3 example, you might want to consider HBase... you can store > associated meta-data in column families. > > On Tue, Jul 6, 2010 at 3:33 PM, Ananth Sarathy > <[email protected]>wrote: > > > > So I am aware of the problem with small files > > and I have read this article > > > > http://www.cloud... >
