If all you want is dumb storage for small-ish files, you can always just use NAS or SAN.
For the MP3 example, you might want to consider HBase... you can store associated meta-data in column families. On Tue, Jul 6, 2010 at 3:33 PM, Ananth Sarathy <[email protected]>wrote: > So I am aware of the problem with small files > and I have read this article > > http://www.cloudera.com/blog/2009/02/the-small-files-problem/ > > I am just wondering if there has been any real change in this? For > example's sake, suppose you just want an HDFS Cluster that never does any > m/r jobs but would store an MP3 of every song known to exist, in > /ARTIST/ALBUM/song kind of structure. And if some one wanted they could > just > go HDFS://U2/Joshua Tree/withOrWithoutyou.mp3 > > Yes I know there are practical issues with this example such as search and > browsing, but let's ignore those. I don't really want to have to write a > file system to go on top of a file system for this kind of example, so I'd > imagine I would use the har, but wanted to know if there is any other > thoughts out there. > > Also, I was wondering if there were any tips and tricks for using > har...auto > archiving, things like that? > Ananth T Sarathy >
