Thank you, everybody, for your input - it was very useful. I need to do my homework now, and I will be back with the update. The device really exists. It is not cheap, but it may make sense as the NN of a serious cluster.
Sincerely, Mark On Fri, Oct 12, 2012 at 10:46 PM, Ravi Prakash <ravi...@ymail.com> wrote: > Maybe at a slight tangent, but for each write operation on HDFS (e.g. > create a file, delete a file, create a directory), the NN waits until the > edit has been *flushed* to disk. So I can imagine such a hypothetical(?) > disk would tremendously speed up the NN even as it is. Mark, can you please > please please send me 5 of these disks? :-P > To answer your question, you probably want to change BlockManager and > FSNamesystem, both basically being the crux of HDFS NN. Its going to be a > pretty significant undertaking. > @memory-mapped files would lose data in case of failure (unless ofcourse > you use special hardware, thinking of which, really its not soooo special, > so maybe worth trying). Has anyone tried this before? > > ------------------------------ > *From:* Lance Norskog <goks...@gmail.com> > *To:* user@hadoop.apache.org > *Sent:* Friday, October 12, 2012 12:01 AM > > *Subject:* Re: Using a hard drive instead of > > This is why memory-mapped files were invented. > > On Thu, Oct 11, 2012 at 9:34 PM, Gaurav Sharma > <gaurav.gs.sha...@gmail.com> wrote: > > If you don't mind sharing, what hard drive do you have with these > > properties: > > -"performance of RAM" > > -"can accommodate very many threads" > > > > > > On Oct 11, 2012, at 21:27, Mark Kerzner <mark.kerz...@shmsoft.com> > wrote: > > > > Harsh, > > > > I agree with you about many small files, and I was giving this only in > way > > of example. However, the hard drive I am talking about can be 1-2 TB in > > size, and that's pretty good, you can't easily get that much memory. In > > addition, it would be more resistant to power failures than RAM. And > yes, it > > has the performance of RAM, and can accommodate very many threads. > > > > Mark > > > > On Thu, Oct 11, 2012 at 11:16 PM, Harsh J <ha...@cloudera.com> wrote: > >> > >> Hi Mark, > >> > >> Note that the NameNode does random memory access to serve back any > >> information or mutate request you send to it, and that there can be > >> several number of concurrent clients. So do you mean a 'very fast hard > >> drive' thats faster than the RAM for random access itself? The > >> NameNode does persist its block information onto disk for various > >> purposes, but to actually make the NameNode use disk storage > >> completely (and not specific parts of it disk-cached instead) wouldn't > >> make too much sense to me. That'd feel like trying to communicate with > >> a process thats swapping, performance-wise. > >> > >> The too many files issue is bloated up to sound like its a NameNode > >> issue but it isn't in reality. HDFS allows you to process lots of > >> files really fast, aside of helping store them for long periods, and a > >> lot of tiny files only gets you down in such operations with overheads > >> of opening and closing files in the way of reading them all at a time. > >> With a single or a few large files, all you do is block (data) reads, > >> and very few NameNode communications - ending up going much faster. > >> This is the same for local filesystems as well, but not many think of > >> that. > >> > >> On Fri, Oct 12, 2012 at 9:29 AM, Mark Kerzner <mark.kerz...@shmsoft.com > > > >> wrote: > >> > Hi, > >> > > >> > Imagine I have a very fast hard drive that I want to use for the > >> > NameNode. > >> > That is, I want the NameNode to store its blocks information on this > >> > hard > >> > drive instead of in memory. > >> > > >> > Why would I do it? Scalability (no federation needed), many files are > >> > not a > >> > problem, and warm fail-over is automatic. What would I need to change > in > >> > the > >> > NameNode to tell it to use the hard drive? > >> > > >> > Thank you, > >> > Mark > >> > >> > >> > >> -- > >> Harsh J > > > > > > > > -- > Lance Norskog > goks...@gmail.com > > >