How does Hadoop manage memory?

Peter Wolf Tue, 28 Jun 2011 13:45:17 -0700

Hello all,

I am looking for the right thing to read...

I am writing a MapReduce Speech Recognition application. I want to runmany Speech Recognizers in parallel.

Speech Recognizers not only use a large amount of processor, they alsouse a large amount of memory. Also, in my application, they are oftenidle much of the time waiting for data. So optimizing what runs when isnon-trivial.

I am trying to better understand how Hadoop manages resources. Does itautomatically figure out the right number of mappers to instantiate?How? What happens when other people are sharing the cluster? Whatresource management is the responsibility of application developers?

For example, let's say each Speech Recognizer uses 500 MB, and I have1,000,000 files to process. What would happen if I made 1,000,000mappers, each with 1 Speech Recognizer? Is it only non-optimal becauseof setup time, or would the system try to allocate 500GB of memory andexplode?


Thank you in advance
Peter

How does Hadoop manage memory?

Reply via email to