Re: Caching frequently map input files

Jason Venner Mon, 11 Feb 2008 08:44:51 -0800

I would propose either store the files in hbase, which will keep anactive copy available, or replicate the files manually to all of yourmachines, and have a task that mmaps the file in to shared memory. Themmap can lock the pages in and fault them in to ensure they are resident.

Then have your jobs attach the shared memory, or simply read the filesnormally.


Shimi K wrote:

Is Hadoop cache frequently/LRU/MRU map input files? Or does it upload files
from the disk each time a file is needed no matter if it was the same file
that was required by the last job on the same node?

I am currently using version 0.14.4

- Shimi

--
Jason Venner
Attributor - Publish with Confidence <http://www.attributor.com/>
Attributor is hiring Hadoop Wranglers, contact if interested

Re: Caching frequently map input files

Reply via email to