I believe you want to ship data to each node in your cluster before MR
begins so the mappers can access files local to their machine. Hadoop
tutorial on YDN has some good info on this.

http://developer.yahoo.com/hadoop/tutorial/module5.html#auxdata

-Prashant Kommireddi

On Fri, Nov 25, 2011 at 1:05 AM, Andy Doddington <a...@doddington.net>wrote:

> I have a series of mappers that I would like to be passed data using the
> distributed cache mechanism. At the
> moment, I am using HDFS to pass the data, but this seems wasteful to me,
> since they are all reading the same data.
>
> Is there a piece of example code that shows how data files can be placed
> in the cache and accessed by mappers?
>
> Thanks,
>
>        Andy Doddington
>
>

Reply via email to