Given the goal of a shared data accessable across the Map instances,
can someone please explain some of the differences between using:
- setNumTasksToExecutePerJvm() and then having statically declared
data initialised in Mapper.configure(); and
- a MultithreadedMapRunner?

Regards,
Shane


On Wed, Nov 26, 2008 at 6:41 AM, Doug Cutting <[EMAIL PROTECTED]> wrote:
> tim robertson wrote:
>>
>> Thanks Alex - this will allow me to share the shapefile, but I need to
>> "one time only per job per jvm" read it, parse it and store the
>> objects in the index.
>> Is the Mapper.configure() the best place to do this?  E.g. will it
>> only be called once per job?
>
> In 0.19, with HADOOP-249, all tasks from a job can be run in a single JVM.
>  So, yes, you could access a static cache from Mapper.configure().
>
> Doug
>
>

Reply via email to