Thank you, Kevin, for a detailed explanation. I went ahead and shared both.
Since I test on my machine, it worked :) but obviously it was a fluke, and I
need to change my code for running on the cluster.
Sincerely,
Mark

On Wed, Sep 9, 2009 at 2:57 PM, Kevin Peterson <[email protected]> wrote:

> On Tue, Sep 8, 2009 at 1:16 PM, Mark Kerzner <[email protected]>
> wrote:
>
> > Hi,
> > I have some code that's common between the main class, mapper, and
> reducer.
> > Can I put it only in the main class and use it from mapper and reducer?
> >
> > A similar question about static variables in the main - are the available
> > from mapper and reducer?
> >
> >
> Code yes, data no.
>
> Your mapper and reducer will have the full jar file that contains the job
> (unless you are doing something very strange). You could include any code
> you need to share, just as you would in any other java app.
>
> You can't pass data in static variables though. The main class is only
> going
> to run on the machine you submit the job from. When the mappers and
> reducers
> start up they will start in separate JVMs not even on the same physical
> node. If you need to distribute a large amount of data, you can use
> distributed cache. If you just need to pass some settings, you could
> accomplish it by setting child opts (options passed to the JVMs for the
> mapper and reducers) in the config. If you need some sort of coordination
> more complicated than this, you should look into zookeeper.
>

Reply via email to