On Tue, Sep 8, 2009 at 1:16 PM, Mark Kerzner <[email protected]> wrote:
> Hi, > I have some code that's common between the main class, mapper, and reducer. > Can I put it only in the main class and use it from mapper and reducer? > > A similar question about static variables in the main - are the available > from mapper and reducer? > > Code yes, data no. Your mapper and reducer will have the full jar file that contains the job (unless you are doing something very strange). You could include any code you need to share, just as you would in any other java app. You can't pass data in static variables though. The main class is only going to run on the machine you submit the job from. When the mappers and reducers start up they will start in separate JVMs not even on the same physical node. If you need to distribute a large amount of data, you can use distributed cache. If you just need to pass some settings, you could accomplish it by setting child opts (options passed to the JVMs for the mapper and reducers) in the config. If you need some sort of coordination more complicated than this, you should look into zookeeper.
