On Dec 9, 2008, at 7:34 PM, Aaron Kimball wrote:

That's true, but you should be aware that you no longer have an
OutputCollector available in the close() method.

True, but in practice you can keep a handle to it from the map method and it will work perfectly. This is required for both streaming and pipes to work. (Both of them do their processing asynchronously, so the close needs to wait for the subprocess to finish. Because of this, the contract with the Mapper and Reducer are very loose and the collect method may be called in between calls to the map method.) In the context object api (hadoop-1230), the api will include the context object in cleanup, to make it clear that cleanup can also write records.

-- Owen

Reply via email to