Is the idea of writing business logic in cleanup method of a Mapper good or bad? We think we can make our Mapper run faster if we keep accumulating data in a HashMap in a Mapper, and later in the cleanup() method write it.
1) Does Map/Reduce paradigm guarantee that cleanup will always be called before the reducer starts? 2) Is cleanup strictly for cleaning up unneeded resources? 3) We understand that the HashMap can grow & that could cause memory issues, but hypothetically let's say the memory requirements were manageable. Please let me know. Thanks.