Hi, I was looking for a similar solution, for keeping track of the minimum/maximum heap usage in a hadoop counter. I currently implemented this as a very generic DoFn, which only does something in the initialize() and cleanup(). But this involves weaving this DoFn in multiple parts of the pipeline, so that it runs on every mapper and reducer. It would be a lot cleaner if there is some way to have this initialize() and cleanup() at pipeline level, which is executed on each and every jvm.
Best regards, Leen
