On Feb 9, 2008, at 4:21 PM, Jeff Eastman wrote:
I'm trying to wait until close() to output the cluster centroids to thereducer, but the OutputCollector is not available.
You hit on exactly the right solution. Actually, because of Pipes and Streaming, you have a lot more guarantees than you would expect. In particular, you can call output.collect when the framework is between calls to map or reduce up until the close finishes.
-- Owen
