I am faced with a similar problem. I want to process an entire set of bugs including their entire history. Once. Then, incrementally process a combination of the latest output + the changes since last processed.
I hit upon a way of handling multiple outputs. Perhaps if there was something in the data format that told you where the data came from, the same mapper could process both and the reducer could merge them? Michael Toback SMTS, VMWare -- View this message in context: http://lucene.472066.n3.nabble.com/Multiple-Input-Data-Processing-using-MapReduce-tp1701199p2165027.html Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
