You could write your own reducer in the new paradigm and the documents within a row are ordered by record id. So the first one in could be your primedoc document. If that's what you are after.
Although I think a better approach would be to implement a secondary sort in hadoop to enforce the record id ordering in mapreduce so you don't have to buffer the whole row. I could implement that in the mapreduce lib in Blur, just create an issue and I will give it a try. Aaron On Mon, May 20, 2013 at 8:37 AM, Tim Williams <[email protected]> wrote: > In the move to outputformat, I don't see how we get our "last chance" > to fiddle with the indexed docs like we do today with the reducer > approach (e.g. documentsToIndex(..) Is that right? My current usage > of documentsToIndex is likely flawed in the new "temporary index" > paradigm anyway because I kinda depend on them being buffered, so i > reckon I'd have to come up with something different anyway... > > --tim >
