Re: Reducer to outputformat

Aaron McCurry Mon, 20 May 2013 06:14:37 -0700

You could write your own reducer in the new paradigm and the documents
within a row are ordered by record id.  So the first one in could be your
primedoc document.  If that's what you are after.

Although  I think a better approach would be to implement a secondary sort
in hadoop to enforce the record id ordering in mapreduce so you don't have
to buffer the whole row.  I could implement that in the mapreduce lib in
Blur, just create an issue and I will give it a try.

Aaron

On Mon, May 20, 2013 at 8:37 AM, Tim Williams <[email protected]> wrote:

> In the move to outputformat, I don't see how we get our "last chance"
> to fiddle with the indexed docs like we do today with the reducer
> approach (e.g. documentsToIndex(..)   Is that right?  My current usage
> of documentsToIndex is likely flawed in the new "temporary index"
> paradigm anyway because I kinda depend on them being buffered, so i
> reckon I'd have to come up with something different anyway...
>
> --tim
>

Re: Reducer to outputformat

Reply via email to