What are the numbers like? Is it 1k rows you need to process? 1M? 10B? Your question is more about scaling (or the need to).
J-D On Tue, Apr 20, 2010 at 8:39 AM, Andrey <atimerb...@gmx.net> wrote: > Dear All, > > Assumed, I've got a list of rowIDs of a HBase table. I want to get each row by > its rowID, do some operations with its values, and store the results somewhere > subsequently. Is there a good way to do this in a Map-Reduce manner? > > As far as I understand, a mapper usually takes a Scan to form inputs. It is > quite possible to create such a Scan, which contains a lot of RowFilters to be > EQUAL to a particular <rowId>. Such a strategy will work for sure, however is > inefficient, since each filter will be tried to match to each found row. > > So, is there a good Map-Reduce praxis for such kind of situations? (E.g. to > make > a Get operation inside a map() method.) If yes, could you kindly point to a > good > code example? > > Thank you in advance. > >