In my practice, I define the 'time' as the first part of rowkey, then I can only process the newly added rows. I think my practice is not good and not appropriate for other cases, since the rowkey definition is so important. And I also want to know any good ideas.
Another question is, how can I remove all rows which are inserted three months ago? On Wed, Mar 4, 2009 at 12:45 AM, Slava Gorelik <[email protected]>wrote: > Hi.I have a small question about MR jobs. Is it possible to run MR job on > part of the table ? > For example I have MR job running on table and next time when run this > job, I want to get only newly added or updated rows. > > Thank You and Best Regards. >
