Yes, but as I understand this is a not MR Job. This is a scanner usage. Best Regards. Slava.
On Wed, Mar 4, 2009 at 4:17 AM, schubert zhang <[email protected]> wrote: > Yes, we can tell HBase API only scan rows start with a key. > // get rows start from startRow to table end > HTable.getScanner(final byte[][] columns, final byte [] startRow) > > // get rows start from startRow to table end, only the cells time stamp > <= timestamp are retrieved > HTable.getScanner(final byte[][] columns, final byte [] startRow, long > timestamp) > > // get row range [startRow, endRow ) > HTable.getScanner(final byte [][] columns, final byte [] startRow, final > byte [] stopRow) > > // get row range [startRow, endRow ), only the cells time stamp <= > timestamp > are retrieved > HTable.getScanner(final byte [][] columns, final byte [] startRow, final > byte [] stopRow, final long timestamp) > > Can any expert share your ideas about: > 1. If the rowkey is not chronological, how can I only process the newly > added/updated rows? > 2. How can I remove the old rows which are inserted three months ago? > > Schubert > > On Wed, Mar 4, 2009 at 3:10 AM, Slava Gorelik <[email protected] > >wrote: > - Show quoted text - > > > Thank You for the answer.How can you tell to MR jobs which rows you want > to > > get ? Is it possible to tell to MR Job give me only rows that starts with > > some key ? > > > > Best Regards. > > Slava > > > > On Tue, Mar 3, 2009 at 7:33 PM, schubert zhang <[email protected]> > wrote: > > > > > In my practice, I define the 'time' as the first part of rowkey, then I > > can > > > only process the newly added rows. > > > I think my practice is not good and not appropriate for other cases, > > since > > > the rowkey definition is so important. > > > And I also want to know any good ideas. > > > > > > Another question is, how can I remove all rows which are inserted three > > > months ago? > > > > > > On Wed, Mar 4, 2009 at 12:45 AM, Slava Gorelik < > [email protected] > > > >wrote: > > > - Show quoted text - > > > > > > > Hi.I have a small question about MR jobs. Is it possible to run MR > job > > on > > > > part of the table ? > > > > For example I have MR job running on table and next time when run > this > > > > job, I want to get only newly added or updated rows. > > > > > > > > Thank You and Best Regards. > > > > > > > > > >
