perhaps you would be interested in the tableindexed package. (Its in transactional contrib, see doc in o.a.h.h.client.tableindexed, or look at the tests).
It will allow you to get a scanner whose results are ordered by a column's values (If you have an index on that column). -clint On Thu, Sep 10, 2009 at 5:49 AM, Keith Thomas <[email protected]>wrote: > > > > stack-3 wrote: > > > > On Wed, Sep 9, 2009 at 7:52 PM, Keith Thomas <[email protected]> > > wrote: > > > >> > >> I think I'm looking at the same problem with HBase as Dingding Ye. I > need > >> to > >> be able to retrieve a list of rows sorted by data in a column and I'm > not > >> sure how to go about it without resorting to performing the sort on the > >> client which feels like I'm just giving up. > >> > >> > > > > s3> You want to sort rows in the table by other than the row key or is it > > just > > s3> that you want to sort the content of a row by other than its column > > name? > > > > I want to sort by the content of a column in each row. > > > > s3> How big is the set you want to look at? Is it full table or some > > subset of > > rows? > > > > I am writing the data access layer, not the app itself. I have to conform > > to a certain api. It is up to the application itself to use certain > > limits, although I may impose configurable limits in my layer just to be > > conservative in this brave new world I am exploring. Idelly I'd like to > be > > able to to both, i.e. retrieve a full table or a subset. I think that > once > > I've written the full table support I's worry about collecting just a > > subset. > > > > > >> My current thinking is to create a map class that outputs key/value > pairs > >> where the key is the field I want to sort upon and the value is row key. > >> This way I will get nice sorted input going into my reduce class. I > guess > >> I > >> would have to have once reduce class instance. > >> > > > > s3> Why one reduce? Write your own partitioner and impose a total order? > > Thanks, I will read up on this, thanks for the direction. > > > > > > > >> > >> However, I am unclear how I can return the row keys and the families > with > >> their column data to the client from the reduce class. All the examples > I > >> have found so far write the results to files/tables whereas I want to > >> return > >> objects to a client. > >> > > > > s3>Yeah.... bit tough making your client into a reduce sink (Can be > done, > > it > > s3>just has to be available to the full cluster) > > > > I guess the thing I'm definitely completely stuck upon is how to get > > something like Result back to the client when I' writing my own > map/reduce > > classes. > > > >> > >> In the Hadoop Javadocs I notice a bunch of Comparators but as yet I've > >> not > >> figure out their purpose. If I spend the cycles understanding the > purpose > >> of > >> these Comparators are they likely to be of help to me in formulating an > >> alternate/better approach to that described above? > >> > > > > > > s3> In HBase all is lexicographically ordered. Tables are ordered by > > rows. Row > > s3> content is ordered by columns. > > > > Thanks > > > > St.Ack > > > > > >> > >> -- > >> View this message in context: > >> > http://www.nabble.com/Possible-to-set-the-results%27-sort-method--tp20047852p25376341.html > >> Sent from the HBase User mailing list archive at Nabble.com. > >> > >> > > > > > > -- > View this message in context: > http://www.nabble.com/Possible-to-set-the-results%27-sort-method--tp20047852p25382714.html > Sent from the HBase User mailing list archive at Nabble.com. > >
