On Wed, Sep 9, 2009 at 7:52 PM, Keith Thomas <[email protected]> wrote:
> > I think I'm looking at the same problem with HBase as Dingding Ye. I need > to > be able to retrieve a list of rows sorted by data in a column and I'm not > sure how to go about it without resorting to performing the sort on the > client which feels like I'm just giving up. > > You want to sort rows in the table by other than the row key or is it just that you want to sort the content of a row by other than its column name? How big is the set you want to look at? Is it full table or some subset of rows? > My current thinking is to create a map class that outputs key/value pairs > where the key is the field I want to sort upon and the value is row key. > This way I will get nice sorted input going into my reduce class. I guess I > would have to have once reduce class instance. > Why one reduce? Write your own partitioner and impose a total order? > > However, I am unclear how I can return the row keys and the families with > their column data to the client from the reduce class. All the examples I > have found so far write the results to files/tables whereas I want to > return > objects to a client. > Yeah.... bit tough making your client into a reduce sink (Can be done, it just has to be available to the full cluster) > > In the Hadoop Javadocs I notice a bunch of Comparators but as yet I've not > figure out their purpose. If I spend the cycles understanding the purpose > of > these Comparators are they likely to be of help to me in formulating an > alternate/better approach to that described above? > In HBase all is lexicographically ordered. Tables are ordered by rows. Row content is ordered by columns. St.Ack > > -- > View this message in context: > http://www.nabble.com/Possible-to-set-the-results%27-sort-method--tp20047852p25376341.html > Sent from the HBase User mailing list archive at Nabble.com. > >
