On Wed, Sep 9, 2009 at 7:52 PM, Keith Thomas <[email protected]> wrote:

>
> I think I'm looking at the same problem with HBase as Dingding Ye. I need
> to
> be able to retrieve a list of rows sorted by data in a column and I'm not
> sure how to go about it without resorting to performing the sort on the
> client which feels like I'm just giving up.
>
>

You want to sort rows in the table by other than the row key or is it just
that you want to sort the content of a row by other than its column name?

How big is the set you want to look at?  Is it full table or some subset of
rows?



> My current thinking is to create a map class that outputs key/value pairs
> where the key is the field I want to sort upon and the value is row key.
> This way I will get nice sorted input going into my reduce class. I guess I
> would have to have once reduce class instance.
>

Why one reduce?  Write your own partitioner and impose a total order?



>
> However, I am unclear how I can return the row keys and the families with
> their column data to the client from the reduce class. All the examples I
> have found so far write the results to files/tables whereas I want to
> return
> objects to a client.
>

Yeah.... bit tough making  your client into a reduce sink (Can be done, it
just has to be available to the full cluster)



>
> In the Hadoop Javadocs I notice a bunch of Comparators but as yet I've not
> figure out their purpose. If I spend the cycles understanding the purpose
> of
> these Comparators are they likely to be of help to me in formulating an
> alternate/better approach to that described above?
>


In HBase all is lexicographically ordered.  Tables are ordered by rows.  Row
content is ordered by columns.

St.Ack


>
> --
> View this message in context:
> http://www.nabble.com/Possible-to-set-the-results%27-sort-method--tp20047852p25376341.html
> Sent from the HBase User mailing list archive at Nabble.com.
>
>

Reply via email to