In your simple example, you can probably use the LIMIT clause, but for more 
advanced cases, here's a patch for a ROW_SEQUENCE UDF (not committed to trunk 
yet):

https://issues.apache.org/jira/browse/HIVE-1304

The caveat is that since we don't actually have a SQL/OLAP implementation yet, 
you have to use ORDER BY at the nested query level (rather than the OVER clause 
level where it belongs) and cross your fingers.

JVS

On May 25, 2010, at 12:13 PM, Kortni Smith wrote:

Hi,

Is there a hive equivalent to Oracle’s rownum, row_number() or the ability to 
loop through a resultset?

I have been struggling to create a hive query that will give me max X records, 
per something, when sorted by something.  For example, I have book data, 
multiple records for any given isbn, and want the lowest 5 priced books per 
isbn.

I can accomplish this in oracle with the following:

select isbn, price from
(
select isbn, price, row_number() over (partition by isbn order by price asc) rn
from kstest
)
where rn <= 5;

Any  ideas would be greatly appreciated.
Thank you,

Kortni Smith | Software Developer
AbeBooks.com  <http://www.abebooks.com/> Passion for books.

[email protected]<mailto:[email protected]>
phone: 250.412.3272  |  fax: 250.475.6014

Suite 500 - 655 Tyee Rd. Victoria, BC. Canada V9A 6X5

www.abebooks.com  |  www.abebooks.co.uk  |  www.abebooks.de
www.abebooks.fr  |  www.abebooks.it  |  www.iberlibro.com


Reply via email to