[jira] Commented: (HBASE-1605) TableInputFormat should support 'limit'

Chris K Wensel (JIRA) Thu, 02 Jul 2009 11:32:11 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726607#action_12726607
 ]


Chris K Wensel commented on HBASE-1605:
---------------------------------------

Good questions.

In SQL, LIMIT returns the first N rows of the result set. and is typically used 
with OFFSET to allow pagination.

In Cascading, the Limit Operation only allows each task to see N/M rows 
(accounting for remainders). no notion of OFFSET as limit in this case is 
really used for unit/integration testing or sampling.

re HBase, you guys should choose a model that makes most sense for typical 
hbase consumer applications. but allowing for an even load across many mappers, 
but orthogonally limiting the total number of rows processed is what I'm after.

having this work with a Filter would also be very nice. i.e. give me the 1k 
rows that satisfy this condition. but I guess if i want the first 1k rows that 
satisfy the filter, we might be limited to a single region (and single mapper 
as I see the code now).

so maybe there are two modes. sample and result. sample returns 'random' N rows 
(top N/M from regions). result turns ordered N rows (from a region by virtue).

anyways, just throwing that out there. current use case would be happy with 
either. though 'result' is probably the most useful coupled with HBASE-1172.




> TableInputFormat should support 'limit'
> ---------------------------------------
>
>                 Key: HBASE-1605
>                 URL: https://issues.apache.org/jira/browse/HBASE-1605
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Chris K Wensel
>
> Would be useful if TableInputFormat could be passed a 'limit' property value 
> that limited the total result set to the value of 'limit'.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1605) TableInputFormat should support 'limit'

Reply via email to