[ 
https://issues.apache.org/jira/browse/HBASE-9343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13872949#comment-13872949
 ] 

Jimmy Xiang commented on HBASE-9343:
------------------------------------

bq. 1. If start row, end row and limit not specified, then the whole table will 
be scanned.
This sounds to be not good. If the table is huge, are we still going to return 
the whole table? I was wondering if we should have a default max data size, and 
if the max cap is reached, return what we have so far with a flag saying there 
are more data to fetch. Without a cap, the REST server will be easily OOM.

> Implement stateless scanner for Stargate
> ----------------------------------------
>
>                 Key: HBASE-9343
>                 URL: https://issues.apache.org/jira/browse/HBASE-9343
>             Project: HBase
>          Issue Type: Improvement
>          Components: REST
>    Affects Versions: 0.94.11
>            Reporter: Vandana Ayyalasomayajula
>            Assignee: Vandana Ayyalasomayajula
>            Priority: Minor
>             Fix For: 0.98.1, 0.99.0
>
>         Attachments: HBASE-9343_94.00.patch, HBASE-9343_94.01.patch, 
> HBASE-9343_trunk.00.patch, HBASE-9343_trunk.01.patch, 
> HBASE-9343_trunk.01.patch, HBASE-9343_trunk.02.patch, 
> HBASE-9343_trunk.03.patch, HBASE-9343_trunk.04.patch, 
> HBASE-9343_trunk.05.patch
>
>
> The current scanner implementation for scanner stores state and hence not 
> very suitable for REST server failure scenarios. The current JIRA proposes to 
> implement a stateless scanner. In the first version of the patch, a new 
> resource class "ScanResource" has been added and all the scan parameters will 
> be specified as query params. 
> The following are the scan parameters
> startrow -  The start row for the scan.
> endrow - The end row for the scan.
> columns - The columns to scan. 
> starttime, endtime - To only retrieve columns within a specific range of 
> version timestamps,both start and end time must be specified.
> maxversions  - To limit the number of versions of each column to be returned.
> batchsize - To limit the maximum number of values returned for each call to 
> next().
> limit - The number of rows to return in the scan operation.
>  More on start row, end row and limit parameters.
> 1. If start row, end row and limit not specified, then the whole table will 
> be scanned.
> 2. If start row and limit (say N) is specified, then the scan operation will 
> return N rows from the start row specified.
> 3. If only limit parameter is specified, then the scan operation will return 
> N rows from the start of the table.
> 4. If limit and end row are specified, then the scan operation will return N 
> rows from start of table till the end row. If the end row is 
> reached before N rows ( say M and M < N ), then M rows will be returned to 
> the user.
> 5. If start row, end row and limit (say N ) are specified and N < number 
> of rows between start row and end row, then N rows from start row
> will be returned to the user. If N > (number of rows between start row and 
> end row (say M), then M number of rows will be returned to the
> user.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to