[
https://issues.apache.org/jira/browse/HBASE-13099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14336521#comment-14336521
]
stack commented on HBASE-13099:
-------------------------------
We use state of Result (null, empty) to flag on client side state of scan.
[~jonathan.lawlor] is adding 'partial' flag on result now to do 'chunking', to
indicate the Result is a partial on the row which a client probably doesn't
care about but the running Scan does (this flag is overloaded).
Where would we tag on the LastEvaluatedKey? Would it just be the last KV in
the Result? Could client-side scan read this and use it going back to the
server?
Would be good disconnecting client and server.
On serverside, when a lease expires, we do this to clean up outstanding region
scanners:
@Override
public synchronized void close() {
if (storeHeap != null) {
storeHeap.close();
storeHeap = null;
}
if (joinedHeap != null) {
joinedHeap.close();
joinedHeap = null;
}
// no need to synchronize here.
scannerReadPoints.remove(this);
this.filterClosed = true;
}
Probably need to keep the above or at least revisit too. A timer on scanner
serverside with returning after we've done "10 seconds" or "1MB" is coming up
in issues elsewhere. The serverside lease-checking facility might be the place
to do this -- it already tries to clean up expired serverside scanners. It
could on a period check outstanding scans for where they are. Probably better
to just rip out this lease checking thing and move the checks into the region
scanner itself; it will know where it is and so rather than have foreign thread
interrupt, interrupt itself (works unless scanner gets stuck -- but I'd guess
Lease interrupting running scanner probably don't work either).
> Scans as in DynamoDB
> --------------------
>
> Key: HBASE-13099
> URL: https://issues.apache.org/jira/browse/HBASE-13099
> Project: HBase
> Issue Type: Brainstorming
> Components: Client, regionserver
> Reporter: Nicolas Liochon
>
> cc: [[email protected]] - as discussed offline.
> DynamoDB has a very simple way to manage scans server side:
> ??citation??
> The data returned from a Query or Scan operation is limited to 1 MB; this
> means that if you scan a table that has more than 1 MB of data, you'll need
> to perform another Scan operation to continue to the next 1 MB of data in the
> table.
> If you query or scan for specific attributes that match values that amount to
> more than 1 MB of data, you'll need to perform another Query or Scan request
> for the next 1 MB of data. To do this, take the LastEvaluatedKey value from
> the previous request, and use that value as the ExclusiveStartKey in the next
> request. This will let you progressively query or scan for new data in 1 MB
> increments.
> When the entire result set from a Query or Scan has been processed, the
> LastEvaluatedKey is null. This indicates that the result set is complete
> (i.e. the operation processed the “last page” of data).
> ??citation??
> This means that there is no state server side: the work is done client side.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)