[
https://issues.apache.org/jira/browse/ACCUMULO-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13845480#comment-13845480
]
Chris McCubbin commented on ACCUMULO-261:
-----------------------------------------
I'm encountering the need for this setting yet again. The situation is that I
have an iterator stack that has a high cost to re-seek. Sometimes I want all
the results ("bulk") sometimes I only want a few ("top-k"). There really is no
good "one size fits all" table.scan.max.memory setting in this case. If I set
it small, the re-seek overhead kills performance on the bulk scan. If I set it
large I look-ahead way too many entries for the top-k use-case and performance
is again poor.
Also related is the fact that one can only "setBatchSize" on Scanners and not
BatchScanners.
> Scanner should support batch size specified in bytes
> ----------------------------------------------------
>
> Key: ACCUMULO-261
> URL: https://issues.apache.org/jira/browse/ACCUMULO-261
> Project: Accumulo
> Issue Type: New Feature
> Components: client
> Reporter: John Vines
>
> Currently the scanner allows a user to set batch size in numbers of entries.
> Unfortunately this isn't too useful if you have widely varied entry size and
> you want to keep your internal footprint within a threshold. So we should
> also allow users to set batch size in maximum number of bytes to bring back.
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)