[ https://issues.apache.org/jira/browse/HBASE-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13267220#comment-13267220 ]
Hudson commented on HBASE-1996: ------------------------------- Integrated in HBase-TRUNK-security #190 (See [https://builds.apache.org/job/HBase-TRUNK-security/190/]) HBASE-2214 Do HBASE-1996 -- setting size to return in scan rather than count of rows -- properly (Ferdy Galema) (Revision 1333122) Result = SUCCESS tedyu : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/AdminProtos.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/HBaseProtos.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/RPCProtos.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ZooKeeperProtos.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScanner.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/RegionServer.java * /hbase/trunk/src/main/protobuf/Client.proto * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java > Configure scanner buffer in bytes instead of number of rows > ----------------------------------------------------------- > > Key: HBASE-1996 > URL: https://issues.apache.org/jira/browse/HBASE-1996 > Project: HBase > Issue Type: Improvement > Reporter: Dave Latham > Assignee: Dave Latham > Fix For: 0.90.0 > > Attachments: 1966.patch, 1996-0.20.3-v2.patch, 1996-0.20.3-v3.patch, > 1996-0.20.3.patch > > > Currently, the default scanner fetches a single row at a time. This makes > for very slow scans on tables where the rows are not large. You can change > the setting for an HTable instance or for each Scan. > It would be better to have a default that performs reasonably well so that > people stop running into slow scans because they are evaluating HBase, aren't > familiar with the setting, or simply forgot. Unfortunately, if we increase > the value of the current setting, then we run the risk of running OOM for > tables with large rows. Let's change the setting so that it works with a > size in bytes, rather than in rows. This will allow us to set a reasonable > default so that tables with small rows will scan performantly and tables with > large rows will not run OOM. > Note that the case is very similar to table writes as well. When disabling > auto flush, we buffer a list of Put's to commit at once. That buffer is > measured in bytes, so that a small number of large Puts or a lot of small > Puts can each fit in a single flush. If that buffer were measured in number > of Put's it would have the same problem that we have for the scan buffer, and > we wouldn't be able to set a good default value for tables with different > size rows. Changing the scan buffer to be configured like the write buffer > will make it more consistent. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira