[ https://issues.apache.org/jira/browse/HBASE-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Gray updated HBASE-1481: --------------------------------- Attachment: HBASE-1481-v1.patch Patch adds a new filter called FirstKeyOnlyFilter. It's extremely simple, but this does generally accomplish what we want. The only further optimizations to row counting I can think of: - prevent sending back even an entire KV per row (all we really need is the count, but this breaks the API) - once we work at issues like HBASE-1517, we should seek to the next row after we look at the first KV (if we have a million columns in a row, we don't need to iterate all of them to do a row count) The latter issue gets me thinking about what filters could do to push that kind of information to the QueryMatcher.... > Add fast row key only scanning > ------------------------------ > > Key: HBASE-1481 > URL: https://issues.apache.org/jira/browse/HBASE-1481 > Project: Hadoop HBase > Issue Type: Improvement > Affects Versions: 0.19.3 > Reporter: Lars George > Priority: Minor > Fix For: 0.21.0 > > Attachments: HBASE-1481-v1.patch > > > Instead of requiring a user to set up a scanner with any column and scan the > table to gather all row keys while ignoring the column value we should have a > fast and lightweight scanner that for example takes a "null" for the column > list and then simply returns only the matching keys of all non-empty or > deleted rows. Filters should still be applicable. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.