[
https://issues.apache.org/jira/browse/HBASE-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988390#action_12988390
]
Jonathan Gray commented on HBASE-3488:
--------------------------------------
So would the idea be to not actually count rows but to count either columns or
versions of columns? As I recall, most of the row counting stuff is using
FirstKeyOnlyFilter and is optimized to count unique rows regardless if they
have one version of one column or a millions versions of a million columns.
Also, I don't recommend the {{Result.getMap()}} API. It's a convenience method
but it's not especially performant (it iterates all the keys, parses stuff,
allocates new byte[]s, and builds up the map). Instead you should just use
{{Result.raw()}} and operate on the list of KeyValues returned.
> Allow RowCounter to retrieve multiple versions of rows
> ------------------------------------------------------
>
> Key: HBASE-3488
> URL: https://issues.apache.org/jira/browse/HBASE-3488
> Project: HBase
> Issue Type: Bug
> Components: util
> Affects Versions: 0.90.0
> Reporter: Ted Yu
> Fix For: 0.92.0
>
>
> Currently RowCounter only retrieves latest version for each row.
> Some applications would store multiple versions for the same row.
> RowCounter should accept a new parameter for the number of versions to return.
> Scan object would be configured with version parameter.
> Then the following API should be called:
> {code}
> public NavigableMap<byte[], NavigableMap<byte[], NavigableMap<Long,
> byte[]>>> getMap() {
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.