[
https://issues.apache.org/jira/browse/HBASE-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008522#comment-13008522
]
Subbu M Iyer commented on HBASE-3488:
-------------------------------------
Test Setup
For simplicity, let's assume that we have a table with following data setup.
R1 -> CF1 with columns c1-c5. (5 cells all with 3 versions)
R2 -> CF1 with columns c1 (1 cell one version)
CF2 with columns c1,c2 (2 cells. c1=2 versions and c2=3 versions)
CF3 with columns c1-c3 (3 cells one version)
CF4 with columns c1-c4 (4 cells one version)
R3 -> CF1 with columns c1-c9 (9 cells one version)
CF2 with columns c10-c20 (10 cells one version)
Running the CellCounter program will print the following stats:
Total Number of rows = 3
Number of distinct CF = CF1-CF4 = 4
Number of distinct Cells = c1-c20 = 20
Total number of Cells (across all rows and CFs) = 34
Avg Number of CFs per Row = 4/3 = 1.33
Avg number of Cells per CF = 34/4 = 8.5
Versions:
CF1:c1 = 3 versions
CF1:c2 = 3 versions
CF1:c3 = 3 versions
CF1:c4 = 3 versions
CF1:c5 = 3 versions
CF2:c1 = 2 versions
CF2:c2 = 3 versions
all other CF:c combination = 1 version.
Ted: Is this your expectation? Please let me know.
> Allow RowCounter to retrieve multiple versions of rows
> ------------------------------------------------------
>
> Key: HBASE-3488
> URL: https://issues.apache.org/jira/browse/HBASE-3488
> Project: HBase
> Issue Type: Bug
> Components: util
> Affects Versions: 0.90.0
> Reporter: Ted Yu
> Fix For: 0.92.0
>
>
> Currently RowCounter only retrieves latest version for each row.
> Some applications would store multiple versions for the same row.
> RowCounter should accept a new parameter for the number of versions to return.
> Scan object would be configured with version parameter (for scan.maxVersions).
> Then the following API should be called:
> {code}
> public KeyValue[] raw() {
> {code}
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira