[
https://issues.apache.org/jira/browse/TRAFODION-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16655959#comment-16655959
]
ASF GitHub Bot commented on TRAFODION-3223:
-------------------------------------------
Github user asfgit closed the pull request at:
https://github.com/apache/trafodion/pull/1730
> Row count estimation code works poorly on time-ordered aged-out data
> --------------------------------------------------------------------
>
> Key: TRAFODION-3223
> URL: https://issues.apache.org/jira/browse/TRAFODION-3223
> Project: Apache Trafodion
> Issue Type: Bug
> Components: sql-cmp
> Affects Versions: any
> Reporter: David Wayne Birdsall
> Assignee: David Wayne Birdsall
> Priority: Major
>
> The estimateRowCountBody method in module HBaseClient.java samples cells from
> the first 500 rows from the first HFile it sees in order to estimate the
> number of rows in a Trafodion table. If the table happens to have a
> time-ordered key, and data are aged out over time, we can get large clumps of
> "delete" tombstones in one or more HFiles. If estimateRowCountBody happens to
> look at such an HFile, it will incorrectly conclude that most cells are
> "delete" tombstones and therefore drastically underestimate the row count.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)