David Wayne Birdsall created TRAFODION-3223:
-----------------------------------------------
Summary: Row count estimation code works poorly on time-ordered
aged-out data
Key: TRAFODION-3223
URL: https://issues.apache.org/jira/browse/TRAFODION-3223
Project: Apache Trafodion
Issue Type: Bug
Components: sql-cmp
Affects Versions: any
Reporter: David Wayne Birdsall
Assignee: David Wayne Birdsall
The estimateRowCountBody method in module HBaseClient.java samples cells from
the first 500 rows from the first HFile it sees in order to estimate the number
of rows in a Trafodion table. If the table happens to have a time-ordered key,
and data are aged out over time, we can get large clumps of "delete" tombstones
in one or more HFiles. If estimateRowCountBody happens to look at such an
HFile, it will incorrectly conclude that most cells are "delete" tombstones and
therefore drastically underestimate the row count.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)