David Wayne Birdsall created TRAFODION-3223:
-----------------------------------------------

             Summary: Row count estimation code works poorly on time-ordered 
aged-out data
                 Key: TRAFODION-3223
                 URL: https://issues.apache.org/jira/browse/TRAFODION-3223
             Project: Apache Trafodion
          Issue Type: Bug
          Components: sql-cmp
    Affects Versions: any
            Reporter: David Wayne Birdsall
            Assignee: David Wayne Birdsall


The estimateRowCountBody method in module HBaseClient.java samples cells from 
the first 500 rows from the first HFile it sees in order to estimate the number 
of rows in a Trafodion table. If the table happens to have a time-ordered key, 
and data are aged out over time, we can get large clumps of "delete" tombstones 
in one or more HFiles. If estimateRowCountBody happens to look at such an 
HFile, it will incorrectly conclude that most cells are "delete" tombstones and 
therefore drastically underestimate the row count.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to