Karan Mehta created PHOENIX-4953:
------------------------------------

             Summary: DefaultStatisticsCollector fails to capture the last 
guidepost of every region
                 Key: PHOENIX-4953
                 URL: https://issues.apache.org/jira/browse/PHOENIX-4953
             Project: Phoenix
          Issue Type: Bug
            Reporter: Karan Mehta


The issue was found during a sanity test run when the count of all rows from 
all the guideposts didn't match the actual number of rows in the table. 

`DefaultStatisticsCollector#collectStatistics()` method iterates over a list of 
cells and keeps track of size of KV's. If the size exceeds guideposts width, it 
adds an entry to `GuidePostsInfo using 
`GuidePostsInfoBuilder`addGuidePostOnCollection()` method. 

However for the last batch of rows that don't cross the threshold of 
GUIDE_POSTS_WIDTH, the code doesn't create any entry for it using the Builder 
class. In an ideal case, we would want to cover that scenario by introducing a 
small guide post with the corresponding row key and the size of the that 
guidepost (since we can persist both the things to SYSTEM.STATS table). This is 
also because GUIDE_POSTS_WIDTH is an estimate/best effort for distribution of 
data. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to