Bin Shi created PHOENIX-4916:
--------------------------------
Summary: When collecting statistics, the estimated size of a guide
post may only count part of cells of the last row
Key: PHOENIX-4916
URL: https://issues.apache.org/jira/browse/PHOENIX-4916
Project: Phoenix
Issue Type: Bug
Reporter: Bin Shi
Assignee: Bin Shi
In DefaultStatisticsCollector.collectStatistics(...), it iterate all cells of
the current row, once the accumulated estimated size plus the size of the
current cell >= guide post width, it skipped all the remaining cells. The
result is that he estimated size of a guide post may only count part of cells
of the last row.
This problem can be ignored in clusters with real data where the guide post
width is much bigger than the row size, but it does have impact on unit test
and iteration test, because we use very small guide post width in the test
which results in inaccuracy of the estimated size of the query.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)