[
https://issues.apache.org/jira/browse/PHOENIX-2707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
James Taylor updated PHOENIX-2707:
----------------------------------
Description:
In order to accurately track the rows and bytes being scanned, we need to be
able to differentiate between a table+family have zero guideposts from not
having collected guideposts. See PHOENIX-2706 which attempted to do the same,
but in a limited/broken way.
This new approach will do the following instead:
- in DefaultStatisticsCollector.writeStatsToStatsTable(), if there are no stats
to write, write a row with the physicalTable+columnFamily for the row key and a
byteCount and rowCount of zero (in addition to what's being done now).
- otherwise, always write a Delete for the physicalTable+columnFamily row. In
this way, only if all regions of a given column family have no guideposts, will
this row remain.
- in StatisticsWriter.deleteStats(), always write a Delete marker for the
physicalTable+columnFamily row.
- in BaseResultIterators.getParallelScans(), set the areStatsEnabled to true or
false based on gps==GuidePostsInfo.EMPTY_GUIDEPOST. In the case of having no
rows, the getGuidePosts() method would return a different GuidePostsInfo (that
indicates there are no rows).
was:In order to accurately track the rows and bytes being scanned, we need to
be able to differentiate between a table+family have zero guideposts from not
having collected guideposts. See PHOENIX-2706 which attempted to do the same,
but in a limited/broken way.
> Differentiate between a table+family have zero guideposts from not having
> collected guideposts
> ----------------------------------------------------------------------------------------------
>
> Key: PHOENIX-2707
> URL: https://issues.apache.org/jira/browse/PHOENIX-2707
> Project: Phoenix
> Issue Type: Bug
> Reporter: James Taylor
>
> In order to accurately track the rows and bytes being scanned, we need to be
> able to differentiate between a table+family have zero guideposts from not
> having collected guideposts. See PHOENIX-2706 which attempted to do the same,
> but in a limited/broken way.
> This new approach will do the following instead:
> - in DefaultStatisticsCollector.writeStatsToStatsTable(), if there are no
> stats to write, write a row with the physicalTable+columnFamily for the row
> key and a byteCount and rowCount of zero (in addition to what's being done
> now).
> - otherwise, always write a Delete for the physicalTable+columnFamily row. In
> this way, only if all regions of a given column family have no guideposts,
> will this row remain.
> - in StatisticsWriter.deleteStats(), always write a Delete marker for the
> physicalTable+columnFamily row.
> - in BaseResultIterators.getParallelScans(), set the areStatsEnabled to true
> or false based on gps==GuidePostsInfo.EMPTY_GUIDEPOST. In the case of having
> no rows, the getGuidePosts() method would return a different GuidePostsInfo
> (that indicates there are no rows).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)