[ 
https://issues.apache.org/jira/browse/PHOENIX-2707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Taylor updated PHOENIX-2707:
----------------------------------
    Description: 
In order to accurately track the rows and bytes being scanned, we need to be 
able to differentiate between a table+family have zero guideposts from not 
having collected guideposts. See PHOENIX-2706 which attempted to do the same, 
but in a limited/broken way.

This new approach will do the following instead:
- in DefaultStatisticsCollector.writeStatsToStatsTable(), if there are no stats 
to write, write a row with the physicalTable+columnFamily for the row key and a 
byteCount and rowCount of zero (in addition to what's being done now).
- otherwise, always write a Delete for the physicalTable+columnFamily row. In 
this way, only if all regions of a given column family have no guideposts, will 
this row remain.
- in StatisticsWriter.deleteStats(), always write a Delete marker for the 
physicalTable+columnFamily row.
- in BaseResultIterators.getParallelScans(), set the areStatsEnabled to true or 
false based on gps==GuidePostsInfo.EMPTY_GUIDEPOST. In the case of having no 
rows, the getGuidePosts() method would return a different GuidePostsInfo (that 
indicates there are no rows). 

  was:In order to accurately track the rows and bytes being scanned, we need to 
be able to differentiate between a table+family have zero guideposts from not 
having collected guideposts. See PHOENIX-2706 which attempted to do the same, 
but in a limited/broken way.


> Differentiate between a table+family have zero guideposts from not having 
> collected guideposts
> ----------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-2707
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2707
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>
> In order to accurately track the rows and bytes being scanned, we need to be 
> able to differentiate between a table+family have zero guideposts from not 
> having collected guideposts. See PHOENIX-2706 which attempted to do the same, 
> but in a limited/broken way.
> This new approach will do the following instead:
> - in DefaultStatisticsCollector.writeStatsToStatsTable(), if there are no 
> stats to write, write a row with the physicalTable+columnFamily for the row 
> key and a byteCount and rowCount of zero (in addition to what's being done 
> now).
> - otherwise, always write a Delete for the physicalTable+columnFamily row. In 
> this way, only if all regions of a given column family have no guideposts, 
> will this row remain.
> - in StatisticsWriter.deleteStats(), always write a Delete marker for the 
> physicalTable+columnFamily row.
> - in BaseResultIterators.getParallelScans(), set the areStatsEnabled to true 
> or false based on gps==GuidePostsInfo.EMPTY_GUIDEPOST. In the case of having 
> no rows, the getGuidePosts() method would return a different GuidePostsInfo 
> (that indicates there are no rows). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to