[ 
https://issues.apache.org/jira/browse/PHOENIX-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15096913#comment-15096913
 ] 

Hudson commented on PHOENIX-2143:
---------------------------------

FAILURE: Integrated in Phoenix-master #1074 (See 
[https://builds.apache.org/job/Phoenix-master/1074/])
PHOENIX-2143 Use guidepost bytes instead of region name in stats primary 
(jtaylor: rev 90cf5730058246914e7fc616c43f2837fd499824)
* 
phoenix-core/src/main/java/org/apache/phoenix/schema/stats/StatisticsScanner.java
* phoenix-core/src/main/java/org/apache/phoenix/schema/stats/StatisticsUtil.java
* 
phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataProtocol.java
* phoenix-core/src/main/java/org/apache/phoenix/schema/MetaDataClient.java
* 
phoenix-core/src/it/java/org/apache/phoenix/end2end/StatsCollectorWithSplitsAndMultiCFIT.java
* 
phoenix-core/src/main/java/org/apache/phoenix/jdbc/PhoenixDatabaseMetaData.java
* 
phoenix-core/src/main/java/org/apache/phoenix/coprocessor/UngroupedAggregateRegionObserver.java
* phoenix-core/src/main/java/org/apache/phoenix/util/MetaDataUtil.java
* phoenix-core/src/it/java/org/apache/phoenix/end2end/StatsCollectorIT.java
* 
phoenix-core/src/main/java/org/apache/phoenix/schema/stats/StatisticsCollector.java
* phoenix-core/src/main/java/org/apache/phoenix/util/ScanUtil.java
* 
phoenix-core/src/main/java/org/apache/phoenix/schema/stats/StatisticsWriter.java
* phoenix-core/src/main/java/org/apache/phoenix/query/QueryConstants.java
* phoenix-core/src/main/java/org/apache/phoenix/schema/stats/GuidePostsInfo.java
* 
phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java


> Use guidepost bytes instead of region name in stats primary key
> ---------------------------------------------------------------
>
>                 Key: PHOENIX-2143
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2143
>             Project: Phoenix
>          Issue Type: Sub-task
>            Reporter: James Taylor
>            Assignee: Ankit Singhal
>             Fix For: 4.7.0
>
>         Attachments: PHOENIX-2143.patch, PHOENIX-2143_v2.patch, 
> PHOENIX-2143_v3.patch, PHOENIX-2143_v4.patch, PHOENIX-2143_v4_rebased.patch, 
> PHOENIX-2143_wip.patch, PHOENIX-2143_wip_2.patch
>
>
> Our current SYSTEM.STATS table uses the region name as the last column in the 
> primary key constraint. Instead, we should use the MIN_KEY column (which 
> corresponds to the region start key). The advantage would be that the stats 
> would then be ordered by region start key allowing us to approximate the 
> number of guideposts which would be traversed given the start/stop row of a 
> scan:
> {code}
> SELECT SUM(guide_posts_count) FROM SYSTEM.STATS WHERE min_key > :1 AND 
> min_key < :2
> {code}
> where :1 is the start row and :2 is the stop row of the scan. With an UNNEST 
> operator for ARRAYs, we could get a better approximation.
> As part of the upgrade to the new Phoenix version containing this fix, stats 
> could simply be dropped and they'd be recalculated with the new schema.
> An alternative, even more granular approach would be to *not* use arrays to 
> store the guide posts, but instead store them as individual rows with a 
> schema like this.
> |PHYSICAL_NAME|VARCHAR|
> |COLUMN_FAMILY|VARCHAR|
> |GUIDE_POST_KEY|VARBINARY|
> In this alternative, the maintenance during compaction is higher, though, as 
> you'd need to run a separate query to do the deletion of the old guideposts, 
> followed by a commit of the new guideposts. The other disadvantage (besides 
> requiring multiple queries) is that this couldn't be done transactionally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to