[ 
https://issues.apache.org/jira/browse/PHOENIX-4008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16622368#comment-16622368
 ] 

ASF GitHub Bot commented on PHOENIX-4008:
-----------------------------------------

Github user BinShi-SecularBird commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/351#discussion_r219249172
  
    --- Diff: 
phoenix-core/src/it/java/org/apache/phoenix/schema/stats/StatsCollectorIT.java 
---
    @@ -736,6 +737,71 @@ public void 
testEmptyGuidePostGeneratedWhenDataSizeLessThanGPWidth() throws Exce
             }
         }
     
    +    @Test
    +    public void testCollectingAllVersionsOfCells() throws Exception {
    +        String tableName = generateUniqueName();
    +        try (Connection conn = DriverManager.getConnection(getUrl())) {
    +            long guidePostWidth = 70;
    +            String ddl =
    +                    "CREATE TABLE " + tableName + " (k INTEGER PRIMARY 
KEY, c1.a bigint, c2.b bigint)"
    +                            + " GUIDE_POSTS_WIDTH=" + guidePostWidth
    +                            + ", USE_STATS_FOR_PARALLELIZATION=true" + ", 
VERSIONS=3";
    +            conn.createStatement().execute(ddl);
    +            conn.createStatement().execute("upsert into " + tableName + " 
values (100,100,3)");
    +            conn.commit();
    +            conn.createStatement().execute("UPDATE STATISTICS " + 
tableName);
    +
    +            ConnectionQueryServices queryServices =
    +                    
conn.unwrap(PhoenixConnection.class).getQueryServices();
    +
    +            // The table only has one row. All cells just has one version, 
and the data size of the row
    +            // is less than the guide post width, so we generate empty 
guide post.
    +            try (Table statsHTable =
    --- End diff --
    
    refactored


> UPDATE STATISTIC should collect all versions of cells
> -----------------------------------------------------
>
>                 Key: PHOENIX-4008
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4008
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Samarth Jain
>            Assignee: Bin Shi
>            Priority: Major
>         Attachments: PHOENIX-4008_0918.patch
>
>
> In order to truly measure the size of data when calculating guide posts, 
> UPDATE STATISTIC should taken into account all versions of cells. We should 
> also be setting the max versions on the scan.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to