Github user twdsilva commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/351#discussion_r218663493
  
    --- Diff: 
phoenix-core/src/main/java/org/apache/phoenix/schema/MetaDataClient.java ---
    @@ -1279,6 +1279,7 @@ private long updateStatisticsInternal(PName 
physicalName, PTable logicalTable, M
                 MutationPlan plan = 
compiler.compile(Collections.singletonList(tableRef), null, cfs, null, 
clientTimeStamp);
                 Scan scan = plan.getContext().getScan();
                 scan.setCacheBlocks(false);
    +            scan.readAllVersions();
    --- End diff --
    
    scan.readAllVersions() isn't used when we run a query with table sampling. 
If you have 100 versions of a row and run query you will only see the latest 
one, or if an SCN is set you will see the last tow at the timestamp just before 
the SCN. If the guideposts are calculated using all the versions then sampling 
will be incorrect. 


---

Reply via email to