Github user twdsilva commented on a diff in the pull request:
https://github.com/apache/phoenix/pull/351#discussion_r218663493
--- Diff:
phoenix-core/src/main/java/org/apache/phoenix/schema/MetaDataClient.java ---
@@ -1279,6 +1279,7 @@ private long updateStatisticsInternal(PName
physicalName, PTable logicalTable, M
MutationPlan plan =
compiler.compile(Collections.singletonList(tableRef), null, cfs, null,
clientTimeStamp);
Scan scan = plan.getContext().getScan();
scan.setCacheBlocks(false);
+ scan.readAllVersions();
--- End diff --
scan.readAllVersions() isn't used when we run a query with table sampling.
If you have 100 versions of a row and run query you will only see the latest
one, or if an SCN is set you will see the last tow at the timestamp just before
the SCN. If the guideposts are calculated using all the versions then sampling
will be incorrect.
---