Daniel Dai created HIVE-17421:
---------------------------------
Summary: Clear incorrect stats after replication
Key: HIVE-17421
URL: https://issues.apache.org/jira/browse/HIVE-17421
Project: Hive
Issue Type: Bug
Components: repl
Reporter: Daniel Dai
Assignee: Daniel Dai
After replication, some stats summary are incorrect. If
hive.compute.query.using.stats set to true, we will get wrong result on the
destination side.
This will not happen with bootstrap replication. This is because stats summary
are in table properties and will be replicated to the destination. However, in
incremental replication, this won't work. When creating table, the stats
summary are empty (eg, numRows=0). Later when we insert data, stats summary are
updated with update_table_column_statistics/update_partition_column_statistics,
however, both events are not captured in incremental replication. Thus on the
destination side, we will get count(*)=0. The simple solution is to remove
COLUMN_STATS_ACCURATE property after incremental replication.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)