[ https://issues.apache.org/jira/browse/HIVE-18395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16321833#comment-16321833 ]
Steve Yeom edited comment on HIVE-18395 at 1/11/18 8:02 AM: ------------------------------------------------------------ The above two comments indicate the following two use case scenarios for a reader statement: 1. Table data directory has newest transaction id(xid), let's say, 10, but the Metastore has stats with lastest xid 9 (if we keep last xid for a table). This indicates the corresponding Metastore side transaction to update stats to the data transaction of 10 failed. In this case, COLUMN_STATUS_ACCURATE may be false. In this case, the compiler can choose an aggregate execution plan (not a plan to use stats on the Metastore). 2. A reader statement started and verified that both data directory and Metastore stats have xid, for example, 20 as the latest xid during compilation. But when it starts execution and acquires SLock on the Metastore stats, it finds the latest xid is 22. This indicates there was a new committed transaction 22 which updated the stats. In this case, the reader should not use stats to keep the default snapshot isolation level on acid/MM table but execute an aggregate execution plan unless Metastore DBMS supports MVCC or snapshot isolation. Possibly recompilation. was (Author: steveyeom2017): The above two comments indicate the following two use case scenarios for a reader statement: 1. Table data directory has newest transaction id(xid), let's say, 10, but the Metastore has stats with lastest xid 9 (if we keep last xid for a table). This indicates the corresponding Metastore side transaction to update stats to the data transaction of 10 failed. In this case, COLUMN_STATUS_ACCURATE may be false. In this case, the compiler can choose an aggregate execution plan (not a plan to use stats on the Metastore). 2. A reader statement started and verified that both data directory and Metastore stats have xid, for example, 20 as the latest xid during compilation. But when it starts execution and acquires SLock on the Metastore stats, it finds the latest xid is 22. This indicates there was a new committed transaction 22 which updated the stats. In this case, the reader should not use stats to keep the default snapshot isolation level on acid/MM table but execute an aggregate execution plan. Possibly recompilation. > Using stats for aggregates on Acid/MM is off even with > "hive.compute.query.using.stats" is true. > ------------------------------------------------------------------------------------------------ > > Key: HIVE-18395 > URL: https://issues.apache.org/jira/browse/HIVE-18395 > Project: Hive > Issue Type: Bug > Affects Versions: 3.0.0 > Reporter: Steve Yeom > Priority: Minor > -- This message was sent by Atlassian JIRA (v6.4.14#64029)