Aggarwal-Raghav commented on PR #5475: URL: https://github.com/apache/hive/pull/5475#issuecomment-2380447766
**Explanation:** When we run insert query on iceberg table, in _StatTask_ _getTable_ call is made https://github.com/apache/hive/blob/d85b87cfd750623d365d39c73df6d58e1220128a/ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java#L103 this _getTable_ contains the stale snapshotID, reason being in _HiveIcebergMetaHook.java_ the getTable call to IcebergUtil uses the skipcache=false and it returns the previous snapshotID, causing the problem https://github.com/apache/hive/blob/d85b87cfd750623d365d39c73df6d58e1220128a/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java#L1067 _For iceberg tables, DML and DDL operations are not disjoint sets anymore, i.e. insert queries makes changes to metadata hence we cannot use cache_ Attaching the screenshot for before and after the fix state: **Before:** <img width="958" alt="Before_fix_beeline" src="https://github.com/user-attachments/assets/ee11b3be-7911-4bfa-bfa3-f70eb7c6fd97"> <img width="1502" alt="Before_fix_mysql" src="https://github.com/user-attachments/assets/35ca8ada-9d58-47fd-8981-42a2983f8357"> **Even though the latest snapshotID is 4050123687297981987 but in backend db 8504103028255089587 is stored.** **After:** <img width="1012" alt="After_fix_beeline" src="https://github.com/user-attachments/assets/2a7d2ed9-302b-442f-8d86-ea92840f8943"> <img width="1512" alt="After_fix_mysql" src="https://github.com/user-attachments/assets/7a9d0722-745e-4c39-a990-b15cb9124c72"> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
