marton-bod opened a new pull request #2329: URL: https://github.com/apache/iceberg/pull/2329
This patch: - Introduces a new snapshot summary metric for `total-files-size`. It was somehow missing up till now, even though it has its companion metrics `added-files-size` and `removed-files-size`. Introducing this total metric makes it consistent with the other 'metric groups'. - On HiveTableOperations commit, we should populate the HMS statistics using these snapshot metrics. Having these stats populated makes the Hive read query planning significantly faster. In some cases, it led to 10x+ improvement on query compilation times, since in the absence of HMS stats the Hive query planner will recursively list the data files to gather their sizes first before execution. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
