deniskuzZ commented on PR #5215: URL: https://github.com/apache/hive/pull/5215#issuecomment-2222272885
> > i don't know the details, need to check the doc, but. if iceberg exposes record_countvia partitions meta-table why select would be expensive, it's just 1 row fetch with spec filter? > > If the table has many manifest files as well as data files, i think getting the `record_count` is a little expensive, as the **record_count of partition** is the sum of all the `record_count of data files`, so iceberg needs to go through all the data file entries in manefest files to get this value. But I think we can tolerate this cost in most cases as long as the table is not so huge. So we can try do the way as you said, and we can refine it further once iceberg repo has done all partition stats api. > > > > iceberg partition row count is that in Hive base code we regard iceberg table as non-partitioned table, and so some partition prune optimization like `HivePartitionPruneRule` > > > > > > btw, Hive support partition pruning for iceberg tables > > Yes, i guess you are saying [HIVE-24962](https://issues.apache.org/jira/browse/HIVE-24962). We do have the ability to prune iceberg partitions when scanning data by [HIVE-24962](https://issues.apache.org/jira/browse/HIVE-24962). But the existing some optimization rules like `HivePartitionPruneRule` is not used by iceberg partition table, but it is used by common Hive tables and then will determine if the partition table can use `StatsOptimizer::transform` to do `count(*) `optimization. > > So, maybe need do some magic change to let some optimization rules like `HivePartitionPruneRule` know iceberg partition table also have the ability of partition prune. FYI https://github.com/apache/iceberg/pull/8502 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
