dengzhhu653 commented on code in PR #6089:
URL: https://github.com/apache/hive/pull/6089#discussion_r2638281029
##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java:
##########
@@ -2029,65 +2064,48 @@ private List<ColumnStatisticsObj> aggrStatsUseDB(String
catName, String dbName,
+ " inner join " + TBLS + " on " + PARTITIONS + ".\"TBL_ID\" = " +
TBLS + ".\"TBL_ID\""
+ " inner join " + DBS + " on " + TBLS + ".\"DB_ID\" = " + DBS +
".\"DB_ID\""
+ " where " + DBS + ".\"CTLG_NAME\" = ? and " + DBS + ".\"NAME\" = ?
and " + TBLS + ".\"TBL_NAME\" = ? "
- + " and " + PART_COL_STATS + ".\"COLUMN_NAME\" in (" +
makeParams(colNames.size()) + ")"
- + " and " + PARTITIONS + ".\"PART_NAME\" in (" +
makeParams(partNames.size()) + ")"
+ + " and " + PART_COL_STATS + ".\"COLUMN_NAME\" in (%1$s)"
+ + " and " + PARTITIONS + ".\"PART_NAME\" in (%2$s)"
+ " and " + PART_COL_STATS + ".\"ENGINE\" = ? "
+ " group by " + PART_COL_STATS + ".\"COLUMN_NAME\", " +
PART_COL_STATS + ".\"COLUMN_TYPE\"";
Review Comment:
I think this is not fully right. let's take an example:
In db, colA has the stats for part1, part2, part3, and colB has stats for
part1,part2,part3,par4,part5
If we want to aggregate the column stats among the
partition(part1,part2,part3), as colA and colB both have the partition column
stats, we just fetch the stats and aggregate them.
however if we want the aggregated stats for
partition(part1,part2,part3,part4,par5), as colB has all of them, so just
aggregate them. While for colA, as part4 and part5 is missing, we need some
assumptions on those two missing stats.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]