flyrain commented on code in PR #4456:
URL: https://github.com/apache/iceberg/pull/4456#discussion_r843175443
##########
hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java:
##########
@@ -402,9 +405,32 @@ private void setHmsTableParameters(String
newMetadataLocation, Table tbl, TableM
parameters.put(StatsSetupConst.TOTAL_SIZE,
summary.get(SnapshotSummary.TOTAL_FILE_SIZE_PROP));
}
+ setSnapshotStats(metadata, parameters);
+
tbl.setParameters(parameters);
}
+ private void setSnapshotStats(TableMetadata metadata, Map<String, String>
parameters) {
+ Snapshot currentSnapshot = metadata.currentSnapshot();
+ if (currentSnapshot != null) {
+ parameters.put(TableProperties.CURRENT_SNAPSHOT_ID,
String.valueOf(currentSnapshot.snapshotId()));
+ parameters.put(TableProperties.CURRENT_SNAPSHOT_TIMESTAMP,
String.valueOf(currentSnapshot.timestampMillis()));
+ try {
+ String summary =
JsonUtil.mapper().writeValueAsString(currentSnapshot.summary());
+ if (summary.length() <= HIVE_TABLE_PROPERTY_VALUE_SIZE_MAX) {
+ parameters.put(TableProperties.CURRENT_SNAPSHOT_SUMMARY, summary);
Review Comment:
yes, these duplicated properties(numFiles, numRows, totalSize) are handy for
existing HMS consumers, since they are hive table properties, the existing tool
can use them out-of-box. The snapshot summary have much more stats. It may not
be a good idea to split them into individual HMS table properties. For example,
any summary change requires a code here change in that case.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]