szehon-ho commented on code in PR #4456:
URL: https://github.com/apache/iceberg/pull/4456#discussion_r843073580


##########
hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java:
##########
@@ -90,6 +92,7 @@
   private static final String HIVE_LOCK_CHECK_MAX_WAIT_MS = 
"iceberg.hive.lock-check-max-wait-ms";
   private static final String HIVE_ICEBERG_METADATA_REFRESH_MAX_RETRIES = 
"iceberg.hive.metadata-refresh-max-retries";
   private static final String HIVE_TABLE_LEVEL_LOCK_EVICT_MS = 
"iceberg.hive.table-level-lock-evict-ms";
+  private static final long HIVE_TABLE_PROPERTY_VALUE_SIZE_MAX = 4000;

Review Comment:
   Should we should expose this configuration, otherwise it may not match what 
works for all users (different database and backends)?  I am not sure what 
others think.



##########
hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java:
##########
@@ -402,9 +405,32 @@ private void setHmsTableParameters(String 
newMetadataLocation, Table tbl, TableM
       parameters.put(StatsSetupConst.TOTAL_SIZE, 
summary.get(SnapshotSummary.TOTAL_FILE_SIZE_PROP));
     }
 
+    setSnapshotStats(metadata, parameters);
+
     tbl.setParameters(parameters);
   }
 
+  private void setSnapshotStats(TableMetadata metadata, Map<String, String> 
parameters) {
+    Snapshot currentSnapshot = metadata.currentSnapshot();

Review Comment:
   I think we need a way to clear the properties if they are not there anymore 
in current snapshot (like original code setHmsProperties).  For example if the 
summary becomes suddenly over the length in new snapshot, then it will not 
override the old summary and would be wrong.



##########
hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java:
##########
@@ -402,9 +405,32 @@ private void setHmsTableParameters(String 
newMetadataLocation, Table tbl, TableM
       parameters.put(StatsSetupConst.TOTAL_SIZE, 
summary.get(SnapshotSummary.TOTAL_FILE_SIZE_PROP));
     }
 
+    setSnapshotStats(metadata, parameters);
+
     tbl.setParameters(parameters);
   }
 
+  private void setSnapshotStats(TableMetadata metadata, Map<String, String> 
parameters) {
+    Snapshot currentSnapshot = metadata.currentSnapshot();
+    if (currentSnapshot != null) {
+      parameters.put(TableProperties.CURRENT_SNAPSHOT_ID, 
String.valueOf(currentSnapshot.snapshotId()));
+      parameters.put(TableProperties.CURRENT_SNAPSHOT_TIMESTAMP, 
String.valueOf(currentSnapshot.timestampMillis()));
+      try {
+        String summary = 
JsonUtil.mapper().writeValueAsString(currentSnapshot.summary());
+        if (summary.length() <= HIVE_TABLE_PROPERTY_VALUE_SIZE_MAX) {
+          parameters.put(TableProperties.CURRENT_SNAPSHOT_SUMMARY, summary);

Review Comment:
   Also had one concern, there's some repeating information 
(setHmsTableParameters puts some but not all of the summary).  Did we ever 
consider just inlining these fields from the summary like they did over there?



##########
hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java:
##########
@@ -402,9 +405,32 @@ private void setHmsTableParameters(String 
newMetadataLocation, Table tbl, TableM
       parameters.put(StatsSetupConst.TOTAL_SIZE, 
summary.get(SnapshotSummary.TOTAL_FILE_SIZE_PROP));
     }
 
+    setSnapshotStats(metadata, parameters);
+
     tbl.setParameters(parameters);
   }
 
+  private void setSnapshotStats(TableMetadata metadata, Map<String, String> 
parameters) {
+    Snapshot currentSnapshot = metadata.currentSnapshot();
+    if (currentSnapshot != null) {
+      parameters.put(TableProperties.CURRENT_SNAPSHOT_ID, 
String.valueOf(currentSnapshot.snapshotId()));
+      parameters.put(TableProperties.CURRENT_SNAPSHOT_TIMESTAMP, 
String.valueOf(currentSnapshot.timestampMillis()));
+      try {
+        String summary = 
JsonUtil.mapper().writeValueAsString(currentSnapshot.summary());
+        if (summary.length() <= HIVE_TABLE_PROPERTY_VALUE_SIZE_MAX) {
+          parameters.put(TableProperties.CURRENT_SNAPSHOT_SUMMARY, summary);
+        } else {
+          LOG.warn("Not expose the current snapshot({}) summary in HMS since 
it exceeds {} characters",

Review Comment:
   nit: "Not exposing".



##########
hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveCatalog.java:
##########
@@ -468,4 +473,54 @@ public void testUUIDinTableProperties() throws Exception {
       catalog.dropTable(tableIdentifier);
     }
   }
+
+  @Test
+  public void testSnapshotStatsTableProperties() throws Exception {
+    Schema schema = new Schema(
+        required(1, "id", Types.IntegerType.get(), "unique ID"),
+        required(2, "data", Types.StringType.get())
+    );
+    TableIdentifier tableIdentifier = TableIdentifier.of(DB_NAME, "tbl");
+    String location = temp.newFolder("tbl").toString();
+
+    try {
+      catalog.buildTable(tableIdentifier, schema)
+          .withLocation(location)
+          .create();
+
+      String tableName = tableIdentifier.name();
+      org.apache.hadoop.hive.metastore.api.Table hmsTable =
+          metastoreClient.getTable(tableIdentifier.namespace().level(0), 
tableName);
+
+      // check whether parameters are in expected state
+      Map<String, String> parameters = hmsTable.getParameters();
+      Assert.assertEquals("0", parameters.get(TableProperties.SNAPSHOT_COUNT));

Review Comment:
   Can we put some error messages in these asserts, like in other tests?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to