rdblue commented on a change in pull request #2048:
URL: https://github.com/apache/iceberg/pull/2048#discussion_r553694732



##########
File path: parquet/src/main/java/org/apache/iceberg/parquet/Parquet.java
##########
@@ -226,6 +226,9 @@ private CompressionCodecName codec() {
       set("parquet.avro.write-old-list-structure", "false");
       MessageType type = ParquetSchemaUtil.convert(schema, name);
 
+      // Check that our metrics make sense
+      metricsConfig.validateProperties(schema);

Review comment:
       Is this the right place to do the validation?
   
   If a user adds a bad property or performs some schema update that causes a 
validation error, that would break all writes to the table. To me, it doesn't 
seem like we are catching the problem early enough and possibly allowing a typo 
to break scheduled jobs.
   
   What do you think about adding this validation when altering the table? 
`UpdateProperties` could check whether any properties starting with 
`write.metadata.metrics` were modified and run this. Similarly, `UpdateSchema` 
could run this as well, although I think that we should probably modify 
`UpdateSchema` to simply update the properties for column renames (if that's 
easily done).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to