FANNG1 commented on code in PR #10189:
URL: https://github.com/apache/gravitino/pull/10189#discussion_r2888231554
##########
api/src/main/java/org/apache/gravitino/policy/PolicyContents.java:
##########
@@ -128,4 +176,233 @@ public String toString() {
+ '}';
}
}
+
+ /** Built-in policy content for Iceberg compaction strategy. */
+ public static class IcebergCompactionContent implements PolicyContent {
+ /** Property key for strategy type. */
+ public static final String STRATEGY_TYPE_KEY = "strategy.type";
+ /** Strategy type value for compaction. */
+ public static final String STRATEGY_TYPE_VALUE = "compaction";
+ /** Property key for job template name. */
+ public static final String JOB_TEMPLATE_NAME_KEY = "job.template-name";
+ /** Built-in job template name for Iceberg rewrite data files. */
+ public static final String JOB_TEMPLATE_NAME_VALUE =
"builtin-iceberg-rewrite-data-files";
+ /** Prefix for rewrite options propagated to job options. */
+ public static final String JOB_OPTIONS_PREFIX = "job.options.";
+ /** Rule key for trigger expression. */
+ public static final String TRIGGER_EXPR_KEY = "trigger-expr";
+ /** Rule key for score expression. */
+ public static final String SCORE_EXPR_KEY = "score-expr";
+ /** Rule key for minimum data file MSE threshold. */
+ public static final String MIN_DATAFILE_MSE_KEY = "minDatafileMse";
+ /** Rule key for minimum delete file count threshold. */
+ public static final String MIN_DELETE_FILE_NUMBER_KEY =
"minDeleteFileNumber";
+ /** Rule key for data file MSE score weight. */
+ public static final String DATAFILE_MSE_WEIGHT_KEY = "datafileMseWeight";
+ /** Rule key for delete file number score weight. */
+ public static final String DELETE_FILE_NUMBER_WEIGHT_KEY =
"deleteFileNumberWeight";
+ /** Metric name for data file MSE. */
+ public static final String DATAFILE_MSE_METRIC = "custom-datafile_mse";
+ /** Metric name for delete file number. */
+ public static final String DELETE_FILE_NUMBER_METRIC =
"custom-delete_file_number";
Review Comment:
Good point. I unified the key style and switched the metric key to
`custom-data_file_mse`. To avoid breakage with existing emitted stats, I also
added compatibility mapping in optimizer so legacy `custom-datafile_mse` still
works. Added coverage in `TestGravitinoPolicyCompactionStrategy`. Addressed in
commit 6212a7c50.
##########
api/src/main/java/org/apache/gravitino/policy/PolicyContents.java:
##########
@@ -42,6 +48,48 @@ public static PolicyContent custom(
return new CustomContent(rules, supportedObjectTypes, properties);
}
+ /**
+ * Creates an iceberg compaction policy content.
+ *
+ * @param minDatafileMse minimum threshold for custom-datafile_mse
+ * @param minDeleteFileNumber minimum threshold for custom-delete_file_number
+ * @param rewriteOptions rewrite options forwarded as job.options.*
+ * @return iceberg compaction policy content
+ */
+ public static PolicyContent icebergCompaction(
+ long minDatafileMse, long minDeleteFileNumber, Map<String, String>
rewriteOptions) {
Review Comment:
Done. The 3-arg `icebergCompaction(...)` overload now calls the 5-arg
overload to remove duplication and keep defaults in one path. Addressed in
commit 6212a7c50.
##########
api/src/main/java/org/apache/gravitino/policy/PolicyContents.java:
##########
@@ -128,4 +176,233 @@ public String toString() {
+ '}';
}
}
+
+ /** Built-in policy content for Iceberg compaction strategy. */
+ public static class IcebergCompactionContent implements PolicyContent {
Review Comment:
Done. I split `IcebergCompactionContent` into a dedicated file
`api/src/main/java/org/apache/gravitino/policy/IcebergCompactionContent.java`
and kept `PolicyContents` as factory utilities only. Addressed in commit
6212a7c50.
##########
api/src/main/java/org/apache/gravitino/policy/PolicyContents.java:
##########
@@ -128,4 +176,233 @@ public String toString() {
+ '}';
}
}
+
+ /** Built-in policy content for Iceberg compaction strategy. */
+ public static class IcebergCompactionContent implements PolicyContent {
+ /** Property key for strategy type. */
+ public static final String STRATEGY_TYPE_KEY = "strategy.type";
+ /** Strategy type value for compaction. */
+ public static final String STRATEGY_TYPE_VALUE = "compaction";
+ /** Property key for job template name. */
+ public static final String JOB_TEMPLATE_NAME_KEY = "job.template-name";
+ /** Built-in job template name for Iceberg rewrite data files. */
+ public static final String JOB_TEMPLATE_NAME_VALUE =
"builtin-iceberg-rewrite-data-files";
+ /** Prefix for rewrite options propagated to job options. */
+ public static final String JOB_OPTIONS_PREFIX = "job.options.";
+ /** Rule key for trigger expression. */
+ public static final String TRIGGER_EXPR_KEY = "trigger-expr";
+ /** Rule key for score expression. */
+ public static final String SCORE_EXPR_KEY = "score-expr";
+ /** Rule key for minimum data file MSE threshold. */
+ public static final String MIN_DATAFILE_MSE_KEY = "minDatafileMse";
+ /** Rule key for minimum delete file count threshold. */
+ public static final String MIN_DELETE_FILE_NUMBER_KEY =
"minDeleteFileNumber";
+ /** Rule key for data file MSE score weight. */
+ public static final String DATAFILE_MSE_WEIGHT_KEY = "datafileMseWeight";
Review Comment:
Agreed. I renamed the naming style from `datafileXxx` to `dataFileXxx`
across policy content, DTO fields, converters and tests to follow word
boundaries consistently. Addressed in commit 6212a7c50.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]