omarsmak commented on code in PR #945:
URL: https://github.com/apache/polaris/pull/945#discussion_r1943771052
##########
polaris-core/src/main/resources/schemas/policies/system/data-compaction/2025-02-03.json:
##########
@@ -0,0 +1,41 @@
+{
+ "license": "Licensed under the Apache License, Version 2.0
(http://www.apache.org/licenses/LICENSE-2.0)",
+ "$id":
"https://polaris.apache.org/schemas/policies/system/data-compaction/2025-02-03.json",
+ "title": "Data Compaction Policy",
+ "description": "Inheritable Polaris policy schema for Iceberg table data
compaction.",
+ "type": "object",
+ "properties": {
+ "version": {
+ "type": "string",
+ "const": "2025-02-03",
+ "description": "Schema version."
+ },
+ "enable": {
+ "type": "boolean",
+ "description": "Enable or disable data compaction."
+ },
+ "target_file_size_bytes": {
Review Comment:
1- it is not about being optional, having defined as policy schema,
implicitly adds the expectations to the users that any TMS would support this
and can open up a lot of false/positives.
2- Yes I am aware of Spark rewrite data files procedure call, but there is a
difference here, the procedure call is a manual one time thing same as a DDL
query, that takes the target file size as input if the user wishes to do so,
but it doesn't save that as policy, so that implicitly put it as engine
implementation details. For example if I have the same procedure written in
Flink, I am not obligated to have the target file size as an input as long as I
respect the iceberg table property.
I understand this might add flexibility, but also we need to think about the
consequences further ahead. I personally think, if this is really needed and
TMS users know their engine would support that, then really using the `config`
is enough here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]