amogh-jahagirdar commented on code in PR #16213:
URL: https://github.com/apache/iceberg/pull/16213#discussion_r3196042053


##########
core/src/main/java/org/apache/iceberg/ManifestInfo.java:
##########
@@ -53,6 +55,28 @@ interface ManifestInfo {
           "min_sequence_number",
           Types.LongType.get(),
           "Minimum sequence number of files in this manifest");
+  Types.StructType PARTITION_SUMMARY_TYPE =
+      Types.StructType.of(
+          Types.NestedField.required(
+              509,
+              "contains_null",
+              Types.BooleanType.get(),
+              "True if any file has a null partition value"),
+          Types.NestedField.optional(
+              518,
+              "contains_nan",
+              Types.BooleanType.get(),
+              "True if any file has a nan partition value"),
+          Types.NestedField.optional(
+              510, "lower_bound", Types.BinaryType.get(), "Partition lower 
bound for all files"),
+          Types.NestedField.optional(
+              511, "upper_bound", Types.BinaryType.get(), "Partition upper 
bound for all files"));
+  Types.NestedField PARTITION_SUMMARIES =

Review Comment:
   Even if we keep partition tuples for data file entries, I don't think we 
neccessarily need to keep partition field summaries for manifests as well, we 
can still use stats at this level in the tree. In fact, I largely think we 
shouldn't since it adds more complexity to the structure.
   
   Ultimately, at the manifest level we want to preserve pruning but I think we 
can do that with stats, so long as we ensure that stats on source columns are 
collected and we have the rules to aggregate them correct. That was independent 
of the decision to keep the tuple or not. Let me know if I'm missing something 
@nastra  @anoopj @rdblue 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to