ajantha-bhat commented on code in PR #7581:
URL: https://github.com/apache/iceberg/pull/7581#discussion_r1201742560
##########
core/src/main/java/org/apache/iceberg/PartitionsTable.java:
##########
@@ -47,6 +47,16 @@ public class PartitionsTable extends BaseMetadataTable {
new Schema(
Types.NestedField.required(1, "partition",
Partitioning.partitionType(table)),
Types.NestedField.required(4, "spec_id", Types.IntegerType.get()),
+ Types.NestedField.required(
+ 9,
+ "last_updated_at",
+ Types.TimestampType.withZone(),
+ "Partition last updated timestamp"),
+ Types.NestedField.required(
+ 10,
+ "last_updated_snapshot_id",
Review Comment:
I am just worried that most of the snapshots will be expired and we end up
not using that field much.
The main purpose of storing the snapshot id is for finding what operation
has last updated this partition id? In that case, we can store the operation
type itself directly maybe.
> I think sequence number will be good too, but do you mean
fileSequenceNumber or dataSequenceNumber? Maybe worth another pr if there's
more discussion there?
I guess it is fileSequenceNumber.
Yeah, we can have a separate discussion. I think `data_file_size_in_bytes`
per partition can also be one more good candidate for storing here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]