+1 for improving how the commit timestamps are assigned monotonically since this requirement has emerged over multiple discussions like notifications, multi-table transactions, time travel accuracy and row timestamps. It would be good to have a single consistent way to represent and assign timestamps that could be leveraged across multiple features.
On Thu, Jan 15, 2026 at 4:05 PM Ryan Blue <[email protected]> wrote: > Yeah, to add my perspective on that discussion, I think my primary concern > is that people expect timestamps to be monotonic and if they aren't then a > `_last_update_timestamp` field just makes the problem worse. But it is > _nice_ to have row-level timestamps. So I would be okay if we revisit how > we assign commit timestamps and improve it so that you get monotonic > behavior. > > On Thu, Jan 15, 2026 at 2:23 PM Steven Wu <[email protected]> wrote: > >> We had an offline discussion with Ryan. I revised the proposal as follows. >> >> 1. V4 would require writers to generate *monotonic* snapshot timestamps. >> The proposal doc has a section that describes a recommended implementation >> using lamport timestamps. >> 2. Expose *last_update_timestamp* metadata column that inherits from >> snapshot timestamp >> >> This is a relatively low-friction change that can fix the time travel >> problem and enable use cases like latency tracking, temporal query, TTL, >> auditing. >> >> There is no accuracy requirement on the timestamp values. In practice, >> modern servers with NTP have pretty reliable wall clocks. E.g., Java >> library implemented this validation >> <https://github.com/apache/iceberg/blob/035e0fb39d2a949f6343552ade0a7d6c2967e0db/core/src/main/java/org/apache/iceberg/TableMetadata.java#L369-L377> >> that >> protects against backward clock drift up to one minute for snapshot >> timestamps. Don't think we have heard many complaints of commit failure due >> to that clock drift validation. >> >> Would appreciate feedback on the revised proposal. >> >> https://docs.google.com/document/d/1cXr_RwEO6o66S8vR7k3NM8-bJ9tH2rkh4vSdMXNC8J8/edit?tab=t.0 >> >> Thanks, >> Steven >> >> On Tue, Jan 13, 2026 at 8:40 PM Anton Okolnychyi <[email protected]> >> wrote: >> >>> Steven, I was referring to the fact that CURRENT_TIMESTAMP() is usually >>> evaluated quite early in engines so we could theoretically have another >>> expression closer to the commit time. You are right, though, it won't be >>> the actual commit time given that we have to write it into the files. Also, >>> I don't think generating a timestamp for a row as it is being written is >>> going to be beneficial. To sum up, expression-based defaults would allow us >>> to capture the time the transaction or write starts, but not the actual >>> commit time. >>> >>> Russell, if the goal is to know what happened to the table in a given >>> time frame, isn't the changelog scan the way to go? It would assign commit >>> ordinals based on lineage and include row-level diffs. How would you be >>> able to determine changes with row timestamps by just looking at the latest >>> snapshot? >>> >>> It does seem promising to make snapshot timestamps strictly increasing >>> to avoid ambiguity during time travel. >>> >>> вт, 13 січ. 2026 р. о 16:33 Ryan Blue <[email protected]> пише: >>> >>>> > Whether or not "t" is an atomic clock time is not as important as the >>>> query between time bounds making sense. >>>> >>>> I'm not sure I get it then. If we want monotonically increasing times, >>>> but they don't have to be real times then how do you know what notion of >>>> "time" you care about for these filters? Or to put it another way, how do >>>> you know that your "before" and "after" times are reasonable? If the >>>> boundaries of these time queries can move around a bit, by how much? >>>> >>>> It seems to me that row IDs can play an important role here because you >>>> have the order guarantee that we seem to want for this use case: if >>>> snapshot A was committed before snapshot B, then the rows from A have row >>>> IDs that are always less than the rows IDs of B. The problem is that we >>>> don't know where those row IDs start and end once A and B are no longer >>>> tracked. Using a "timestamp" seems to work, but I still worry that without >>>> reliable timestamps that correspond with some guarantee to real timestamps, >>>> we are creating a feature that seems reliable but isn't. >>>> >>>> I'm somewhat open to the idea of introducing a snapshot timestamp that >>>> the catalog guarantees is monotonically increasing. But if we did that, >>>> wouldn't we still need to know the association between these timestamps and >>>> snapshots after the snapshot metadata expires? My mental model is that this >>>> would be used to look for data that arrived, say, 3 weeks ago on Dec 24th. >>>> Since the snapshots metadata is no longer around we could use the row >>>> timestamp to find those rows. But how do we know that the snapshot >>>> timestamps correspond to the actual timestamp range of Dec 24th? Is it just >>>> "close enough" as long as we don't have out of order timestamps? This is >>>> what I mean by needing to keep track of the association between timestamps >>>> and snapshots after the metadata expires. Seems like you either need to >>>> keep track of what the catalog's clock was for events you care about, or >>>> you don't really care about exact timestamps. >>>> >>>> On Tue, Jan 13, 2026 at 2:22 PM Russell Spitzer < >>>> [email protected]> wrote: >>>> >>>>> The key goal here is the ability to answer the question "what happened >>>>> to the table in some time window. (before < t < after)?" >>>>> Whether or not "t" is an atomic clock time is not as important as the >>>>> query between time bounds making sense. >>>>> Downstream applications (from what I know) are mostly sensitive to >>>>> getting discrete and well defined answers to >>>>> this question like: >>>>> >>>>> 1 < t < 2 should be exclusive of >>>>> 2 < t < 3 should be exclusive of >>>>> 3 < t < 4 >>>>> >>>>> And the union of these should be the same as the query asking for 1 < >>>>> t < 4 >>>>> >>>>> Currently this is not possible because we have no guarantee of >>>>> ordering in our timestamps >>>>> >>>>> Snapshots >>>>> A -> B -> C >>>>> Sequence numbers >>>>> 50 -> 51 -> 52 >>>>> Timestamp >>>>> 3 -> 1 -> 2 >>>>> >>>>> This makes time travel always a little wrong to start with. >>>>> >>>>> The Java implementation only allows one minute of negative time on >>>>> commit so we actually kind of do have this as a >>>>> "light monotonicity" requirement but as noted above there is no spec >>>>> requirement for this. While we do have sequence >>>>> number and row id, we still don't have a stable way of associating >>>>> these with a consistent time in an engine independent way. >>>>> >>>>> Ideally we just want to have one consistent way of answering the >>>>> question "what did the table look like at time t" >>>>> which I think we get by adding in a new field that is a timestamp, set >>>>> by the Catalog close to commit time, >>>>> that always goes up. >>>>> >>>>> I'm not sure we can really do this with an engine expression since >>>>> they won't know when the data is actually committed >>>>> when writing files? >>>>> >>>>> >>>>> On Tue, Jan 13, 2026 at 3:35 PM Anton Okolnychyi < >>>>> [email protected]> wrote: >>>>> >>>>>> This seems like a lot of new complexity in the format. I would like >>>>>> us to explore whether we can build the considered use cases on top of >>>>>> expression-based defaults instead. >>>>>> >>>>>> We already plan to support CURRENT_TIMESTAMP() and similar functions >>>>>> that are part of the SQL standard definition for default values. This >>>>>> would >>>>>> provide us a way to know the relative row order. True, this usually will >>>>>> represent the start of the operation. We may define COMMIT_TIMESTAMP() >>>>>> or a >>>>>> similar expression for the actual commit time, if there are use cases >>>>>> that >>>>>> need that. Plus, we may explore an approach similar to MySQL that allows >>>>>> users to reset the default value on update. >>>>>> >>>>>> - Anton >>>>>> >>>>>> вт, 13 січ. 2026 р. о 11:04 Russell Spitzer < >>>>>> [email protected]> пише: >>>>>> >>>>>>> I think this is the right step forward. Our current "timestamp" >>>>>>> definition is too ambiguous to be useful so establishing >>>>>>> a well defined and monotonic timestamp could be really great. I also >>>>>>> like the ability for row's to know this value without >>>>>>> having to rely on snapshot information which can be expired. >>>>>>> >>>>>>> On Mon, Jan 12, 2026 at 11:03 AM Steven Wu <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi all, >>>>>>>> >>>>>>>> I have revised the row timestamp proposal with the following >>>>>>>> changes. >>>>>>>> * a new commit_timestamp field in snapshot metadata that has >>>>>>>> nano-second precision. >>>>>>>> * this optional field is only set by the REST catalog server >>>>>>>> * it needs to be monotonic (e.g. implemented using Lamport >>>>>>>> timestamp) >>>>>>>> >>>>>>>> >>>>>>>> https://docs.google.com/document/d/1cXr_RwEO6o66S8vR7k3NM8-bJ9tH2rkh4vSdMXNC8J8/edit?tab=t.0#heading=h.efdngoizchuh >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Steven >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Dec 12, 2025 at 2:36 PM Steven Wu <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Thanks for the clarification, Ryan. >>>>>>>>> >>>>>>>>> For long-running streaming jobs that commit periodically, it is >>>>>>>>> difficult to establish the constant value of current_timestamp across >>>>>>>>> all >>>>>>>>> writer tasks for each commit cycle. I guess streaming writers may >>>>>>>>> just need >>>>>>>>> to write the wall clock time when appending a row to a data file for >>>>>>>>> the >>>>>>>>> default value of current_timestamp. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Dec 12, 2025 at 1:44 PM Ryan Blue <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> I don't think that every row would have a different value. That >>>>>>>>>> would be up to the engine, but I would expect engines to insert >>>>>>>>>> `CURRENT_TIMESTAMP` into the plan and then replace it with a >>>>>>>>>> constant, >>>>>>>>>> resulting in a consistent value for all rows. >>>>>>>>>> >>>>>>>>>> You're right that this would not necessarily be the commit time. >>>>>>>>>> But neither is the commit timestamp from Iceberg's snapshot. I'm not >>>>>>>>>> sure >>>>>>>>>> how we are going to define "good enough" for this purpose. I think >>>>>>>>>> at least >>>>>>>>>> `CURRENT_TIMESTAMP` has reliable and known behavior when you look at >>>>>>>>>> how it >>>>>>>>>> is handled in engines. And if you want the Iceberg timestamp, then >>>>>>>>>> use a >>>>>>>>>> periodic query of the snapshot stable to keep track of them in a >>>>>>>>>> table you >>>>>>>>>> can join to. I don't think this rises to the need for a table feature >>>>>>>>>> unless we can guarantee that it is correct. >>>>>>>>>> >>>>>>>>>> On Fri, Dec 12, 2025 at 1:19 PM Steven Wu <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> > Postgres `current_timestamp` captures the transaction start >>>>>>>>>>> time [1, 2]. Should we extend the same semantic to Iceberg: all >>>>>>>>>>> rows added >>>>>>>>>>> in the same snapshot should have the same timestamp value? >>>>>>>>>>> >>>>>>>>>>> Let me clarify my last comment. >>>>>>>>>>> >>>>>>>>>>> created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP) >>>>>>>>>>> >>>>>>>>>>> Since Postgres current_timestamp captures the transaction start >>>>>>>>>>> time, all rows added in the same insert transaction would have the >>>>>>>>>>> same >>>>>>>>>>> value as the transaction timestamp with the column definition above. >>>>>>>>>>> >>>>>>>>>>> If we extend a similar semantic to Iceberg, all rows added in >>>>>>>>>>> the same Iceberg transaction/snapshot should have the same >>>>>>>>>>> timestamp? >>>>>>>>>>> >>>>>>>>>>> Ryan, I understand your comment for using current_timestamp >>>>>>>>>>> expression as column default value, you were thinking that the >>>>>>>>>>> engine would >>>>>>>>>>> set the column value to the wall clock time when appending a row to >>>>>>>>>>> a data >>>>>>>>>>> file, right? every row would almost have a different timestamp >>>>>>>>>>> value. >>>>>>>>>>> >>>>>>>>>>> On Fri, Dec 12, 2025 at 10:26 AM Steven Wu <[email protected]> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> `current_timestamp` expression may not always carry the right >>>>>>>>>>>> semantic for the use cases. E.g., latency tracking is interested >>>>>>>>>>>> in when >>>>>>>>>>>> records are added / committed to the table, not when the record was >>>>>>>>>>>> appended to an uncommitted data file in the processing engine. >>>>>>>>>>>> Record creation and Iceberg commit can be minutes or even hours >>>>>>>>>>>> apart. >>>>>>>>>>>> >>>>>>>>>>>> Row timestamp inherited from snapshot timestamp has no overhead >>>>>>>>>>>> with the initial commit and has very minimal storage overhead >>>>>>>>>>>> during file >>>>>>>>>>>> rewrite. Per-row current_timestamp would have distinct values for >>>>>>>>>>>> every row >>>>>>>>>>>> and has more storage overhead. >>>>>>>>>>>> >>>>>>>>>>>> OLTP databases deal with small row-level transactions. Postgres >>>>>>>>>>>> `current_timestamp` captures the transaction start time [1, 2]. >>>>>>>>>>>> Should we >>>>>>>>>>>> extend the same semantic to Iceberg: all rows added in the same >>>>>>>>>>>> snapshot >>>>>>>>>>>> should have the same timestamp value? >>>>>>>>>>>> >>>>>>>>>>>> [1] >>>>>>>>>>>> https://www.postgresql.org/docs/current/functions-datetime.html >>>>>>>>>>>> [2] >>>>>>>>>>>> https://neon.com/postgresql/postgresql-date-functions/postgresql-current_timestamp >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Dec 11, 2025 at 4:07 PM Micah Kornfield < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Micah, are 1 and 2 the same? 3 is covered by this proposal. >>>>>>>>>>>>>> To support the created_by timestamp, we would need to >>>>>>>>>>>>>> implement the following row lineage behavior >>>>>>>>>>>>>> * Initially, it inherits from the snapshot timestamp >>>>>>>>>>>>>> * during rewrite (like compaction), it should be persisted >>>>>>>>>>>>>> into data files. >>>>>>>>>>>>>> * during update, it needs to be carried over from the >>>>>>>>>>>>>> previous row. This is similar to the row_id carry over for row >>>>>>>>>>>>>> updates. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Sorry for the short hand. These are not the same: >>>>>>>>>>>>> >>>>>>>>>>>>> 1. Insertion time - time the row was inserted. >>>>>>>>>>>>> 2. Create by - The system that created the record. >>>>>>>>>>>>> 3. Updated by - The system that last updated the record. >>>>>>>>>>>>> >>>>>>>>>>>>> Depending on the exact use-case these might or might not have >>>>>>>>>>>>> utility. I'm just wondering if there will be more example like >>>>>>>>>>>>> this in the >>>>>>>>>>>>> future. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> created_by column would incur likely significantly higher >>>>>>>>>>>>>> storage overhead compared to the updated_by column. As rows are >>>>>>>>>>>>>> updated >>>>>>>>>>>>>> overtime, the cardinality for this column in data files can be >>>>>>>>>>>>>> high. Hence, >>>>>>>>>>>>>> the created_by column may not compress well. This is a similar >>>>>>>>>>>>>> problem for >>>>>>>>>>>>>> the row_id column. One side effect of enabling row lineage by >>>>>>>>>>>>>> default for >>>>>>>>>>>>>> V3 tables is the storage overhead of row_id column after >>>>>>>>>>>>>> compaction >>>>>>>>>>>>>> especially for narrow tables with few columns. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> I agree. I think this analysis also shows that some consumers >>>>>>>>>>>>> of Iceberg might not necessarily want to have all these columns, >>>>>>>>>>>>> so we >>>>>>>>>>>>> might want to make them configurable, rather than mandating them >>>>>>>>>>>>> for all >>>>>>>>>>>>> tables. Ryan's thought on default values seems like it would >>>>>>>>>>>>> solve the >>>>>>>>>>>>> issues I was raising. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Micah >>>>>>>>>>>>> >>>>>>>>>>>>> On Thu, Dec 11, 2025 at 3:47 PM Ryan Blue <[email protected]> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> > An explicit timestamp column adds more burden to >>>>>>>>>>>>>> application developers. While some databases require an explicit >>>>>>>>>>>>>> column in >>>>>>>>>>>>>> the schema, those databases provide triggers to auto set the >>>>>>>>>>>>>> column value. >>>>>>>>>>>>>> For Iceberg, the snapshot timestamp is the closest to the >>>>>>>>>>>>>> trigger timestamp. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Since the use cases don't require an exact timestamp, this >>>>>>>>>>>>>> seems like the best solution to get what people want (an >>>>>>>>>>>>>> insertion >>>>>>>>>>>>>> timestamp) that has clear and well-defined behavior. Since >>>>>>>>>>>>>> `current_timestamp` is defined by the SQL spec, it makes sense >>>>>>>>>>>>>> to me that >>>>>>>>>>>>>> we could use it and have reasonable behavior. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I've talked with Anton about this before and maybe he'll jump >>>>>>>>>>>>>> in on this thread. I think that we may need to extend default >>>>>>>>>>>>>> values to >>>>>>>>>>>>>> include default value expressions, like `current_timestamp` that >>>>>>>>>>>>>> is allowed >>>>>>>>>>>>>> by the SQL spec. That would solve the problem as well as some >>>>>>>>>>>>>> others (like >>>>>>>>>>>>>> `current_date` or `current_user`) and would not create a >>>>>>>>>>>>>> potentially >>>>>>>>>>>>>> misleading (and heavyweight) timestamp feature in the format. >>>>>>>>>>>>>> >>>>>>>>>>>>>> > Also some environments may have stronger clock service, >>>>>>>>>>>>>> like Spanner TrueTime service. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Even in cases like this, commit retries can reorder commits >>>>>>>>>>>>>> and make timestamps out of order. I don't think that we should >>>>>>>>>>>>>> be making >>>>>>>>>>>>>> guarantees or even exposing metadata that people might mistake >>>>>>>>>>>>>> as having >>>>>>>>>>>>>> those guarantees. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Tue, Dec 9, 2025 at 2:22 PM Steven Wu < >>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Ryan, thanks a lot for the feedback! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Regarding the concern for reliable timestamps, we are not >>>>>>>>>>>>>>> proposing using timestamps for ordering. With NTP in modern >>>>>>>>>>>>>>> computers, they >>>>>>>>>>>>>>> are generally reliable enough for the intended use cases. Also >>>>>>>>>>>>>>> some >>>>>>>>>>>>>>> environments may have stronger clock service, like Spanner >>>>>>>>>>>>>>> TrueTime service >>>>>>>>>>>>>>> <https://docs.cloud.google.com/spanner/docs/true-time-external-consistency> >>>>>>>>>>>>>>> . >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> > joining to timestamps from the snapshots metadata table. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> As you also mentioned, it depends on the snapshot history, >>>>>>>>>>>>>>> which is often retained for a few days due to performance >>>>>>>>>>>>>>> reasons. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> > embedding a timestamp in DML (like `current_timestamp`) >>>>>>>>>>>>>>> rather than relying on an implicit one from table metadata. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> An explicit timestamp column adds more burden to application >>>>>>>>>>>>>>> developers. While some databases require an explicit column in >>>>>>>>>>>>>>> the schema, >>>>>>>>>>>>>>> those databases provide triggers to auto set the column value. >>>>>>>>>>>>>>> For Iceberg, >>>>>>>>>>>>>>> the snapshot timestamp is the closest to the trigger timestamp. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Also, the timestamp set during computation (like streaming >>>>>>>>>>>>>>> ingestion or relative long batch computation) doesn't capture >>>>>>>>>>>>>>> the time the >>>>>>>>>>>>>>> rows/files are added to the Iceberg table in a batch fashion. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> > And for those use cases, you could also keep a longer >>>>>>>>>>>>>>> history of snapshot timestamps, like storing a catalog's event >>>>>>>>>>>>>>> log for >>>>>>>>>>>>>>> long-term access to timestamp info >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> this is not really consumable by joining the regular table >>>>>>>>>>>>>>> query with catalog event log. I would also imagine catalog >>>>>>>>>>>>>>> event log is >>>>>>>>>>>>>>> capped at shorter retention (maybe a few months) compared to >>>>>>>>>>>>>>> data retention >>>>>>>>>>>>>>> (could be a few years). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Tue, Dec 9, 2025 at 1:32 PM Ryan Blue <[email protected]> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I don't think it is a good idea to expose timestamps at the >>>>>>>>>>>>>>>> row level. Timestamps in metadata that would be carried down >>>>>>>>>>>>>>>> to the row >>>>>>>>>>>>>>>> level already confuse people that expect them to be useful or >>>>>>>>>>>>>>>> reliable, >>>>>>>>>>>>>>>> rather than for debugging. I think extending this to the row >>>>>>>>>>>>>>>> level would >>>>>>>>>>>>>>>> only make the problem worse. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> You can already get this information by projecting the last >>>>>>>>>>>>>>>> updated sequence number, which is reliable, and joining to >>>>>>>>>>>>>>>> timestamps from >>>>>>>>>>>>>>>> the snapshots metadata table. Of course, the drawback there is >>>>>>>>>>>>>>>> losing the >>>>>>>>>>>>>>>> timestamp information when snapshots expire, but since it >>>>>>>>>>>>>>>> isn't reliable >>>>>>>>>>>>>>>> anyway I'd be fine with that. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Some of the use cases, like auditing and compliance, are >>>>>>>>>>>>>>>> probably better served by embedding a timestamp in DML (like >>>>>>>>>>>>>>>> `current_timestamp`) rather than relying on an implicit one >>>>>>>>>>>>>>>> from table >>>>>>>>>>>>>>>> metadata. And for those use cases, you could also keep a >>>>>>>>>>>>>>>> longer history of >>>>>>>>>>>>>>>> snapshot timestamps, like storing a catalog's event log for >>>>>>>>>>>>>>>> long-term >>>>>>>>>>>>>>>> access to timestamp info. I think that would be better than >>>>>>>>>>>>>>>> storing it at >>>>>>>>>>>>>>>> the row level. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Mon, Dec 8, 2025 at 3:46 PM Steven Wu < >>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> For V4 spec, I have a small proposal [1] to expose the row >>>>>>>>>>>>>>>>> timestamp concept that can help with many use cases like >>>>>>>>>>>>>>>>> temporal queries, >>>>>>>>>>>>>>>>> latency tracking, TTL, auditing and compliance. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> This *_last_updated_timestamp_ms * metadata column >>>>>>>>>>>>>>>>> behaves very similarly to the >>>>>>>>>>>>>>>>> *_last_updated_sequence_number* for row lineage. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> - Initially, it inherits from the snapshot timestamp. >>>>>>>>>>>>>>>>> - During rewrite (like compaction), its values are >>>>>>>>>>>>>>>>> persisted in the data files. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Would love to hear what you think. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>> Steven >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>> https://docs.google.com/document/d/1cXr_RwEO6o66S8vR7k3NM8-bJ9tH2rkh4vSdMXNC8J8/edit?usp=sharing >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>
