sergioferragut commented on issue #6513: URL: https://github.com/apache/druid/issues/6513#issuecomment-941683407
Hi, I'm very new to druid, I think this subject is important in at least the IoT space and even though this issue is old, ingestion timestamp calculation continues to come up as an issue/request/question. I think there are different data pipeline timestamps involved when streaming event message: * *the event timestamp* : I think this is what most in this discussion are concerned with when this time value is used as the primary partitioning timestamp. When the timestamp value is in the streamed message, then it is idempotent, if we use a system time it would not be. Using system time for this doesn't make sense to me regardless of whether idempotent or not, if we use system time at ingestion for this, then the time of the event is meaningless unless the event we are modeling IS the ingestion of the message...which I guess could be a use case. * *published timestamp* when the message was published to the message bus, this is the timestamp on the message envelope - it is useful to measure lag between the event and publication; it does not seem to make sense to use the druid system time at ingestion for this at all because it just isn't representative of the publish time, if there are performance issues or recoveries, it could potentially be a long time between publication and ingestion. * *ingestion timestamp* when the rows are ingested into the database. The immediate use of this is to be able to measure event to ingestion lag. This is what I believe most comments on the subject are referring to. *Isn't this essentially the ability to use a *now()* or *current_timestamp()* function in a transformSpec expression?* I think the more relevant case is when this value is ingested as a Dimension and not as the primary timestamp or maybe as a metric is calculated as *(now() - __time)* Anyway, I think it comes down to allowing *now()* or *current_timestamp()* in ingestion spec expressions in transformation and metric specifications. Sorry for the long comment. :-) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
