htpawel commented on PR #37936:
URL: https://github.com/apache/airflow/pull/37936#issuecomment-2024692435

   Ok, I did further investigation and now I know everything :D 
   Statsd by convention **should always** emit milliseconds. (See `ms` in their 
code: `self._send_stat(stat, '%0.6f|ms' % delta, rate)`). And that is why they 
are requesting time delta object or value in milliseconds: (in their code: 
`delta can be either a number of milliseconds or a timedelta.`). But later you 
need to do something with those metrics, so we (and probably most of the 
people) use statsd-exporter from prometheus. And in statsd-exporter 
documentation you can read `Timers will be accepted with the ms statsd type. 
Statsd timer data is transmitted in milliseconds, while Prometheus expects the 
unit to be seconds. The exporter converts all timer observations to seconds.`. 
So that is why I thought it is emitted in seconds, I was just checking 
statsd-exporter /metrics endpoint. So it turns out that not this, but other 
metrics are wrong in documentation, eg. dagrun.schedule_delay.<dag_id> as it 
passes time delta to Stats.timing so emits milliseconds but airflow 
documentation say
 s those are seconds. And above two from this PR are indeed emitted in seconds, 
but obviously it is not correct either as those should be milliseconds and are 
emitted with `ms` suffix. So there are more things f*cked up generally. 
Timing/Timer should always receive milliseconds value or time delta object and 
it emits milliseconds, which can be later consumed by statsd-exporter and 
exporter in seconds for prometheus to scrape them. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to