Hi community, To better match the slogan `Choose good tools, Back home early. Use Right Scheduler, Sleep Tight `, this mail thread proposes improving the monitoring of DolphinScheduler.
Currently, in the officially-released versions of DS, there is only `statistics` feature but no metrics which could be exposed and better monitored in external systems such the `prometheus + grafana` solution. Metrics also enable users to better prevent scheduling failures and track down the bugs when failures happen. We find that DS has already integrated `micrometers`, therefore, we are working on adding metrics into DolphinScheduler based on the previous work of https://github.com/apache/dolphinscheduler/pull/6840 Actually this proposal has been brought up and discussed several times in community bi-weekly meetings and an initial PR has been submitted by Wenjun. For details such as progress, action items, etc. please check these two links: https://docs.qq.com/doc/DTGFiSkRIbHBIeVp3 and https://github.com/apache/dolphinscheduler/issues/9324 Discussions and suggestions are welcomed and appreciated! (Either by replying directly to this mail thread or commenting in github issue https://github.com/apache/dolphinscheduler/issues/9324) Thanks! -- Best Regards Eric Gao
