layoaster opened a new issue, #14234:
URL: https://github.com/apache/druid/issues/14234

   ### Affected Version
   0.23.0
   
   ### Description
   
   I've setup the Prometheus Emitter (along with the `http`) to export metrics 
to my Prometheus. However, the metrics[ 
`ingest/*`](https://druid.apache.org/docs/0.23.0/operations/metrics.html#other-ingestion-metrics)
 are not emitted/populated in Prometheus. Curiously enough, I get the metrics 
`ingest/kafka/*` properly reported.
   
   My cluster has 6 MiddleManagers (with Peons) and periodic batch ingestion 
tasks  (`index_parallel`) scheduled.
   
   Here is my metrics-related 
[configuration](https://gist.github.com/layoaster/929d7c722068e4414146c843cc900ce1).
  I don't see any error messages on Druid logs and this is the corresponding 
metric mapping.
   
   ```json
   {
       "ingest/events/thrownAway" : { "dimensions" : ["dataSource", "taskId", 
"taskType"], "type" : "count", "help": "Number of events rejected because they 
are outside the windowPeriod."},
       "ingest/events/unparseable" : { "dimensions" : ["dataSource", "taskId", 
"taskType"], "type" : "count", "help": "Number of events rejected because the 
events are unparseable." },
       "ingest/events/duplicate" : { "dimensions" : ["dataSource", "taskId", 
"taskType"], "type" : "count", "help": "Number of events rejected because the 
events are duplicated."},
       "ingest/events/processed" : { "dimensions" : ["dataSource", "taskId", 
"taskType"], "type" : "count", "help": "Number of events successfully processed 
per emission period." },
       "ingest/events/messageGap" : { "dimensions" : ["dataSource", "taskId", 
"taskType"], "type" : "gauge", "help": "Time gap in milliseconds between the 
latest ingested event timestamp and the current system timestamp of metrics 
emission."},
       "ingest/rows/output" : { "dimensions" : ["dataSource", "taskId", 
"taskType"], "type" : "count", "help": "Number of Druid rows persisted."},
       "ingest/persists/count" : { "dimensions" : ["dataSource", "taskId", 
"taskType"], "type" : "count", "help": "Number of times persist occurred." },
       "ingest/persists/time" : { "dimensions" : ["dataSource"], "type" : 
"timer", "conversionFactor": 1000.0, "help": "Seconds spent doing intermediate 
persist."},
       "ingest/persists/cpu" : { "dimensions" : ["dataSource"], "type" : 
"timer", "conversionFactor": 1000000000.0, "help": "Cpu time in Seconds spent 
on doing intermediate persist." },
       "ingest/persists/backPressure" : { "dimensions" : ["dataSource", 
"taskId", "taskType"], "type" : "gauge", "help": "Seconds spent creating 
persist tasks and blocking waiting for them to finish." },
       "ingest/persists/failed" : { "dimensions" : ["dataSource", "taskId", 
"taskType"], "type" : "count", "help": "Number of persists that failed." },
       "ingest/handoff/failed" : { "dimensions" : ["dataSource", "taskId", 
"taskType"], "type" : "count", "help": "Number of handoffs that failed." },
       "ingest/merge/time" : { "dimensions" : ["dataSource"], "type" : "timer", 
"conversionFactor": 1000.0, "help": "Seconds spent merging intermediate 
segments" },
       "ingest/merge/cpu" : { "dimensions" : ["dataSource"], "type" : "timer", 
"conversionFactor": 1000000000.0, "help": "Cpu time in Seconds spent on merging 
intermediate segments."}
   }
   ```
   
   Not really sure if I'm doing something wrong or those metrics are not 
reported for native batch ingestion tasks. Lastly, I wonder if there is any 
metric that reports the current `status` of the Kafka supervisors
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to