pjain1 opened a new pull request #10407:
URL: https://github.com/apache/druid/pull/10407
### Description
Currently there is no way to know how much data is processed by task during
ingestion. This PR adds `ingest/events/processedBytes` metric to emit number of
bytes read since last emission time.
- This PR adds `InputStats` class which is present in all task types and
acts as holder for task level counts like processed bytes in this case. Thus
standardized metrics throughout the task types can be added in future and
emitted using `InputStatsMonitor` which is automatically initialized for all
tasks
- This PR provides convenient wrapper class named `CountableInputEntity`
which can warp any `InputEntity` to count number of bytes processed through
that `InputEntity`, thus its easier for new implementations to emit this metric
just by wrapping the base input entity in this while creating
`InputEntityIteratingReader`
- Since Kafka and Kinesis does not use `InputEntity`, therefore processed
bytes is increment directly in `SeekableStreamIndexTaskRunner` as it has access
to `InputStats`
- This does not support Firehoses
<hr>
This PR has:
- [x] been self-reviewed.
- [x] added documentation for new or modified features or behaviors.
- [x] added Javadocs for most classes and all non-trivial methods. Linked
related entities via Javadoc links.
- [x] added unit tests or modified existing tests to cover new code paths,
ensuring the threshold for [code
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
is met.
- [x] been tested in a test Druid cluster.
<hr>
##### Key changed/added classes in this PR
* `InputStats`
* `InputStatsMonitor`
* `CountableInputEntity`
* `AbstractBatchIndexTask`
* `SeekableStreamIndexTask`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]