jianghuazhu commented on PR #7191:
URL: https://github.com/apache/ozone/pull/7191#issuecomment-2348121763
@errose28 , your suggestion is very good.
I will modify this PR later.
All time-based metrics in all commands use MutableRate to calculate the
average time. I think the existing logic for calculating the total time should
be deleted.
In addition, I think two new subtasks should be created:
jira 1: Add transferredBytes metric to all replication commands, including
ECReconstructionCoordinatorTask and ReplicationTask. A protected method should
be added in AbstractReplicationTask:
`
public abstract void addTransferredBytes(long transferredBytes);
`
For the transferredBytes metric in ECReconstructionCoordinatorTask, we need
to do some work in ECReconstructionCoordinator and
ECBlockReconstructedStripeInputStream.
jira 2: Shared queues seem to be only useful for replication commands, so we
should add an abstract method in CommandHandler.
`
default long getQueueTime() {
return -1;
}
`
If other commands need to use shared queues, just implement this method. In
addition, this metric also needs to be collected in
CommandHandlerMetrics#getMetrics().
In addition, there is another issue worth discussing. There are timeout
metrics in the replication commands and DeleteContainerCommandHandler, but the
timeoutCount of DeleteContainerCommandHandler is not used in many places. In
theory, all commands have timeouts and failures. Now, except for the
replication commands, we only know the total number of calls for other
commands. The granularity discussed here seems a bit fine.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]