scwhittle commented on code in PR #26085:
URL: https://github.com/apache/beam/pull/26085#discussion_r1213150951
##########
runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/windmill/GrpcWindmillServer.java:
##########
@@ -969,6 +987,76 @@ protected void startThrottleTimer() {
getWorkThrottleTimer.start();
}
+ private class GetWorkTimingInfosTracker {
+ private final Map<State, Duration> getWorkStreamLatencies;
+
+ public GetWorkTimingInfosTracker() {
+ this.getWorkStreamLatencies = new EnumMap<>(State.class);
+ }
+
+ public void addTimingInfo(Collection<GetWorkStreamTimingInfo> infos) {
+ // We want to record duration for each stage and also be reflective on
total work item
+ // processing time. It can be tricky because timings of different
+ // StreamingGetWorkResponseChunks can be interleaved. Current strategy
is to record the
+ // maximum duration in each stage across different chunks, this will
allow us to identify
+ // the slow stage, but note the sum duration of each slowest stages
may be larger than the
+ // duration from first chunk creation to last chunk reception by user
worker.
Review Comment:
I was thinking to scale the times to transmit to user worker such that those
portions of the latency attribution match the time it takes from generating
work to assembling it on the user worker.
So the transmit time elapsed would be something like [GET_WORK_CREATION_END
time, last chunk arrived in user worker].
Then we can scale the components of latency for transmitting (ie
GET_WORK_IN_TRANSIT_TO_USER_WORKER and GET_WORK_IN_TRANSIT_TO_DISPATCHER) so
that their sum equals the transmit time.
I think it's worthwhile as we are looking into low-latency processing
because it isn't necessarily guaranteed that user worker processing will take
longer and it's confusing if the total of all the latencies could be larger
than the actual total time to process.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]