phet commented on code in PR #4087: URL: https://github.com/apache/gobblin/pull/4087#discussion_r1896430362
########## gobblin-temporal/src/main/java/org/apache/gobblin/temporal/ddm/activity/impl/RecommendScalingForWorkUnitsLinearHeuristicImpl.java: ########## @@ -27,16 +27,22 @@ /** - * Simple config-driven linear relationship between `remainingWork` and the resulting `setPoint` + * Simple config-driven linear recommendation for how many containers to use to complete the "remaining work" within a given {@link TimeBudget}, per: * - * - * TODO: describe algo!!!!! + * a. from {@link WorkUnitsSizeSummary}, find how many (remaining) "top-level" {@link org.apache.gobblin.source.workunit.MultiWorkUnit}s of some mean size + * b. from the configured {@link #AMORTIZED_NUM_BYTES_PER_MINUTE}, find the expected "processing rate" in bytes / minute + * 1. estimate the time required for processing a mean-sized `MultiWorkUnit` (MWU) + * c. from {@link JobState}, find per-container `MultiWorkUnit` parallelism capacity (aka. "worker-slots") to base the recommendation upon + * 2. calculate the per-container throughput of MWUs per minute + * 3. estimate the total per-container-minutes required to process all MWUs + * d. from the {@link TimeBudget}, find the target number of minutes in which to complete processing of all MWUs + * 4. recommend the number of containers so all MWU processing should finish within the target number of minutes Review Comment: no, the input parameterization is lettered and the algo calculations are numbered -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@gobblin.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org