phet commented on code in PR #4087:
URL: https://github.com/apache/gobblin/pull/4087#discussion_r1896430362


##########
gobblin-temporal/src/main/java/org/apache/gobblin/temporal/ddm/activity/impl/RecommendScalingForWorkUnitsLinearHeuristicImpl.java:
##########
@@ -27,16 +27,22 @@
 
 
 /**
- * Simple config-driven linear relationship between `remainingWork` and the 
resulting `setPoint`
+ * Simple config-driven linear recommendation for how many containers to use 
to complete the "remaining work" within a given {@link TimeBudget}, per:
  *
- *
- * TODO: describe algo!!!!!
+ *   a. from {@link WorkUnitsSizeSummary}, find how many (remaining) 
"top-level" {@link org.apache.gobblin.source.workunit.MultiWorkUnit}s of some 
mean size
+ *   b. from the configured {@link #AMORTIZED_NUM_BYTES_PER_MINUTE}, find the 
expected "processing rate" in bytes / minute
+ * 1. estimate the time required for processing a mean-sized `MultiWorkUnit` 
(MWU)
+ *   c. from {@link JobState}, find per-container `MultiWorkUnit` parallelism 
capacity (aka. "worker-slots") to base the recommendation upon
+ * 2. calculate the per-container throughput of MWUs per minute
+ * 3. estimate the total per-container-minutes required to process all MWUs
+ *   d. from the {@link TimeBudget}, find the target number of minutes in 
which to complete processing of all MWUs
+ * 4. recommend the number of containers so all MWU processing should finish 
within the target number of minutes

Review Comment:
   no, the input parameterization is lettered and the algo calculations are 
numbered



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to