Yicong-Huang commented on code in PR #3550:
URL: https://github.com/apache/texera/pull/3550#discussion_r2285786232
##########
core/amber/src/main/scala/edu/uci/ics/amber/engine/architecture/scheduling/CostEstimator.scala:
##########
@@ -36,7 +39,15 @@ import scala.util.{Failure, Success, Try}
* A cost estimator should estimate a cost of running a region under the
given resource constraints as units.
*/
trait CostEstimator {
- def estimate(region: Region, resourceUnits: Int): Double
+
+ /**
+ * Uses the given resource units to allocate resources to the region, and
determine a cost based on the allocation.
+ *
+ * Note currently the ResourceAllocator is not cost-based and thus we use a
cost model that does not rely on the
+ * allocator, i.e., the cost estimation process is external to the
ResourceAllocator.
+ * @return An updated region with allocated resources and an estimated cost
of this region.
Review Comment:
for the return, I suggest we only return resource config and a cost. the
caller should be responsible to evaluate the cost and decide the resource
config to attach to the region.
##########
core/amber/src/main/scala/edu/uci/ics/amber/engine/architecture/scheduling/CostBasedScheduleGenerator.scala:
##########
@@ -601,16 +589,31 @@ class CostBasedScheduleGenerator(
}
/**
- * The cost function used by the search. Takes a region plan, generates one
or more (to be done in the future)
- * schedules based on the region plan, and calculates the cost of the
schedule(s) using Cost Estimator. Uses the cost
- * of the best schedule (currently only considers one schedule) as the cost
of the region plan.
+ * Takes a region DAG, generates one or more (to be done in the future)
schedules based on the region DAG, allocates
+ * resources to each region in the region DAG, and calculates the cost of
the schedule(s) using Cost Estimator. Uses
+ * the cost of the best schedule (currently only considers one schedule) as
the cost of the region DAG.
*
* @return A cost determined by the cost estimator.
*/
- private def evaluate(regionPlan: RegionPlan): Double = {
+ private def allocateResourcesAndEvaluateCost(
+ regionDAG: DirectedAcyclicGraph[Region, RegionLink]
+ ): Double = {
+ val regionPlan =
+ RegionPlan(regionDAG.vertexSet().asScala.toSet,
regionDAG.edgeSet().asScala.toSet)
val schedule = generateScheduleFromRegionPlan(regionPlan)
// In the future we may allow multiple regions in a level and split the
resources.
- schedule.map(level => level.map(region => costEstimator.estimate(region,
1)).sum).sum
+ schedule
+ .map(level =>
+ level
+ .map(region => {
+ val (newRegion, regionCost) =
costEstimator.allocateResourcesAndEstimateCost(region, 1)
Review Comment:
`newRegion` is confusing. Better to return a resource config, instead of the
region with the updated resource.
##########
core/amber/src/main/scala/edu/uci/ics/amber/engine/architecture/scheduling/CostEstimator.scala:
##########
@@ -73,14 +85,21 @@ class DefaultCostEstimator(
case Some(_) =>
}
- override def estimate(region: Region, resourceUnits: Int): Double = {
- this.operatorEstimatedTimeOption match {
+ override def allocateResourcesAndEstimateCost(
+ region: Region,
+ resourceUnits: Int
+ ): (Region, Double) = {
+ // Currently the dummy cost from resourceAllocator is discarded.
+ val (newRegion, _) = resourceAllocator.allocate(region)
Review Comment:
we might want to change the return of `resourceAllocator.allocate` to just
resourceConfigs, instead of an updated region.
##########
core/amber/src/main/scala/edu/uci/ics/amber/engine/architecture/scheduling/CostEstimator.scala:
##########
@@ -73,14 +85,21 @@ class DefaultCostEstimator(
case Some(_) =>
}
- override def estimate(region: Region, resourceUnits: Int): Double = {
- this.operatorEstimatedTimeOption match {
+ override def allocateResourcesAndEstimateCost(
+ region: Region,
+ resourceUnits: Int
+ ): (Region, Double) = {
+ // Currently the dummy cost from resourceAllocator is discarded.
+ val (newRegion, _) = resourceAllocator.allocate(region)
Review Comment:
I am confused with the role of `resource allocator` and `cost estimator`. if
resource allocator is also returning a cost, why do we need another cost
estimator? are we going to stop letting resource allocator to emit a cost?
##########
core/amber/src/main/scala/edu/uci/ics/amber/engine/architecture/scheduling/CostBasedScheduleGenerator.scala:
##########
@@ -601,16 +589,31 @@ class CostBasedScheduleGenerator(
}
/**
- * The cost function used by the search. Takes a region plan, generates one
or more (to be done in the future)
- * schedules based on the region plan, and calculates the cost of the
schedule(s) using Cost Estimator. Uses the cost
- * of the best schedule (currently only considers one schedule) as the cost
of the region plan.
+ * Takes a region DAG, generates one or more (to be done in the future)
schedules based on the region DAG, allocates
+ * resources to each region in the region DAG, and calculates the cost of
the schedule(s) using Cost Estimator. Uses
+ * the cost of the best schedule (currently only considers one schedule) as
the cost of the region DAG.
*
* @return A cost determined by the cost estimator.
*/
- private def evaluate(regionPlan: RegionPlan): Double = {
+ private def allocateResourcesAndEvaluateCost(
+ regionDAG: DirectedAcyclicGraph[Region, RegionLink]
Review Comment:
what's the difference between a `regionDAG` and a `regionPlan`?
##########
core/amber/src/main/scala/edu/uci/ics/amber/engine/architecture/scheduling/CostEstimator.scala:
##########
@@ -73,14 +85,21 @@ class DefaultCostEstimator(
case Some(_) =>
}
- override def estimate(region: Region, resourceUnits: Int): Double = {
- this.operatorEstimatedTimeOption match {
+ override def allocateResourcesAndEstimateCost(
+ region: Region,
+ resourceUnits: Int
+ ): (Region, Double) = {
+ // Currently the dummy cost from resourceAllocator is discarded.
+ val (newRegion, _) = resourceAllocator.allocate(region)
+ // We use a cost model that does not rely on the resource allocation.
+ // TODO: Once the ResourceAllocator actually calculates a cost, we can use
its calculated cost.
Review Comment:
if we use the calculated cost from the resource allocator, we don't need
cost estimator any more. Maybe you should change the name to `region cost` and
`region plan cost`, as well as `regionResourceAllocator` and
`regionPlanCostEstimator` to differentiate their roles.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]