seojangho opened a new pull request #51: [NEMO-102] Stage Partitioning by PhysicalPlanGenerator URL: https://github.com/apache/incubator-nemo/pull/51 JIRA: [NEMO-102: Stage Partitioning by PhysicalPlanGenerator](https://issues.apache.org/jira/projects/NEMO/issues/NEMO-102) **Major changes:** - Removed StageId property - Replaced DefaultStagePartitioningPass with StagePartitioner in nemo-runtime-common - Modified PhysicalPlanGenerator to use StagePartitioner - Added Stage-level property. Common properties that vertices in a stage share become stage-level properties. (Except for properties ignored by StagePartitioner) - Ad-hoc properties in Task and Stage, such as containerType, now can be handled by Stage-level properties. - Replaced ScheduleGroupPass with DefaultScheduleGroupPass, which does not require StageId property in assigning ScheduleGroupIndex **Minor changes to note:** - Removed StageBuilder and StageEdgeBuilder - Add a feature to visualizer to display stage-level ExecutionProperties - Modified visualizer to properly display other ExecutionProperties - Modified visualizer to properly draw StageEdges so that those edges are not cut by stage boundaries - Removed parallelism equality checking by DAGBuilder which requires IR-level StageId property (the feature is replaced by the constructor of Stage and sanity checking by PhysicalPlanGenerator) - Renamed DataStoreProperty to InterTaskDataStoreProperty because it only controls data flow which spans through differnet stages - Modified ExecutionPropertyMap#toString to emit canonical name of property key - Added equality test cases to ExecutionPropertyMapTest - Removed `idToIRVertex` parameter from the constructor of PhysicalPlan. (The parameter value can be inferred from the IR dag supplied to the constructor.) - Increased the capacity of resources in `beam_sample_executor_resources.json`, because some integration test cases require scheduling a large ScheduleGroup at once. (For example, ScheduleGroup 0 in AlternatingLeastSquareITCase_pado requires 15 slots in Transient resoruce.) - Modified StageEdge to use consistent naming for executionProperties when emitting json (edgeProperties → executionProperties) **Tests for the changes:** - Added StagePartitionerTest to test StagePartitioner under various test scenarios. - Renamed ScheduleGroupPassTest to DefaultScheduleGroupPassTest and made it tests DefaultScheduleGroupPass under various test scenarios. - Existing tests should cover changes on PhysicalPlanGenerator. **Other comments:** - Legacy ScheduleGroupPass forces stages with SourceVertex within have ScheduleGroupIndex of zero. Since DefaultScheduleGroupPass does not employ this kind of trick, in FaultToleranceTest ScheduleGroup 0 for `TestPlanGenerator.PlanType.TwoVerticesJoined` is splitted into two ScheduleGroups. That's why I made modification like `if (stage.getScheduleGroupIndex() == 0 || stage.getScheduleGroupIndex() == 1) {` in FaultToleranceTest. resolves [NEMO-102](https://issues.apache.org/jira/projects/NEMO/issues/NEMO-102)
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
