seojangho opened a new pull request #51: [NEMO-102] Stage Partitioning by 
PhysicalPlanGenerator
URL: https://github.com/apache/incubator-nemo/pull/51
 
 
   JIRA: [NEMO-102: Stage Partitioning by 
PhysicalPlanGenerator](https://issues.apache.org/jira/projects/NEMO/issues/NEMO-102)
   
   **Major changes:**
   - Removed StageId property
   - Replaced DefaultStagePartitioningPass with StagePartitioner in 
nemo-runtime-common
   - Modified PhysicalPlanGenerator to use StagePartitioner
   - Added Stage-level property. Common properties that vertices in a stage 
share become stage-level properties. (Except for properties ignored by 
StagePartitioner)
   - Ad-hoc properties in Task and Stage, such as containerType, now can be 
handled by Stage-level properties.
   - Replaced ScheduleGroupPass with DefaultScheduleGroupPass, which does not 
require StageId property in assigning ScheduleGroupIndex
   
   **Minor changes to note:**
   - Removed StageBuilder and StageEdgeBuilder
   - Add a feature to visualizer to display stage-level ExecutionProperties
   - Modified visualizer to properly display other ExecutionProperties
   - Modified visualizer to properly draw StageEdges so that those edges are 
not cut by stage boundaries
   - Removed parallelism equality checking by DAGBuilder which requires 
IR-level StageId property (the feature is replaced by the constructor of Stage 
and sanity checking by PhysicalPlanGenerator)
   - Renamed DataStoreProperty to InterTaskDataStoreProperty because it only 
controls data flow which spans through differnet stages
   - Modified ExecutionPropertyMap#toString to emit canonical name of property 
key
   - Added equality test cases to ExecutionPropertyMapTest
   - Removed `idToIRVertex` parameter from the constructor of PhysicalPlan. 
(The parameter value can be inferred from the IR dag supplied to the 
constructor.)
   - Increased the capacity of resources in 
`beam_sample_executor_resources.json`, because some integration test cases 
require scheduling a large ScheduleGroup at once. (For example, ScheduleGroup 0 
in AlternatingLeastSquareITCase_pado requires 15 slots in Transient resoruce.)
   - Modified StageEdge to use consistent naming for executionProperties when 
emitting json (edgeProperties → executionProperties)
   
   **Tests for the changes:**
   - Added StagePartitionerTest to test StagePartitioner under various test 
scenarios.
   - Renamed ScheduleGroupPassTest to DefaultScheduleGroupPassTest and made it 
tests DefaultScheduleGroupPass under various test scenarios.
   - Existing tests should cover changes on PhysicalPlanGenerator.
   
   **Other comments:**
   - Legacy ScheduleGroupPass forces stages with SourceVertex within have 
ScheduleGroupIndex of zero. Since DefaultScheduleGroupPass does not employ this 
kind of trick, in FaultToleranceTest ScheduleGroup 0 for 
`TestPlanGenerator.PlanType.TwoVerticesJoined` is splitted into two 
ScheduleGroups. That's why I made modification like `if 
(stage.getScheduleGroupIndex() == 0 || stage.getScheduleGroupIndex() == 1) {` 
in FaultToleranceTest.
   
   resolves 
[NEMO-102](https://issues.apache.org/jira/projects/NEMO/issues/NEMO-102)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to