jeongyooneo opened a new pull request #115: [NEMO-96] Modularize DataSkewPolicy to use MetricVertex and BarrierVertex URL: https://github.com/apache/incubator-nemo/pull/115 JIRA: [NEMO-96: Modularize DataSkewPolicy to use MetricVertex and BarrierVertex](https://issues.apache.org/jira/projects/NEMO/issues/NEMO-96) **Major changes:** - Handle dynamic optimization via `MetricCollectionVertex` and `AggregationBarrierVertex` instead of `MetricCollectionBarrierVertex` - For each shuffle edge with main output, `MetricCollectionVertex` is inserted in compile-time at the end of its source tasks, which collects key frequency data - For each shuffle edge with main output, `AggregationBarrierVertex` is inserted in compile-time. It aggregates task-level key frequency data, which is collected via each `MetricCollectionVertex` and emitted as additional tagged output **Minor changes to note:** - Added encoder/decoder factories needed for aggregating dynamic optimization data - in here key frequency data - Modified `PipelineTranslator` to extract key encoder/decoders - Modified `DataSkewRuntimePass` and related code path to handle `Object` type keys, instead of integer type hash index keys **Tests for the changes:** - N/A(unit tests for skew handling and `PerKeyMedianITCase` test the changes) **Other comments:** - N/A Closes #GITHUB_PR_NUMBER
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
