[ https://issues.apache.org/jira/browse/TEZ-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ming Ma updated TEZ-3269: ------------------------- Attachment: TEZ-3269-4.patch Thanks @sseth. bq. final where possible Fixed. bq. Would be good to have some more documentation. I have added more. Please let me know if it is enough. bq. the ceil method Mind clarify what you mean here? bq. getTotalStatsAtIndex to getCurrentlyKnownStatsAtIndex Fixed. bq. expectedTotalSourceTasksOutputSize / numOfPartitions Fixed. bq. onVertexStarted - Should this be split up a little more Good point. Fixed it by adding an overridable onVertexStartedCheck function. bq. This ensures that averaging of stats based on output size isn't accidentally hit on a 0 sized partition? This is used to detect the case partition stats reporting isn't enabled. The new patch is updated to use boolean partitionStatsReported instead. bq. which the current ShuffleVertexManager doesn't do yet? That is right. TEZ-2962 is the jira for that. bq. Will the parallelism ever end up getting increased? yes. When {{FairRoutingType#FAIR_PARALLELISM}} is used, there will be multiple destination tasks processing the same partition and each destination task will process a range of source tasks of that partition. bq. Any thoughts on what it will take to move this to support multiple source vertices Current implementation supports multiple source vertices if REDUCE_PARALLELISM is used. More work is needed for the FAIR_PARALLELISM case. Opened TEZ-3500 for further discussion. > Provide basic fair routing and scheduling functionality via custom > VertexManager and EdgeManager > ------------------------------------------------------------------------------------------------ > > Key: TEZ-3269 > URL: https://issues.apache.org/jira/browse/TEZ-3269 > Project: Apache Tez > Issue Type: Sub-task > Reporter: Ming Ma > Assignee: Ming Ma > Attachments: TEZ-3269-2.patch, TEZ-3269-3.patch, TEZ-3269-4.patch, > TEZ-3269.patch > > > With TEZ-3206 and TEZ-3216, we can build a custom VertexManager and > EdgeManager that uses partition stats to do fair routing as well as the > scheduling based on destination tasks’ dependency on source tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)