[
https://issues.apache.org/jira/browse/TEZ-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ming Ma updated TEZ-3269:
-------------------------
Attachment: TEZ-3269.patch
Here is the draft patch. It supports two polices w.r.t. fair routing.
* One policy is auto reduce, somewhat similar to ShuffleVertexManager where the
number of partitions can be reduced based on data size. Instead of using the
overall size as in ShuffleVertexManager, it uses partition stats instead e.g.
TEZ-2962.
* Another routing policy is fair routing. Any destination task can fetch a
consecutive range of partitions from a consecutive range of source tasks. Note
that the patch only supports one bipartite edge. To make it work for more than
one bipartite edge requires more work. We can open another jira if we need to
support that.
Besides the core routing functionalities, the patch also includes the
followings.
* Move global stats to per source vertex. This will allow more accurate
estimation of the partition size given one source vertex can be much larger
than the others. In addition, change from long to int for stats as the unit is
in MB. So there is some impact on memory even for ShuffleVertexManager. But the
net impact should be acceptable. For joining say 20 source vertexes with 20k
destination tasks, the size is 4 * 20k * 20 = 800k. If we want to be safe, we
can make this change specific to FairShuffleVertexManager. But we might need it
anyway for TEZ-2962.
* Refactor test code.
** The common test cases for both ShuffleVertexManager and
FairShuffleVertexManager are moved to TestShuffleVertexManagerBase.
** TestShuffleVertexManager still verified EdgeManager via
EdgeManagerPlugin#routeDataMovementEventToDestination. It should use
EdgeManagerPluginOnDemand instead as that is what is actually being used.
** Break testShuffleVertexManagerAutoParallelism into individual test cases.
> Provide basic fair routing and scheduling functionality via custom
> VertexManager and EdgeManager
> ------------------------------------------------------------------------------------------------
>
> Key: TEZ-3269
> URL: https://issues.apache.org/jira/browse/TEZ-3269
> Project: Apache Tez
> Issue Type: Sub-task
> Reporter: Ming Ma
> Attachments: TEZ-3269.patch
>
>
> With TEZ-3206 and TEZ-3216, we can build a custom VertexManager and
> EdgeManager that uses partition stats to do fair routing as well as the
> scheduling based on destination tasks’ dependency on source tasks.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)