[
https://issues.apache.org/jira/browse/TEZ-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15383568#comment-15383568
]
Peter Slawski commented on TEZ-3356:
------------------------------------
Hi [~rajesh.balamohan],
Thanks for the quick reply. The case is when parallelism for a vertex is not
yet known when ShuffleVertexManager::initialize() is called, i.e. numTasks is
set to {{-1}}. Then, the stats field is not initialized when the vertex manager
is initialized as pendingTasks() exits early as shown below.
{code:java}
int tasks = getContext().getVertexNumTasks(getContext().getVertexName());
if (tasks == pendingTasks.size() || tasks <= 0) {
return;
}
{code}
To clarify, in this case, we are using a manager plugin which extends a
ShuffleVertexManager. This manager will set the parallelism during the runtime
of the DAG, which would occur after ShuffleVertexManager::initialize() is
called for a vertex with that custom manager.
See Pig's PigGraceShuffleVertexManager which will set a vertex parallelism
during the DAG execution,
[here|https://github.com/apache/pig/blob/40300960d2edaa8097e551dd8692e29d6018ffc4/src/org/apache/pig/backend/hadoop/executionengine/tez/runtime/PigGraceShuffleVertexManager.java#L172].
When Pig submits a job with grace parallelism, the parallelism for vertices
would be {{-1}}, which is done
[here|https://github.com/apache/pig/blob/7cf1a945772f49ff620d7eab75bf2c7e635ab2ae/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java#L871].
So, in this case, ShuffleVertexManager::initialize() will not initialize the
stats field.
> Fix initializing of stats when custom ShuffleVertexManager is used
> ------------------------------------------------------------------
>
> Key: TEZ-3356
> URL: https://issues.apache.org/jira/browse/TEZ-3356
> Project: Apache Tez
> Issue Type: Bug
> Affects Versions: 0.8.4
> Reporter: Peter Slawski
> Assignee: Peter Slawski
> Attachments: TEZ-3356.1.patch
>
>
> When using a custom ShuffleVertexManager to set a vertex’s parallelism, the
> partition stats field will be left uninitialized even after the manager
> itself gets initialized. This results in a IllegalStateException to be thrown
> as the stats field will not yet be initialized when VertexManagerEvents are
> processed upon the start of the vertex. Note that these events contain
> partition sizes which are aggregated and stored in this stats field.
>
> Apache Pig’s grace auto-parallelism feature uses a custom
> ShuffleVertexManager which sets a vertex’s parallelism upon the completion of
> one of its parent’s parents. Thus, this corner case is hit and pig scripts
> with grace parallelism enabled would fail if the DAG consists of at least one
> vertex having grandparents.
>
> The fix should be straight forward. Before rather than after
> VertexManagerEvents are processed, simply update pending tasks to ensure the
> partition stats field will be initialized.
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)