[
https://issues.apache.org/jira/browse/TEZ-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159558#comment-14159558
]
Rajesh Balamohan edited comment on TEZ-1635 at 10/5/14 3:23 PM:
----------------------------------------------------------------
Attaching the successful and hung job details for tez_smb_1.q with additional
logs in org.apache.hadoop.hive.ql.exec.tez.CustomPartitionVertex.
tez_smb_1.q: (DAG snapshot is already attached)
- Map_1 [Map_2] - MultiMRInput, initializer=MRInputAMSplitGenerator
- Map_1 [s1] - MRInputLegacy, initializer=MRInputAMSplitGenerator
- (Map_1 [Map_2], Map_1 [s1]) --> Map_1[MapTezProcessor]
- Map_1[MapTezProcessor] --> Map_1[out_Map_1] MROutput
Map_1 vertexManager is org.apache.hadoop.hive.ql.exec.tez.CustomPartitionVertex
Successful job:
===============
- When CustomPartitionVertex.onRootVertexInitialized() gets called for "s1",
CustomPartionVertex.processAllEvents() is invoked which internally populates
bucketToTaskMap datastructure.
- When CustomPartitionVertex.onRootVertexInitialized() gets called for "Map_2",
CustomPartitionVertex.processAllSideEvents() is invoked which depends on
bucketToTaskMap to generate the InputDataInformationEvent.
Failure/hung job:
===============
- CPV.onRootVertexInitialized() gets called for "Map_2" first. This ends up
calling CPV.processAllSideEvents(). Since bucketToTaskMap structure is empty,
it would *not* generate any InputDataInformationEvent.
- CPV.onRootVertexInitialized() gets called for "s1" later.
In this case, events pertaining to MultiMRInput (Map_2) is never sent to Tez
from CustomPartitionVertex. [~hagleitn] - Is this expected behavior of
CustomPartitionVertex?
was (Author: rajesh.balamohan):
Attaching the successful and hung job details for tez_smb_1.q with additional
logs in org.apache.hadoop.hive.ql.exec.tez.CustomPartitionVertex.
tez_smb_1.q: (DAG snapshot is already attached)
- Map_1 [Map_2] - MultiMRInput, initializer=MRInputAMSplitGenerator
- Map_1 [s1] - MRInputLegacy, initializer=MRInputAMSplitGenerator
- (Map_1 [Map_2], Map_1 [s1]) --> Map_1[MapTezProcessor]
- Map_1[MapTezProcessor] --> Map_1[out_Map_1] MROutput
Map_1 vertexManager is org.apache.hadoop.hive.ql.exec.tez.CustomPartitionVertex
Successful job:
===============
- When CustomPartitionVertex.onRootVertexInitialized() gets called for "s1",
CustomPartionVertex.processAllEvents() is invoked which internally populates
bucketToTaskMap datastructure.
- When CustomPartitionVertex.onRootVertexInitialized() gets called for "Map_2",
CustomPartitionVertex.processAllSideEvents() is invoked which depends on
bucketToTaskMap to generate the InputDataInformationEvent.
Failure/hung job:
===============
- CPV.onRootVertexInitialized() gets called for "Map_2" first. This ends up
calling CPV.processAllSideEvents(). Since bucketToTaskMap structure is empty,
it would *not* generate any InputDataInformationEvent.
- CPV.onRootVertexInitialized() gets called for "s1" later.
In this case, events pertaining to MultiMRInput (Map_2) is never sent to Tez
from CustomPartitionVertex.
> Dag gets stuck intermittently
> -----------------------------
>
> Key: TEZ-1635
> URL: https://issues.apache.org/jira/browse/TEZ-1635
> Project: Apache Tez
> Issue Type: Bug
> Affects Versions: 0.5.0
> Reporter: Vikram Dixit K
> Priority: Blocker
> Attachments: Screen Shot 2014-10-05 at 9.46.31 AM.png,
> syslog_dag_1412109415326_0002_10.gz, tez_smb_1_hung_job.log,
> tez_smb_1_successful_job.log
>
>
> Attaching logs for the dag.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)