[
https://issues.apache.org/jira/browse/PIG-4699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rohini Palaniswamy updated PIG-4699:
------------------------------------
Attachment: sample-output.txt
PIG-4699-1.patch
The patch also reduces logging in case the dag status information does not
change while the script is running.
Attached a sample output to better understand the changes. Could not find a way
to get the vertex task timing stats which is more useful. So just put in the
records and bytes information in there for now. There are couple of cases
where the alias and feature information on the vertex is empty. Need to look
into those and fix them. Will file a separate jira for that.
Unrelated minor change of DataSinkDescriptor.create in TezDAGBuilder.java is
just to fix deprecated warnings.
> Print Job stats information in Tez like mapreduce
> -------------------------------------------------
>
> Key: PIG-4699
> URL: https://issues.apache.org/jira/browse/PIG-4699
> Project: Pig
> Issue Type: Improvement
> Reporter: Rohini Palaniswamy
> Assignee: Rohini Palaniswamy
> Fix For: 0.16.0
>
> Attachments: PIG-4699-1.patch, sample-output.txt
>
>
> Job stats information in mapreduce is extremely useful while debugging or
> looking at performance bottlenecks on which of the mapreduce jobs is taking
> time. It is hard to figure out the same and what aliases are being processed
> in vertices of Tez without that.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)