[ 
https://issues.apache.org/jira/browse/PIG-4699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohini Palaniswamy updated PIG-4699:
------------------------------------
    Attachment: sample-output.txt
                PIG-4699-1.patch

The patch also reduces logging in case the dag status information does not 
change while the script is running.

Attached a sample output to better understand the changes. Could not find a way 
to get the vertex task timing stats which is more useful. So just put in the 
records and bytes information in there for now.   There are couple of cases 
where the alias and feature information on the vertex is empty. Need to look 
into those and fix them. Will file a separate jira for that.

Unrelated minor change of DataSinkDescriptor.create in TezDAGBuilder.java is 
just to fix deprecated warnings.

> Print Job stats information in Tez like mapreduce
> -------------------------------------------------
>
>                 Key: PIG-4699
>                 URL: https://issues.apache.org/jira/browse/PIG-4699
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Rohini Palaniswamy
>            Assignee: Rohini Palaniswamy
>             Fix For: 0.16.0
>
>         Attachments: PIG-4699-1.patch, sample-output.txt
>
>
>    Job stats information in mapreduce is extremely useful while debugging or 
> looking at performance bottlenecks on which of the mapreduce jobs is taking 
> time. It is hard to figure out the same and what aliases are being processed 
> in vertices of Tez without that. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to