[
https://issues.apache.org/jira/browse/PIG-4757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rohini Palaniswamy updated PIG-4757:
------------------------------------
Resolution: Fixed
Hadoop Flags: Reviewed
Status: Resolved (was: Patch Available)
Committed to trunk. Thanks for the review Daniel.
> Job stats on successfully read/output records wrong with multiple
> inputs/outputs
> --------------------------------------------------------------------------------
>
> Key: PIG-4757
> URL: https://issues.apache.org/jira/browse/PIG-4757
> Project: Pig
> Issue Type: Bug
> Components: tez
> Reporter: Rohini Palaniswamy
> Assignee: Rohini Palaniswamy
> Fix For: 0.16.0
>
> Attachments: PIG-4757-1.patch, PIG-4757-2.patch, PIG-4757-3.patch
>
>
> TezVertexStats uses TaskCounter.INPUT_RECORDS_PROCESSED to display records
> read from MRInput. But in cases of replicate join or scalar it also includes
> replicate join input. Need to have a pig specific counter
> (MULTI_INPUTS_RECORD_COUNTER) in POSimpleTezLoad.
> TezVertexStats uses TaskCounter.OUTPUT_RECORDS to display records stored to
> MROutput if there is single store. If there are multiple stores it uses
> MULTI_STORE_RECORD_COUNTER and there are no issues. If there is a single
> store with another output, then value from OUTPUT_RECORDS is wrong. Need to
> use MULTI_STORE_RECORD_COUNTER for all cases even if there is no multiple
> store.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)