[jira] Commented: (PIG-1299) Implement Pig counter to track number of output rows for each output files

Pradeep Kamath (JIRA) Tue, 06 Apr 2010 14:25:58 -0700

    [ 
https://issues.apache.org/jira/browse/PIG-1299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854188#action_12854188
 ]


Pradeep Kamath commented on PIG-1299:
-------------------------------------

Changes are mostly good - a few comments:
1) Instead of creating a wrapper RecordWriter in MapReducePOStoreImpl, the 
incrementing of the counter should be done in POStore.getNext() - POStore holds 
a reference to MapReducePOStoreImpl, so the counter is available for 
incrementing. This way, we will still keep our contract to StoreFunc that the 
RecordWriter instance provided in prepareToWrite() is the same as the one given 
by StoreFunc.getOutputFormat().getRecordWriter(). With this change, the change 
to BinStorage should be reverted.
2) Is the check for store.isMultiStore() required in MapReducePOStoreImpl - I 
think MapReducePOStoreImpl is used only with multi-store POStore(s) - so the 
check seems redundant
3) If javac warnings can be addressed, please address them - also unit tests 
along the lines of those in TestCounters would be good.

> Implement Pig counter  to track number of output rows for each output files 
> ----------------------------------------------------------------------------
>
>                 Key: PIG-1299
>                 URL: https://issues.apache.org/jira/browse/PIG-1299
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>            Reporter: Richard Ding
>            Assignee: Richard Ding
>             Fix For: 0.8.0
>
>         Attachments: PIG-1299.patch
>
>
> When running a multi-store query, the Hadoop job tracker often displays only 
> 0 for "Reduce output records" or "Map output records" counters, This is 
> incorrect and misleading. Pig should implement an "output records" counter 
> for each output files in the query. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PIG-1299) Implement Pig counter to track number of output rows for each output files

Reply via email to