[
https://issues.apache.org/jira/browse/PIG-3891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13977698#comment-13977698
]
Mona Chitnis commented on PIG-3891:
-----------------------------------
Linking the original JIRA introducing this change. The issue is probably in
reporting the counters as a whole as I'm getting the following output for a
sample pig test (map-reduce mode of course), even though its successful and
produced output successfully.
{quote}
Input(s):
Successfully read 0 records from: "/user/pig/tests/data/pigmix/page_views"
Output(s):
Successfully stored 0 records in: "/user/chitnis//L1out"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
{quote}
> FileBasedOutputSizeReader does not calculate size of files in sub-directories
> -----------------------------------------------------------------------------
>
> Key: PIG-3891
> URL: https://issues.apache.org/jira/browse/PIG-3891
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.12.0
> Reporter: Rohini Palaniswamy
>
> FileBasedOutputSizeReader only includes files in the top level output
> directory. So if files are stored under subdirectories (For eg:
> MultiStorage), it does not have the bytes written correctly.
> 0.11 shows the correct number of total bytes written and this is a
> regression. A quick look at the code shows that the
> JobStats.addOneOutputStats() in 0.11 also does not recursively iterate and
> code is same as FileBasedOutputSizeReader. Need to investigate where the
> correct value comes from in 0.11 and fix it in 0.12.1/0.13.
--
This message was sent by Atlassian JIRA
(v6.2#6252)