Rohini Palaniswamy created PIG-3891:
---------------------------------------

             Summary: FileBasedOutputSizeReader does not calculate size of 
files in sub-directories
                 Key: PIG-3891
                 URL: https://issues.apache.org/jira/browse/PIG-3891
             Project: Pig
          Issue Type: Bug
    Affects Versions: 0.12.0
            Reporter: Rohini Palaniswamy


FileBasedOutputSizeReader only includes files in the top level output 
directory. So if files are stored under subdirectories (For eg: MultiStorage), 
it does not have the bytes written correctly. 

0.11 shows the correct number of total bytes written and this is a regression. 
A quick look at the code shows that the JobStats.addOneOutputStats() in 0.11 
also does not recursively iterate and code is same as  
FileBasedOutputSizeReader. Need to investigate where the correct value comes 
from in 0.11 and fix it in 0.12.1/0.13.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to