[ https://issues.apache.org/jira/browse/PIG-3891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15387186#comment-15387186 ]
Rohini Palaniswamy commented on PIG-3891: ----------------------------------------- Few comments: - Could you fix this to run in Tez mode as well? Did not realize this one was not fixed. {code} pigServer = new PigServer(cluster.getExecType(), cluster.getProperties()); pigServerLocal = new PigServer(ExecType.LOCAL); {code} to {code} pigServer = new PigServer(ExecType.MAPREDUCE, cluster.getProperties()); pigServerLocal = new PigServer(Util.getLocalTestMode()); {code} It would also involve changes to the test like MRJobStats->JobStats, etc. You can test by running with ant test -Dhadoopversion=23 -Dexectype=tez -Dtestcase=TestMultiStorage - Can you add asserts for getMultiStoreCounters() as well for the individual output bytes written - Test name is too verbose. Could you rename the test as just testOutputStats and add a comment in the beginning of the test saying //Test if bytes written is correct with sub-directories and multiple MultiStorage statements. Rest looks good. > FileBasedOutputSizeReader does not calculate size of files in sub-directories > ----------------------------------------------------------------------------- > > Key: PIG-3891 > URL: https://issues.apache.org/jira/browse/PIG-3891 > Project: Pig > Issue Type: Bug > Affects Versions: 0.12.0 > Reporter: Rohini Palaniswamy > Assignee: Nandor Kollar > Attachments: PIG-3891-1.patch, PIG-3891-2.patch, PIG-3891-3.patch > > > FileBasedOutputSizeReader only includes files in the top level output > directory. So if files are stored under subdirectories (For eg: > MultiStorage), it does not have the bytes written correctly. > 0.11 shows the correct number of total bytes written and this is a > regression. A quick look at the code shows that the > JobStats.addOneOutputStats() in 0.11 also does not recursively iterate and > code is same as FileBasedOutputSizeReader. Need to investigate where the > correct value comes from in 0.11 and fix it in 0.12.1/0.13. -- This message was sent by Atlassian JIRA (v6.3.4#6332)