[ 
https://issues.apache.org/jira/browse/HADOOP-6029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HADOOP-6029:
--------------------------------

    Attachment: TEST-org.apache.hadoop.mapred.TestReduceFetch.txt
                
FAILING-PARTIALMEM-TEST-org.apache.hadoop.mapred.TestReduceFetch.txt

Jothi and I came across another TestReduceFetch failure. 
{noformat}
Testcase: testReduceFromDisk took 78.436 sec
Testcase: testReduceFromPartialMem took 60.701 sec
        FAILED
Expected at least 1MB fewer bytes read from local (21159650) than written to 
HDFS (21036680)
junit.framework.AssertionFailedError: Expected at least 1MB fewer bytes read 
from local (21159650) than written to HDFS (21036680)
        at 
org.apache.hadoop.mapred.TestReduceFetch.testReduceFromPartialMem(TestReduceFetch.java:276)
        at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
        at junit.extensions.TestSetup$1.protect(TestSetup.java:23)
        at junit.extensions.TestSetup.run(TestSetup.java:27)

Testcase: testReduceFromMem took 52.097 sec
{noformat}

The above failure actually looks like a memory issue. In 
ReduceTask.ReduceCopier.ShuffleRamManager, a memory reservation is done for 
in-memory shuffle, and that uses Runtime.getRuntime().maxMemory(). The return 
value of this seems to be machine-dependent. For the case where it failed with 
the exception trace above, the value returned by Runtime.maxMemory is smaller 
compared to the case using which the test passes. When the former happens, 
shuffled files start hitting the disk, and the testcase fails since it doesn't 
expect that many files to hit the disk.. I am attaching two logs - one of the 
successful testcase (all tests successful) and another for the failed 
testReduceFromPartialMem run. In both the logs, job_0002 is the job for the 
testReduceFromPartialMem test.

Nicholas, could you please upload the logs of the test failure you saw. Thanks!

> TestReduceFetch failed.
> -----------------------
>
>                 Key: HADOOP-6029
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6029
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Tsz Wo (Nicholas), SZE
>         Attachments: 
> FAILING-PARTIALMEM-TEST-org.apache.hadoop.mapred.TestReduceFetch.txt, 
> TEST-org.apache.hadoop.mapred.TestReduceFetch.txt
>
>
> {noformat}
> Testcase: testReduceFromMem took 23.625 sec
>       FAILED
> Non-zero read from local: 83
> junit.framework.AssertionFailedError: Non-zero read from local: 83
>       at 
> org.apache.hadoop.mapred.TestReduceFetch.testReduceFromMem(TestReduceFetch.java:289)
>       at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
>       at junit.extensions.TestSetup$1.protect(TestSetup.java:23)
>       at junit.extensions.TestSetup.run(TestSetup.java:27)
> {noformat}
> Ran TestReduceFetch a few times on a clean trunk.  It failed consistently.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to