[
https://issues.apache.org/jira/browse/HADOOP-3285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591582#action_12591582
]
Hadoop QA commented on HADOOP-3285:
-----------------------------------
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12380747/3285-5.patch
against trunk revision 645773.
@author +1. The patch does not contain any @author tags.
tests included +1. The patch appears to include 3 new or modified tests.
javadoc +1. The javadoc tool did not generate any warning messages.
javac +1. The applied patch does not generate any new javac compiler
warnings.
release audit +1. The applied patch does not generate any new release
audit warnings.
findbugs +1. The patch does not introduce any new Findbugs warnings.
core tests +1. The patch passed core unit tests.
contrib tests +1. The patch passed contrib unit tests.
Test results:
http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2305/testReport/
Findbugs warnings:
http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2305/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results:
http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2305/artifact/trunk/build/test/checkstyle-errors.html
Console output:
http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2305/console
This message is automatically generated.
> map tasks with node local splits do not always read from local nodes
> --------------------------------------------------------------------
>
> Key: HADOOP-3285
> URL: https://issues.apache.org/jira/browse/HADOOP-3285
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Runping Qi
> Assignee: Owen O'Malley
> Priority: Blocker
> Fix For: 0.17.0
>
> Attachments: 3285-3.patch, 3285-4.patch, 3285-5.patch, 3285.patch,
> 3285.patch
>
>
> I ran a simple map/reduce job counting the number of records in the input
> data.
> The number of reducers was set to 1.
> I did not set the number of mappers. Thus by default, all splits except the
> last split of a file contain one dfs block (128MB in my case).
> The web gui indicated that 99% of map tasks were with local splits.
> Thus I expected that most of the dfs reads should have come from the local
> data nodes.
> However, when I examine the traffic of the ethernet interfaces,
> I found about 50% traffic of each node were through the loopback interface
> and other 50% were through the ethernet card!
> Also, the switch monitoring indicated that a lot of traffic went through the
> links and cross racks!
> This indicated that the data locality feature does not work as expected.
> To confirm that, I set the number of map tasks to a very high number so that
> it forced the split size down to about 27MB.
> The web gui indicated that 99% of map tasks were with local splits, as
> expected.
> The ethernet interface monitor showed that almost 100% traffic went through
> the loopback interface, as it should be.
> I found about 50% traffic of each node were through the loopback interface
> and other 50% were through the ethernet card!
> Also, the switch monitoring indicated that there were very little traffic
> through the links and cross racks.
> This implies that some corner cases are not handled properly.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.