[
https://issues.apache.org/jira/browse/MAPREDUCE-4464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463833#comment-13463833
]
Harsh J commented on MAPREDUCE-4464:
------------------------------------
Hi Clint,
Sorry on the delay here!
I noticed that the line:
bq. String host = u.getHost();
Which is the one in question of carrying a null, is then used in the lookup as:
bq. List<MapOutputLocation> loc = mapLocations.get(host);
Hence, I think the most ideal fix would be to throw an exception. Because, in
the chunks later, we rely heavily on host:
{code}
URI u = URI.create(event.getTaskTrackerHttp());
String host = u.getHost();
TaskAttemptID taskId = event.getTaskAttemptId();
URL mapOutputLocation = new URL(event.getTaskTrackerHttp() +
"/mapOutput?job=" + taskId.getJobID() +
"&map=" + taskId +
"&reduce=" + getPartition());
List<MapOutputLocation> loc = mapLocations.get(host);
if (loc == null) {
loc = Collections.synchronizedList
(new LinkedList<MapOutputLocation>());
mapLocations.put(host, loc);
}
loc.add(new MapOutputLocation(taskId, host, mapOutputLocation));
numNewMaps ++;
{code}
As seen by its usage, if host itself is undeterminable, and is consistently
null, we cannot really work with it, and throwing an IOException makes sense.
I'm currently running test-patch on your patch for branch-1, depending on whose
results I'll commit it in or post some further comments.
MR2 may be similarly affected on the netty side but may be failing properly
already, I haven't the time to verify at the moment (perhaps another JIRA). So
I'll just focus on the MR1 side now.
Thanks for the patch!
> Reduce tasks failing with NullPointerException in ConcurrentHashMap.get()
> -------------------------------------------------------------------------
>
> Key: MAPREDUCE-4464
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4464
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: task
> Affects Versions: 1.0.0
> Reporter: Clint Heath
> Assignee: Clint Heath
> Priority: Minor
> Attachments: MAPREDUCE-4464_new.patch, MAPREDUCE-4464.patch
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> If DNS does not resolve hostnames properly, reduce tasks can fail with a very
> misleading exception.
> as per my peer Ahmed's diagnosis:
> In ReduceTask, it seems that event.getTaskTrackerHttp() returns a malformed
> URI, and so host from:
> {code}
> String host = u.getHost();
> {code}
> is evaluated to null and the NullPointerException is thrown afterwards in the
> ConcurrentHashMap.
> I have written a patch to check for a null hostname condition when getHost is
> called in the getMapCompletionEvents method and print an intelligible warning
> message rather than suppressing it until later when it becomes confusing and
> misleading.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira