Map-side joins always empty when an input has NullWritable value
----------------------------------------------------------------
Key: MAPREDUCE-2597
URL: https://issues.apache.org/jira/browse/MAPREDUCE-2597
Project: Hadoop Map/Reduce
Issue Type: Bug
Affects Versions: 0.20.2
Environment: Linux cluster
Reporter: Michael White
It is not uncommon to have a sorted list of data that has no specific value
associated with it as input to a map-side join, e.g. as an exact-match filter.
In these cases, you would typically have a value class of NullWritable.
However, when performing a map-side join in Hadoop 0.20.2, we have found that
any input that has value class of NullWritable results in the Mapper never
getting called. I found this with a 3-way map-side join, and my colleague
tells me he ran into the same issue. I have not specifically tested a 2-way
join to see if the problem occurs, so it may be that the bug is specific to
n-way joins for n>2 (though I suspect not).
The current workaround is to use some other value type (e.g. IntWritable) and
stuff an arbitrary value into it.
For a join, the value class should have no bearing on the set of keys that are
considered matching.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira