[
https://issues.apache.org/jira/browse/HADOOP-9295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Parks updated HADOOP-9295:
--------------------------------
Description:
Verified the trunk looks the same as 1.0.3 for this issue.
When mappers output MapWritables with different class types, then they are read
in on the Reducer via an iterator (multiple calls to readFields without
instantiating a new object) you'll get this:
java.lang.IllegalArgumentException: Id 1 exists but maps to org.me.ClassTypeOne
and not org.me.ClassTypeTwo
at
org.apache.hadoop.io.AbstractMapWritable.addToMap(AbstractMapWritable.java:73)
at
org.apache.hadoop.io.AbstractMapWritable.readFields(AbstractMapWritable.java:201)
It happens because AbstractMapWritable accumulates class type entries in its
ClassType to ID (and vice versa) hashmaps.
Those need to be cleared to support multiple calls to readFields().
I've attached a JUnit test that both demonstrates the problem and contains an
embedded, fixed version of MapWritable and ArrayMapWritable (note the //TODO
comments in the code where it was fixed in 2 places).
If there's a better way to submit this recommended bug fix, someone please feel
free to let me know.
was:
Verified the trunk looks the same as 1.0.3 for this issue.
When you save two different class types two MapWritables, then try to read them
in on the Reducer over an iterator (multiple calls to readFields without
instantiating a new object) you'll get this:
java.lang.IllegalArgumentException: Id 1 exists but maps to org.me.ClassTypeOne
and not org.me.ClassTypeTwo
at
org.apache.hadoop.io.AbstractMapWritable.addToMap(AbstractMapWritable.java:73)
at
org.apache.hadoop.io.AbstractMapWritable.readFields(AbstractMapWritable.java:201)
It happens because AbstractMapWritable accumulates class type entries in its
ClassType to ID (and vice versa) Hashmaps.
Those need to be cleared to support multiple calls to readFields().
I've attached a JUnit test that both demonstrates the problem and contains an
embedded, fixed version of MapWritable and ArrayMapWritable (note the //TODO
comments in the code where it was fixed in 2 places).
> AbstractMapWritable throws exception when calling readFields() multiple times
> when the maps contain different class types
> -------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-9295
> URL: https://issues.apache.org/jira/browse/HADOOP-9295
> Project: Hadoop Common
> Issue Type: Bug
> Components: io
> Affects Versions: 1.0.3
> Reporter: David Parks
> Priority: Critical
> Attachments: MapWritableBugTest.java
>
>
> Verified the trunk looks the same as 1.0.3 for this issue.
> When mappers output MapWritables with different class types, then they are
> read in on the Reducer via an iterator (multiple calls to readFields without
> instantiating a new object) you'll get this:
> java.lang.IllegalArgumentException: Id 1 exists but maps to
> org.me.ClassTypeOne and not org.me.ClassTypeTwo
> at
> org.apache.hadoop.io.AbstractMapWritable.addToMap(AbstractMapWritable.java:73)
> at
> org.apache.hadoop.io.AbstractMapWritable.readFields(AbstractMapWritable.java:201)
> It happens because AbstractMapWritable accumulates class type entries in its
> ClassType to ID (and vice versa) hashmaps.
> Those need to be cleared to support multiple calls to readFields().
> I've attached a JUnit test that both demonstrates the problem and contains an
> embedded, fixed version of MapWritable and ArrayMapWritable (note the //TODO
> comments in the code where it was fixed in 2 places).
> If there's a better way to submit this recommended bug fix, someone please
> feel free to let me know.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira