[
https://issues.apache.org/jira/browse/HADOOP-5452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12680528#action_12680528
]
Hong Tang commented on HADOOP-5452:
-----------------------------------
I suspect this restriction is provided for performance reasons. To deserialize
an object in SequenceFile Reader, the SequenceFile needs to know the concrete
type of the serialized bytes. In other words, if objects of any sub-cloasses of
the Key-class are admissible, then SequenceFile may have to pay a per-key or
per-value string to record the actual type of the key or value objects.
Typically, you would have to write a wrapper class over the set of possible key
types and a numeric tag. The serialized form of your wrapper object is simply
the numeric tag followed by the actual object in serialized form. This
effectively is to minimize the per-key or per-value overhead by using small
integers instead of long strings.
> Relax the strict type check by allowing subclasses pass the check
> -----------------------------------------------------------------
>
> Key: HADOOP-5452
> URL: https://issues.apache.org/jira/browse/HADOOP-5452
> Project: Hadoop Core
> Issue Type: Improvement
> Reporter: he yongqiang
>
> The type check like:
> {code}
> if (key.getClass() != keyClass)
> throw new IOException("wrong key class: "+key.getClass().getName()
> +" is not "+keyClass);
> if (val.getClass() != valClass)
> throw new IOException("wrong value class: "+val.getClass().getName()
> +" is not "+valClass);
> {code}
> is used a lot when a type check is needed.
> I found their uses in org.apache.hadoop.io.SequenceFile,
> org.apache.hadoop.mapred.IFile, org.apache.hadoop.mapred.MapTask. Because i
> search with(key.getClass() != keyClass), so these codes may also appear in
> other classes.
> I suggest we can relax the strict type check by using
> {code}
> if (key.getClass().isAssignableFrom(keyClass))
> {code}
> The error in my situation is listed below:
> {panel:borderStyle=dashed| borderColor=#ccc| titleBGColor=#F7D6C1|
> bgColor=#FFFFCE}
> java.io.IOException: Type mismatch in value from map: expected
> cn.ac.ict.vega.type.Type, recieved cn.ac.ict.vega.type.Type$Float
> at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:553)
> at
> cn.ac.ict.vega.parse.mapreduce.block.FilterColumnBlockMapper.map(FilterColumnBlockMapper.java:77)
> at
> cn.ac.ict.vega.parse.mapreduce.block.BlockMapRunner.run(BlockMapRunner.java:33)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
> at org.apache.hadoop.mapred.Child.main(Child.java:155)
> {panel}
> Float is a sub class of Type. I wish it can pass the check. I use Type
> instead of Float is because i can not determint exactly whether it is Float,
> String or some others.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.