[
https://issues.apache.org/jira/browse/HADOOP-7425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064305#comment-13064305
]
steven zhuang commented on HADOOP-7425:
---------------------------------------
hi, Sudharsan, sure, you can use the data and scripts I uploaded(later) to do
the check.
I used the "org.apache.hadoop.mapred.lib.KeyFieldBasedPartitioner" instead of
the newer "org.apache.hadoop.mapreduce.lib.partition.KeyFieldBasedPartitioner"
in streaming command cause in 0.21.0, the
org.apache.hadoop.mapred.JobConf.setPartitionerClass() is still expecting an
Object extends from "org.apache.hadoop.mapred.Partitioner":
public void setPartitionerClass(Class<? extends Partitioner> theClass) {
' setClass("mapred.partitioner.class", theClass, Partitioner.class);
}
and why is KeyFieldBasedPartitioner configured twice here:
cause the "o.a.h.mapred.lib.KeyFieldBasedPartitioner" class now extends from
"o.a.h.mapreduce.lib.partition.KeyFieldBasedPartitioner" to use potential newer
features(which's not appeared yet), which is Configurable, so in
ReflectionUtils.setConf, the line :
"if (theObject instanceof Configurable) { ((Configurable)
theObject).setConf(conf); }"
will be executed, configure the partitioner once, add a KeyDescription to the
KeyDescription list.
And later, no matter the above line is executed or what, this line:
"setJobConf(theObject, conf);"
will be executed anyway, configure the Partitioner twice, add another
KeyDescription to the list, even if we just configured one.
> ReflectionUtils.setConf would configure anything Configurable twice
> -------------------------------------------------------------------
>
> Key: HADOOP-7425
> URL: https://issues.apache.org/jira/browse/HADOOP-7425
> Project: Hadoop Common
> Issue Type: Bug
> Components: util
> Affects Versions: 0.21.0
> Reporter: steven zhuang
> Attachments: test.tar
>
>
> In the setConf method of org.apache.hadoop.util.ReflectionUtils, any
> instance of Configurable would be configured twice.
> In 0.21.0, KeyFieldBasedPartitioner implements the Configurable interface.
> When configured twice, it get two KeyDescription and gives out wrong
> partition number.
> public static void setConf(Object theObject, Configuration conf) {
> if (conf != null) {
> if (theObject instanceof Configurable) {
> ((Configurable) theObject).setConf(conf);
> }
> setJobConf(theObject, conf);
> }
> }
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira