[ 
https://issues.apache.org/jira/browse/HADOOP-7425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064305#comment-13064305
 ] 

steven zhuang commented on HADOOP-7425:
---------------------------------------

hi, Sudharsan, sure, you can use the data and scripts I uploaded(later) to do 
the check. 

I used the "org.apache.hadoop.mapred.lib.KeyFieldBasedPartitioner" instead of 
the newer "org.apache.hadoop.mapreduce.lib.partition.KeyFieldBasedPartitioner" 
in streaming command cause in 0.21.0, the 
org.apache.hadoop.mapred.JobConf.setPartitionerClass() is still expecting an 
Object extends from "org.apache.hadoop.mapred.Partitioner":

public void setPartitionerClass(Class<? extends Partitioner> theClass) {
'      setClass("mapred.partitioner.class", theClass, Partitioner.class);
}

and why is KeyFieldBasedPartitioner configured twice here:
  cause the "o.a.h.mapred.lib.KeyFieldBasedPartitioner" class now extends from 
"o.a.h.mapreduce.lib.partition.KeyFieldBasedPartitioner" to use potential newer 
features(which's not appeared yet), which is Configurable, so in 
ReflectionUtils.setConf, the line :

"if (theObject instanceof Configurable) { ((Configurable) 
theObject).setConf(conf); }" 

will be executed, configure the partitioner once, add a KeyDescription to the 
KeyDescription list.
And later, no matter the above line is executed or what, this line:

"setJobConf(theObject, conf);"

will be executed anyway, configure the Partitioner twice, add another 
KeyDescription to the list, even if we just configured one.


> ReflectionUtils.setConf would configure anything Configurable twice
> -------------------------------------------------------------------
>
>                 Key: HADOOP-7425
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7425
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.21.0
>            Reporter: steven zhuang
>         Attachments: test.tar
>
>
> In  the setConf method of org.apache.hadoop.util.ReflectionUtils, any 
> instance of Configurable would be configured twice.
> In 0.21.0, KeyFieldBasedPartitioner implements the Configurable interface. 
> When configured twice, it get two KeyDescription and gives out wrong 
> partition number. 
> public static void setConf(Object theObject, Configuration conf) {
>     if (conf != null) {
>       if (theObject instanceof Configurable) {
>         ((Configurable) theObject).setConf(conf);
>       }
>       setJobConf(theObject, conf);
>     }
>   }

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to