[
https://issues.apache.org/jira/browse/MAPREDUCE-190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Allen Wittenauer resolved MAPREDUCE-190.
----------------------------------------
Resolution: Incomplete
I'm going to close this out as stale.
> MultipleOutputs should use newer Hadoop serialization interface since 0.19
> --------------------------------------------------------------------------
>
> Key: MAPREDUCE-190
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-190
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Environment: Environment-independent issue
> Reporter: Mikhail Yakshin
>
> We have a system based on Hadoop 0.18 / Cascading 0.8.1 and now I'm trying to
> port it to Hadoop 0.19 / Cascading 1.0. The first serious problem I've got
> into that we're extensively using MultipleOutputs in our jobs dealing with
> sequence files that store Cascading's Tuples.
> Since Cascading 0.9, Tuples stopped being WritableComparable and implemented
> generic Hadoop serialization interface and framework. However, in Hadoop
> 0.19, MultipleOutputs require use of older WritableComparable interface.
> Thus, trying to do something like:
> {noformat}
> MultipleOutputs.addNamedOutput(conf, "output-name",
> MySpecialMultiSplitOutputFormat.class, Tuple.class, Tuple.class);
> mos = new MultipleOutputs(conf);
> ...
> mos.getCollector("output-name", reporter).collect(tuple1, tuple2);
> {noformat}
> yields an error:
> {noformat}
> java.lang.RuntimeException: java.lang.RuntimeException: class
> cascading.tuple.Tuple not org.apache.hadoop.io.WritableComparable
> at
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:752)
> at
> org.apache.hadoop.mapred.lib.MultipleOutputs.getNamedOutputKeyClass(MultipleOutputs.java:252)
> at
> org.apache.hadoop.mapred.lib.MultipleOutputs$InternalFileOutputFormat.getRecordWriter(MultipleOutputs.java:556)
> at
> org.apache.hadoop.mapred.lib.MultipleOutputs.getRecordWriter(MultipleOutputs.java:425)
> at
> org.apache.hadoop.mapred.lib.MultipleOutputs.getCollector(MultipleOutputs.java:511)
> at
> org.apache.hadoop.mapred.lib.MultipleOutputs.getCollector(MultipleOutputs.java:476)
> at my.namespace.MyReducer.reduce(MyReducer.java:xxx)
> {noformat}
> MultipleOutputs should eventually be ported to use more generic Hadoop
> serialization, as I understand.
--
This message was sent by Atlassian JIRA
(v6.2#6252)