[ https://issues.apache.org/jira/browse/CRUNCH-539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gabriel Reid resolved CRUNCH-539. --------------------------------- Resolution: Fixed Assignee: Gabriel Reid Fix Version/s: 0.13.0 Pushed to master > Use of TupleWritable.setConf fails in mapper/reducer > ---------------------------------------------------- > > Key: CRUNCH-539 > URL: https://issues.apache.org/jira/browse/CRUNCH-539 > Project: Crunch > Issue Type: Bug > Affects Versions: 0.12.0 > Reporter: Gabriel Reid > Assignee: Gabriel Reid > Fix For: 0.13.0 > > Attachments: CRUNCH-539.patch > > > In (at least) more recent versions of Hadoop 2, the implicit call to > TupleWritable.setConf that happens when using TupleWritables fails with a > ClassNotFoundException for (ironically) the TupleWritable class. > This appears to be due to the way that ObjectInputStream resolves classes in > its [resolveClass > method|https://docs.oracle.com/javase/7/docs/api/java/io/ObjectInputStream.html#resolveClass(java.io.ObjectStreamClass)], > together with the way that the context classloader is set within a hadoop > mapper or reducer. > This is similar to PIG-2532. > This can be reproduced in the local job tracker (at least) in Hadoop 2.7.0, > but it can't be reproduced in Crunch integration tests (due to classloading > setup). It appears that this issue is only present in Crunch 0.12. > The following code within a simple pipeline will cause this issue to occur: > {code} > PTable<String, Integer> yearTemperatures = ... /* Writable-based PTable */ > PTable<String, Integer> maxTemps = yearTemperatures > .groupByKey() > .combineValues(Aggregators.MAX_INTS()) > .top(1); //LINE THAT CAUSES THE ERROR > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)