On 1/12/10 6:53 PM, Wilkes, Chris wrote: > I created my own Writable class to store 3 pieces of information. In my > mapreducer.Reducer class I collect all of them and then process as a > group, ie: > > reduce(key, values, context) { > List<Foo> myFoos =new ArrayList(); > for (Foo value : values) { > myFoos.add(value); > } > }
snip > > Am I doing something wrong? Should I expect this VALUEIN object to > change from underneath me? I'm using hadoop 0.20.1 (from a cloudera > tarball) That's the documented behavior. Hadoop reuses the same Writable instance and replaces the *members* in the readFields() method in most cases (all cases?). The instance of Foo in your example will be the same object and simply have its members overwritten after each call to readFields(). Currently, you're building a list of the same object. At the end of your for, you'll have a list of N objects all containing the same data. This is one of those "gotchas." If you really need to build a list like this, you'd have to resort to doing a deep copy, but you're better off avoid it if you can as it will drastically impact performance and add the requirement that all values for a given key fit in memory. Hope this helps. -- Eric Sammer e...@lifeless.net http://esammer.blogspot.com