HADOOP-2399 has caused a lot of problems for users so far, and the saga still continues :-(
I remember spending 18 straight hours in 2008 with a user debugging this issue. - milind --- Milind Bhandarkar Greenplum Labs, EMC (Disclaimer: Opinions expressed in this email are those of the author, and do not necessarily represent the views of any organization, past or present, the author might be affiliated with.) On 8/3/11 4:19 AM, "Joey Echeverria" <j...@cloudera.com> wrote: >Hadoop reuses objects as an optimization. If you need to keep a copy >in memory, you need to call clone yourself. I've never used Avro, but >my guess is that the BARs are not reused, only the FOO. > >-Joey > >On Wed, Aug 3, 2011 at 3:18 AM, Vyacheslav Zholudev ><vyacheslav.zholu...@gmail.com> wrote: >> Hi all, >> >> I'm using Avro as a serialization format and assume I have a generated >>specific class FOO that I use as a Mapper output format: >> >> class FOO { >> int a; >> List<BAR> barList; >> } >> >> where BAR is another generated specific Java class. >> >> When I iterate over "Iterable<FOO> values" in the Reducer it is clear >>that the same object of class FOO is reused, i.e. >> FOO foo1 = values.iterator.next(); >> FOO foo2 = values.iterator.next(); >> assertThat(foo1 == foo2, is (true)); >> >> So I have the following questions: >> 1) Is the list barList reused over the next() calls? >> 2) If yes, can the objects that are in the barList be reused? For >>example, if the first time next() is called, the list contains two BAR >>objects, the next time next() is called the barList contains 3 objects >>and 2 of them are equal by reference to the two from the list of the >>first next() call. In other words, does Hadoop maintain some sort of >>"object pool"? >> 3) Why do not AvroTools generate clone() methods since it would be >>quite straightforward and more importantly useful given that objects are >>reused? >> >> Thanks a lot in advance! >> >> Vyacheslav >> >> >> >> > > > >-- >Joseph Echeverria >Cloudera, Inc. >443.305.9434 >