On 8/17/11 1:32 AM, "Vyacheslav Zholudev" <[email protected]> wrote:
> Hi Scott, > > The pair types are Pair<CharSequence, SomeSpecificJavaClass>, but in essence > when I call "collect()" then I always provide a java.lang.String object. > > The reduce method is > reduce(CharSequence key, Iterable<SomeSpecificJavaClass> values, .....) What happens if you change it to Pair<String, SomeSpecificJavaClass> or <Utf8, SomeSpecificJavaClass> ? Does the problem persist? > > Some more detailed info: > the jobtracker and namenode run with: > java version "1.6.0_22" > Java(TM) SE Runtime Environment (build 1.6.0_22-b04) > Java HotSpot(TM) 64-Bit Server VM (build 17.1-b03, mixed mode) > > the tasktrackers and datanodes run with: > java version "1.6.0_24" > Java(TM) SE Runtime Environment (build 1.6.0_24-b07) > Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode) > > Hadoop version is: > cdh3u1 > > Thanks for suggestions, > Vyacheslav > > > > > On Aug 17, 2011, at 3:56 AM, Scott Carey wrote: > >> On 8/16/11 3:56 PM, "Vyacheslav Zholudev" <[email protected]> >> wrote: >> >>> Hi, Scott, >>> >>> thanks for your reply. >>> >>>> What Avro version is this happening with? What JVM version? >>> >>> We are using Avro 1.5.1 and Sun JDK 6, but the exact version I will have >>> to look up. >>> >>>> >>>> On a hunch, have you tried adding -XX:-UseLoopPredicate to the JVM args >>>> if >>>> it is Sun and JRE 6u21 or later? (some issues in loop predicates affect >>>> Java 6 too, just not as many as the recent news on Java7). >>>> >>>> Otherwise, it may likely be the same thing as AVRO-782. Any extra >>>> information related to that issue would be welcome. >>> >>> I will have to collect it. In the meanwhile, do you have any reasonable >>> explanations of the issue besides it being something like AVRO-782? >> >> What is your key type (map output schema, first type argument of Pair)? >> Is your key a Utf8 or String? I don't have a reasonable explanation at >> this point, I haven't looked into it in depth with a good reproducible >> case. I have my suspicions with how recycling of the key works since Utf8 >> is mutable and its backing byte[] can end up shared. >> >> >> >>> >>> Thanks a lot, >>> Vyacheslav >>> >>>> >>>> Thanks! >>>> >>>> -Scott >>>> >>>> >>>> >>>> On 8/16/11 8:39 AM, "Vyacheslav Zholudev" >>>> <[email protected]> >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> I'm having multiple hadoop jobs that use the avro mapred API. >>>>> Only in one of the jobs I have a visible mismatch between a number of >>>>> map >>>>> output records and reducer input records. >>>>> >>>>> Does anybody encountered such a behavior? Can anybody think of possible >>>>> explanations of this phenomenon? >>>>> >>>>> Any pointers/thoughts are highly appreciated! >>>>> >>>>> Best, >>>>> Vyacheslav >>>> >>>> >>> >>> Best, >>> Vyacheslav >>> >>> >>> >> >> >
