[
https://issues.apache.org/jira/browse/AVRO-792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019552#comment-13019552
]
ey-chih chow commented on AVRO-792:
-----------------------------------
Thanks. I tested the third patch under our environment. Unfortunately, this
did not fix the problem. What follows is the trace from our VM.
===============================================================================================================================
cloudera@cloudera-demo:~/src/ngpipes-etl/dist$ hadoop jar ngpipesjobs.jar
com.ngmoco.ngpipes.etl.NgEventETLJob input/etl/test_avro_bugfix/2011-04-12/0200
etl_out avro/ngpipes-events.avdl
Input Path => input/etl/test_avro_bugfix/2011-04-12/0200
Log Start Time => 2011:04:12:02
Setting Job Name => NgEventETLJob 2011:04:12:02 2011:04:12:03
Output Path => etl_out
Fetching From URL => http://partner.plusplus.com/admin/products.json
isProduction => false
11/04/12 10:18:14 INFO etl.NgEventETLJob: Setting plus.json.games.table
11/04/12 10:18:14 INFO mapred.FileInputFormat: Total input paths to process : 4
11/04/12 10:18:15 INFO mapred.JobClient: Running job: job_201104081805_0001
11/04/12 10:18:16 INFO mapred.JobClient: map 0% reduce 0%
11/04/12 10:18:28 INFO mapred.JobClient: map 20% reduce 0%
11/04/12 10:18:29 INFO mapred.JobClient: map 40% reduce 0%
11/04/12 10:18:35 INFO mapred.JobClient: map 80% reduce 0%
11/04/12 10:18:39 INFO mapred.JobClient: map 100% reduce 0%
11/04/12 10:18:43 INFO mapred.JobClient: map 100% reduce 26%
11/04/12 10:18:46 INFO mapred.JobClient: Task Id :
attempt_201104081805_0001_r_000000_0, Status : FAILED
11/04/12 10:18:47 INFO mapred.JobClient: map 100% reduce 0%
11/04/12 10:18:57 INFO mapred.JobClient: Task Id :
attempt_201104081805_0001_r_000000_1, Status : FAILED
java.lang.ArrayIndexOutOfBoundsException: 3
at
org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364)
at
org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229)
at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
at
org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206)
at
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
at
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
at
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
at
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
at
org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:86)
at
org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:68)
at
org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1136)
at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1076)
at
org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:246)
at
org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:242)
at
org.apache.avro.mapred.HadoopReducerBase$ReduceIterable.next(HadoopReducerBase.java:47)
at
com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:39)
at
com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:1)
at
org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:60)
at
org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:30)
at
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:468)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416)
at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
at org.apache.hadoop.mapred.Child.main(Child.java:234)
11/04/12 10:19:05 INFO mapred.JobClient: map 100% reduce 26%
11/04/12 10:19:08 INFO mapred.JobClient: Task Id :
attempt_201104081805_0001_r_000000_2, Status : FAILED
java.lang.ArrayIndexOutOfBoundsException: 3
at
org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364)
at
org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229)
at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
at
org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206)
at
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
at
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
at
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
at
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
at
org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:86)
at
org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:68)
at
org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1136)
at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1076)
at
org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:246)
at
org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:242)
at
org.apache.avro.mapred.HadoopReducerBase$ReduceIterable.next(HadoopReducerBase.java:47)
at
com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:39)
at
com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:1)
at
org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:60)
at
org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:30)
at
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:468)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416)
at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
at org.apache.hadoop.mapred.Child.main(Child.java:234)
11/04/12 10:19:10 INFO mapred.JobClient: map 100% reduce 0%
11/04/12 10:19:22 INFO mapred.JobClient: Job complete: job_201104081805_0001
11/04/12 10:19:22 INFO mapred.JobClient: Counters: 31
11/04/12 10:19:22 INFO mapred.JobClient:
com.ngmoco.ngpipes.utils.NgPipesGlobals$EventClassCounter
11/04/12 10:19:22 INFO mapred.JobClient: PLUS_EVENT=249
11/04/12 10:19:22 INFO mapred.JobClient: REV_EVENT=1
11/04/12 10:19:22 INFO mapred.JobClient: PC_REV_EVENT=1
11/04/12 10:19:22 INFO mapred.JobClient: Job Counters
11/04/12 10:19:22 INFO mapred.JobClient: Launched reduce tasks=4
11/04/12 10:19:22 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=34290
11/04/12 10:19:22 INFO mapred.JobClient: Total time spent by all reduces
waiting after reserving slots (ms)=0
11/04/12 10:19:22 INFO mapred.JobClient: Total time spent by all maps
waiting after reserving slots (ms)=0
11/04/12 10:19:22 INFO mapred.JobClient: Launched map tasks=5
11/04/12 10:19:22 INFO mapred.JobClient: Data-local map tasks=5
11/04/12 10:19:22 INFO mapred.JobClient: Failed reduce tasks=1
11/04/12 10:19:22 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=53407
11/04/12 10:19:22 INFO mapred.JobClient:
com.ngmoco.ngpipes.etl.NgEventETLMapper$EventSourceTypes
11/04/12 10:19:22 INFO mapred.JobClient: PLUS_SERVER=222
11/04/12 10:19:22 INFO mapred.JobClient: PLUS_CLIENT=28
11/04/12 10:19:22 INFO mapred.JobClient: FileSystemCounters
11/04/12 10:19:22 INFO mapred.JobClient: HDFS_BYTES_READ=472855
11/04/12 10:19:22 INFO mapred.JobClient: FILE_BYTES_WRITTEN=1164803
11/04/12 10:19:22 INFO mapred.JobClient:
com.ngmoco.ngpipes.etl.NgEventETLMapper$Event
11/04/12 10:19:22 INFO mapred.JobClient: ERR_NO_AFAM=133
11/04/12 10:19:22 INFO mapred.JobClient: ERR_NULL_VALUE=109
11/04/12 10:19:22 INFO mapred.JobClient: DISCARDED_EVENTS=1058
11/04/12 10:19:22 INFO mapred.JobClient: ERR_NO_PUBL=112
11/04/12 10:19:22 INFO mapred.JobClient: ERR_MAPPING_ASKU_TO_AFAM=676
11/04/12 10:19:22 INFO mapred.JobClient: ERR_NO_ASKU=225
11/04/12 10:19:22 INFO mapred.JobClient: ERR_EMPTY_MAP=182
11/04/12 10:19:22 INFO mapred.JobClient: ERR_OTHER=45
11/04/12 10:19:22 INFO mapred.JobClient: Map-Reduce Framework
11/04/12 10:19:22 INFO mapred.JobClient: Combine output records=0
11/04/12 10:19:22 INFO mapred.JobClient: Map input records=1281
11/04/12 10:19:22 INFO mapred.JobClient: Spilled Records=205
11/04/12 10:19:22 INFO mapred.JobClient: Map output bytes=41281
11/04/12 10:19:22 INFO mapred.JobClient: Map input bytes=468793
11/04/12 10:19:22 INFO mapred.JobClient: Combine input records=0
11/04/12 10:19:22 INFO mapred.JobClient: Map output records=205
11/04/12 10:19:22 INFO mapred.JobClient: SPLIT_RAW_BYTES=889
11/04/12 10:19:22 INFO mapred.JobClient: Job Failed: NA
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1246)
at com.ngmoco.ngpipes.etl.NgEventETLJob.runJob(NgEventETLJob.java:160)
at com.ngmoco.ngpipes.etl.NgEventETLJob.run(NgEventETLJob.java:108)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at com.ngmoco.ngpipes.etl.NgEventETLJob.main(NgEventETLJob.java:189)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
cloudera@cloudera-demo:~/src/ngpipes-etl/dist$ hadoop jar ngpipesjobs.jar
com.ngmoco.ngpipes.etl.NgEventETLJob input/etl/test_avro_bugfix/2011-04-12/0200
etl_out avro/ngpipes-events.avdl
Input Path => input/etl/test_avro_bugfix/2011-04-12/0200
Log Start Time => 2011:04:12:02
Setting Job Name => NgEventETLJob 2011:04:12:02 2011:04:12:03
Output Path => etl_out
Fetching From URL => http://partner.plusplus.com/admin/products.json
isProduction => false
11/04/12 10:30:33 INFO etl.NgEventETLJob: Setting plus.json.games.table
11/04/12 10:30:34 INFO mapred.FileInputFormat: Total input paths to process : 4
11/04/12 10:30:34 INFO mapred.JobClient: Running job: job_201104081805_0002
11/04/12 10:30:35 INFO mapred.JobClient: map 0% reduce 0%
11/04/12 10:30:44 INFO mapred.JobClient: map 40% reduce 0%
11/04/12 10:30:51 INFO mapred.JobClient: map 60% reduce 0%
11/04/12 10:30:52 INFO mapred.JobClient: map 80% reduce 0%
11/04/12 10:30:55 INFO mapred.JobClient: map 100% reduce 0%
11/04/12 10:31:00 INFO mapred.JobClient: map 100% reduce 33%
11/04/12 10:31:03 INFO mapred.JobClient: Task Id :
attempt_201104081805_0002_r_000000_0, Status : FAILED
java.lang.ArrayIndexOutOfBoundsException: 3
at
org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364)
at
org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:246)
at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
at
org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:223)
at
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:123)
at
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:147)
at
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:119)
at
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:110)
at
org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:86)
at
org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:68)
at
org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1136)
at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1076)
at
org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:246)
at
org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:242)
at
org.apache.avro.mapred.HadoopReducerBase$ReduceIterable.next(HadoopReducerBase.java:47)
at
com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:39)
at
com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:1)
at
org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:60)
at
org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:30)
at
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:468)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416)
at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
at org.apache.hadoop.mapred.Child.main(Child.java:234)
11/04/12 10:31:04 INFO mapred.JobClient: map 100% reduce 0%
11/04/12 10:31:11 INFO mapred.JobClient: map 100% reduce 33%
11/04/12 10:31:14 INFO mapred.JobClient: Task Id :
attempt_201104081805_0002_r_000000_1, Status : FAILED
java.lang.ArrayIndexOutOfBoundsException: 3
at
org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364)
at
org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:246)
at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
at
org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:223)
at
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:123)
at
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:147)
at
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:119)
at
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:110)
at
org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:86)
at
org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:68)
at
org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1136)
at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1076)
at
org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:246)
at
org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:242)
at
org.apache.avro.mapred.HadoopReducerBase$ReduceIterable.next(HadoopReducerBase.java:47)
at
com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:39)
at
com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:1)
at
org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:60)
at
org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:30)
at
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:468)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416)
at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
at org.apache.hadoop.mapred.Child.main(Child.java:234)
11/04/12 10:31:16 INFO mapred.JobClient: map 100% reduce 0%
11/04/12 10:31:26 INFO mapred.JobClient: Task Id :
attempt_201104081805_0002_r_000000_2, Status : FAILED
java.lang.ArrayIndexOutOfBoundsException: 3
at
org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364)
at
org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:246)
at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
at
org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:223)
at
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:123)
at
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:147)
at
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:119)
at
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:110)
at
org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:86)
at
org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:68)
at
org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1136)
at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1076)
at
org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:246)
at
org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:242)
at
org.apache.avro.mapred.HadoopReducerBase$ReduceIterable.next(HadoopReducerBase.java:47)
at
com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:39)
at
com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:1)
at
org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:60)
at
org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:30)
at
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:468)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416)
at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
at org.apache.hadoop.mapred.Child.main(Child.java:234)
11/04/12 10:31:34 INFO mapred.JobClient: map 100% reduce 13%
11/04/12 10:31:38 INFO mapred.JobClient: map 100% reduce 0%
11/04/12 10:31:38 INFO mapred.JobClient: Job complete: job_201104081805_0002
11/04/12 10:31:38 INFO mapred.JobClient: Counters: 31
11/04/12 10:31:38 INFO mapred.JobClient:
com.ngmoco.ngpipes.utils.NgPipesGlobals$EventClassCounter
11/04/12 10:31:38 INFO mapred.JobClient: PLUS_EVENT=249
11/04/12 10:31:38 INFO mapred.JobClient: REV_EVENT=1
11/04/12 10:31:38 INFO mapred.JobClient: PC_REV_EVENT=1
11/04/12 10:31:38 INFO mapred.JobClient: Job Counters
11/04/12 10:31:38 INFO mapred.JobClient: Launched reduce tasks=4
11/04/12 10:31:38 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=32438
11/04/12 10:31:38 INFO mapred.JobClient: Total time spent by all reduces
waiting after reserving slots (ms)=0
11/04/12 10:31:38 INFO mapred.JobClient: Total time spent by all maps
waiting after reserving slots (ms)=0
11/04/12 10:31:38 INFO mapred.JobClient: Launched map tasks=5
11/04/12 10:31:38 INFO mapred.JobClient: Data-local map tasks=5
11/04/12 10:31:38 INFO mapred.JobClient: Failed reduce tasks=1
11/04/12 10:31:38 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=52425
11/04/12 10:31:38 INFO mapred.JobClient:
com.ngmoco.ngpipes.etl.NgEventETLMapper$EventSourceTypes
11/04/12 10:31:38 INFO mapred.JobClient: PLUS_SERVER=222
11/04/12 10:31:38 INFO mapred.JobClient: PLUS_CLIENT=28
11/04/12 10:31:38 INFO mapred.JobClient: FileSystemCounters
11/04/12 10:31:38 INFO mapred.JobClient: HDFS_BYTES_READ=472855
11/04/12 10:31:38 INFO mapred.JobClient: FILE_BYTES_WRITTEN=1164803
11/04/12 10:31:38 INFO mapred.JobClient:
com.ngmoco.ngpipes.etl.NgEventETLMapper$Event
11/04/12 10:31:38 INFO mapred.JobClient: ERR_NO_AFAM=133
11/04/12 10:31:38 INFO mapred.JobClient: ERR_NULL_VALUE=109
11/04/12 10:31:38 INFO mapred.JobClient: DISCARDED_EVENTS=1058
11/04/12 10:31:38 INFO mapred.JobClient: ERR_NO_PUBL=112
11/04/12 10:31:38 INFO mapred.JobClient: ERR_MAPPING_ASKU_TO_AFAM=676
11/04/12 10:31:38 INFO mapred.JobClient: ERR_NO_ASKU=225
11/04/12 10:31:38 INFO mapred.JobClient: ERR_EMPTY_MAP=182
11/04/12 10:31:38 INFO mapred.JobClient: ERR_OTHER=45
11/04/12 10:31:38 INFO mapred.JobClient: Map-Reduce Framework
11/04/12 10:31:38 INFO mapred.JobClient: Combine output records=0
11/04/12 10:31:38 INFO mapred.JobClient: Map input records=1281
11/04/12 10:31:38 INFO mapred.JobClient: Spilled Records=205
11/04/12 10:31:38 INFO mapred.JobClient: Map output bytes=41281
11/04/12 10:31:38 INFO mapred.JobClient: Map input bytes=468793
11/04/12 10:31:38 INFO mapred.JobClient: Combine input records=0
11/04/12 10:31:38 INFO mapred.JobClient: Map output records=205
11/04/12 10:31:38 INFO mapred.JobClient: SPLIT_RAW_BYTES=889
11/04/12 10:31:38 INFO mapred.JobClient: Job Failed: NA
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1246)
at com.ngmoco.ngpipes.etl.NgEventETLJob.runJob(NgEventETLJob.java:160)
at com.ngmoco.ngpipes.etl.NgEventETLJob.run(NgEventETLJob.java:108)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at com.ngmoco.ngpipes.etl.NgEventETLJob.main(NgEventETLJob.java:189)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
===========================================================================================================================
> map reduce job for avro 1.5 generates ArrayIndexOutOfBoundsException
> --------------------------------------------------------------------
>
> Key: AVRO-792
> URL: https://issues.apache.org/jira/browse/AVRO-792
> Project: Avro
> Issue Type: Bug
> Components: java
> Affects Versions: 1.5.0
> Environment: Mac with VMWare running Linux training-vm-Ubuntu
> Reporter: ey-chih chow
> Assignee: Thiruvalluvan M. G.
> Priority: Blocker
> Fix For: 1.5.1
>
> Attachments: AVRO-792-2.patch, AVRO-792-3.patch, AVRO-792.patch
>
> Original Estimate: 504h
> Remaining Estimate: 504h
>
> We have an avro map/reduce job used to be working with avro 1.4, but broken
> with avro 1.5. The M/R job with avro 1.5 worked fine under our debugging
> environment, but broken when we moved to a real cluster. At one instance f
> testing, the job had 23 reducers. Four of them succeeded and the rest failed
> because of the ArrayIndexOutOfBoundsException generated. Here are two
> instances of the stack traces:
> =================================================================================
> java.lang.ArrayIndexOutOfBoundsException: -1576799025
> at
> org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364)
> at
> org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229)
> at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
> at
> org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
> at
> org.apache.avro.generic.GenericDatumReader.readMap(GenericDatumReader.java:232)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:141)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
> at
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
> at
> org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:86)
> at
> org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:68)
> at
> org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1136)
> at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1076)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:246)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:242)
> at
> org.apache.avro.mapred.HadoopReducerBase$ReduceIterable.next(HadoopReducerBase.java:47)
> at
> com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:46)
> at
> com.ngmoco.ngpipes.etl.NgEventETLReducer.reduce(NgEventETLReducer.java:1)
> at
> org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:60)
> at
> org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:30)
> at
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:468)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
> at org.apache.hadoop.mapred.Child.main(Child.java:234)
> =====================================================================================================
> java.lang.ArrayIndexOutOfBoundsException: 40
> at
> org.apache.avro.io.parsing.Symbol$Alternative.getSymbol(Symbol.java:364)
> at
> org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:229)
> at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
> at
> org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:206)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142)
> at
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:166)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
> at
> org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:86)
> at
> org.apache.avro.mapred.AvroSerialization$AvroWrapperDeserializer.deserialize(AvroSerialization.java:68)
> at
> org.apache.hadoop.mapred.Task$ValuesIterator.readNextValue(Task.java:1136)
> at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1076)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:246)
> at
> org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:242)
> at
> org.apache.avro.mapred.HadoopReducerBase$ReduceIterable.next(HadoopReducerBase.java:47)
> at
> com.ngmoco.ngpipes.sourcing.sessions.NgSessionReducer.reduce(NgSessionReducer.java:74)
> at
> com.ngmoco.ngpipes.sourcing.sessions.NgSessionReducer.reduce(NgSessionReducer.java:1)
> at
> org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:60)
> at
> org.apache.avro.mapred.HadoopReducerBase.reduce(HadoopReducerBase.java:30)
> at
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:468)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
> at org.apache.hadoop.mapred.Child.main(Child.java:234)
> =====================================================================================================
> The signature of our map() is:
> public void map(Utf8 input, AvroCollector<Pair<Utf8, GenericRecord>>
> collector, Reporter reporter) throws IOException;
> and reduce() is:
> public void reduce(Utf8 key, Iterable<GenericRecord> values,
> AvroCollector<GenericRecord> collector, Reporter reporter) throws IOException;
> All the GenericRecords are of the same schema.
> There are many changes in the area of serialization/de-serailization between
> avro 1.4 and 1.5, but could not figure out why the exceptions were generated.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira