Spoken too soon... this happens no matter what avros I load now. I can't figure that anything has changed regarding jars, etc. Confused.
I think this happens when Avro is parsing the schema? Pig Stack Trace --------------- ERROR 2998: Unhandled internal error. org.codehaus.jackson.JsonFactory.enable(Lorg/codehaus/jackson/JsonParser$Feature;)Lorg/codehaus/jackson/JsonFactory; java.lang.NoSuchMethodError: org.codehaus.jackson.JsonFactory.enable(Lorg/codehaus/jackson/JsonParser$Feature;)Lorg/codehaus/jackson/JsonFactory; at org.apache.avro.Schema.<clinit>(Schema.java:82) at org.apache.pig.piggybank.storage.avro.AvroStorageUtils.<clinit>(AvroStorageUtils.java:49) at org.apache.pig.piggybank.storage.avro.AvroStorage.getAvroSchema(AvroStorage.java:163) at org.apache.pig.piggybank.storage.avro.AvroStorage.getAvroSchema(AvroStorage.java:144) at org.apache.pig.piggybank.storage.avro.AvroStorage.getSchema(AvroStorage.java:269) at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:150) at org.apache.pig.newplan.logical.relational.LOLoad.getSchema(LOLoad.java:109) at org.apache.pig.newplan.logical.visitor.LineageFindRelVisitor.visit(LineageFindRelVisitor.java:100) at org.apache.pig.newplan.logical.relational.LOLoad.accept(LOLoad.java:218) at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75) at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50) at org.apache.pig.newplan.logical.visitor.CastLineageSetter.<init>(CastLineageSetter.java:57) at org.apache.pig.PigServer$Graph.compile(PigServer.java:1679) at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1610) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1582) at org.apache.pig.PigServer.registerQuery(PigServer.java:584) at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:942) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:188) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:164) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69) at org.apache.pig.Main.run(Main.java:495) at org.apache.pig.Main.main(Main.java:111) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) ================================================================================ On Thu, Feb 2, 2012 at 2:53 PM, Russell Jurney <[email protected]>wrote: > Further examination shows that the problematic emails I am encoding are > formatted in ISO-8859-1, not UTF-8. That is why I am getting character > problems. Looks like it is not an Avro problem after all. Thanks! :) > > > On Thu, Feb 2, 2012 at 2:49 PM, Russell Jurney > <[email protected]>wrote: > >> A little bit more searching shows this: >> >> >> http://www.harshj.com/2010/04/25/writing-and-reading-avro-data-files-using-python/ >> >> >> On Thu, Feb 2, 2012 at 2:48 PM, Russell Jurney >> <[email protected]>wrote: >> >>> The jars being used are: >>> >>> REGISTER /me/pig/build/ivy/lib/Pig/avro-1.5.3.jar >>> REGISTER /me/pig/build/ivy/lib/Pig/json-simple-1.1.jar >>> REGISTER /me/pig/contrib/piggybank/java/piggybank.jar >>> REGISTER /me/pig/build/ivy/lib/Pig/jackson-core-asl-1.7.3.jar >>> REGISTER /me/pig/build/ivy/lib/Pig/jackson-mapper-asl-1.7.3.jar >>> >>> On Thu, Feb 2, 2012 at 2:41 PM, James Baldassari >>> <[email protected]>wrote: >>> >>>> HI Russell, >>>> >>>> I'm not sure about the Python error, but the Java error looks like a >>>> classpath problem, not a schema parsing issue. The NoSuchMethodError in >>>> the stack trace indicates that Avro was trying to invoke a method in the >>>> Jackson library that wasn't present at run-time. My guess is that your >>>> program (or Pig?) either has two incompatible versions of the Jackson >>>> library on its classpath or maybe Avro's Jackson dependency has been >>>> excluded and a version that is incompatible with Avro is on the classpath. >>>> >>>> Which version of Avro is being used? Running 'mvn dependency:tree' in >>>> Avro trunk I see that it's depending on Jackson 1.8.6. Can you verify that >>>> only one version of Jackson is on the classpath and that it's the version >>>> that is required by whatever version of Avro is on the classpath? >>>> >>>> -James >>>> >>>> >>>> >>>> On Thu, Feb 2, 2012 at 5:21 PM, Russell Jurney < >>>> [email protected]> wrote: >>>> >>>>> Correction: when I read the file in Python, I get the error below. It >>>>> looks like a unicode problem? Can one tell Avro how to handle this? >>>>> >>>>> Traceback (most recent call last): >>>>> File "./cat_avro", line 21, in <module> >>>>> for record in df_reader: >>>>> File >>>>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/datafile.py", >>>>> line 354, in next >>>>> datum = self.datum_reader.read(self.datum_decoder) >>>>> File >>>>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py", >>>>> line 445, in read >>>>> return self.read_data(self.writers_schema, self.readers_schema, >>>>> decoder) >>>>> File >>>>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py", >>>>> line 490, in read_data >>>>> return self.read_record(writers_schema, readers_schema, decoder) >>>>> File >>>>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py", >>>>> line 690, in read_record >>>>> field_val = self.read_data(field.type, readers_field.type, decoder) >>>>> File >>>>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py", >>>>> line 488, in read_data >>>>> return self.read_union(writers_schema, readers_schema, decoder) >>>>> File >>>>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py", >>>>> line 654, in read_union >>>>> return self.read_data(selected_writers_schema, readers_schema, >>>>> decoder) >>>>> File >>>>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py", >>>>> line 458, in read_data >>>>> return self.read_data(writers_schema, s, decoder) >>>>> File >>>>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py", >>>>> line 468, in read_data >>>>> return decoder.read_utf8() >>>>> File >>>>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py", >>>>> line 233, in read_utf8 >>>>> return unicode(self.read_bytes(), "utf-8") >>>>> UnicodeDecodeError: 'utf8' codec can't decode byte 0xa0 in position >>>>> 543: invalid start byte >>>>> >>>>> >>>>> On Thu, Feb 2, 2012 at 2:06 PM, Russell Jurney < >>>>> [email protected]> wrote: >>>>> >>>>>> I am writing Avro records in Ruby using the avro ruby gem in 1.8.7. >>>>>> I have problems with loading these files sometimes. As a result, I am >>>>>> unable to write large files that are readable. >>>>>> >>>>>> The exception I get is below. Anyone have an idea what this means? >>>>>> It looks like Avro is having trouble parsing the schema. The avro files >>>>>> parse in Ruby and Python, just not Pig. Are there more rigorous checks >>>>>> in >>>>>> Java? >>>>>> >>>>>> Pig Stack Trace >>>>>> --------------- >>>>>> ERROR 2998: Unhandled internal error. >>>>>> org.codehaus.jackson.JsonFactory.enable(Lorg/codehaus/jackson/JsonParser$Feature;)Lorg/codehaus/jackson/JsonFactory; >>>>>> >>>>>> java.lang.NoSuchMethodError: >>>>>> org.codehaus.jackson.JsonFactory.enable(Lorg/codehaus/jackson/JsonParser$Feature;)Lorg/codehaus/jackson/JsonFactory; >>>>>> at org.apache.avro.Schema.<clinit>(Schema.java:82) >>>>>> at >>>>>> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.<clinit>(AvroStorageUtils.java:49) >>>>>> at >>>>>> org.apache.pig.piggybank.storage.avro.AvroStorage.getAvroSchema(AvroStorage.java:163) >>>>>> at >>>>>> org.apache.pig.piggybank.storage.avro.AvroStorage.getAvroSchema(AvroStorage.java:144) >>>>>> at >>>>>> org.apache.pig.piggybank.storage.avro.AvroStorage.getSchema(AvroStorage.java:269) >>>>>> at >>>>>> org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:150) >>>>>> at >>>>>> org.apache.pig.newplan.logical.relational.LOLoad.getSchema(LOLoad.java:109) >>>>>> at >>>>>> org.apache.pig.newplan.logical.visitor.LineageFindRelVisitor.visit(LineageFindRelVisitor.java:100) >>>>>> at >>>>>> org.apache.pig.newplan.logical.relational.LOLoad.accept(LOLoad.java:218) >>>>>> at >>>>>> org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75) >>>>>> at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50) >>>>>> at >>>>>> org.apache.pig.newplan.logical.visitor.CastLineageSetter.<init>(CastLineageSetter.java:57) >>>>>> at org.apache.pig.PigServer$Graph.compile(PigServer.java:1679) >>>>>> at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1610) >>>>>> at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1582) >>>>>> at org.apache.pig.PigServer.registerQuery(PigServer.java:584) >>>>>> at >>>>>> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:942) >>>>>> at >>>>>> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386) >>>>>> at >>>>>> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:188) >>>>>> at >>>>>> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:164) >>>>>> at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69) >>>>>> at org.apache.pig.Main.run(Main.java:495) >>>>>> at org.apache.pig.Main.main(Main.java:111) >>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>>>> at >>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >>>>>> at >>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>>>>> at java.lang.reflect.Method.invoke(Method.java:597) >>>>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:156) >>>>>> >>>>>> ================================================================================ >>>>>> >>>>>> -- >>>>>> Russell Jurney >>>>>> twitter.com/rjurney >>>>>> [email protected] >>>>>> datasyndrome.com >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Russell Jurney >>>>> twitter.com/rjurney >>>>> [email protected] >>>>> datasyndrome.com >>>>> >>>> >>>> >>> >>> >>> -- >>> Russell Jurney >>> twitter.com/rjurney >>> [email protected] >>> datasyndrome.com >>> >> >> >> >> -- >> Russell Jurney >> twitter.com/rjurney >> [email protected] >> datasyndrome.com >> > > > > -- > Russell Jurney > twitter.com/rjurney > [email protected] > datasyndrome.com > -- Russell Jurney twitter.com/rjurney [email protected] datasyndrome.com
