Cleaned up my environment by unsetting HADOOP_HOME, and removing some old jacksons in my CLASSPATH and Pig's AvroStorage works again.
Woot! On Thu, Feb 2, 2012 at 3:47 PM, Russell Jurney <[email protected]>wrote: > Spoken too soon... this happens no matter what avros I load now. I can't > figure that anything has changed regarding jars, etc. Confused. > > I think this happens when Avro is parsing the schema? > > Pig Stack Trace > --------------- > ERROR 2998: Unhandled internal error. > org.codehaus.jackson.JsonFactory.enable(Lorg/codehaus/jackson/JsonParser$Feature;)Lorg/codehaus/jackson/JsonFactory; > > java.lang.NoSuchMethodError: > org.codehaus.jackson.JsonFactory.enable(Lorg/codehaus/jackson/JsonParser$Feature;)Lorg/codehaus/jackson/JsonFactory; > at org.apache.avro.Schema.<clinit>(Schema.java:82) > at > org.apache.pig.piggybank.storage.avro.AvroStorageUtils.<clinit>(AvroStorageUtils.java:49) > at > org.apache.pig.piggybank.storage.avro.AvroStorage.getAvroSchema(AvroStorage.java:163) > at > org.apache.pig.piggybank.storage.avro.AvroStorage.getAvroSchema(AvroStorage.java:144) > at > org.apache.pig.piggybank.storage.avro.AvroStorage.getSchema(AvroStorage.java:269) > at > org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:150) > at > org.apache.pig.newplan.logical.relational.LOLoad.getSchema(LOLoad.java:109) > at > org.apache.pig.newplan.logical.visitor.LineageFindRelVisitor.visit(LineageFindRelVisitor.java:100) > at org.apache.pig.newplan.logical.relational.LOLoad.accept(LOLoad.java:218) > at > org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75) > at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50) > at > org.apache.pig.newplan.logical.visitor.CastLineageSetter.<init>(CastLineageSetter.java:57) > at org.apache.pig.PigServer$Graph.compile(PigServer.java:1679) > at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1610) > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1582) > at org.apache.pig.PigServer.registerQuery(PigServer.java:584) > at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:942) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:188) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:164) > at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69) > at org.apache.pig.Main.run(Main.java:495) > at org.apache.pig.Main.main(Main.java:111) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > > ================================================================================ > > On Thu, Feb 2, 2012 at 2:53 PM, Russell Jurney > <[email protected]>wrote: > >> Further examination shows that the problematic emails I am encoding are >> formatted in ISO-8859-1, not UTF-8. That is why I am getting character >> problems. Looks like it is not an Avro problem after all. Thanks! :) >> >> >> On Thu, Feb 2, 2012 at 2:49 PM, Russell Jurney >> <[email protected]>wrote: >> >>> A little bit more searching shows this: >>> >>> >>> http://www.harshj.com/2010/04/25/writing-and-reading-avro-data-files-using-python/ >>> >>> >>> On Thu, Feb 2, 2012 at 2:48 PM, Russell Jurney <[email protected] >>> > wrote: >>> >>>> The jars being used are: >>>> >>>> REGISTER /me/pig/build/ivy/lib/Pig/avro-1.5.3.jar >>>> REGISTER /me/pig/build/ivy/lib/Pig/json-simple-1.1.jar >>>> REGISTER /me/pig/contrib/piggybank/java/piggybank.jar >>>> REGISTER /me/pig/build/ivy/lib/Pig/jackson-core-asl-1.7.3.jar >>>> REGISTER /me/pig/build/ivy/lib/Pig/jackson-mapper-asl-1.7.3.jar >>>> >>>> On Thu, Feb 2, 2012 at 2:41 PM, James Baldassari <[email protected] >>>> > wrote: >>>> >>>>> HI Russell, >>>>> >>>>> I'm not sure about the Python error, but the Java error looks like a >>>>> classpath problem, not a schema parsing issue. The NoSuchMethodError in >>>>> the stack trace indicates that Avro was trying to invoke a method in the >>>>> Jackson library that wasn't present at run-time. My guess is that your >>>>> program (or Pig?) either has two incompatible versions of the Jackson >>>>> library on its classpath or maybe Avro's Jackson dependency has been >>>>> excluded and a version that is incompatible with Avro is on the classpath. >>>>> >>>>> Which version of Avro is being used? Running 'mvn dependency:tree' in >>>>> Avro trunk I see that it's depending on Jackson 1.8.6. Can you verify >>>>> that >>>>> only one version of Jackson is on the classpath and that it's the version >>>>> that is required by whatever version of Avro is on the classpath? >>>>> >>>>> -James >>>>> >>>>> >>>>> >>>>> On Thu, Feb 2, 2012 at 5:21 PM, Russell Jurney < >>>>> [email protected]> wrote: >>>>> >>>>>> Correction: when I read the file in Python, I get the error below. >>>>>> It looks like a unicode problem? Can one tell Avro how to handle this? >>>>>> >>>>>> Traceback (most recent call last): >>>>>> File "./cat_avro", line 21, in <module> >>>>>> for record in df_reader: >>>>>> File >>>>>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/datafile.py", >>>>>> line 354, in next >>>>>> datum = self.datum_reader.read(self.datum_decoder) >>>>>> File >>>>>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py", >>>>>> line 445, in read >>>>>> return self.read_data(self.writers_schema, self.readers_schema, >>>>>> decoder) >>>>>> File >>>>>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py", >>>>>> line 490, in read_data >>>>>> return self.read_record(writers_schema, readers_schema, decoder) >>>>>> File >>>>>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py", >>>>>> line 690, in read_record >>>>>> field_val = self.read_data(field.type, readers_field.type, >>>>>> decoder) >>>>>> File >>>>>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py", >>>>>> line 488, in read_data >>>>>> return self.read_union(writers_schema, readers_schema, decoder) >>>>>> File >>>>>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py", >>>>>> line 654, in read_union >>>>>> return self.read_data(selected_writers_schema, readers_schema, >>>>>> decoder) >>>>>> File >>>>>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py", >>>>>> line 458, in read_data >>>>>> return self.read_data(writers_schema, s, decoder) >>>>>> File >>>>>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py", >>>>>> line 468, in read_data >>>>>> return decoder.read_utf8() >>>>>> File >>>>>> "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/avro-_AVRO_VERSION_-py2.6.egg/avro/io.py", >>>>>> line 233, in read_utf8 >>>>>> return unicode(self.read_bytes(), "utf-8") >>>>>> UnicodeDecodeError: 'utf8' codec can't decode byte 0xa0 in position >>>>>> 543: invalid start byte >>>>>> >>>>>> >>>>>> On Thu, Feb 2, 2012 at 2:06 PM, Russell Jurney < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> I am writing Avro records in Ruby using the avro ruby gem in 1.8.7. >>>>>>> I have problems with loading these files sometimes. As a result, I am >>>>>>> unable to write large files that are readable. >>>>>>> >>>>>>> The exception I get is below. Anyone have an idea what this means? >>>>>>> It looks like Avro is having trouble parsing the schema. The avro >>>>>>> files >>>>>>> parse in Ruby and Python, just not Pig. Are there more rigorous checks >>>>>>> in >>>>>>> Java? >>>>>>> >>>>>>> Pig Stack Trace >>>>>>> --------------- >>>>>>> ERROR 2998: Unhandled internal error. >>>>>>> org.codehaus.jackson.JsonFactory.enable(Lorg/codehaus/jackson/JsonParser$Feature;)Lorg/codehaus/jackson/JsonFactory; >>>>>>> >>>>>>> java.lang.NoSuchMethodError: >>>>>>> org.codehaus.jackson.JsonFactory.enable(Lorg/codehaus/jackson/JsonParser$Feature;)Lorg/codehaus/jackson/JsonFactory; >>>>>>> at org.apache.avro.Schema.<clinit>(Schema.java:82) >>>>>>> at >>>>>>> org.apache.pig.piggybank.storage.avro.AvroStorageUtils.<clinit>(AvroStorageUtils.java:49) >>>>>>> at >>>>>>> org.apache.pig.piggybank.storage.avro.AvroStorage.getAvroSchema(AvroStorage.java:163) >>>>>>> at >>>>>>> org.apache.pig.piggybank.storage.avro.AvroStorage.getAvroSchema(AvroStorage.java:144) >>>>>>> at >>>>>>> org.apache.pig.piggybank.storage.avro.AvroStorage.getSchema(AvroStorage.java:269) >>>>>>> at >>>>>>> org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:150) >>>>>>> at >>>>>>> org.apache.pig.newplan.logical.relational.LOLoad.getSchema(LOLoad.java:109) >>>>>>> at >>>>>>> org.apache.pig.newplan.logical.visitor.LineageFindRelVisitor.visit(LineageFindRelVisitor.java:100) >>>>>>> at >>>>>>> org.apache.pig.newplan.logical.relational.LOLoad.accept(LOLoad.java:218) >>>>>>> at >>>>>>> org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75) >>>>>>> at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50) >>>>>>> at >>>>>>> org.apache.pig.newplan.logical.visitor.CastLineageSetter.<init>(CastLineageSetter.java:57) >>>>>>> at org.apache.pig.PigServer$Graph.compile(PigServer.java:1679) >>>>>>> at >>>>>>> org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1610) >>>>>>> at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1582) >>>>>>> at org.apache.pig.PigServer.registerQuery(PigServer.java:584) >>>>>>> at >>>>>>> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:942) >>>>>>> at >>>>>>> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386) >>>>>>> at >>>>>>> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:188) >>>>>>> at >>>>>>> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:164) >>>>>>> at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69) >>>>>>> at org.apache.pig.Main.run(Main.java:495) >>>>>>> at org.apache.pig.Main.main(Main.java:111) >>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>>>>> at >>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >>>>>>> at >>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>>>>>> at java.lang.reflect.Method.invoke(Method.java:597) >>>>>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:156) >>>>>>> >>>>>>> ================================================================================ >>>>>>> >>>>>>> -- >>>>>>> Russell Jurney >>>>>>> twitter.com/rjurney >>>>>>> [email protected] >>>>>>> datasyndrome.com >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Russell Jurney >>>>>> twitter.com/rjurney >>>>>> [email protected] >>>>>> datasyndrome.com >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Russell Jurney >>>> twitter.com/rjurney >>>> [email protected] >>>> datasyndrome.com >>>> >>> >>> >>> >>> -- >>> Russell Jurney >>> twitter.com/rjurney >>> [email protected] >>> datasyndrome.com >>> >> >> >> >> -- >> Russell Jurney >> twitter.com/rjurney >> [email protected] >> datasyndrome.com >> > > > > -- > Russell Jurney > twitter.com/rjurney > [email protected] > datasyndrome.com > -- Russell Jurney twitter.com/rjurney [email protected] datasyndrome.com
