Hello, I hope I did not step on anyones foot with this mail. Have not received any feedback. Can someone give me a hint were to search or a hint into the right direction. Thanks in advance. Kind regards. Ralf
Von: Klüber, Ralf [mailto:[email protected]] Gesendet: Wednesday, August 06, 2014 3:27 PM An: [email protected] Betreff: Creating and Reading Avro in Amazon EMR Hello, I am trying to (i) read avro files in pig on Amazon EMR which I have created in my local cluster from JSONs (complex nested including arrays) and uploaded to S3 (ii) Create avro files in EMR from those complex JSONs uploaded to S3 In my local Cloudera cluster I was able to load and work with the data in the avro file. I was not able to load the existing avro files in Amazon EMR. My EMR cluster is ´´ AMI version:3.0.4 Amazon 2.2.0 Hive 0.11.0.2, Pig 0.11.1.1 Impala 1.2.1 ´´ I searched a lot, but I could not find too much about EMR/Avro. I am stuck. Is there somewhere an example with data, schemas and pig scripts which I can try? I hope this - as my 1st post in this mailing list - complies to your standards in terms of provided information and tone ,-). If not, apologies and let me try a 2nd time. In pig I try this ´´ REGISTER s3://p3insight/libs/avro-1.7.4.jar; -- REGISTER s3://p3insight/libs/pig/piggybank.jar; REGISTER s3://p3insight/libs/jackson-mapper-asl-1.9.9.jar; REGISTER s3://p3insight/libs/jackson-core-2.3.4.jar -- REGISTER s3://p3insight/libs/jackson-core-asl-1.9.9.jar; REGISTER s3://p3insight/libs/json-simple-1.1.1.jar; REGISTER /home/hadoop/pig/lib/piggybank.jar a = LOAD 's3://p3iqubole/data/avro/' USING org.apache.pig.piggybank.storage.avro.AvroStorage(); ´´ Output is as follows ´´ <line 1, column 4> pig script failed to validate: java.lang.RuntimeException: could not instantiate 'org.apache.pig.piggybank.storage.avro.AvroStorage' with arguments 'null' Details at logfile: /mnt/var/log/apps/pig.log ´´ Content of log file is: ´´ Pig Stack Trace --------------- ERROR 1200: Pig script failed to parse: <line 1, column 4> pig script failed to validate: java.lang.RuntimeException: could not instantiate 'org.apache.pig.piggybank.storage.avro.AvroStorage' with arguments 'null' Failed to parse: Pig script failed to parse: <line 1, column 4> pig script failed to validate: java.lang.RuntimeException: could not instantiate 'org.apache.pig.piggybank.storage.avro.AvroStorage' with arguments 'null' at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:191) at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1571) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1544) at org.apache.pig.PigServer.registerQuery(PigServer.java:516) at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:988) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:412) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69) at org.apache.pig.Main.run(Main.java:542) at org.apache.pig.Main.main(Main.java:159) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: <line 1, column 4> pig script failed to validate: java.lang.RuntimeException: could not instantiate 'org.apache.pig.piggybank.storage.avro.AvroStorage' with arguments 'null' at org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:835) at org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3235) at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1314) at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:798) at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:516) at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:391) at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:184) ... 15 more Caused by: java.lang.RuntimeException: could not instantiate 'org.apache.pig.piggybank.storage.avro.AvroStorage' with arguments 'null' at org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:618) at org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:823) ... 21 more Caused by: java.lang.NoClassDefFoundError: org/json/simple/parser/ParseException at java.lang.Class.getDeclaredConstructors0(Native Method) at java.lang.Class.privateGetDeclaredConstructors(Class.java:2493) at java.lang.Class.getConstructor0(Class.java:2803) at java.lang.Class.newInstance(Class.java:345) at org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:588) ... 22 more Caused by: java.lang.ClassNotFoundException: org.json.simple.parser.ParseException at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) ... 27 more ´´ Kind regards. Ralf Klüber
