Jackson is no longer needed, right? Or is it coming back in 0.11? Russell Jurney http://datasyndrome.com
On Jan 9, 2013, at 10:26 AM, Cheolsoo Park <[email protected]> wrote: > Hi Milind, > > Please try this: > > REGISTER build/ivy/lib/Pig/avro-1.7.1.jar > REGISTER build/ivy/lib/Pig/json-simple-1.1.jar > REGISTER build/ivy/lib/Pig/jackson-mapper-asl-1.8.8.jar > REGISTER build/ivy/lib/Pig/jackson-core-asl-1.8.8.jar > REGISTER contrib/piggybank/java/piggybank.jar > > employee = LOAD '/home/cheolsoo/workspace/avro/emplyees' USING > org.apache.pig.piggybank.storage.avro.AvroStorage('multiple_schemas'); > DESCRIBE employee; > DUMP employee; > > I have two Avro files in my input directory: > > $java -jar /home/cheolsoo/workspace/avro/avro-tools-1.7.1.jar tojson > record_employee.avro > {"name":"a","age":0,"dept":"b","office":"c","salary":0.0} > > $java -jar /home/cheolsoo/workspace/avro/avro-tools-1.7.1.jar tojson > record_employee2.avro > {"name":"a","age":0,"dept":"b","office":"c","salary":0} > > record_employee.avro contains a float, and record_employee2.avro contains > an int. > > The output looks as follows: > > ... > employee: {name: chararray,age: int,dept: chararray,office: > chararray,salary: float} > ... > (a,0,b,c,0.0) > (a,0,b,c,0) > > Thanks, > Cheolsoo > > > > > On Wed, Jan 9, 2013 at 6:49 AM, Milind Vaidya <[email protected]> wrote: > >> Environment: >> >> Pig version: 0.11 >> Hadoop 0.23.6.0.1301071353 >> >> >> Script: >> >> >> REGISTER /homes/immilind/HadoopLocal/Jars/avro-1.7.1.jar >> REGISTER /homes/immilind/HadoopLocal/Jars/jackson-all-1.8.10.jar >> REGISTER /homes/immilind/HadoopLocal/Jars/jackson-core-asl-1.8.10.jar >> REGISTER /homes/immilind/HadoopLocal/Jars/jackson-jaxrs-1.8.10.jar >> REGISTER /homes/immilind/HadoopLocal/Jars/jackson-mapper-asl-1.8.10.jar >> REGISTER /homes/immilind/HadoopLocal/Jars/jackson-xc-1.8.10.jar >> REGISTER /home/gs/pig/current/lib-hadoop23/piggybank.jar >> >> employee= load '/user/immilind/AvroData' using >> org.apache.pig.piggybank.storage.avro.AvroStorage( ); >> dump employee; >> >> >> Schemas : >> >> { >> "type" : "record", >> "name" : "employee", >> "fields":[ >> {"name" : "name", "type" : "string", "default" : "NU"}, >> {"name" : "age", "type" : "int","default" : 0}, >> {"name" : "dept", "type": "string","default" : "DU"}, >> {"name" : "office", "type": "string","default" : "OU"}, >> {"name" : "salary", "type": "float","default" : 0.0} >> ] >> } >> >> { >> "type" : "record", >> "name" : "employee", >> "fields":[ >> {"name" : "name", "type" : "string", "default" : "NU"}, >> {"name" : "age", "type" : "int","default" : 0}, >> {"name" : "dept", "type": "string","default" : "DU"}, >> {"name" : "office", "type": "string","default" : "OU"}, >> {"name" : "salary", "type": "int", "default" : 0} >> ] >> } >> >> >> Both the schemas differ only in one field. As per the schema evolution/ >> merging rules, I am expecting to see "int" fields loaded as "float". But >> instead, the job fails due to field mismatch. >> >> I am referring to : >> >> Similar thread named "Working with changing schemas (avro) in Pig" >> >> https://mail-archives.apache.org/mod_mbox/pig-user/201204.mbox/%3ccab-acjm6b39omtwypyijbbojxl8muyrjdzrxmxfbjmtxxcm...@mail.gmail.com%3E >> >> JIRA: >> https://issues.apache.org/jira/browse/PIG-2579 >> How to use "multiple_schema' option with "AvroStorage" as suggested by >> this JIRA ? >> >> Function mergeType indicating rules for primitive types >> >> https://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java >> >> >> >> Can anybody suggest what is going wrong ? >>
