>> java.lang.RuntimeException : Dataum 23.0 is not in union ["null" , "int"]
Given that you're specifying no Avro schema in STORE command, AvroStorage would derive the output Avro schema based on Pig schema. By default, AvroStorage converts every primitive type to a nullable union. In this case, final has an integer field, so AvroStorage converts it to the union ["null", "int"]. According to the error message, one record includes a float (23.0) instead of integer, and thus, it fails. I would try to DESCRIBE and DUMP on final and find which column is causing the mismatch. It's hard to tell what the exact problem is without seeing your data and schema. Thanks, Cheolsoo On Sun, Jun 9, 2013 at 11:01 AM, abhishek dodda <[email protected]>wrote: > hi all, > > Running pig with avro storage and facing the below issue > > pig 0.10 and avro 1.7 > * > * > *org.apache.avro.file.DataFileWriter$AppenderWriteException : > java.lang.RuntimeException : Dataum 23.0 is not in union ["null" , "int"]* > * > * > my pig script does the following > > a = load '/user/abhi/abc.txt' using > org.apache.pig.piggybank.storage.avro.AvroStorage(); > > b = load '/user/abhi/def.txt' using > org.apache.pig.piggybank.storage.avro.AvroStorage(); > > joined = join a by $0 left outer , b by $0; > > final = foreach joined generate > { > ... > ... > }; > > store final into '/user/abhi/output' using > org.apache.pig.piggybank.storage.avro.AvroStorage(); > > Thanks > abhishek >
