FYI:
https://issues.apache.org/jira/browse/AVRO-993

I expect that Avro 1.6.2 will add these methods back in.

On 1/11/12 1:47 AM, "Andrew Kenworthy" <[email protected]> wrote:

>Hi Stan,
>
>Thank you for your feedback. I've run the script passing "-D
>mapred.child.java.opts=-verbose:class" and have the following in my logs:
>
>[Loaded org.apache.avro.generic.GenericDatumWriter from
>file:/var/lib/hadoop-0.20/cache/mapred/mapred/local/taskTracker/ankenworth
>y/jobcache/job_201111230039_0146/jars/job.jar]
>[Loaded org.apache.pig.piggybank.storage.avro.PigAvroDatumWriter from
>file:/var/lib/hadoop-0.20/cache/mapred/mapred/local/taskTracker/ankenworth
>y/jobcache/job_201111230039_0146/jars/job.jar]
>
>I assume the .../job_201111230039_0146/jars/job.jar is the one prepared
>by pig using the jars I have REGISTER-ed, in which case the classes are
>the ones I expect, or have I misread that?
>
>Regards,
>
>Andrew
>
>
>
>>________________________________
>> From: Stan Rosenberg <[email protected]>
>>To: [email protected]; Andrew Kenworthy <[email protected]>
>>Sent: Tuesday, January 10, 2012 5:36 PM
>>Subject: Re: Simple AvroStorage LOAD and STORE with Avro 1.6.0
>> 
>>Andrew,
>>
>>Something looks odd in this stack trace:
>>
>>Caused by: java.lang.ClassCastException:
>>org.apache.pig.data.BinSedesTuple cannot be cast to
>>org.apache.avro.generic.IndexedRecord
>>>         at 
>>>org.apache.avro.generic.GenericData.getField(GenericData.java:525)
>>>         at 
>>>org.apache.avro.generic.GenericData.getField(GenericData.java:540)
>>>         at 
>>>org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWrite
>>>r.java:103)
>>>         at 
>>>org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java
>>>:65)
>>>         at 
>>>org.apache.pig.piggybank.storage.avro.PigAvroDatumWriter.write(PigAvroDa
>>>tumWriter.java:99)
>>
>>PigAvroDatumWriter overrides 'GenericDatumWriter.writeRecord' in order
>>to extract values from a tuple.  Thus, I would expect that the third
>>method invocation be PigAvroDatumWriter.writeRecord.  Perhaps, someone
>>else has more insight as to why it's not getting invoked.  In the
>>meantime, please confirm that both PigAvroDatumWriter and
>>GenericDatumWriter are loaded from the right jar files. (You can do
>>this by temporarily changing the pig script to invoke JVM with 'java
>>-verbose' and 'grep' the output for these classes.)
>>
>>Best,
>>
>>stan
>>
>>On Tue, Jan 10, 2012 at 8:03 AM, Andrew Kenworthy
>><[email protected]> wrote:
>>> Hi Stan,
>>>
>>> here's the full stacktrace:
>>>
>>> org.apache.avro.file.DataFileWriter$AppendWriteException:
>>>java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple cannot
>>>be cast to org.apache.avro.generic.IndexedRecord
>>>         at 
>>>org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:261)
>>>         at 
>>>org.apache.pig.piggybank.storage.avro.PigAvroRecordWriter.write(PigAvroR
>>>ecordWriter.java:49)
>>>         at 
>>>org.apache.pig.piggybank.storage.avro.AvroStorage.putNext(AvroStorage.ja
>>>va:580)
>>>         at 
>>>org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFo
>>>rmat$PigRecordWriter.write(PigOutputFormat.java:138)
>>>         at 
>>>org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFo
>>>rmat$PigRecordWriter.write(PigOutputFormat.java:97)
>>>         at 
>>>org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.
>>>java:530)
>>>         at 
>>>org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutput
>>>Context.java:80)
>>>         at 
>>>org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$
>>>Map.collect(PigMapOnly.java:48)
>>>         at 
>>>org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.
>>>runPipeline(PigMapBase.java:238)
>>>         at 
>>>org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.
>>>map(PigMapBase.java:231)
>>>         at 
>>>org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.
>>>map(PigMapBase.java:53)
>>>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>>>         at 
>>>org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:646)
>>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322)
>>>         at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>>         at 
>>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio
>>>n.java:1115)
>>>         at org.apache.hadoop.mapred.Child.main(Child.java:262)
>>> Caused by: java.lang.ClassCastException:
>>>org.apache.pig.data.BinSedesTuple cannot be cast to
>>>org.apache.avro.generic.IndexedRecord
>>>         at 
>>>org.apache.avro.generic.GenericData.getField(GenericData.java:525)
>>>         at 
>>>org.apache.avro.generic.GenericData.getField(GenericData.java:540)
>>>         at 
>>>org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWrite
>>>r.java:103)
>>>         at 
>>>org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java
>>>:65)
>>>         at 
>>>org.apache.pig.piggybank.storage.avro.PigAvroDatumWriter.write(PigAvroDa
>>>tumWriter.java:99)
>>>         at 
>>>org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java
>>>:57)
>>>         at 
>>>org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:255)
>>>         ... 18 more
>>>
>>>
>>> Andrew
>>>
>>>
>>>
>>>>________________________________
>>>> From: Stan Rosenberg <[email protected]>
>>>>To: [email protected]; Andrew Kenworthy <[email protected]>
>>>>Sent: Monday, January 9, 2012 5:30 PM
>>>>Subject: Re: Simple AvroStorage LOAD and STORE with Avro 1.6.0
>>>>
>>>>Andrew,
>>>>
>>>>The source of the problem may be AvroStorage in piggybank.  Could you
>>>>please include the entire stack trace?
>>>>
>>>>stan
>>>>
>>>>On Mon, Jan 9, 2012 at 4:15 AM, Andrew Kenworthy
>>>><[email protected]> wrote:
>>>>> Hallo,
>>>>>
>>>>> When I run a simple pig script to LOAD and STORE avro data, I get:-
>>>>>
>>>>> java.lang.ClassCastException: org.apache.pig.data.BinSedesTuple
>>>>>cannot be cast to org.apache.avro.generic.IndexedRecord
>>>>>
>>>>>
>>>>> Script:
>>>>>
>>>>> REGISTER /tmp/avro-1.6.0.jar;
>>>>> --REGISTER /tmp/avro-1.5.4.jar
>>>>> --REGISTER /tmp/avro-1.4.1.jar;
>>>>>
>>>>> REGISTER /tmp/piggybank-0.9.1.jar;
>>>>> REGISTER /tmp/json-simple-1.1.jar;
>>>>> REGISTER /tmp/jackson-core-asl-1.8.4.jar;
>>>>> REGISTER /tmp/jackson-mapper-asl-1.8.4.jar;
>>>>>
>>>>> avroData=LOAD '$DATA_INPUTDIR' USING
>>>>>org.apache.pig.piggybank.storage.avro.AvroStorage();
>>>>>
>>>>> dataSubset = FOREACH avroData GENERATE myField1, myField2;
>>>>> describe dataSubset;
>>>>> -----------------------------------------------
>>>>> -- shows:
>>>>> -- dataSubset : {myField1: int,myField2: int}
>>>>> -----------------------------------------------
>>>>> STORE dataSubset INTO '$OUTPUTDIR' USING
>>>>>org.apache.pig.piggybank.storage.avro.AvroStorage();
>>>>>
>>>>> If I use the 1.5.4 jar I get the same error, but the script works
>>>>>with the 1.4.1 version. If I just write one field, then it works with
>>>>>1.6.0.
>>>>>
>>>>> I see there's been a related issue fixed here:
>>>>>
>>>>> https://issues.apache.org/jira/browse/PIG-2202
>>>>> https://issues.apache.org/jira/browse/PIG-2195
>>>>>
>>>>> Can anyone confirm that this or similar works with avro 1.6.0,
>>>>>and/or point me in the right direction concering where the problem
>>>>>may lie?
>>>>>
>>>>> Many thanks,
>>>>>
>>>>> Andrew
>>>>
>>>>
>>>>
>>
>>

Reply via email to