Re: Apache Pig 0.9.1 with CDH3u1 hadoop

Scott Carey Thu, 15 Dec 2011 17:43:08 -0800

Does it fail if the script is more complicated?

1.  Try STORE instead of DUMP.  I have seen cases where the former works
and the latter does not in older pig versions.
2.  Do something with the data that triggers a job that is not map-only.
For example, GROUP on the first column.  I have seen situations where a
LOAD right into a DUMP with no work in between causes mysterious issues in
older versions as well.


I have no idea whether the above variations will have a different outcome,
its just a hunch and some easy things to try.  If one of the above works,
or if the stack traces from failure differ, it may provide more clues for
others to help as well.

On 12/15/11 3:39 PM, "Rohini U" <[email protected]> wrote:

>Yes, thats from a failing task.
>
>The script just reads tab separated data and dumps it to secreen. Please
>find the same below:
>
>raw = LOAD '$input' USING PigStorage() AS (a:int, b:int, c:int);
>DESCRIBE raw;
>dump raw;
>
>where $input has the following (the file has actual tabs but for here, i
>am
>writing them down as \t)
>
>1\t2\t3
>4\t5\t6
>
>I can successfully run this script with pig0.8.1 that comes with CDH3
>bundle.
>
>Thanks,
>-Rohini
>
>On Thu, Dec 15, 2011 at 3:20 PM, Dmitriy Ryaboy <[email protected]>
>wrote:
>
>> The stack trace you pasted is from a failing task?
>> What was the script? Can you read the input data using other tools?
>> Can you run the same script using the version of Pig that ships with
>> CDH?
>>
>> D
>>
>> On Thu, Dec 15, 2011 at 11:31 AM, Rohini U <[email protected]> wrote:
>> > Yes, it started a map reduce job however, however, all I can in the
>>error
>> > logs is as given below. There is nothing else
>> >
>> > java.io.EOFException
>> >        at
>> 
>>java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.jav
>>a:2281)
>> >        at
>> 
>>java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStrea
>>m.java:2750)
>> >        at
>> java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:780)
>> >        at java.io.ObjectInputStream.<init>(ObjectInputStream.java:280)
>> >        at
>> 
>>org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit.rea
>>dObject(PigSplit.java:264)
>> >        at
>> 
>>org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit.rea
>>dFields(PigSplit.java:209)
>> >        at
>> 
>>org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserialize
>>r.deserialize(WritableSerialization.java:67)
>> >        at
>> 
>>org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserialize
>>r.deserialize(WritableSerialization.java:40)
>> >        at
>> org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:349)
>> >        at 
>>org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:611)
>> >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
>> >        at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>> >        at java.security.AccessController.doPrivileged(Native Method)
>> >        at javax.security.auth.Subject.doAs(Subject.java:396)
>> >        at
>> 
>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation
>>.java:1127)
>> >        at org.apache.hadoop.mapred.Child.main(Child.java:264)
>> >
>> >
>> >
>> >
>> > On Thu, Dec 15, 2011 at 10:01 AM, Dmitriy Ryaboy <[email protected]>
>> wrote:
>> >
>> >> Should just work. Did it start a mapreduce job? Can you get task
>> failure or
>> >> job setup failure logs?
>> >>
>> >> On Thu, Dec 15, 2011 at 7:14 AM, Rohini U <[email protected]> wrote:
>> >>
>> >> > Hi,
>> >> >
>> >> > I am trying to use Pig 0.9.1 with CDH3u1 packaged hadoop. I
>>compiled
>> pig
>> >> > without hadoop jars to avoid conflicts, and using that jar to run
>>pig
>> >> jobs.
>> >> > Thigns are running fine in local mode but on the cluster, I get the
>> >> > following error:
>> >> >
>> >> > 2011-12-15 05:15:49,641 [main] INFO
>> >> >
>> >> >
>> >>
>> 
>>org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLau
>>ncher
>> >> > - Failed!
>> >> > 2011-12-15 05:15:49,643 [main] ERROR
>> >> org.apache.pig.tools.grunt.GruntParser
>> >> > - ERROR 2997: Unable to recreate exception from backed error:
>> >> > java.io.EOFExceptio
>> >> >        at
>> >> >
>> >> >
>> >>
>> 
>>java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.jav
>>a:2281)
>> >> >        at
>> >> >
>> >> >
>> >>
>> 
>>java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStrea
>>m.java:2750)
>> >> >        at
>> >> > 
>>java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:780)
>> >> >        at 
>>java.io.ObjectInputStream.<init>(ObjectInputStream.java:280)
>> >> >        at
>> >> >
>> >> >
>> >>
>> 
>>org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit.rea
>>dObject(PigSplit.java:264)
>> >> >        at
>> >> >
>> >> >
>> >>
>> 
>>org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit.rea
>>dFields(PigSplit.java:209)
>> >> >        at
>> >> >
>> >> >
>> >>
>> 
>>org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserialize
>>r.deserialize(WritableSerialization.java:67)
>> >> >        at
>> >> >
>> >> >
>> >>
>> 
>>org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserialize
>>r.deserialize(WritableSerialization.java:40)
>> >> >        at
>> >> > org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:349)
>> >> >        at
>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:611)
>> >> >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
>> >> >        at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>> >> >        at java.security.AccessController.doPrivileged(Native
>>Method)
>> >> >        at javax.security.auth.Subject.doAs(Subject.java:396)
>> >> >        at
>> >> >
>> >> >
>> >>
>> 
>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation
>>.java:1127)
>> >> >        at org.apache.hadoop.mapred.Child.main(Child.java:264)
>> >> >
>> >> > Any insights what might be wrong?
>> >> >
>> >> > Is there any way we can make pig 0.9 work with CDH3 u1 version of
>> hadoop?
>> >> >
>> >> > Thanks
>> >> > -Rohini
>> >> >
>> >>
>>

Re: Apache Pig 0.9.1 with CDH3u1 hadoop

Reply via email to