Hi Cheolsoo,
I'm not specifically setting default_parallel in my script
anywhere and I see this in the log file :-
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompi
ler - Neither PARALLEL nor default parallelism is set for this job. Setting
number of reducers to 1
So I guess I'm not using parallel. Is it worth trying to compile Pig to use
the Hadoop 0.23.x LocalJobRunner ? How do I tell which pig jar file I'm
using currently ?
Thanks
Malc
-----Original Message-----
From: Cheolsoo Park [mailto:[email protected]]
Sent: 12 November 2012 16:29
To: [email protected]
Subject: Re: Intermittent NullPointerException
Hi Malcolm,
How do you run your script? Do you run your script in parallel? Hadoop 1.0.x
LocalJobRunner is not thread-safe, and Pig is by default built with Hadoop
1.0.x. I have seen a similar problem before (
https://issues.apache.org/jira/browse/PIG-2852).
If you're running your script in parallel, one workaround is to use Hdoop
0.23.x LocalJobRunner, which is thread-safe. You can do the following:
- If you're using the standalone pig.jar, please download the Pig source
tarball and run "ant clean jar -Dhadoopversion=23" to build pig.jar.
- If you're using installed Hadoop with pig-withouthadoop.jar, please
install Hadoop 0.23.x, download the Pig source tarball, and run "ant clean
jar-withouthadoop -Dhadoopversion=23" to build pig-withouthadoop.jar.
Hope this is helpful.
Thanks,
Cheolsoo
On Mon, Nov 12, 2012 at 7:14 AM, Malcolm Tye
<[email protected]>wrote:
> Hi,****
>
> I'm running Pig 0.10.0 in local mode on some small text files.
> There is no intention to run it on Hadoop at all. We have a job that
> runs every 5 minutes and about 3% of the time, the job fails with the
> error below. It happens at random places within the Pig Script.****
>
> ** **
>
> 2012-10-19 14:15:37,719 [Thread-15] WARN
> org.apache.hadoop.mapred.LocalJobRunner - job_local_0004
> java.lang.NullPointerException
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOp
> erator.processInput(PhysicalOperator.java:286)
>
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expression
> Operators.POProject.getNext(POProject.java:158)
>
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expression
> Operators.POProject.getNext(POProject.java:360)
>
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOp
> erator.getNext(PhysicalOperator.java:330)
>
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relational
> Operators.POForEach.processPlan(POForEach.java:332)
>
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relational
> Operators.POForEach.getNext(POForEach.java:284)
>
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOp
> erator.processInput(PhysicalOperator.java:290)
>
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relational
> Operators.POFilter.getNext(POFilter.java:95)
>
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOp
> erator.processInput(PhysicalOperator.java:290)
>
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relational
> Operators.POForEach.getNext(POForEach.java:233)
>
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOp
> erator.processInput(PhysicalOperator.java:290)
>
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relational
> Operators.POLocalRearrange.getNext(POLocalRearrange.java:256)
>
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relational
> Operators.POUnion.getNext(POUnion.java:165)
>
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGeneri
> cMapBase.runPipeline(PigGenericMapBase.java:271)
>
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGeneri
> cMapBase.map(PigGenericMapBase.java:266)
>
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGeneri
> cMapBase.map(PigGenericMapBase.java:64)
>
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:21
> 2)**
> **
>
> ** **
>
> In the Pig Log, I get****
>
> ** **
>
> ERROR 2244: Job failed, hadoop does not return any error message
>
> org.apache.pig.backend.executionengine.ExecException: ERROR 2244: Job
> failed, hadoop does not return any error message
> at
> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140)
> at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.ja
> va:193)
>
> at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.ja
> va:165)
>
> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
> at org.apache.pig.Main.run(Main.java:555)
> at org.apache.pig.Main.main(Main.java:111)
>
> ======================================================================
> ==========
> ****
>
> ** **
>
> Pig script is attached.****
>
> ** **
>
> Any help gratefully received****
>
> ** **
>
> Thanks****
>
> ** **
>
> Malc****
>
> ** **
>
> ** **
>
> ** **
>
> ** **
>
> ** **
>