Michael Knapp created PIG-3651:
----------------------------------

             Summary: Cannot Launch Pig Job on Hadoop (Infinite Loop?)
                 Key: PIG-3651
                 URL: https://issues.apache.org/jira/browse/PIG-3651
             Project: Pig
          Issue Type: Bug
    Affects Versions: 0.11
            Reporter: Michael Knapp


There must be an infinite loop in the pig code, I wrote a very simple pig 
script, all it has is one load statement, and the next line dumps it to the 
console.  When I run that pig gets to the MapReduceLauncher and just hangs, 
never making any progress.

I'm running CDH4.5 on a CentOS 6.4 VM, all installed from Cloudera's yum repo. 
It is configured to all run in pseudo-distributed mode. Everything is running 
as a service and appears to be configured correctly (thank heaven!)

When I run the pig script on the command line I get a whole bunch of log 
statements and it looks like it is running, but once it starts, it never makes 
any progress, no matter how long I wait. These are the last couple lines:

2014-01-05 15:10:41,113 [JobControl] INFO  
org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: 
job_1388936205793_0006
2014-01-05 15:10:41,511 [JobControl] INFO  
org.apache.hadoop.yarn.client.YarnClientImpl - Submitted application 
application_1388936205793_0006 to ResourceManager at /0.0.0.0:8032
2014-01-05 15:10:41,564 [JobControl] INFO  org.apache.hadoop.mapreduce.Job - 
The url to track the job: 
http://localhost:8088/proxy/application_1388936205793_0006/
2014-01-05 15:10:41,653 [main] INFO  
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher 
- 0% complete

I modified the pig script to point to my local filesystem file, and ran the pig 
script in local mode, and the job finished successfully in seconds. The local 
copy of the file is identical to the one hdfs has. 

I tried adding all my hadoop jars to the classpath, both in my pig script 
(using REGISTER) and in the pig bash script, it made no difference.

I ran jstack on the process and saw this:

"main" prio=10 tid=0x00007ff468018000 nid=0x2bbf in Object.wait() 
[0x00007ff46f14c000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00000000fba8c6e0> (a 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1)
        at java.lang.Thread.join(Thread.java:1288)
        - locked <0x00000000fba8c6e0> (a 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1)
        at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:282)
        at org.apache.pig.PigServer.launchPlan(PigServer.java:1266)
        at 
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1251)
        at org.apache.pig.PigServer.storeEx(PigServer.java:933)
        at org.apache.pig.PigServer.store(PigServer.java:900)
        at org.apache.pig.PigServer.openIterator(PigServer.java:813)
        at 
org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:696)
        at 
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:320)
        at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
        at 
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
        at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
        at org.apache.pig.Main.run(Main.java:604)
        at org.apache.pig.Main.main(Main.java:157)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:208)

I think the MayReduceLauncher classes launchPig method must have an infinite 
loop in it somehow.  I looked in the source, that method is hundreds of lines 
(you really should split it up).  It does contain a while loop, I suspect that 
it is waiting for some other thread to start.

So you probably need to do a couple things:
1.  Split up that method, make it modular
2.  Have some kind of timeout, something that checks that it's making progress 
3.  Figure out why it's hung (I really can't figure it out, I followed all of 
CDH4.5's instructions very carefully)
4.  Whatever configuration setting I'm missing, probably needs to be documented 
a little beter.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to