Here is the answer:

In general, the PREP state does indeed exit rapidly, it exits as soon as
the connection is made to the resource manager.

However, if a connection cannot be made to the manager ~ it will stay hung
in PREP mode forever.

To debug this, you can check the resourcemanager port in
/etc/hadoop/conf/yarn-site.xml, and make sure it matches what is in
job.properties (jobTracker).

I should we categorize this as a bug in the oozie job submission process...
?  Or is it expected to hang when resourcemanager connection fails?


On Sun, Feb 16, 2014 at 9:06 AM, Jay Vyas <[email protected]> wrote:

> There must be a serious problem then, because I use:
>
> oozie job -oozie http://localhost:11000/oozie -config job.properties -run
>
> It goes to PREP, and "stays there", for a very long time, without ever
> really converting to run.  I think eventually it times out.
>
> Any thoughts on what could be going on?  I dont see much in the logs.
>
> Here is the output:
>
> [root@mrg42 oozie-smoke]# ./run.sh
> result = job: 0000002-140215165147905-oozie-oozi-W
> if 500 error, then test that jar is in oozie web-inf/lib or libext
> 2
> 0000002-140215165147905-oozie-oozi-W is the job id, proceed?
>
> 2014-02-16 09:03:48,022  INFO ActionStartXCommand:539 - USER[root]
> GROUP[-] TOKEN[] APP[hello-world-wf]
> JOB[0000002-140215165147905-oozie-oozi-W]
> ACTION[0000002-140215165147905-oozie-oozi-W@:start:] Start action
> [0000002-140215165147905-oozie-oozi-W@:start:] with user-retry state :
> userRetryCount [0], userRetryMax [0], userRetryInterval [10]
> 2014-02-16 09:03:48,022  WARN ActionStartXCommand:542 - USER[root]
> GROUP[-] TOKEN[] APP[hello-world-wf]
> JOB[0000002-140215165147905-oozie-oozi-W]
> ACTION[0000002-140215165147905-oozie-oozi-W@:start:]
> [***0000002-140215165147905-oozie-oozi-W@:start:***]Action status=DONE
> 2014-02-16 09:03:48,022  WARN ActionStartXCommand:542 - USER[root]
> GROUP[-] TOKEN[] APP[hello-world-wf]
> JOB[0000002-140215165147905-oozie-oozi-W]
> ACTION[0000002-140215165147905-oozie-oozi-W@:start:]
> [***0000002-140215165147905-oozie-oozi-W@:start:***]Action updated in DB!
> 2014-02-16 09:03:48,075  INFO ActionStartXCommand:539 - USER[root]
> GROUP[-] TOKEN[] APP[hello-world-wf]
> JOB[0000002-140215165147905-oozie-oozi-W]
> ACTION[0000002-140215165147905-oozie-oozi-W@java-node] Start action
> [0000002-140215165147905-oozie-oozi-W@java-node] with user-retry state :
> userRetryCount [0], userRetryMax [0], userRetryInterval [10]
> 2014-02-16 09:03:48,200  WARN JavaActionExecutor:542 - USER[root] GROUP[-]
> TOKEN[] APP[hello-world-wf] JOB[0000002-140215165147905-oozie-oozi-W]
> ACTION[0000002-140215165147905-oozie-oozi-W@java-node] credentials is
> null for the action
> Job ID : 0000002-140215165147905-oozie-oozi-W
>
> ------------------------------------------------------------------------------------------------------------------------------------
> Workflow Name : hello-world-wf
> App Path      : glusterfs:///user/tom/oozietest/
> Status        : RUNNING
> Run           : 0
> User          : root
> Group         : -
> Created       : 2014-02-16 14:03 GMT
> Started       : 2014-02-16 14:03 GMT
> Last Modified : 2014-02-16 14:03 GMT
> Ended         : -
> CoordAction ID: -
>
> Actions
>
> ------------------------------------------------------------------------------------------------------------------------------------
> ID
> Status    Ext ID                 Ext Status Err Code
>
> ------------------------------------------------------------------------------------------------------------------------------------
> 0000002-140215165147905-oozie-oozi-W@:start:
> OK        -                      OK         -
>
> ------------------------------------------------------------------------------------------------------------------------------------
> 0000002-140215165147905-oozie-oozi-W@java-node
> PREP      -                      -          -
>
> ------------------------------------------------------------------------------------------------------------------------------------
>
> ..............
> 2014-02-16 09:03:48,022  INFO ActionStartXCommand:539 - USER[root]
> GROUP[-] TOKEN[] APP[hello-world-wf]
> JOB[0000002-140215165147905-oozie-oozi-W]
> ACTION[0000002-140215165147905-oozie-oozi-W@:start:] Start action
> [0000002-140215165147905-oozie-oozi-W@:start:] with user-retry state :
> userRetryCount [0], userRetryMax [0], userRetryInterval [10]
> 2014-02-16 09:03:48,022  WARN ActionStartXCommand:542 - USER[root]
> GROUP[-] TOKEN[] APP[hello-world-wf]
> JOB[0000002-140215165147905-oozie-oozi-W]
> ACTION[0000002-140215165147905-oozie-oozi-W@:start:]
> [***0000002-140215165147905-oozie-oozi-W@:start:***]Action status=DONE
> 2014-02-16 09:03:48,022  WARN ActionStartXCommand:542 - USER[root]
> GROUP[-] TOKEN[] APP[hello-world-wf]
> JOB[0000002-140215165147905-oozie-oozi-W]
> ACTION[0000002-140215165147905-oozie-oozi-W@:start:]
> [***0000002-140215165147905-oozie-oozi-W@:start:***]Action updated in DB!
> 2014-02-16 09:03:48,075  INFO ActionStartXCommand:539 - USER[root]
> GROUP[-] TOKEN[] APP[hello-world-wf]
> JOB[0000002-140215165147905-oozie-oozi-W]
> ACTION[0000002-140215165147905-oozie-oozi-W@java-node] Start action
> [0000002-140215165147905-oozie-oozi-W@java-node] with user-retry state :
> userRetryCount [0], userRetryMax [0], userRetryInterval [10]
> 2014-02-16 09:03:48,200  WARN JavaActionExecutor:542 - USER[root] GROUP[-]
> TOKEN[] APP[hello-world-wf] JOB[0000002-140215165147905-oozie-oozi-W]
> ACTION[0000002-140215165147905-oozie-oozi-W@java-node] credentials is
> null for the action
> Job ID : 0000002-140215165147905-oozie-oozi-W
>
> ------------------------------------------------------------------------------------------------------------------------------------
> Workflow Name : hello-world-wf
> App Path      : glusterfs:///user/tom/oozietest/
> Status        : RUNNING
> Run           : 0
> User          : root
> Group         : -
> Created       : 2014-02-16 14:03 GMT
> Started       : 2014-02-16 14:03 GMT
> Last Modified : 2014-02-16 14:03 GMT
> Ended         : -
> CoordAction ID: -
>
> Actions
>
> ------------------------------------------------------------------------------------------------------------------------------------
> ID
> Status    Ext ID                 Ext Status Err Code
>
> ------------------------------------------------------------------------------------------------------------------------------------
> 0000002-140215165147905-oozie-oozi-W@:start:
> OK        -                      OK         -
>
> ------------------------------------------------------------------------------------------------------------------------------------
> 0000002-140215165147905-oozie-oozi-W@java-node
> PREP      -                      -          -
>
> ------------------------------------------------------------------------------------------------------------------------------------
>
>
>
>
>
> On Sun, Feb 16, 2014 at 2:28 AM, Mohammad Islam <[email protected]>wrote:
>
>> In general, PREP-->RUNNING should rarely visible in normal cases.
>> If you use the command "oozie job -run ". The job should stay in PREP for
>> few ms. It should transit to RUNNING directly.
>>
>> Did you use "oozie  job -submit "? In this case, Oozie waits for the job
>> to be started when the user will execute "oozie job -start".  It is
>> encouraged to use "oozie job -run ..." directly.
>>
>> Regards,
>> Mohammad
>>
>>
>>
>>
>> On Saturday, February 15, 2014 5:07 PM, Jay Vyas <[email protected]>
>> wrote:
>>
>> Hi oozie :
>>
>> What exactly happens between the PREP->RUNNING stage of an oozie workflow
>> ?
>>
>> I notice that I have seen some jobs frozen in this state.
>> --
>> Jay Vyas
>> http://jayunit100.blogspot.com
>>
>
>
>
> --
> Jay Vyas
> http://jayunit100.blogspot.com
>



-- 
Jay Vyas
http://jayunit100.blogspot.com

Reply via email to