Hello Sir,

I have installed GridWay-5.6.1 on a machine running globus-4.0.7 with PBS.
When I submit a job using gwsubmit command , the status of job remains
pending all the time. My grid has only single machine named "saum.grid". The
contents of various log file are given below:

Content of gwd.log:
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sun Apr  3 18:45:04 2011 [IM][I]: Hosts discovered by MAD (mds4): saum.grid
Sun Apr  3 18:46:32 2011 [UM][I]: Executing command grid-proxy-info
-identity
Sun Apr  3 18:46:32 2011 [UM][I]: User proxy info,
/O=Grid/OU=GlobusTest/OU=simpleCA-saum.grid/OU=grid/CN=guser01
Sun Apr  3 18:46:32 2011 [UM][I]: Loading execution MADs for user guser01
(/O=Grid/OU=GlobusTest/OU=simpleCA-saum.grid/OU=grid/CN=guser01).
Sun Apr  3 18:46:33 2011 [TM][I]: -- MARK --
Sun Apr  3 18:46:34 2011 [UM][I]:     Execution MAD ws loaded
(exec:gw_em_mad_ws, args:, mode:rsl2).
Sun Apr  3 18:46:34 2011 [UM][I]: Loading transfer MADs for user guser01
(/O=Grid/OU=GlobusTest/OU=simpleCA-saum.grid/OU=grid/CN=guser01).
Sun Apr  3 18:46:35 2011 [UM][I]:     Transfer MAD gridftp loaded (exec:
gw_tm_mad_ftp, arg: ).
Sun Apr  3 18:46:35 2011 [UM][I]: User guser01
(/O=Grid/OU=GlobusTest/OU=simpleCA-saum.grid/OU=grid/CN=guser01) registered.
Sun Apr  3 18:46:35 2011 [DM][I]: New job 0 allocated and initialized.
Sun Apr  3 18:46:38 2011 [UM][I]: -- MARK --
Sun Apr  3 18:46:41 2011 [EM][I]: -- MARK --
Sun Apr  3 18:46:45 2011 [IM][I]: -- MARK --
Sun Apr  3 18:46:46 2011 [DM][I]: Dispatching job 0 to saum.grid (workq).
Sun Apr  3 18:47:05 2011 [IM][I]: Discovering hosts.
Sun Apr  3 18:47:05 2011 [DM][I]: -- MARK --
Sun Apr  3 18:47:39 2011 [IM][I]: Hosts discovered by MAD (mds4): saum.grid
Sun Apr  3 18:47:56 2011 [DM][I]: Rescheduling job 0.
Sun Apr  3 18:47:58 2011 [UM][I]: -- MARK --
Sun Apr  3 18:48:20 2011 [TM][I]: -- MARK --
Sun Apr  3 18:48:21 2011 [IM][I]: -- MARK --
Sun Apr  3 18:48:21 2011 [EM][I]: -- MARK --
Sun Apr  3 18:49:26 2011 [TM][I]: -- MARK --
Sun Apr  3 18:49:31 2011 [IM][I]: -- MARK --
Sun Apr  3 18:49:36 2011 [EM][I]: -- MARK --
Sun Apr  3 18:49:38 2011 [DM][I]: -- MARK --
Sun Apr  3 18:49:46 2011 [DM][I]: -- MARK --
Sun Apr  3 18:49:50 2011 [IM][I]: Discovering hosts.
Sun Apr  3 18:49:51 2011 [IM][I]: Discovering hosts.
Sun Apr  3 18:50:42 2011 [IM][I]: Hosts discovered by MAD (mds4): saum.grid
============================================================================================================

result of *gwps* command:
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
USER         JID DM   EM   START    END      EXEC    XFER    EXIT
NAME            HOST
guser01:0    0   pend ---- 18:46:35 --:--:-- 0:00:52 0:00:28 --
jt              saum.grid/PBS
================================================================================================


result of *gwhistory 0*:
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
HID START    END      PROLOG  WRAPPER EPILOG  MIGR    REASON QUEUE
HOST
0   18:54:01 18:54:11 0:00:01 0:00:06 0:00:03 0:00:00 err    workq
saum.grid/PBS
0   18:46:46 18:47:56 0:00:04 0:00:46 0:00:20 0:00:00 err    workq
saum.grid/PBS
================================================================================================


content of job.log file:
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sun Apr  3 18:46:35 2011 [DM][I]: ----------- Job configuration file (jt)
values -----------
Sun Apr  3 18:46:35 2011 [DM][I]:     EXECUTABLE             : /bin/hostname
Sun Apr  3 18:46:35 2011 [DM][I]:     ARGUMENTS              :
Sun Apr  3 18:46:35 2011 [DM][I]:     INPUT_FILES   (Total 0):
Sun Apr  3 18:46:35 2011 [DM][I]:     OUTPUT_FILES  (Total 0):
Sun Apr  3 18:46:35 2011 [DM][I]:     RESTART_FILES (Total 0):
Sun Apr  3 18:46:35 2011 [DM][I]:     STDIN_FILE             : /dev/null
Sun Apr  3 18:46:35 2011 [DM][I]:     STDOUT_FILE            :
stdout.${JOB_ID}
Sun Apr  3 18:46:35 2011 [DM][I]:     STDERR_FILE            :
stderr.${JOB_ID}
Sun Apr  3 18:46:35 2011 [DM][I]:     REQUIREMENTS           :
Sun Apr  3 18:46:35 2011 [DM][I]:     RANK                   :
Sun Apr  3 18:46:35 2011 [DM][I]:     RESCHEDULING_INTERVAL  : 0
Sun Apr  3 18:46:35 2011 [DM][I]:     RESCHEDULING_THRESHOLD : 300
Sun Apr  3 18:46:35 2011 [DM][I]:     SUSPENSION_TIMEOUT     : 600
Sun Apr  3 18:46:35 2011 [DM][I]:     CPULOAD_THRESHOLD      : 50
Sun Apr  3 18:46:35 2011 [DM][I]:     RESCHEDULE_ON_FAILURE  : yes
Sun Apr  3 18:46:35 2011 [DM][I]:     NUMBER_OF_RETRIES      : 1
Sun Apr  3 18:46:35 2011 [DM][I]:     CHECKPOINT_INTERVAL    : 0
Sun Apr  3 18:46:35 2011 [DM][I]:     CHECKPOINT_URL         :
Sun Apr  3 18:46:35 2011 [DM][I]:     WRAPPER                :
/usr/local/gw-5.6.1/libexec/gw_wrapper.sh
Sun Apr  3 18:46:35 2011 [DM][I]:     MONITOR                :
Sun Apr  3 18:46:35 2011 [DM][I]:     PRE_WRAPPER            :
Sun Apr  3 18:46:35 2011 [DM][I]:     PRE_WRAPPER_ARGUMENTS  :
Sun Apr  3 18:46:35 2011 [DM][I]:     TYPE                   : single
Sun Apr  3 18:46:35 2011 [DM][I]:     NP                     : 1
Sun Apr  3 18:46:35 2011 [DM][I]:     DEADLINE               : 0:00:00 0
Sun Apr  3 18:46:35 2011 [DM][I]:
----------------------------------------------------------
Sun Apr  3 18:46:35 2011 [DM][I]: New state is PENDING.
Sun Apr  3 18:46:46 2011 [DM][I]: New state is PROLOG.
Sun Apr  3 18:46:46 2011 [TM][I]: Creating remote job working directory:
Sun Apr  3 18:46:46 2011 [TM][I]:     Target url:
gsiftp://saum.grid/~/.gw_guser01_0/.
Sun Apr  3 18:46:48 2011 [TM][I]:     Remote job directory created.
Sun Apr  3 18:46:48 2011 [TM][I]: Staging input files:
Sun Apr  3 18:46:48 2011 [TM][I]:     Source: /home/guser01/GridWay.
Sun Apr  3 18:46:48 2011 [TM][I]:     Copying file
file:///usr/local/gw-5.6.1/var/0/job.env.
Sun Apr  3 18:46:48 2011 [TM][W]:     Skipping file /bin/hostname, absolute
path.
Sun Apr  3 18:46:48 2011 [TM][W]:     Skipping file /dev/null, absolute
path.
Sun Apr  3 18:46:48 2011 [TM][I]:     Copying file
file:///usr/local/gw-5.6.1/libexec/gw_wrapper.sh.
Sun Apr  3 18:46:49 2011 [TM][I]:     File
file:///usr/local/gw-5.6.1/var/0/job.env copied.
Sun Apr  3 18:46:50 2011 [TM][I]:     File
file:///usr/local/gw-5.6.1/libexec/gw_wrapper.sh copied.
Sun Apr  3 18:46:50 2011 [TM][I]: All input files copied.
Sun Apr  3 18:46:50 2011 [DM][I]: Prolog done:
Sun Apr  3 18:46:50 2011 [DM][I]:     Total time      : 4
Sun Apr  3 18:46:50 2011 [DM][I]: New state is WRAPPER.
Sun Apr  3 18:46:50 2011 [EM][I]: Submitting wrapper to saum.grid/PBS, RSL
used is in /usr/local/gw-5.6.1/var/0/job.rsl.0.
Sun Apr  3 18:47:33 2011 [EM][I]: New execution state is PENDING.
Sun Apr  3 18:47:36 2011 [EM][I]: Execution state is PENDING.
Sun Apr  3 18:47:36 2011 [EM][I]: New execution state is ACTIVE.
Sun Apr  3 18:47:36 2011 [EM][I]: New execution state is DONE.
Sun Apr  3 18:47:36 2011 [DM][I]: Wrapper DONE:
Sun Apr  3 18:47:36 2011 [DM][I]:     Active time     : 0
Sun Apr  3 18:47:36 2011 [DM][I]:     Suspension time : 46
Sun Apr  3 18:47:36 2011 [DM][I]:     Total time      : 46
Sun Apr  3 18:47:36 2011 [DM][I]: New state is EPILOG_STD.
Sun Apr  3 18:47:36 2011 [TM][I]: Staging output files:
Sun Apr  3 18:47:36 2011 [TM][I]:     Source:
gsiftp://saum.grid/~/.gw_guser01_0/.
Sun Apr  3 18:47:36 2011 [TM][I]:     Copying file stdout.wrapper.
Sun Apr  3 18:47:36 2011 [TM][I]:     Copying file stderr.wrapper.
Sun Apr  3 18:47:49 2011 [TM][I]:     File stdout.wrapper copied.
Sun Apr  3 18:47:51 2011 [TM][I]:     File stderr.wrapper copied.
Sun Apr  3 18:47:51 2011 [TM][I]: All output files copied.
Sun Apr  3 18:47:51 2011 [DM][E]: Unable to find exit code, assuming that
the job failed or was cancelled.
Sun Apr  3 18:47:51 2011 [DM][I]: New state is EPILOG_RESTART.
Sun Apr  3 18:47:51 2011 [TM][I]: Staging output files:
Sun Apr  3 18:47:51 2011 [TM][I]:     Source:
gsiftp://saum.grid/~/.gw_guser01_0/.
Sun Apr  3 18:47:51 2011 [TM][I]:     Copying file stdout.execution.
Sun Apr  3 18:47:51 2011 [TM][I]:     Copying file stderr.execution.
Sun Apr  3 18:47:53 2011 [TM][E]:     Copy of file stdout.execution failed.
Sun Apr  3 18:47:54 2011 [TM][E]:     Copy of file stderr.execution failed.
Sun Apr  3 18:47:54 2011 [TM][W]: Some output files were not copied.
Sun Apr  3 18:47:54 2011 [TM][W]: Removing remote directory:
Sun Apr  3 18:47:54 2011 [TM][W]:     Target url:
gsiftp://saum.grid/~/.gw_guser01_0/.
Sun Apr  3 18:47:56 2011 [TM][I]:     Remote job directory removed.
Sun Apr  3 18:47:56 2011 [DM][E]: Epilog failed:
Sun Apr  3 18:47:56 2011 [DM][E]:     Total time      : 20
Sun Apr  3 18:47:56 2011 [DM][I]: Rescheduling job.
Sun Apr  3 18:47:56 2011 [DM][I]: New state is PENDING.
Sun Apr  3 18:54:01 2011 [DM][I]: New state is PROLOG.
Sun Apr  3 18:54:01 2011 [TM][I]: Creating remote job working directory:
Sun Apr  3 18:54:01 2011 [TM][I]:     Target url:
gsiftp://saum.grid/~/.gw_guser01_0/.
Sun Apr  3 18:54:01 2011 [TM][I]:     Remote job directory created.
Sun Apr  3 18:54:01 2011 [TM][I]: Staging input files:
Sun Apr  3 18:54:01 2011 [TM][I]:     Source: /home/guser01/GridWay.
Sun Apr  3 18:54:01 2011 [TM][I]:     Copying file
file:///usr/local/gw-5.6.1/var/0/job.env.
Sun Apr  3 18:54:01 2011 [TM][W]:     Skipping file /bin/hostname, absolute
path.
Sun Apr  3 18:54:01 2011 [TM][W]:     Skipping file /dev/null, absolute
path.
Sun Apr  3 18:54:01 2011 [TM][I]:     Copying file
file:///usr/local/gw-5.6.1/libexec/gw_wrapper.sh.
Sun Apr  3 18:54:02 2011 [TM][I]:     File
file:///usr/local/gw-5.6.1/var/0/job.env copied.
Sun Apr  3 18:54:02 2011 [TM][I]:     File
file:///usr/local/gw-5.6.1/libexec/gw_wrapper.sh copied.
Sun Apr  3 18:54:02 2011 [TM][I]: All input files copied.
Sun Apr  3 18:54:02 2011 [DM][I]: Prolog done:
Sun Apr  3 18:54:02 2011 [DM][I]:     Total time      : 1
Sun Apr  3 18:54:02 2011 [DM][I]: New state is WRAPPER.
Sun Apr  3 18:54:02 2011 [EM][I]: Submitting wrapper to saum.grid/PBS, RSL
used is in /usr/local/gw-5.6.1/var/0/job.rsl.1.
Sun Apr  3 18:54:07 2011 [EM][I]: New execution state is PENDING.
Sun Apr  3 18:54:08 2011 [EM][I]: Execution state is PENDING.
Sun Apr  3 18:54:08 2011 [EM][I]: New execution state is ACTIVE.
Sun Apr  3 18:54:08 2011 [EM][I]: New execution state is DONE.
Sun Apr  3 18:54:08 2011 [DM][I]: Wrapper DONE:
Sun Apr  3 18:54:08 2011 [DM][I]:     Active time     : 0
Sun Apr  3 18:54:08 2011 [DM][I]:     Suspension time : 6
Sun Apr  3 18:54:08 2011 [DM][I]:     Total time      : 6
Sun Apr  3 18:54:08 2011 [DM][I]: New state is EPILOG_STD.
Sun Apr  3 18:54:08 2011 [TM][I]: Staging output files:
Sun Apr  3 18:54:08 2011 [TM][I]:     Source:
gsiftp://saum.grid/~/.gw_guser01_0/.
Sun Apr  3 18:54:08 2011 [TM][I]:     Copying file stdout.wrapper.
Sun Apr  3 18:54:08 2011 [TM][I]:     Copying file stderr.wrapper.
Sun Apr  3 18:54:09 2011 [TM][I]:     File stdout.wrapper copied.
Sun Apr  3 18:54:10 2011 [TM][I]:     File stderr.wrapper copied.
Sun Apr  3 18:54:10 2011 [TM][I]: All output files copied.
Sun Apr  3 18:54:10 2011 [DM][E]: Unable to find exit code, assuming that
the job failed or was cancelled.
Sun Apr  3 18:54:10 2011 [DM][I]: New state is EPILOG_RESTART.
Sun Apr  3 18:54:10 2011 [TM][I]: Staging output files:
Sun Apr  3 18:54:10 2011 [TM][I]:     Source:
gsiftp://saum.grid/~/.gw_guser01_0/.
Sun Apr  3 18:54:10 2011 [TM][I]:     Copying file stdout.execution.
Sun Apr  3 18:54:10 2011 [TM][I]:     Copying file stderr.execution.
Sun Apr  3 18:54:10 2011 [TM][E]:     Copy of file stdout.execution failed.
Sun Apr  3 18:54:11 2011 [TM][E]:     Copy of file stderr.execution failed.
Sun Apr  3 18:54:11 2011 [TM][W]: Some output files were not copied.
Sun Apr  3 18:54:11 2011 [TM][W]: Removing remote directory:
Sun Apr  3 18:54:11 2011 [TM][W]:     Target url:
gsiftp://saum.grid/~/.gw_guser01_0/.
Sun Apr  3 18:54:11 2011 [TM][I]:     Remote job directory removed.
Sun Apr  3 18:54:11 2011 [DM][E]: Epilog failed:
Sun Apr  3 18:54:11 2011 [DM][E]:     Total time      : 3
Sun Apr  3 18:54:11 2011 [DM][I]: Rescheduling job.
Sun Apr  3 18:54:11 2011 [DM][I]: New state is PENDING.
=========================================================================================================



And the globus container log corresponding to the gridway's* gwsubmit* is:
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2011-04-03 18:35:17,183 INFO  exec.StateMachine
[RunQueueThread_1,logJobAccepted:3424] Job
05217630-5df3-11e0-aafa-e2327fe73aea accepted for local user 'guser01'
2011-04-03 18:35:18,497 INFO  exec.StateMachine
[RunQueueThread_2,logJobSubmitted:3436] Job
05217630-5df3-11e0-aafa-e2327fe73aea submitted with local job ID
'7.saum.grid'
2011-04-03 18:35:23,168 INFO  exec.StateMachine
[RunQueueThread_11,logJobSucceeded:3446] Job
05217630-5df3-11e0-aafa-e2327fe73aea finished successfully
2011-04-03 18:47:29,586 INFO  exec.StateMachine
[RunQueueThread_13,logJobAccepted:3424] Job
b9e4ae10-5df4-11e0-aafa-e2327fe73aea accepted for local user 'guser01'
2011-04-03 18:47:31,373 INFO  exec.StateMachine
[RunQueueThread_14,logJobSubmitted:3436] Job
b9e4ae10-5df4-11e0-aafa-e2327fe73aea submitted with local job ID
'8.saum.grid'
2011-04-03 18:47:32,411 INFO  exec.StateMachine
[RunQueueThread_5,logJobSucceeded:3446] Job
b9e4ae10-5df4-11e0-aafa-e2327fe73aea finished successfully
=======================================================================================================================



Plz tell me what is wrong with the GridWay's *gwsubmit*. How should I solve
the issue???

_ _ _ _ _ _ _ _ _ _
     Regads
Saumesh Kumar
  IIT Roorkee

Reply via email to