Hi,

Does a Globus submission with data staging work?

Please send me the output of the following command:

globusrun-ws -submit -F saum.grid -Ft PBS -s -c /bin/uname -a


Regards,

Dr. Eduardo Huedo Cuesta
Associate Professor (Profesor Titular), Universidad Complutense de Madrid
http://dsa-research.org/ehuedo



2011/4/4 Saumesh Kumar <[email protected]>

> Hello Sir,
>
> I have installed GridWay-5.6.1 on a machine running globus-4.0.7 with PBS.
> When I submit a job using gwsubmit command , the status of job remains
> pending all the time. My grid has only single machine named "saum.grid". The
> contents of various log file are given below:
>
> Content of gwd.log:
>
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Sun Apr  3 18:45:04 2011 [IM][I]: Hosts discovered by MAD (mds4): saum.grid
>
> Sun Apr  3 18:46:32 2011 [UM][I]: Executing command grid-proxy-info
> -identity
> Sun Apr  3 18:46:32 2011 [UM][I]: User proxy info,
> /O=Grid/OU=GlobusTest/OU=simpleCA-saum.grid/OU=grid/CN=guser01
> Sun Apr  3 18:46:32 2011 [UM][I]: Loading execution MADs for user guser01
> (/O=Grid/OU=GlobusTest/OU=simpleCA-saum.grid/OU=grid/CN=guser01).
> Sun Apr  3 18:46:33 2011 [TM][I]: -- MARK --
> Sun Apr  3 18:46:34 2011 [UM][I]:     Execution MAD ws loaded
> (exec:gw_em_mad_ws, args:, mode:rsl2).
> Sun Apr  3 18:46:34 2011 [UM][I]: Loading transfer MADs for user guser01
> (/O=Grid/OU=GlobusTest/OU=simpleCA-saum.grid/OU=grid/CN=guser01).
> Sun Apr  3 18:46:35 2011 [UM][I]:     Transfer MAD gridftp loaded (exec:
> gw_tm_mad_ftp, arg: ).
> Sun Apr  3 18:46:35 2011 [UM][I]: User guser01
> (/O=Grid/OU=GlobusTest/OU=simpleCA-saum.grid/OU=grid/CN=guser01) registered.
> Sun Apr  3 18:46:35 2011 [DM][I]: New job 0 allocated and initialized.
> Sun Apr  3 18:46:38 2011 [UM][I]: -- MARK --
> Sun Apr  3 18:46:41 2011 [EM][I]: -- MARK --
> Sun Apr  3 18:46:45 2011 [IM][I]: -- MARK --
> Sun Apr  3 18:46:46 2011 [DM][I]: Dispatching job 0 to saum.grid (workq).
> Sun Apr  3 18:47:05 2011 [IM][I]: Discovering hosts.
> Sun Apr  3 18:47:05 2011 [DM][I]: -- MARK --
> Sun Apr  3 18:47:39 2011 [IM][I]: Hosts discovered by MAD (mds4): saum.grid
>
> Sun Apr  3 18:47:56 2011 [DM][I]: Rescheduling job 0.
> Sun Apr  3 18:47:58 2011 [UM][I]: -- MARK --
> Sun Apr  3 18:48:20 2011 [TM][I]: -- MARK --
> Sun Apr  3 18:48:21 2011 [IM][I]: -- MARK --
> Sun Apr  3 18:48:21 2011 [EM][I]: -- MARK --
> Sun Apr  3 18:49:26 2011 [TM][I]: -- MARK --
> Sun Apr  3 18:49:31 2011 [IM][I]: -- MARK --
> Sun Apr  3 18:49:36 2011 [EM][I]: -- MARK --
> Sun Apr  3 18:49:38 2011 [DM][I]: -- MARK --
> Sun Apr  3 18:49:46 2011 [DM][I]: -- MARK --
> Sun Apr  3 18:49:50 2011 [IM][I]: Discovering hosts.
> Sun Apr  3 18:49:51 2011 [IM][I]: Discovering hosts.
> Sun Apr  3 18:50:42 2011 [IM][I]: Hosts discovered by MAD (mds4): saum.grid
>
>
> ============================================================================================================
>
> result of *gwps* command:
>
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> USER         JID DM   EM   START    END      EXEC    XFER    EXIT
> NAME            HOST
> guser01:0    0   pend ---- 18:46:35 --:--:-- 0:00:52 0:00:28 --
> jt              saum.grid/PBS
>
> ================================================================================================
>
>
> result of *gwhistory 0*:
>
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> HID START    END      PROLOG  WRAPPER EPILOG  MIGR    REASON QUEUE
> HOST
> 0   18:54:01 18:54:11 0:00:01 0:00:06 0:00:03 0:00:00 err    workq
> saum.grid/PBS
> 0   18:46:46 18:47:56 0:00:04 0:00:46 0:00:20 0:00:00 err    workq
> saum.grid/PBS
>
> ================================================================================================
>
>
> content of job.log file:
>
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Sun Apr  3 18:46:35 2011 [DM][I]: ----------- Job configuration file (jt)
> values -----------
> Sun Apr  3 18:46:35 2011 [DM][I]:     EXECUTABLE             :
> /bin/hostname
> Sun Apr  3 18:46:35 2011 [DM][I]:     ARGUMENTS              :
> Sun Apr  3 18:46:35 2011 [DM][I]:     INPUT_FILES   (Total 0):
> Sun Apr  3 18:46:35 2011 [DM][I]:     OUTPUT_FILES  (Total 0):
> Sun Apr  3 18:46:35 2011 [DM][I]:     RESTART_FILES (Total 0):
> Sun Apr  3 18:46:35 2011 [DM][I]:     STDIN_FILE             : /dev/null
> Sun Apr  3 18:46:35 2011 [DM][I]:     STDOUT_FILE            :
> stdout.${JOB_ID}
> Sun Apr  3 18:46:35 2011 [DM][I]:     STDERR_FILE            :
> stderr.${JOB_ID}
> Sun Apr  3 18:46:35 2011 [DM][I]:     REQUIREMENTS           :
> Sun Apr  3 18:46:35 2011 [DM][I]:     RANK                   :
> Sun Apr  3 18:46:35 2011 [DM][I]:     RESCHEDULING_INTERVAL  : 0
> Sun Apr  3 18:46:35 2011 [DM][I]:     RESCHEDULING_THRESHOLD : 300
> Sun Apr  3 18:46:35 2011 [DM][I]:     SUSPENSION_TIMEOUT     : 600
> Sun Apr  3 18:46:35 2011 [DM][I]:     CPULOAD_THRESHOLD      : 50
> Sun Apr  3 18:46:35 2011 [DM][I]:     RESCHEDULE_ON_FAILURE  : yes
> Sun Apr  3 18:46:35 2011 [DM][I]:     NUMBER_OF_RETRIES      : 1
> Sun Apr  3 18:46:35 2011 [DM][I]:     CHECKPOINT_INTERVAL    : 0
> Sun Apr  3 18:46:35 2011 [DM][I]:     CHECKPOINT_URL         :
> Sun Apr  3 18:46:35 2011 [DM][I]:     WRAPPER                :
> /usr/local/gw-5.6.1/libexec/gw_wrapper.sh
> Sun Apr  3 18:46:35 2011 [DM][I]:     MONITOR                :
> Sun Apr  3 18:46:35 2011 [DM][I]:     PRE_WRAPPER            :
> Sun Apr  3 18:46:35 2011 [DM][I]:     PRE_WRAPPER_ARGUMENTS  :
> Sun Apr  3 18:46:35 2011 [DM][I]:     TYPE                   : single
> Sun Apr  3 18:46:35 2011 [DM][I]:     NP                     : 1
> Sun Apr  3 18:46:35 2011 [DM][I]:     DEADLINE               : 0:00:00 0
> Sun Apr  3 18:46:35 2011 [DM][I]:
> ----------------------------------------------------------
> Sun Apr  3 18:46:35 2011 [DM][I]: New state is PENDING.
> Sun Apr  3 18:46:46 2011 [DM][I]: New state is PROLOG.
> Sun Apr  3 18:46:46 2011 [TM][I]: Creating remote job working directory:
> Sun Apr  3 18:46:46 2011 [TM][I]:     Target url:
> gsiftp://saum.grid/~/.gw_guser01_0/.
> Sun Apr  3 18:46:48 2011 [TM][I]:     Remote job directory created.
> Sun Apr  3 18:46:48 2011 [TM][I]: Staging input files:
> Sun Apr  3 18:46:48 2011 [TM][I]:     Source: /home/guser01/GridWay.
> Sun Apr  3 18:46:48 2011 [TM][I]:     Copying file
> file:///usr/local/gw-5.6.1/var/0/job.env.
> Sun Apr  3 18:46:48 2011 [TM][W]:     Skipping file /bin/hostname, absolute
> path.
> Sun Apr  3 18:46:48 2011 [TM][W]:     Skipping file /dev/null, absolute
> path.
> Sun Apr  3 18:46:48 2011 [TM][I]:     Copying file
> file:///usr/local/gw-5.6.1/libexec/gw_wrapper.sh.
> Sun Apr  3 18:46:49 2011 [TM][I]:     File
> file:///usr/local/gw-5.6.1/var/0/job.env copied.
> Sun Apr  3 18:46:50 2011 [TM][I]:     File
> file:///usr/local/gw-5.6.1/libexec/gw_wrapper.sh copied.
> Sun Apr  3 18:46:50 2011 [TM][I]: All input files copied.
> Sun Apr  3 18:46:50 2011 [DM][I]: Prolog done:
> Sun Apr  3 18:46:50 2011 [DM][I]:     Total time      : 4
> Sun Apr  3 18:46:50 2011 [DM][I]: New state is WRAPPER.
> Sun Apr  3 18:46:50 2011 [EM][I]: Submitting wrapper to saum.grid/PBS, RSL
> used is in /usr/local/gw-5.6.1/var/0/job.rsl.0.
> Sun Apr  3 18:47:33 2011 [EM][I]: New execution state is PENDING.
> Sun Apr  3 18:47:36 2011 [EM][I]: Execution state is PENDING.
> Sun Apr  3 18:47:36 2011 [EM][I]: New execution state is ACTIVE.
> Sun Apr  3 18:47:36 2011 [EM][I]: New execution state is DONE.
> Sun Apr  3 18:47:36 2011 [DM][I]: Wrapper DONE:
> Sun Apr  3 18:47:36 2011 [DM][I]:     Active time     : 0
> Sun Apr  3 18:47:36 2011 [DM][I]:     Suspension time : 46
> Sun Apr  3 18:47:36 2011 [DM][I]:     Total time      : 46
> Sun Apr  3 18:47:36 2011 [DM][I]: New state is EPILOG_STD.
> Sun Apr  3 18:47:36 2011 [TM][I]: Staging output files:
> Sun Apr  3 18:47:36 2011 [TM][I]:     Source:
> gsiftp://saum.grid/~/.gw_guser01_0/.
> Sun Apr  3 18:47:36 2011 [TM][I]:     Copying file stdout.wrapper.
> Sun Apr  3 18:47:36 2011 [TM][I]:     Copying file stderr.wrapper.
> Sun Apr  3 18:47:49 2011 [TM][I]:     File stdout.wrapper copied.
> Sun Apr  3 18:47:51 2011 [TM][I]:     File stderr.wrapper copied.
> Sun Apr  3 18:47:51 2011 [TM][I]: All output files copied.
> Sun Apr  3 18:47:51 2011 [DM][E]: Unable to find exit code, assuming that
> the job failed or was cancelled.
> Sun Apr  3 18:47:51 2011 [DM][I]: New state is EPILOG_RESTART.
> Sun Apr  3 18:47:51 2011 [TM][I]: Staging output files:
> Sun Apr  3 18:47:51 2011 [TM][I]:     Source:
> gsiftp://saum.grid/~/.gw_guser01_0/.
> Sun Apr  3 18:47:51 2011 [TM][I]:     Copying file stdout.execution.
> Sun Apr  3 18:47:51 2011 [TM][I]:     Copying file stderr.execution.
> Sun Apr  3 18:47:53 2011 [TM][E]:     Copy of file stdout.execution failed.
> Sun Apr  3 18:47:54 2011 [TM][E]:     Copy of file stderr.execution failed.
> Sun Apr  3 18:47:54 2011 [TM][W]: Some output files were not copied.
> Sun Apr  3 18:47:54 2011 [TM][W]: Removing remote directory:
> Sun Apr  3 18:47:54 2011 [TM][W]:     Target url:
> gsiftp://saum.grid/~/.gw_guser01_0/.
> Sun Apr  3 18:47:56 2011 [TM][I]:     Remote job directory removed.
> Sun Apr  3 18:47:56 2011 [DM][E]: Epilog failed:
> Sun Apr  3 18:47:56 2011 [DM][E]:     Total time      : 20
> Sun Apr  3 18:47:56 2011 [DM][I]: Rescheduling job.
> Sun Apr  3 18:47:56 2011 [DM][I]: New state is PENDING.
> Sun Apr  3 18:54:01 2011 [DM][I]: New state is PROLOG.
> Sun Apr  3 18:54:01 2011 [TM][I]: Creating remote job working directory:
> Sun Apr  3 18:54:01 2011 [TM][I]:     Target url:
> gsiftp://saum.grid/~/.gw_guser01_0/.
> Sun Apr  3 18:54:01 2011 [TM][I]:     Remote job directory created.
> Sun Apr  3 18:54:01 2011 [TM][I]: Staging input files:
> Sun Apr  3 18:54:01 2011 [TM][I]:     Source: /home/guser01/GridWay.
> Sun Apr  3 18:54:01 2011 [TM][I]:     Copying file
> file:///usr/local/gw-5.6.1/var/0/job.env.
> Sun Apr  3 18:54:01 2011 [TM][W]:     Skipping file /bin/hostname, absolute
> path.
> Sun Apr  3 18:54:01 2011 [TM][W]:     Skipping file /dev/null, absolute
> path.
> Sun Apr  3 18:54:01 2011 [TM][I]:     Copying file
> file:///usr/local/gw-5.6.1/libexec/gw_wrapper.sh.
> Sun Apr  3 18:54:02 2011 [TM][I]:     File
> file:///usr/local/gw-5.6.1/var/0/job.env copied.
> Sun Apr  3 18:54:02 2011 [TM][I]:     File
> file:///usr/local/gw-5.6.1/libexec/gw_wrapper.sh copied.
> Sun Apr  3 18:54:02 2011 [TM][I]: All input files copied.
> Sun Apr  3 18:54:02 2011 [DM][I]: Prolog done:
> Sun Apr  3 18:54:02 2011 [DM][I]:     Total time      : 1
> Sun Apr  3 18:54:02 2011 [DM][I]: New state is WRAPPER.
> Sun Apr  3 18:54:02 2011 [EM][I]: Submitting wrapper to saum.grid/PBS, RSL
> used is in /usr/local/gw-5.6.1/var/0/job.rsl.1.
> Sun Apr  3 18:54:07 2011 [EM][I]: New execution state is PENDING.
> Sun Apr  3 18:54:08 2011 [EM][I]: Execution state is PENDING.
> Sun Apr  3 18:54:08 2011 [EM][I]: New execution state is ACTIVE.
> Sun Apr  3 18:54:08 2011 [EM][I]: New execution state is DONE.
> Sun Apr  3 18:54:08 2011 [DM][I]: Wrapper DONE:
> Sun Apr  3 18:54:08 2011 [DM][I]:     Active time     : 0
> Sun Apr  3 18:54:08 2011 [DM][I]:     Suspension time : 6
> Sun Apr  3 18:54:08 2011 [DM][I]:     Total time      : 6
> Sun Apr  3 18:54:08 2011 [DM][I]: New state is EPILOG_STD.
> Sun Apr  3 18:54:08 2011 [TM][I]: Staging output files:
> Sun Apr  3 18:54:08 2011 [TM][I]:     Source:
> gsiftp://saum.grid/~/.gw_guser01_0/.
> Sun Apr  3 18:54:08 2011 [TM][I]:     Copying file stdout.wrapper.
> Sun Apr  3 18:54:08 2011 [TM][I]:     Copying file stderr.wrapper.
> Sun Apr  3 18:54:09 2011 [TM][I]:     File stdout.wrapper copied.
> Sun Apr  3 18:54:10 2011 [TM][I]:     File stderr.wrapper copied.
> Sun Apr  3 18:54:10 2011 [TM][I]: All output files copied.
> Sun Apr  3 18:54:10 2011 [DM][E]: Unable to find exit code, assuming that
> the job failed or was cancelled.
> Sun Apr  3 18:54:10 2011 [DM][I]: New state is EPILOG_RESTART.
> Sun Apr  3 18:54:10 2011 [TM][I]: Staging output files:
> Sun Apr  3 18:54:10 2011 [TM][I]:     Source:
> gsiftp://saum.grid/~/.gw_guser01_0/.
> Sun Apr  3 18:54:10 2011 [TM][I]:     Copying file stdout.execution.
> Sun Apr  3 18:54:10 2011 [TM][I]:     Copying file stderr.execution.
> Sun Apr  3 18:54:10 2011 [TM][E]:     Copy of file stdout.execution failed.
> Sun Apr  3 18:54:11 2011 [TM][E]:     Copy of file stderr.execution failed.
> Sun Apr  3 18:54:11 2011 [TM][W]: Some output files were not copied.
> Sun Apr  3 18:54:11 2011 [TM][W]: Removing remote directory:
> Sun Apr  3 18:54:11 2011 [TM][W]:     Target url:
> gsiftp://saum.grid/~/.gw_guser01_0/.
> Sun Apr  3 18:54:11 2011 [TM][I]:     Remote job directory removed.
> Sun Apr  3 18:54:11 2011 [DM][E]: Epilog failed:
> Sun Apr  3 18:54:11 2011 [DM][E]:     Total time      : 3
> Sun Apr  3 18:54:11 2011 [DM][I]: Rescheduling job.
> Sun Apr  3 18:54:11 2011 [DM][I]: New state is PENDING.
>
> =========================================================================================================
>
>
>
> And the globus container log corresponding to the gridway's* gwsubmit* is:
>
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 2011-04-03 18:35:17,183 INFO  exec.StateMachine
> [RunQueueThread_1,logJobAccepted:3424] Job
> 05217630-5df3-11e0-aafa-e2327fe73aea accepted for local user 'guser01'
> 2011-04-03 18:35:18,497 INFO  exec.StateMachine
> [RunQueueThread_2,logJobSubmitted:3436] Job
> 05217630-5df3-11e0-aafa-e2327fe73aea submitted with local job ID
> '7.saum.grid'
> 2011-04-03 18:35:23,168 INFO  exec.StateMachine
> [RunQueueThread_11,logJobSucceeded:3446] Job
> 05217630-5df3-11e0-aafa-e2327fe73aea finished successfully
> 2011-04-03 18:47:29,586 INFO  exec.StateMachine
> [RunQueueThread_13,logJobAccepted:3424] Job
> b9e4ae10-5df4-11e0-aafa-e2327fe73aea accepted for local user 'guser01'
> 2011-04-03 18:47:31,373 INFO  exec.StateMachine
> [RunQueueThread_14,logJobSubmitted:3436] Job
> b9e4ae10-5df4-11e0-aafa-e2327fe73aea submitted with local job ID
> '8.saum.grid'
> 2011-04-03 18:47:32,411 INFO  exec.StateMachine
> [RunQueueThread_5,logJobSucceeded:3446] Job
> b9e4ae10-5df4-11e0-aafa-e2327fe73aea finished successfully
>
> =======================================================================================================================
>
>
>
> Plz tell me what is wrong with the GridWay's *gwsubmit*. How should I
> solve the issue???
>
> _ _ _ _ _ _ _ _ _ _
>      Regads
> Saumesh Kumar
>   IIT Roorkee
>

Reply via email to