It appears that you are trying to mix two different versions of globus toolkit 
here.
Globus-job-submit is a pre-ws GRAM (gt2) service. The URL you have in the 
condor submit file before is
Trying to use GT4 (Web services) and worse, trying to use it on port 2119.

Try the following GridResource in your condor-G submit file:

GridResource= gt2 head.beng02.com/jobmanager-pbs

I think if you use a pure gt2 service you will be fine.  What you have below is 
a GT4 url on a GT2 port and that for sure will not work.
Also, gt4 (web services) client is mostly broken in condor 7.6.6 and will be 
totally deprecated in the next major release of condor.

Steve Timm



From: [email protected] 
[mailto:[email protected]] On Behalf Of Hameed Alzahrani
Sent: Sunday, April 29, 2012 8:19 PM
To: [email protected]
Subject: [gt-user] Condor-G sumission does not work while globus submit works


Hi,

When I submit the following submission file through condor it does not work and 
the job remains idle while submitting the same job using globus-job-submit 
works without any errors. The log on the remote host shows authentication 
failure in the condor-G case but it does not shows any failure when submitting 
the job by globus. Does any one come across this problem or know how to solve 
it? any help will be appreciated.

I use condor 7.6.6 and VDT 2

Submission file and process:

[zhrani@CM Grid]$ cat hostname_submit.jcl
grid_resource = gt4 
https://head.beng02.com:2119/wsrf/services/ManagedJobFactoryService PBS
Universe = grid
when_to_transfer_output = ON_EXIT
Executable = /bin/hostname
Arguments = -f
Output = cout.$(Cluster).$(Process)
Log =clog.$(Cluster).$(Process)
Queue

[zhrani@CM Grid]$ condor_submit hostname_submit.jcl
Submitting job(s).
Logging submit event(s).
1 job(s) submitted to cluster 1106.

[zhrani@CM Grid]$ condor_q -globus


-- Submitter: CM.CHPC.hud.ac.uk : <192.168.0.10:21871> : CM.CHPC.hud.ac.uk
 ID      OWNER          STATUS  MANAGER  HOST                EXECUTABLE
1106.0   zhrani        UNSUBMITTED PBS      head.beng02.com     /bin/hostname

[zhrani@CM Grid]$ condor_rm zhrani
User zhrani's job(s) have been marked for removal.

[zhrani@CM Grid]$ globus-job-submit head.beng02.com /bin/hostname -f
https://head.beng02.com:37308/6261/1335746926/
[zhrani@CM Grid]$ globus-job-status 
https://head.beng02.com:37308/6261/1335746926/
DONE
[zhrani@CM Grid]$ globus-job-get-output 
https://head.beng02.com:37308/6261/1335746926/
head.beng02.com


Gridmanager LOG:

04/30/12 01:46:29 [25065] resource 
https://head.beng02.com:2119/wsrf/services/ManagedJobFactoryService is now up
04/30/12 01:46:29 [25065] *** checkDelegation()
04/30/12 01:46:29 [25065] (1106.0) doEvaluateState called: gmState 
GM_UNSUBMITTED, globusState
04/30/12 01:47:19 [25065] Received CHECK_LEASES signal
04/30/12 01:47:19 [25065] in doContactSchedd()
04/30/12 01:47:19 [25065] querying for renewed leases
04/30/12 01:47:19 [25065] querying for removed/held jobs
04/30/12 01:47:19 [25065] Using constraint ((Owner=?="zhrani"&&JobUniverse==9)) 
&& ((Managed =!= "ScheddDone")) && (JobStatus == 3 || JobStatus == 4 || 
(JobStatus == 5 && Managed =?= "External"))
04/30/12 01:47:19 [25065] Fetched 0 job ads from schedd
04/30/12 01:47:19 [25065] leaving doContactSchedd()
04/30/12 01:47:22 [25065] GridftpServer: Submitting job for proxy 
'/O=Grid/OU=GlobusTest/OU=simpleCA-head.beng02.com/OU=beng02.com/CN=zahrani'
04/30/12 01:47:22 [25065] entering FileTransfer::SimpleInit
04/30/12 01:47:22 [25065] Input files: 
/tmp/condor_g_scratch.0x19360fd0.25029/grid-mapfile
04/30/12 01:47:22 [25065] entering FileTransfer::UploadFiles (final_transfer=0)
04/30/12 01:47:22 [25065] entering FileTransfer::Upload
04/30/12 01:47:22 [25065] entering FileTransfer::DoUpload
04/30/12 01:47:22 [25065] DoUpload: sending file 
/tmp/condor_g_scratch.0x19360fd0.25029/master_proxy.2
04/30/12 01:47:22 [25065] FILETRANSFER: outgoing file_command is 4 for 
/tmp/condor_g_scratch.0x19360fd0.25029/master_proxy.2
04/30/12 01:47:22 [25065] Received GoAhead from peer to send 
/tmp/condor_g_scratch.0x19360fd0.25029/master_proxy.2 and all further files.
04/30/12 01:47:22 [25065] Sending GoAhead for 192.168.0.10 to receive 
/tmp/condor_g_scratch.0x19360fd0.25029/master_proxy.2 and all further files.
04/30/12 01:47:22 [25065] DoUpload: put_x509_delegation() returned 0
04/30/12 01:47:22 [25065] DoUpload: sending file 
/tmp/condor_g_scratch.0x19360fd0.25029/grid-mapfile
04/30/12 01:47:22 [25065] FILETRANSFER: outgoing file_command is 1 for 
/tmp/condor_g_scratch.0x19360fd0.25029/grid-mapfile
04/30/12 01:47:22 [25065] ReliSock::put_file_with_permissions(): going to send 
permissions 100644
04/30/12 01:47:22 [25065] put_file: going to send from filename 
/tmp/condor_g_scratch.0x19360fd0.25029/grid-mapfile
04/30/12 01:47:22 [25065] put_file: Found file size 84
04/30/12 01:47:22 [25065] put_file: sending 84 bytes
04/30/12 01:47:22 [25065] ReliSock: put_file: sent 84 bytes
04/30/12 01:47:22 [25065] DoUpload: sending file 
/usr/libexec/condor/gridftp_wrapper.sh
04/30/12 01:47:22 [25065] FILETRANSFER: outgoing file_command is 1 for 
/usr/libexec/condor/gridftp_wrapper.sh
04/30/12 01:47:22 [25065] ReliSock::put_file_with_permissions(): going to send 
permissions 100755
04/30/12 01:47:22 [25065] put_file: going to send from filename 
/usr/libexec/condor/gridftp_wrapper.sh
04/30/12 01:47:22 [25065] put_file: Found file size 1057
04/30/12 01:47:22 [25065] put_file: sending 1057 bytes
04/30/12 01:47:22 [25065] ReliSock: put_file: sent 1057 bytes
04/30/12 01:47:22 [25065] DoUpload: exiting at 3003
04/30/12 01:47:25 [25065] GAHP[25071] <- 'RESULTS'
04/30/12 01:47:25 [25065] GAHP[25071] -> 'S' '0'
04/30/12 01:47:25 [25065] in doContactSchedd()
04/30/12 01:47:25 [25065] querying for removed/held jobs
04/30/12 01:47:25 [25065] Using constraint ((Owner=?="zhrani"&&JobUniverse==9)) 
&& ((Managed =!= "ScheddDone")) && (JobStatus == 3 || JobStatus == 4 || 
(JobStatus == 5 && Managed =?= "External"))
04/30/12 01:47:25 [25065] Fetched 0 job ads from schedd
04/30/12 01:47:25 [25065] 1108.0 job status: 4
04/30/12 01:47:25 [25065] leaving doContactSchedd()
04/30/12 01:47:26 [25065] Evaluating staleness of remote job statuses.
04/30/12 01:47:42 [25065] Received REMOVE_JOBS signal
04/30/12 01:47:42 [25065] in doContactSchedd()
04/30/12 01:47:42 [25065] querying for new jobs
04/30/12 01:47:42 [25065] Using constraint ((Owner=?="zhrani"&&JobUniverse==9)) 
&& (Managed =!= "ScheddDone") && (Matched =!= FALSE) && (JobStatus != 5) && 
(Managed =!= "External")
04/30/12 01:47:42 [25065] Fetched 0 new job ads from schedd
04/30/12 01:47:42 [25065] querying for removed/held jobs
04/30/12 01:47:42 [25065] Using constraint ((Owner=?="zhrani"&&JobUniverse==9)) 
&& ((Managed =!= "ScheddDone")) && (JobStatus == 3 || JobStatus == 4 || 
(JobStatus == 5 && Managed =?= "External"))
04/30/12 01:47:42 [25065] Fetched 1 job ads from schedd
04/30/12 01:47:42 [25065] leaving doContactSchedd()
04/30/12 01:47:42 [25065] (1106.0) doEvaluateState called: gmState 
GM_UNSUBMITTED, globusState
04/30/12 01:47:42 [25065] (1106.0) gm state change: GM_UNSUBMITTED -> GM_DELETE
04/30/12 01:47:42 [25065] directory_util::rec_touch_file: Creating directory 
/tmp
04/30/12 01:47:42 [25065] directory_util::rec_touch_file: Creating directory 
/tmp/condorLocks
04/30/12 01:47:42 [25065] directory_util::rec_touch_file: Creating directory 
/tmp/condorLocks/13
04/30/12 01:47:42 [25065] directory_util::rec_touch_file: Creating directory 
/tmp/condorLocks/13/73
04/30/12 01:47:42 [25065] FileLock object is updating timestamp on: 
/tmp/condorLocks/13/73/8341789162039746.lockc
04/30/12 01:47:42 [25065] (1106.0) Writing abort record to user logfile
04/30/12 01:47:42 [25065] FileLock::obtain(1) - @1335746862.880224 lock on 
/tmp/condorLocks/13/73/8341789162039746.lockc now WRITE
04/30/12 01:47:42 [25065] FileLock::obtain(2) - @1335746862.882102 lock on 
/tmp/condorLocks/13/73/8341789162039746.lockc now UNLOCKED
04/30/12 01:47:42 [25065] FileLock::obtain(1) - @1335746862.882247 lock on 
/tmp/condorLocks/13/73/8341789162039746.lockc now WRITE
04/30/12 01:47:42 [25065] directory_util::rec_clean_up: file 
/tmp/condorLocks/13/73/8341789162039746.lockc has been deleted.
04/30/12 01:47:42 [25065] Lock file 
/tmp/condorLocks/13/73/8341789162039746.lockc has been deleted.
04/30/12 01:47:42 [25065] FileLock::obtain(2) - @1335746862.882583 lock on 
/tmp/condorLocks/13/73/8341789162039746.lockc now UNLOCKED
04/30/12 01:47:47 [25065] in doContactSchedd()
04/30/12 01:47:47 [25065] querying for removed/held jobs
04/30/12 01:47:47 [25065] Using constraint ((Owner=?="zhrani"&&JobUniverse==9)) 
&& ((Managed =!= "ScheddDone")) && (JobStatus == 3 || JobStatus == 4 || 
(JobStatus == 5 && Managed =?= "External"))
04/30/12 01:47:47 [25065] Fetched 1 job ads from schedd
04/30/12 01:47:47 [25065] Updating classad values for 1106.0:
04/30/12 01:47:47 [25065]    Managed = "ScheddDone"
04/30/12 01:47:47 [25065] Deleting job 1106.0 from schedd
04/30/12 01:47:47 [25065] GAHP[25071] <- 'UNCACHE_PROXY 1'
04/30/12 01:47:47 [25065] GAHP[25071] -> 'S'
04/30/12 01:47:47 [25065] No jobs left, shutting down
04/30/12 01:47:47 [25065] leaving doContactSchedd()
04/30/12 01:47:47 [25065] Got SIGTERM. Performing graceful shutdown.
04/30/12 01:47:47 [25065] Started timer to call main_shutdown_fast in 1800 
seconds
04/30/12 01:47:47 [25065] **** condor_gridmanager (condor_GRIDMANAGER) pid 
25065 EXITING WITH STATUS 0


Remote Host Log including condor-G submit and globus submit:

TIME: Mon Apr 30 01:46:26 2012
 PID: 6255 -- Notice: 6: globus-gatekeeper pid=6255 starting at Mon Apr 30 
01:46:26 2012

TIME: Mon Apr 30 01:46:26 2012
 PID: 6255 -- Notice: 6: Got connection 10.71.88.93 at Mon Apr 30 01:46:26 2012

GSS authentication failure
GSS Major Status: General failure
GSS Minor Status Error Chain:
globus_gsi_gssapi: Error during delegation: Delegation protocol violation
Failure: GSS failed Major:000d0000 Minor:00000002 Token:00000000

TIME: Mon Apr 30 01:46:26 2012
 PID: 6255 -- Failure: GSS failed Major:000d0000 Minor:00000002 Token:00000000

TIME: Mon Apr 30 01:48:46 2012
 PID: 6260 -- Notice: 6: globus-gatekeeper pid=6260 starting at Mon Apr 30 
01:48:46 2012

TIME: Mon Apr 30 01:48:46 2012
 PID: 6260 -- Notice: 6: Got connection 10.71.88.93 at Mon Apr 30 01:48:46 2012

TIME: Mon Apr 30 01:48:46 2012
 PID: 6260 -- Notice: 5: Authenticated globus user: 
/O=Grid/OU=GlobusTest/OU=simpleCA-head.beng02.com/OU=beng02.com/CN=zahrani
TIME: Mon Apr 30 01:48:46 2012
 PID: 6260 -- Notice: 0: GRID_SECURITY_HTTP_BODY_FD=6
TIME: Mon Apr 30 01:48:46 2012
 PID: 6260 -- Notice: 5: Requested service: jobmanager
TIME: Mon Apr 30 01:48:46 2012
 PID: 6260 -- Notice: 5: Authorized as local user: zhrani
TIME: Mon Apr 30 01:48:46 2012
 PID: 6260 -- Notice: 5: Authorized as local uid: 516
TIME: Mon Apr 30 01:48:46 2012
 PID: 6260 -- Notice: 5:           and local gid: 516
TIME: Mon Apr 30 01:48:46 2012
 PID: 6260 -- Notice: 0: executing 
/usr/local/globus-4.2.0/libexec/globus-job-manager
TIME: Mon Apr 30 01:48:46 2012
 PID: 6260 -- Notice: 0: GRID_SECURITY_CONTEXT_FD=9
TIME: Mon Apr 30 01:48:46 2012
 PID: 6260 -- Notice: 0: Child 6261 started
TIME: Mon Apr 30 01:49:21 2012
 PID: 6275 -- Notice: 6: globus-gatekeeper pid=6275 starting at Mon Apr 30 
01:49:21 2012

TIME: Mon Apr 30 01:49:21 2012
 PID: 6275 -- Notice: 6: Got connection 10.71.88.93 at Mon Apr 30 01:49:21 2012

TIME: Mon Apr 30 01:49:21 2012
 PID: 6275 -- Notice: 5: Authenticated globus user: 
/O=Grid/OU=GlobusTest/OU=simpleCA-head.beng02.com/OU=beng02.com/CN=zahrani
TIME: Mon Apr 30 01:49:21 2012
 PID: 6275 -- Notice: 0: GRID_SECURITY_HTTP_BODY_FD=6
TIME: Mon Apr 30 01:49:21 2012
 PID: 6275 -- Notice: 5: Requested service: jobmanager
TIME: Mon Apr 30 01:49:21 2012
 PID: 6275 -- Notice: 5: Authorized as local user: zhrani
TIME: Mon Apr 30 01:49:21 2012
 PID: 6275 -- Notice: 5: Authorized as local uid: 516
TIME: Mon Apr 30 01:49:21 2012
 PID: 6275 -- Notice: 5:           and local gid: 516
TIME: Mon Apr 30 01:49:21 2012
 PID: 6275 -- Notice: 0: executing 
/usr/local/globus-4.2.0/libexec/globus-job-manager
TIME: Mon Apr 30 01:49:21 2012
 PID: 6275 -- Notice: 0: GRID_SECURITY_CONTEXT_FD=9
TIME: Mon Apr 30 01:49:21 2012
 PID: 6275 -- Notice: 0: Child 6276 started


Regards,

-->

Reply via email to