On Jul 27, 2010, at 9:13 AM, Carlos Borrego Iglesias wrote:

> Hello,
> I am trying to define a new globus-job-manager called jobmanager-mmaa:
> 
> when I do a globus-job-run I get the next message:
> 
> #globus-job-run myce.pic.es/jobmanager-mmaa /bin/hostname
> GRAM Job submission failed because the job manager detected an invalid script 
> response (error code 24)
> 
> In the gatekeeper log I see no errors:
> 

Does this mean you are creating a new LRM script?

If so, the first step of debugging is to make sure it runs with

perl -I$GLOBUS_LOCATION/lib/perl 
$GLOBUS_LOCATION/lib/perl/Globus/GRAM/JobManager/mmaa.pm

(assuming mmaa.pm is your module name)

After that, if you are using 5.0.2, you can add the rsl relation 
save_job_description = yes to a job
submission. This will leave a perl file in your home directory that you can run 
with the job manager script. That file is called gram_$unique.pl where $unique 
is a string of characters unique for each job. Pass that to the script to see 
what's going on:

$GLOBUS_LOCATION/libexec/globus-job-manager-script.pl -m mmaa -f 
~/gram_UNIQUE.pl -c submit


> Successfull mapping done
> Mapping service "LCMAPS" returned local user "atlas007"
> TIME: Tue Jul 27 15:08:22 2010
>  PID: 26517 -- Notice: 0: GRID_SECURITY_HTTP_BODY_FD=9
> TIME: Tue Jul 27 15:08:22 2010
>  PID: 26517 -- Notice: 5: Requested service: jobmanager-mmaa 
> TIME: Tue Jul 27 15:08:22 2010
>  PID: 26517 -- Notice: 5: Authorized as local user: atlas007
> TIME: Tue Jul 27 15:08:22 2010
>  PID: 26517 -- Notice: 5: Authorized as local uid: 31057
> TIME: Tue Jul 27 15:08:22 2010
>  PID: 26517 -- Notice: 5:           and local gid: 1307
> TIME: Tue Jul 27 15:08:22 2010
>  PID: 26517 -- Notice: 5: "/DC=es/DC=irisgrid/O=ifae/CN=carlos.borrego" 
> mapped to atlas007 (31057/1307)
> TIME: Tue Jul 27 15:08:22 2010
>  PID: 26517 -- Notice: 0: executing /opt/globus//libexec/globus-job-manager
> TIME: Tue Jul 27 15:08:22 2010
>  PID: 26517 -- Notice: 0: GATEKEEPER_JM_ID 
> 2010-07-27.15:08:22.0000026517.0000000000 for 
> /DC=es/DC=irisgrid/O=ifae/CN=carlos.borrego on 193.109.175.133
> JMA 2010/07/27 15:08:22 GATEKEEPER_JM_ID 
> 2010-07-27.15:08:22.0000026517.0000000000 has EDG_WL_JOBID ''
> GATEKEEPER_DGAS_FD=8 
> (/opt/edg/var/gatekeeper/jobs/2010-07-27.15:08:22.0000014805.0000000014)
> TIME: Tue Jul 27 15:08:22 2010
>  PID: 26517 -- Notice: 0: GRID_SECURITY_CONTEXT_FD=12
> TIME: Tue Jul 27 15:08:22 2010
>  PID: 26517 -- Notice: 0: Child 26519 started
> JMA 2010/07/27 15:08:22 GATEKEEPER_JM_ID 
> 2010-07-27.15:08:22.0000026517.0000000000 JM exiting
> 
> In the gram job manager log file from the user which is mapped to I get:
> 
> [atlas...@myce ~]$ tail gram_job_mgr_22248.log
> 7/27 14:33:07 JMI: completed script validation: job manager type is mmaa.
> 7/27 14:33:07 JMI: cmd = cache_cleanup
> 7/27 14:33:07 Job Manager State Machine (entering): 
> GLOBUS_GRAM_JOB_MANAGER_STATE_EARLY_FAILED_CACHE_CLEAN_UP
> 7/27 14:33:07 Job Manager State Machine (entering): 
> GLOBUS_GRAM_JOB_MANAGER_STATE_EARLY_FAILED_RESPONSE
> 7/27 14:33:07 JM: before sending to client: rc=0 (Success)
> 7/27 14:33:07 Job Manager State Machine (exiting): 
> GLOBUS_GRAM_JOB_MANAGER_STATE_FAILED_DONE
> 7/27 14:33:07 JM: in globus_gram_job_manager_reporting_file_remove()
> 7/27 14:33:07 Job Manager State Machine (entering): 
> GLOBUS_GRAM_JOB_MANAGER_STATE_FAILED_DONE
> 7/27 14:33:07 JM: in globus_gram_job_manager_reporting_file_remove()
> 7/27 14:33:07 JM: exiting globus_gram_job_manager.
> 
> Any ideas where can I get more debug info?
> Thanks so much in advance
> Carlos
> 
> -- 
> =============================
> Carlos Borrego Iglesias
> [email protected]       
> IFAE Institut de Física d'Altes Energies
> Campus UAB Edifici Cn. Facultat Ciències
> E-08193 Bellaterra
> tel: +34 93 581 2822            
> ============================= 

Reply via email to