I'm moving this to gt-user to get more help. gtmanuals-user I think is more for basic web/doc questions.
-Stu Begin forwarded message: > From: Yu Huang <[email protected]> > Date: August 16, 2011 2:38:34 PM CDT > To: Stuart Martin <[email protected]> > Cc: [email protected] > Subject: Re: [gtmanuals-user] restrict GRAM job manager's port to a certain > range > > Thanks, Stuart. The ucla grid here has something strange going on. The > gridftp file in xinetd.d directory has the GLOBUS_TCP_PORT_RANGE set to > 40000,50000. but not the gsigatekeeper. > > The administrator modified it after i suspected there is a firewall around > it. My job will succeed if the job contact ID (controlled by gate keeper , i > think) from the server has its port between 40000 and 50000. > > But after administrator modified it and restarted xinetd, port below 40000 or > above 50000 is still happening. Is restarting xinetd all it's needed? or > there is something else? > > Also i noticed, after he added this GLOBUS_TCP_PORT_RANGE to gsigatekeeper, > globusrun or globus-job-run just hangs there forever after message "GRAM job > submission successful". Before there was no issue with globusrun or > globus-job-run, just condor-g. Any idea of what's going on there? > > Thanks, > yu > > On Aug 15, 2011 7:32 AM, "Stuart Martin" <[email protected]> wrote: > > Hi Yu, > > > > Here is information on how to control the ports used by GRAM > > (gatekeeper/jobmanager) > > > > http://dev.globus.org/wiki/FirewallHowTo#Controlling_the_Ephemeral_Port_Range > > > > Let me know if this works for you or if you have further troubles. > > > > Thanks, > > -Stu > > > > On Aug 12, 2011, at Aug 12, 6:03 PM, Yu Huang wrote: > > > >> Hi, > >> > >> How to restrict job manger on the GRAM server to have ports in the job > >> contact URLs within a certain range, say 40000 to 50000? > >> > >> I'm using condor_g to talk to a server at ucla, > >> grid4.hoffman2.idre.ucla.edu, sometimes fail, sometimes succeed. The job > >> succeeds when the job contact URL has its port within 40000 and 50000 > >> range, like job contact id is > >> https://grid4.hoffman2.idre.ucla.edu:41297/16145784839073078036/15589874525209144233/. > >> > >> 08/11/11 13:33:44 [25832] Fetched 0 job ads from schedd > >> 08/11/11 13:33:44 [25832] Updating classad values for 94.0: > >> 08/11/11 13:33:44 [25832] DelegatedProxyExpiration = 1313134315 > >> 08/11/11 13:33:44 [25832] GridJobId = "gt5 > >> grid4.hoffman2.idre.ucla.edu/jobmanager-fork > >> https://grid4.hoffman2.idre.ucla.edu:412 > >> 97/16145784839073078036/15589874525209144233/" > >> 08/11/11 13:33:44 [25832] LastRemoteStatusUpdate = 1313094824 > >> 08/11/11 13:33:44 [25832] leaving doContactSchedd() > >> 08/11/11 13:33:44 [25832] (94.0) doEvaluateState called: gmState > >> GM_SUBMIT_SAVE, globusState 32 > >> 08/11/11 13:33:44 [25832] (94.0) gm state change: GM_SUBMIT_SAVE -> > >> GM_SUBMIT_COMMIT > >> 08/11/11 13:33:44 [25832] GAHP[25838] <- 'GRAM_JOB_SIGNAL 6 > >> https://grid4.hoffman2.idre.ucla.edu:41297/16145784839073078036/1558987 > >> 4525209144233/ 5 NULL' > >> 08/11/11 13:33:44 [25832] GAHP[25838] -> 'S' > >> 08/11/11 13:33:44 [25832] GAHP[25838] <- 'RESULTS' > >> 08/11/11 13:33:44 [25832] GAHP[25838] -> 'R' > >> 08/11/11 13:33:44 [25832] GAHP[25838] -> 'S' '1' > >> 08/11/11 13:33:44 [25832] GAHP[25838] -> '6' '0' '0' '32' > >> 08/11/11 13:33:44 [25832] (94.0) doEvaluateState called: gmState > >> GM_SUBMIT_COMMIT, globusState 32 > >> 08/11/11 13:33:44 [25832] (94.0) gm state change: GM_SUBMIT_COMMIT -> > >> GM_SUBMITTED > >> 08/11/11 13:33:44 [25832] GAHP[25838] <- 'RESULTS' > >> 08/11/11 13:33:44 [25832] GAHP[25838] -> 'R' > >> 08/11/11 13:33:44 [25832] GAHP[25838] -> 'S' '1' > >> 08/11/11 13:33:44 [25832] GAHP[25838] -> '2' > >> 'https://grid4.hoffman2.idre.ucla.edu:41297/16145784839073078036/15589874525209144233/ > >> ' '2' '0' > >> 08/11/11 13:33:44 [25832] (94.0) gram callback: state 2, errorcode 0 > >> 08/11/11 13:33:44 [25832] (94.0) doEvaluateState called: gmState > >> GM_SUBMITTED, globusState 32 > >> 08/11/11 13:33:44 [25832] (94.0) globus state change: UNSUBMITTED -> ACTIVE > >> > >> > >> when it fails it's like this: > >> > >> > >> 08/12/11 15:57:18 [8911] GAHP[8915] <- 'GRAM_JOB_SIGNAL 15 > >> https://grid4.hoffman2.idre.ucla.edu:34748/16145744163145922676/15589874 > >> 525209142961/ 5 NULL' > >> 08/12/11 15:57:18 [8911] GAHP[8915] -> 'S' > >> 08/12/11 15:57:18 [8911] GAHP[8915] <- 'RESULTS' > >> 08/12/11 15:57:18 [8911] GAHP[8915] -> 'R' > >> 08/12/11 15:57:18 [8911] GAHP[8915] -> 'S' '1' > >> 08/12/11 15:57:18 [8911] GAHP[8915] -> '10' '110' > >> 'https://grid4.hoffman2.idre.ucla.edu:34748/16145745260075977356/1558987452520914 > >> 2961/' > >> 08/12/11 15:57:18 [8911] (157.0) doEvaluateState called: gmState > >> GM_RESTART, globusState 0 > >> 08/12/11 15:57:18 [8911] (157.0) gm state change: GM_RESTART -> > >> GM_RESTART_SAVE > >> 08/12/11 15:57:18 [8911] (157.0) gm state change: GM_RESTART_SAVE -> > >> GM_RESTART_COMMIT > >> 08/12/11 15:57:18 [8911] GAHP[8915] <- 'GRAM_JOB_SIGNAL 16 > >> https://grid4.hoffman2.idre.ucla.edu:34748/16145745260075977356/15589874 > >> 525209142961/ 5 NULL' > >> 08/12/11 15:57:18 [8911] GAHP[8915] -> 'S' > >> 08/12/11 15:57:21 [8911] grid_monitor for > >> grid4.hoffman2.idre.ucla.edu:2119 entering CheckMonitor > >> > >> 08/12/11 15:57:21 [8911] Disabling grid_monitor for GRAM5 server > >> grid4.hoffman2.idre.ucla.edu:2119 > >> 08/12/11 15:57:21 [8911] GAHP[8915] <- 'RESULTS' > >> 08/12/11 15:57:21 [8911] GAHP[8915] -> 'R' 08/12/11 15:57:21 [8911] > >> GAHP[8915] -> 'S' '1' 08/12/11 15:57:21 [8911] GAHP[8915] -> '15' '79' '0' > >> '0' 08/12/11 15:57:21 [8911] (154.0) doEvaluateState called: gmState > >> GM_RESTART_COMMIT, globusState 0 08/12/11 15:57:21 [8911] (154.0) > >> Connection failure (try #1), retrying in 5 secs 08/12/11 15:57:21 [8911] > >> GAHP[8915] <- 'RESULTS' 08/12/11 15:57:21 [8911] GAHP[8915] -> 'R' > >> 08/12/11 15:57:21 [8911] GAHP[8915] -> 'S' '1' > >> 08/12/11 15:57:21 > >> [8911] GAHP[8915] -> '16' '79' '0' '0' 08/12/11 15:57:21 [8911] (157.0) > >> doEvaluateState called: gmState GM_RESTART_COMMIT, globusState 0 > >> 08/12/11 15:57:21 [8911] (157.0) Connection failure (try #1), retrying in > >> 5 secs > >> > >> > >> thank you, > >> yu > >> -- > >> http://www-scf.usc.edu/~yuhuang/ > >
