In that case, I'd check for gram_job_mgr* logs in the homedirs of either system, or the globus-gatekeeper.log files in the $GLOBUS_LOCATIONs for more details.

You'll likely turn up the offending https:// url for std{out,err}. Then you can manually run a globus-gass-server (or even just netcat) on that port on that machine, and try contacting it yourself.


Charles

On Sep 5, 2007, at 3:08 PM, leonid glimcher wrote:

Charles,

thank you for the reply, but there's no firewall running. I'm trying to submit a job from one node in a cluster to another. I've gotten it to work with 6 nodes so far, but the seventh is giving me problems. Is there something else that could be going wrong?

~leo

Charles Bacon wrote:
It's usually a firewall problem. Your client opens a GASS server on a port and passes that URL to the server. The server tries to connect to it to send back stdout/err. If a firewall blocks it, you get this error. Set GLOBUS_TCP_PORT_RANGE to force the client to open a port in a known-good range.
Charles
On Sep 5, 2007, at 2:42 PM, leonid glimcher wrote:
Hi,

i'm trying to run an MPICH-G2 job using GT4.0.4 and here's the error i'm getting when running "mpirun":

Submission of subjob (label = "subjob 0") failed because the job manager failed to open stderr (error code 124)

i'm at a loss about what could be the problem, does anyone have any ideas?

thanks in advance,

~leo




Reply via email to