In that case, I'd check for gram_job_mgr* logs in the homedirs of
either system, or the globus-gatekeeper.log files in the
$GLOBUS_LOCATIONs for more details.
You'll likely turn up the offending https:// url for std{out,err}.
Then you can manually run a globus-gass-server (or even just netcat)
on that port on that machine, and try contacting it yourself.
Charles
On Sep 5, 2007, at 3:08 PM, leonid glimcher wrote:
Charles,
thank you for the reply, but there's no firewall running. I'm
trying to submit a job from one node in a cluster to another. I've
gotten it to work with 6 nodes so far, but the seventh is giving me
problems. Is there something else that could be going wrong?
~leo
Charles Bacon wrote:
It's usually a firewall problem. Your client opens a GASS server
on a port and passes that URL to the server. The server tries
to connect to it to send back stdout/err. If a firewall blocks
it, you get this error. Set GLOBUS_TCP_PORT_RANGE to force the
client to open a port in a known-good range.
Charles
On Sep 5, 2007, at 2:42 PM, leonid glimcher wrote:
Hi,
i'm trying to run an MPICH-G2 job using GT4.0.4 and here's the
error i'm getting when running "mpirun":
Submission of subjob (label = "subjob 0") failed because the
job manager failed to open stderr (error code 124)
i'm at a loss about what could be the problem, does anyone have
any ideas?
thanks in advance,
~leo