Langford, Lester wrote:
Hello all,

I am kind of new to cluster operation, but have built 4 small sized (16 – 48 nodes) for other researchers. Now, I’m trying to run HPL on our 48-node dual Opteron 246 cluster. Got it to build the xhpl file, but when I try a test run I get the following error:
a48:/local-io/linux_bench/hpl/bin/Linux_Op_246 # mpirun -np 4 xhpl
/local-io/linux_bench/hpl/bin/Linux_Op_246/xhpl: Command not found.
p0_4305: p4_error: Child process exited while making connection to remote process on ath64: 0
p0_4305: (46.070312) net_send: could not write to fd=4, errno = 32

Is the /local-io filesystem shared amongst all nodes, or is it the
local disk?  The executable needs to be on a shared filesystem where
all applications can see it.

This is probably why you got the "command not found' error.

Craig



ath64 is a workstation I am not using for this task.

Any and all help on this matter would be greatly appreciated.
Thanks,

Les Langford


Lester Langford
Technology Development & Transfer
NASA Test Operations Group
Jacobs Sverdrup/ERC
Bldg 8306
Stennis Space Center, MS 39529

*- [EMAIL PROTECTED]
( (228) 688-7221
Fax     (228) 688-1106


------------------------------------------------------------------------

_______________________________________________
Beowulf mailing list, [email protected]
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

_______________________________________________
Beowulf mailing list, [email protected]
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to