Bernard Li wrote:
Hi Jim:
As that user, can you ssh to 192.168.0.6?

Yes.

[EMAIL PROTECTED] ~]$ ssh node6
[EMAIL PROTECTED] ~]$ ls -al
The error message "No Password Entry for User tmac0501" sounds fishy...

Strangest thing I ever saw.

  does /etc/passwd, /etc/shadow look okay on that node?

Looks fine.

Regarding my local user test, I was wrong, using a local user that can ssh to all nodes, and can run mpirun without any errors, the pbs submitted jobs now are returning the following:
----------
p0_4722: (61.563956) Procgroup:
p0_4722: (61.564118) entry 0: node15.oscardomain 0 0 /admin/localpbs/clib/node-test localpbs p0_4722: (61.564145) entry 1: rhel4.ehpctc.intern 1 1 /admin/localpbs/clib/node-test localpbs p0_4722: p4_error: Could not gethostbyname for host rhel4.ehpctc.intern; may be invalid name
: 62
----------
This is repeatable even with a varying number of nodes. I am not sure where it is getting the

rhel4.ehpctc.intern

entry. The machines.LINUX is fine, mpirun works. It seems like there may be a problem in the scheduler?


Cheers,
Bernard P.S. Did you run through the "Test OSCAR Setup" step and all tests passed?

Yes most of the tests passed. It seems like one did fail, either it was ganglia or some pvm tests. I have configured the ganglia and it works fine. Since we aren't planning on using the pvm I disregarded it.


in the queue.  I am seeing the following in the mom_logs on the nodes
involved:
=========================
02/15/2006 15:06:22;0008;   pbs_mom;Job;6.master;No Password Entry for
User tmac0501
02/15/2006 15:10:26;0008;   pbs_mom;Job;6.master;ERROR:    received
request 'ABORT_JOB' from 192.168.0.6:1023 for job '6.master' (job does
not exist locally)
========================

Not sure what I have to configure.  I haven't seen anyhting in the pbs
docs regarding authentication yet.

The ldap users can ssh to each node and their home is mounted.


TIA

--
Jim Summers
School of Computer Science-University of Oklahoma
-------------------------------------------------


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Oscar-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oscar-users

Reply via email to