I did a clean install again and have run into exactly the same problem
at step-8.
The /home mount scrolls on all 31 nodes and fails in the test.
I get no errors whatsoever in the first steps.
This time however, I am asked for the password for oscartst user as
number of times as there are nodes in subsequent tests.
It looks like a very minor problem but we don't know how to fix it.
Unfortunately, we are non professional linux users and are trying to
fix the cluster for teaching  bioinformatics.
1. I have installed fedora core 4 as 'workstation' Should I go for
custom install?
2. I have checked 'network' also in the selection of packages.
3. I have selected MPIH instead of LAM
Could some one please suggest some tests to check if NFS is working properly?
Should we change to fedora core 5 or 6 or some other distribution?
By the way how is the whole cluster shutdown?
Currently, I ssh into every node and issue halt -p :)

Still trying and haven't lost hope yet,

lutfullah




On 4/24/07, Michael Edwards <[EMAIL PROTECTED]> wrote:
> The home directories should contain the same files as the head node, which
> to start with is probably just a bunch of .whatever files.
>
> I have had problems with the passwordless ssh when I was fiddling around
> with getting LDAP authentication set up, so you may have changed something
> which fixed one problem but is now causing this one.  Adding things to
> hosts.allow should not be necessary for instance, though as far as I know it
> shouldn't cause problems either.
>
> After you were having set up problems, did you go back and start with a
> clean OS install again?  It shouldn't take nearly as long the second time :)
>
>
> On 4/24/07, Dr. Lutfullah <[EMAIL PROTECTED]> wrote:
> > /etc/hosts is like this on all nodes:
> >
> > # Do not remove the following line, or various programs
> > # that require network functionality will fail.
> > 127.0.0.1       localhost.localdomain    localhost
> > 192.168.0.100 cc32.kust.edu.pk cc32 oscar_server nfs_oscar pbs_oscar
> >
> > # These entries are managed by SIS, please don't modify them.
> > 192.168.0.1          oscarnode1.kust.edu.pk     oscarnode1
> > 192.168.0.2           oscarnode2.kust.edu.pk     oscarnode2
> > 192.168.0.3          oscarnode3.kust.edu.pk     oscarnode3
> > 192.168.0.4           oscarnode4.kust.edu.pk     oscarnode4
> > 192.168.0.5          oscarnode5.kust.edu.pk     oscarnode5
> > 192.168.0.6          oscarnode6.kust.edu.pk     oscarnode6
> > 192.168.0.7           oscarnode7.kust.edu.pk     oscarnode7
> > 192.168.0.8          oscarnode8.kust.edu.pk     oscarnode8
> > 192.168.0.9           oscarnode9.kust.edu.pk     oscarnode9
> > 192.168.0.10         oscarnode10.kust.edu.pk     oscarnode10
> > --------------------- rest deleted---------------
> >
> > lutfullah
> >
> > On 4/24/07, Michael Edwards <[EMAIL PROTECTED]> wrote:
> > > Try rebooting the head node, waiting until it is completely booted, and
> then
> > > rebooting all the client nodes.  If everything is set up right, that
> might
> > > fix the problem.  You can check if it is working by sshing to the
> compute
> > > node and doing "ls /home" and comparing it to the head node.
> > >
> > > Check /etc/exports and make sure /home is exported to the internal
> network,
> > > and the fstab in the image and nodes to make sure it is being mounted.
> > >
> > > Another common problem is an incorrect /etc/hosts files, see if it is
> > > something like
> > > 127.0.0.1 localhost.localdomain localhost
> > > 10.0.0.1 oscarmaster.oscardomain oscarmaster nfs_oscar pbs_oscar
> > >
> > > If the cluster hostname is on the same line as localhost, that will
> cause
> > > this kind of problem because the nodes will be trying to contact
> themselves
> > > instead of the head node for nfs mounts.
> > >
> > > Not much will work without nfs working unfortunately.
> > >
> > >
> > > On 4/23/07, Dr. Lutfullah <[EMAIL PROTECTED]> wrote:
> > > > Thanks a lot. There are 15 log files. I am attaching two.
> > > >
> > > > lutfullah
> > > >
> > > > On 4/23/07, Michael Edwards < [EMAIL PROTECTED]> wrote:
> > > > > Please send your oscarinstall.log file.
> > > > >
> > > > >
> > > > > On 4/23/07, Dr. Lutfullah < [EMAIL PROTECTED] > wrote:
> > > > > >
> > > > > > Hello,
> > > > > >
> > > > > > I am using fedora core 4 and trying to install oscar on a cluster
> with
> > > > > > 32 HP P4 computers.
> > > > > > Everything has gone well except for the last step in which I get
> tests
> > > > > > FAILED errors.
> > > > > > Something like:
> > > > > > /home mounts 31 nodes FAILED
> > > > > > SSH ping test PASSED
> > > > > > SSH server > node FAILED
> > > > > > SSH node -> server FAILED
> > > > > > TORQUE shell test
> > > > > > goes into a loop and produces
> > > > > > Checking for NFS propagation of MPI preferences     not yet
> > > > > > this message then keeps on appearing.
> > > > > > Could anyone please help.
> > > > > > This is our first experience with a cluster.
> > > > > >
> > > > > > lutfullah
> > > > > >
> > > > > >
> > > > >
> > >
> -------------------------------------------------------------------------
> > > > > > This SF.net email is sponsored by DB2 Express
> > > > > > Download DB2 Express C - the FREE version of DB2 express and take
> > > > > > control of your XML. No limits. Just data. Click to get it now.
> > > > > > http://sourceforge.net/powerbar/db2/
> > > > > > _______________________________________________
> > > > > > Oscar-users mailing list
> > > > > > Oscar-users@lists.sourceforge.net
> > > > > >
> > >
> https://lists.sourceforge.net/lists/listinfo/oscar-users
> > > > > >
> > > > >
> > > > >
> > > > >
> > >
> -------------------------------------------------------------------------
> > > > > This SF.net email is sponsored by DB2 Express
> > > > > Download DB2 Express C - the FREE version of DB2 express and take
> > > > > control of your XML. No limits. Just data. Click to get it now.
> > > > > http://sourceforge.net/powerbar/db2/
> > > > > _______________________________________________
> > > > > Oscar-users mailing list
> > > > > Oscar-users@lists.sourceforge.net
> > > > >
> > >
> https://lists.sourceforge.net/lists/listinfo/oscar-users
> > > > >
> > > > >
> > > >
> > > >
> >
> >
> -------------------------------------------------------------------------
> > This SF.net email is sponsored by DB2 Express
> > Download DB2 Express C - the FREE version of DB2 express and take
> > control of your XML. No limits. Just data. Click to get it now.
> > http://sourceforge.net/powerbar/db2/
> > _______________________________________________
> > Oscar-users mailing list
> > Oscar-users@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/oscar-users
> >
>
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by DB2 Express
> Download DB2 Express C - the FREE version of DB2 express and take
> control of your XML. No limits. Just data. Click to get it now.
> http://sourceforge.net/powerbar/db2/
> _______________________________________________
> Oscar-users mailing list
> Oscar-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/oscar-users
>
>

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Oscar-users mailing list
Oscar-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oscar-users

Reply via email to