A possible workarouund is to put a link in r3.d (S99 or something like
that) to a script that just does a "mount -a".  This was suggested by
a friend I was chatting with about the problem, but is a total hack,
and does nothing to find the root of the problem per say.  Unless it
doesn't work, of course.

A routing problem was the only other suggestion anyone I talked to
came up with.  I know nothing about trouble shouting managed routers
or switches though.

Does the boot log ("dmsg") on the client show anything unusual when
the client attempts to connect?

Something to try "just for fun" might be to turn off pfilter and any
sort of firewall/SELinux on all the machines and try again.  I have
had some strange behavior on occasion from firewall setups.  Also, if
you have more than one switch make sure it isn't blocking anything
between the two switches.

On 1/31/06, ANDY SIU <[EMAIL PROTECTED]> wrote:
> Thank you for your suggestion.
> All the nodes have intel on board gigabit NICs and are connected to a
> gigabit switch(Dell PowerConnect 5324). Anyway, this problem is not very
> serious because I still can mount the NFS manually by cexec.
> Andy
>
> Frank Crawford <[EMAIL PROTECTED]> wrote:
> Andy,
> Going back a bit, I have sometimes had this problem when the NIC was
> autonegotiating and it too a while. Do you have gigabit NICs and a
> 100Mbit switch or something like that.
>
> The only way I could get around it was to put in a delay of about 30-40
> sec after the network started and before anything tried to use it.
>
> Frank
>
> On Tue, 2006-01-31 at 18:39 -0500, ANDY SIU wrote:
> > Hi Michael,
> > Yes, the head node is booted completely before the compute nodes. I
> > have also tri ed to reboot one compute node when the rest of the
> > cluster is running. So the /home in the head node was already using by
> > the other compute nodes. However, the rebooted computer node still
> > could not mount the /home and /nfstmp, except if I manually executed
> > "mount -a".
> > Thanks!
> > Andy
> >
> > Michael Edwards wrote:
> > The head node is booted completely before the nodes boot?
> > Thats what
> > the problem sounds like. If it was a problem with the fstab
> > "cexec
> > mount -a" wouldn't work...
> >
> > On 1/30/06, ANDY SIU wrote:
> > > Hi,
> > > Does anybody know why the compute nodes cannot mount NFS
> > file systems during
> > > boot up?
> > > My cluster is Intel P4 EM64T runnin g FC4 x86_64. I am using
> > OSCAR-4.2.1.
> > > The compute nodes are booted after the headnode is booted
> > up. The error is
> > > something like "pbs_oscar NFS server is down". I have to
> > manually mount the
> > > NFS on each compute nodes by "cexec mount -a".
> > >
> > > The /etc/fstab file on the compute nodes contains:
> > > /dev/sda6 / ext3 defaults 1 2
> > > /dev/sda5 swap swap defaults 0 0
> > > /dev/sda1 /boot ext3 defaults 1 2
> > > /dev/fd0 /mnt/floppy auto noauto,owner 0 0
> > > none /dev/pts devpts defaults 0 0
> > > none /proc proc defaults 0 0
> > > nfs_oscar:/nfstmp /nfstmp nfs rw 0 0
> > > nfs_oscar:/home /home nfs rw 0 0
> > >
> > > The /etc/exports file on the headnode contains:< /DIV>
> > > /home 192.168.1.1/255.255.0.0(async,rw,no_root_squash)
> > > /nfstmp
> 192.168.1.1/255.255.0.0(async,rw,no_root_squash)
> > >
> > > Thank you
> > > Andy
> >
> >
> --
> ac3
> Suite G16, Bay 7, Locomotive Workshop Phone: 02 9209 4600
> Australian Technology Park Fax: 02 9209 4611
> Eveleigh NSW 1430
>
> This email is for intended recipients only. If you are not an
> intended recipient, you must not disclose, copy or distribute this
> email to others or act upon the information contained in this email
> in any way. This email may contain privileged or confidential
> information. If you have received this email in error, please
> notify the sender and then delete it from your system.
>
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
> for problems? Stop! Download the new AJAX search engine that makes
> searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
> _______________________________________________
> Oscar-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/oscar-users
>
>
>


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid3432&bid#0486&dat1642
_______________________________________________
Oscar-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oscar-users

Reply via email to