Hi Shiang-Tai:

Good work!  This may be a potential bug that we might have to look into.

One question though, assuming that 10.0.1.10 is the address of your eth1
interface, have you always been running:

./install_cluster eth1

?  Have you ever run install_cluster on eth0?

Thanks,

Bernard 

> -----Original Message-----
> From: Shiang-Tai Lin [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, January 11, 2005 20:53
> To: Yu Chen
> Cc: Bernard Li; [email protected]
> Subject: Re: [Oscar-users] PBS configuration failure during 
> post_install (OSCAR4+FC2)
> 
> Hi,
> I finally figure out the problem with the Torque post_install 
> failure. 
> The problem (at least for me) was that
> the hostname for the network  is not the same in the PBSserver file. 
> That is, the name of the PBSserver,
> pbs_oscar (defined in /var/spool/pbs/server_name), is not 
> defined in the file /etc/hosts. This should always be true if 
> one has only one network card installed but may be setup 
> incorrectly if there are additional network cards on the 
> server node. To be more clear, I list my /etc/hosts file here 
> (the xx are numbers I do not wish to disclose for security reasons)
> 
> # Do not remove the following line, or various programs # 
> that require network functionality will fail.
> 10.0.1.10      abc.ntu.edu.tw  abc oscar_server nfs_oscar pbs_oscar
> 140.112.xx.xx def.ntu.edu.tw def
> 
> # These entries are managed by SIS, please don't modify them.
> 10.0.1.1        node1.abc.ntu.edu.tw    node1
> 10.0.1.2        node2.abc.ntu.edu.tw    node2
> 
> Originally I had "10.0.1.10      def.ntu.edu.tw  def oscar_server 
> nfs_oscar pbs_oscar" in the /etc/hosts
> so the post_install always failed.
> 
> I also want to note that it is not necessary to edit the 
> nodes file /var/spool/pbs/server_priv/nodes. The post_install 
> script can still find the nodes without this file.
> 
> Finally, I'd like to point out that I figured this out after 
> reading the following paragraph I found in 
> http://www.mail-archive.com/[email protected]/
msg01387.html
> 
> "If the primary name on the interface is not the name in the 
> PBSserver file, you will get get an "Unauthorized Request" 
> error when you attempt to configure the server with qmgr."
> 
> 
> Thanks to all, especially Bernard, who tried to help me out.
> 
> Shiang-Tai
> 
> Yu Chen wrote:
> 
> > Hello,
> >
> > I can confirm Shiang-Tai's finding, it happened to me too, the same 
> > thing, although different system. I am using RH-EL-AS-3 
> update 3 on i386.
> >
> > The error messages are the same, I thought it's the
> > "/opt/pbs/bin/pbsnodes: Server has no node list" problem, 
> so I created 
> > /var/spool/pbs/server_priv/nodes file manually, restarted 
> pbs_server, 
> > then run the "pbs_postinstall", now there is no
> > "/opt/pbs/bin/pbsnodes: Server has no node list" message, but still 
> > tons of "qmgr obj=node2.cl.hhmi.umbc.edu svr=default: Unauthorized 
> > Request" messages from each node.
> >
> > This is my first time playing with PBS, so anyone has any ideas on 
> > this, maybe something on nodes have to be done? BTW, I can 
> ssh to any 
> > node without password without problem.
> >
> > Chen
> >
> >
> > On Fri, 7 Jan 2005, Bernard Li wrote:
> >
> >> Hi Shiang-Tai:
> >>
> >> You should not need to select anything in Step 1, since 
> Torque should 
> >> be selected by default.  If you need to select it manually, then 
> >> something is wrong.
> >>
> >> Can you run the following command and paste the output 
> here?  Run it
> >> 2 times at least:
> >>
> >> % cd /opt/oscar/packages/torque/scripts
> >> % ./post_install
> >>
> >> also:
> >>
> >> % qmgr -c "print server"
> >>
> >> You might also want to check the Torque logs to see what 
> is going on:
> >>
> >> /var/spool/pbs/server_logs/pbs_server.log
> >>
> >> Cheers,
> >>
> >> Bernard
> >>
> >
> >
> > ===========================================
> > Yu Chen
> > Howard Hughes Medical Institute
> > Chemistry Building, Rm 182
> > University of Maryland at Baltimore County 1000 Hilltop Circle 
> > Baltimore, MD 21250
> >
> > phone:     (410)455-6347 (primary)
> >     (410)455-2718 (secondary)
> > fax:     (410)455-1174
> > email:     [EMAIL PROTECTED]
> > ===========================================
> >
> 


-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
_______________________________________________
Oscar-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oscar-users

Reply via email to