Thanks Mike for the answer. And BTW, we have a HOWTO related to this: http://gridscheduler.sourceforge.net/howto/multi_intrfcs.html
Grid Engine is usually quite picky on name resolution. In the past, we have received a few reports related to multi-NIC servers, 127.0.0.1 & "localhost" resolution issues, and of course we started the work on IPv6 a while ago - so there are a few things that Grid Engine needs to be enhanced related to the commlib (aka Communication Library). Last we changed something major in the commlib was in 2005 - when Ron & I added poll(2) support for Linux & Solaris to support more than 1024 nodes (before that you could not have more than ~1000 nodes on a Linux qmaster, and the workaround was to change the system include file to extend a hard-coded system limit when you compile SGE - which most people did not want to do even if they knew the hack). POLL(2) support was reviewed & enhanced by Christian Reissmann at Sun (now at the Oracle Grid Engine team - interesting, I sent Christian an email a few days ago, and Andy & Christian are still at Oracle). In 2009, Ionel emailed the dev list and wanted to add IPv6 support, and we (ie. Ionel, Christian, and I) exchanged a few emails related to the IPv6 support. Basically we know the structure of the commlib, and we will get back to it - but for now, just use the method documented by Mike. When we are done with the higher priority things, we will fix non-critical issues that have known and clean workarounds. To us, if something works for other mission critical systems like LSF but doesn't in Grid Engine, then it is a bug. Those are on the list of things that we will add in Open Grid Scheduler/Grid Engine eventually. Rayson On Wed, May 9, 2012 at 6:06 PM, Mike Hanby <[email protected]> wrote: > I have no idea if this is the solution, but we had an issue with Rocks and > the head node where the daemon wouldn't start properly due to the private > interface being on eth0. I would spit out a message similar to what you > posted. > > The solution was to create the host_aliases file under default/common: > > echo "$(/bin/hostname -s).local $(/bin/hostname --fqdn) $(/bin/hostname -s)" > > $SGE_ROOT/default/common/host_aliases > > Perhaps something similar needs to be done for the login node since it's > multihomed. > > -----Original Message----- > From: [email protected] [mailto:[email protected]] On > Behalf Of Joseph Farran > Sent: Wednesday, May 09, 2012 4:10 PM > To: [email protected] Users > Subject: [gridengine users] Installing OGE on Rocks Login Node > > Hello. > > I have a cluster running Rocks 5.4.3 that I originally setup with > Torque/Maui. I am testing Open Grid Scheduler using the ge2011.11.tar > distribution. > > I setup OGE on the master head node and was able to also setup 6 compute > nodes using "start_gui_installer" on the head node. All 6 compute nodes > were setup without any issues. > > All works except that when I tried to setup our login node, I cannot. The > login node has both a private & public network interfaces. I want to setup > our login node "login-node.xxx.uci.edu" as an Executable and Submit node. > > When I try to setup our Rocks login node using the private name of login-1-1, > it complains with: > > The error message was: > error: commlib error: access denied (client IP resolved to host name > "login-1-1.local". This is not identical to clients host name > "login-node.xxx.uci.edu") > ERROR: unable to contact qmaster using port 6444 on host "headnode.local" > > So then I try installing OGE using the public name of > "login-node.xxx.uci.edu" and it also complains. As soon as I enter > "login-node.xxx.uci.edu" the state column turns red with "Resolvable" and the > "Install" GUI button is greyed out so I cannot continue. > > Looks like OGE is confused about the actual fully qualified name of our login > node. The FQN is "login-node.xxx.uci.edu" but neither name seems to work. > > What is the correct why to get around this? > > Joseph > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users > > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
