Am 09.07.2011 um 00:36 schrieb Carlos Scaloni:

> Next things to check:
> 
> - enough free space on the granted virtual disk space?
> 61% is free
> 
> - no SELinux enabled?
> cat /etc/sysconfig/selinux
> SELINUX=enforcing

SELINUX=disabled

should turn it off after a reboot of the VM.

-- Reuti


> SELINUXTYPE=targeted
> 
> 
> 
> - what is the configuration of your virtual switch - is it set to bridge mode 
> or NAT? Both virtual interfaces have a virtual switch on their own?
> eth0 NAT
> eth1 adaptador solo anfitrión (Host-only networking)
> 2011/7/8 Reuti <[email protected]>
> Am 08.07.2011 um 20:39 schrieb Carlos Scaloni:
> 
> > I ran the source command! Succesful without errors.
> >
> > I don't have any firewall and there aren't relevant messages in /var/log
> >
> > But...
> >
> > /etc/init.d/sgemaster.p6444
> >    starting sge_qmaster
> >
> > sge_qmaster start problem
> >
> > sge_qmaster didn't start!
> >
> >
> > cat /tmp/sge_messages
> > 07/08/2011 20:36:11|  main|proyecto-192|C|abort qmaster startup due to 
> > communication errors
> >
> > :(
> 
> Next things to check:
> 
> - enough free space on the granted virtual disk space?
> - no SELinux enabled?
> - what is the configuration of your virtual switch - is it set to bridge mode 
> or NAT? Both virtual interfaces have a virtual switch on their own?
> 
> -- Reuti
> 
> 
> > 2011/7/8 Reuti <[email protected]>
> > Am 08.07.2011 um 19:32 schrieb Carlos Scaloni:
> >
> > > Thanks again crack! I have severals users. Settings.sh. What do i have to 
> > > do with this file? This is the contain of this file:
> >
> > Source it, in bash by:
> >
> > . /usr/global/sge-6.2u5-bin/default/common/settings.sh
> >
> > (note the space after the dot) or
> >
> > source /usr/global/sge-6.2u5-bin/default/common/settings.sh
> >
> > (check `man bash` for "source filename"). As I don't know where you can set 
> > it up in your particular distrubution on a global level, you have to 
> > investigate it by other means.
> >
> >
> > > cat /usr/global/sge-6.2u5-bin/default/common/settings.sh
> > > SGE_ROOT=/usr/global/sge-6.2u5-bin; export SGE_ROOT
> > >
> > > ARCH=`$SGE_ROOT/util/arch`
> > > DEFAULTMANPATH=`$SGE_ROOT/util/arch -m`
> > > MANTYPE=`$SGE_ROOT/util/arch -mt`
> > >
> > > SGE_CELL=default; export SGE_CELL
> > > SGE_CLUSTER_NAME=p6444; export SGE_CLUSTER_NAME
> > > SGE_QMASTER_PORT=6444; export SGE_QMASTER_PORT
> > > SGE_EXECD_PORT=6445; export SGE_EXECD_PORT
> > > .... etc.
> > >
> > > The hostname was changed by me, when i tried to fix the problems...
> > >
> > > cat /usr/global/sge-6.2u5-bin/default/common/act_qmaster
> > > proyecto-192.local
> >
> > Fine, so it's just running on the second interface. Did you setup any 
> > firewall which blocks the traffic - you can also check the various logfiles 
> > in /var/log to spotz any error message pointing to it.
> >
> > -- Reuti
> >
> >
> > > 2011/7/8 Reuti <[email protected]>
> > > Am 08.07.2011 um 16:43 schrieb Carlos Scaloni:
> > >
> > > > Firstly, "echo $SGE_ROOT" doesn't show anything... Doesn't exist this 
> > > > environment variable
> > > >
> > > > But I went to: /usr/global/sge-6.2u5-bin/utilbin/lx24-amd64
> > > >
> > > > I ran the followings commands:
> > > >
> > > > ./gethostname -all
> > > > critical error: Please set the environment variable SGE_ROOT.
> > >
> > > To have acccess to all SGE commands, it's necessary to source 
> > > /usr/global/sge-6.2u5-bin/default/common/settings.sh or similar, 
> > > depending on the location where you installed it. Best is something like 
> > > a global profile for all users (or just your own if you are the only 
> > > user).
> > >
> > >
> > > > hostname
> > > > proyecto-192.local
> > >
> > > This could spots the problem: it's accessing the qmaster on the wrong 
> > > interface. Did you define hostname by hand, or was it chosen during 
> > > installation of the OS automatically?
> > >
> > > What's in the file:
> > >
> > > /usr/global/sge-6.2u5-bin/default/common/act_qmaster
> > >
> > > -- Reuti
> > >
> > >
> > > > 2011/7/8 Reuti <[email protected]>
> > > > Am 07.07.2011 um 19:44 schrieb Carlos Scaloni:
> > > >
> > > > > It's a virtual machine. I want to install the qmaster and the execcd 
> > > > > in the same machine. 192.168.56.101 is connected with the outside 
> > > > > world!
> > > > >
> > > > > ifconfig
> > > > > eth0      Link encap:Ethernet  HWaddr 08:00:27:F3:80:43
> > > > >           inet addr:10.0.2.15  Bcast:10.0.2.255  Mask:255.255.255.0
> > > > >           inet6 addr: fe80::a00:27ff:fef3:8043/64 Scope:Link
> > > > >           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> > > > >           RX packets:1635 errors:0 dropped:0 overruns:0 frame:0
> > > > >           TX packets:1506 errors:0 dropped:0 overruns:0 carrier:0
> > > > >           collisions:0 txqueuelen:1000
> > > > >           RX bytes:1390858 (1.3 MiB)  TX bytes:104832 (102.3 KiB)
> > > >
> > > > Okay, so this is the main interface and the ones which gives the name 
> > > > where the qmaster can be acessed. But below it's trying to access under 
> > > > "|  main|proyecto-192|C|".There are some tools in 
> > > > $SGE_ROOT/utilbin/lx24-amd64:
> > > >
> > > > $ ./gethostname -all
> > > >
> > > > and then:
> > > >
> > > > $ ./gethostbyname <name>
> > > > $ ./gethostbyaddr <addr>
> > > >
> > > > by these names and address. Is there any firewall installed blocking 
> > > > traffic the machine itself? This needs to be reolved, why it is running 
> > > > under eth1. There is a file to map hostnames to interfaces and make an 
> > > > alias to them, but I think in your case we have to look elsewhere, as 
> > > > you want to run it on the main interface. A plain:
> > > >
> > > > $ hostname
> > > >
> > > > gives you the name from eth0?
> > > >
> > > > -- Reuti
> > > >
> > > >
> > > > > eth1      Link encap:Ethernet  HWaddr 08:00:27:14:C4:0C
> > > > >           inet addr:192.168.56.101  Bcast:192.168.56.255  
> > > > > Mask:255.255.255.0
> > > > >           inet6 addr: fe80::a00:27ff:fe14:c40c/64 Scope:Link
> > > > >           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> > > > >           RX packets:1805 errors:0 dropped:0 overruns:0 frame:0
> > > > >           TX packets:1152 errors:0 dropped:0 overruns:0 carrier:0
> > > > >           collisions:0 txqueuelen:1000
> > > > >           RX bytes:162007 (158.2 KiB)  TX bytes:186807 (182.4 KiB)
> > > > >
> > > > > lo        Link encap:Local Loopback
> > > > >           inet addr:127.0.0.1  Mask:255.0.0.0
> > > > >           inet6 addr: ::1/128 Scope:Host
> > > > >           UP LOOPBACK RUNNING  MTU:16436  Metric:1
> > > > >           RX packets:92 errors:0 dropped:0 overruns:0 frame:0
> > > > >           TX packets:92 errors:0 dropped:0 overruns:0 carrier:0
> > > > >           collisions:0 txqueuelen:0
> > > > >           RX bytes:4600 (4.4 KiB)  TX bytes:4600 (4.4 KiB)
> > > > >
> > > > >
> > > > > 2011/7/7 Reuti <[email protected]>
> > > > > Am 07.07.2011 um 19:27 schrieb Carlos Scaloni:
> > > > >
> > > > > > Hi, thanks for answering!
> > > > > >
> > > > > > I have a file in /tmp called sge_messages with this content:
> > > > > >
> > > > > > 07/07/2011 18:59:41|  main|proyecto-192|C|abort qmaster startup due 
> > > > > > to communication errors
> > > > >
> > > > > Well, you listed the complete /etc/hosts - i.e. no 127.0.0.2 is 
> > > > > present (i.e. no entry for it is good)?
> > > > >
> > > > > What is the primary interface in the master node? As I see two 
> > > > > entries I assume you have at least two network interfaces, and one of 
> > > > > them is connected to the outside world, the other to the nodes. Maybe 
> > > > > it's addressing the cluster on the wrong one.
> > > > >
> > > > > -- Reuti
> > > > >
> > > > >
> > > > > > 2011/7/7 Reuti <[email protected]>
> > > > > > Hi,
> > > > > >
> > > > > > Am 07.07.2011 um 19:11 schrieb Carlos Scaloni:
> > > > > >
> > > > > > > Hi friends! I can't install SGE, I need your help, please. Thanks 
> > > > > > > a lot in advance!
> > > > > > >
> > > > > > > Options I chose:
> > > > > > >
> > > > > > > admin user is sgeadmin
> > > > > > > set network ports with environment
> > > > > > > sge_qmaster port 6444
> > > > > > > sge_execd port 6445
> > > > > > > say no to pkgadd and verify permissions
> > > > > > > classic spooling, not berkeley db
> > > > > > > gid range 20000-21000
> > > > > > > enter list of execution hosts node01 thru node##
> > > > > >
> > > > > > did you get any error log in /tmp?
> > > > > >
> > > > > > -- Reuti
> > > > > >
> > > > > >
> > > > > > > Error:
> > > > > > >
> > > > > > > Grid Engine qmaster startup
> > > > > > > ---------------------------
> > > > > > >
> > > > > > > Starting qmaster daemon. Please wait ...
> > > > > > >    starting sge_qmaster
> > > > > > >
> > > > > > > sge_qmaster start problem
> > > > > > >
> > > > > > > sge_qmaster didn't start!
> > > > > > > sge_qmaster start problem
> > > > > > >
> > > > > > > cat /etc/hosts
> > > > > > >
> > > > > > > 127.0.0.1   localhost localhost.localdomain localhost4 
> > > > > > > localhost4.localdomain4
> > > > > > > ::1         localhost
> > > > > > > 192.168.56.101 proyecto-192.local
> > > > > > > 10.0.2.15 proyecto-10.local
> > > > > > >
> > > > > > >
> > > > > > > _______________________________________________
> > > > > > > users mailing list
> > > > > > > [email protected]
> > > > > > > https://gridengine.org/mailman/listinfo/users
> > > > > >
> > > > > >
> > > > > >
> > > > > > _______________________________________________
> > > > > > users mailing list
> > > > > > [email protected]
> > > > > > https://gridengine.org/mailman/listinfo/users
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> >
> >
> >
> > _______________________________________________
> > users mailing list
> > [email protected]
> > https://gridengine.org/mailman/listinfo/users
> 
> 
> 
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to