Re: [gridengine users] Having issues getting sun grid engine running on new frontend

2019-07-25 Thread Reuti
Hi,

Am 25.07.2019 um 16:44 schrieb Pat Haley:

> 
> Hi All,
> 
> We have been trying to install Rocks 7 on a new frontend machine, using a 
> restore roll from our old front-end (running Rocks 6.2) to bring over our 
> users, groups and various customizations (more details are available in 
> https://marc.info/?l=npaci-rocks-discussion=154514980222760=2 ).  Our 
> latest issue is that the Sun Grid Engine service does not start.
> 
>  systemctl status -l sgemaster.mseas
> ● sgemaster.mseas.service - LSB: start Grid Engine qmaster, shadowd
>Loaded: loaded (/etc/rc.d/init.d/sgemaster.mseas; bad; vendor preset: 
> disabled)
>Active: failed (Result: exit-code) since Fri 2019-07-19 12:26:46 EDT; 
> 32min ago
>  Docs: man:systemd-sysv-generator(8)
>   Process: 355124 ExecStart=/etc/rc.d/init.d/sgemaster.mseas start 
> (code=exited, status=1/FAILURE)
> 
> Jul 19 12:25:44 mseas.mit.edu systemd[1]: Starting LSB: start Grid Engine 
> qmaster, shadowd...
> Jul 19 12:25:45 mseas.mit.edu sgemaster.mseas[355124]: Starting Grid Engine 
> qmaster
> Jul 19 12:26:46 mseas.mit.edu sgemaster.mseas[355124]: sge_qmaster start 
> problem
> Jul 19 12:26:46 mseas.mit.edu sgemaster.mseas[355124]: sge_qmaster didn't 
> start!
> Jul 19 12:26:46 mseas.mit.edu systemd[1]: sgemaster.mseas.service: control 
> process exited, code=exited status=1
> Jul 19 12:26:46 mseas.mit.edu systemd[1]: Failed to start LSB: start Grid 
> Engine qmaster, shadowd.
> Jul 19 12:26:46 mseas.mit.edu systemd[1]: Unit sgemaster.mseas.service 
> entered failed state.
> Jul 19 12:26:46 mseas.mit.edu systemd[1]: sgemaster.mseas.service failed.
> 
> 
> in poking around, we see 2 entries for sge in /etc/passwd on the new system
> 
> grep -in sge /etc/passwd
> 44:sge:x:990:985:GridEngine  System account:/opt/gridengine:/bin/true
> 64:sge:x:400:400:GridEngine:/opt/gridengine:/bin/true

It's definitely wrong two have two entries for one and the same account. First 
remove the first one which also points to an unknown group. Do you have a group 
with ID 985?

Then: are the files in /opt/gridengine owned by this (leftover) user?

But some files inside need a root-squash:

$ find . -perm /u+s
./utilbin/lx24-amd64/testsuidroot
./utilbin/lx24-amd64/rlogin
./utilbin/lx24-amd64/rsh
./utilbin/lx24-amd64/authuser
./bin/lx24-amd64/sgepasswd

There is the script /opt/sge/util/setfileperm.sh to correct this.


> and only one on the old system
> 
> grep -in sge /etc/passwd
> 37:sge:x:400:400:GridEngine:/opt/gridengine:/bin/true
> 
> looking at /etc/group both systems only show the old group id
> 
> grep -in sge /etc/group
> 49:sge:x:400:
> 
> looking at the qmaster logs in 
> /opt/gridengine/default/spool/qmaster/messages 
>  
> we’ve found the following message:
> error opening file "/opt/gridengine/default/spool/qmaster/./sharetree" for 
> reading: No such file or directory

Did you transfer the old configuration or does this pop up in a fresh installed 
system?

Unfortunately the procedure might be changed by the ROCKS distribution compared 
to the original sources.

-- Reuti


> However, we do not see that file on the old frontend either. 
> 
> Can anyone suggest what we can do to either correct or debug this issue?
> 
> Pat
> 
> -- 
> 
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> Pat Haley  Email:  
> pha...@mit.edu
> 
> Center for Ocean Engineering   Phone:  (617) 253-6824
> Dept. of Mechanical EngineeringFax:(617) 253-8125
> MIT, Room 5-213
> http://web.mit.edu/phaley/www/
> 
> 77 Massachusetts Avenue
> Cambridge, MA  02139-4301
> 
> ___
> users mailing list
> users@gridengine.org
> https://gridengine.org/mailman/listinfo/users


___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users


[gridengine users] Having issues getting sun grid engine running on new frontend

2019-07-25 Thread Pat Haley


Hi All,

We have been trying to install Rocks 7 on a new frontend machine, using 
a restore roll from our old front-end (running Rocks 6.2) to bring over 
our users, groups and various customizations (more details are available 
in https://marc.info/?l=npaci-rocks-discussion=154514980222760=2 ).  
Our latest issue is that the Sun Grid Engine service does not start.


 systemctl status -l sgemaster.mseas
● sgemaster.mseas.service - LSB: start Grid Engine qmaster, shadowd
   Loaded: loaded (/etc/rc.d/init.d/sgemaster.mseas; bad; vendor 
preset: disabled)
   Active: failed (Result: exit-code) since Fri 2019-07-19 12:26:46 
EDT; 32min ago

 Docs: man:systemd-sysv-generator(8)
  Process: 355124 ExecStart=/etc/rc.d/init.d/sgemaster.mseas start 
(code=exited, status=1/FAILURE)


Jul 19 12:25:44 mseas.mit.edu systemd[1]: Starting LSB: start Grid 
Engine qmaster, shadowd...
Jul 19 12:25:45 mseas.mit.edu sgemaster.mseas[355124]: Starting Grid 
Engine qmaster
Jul 19 12:26:46 mseas.mit.edu sgemaster.mseas[355124]: sge_qmaster start 
problem
Jul 19 12:26:46 mseas.mit.edu sgemaster.mseas[355124]: sge_qmaster 
didn't start!
Jul 19 12:26:46 mseas.mit.edu systemd[1]: sgemaster.mseas.service: 
control process exited, code=exited status=1
Jul 19 12:26:46 mseas.mit.edu systemd[1]: Failed to start LSB: start 
Grid Engine qmaster, shadowd.
Jul 19 12:26:46 mseas.mit.edu systemd[1]: Unit sgemaster.mseas.service 
entered failed state.

Jul 19 12:26:46 mseas.mit.edu systemd[1]: sgemaster.mseas.service failed.


in poking around, we see 2 entries for sge in /etc/passwd on the new system

grep -in sge /etc/passwd
44:sge:x:990:985:GridEngine  System account:/opt/gridengine:/bin/true
64:sge:x:400:400:GridEngine:/opt/gridengine:/bin/true

and only one on the old system

grep -in sge /etc/passwd
37:sge:x:400:400:GridEngine:/opt/gridengine:/bin/true

looking at /etc/group both systems only show the old group id

grep -in sge /etc/group
49:sge:x:400:

looking at the qmaster logs in
/opt/gridengine/default/spool/qmaster/messages

we’ve found the following message:
*error opening file "/opt/gridengine/default/spool/qmaster/./sharetree" 
for reading: No such file or directory*


However, we do not see that file on the old frontend either.

Can anyone suggest what we can do to either correct or debug this issue?

Pat

--

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Pat Haley  Email:pha...@mit.edu
Center for Ocean Engineering   Phone:  (617) 253-6824
Dept. of Mechanical EngineeringFax:(617) 253-8125
MIT, Room 5-213http://web.mit.edu/phaley/www/
77 Massachusetts Avenue
Cambridge, MA  02139-4301

___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users