Gianfranco Sciacca wrote: > Adrian Sevcenco wrote: >> Greenseid, Joseph M. wrote: >> >>> it says ok for when it is starting up. does it not actually start? is >>> there a maui process running after you do this? >>> >> yes, it has a process but when i try to do any command related to maui i >> have : >> [r...@grid01 log]# checkjob 2 >> ERROR: lost connection to server >> ERROR: cannot request service (status) >> I attached the log(9) of starting maui. >> Can somebody see the problem there? >> Thank you, >> Adrian >> > Adrian, are you running nscd per chance? We have noticed on many of our > clients and servers that the nscd process tends to go haywire from time > to time and cause all sort of problems, including the one you mention. > The tell-tale would be nscd using 100% CPU on your grid01 machine. > Perhaps not your case, but worth checking. Hi and thanks for the tip but we don't have nscd on this machine. Adrian
> cheers, > Gianfranco >> >>> >>> --Joe >>> >>> ------------------------------------------------------------------------ >>> *From:* [email protected] on behalf of Adrian Sevcenco >>> *Sent:* Mon 12/15/2008 12:56 PM >>> *To:* [email protected] >>> *Subject:* [Mauiusers] MAUI not responding - "lost connection to server" >>> >>> Hi, >>> I have a strange situation : >>> when i try to restart the maui server i have : >>> [r...@grid01 /]# service maui restart >>> Shutting down MAUI Scheduler: ERROR: lost connection to server >>> ERROR: cannot request service (status) >>> [FAILED] >>> Starting MAUI Scheduler: [ OK ] >>> >>> The same with firewall down. >>> as configuration i have this : >>> >>> [r...@grid01 maui]# cat maui.cfg >>> # MAUI configuration example >>> >>> SERVERHOST grid01.spacescience.ro >>> ADMIN1 root >>> ADMIN3 edginfo rgma edguser >>> ADMINHOSTS grid01.spacescience.ro >>> RMCFG[base] TYPE=PBS >>> SERVERPORT 40559 >>> SERVERMODE NORMAL >>> >>> # Set PBS server polling interval. If you have short # queues or/and >>> jobs it is worth to set a short interval. (10 seconds) >>> >>> RMPOLLINTERVAL 00:00:10 >>> >>> # a max. 10 MByte log file in a logical location >>> >>> LOGFILE /var/log/maui.log >>> LOGFILEMAXSIZE 10000000 >>> LOGLEVEL 1 >>> >>> # Set the delay to 1 minute before Maui tries to run a job again, # in >>> case it failed to run the first time. >>> # The default value is 1 hour. >>> >>> DEFERTIME 00:01:00 >>> >>> # Necessary for MPI grid jobs >>> ENABLEMULTIREQJOBS TRUE >>> >>> Any ideas why it is not working? how can i debug this further? >>> is there a requirement of something to be in /etc/hosts ? >>> Thank you, >>> Adrian >>> >>> >> >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> mauiusers mailing list >> [email protected] >> http://www.supercluster.org/mailman/listinfo/mauiusers >>
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ mauiusers mailing list [email protected] http://www.supercluster.org/mailman/listinfo/mauiusers
