Am 27.11.2016 um 03:23 schrieb Coleman, Marcus [JRDUS Non-J&J]: > Hi Reuti > > I am not sure what I am looking for...but here is the contents of /tmp on the > rebooting node > Any outrights you can see? > > [root@padme tmp]# ls -l > total 20 > prw-rw-r-- 1 mcolem19 mcolem19 0 Nov 23 22:09 jmonitor.mcolem19.37995 > prw-rw-r-- 1 mcolem19 mcolem19 0 Nov 23 22:35 jmonitor.mcolem19.38497 > prw-rw-r-- 1 mcolem19 mcolem19 0 Nov 23 22:45 jmonitor.mcolem19.38615 > prw-rw-r-- 1 mcolem19 mcolem19 0 Nov 23 22:45 jmonitor.mcolem19.38624 > prw-rw-r-- 1 schrogpu schrogpu 0 Sep 5 00:27 jmonitor.schrogpu.28331 > prw-rw-r-- 1 schrogpu schrogpu 0 Sep 5 00:27 jmonitor.schrogpu.28377 > prw-rw-r-- 1 schrogpu schrogpu 0 Sep 5 00:40 jmonitor.schrogpu.31781 > prw-rw-r-- 1 schrogpu schrogpu 0 Sep 5 00:41 jmonitor.schrogpu.31829 > prw-rw-r-- 1 schrogpu schrogpu 0 Sep 9 12:17 jmonitor.schrogpu.5042 > prw-rw-r-- 1 schrogpu schrogpu 0 Sep 9 12:17 jmonitor.schrogpu.5043 > prw-rw-r-- 1 schrogpu schrogpu 0 Sep 5 00:08 jmonitor.schrogpu.8041 > prw-rw-r-- 1 schrogpu schrogpu 0 Sep 5 00:39 jmonitor.schrogpu.8220 > prw-rw-r-- 1 schrogpu schrogpu 0 Sep 5 00:26 jmonitor.schrogpu.8346 > prw-rw-r-- 1 schrogpu schrogpu 0 Sep 5 00:39 jmonitor.schrogpu.8557 > prw-rw-r-- 1 schrogpu schrogpu 0 Sep 5 00:27 jmonitor.schrogpu.8740 > drwx------ 2 root root 4096 Nov 4 16:09 keyring-6CWKlB > drwxrwxrwx 2 mcolem19 mcolem19 4096 Nov 23 11:03 mmjob.lock > prw------- 1 schrogpu schrogpu 0 Sep 5 00:27 mmjob.schrogpu.28352 > prw------- 1 schrogpu schrogpu 0 Sep 5 00:27 mmjob.schrogpu.28400 > prw------- 1 schrogpu schrogpu 0 Sep 5 00:27 mmjob.schrogpu.28480 > prw------- 1 schrogpu schrogpu 0 Sep 5 00:27 mmjob.schrogpu.28487 > prw------- 1 schrogpu schrogpu 0 Sep 5 00:39 mmjob.schrogpu.31802 > prw------- 1 schrogpu schrogpu 0 Sep 5 00:39 mmjob.schrogpu.31850 > prw------- 1 schrogpu schrogpu 0 Sep 5 00:40 mmjob.schrogpu.31876 > prw------- 1 schrogpu schrogpu 0 Sep 5 00:41 mmjob.schrogpu.31891 > prw------- 1 schrogpu schrogpu 0 Sep 5 00:08 mmjob.schrogpu.8087 > prw------- 1 schrogpu schrogpu 0 Sep 5 00:39 mmjob.schrogpu.8266 > prw------- 1 schrogpu schrogpu 0 Sep 5 00:26 mmjob.schrogpu.8392 > prw------- 1 schrogpu schrogpu 0 Sep 5 00:39 mmjob.schrogpu.8603 > prw------- 1 schrogpu schrogpu 0 Sep 5 00:27 mmjob.schrogpu.8787 > drwx------ 2 gdm gdm 4096 Nov 25 07:42 orbit-gdm > drwx------. 2 gdm gdm 4096 Nov 25 07:42 pulse-5mlDwNemaGym > drwx------ 2 root root 4096 Nov 4 16:09 pulse-GAI9xhuCTgeg
Thx, I was looking for a file created by the execd in case it faces problems during startup. Such files will be saved in /tmp as last resort for the logfiles. Unfortunately there are none, hence the startup per se was successful. > [root@padme tmp]# > > > -----Original Message----- > From: Reuti [mailto:re...@staff.uni-marburg.de] > Sent: Saturday, November 26, 2016 6:31 AM > To: Coleman, Marcus [JRDUS Non-J&J] > Cc: users@gridengine.org > Subject: [EXTERNAL] Re: [gridengine users] commlib > > Hi, > > Am 26.11.2016 um 06:10 schrieb Coleman, Marcus [JRDUS Non-J&J]: > >> I am having an issue with a node rebooting. I am running Desmond fep >> jobs... >> >> Thanks for any help in advance! >> >> /etc/resolv.conf is the same on all nodes /etc/hosts is the same on >> all nodes All nodes are connected to the same switch in a server rack. >> ################### from NODE >> [root@padme lx-amd64]# ./gethostbyaddr -name 192.168.1.8 >> rndusljpp2.na.jnj.com [root@padme lx-amd64]# ./gethostbyname -name s1 >> rndusljpp2.na.jnj.com ################### from QMASTER >> [root@rndusljpp2 lx-amd64]# ./gethostbyaddr -name 192.168.1.159 padme >> [root@rndusljpp2 lx-amd64]# ./gethostbyname -name padme padme What do: $ ./gethostbyname -all padme $ ./gethostbyaddr -all 192.168.1.159 show? -- Reuti _______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users