Am 15.07.2011 um 01:14 schrieb Harry Mangalam: > On Thursday 14 July 2011 16:07:55 you wrote: > > Am 15.07.2011 um 00:58 schrieb Harry Mangalam: > > > <snip> > > > ... > > > hmangala@n103:~ > > > 504 $ telnet bduc-sched 536 > > > Trying 10.255.78.3... > > > Connected to bduc-sched.nacs.uci.edu (10.255.78.3). > > > Escape character is '^]'. > > > ... > > > > Fine. Next tool is `qping`: > Thank you - this seems to be resulting in some more info.. > > > > $ qping name_of_your_qmaster_machine 536 qmaster 1 > from one of the non-starters: (n103) > hmangala@n103:~ > 502 $ qping bduc-sched 536 qmaster 1 > endpoint bduc-sched.nacs.uci.edu/qmaster/1 at port 536: can't find connection > access denied: client IP resolved to host name "". This is not identical to > clients host name "" > endpoint bduc-sched.nacs.uci.edu/qmaster/1 at port 536: can't find connection > > $ qping -info name_of_your_qmaster_machine 536 qmaster 1 > hmangala@n103:~ > 503 $ qping -info bduc-sched 536 qmaster 1 > endpoint bduc-sched.nacs.uci.edu/qmaster/1 at port 536: can't find connection > access denied: client IP resolved to host name "". This is not identical to > clients host name ""
Then please check on the qmaster and exec machine(s) the output when youn use the tools in $SGE_ROOT/utilbin/lx24_amd64 like $ ./gethostbyaddr -all 10.255.78.3 $ ./gethostbyname -all bduc-sched $ ./gethostbyname -all n103 $ ./gethostname Match all up for the particular machines? You use NIS or so or all are recorded in local files? Did you enable/disable in SGE to honor the FQDN (recorded in $SGE_ROOT/default/common/bootstrap)? -- Reuti > from one of the 'connected' nodes: > hmangala@n102:~ > 501 $ qping bduc-sched 536 qmaster 1 > > 07/14/2011 23:12:16 endpoint bduc-sched.nacs.uci.edu/qmaster/1 at port 536 is > up since 191495 seconds > 07/14/2011 23:12:17 endpoint bduc-sched.nacs.uci.edu/qmaster/1 at port 536 is > up since 191496 seconds > 07/14/2011 23:12:18 endpoint bduc-sched.nacs.uci.edu/qmaster/1 at port 536 is > up since 191497 seconds > hmangala@n102:~ > 502 $ qping -info bduc-sched 536 qmaster 1 > > 07/14/2011 23:12:59: > SIRM version: 0.1 > SIRM message id: 1 > start time: 07/12/2011 18:00:41 (1310493641) > run time [s]: 191538 > messages in read buffer: 0 > messages in write buffer: 0 > nr. of connected clients: 139 > status: 1 > info: MAIN: E (191538.46) | signaler000: E (191537.17) | > event_master000: E (0.52) | timer000: E (0.52) | worker000: E (1.09) | > worker001: E (0.90) | listener000: E (0.90) | listener001: E (1.25) | > scheduler000: E (1.28) | WARNING > malloc: arena(15892480) |ordblks(3939) | smblks(34) | > hblksr(0) | hblhkd(0) usmblks(0) | fsmblks(1232) | uordblks(6483312) | > fordblks(9409168) | keepcost(133688) > Monitor: disabled > -- > Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine > [ZOT 2225] / 92697 Google Voice Multiplexer: (949) 478-4487 > MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps) > -- > Unhappy? Grouchy, pedantic old geezer available to follow you > relentlessly until your current life seems like paradise. _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
