hi all,
i've build maui on a centos 5.1 x86_64 machine with torque 2.3.1 (also
x86_64).
job schedluing with pbs_sched works (so i'm going to guess that torque
is not to blam here).
symptom:
submitted jobs stay queued, showq/checkjob commands fail.
(using LOGLEVEL 9):
in /var/log/maui.log:
07/10 16:35:05 INFO: no PBS sched socket connections ready
07/10 16:35:05 MSUAcceptClient(5,ClientSD,HostName,TCP)
07/10 16:35:05 INFO: accept call failed, errno: 11 (Resource
temporarily unavailable)
07/10 16:35:05 INFO: all clients connected. servicing requests
strace -p <pid_of_maui>:
accept(5, 0x7fff08213660, [34359738384]) = -1 EAGAIN (Resource
temporarily unavailable)
write(3, "07/10 16:36:02 INFO: accept "..., 90) = 90
write(3, "07/10 16:36:02 INFO: all cli"..., 68) = 68
select(0, NULL, NULL, NULL, {0, 100000}) = 0 (Timeout)
write(3, "07/10 16:36:02 MRMCheckEvents()\n", 32) = 32
select(1024, [8], NULL, NULL, {0, 10000}) = 0 (Timeout)
i found this thread with similar issues:
http://www.clusterresources.com/pipermail/mauiusers/2008-January/003085.html
showq
ERROR: lost connection to server
ERROR: cannot request service (status)
except that strace on eg showq does not give the 'EACCES (Permission
denied)', but 'connect(3, {sa_family=AF_INET, sin_port=htons(40559),
sin_addr=inet_addr("192.168.10.1")}, 16) = -1 EINPROGRESS (Operation now
in progress)'
i also ran strace on maui -d, no suspicious things until the
select(1024, [8], NULL, NULL, {0, 10000}) = 0 (Timeout)
accept(5, 0x7fffde67ca70, [16]) = -1 EAGAIN (Resource
temporarily unavailable)
i have disabled ipv6 (and iptables) and selinux.
the maui rpms i've build a rebuild of a working setup for sl4 x86_64
(but with some older snapshot of torque 230).
exact release is:
3.2.6p20 snap.1212617145
configure options from config.log
$ ./configure --build=x86_64-redhat-linux-gnu
--host=x86_64-redhat-linux-gnu --target=x86_64-redhat-linux-gnu
--program-prefix= --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin
--sbindir=/usr/sbin --sysconfdir=/etc --datadir=/usr/share
--includedir=/usr/include --libdir=/usr/lib64 --libexecdir=/usr/libexec
--localstatedir=/var --sharedstatedir=/usr/com --mandir=/usr/share/man
--infodir=/usr/share/info --with-key=123456 --with-spooldir=/var/spool/maui
all hints welcome.
many thanks,
stijn
_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers