Elmar Pruesse wrote:
Load balancing doesn't work anymore since we got the second machine back
from repair. The only thing I can see is lots of "Error obtaining system
capacity" messages in auth_log. Unfortunately a couple of changes where
done at the same time, so am troubled determining the exact cause. We
finally got our Barcelona's and upgraded to Hardy at the same time.
utgstatus reports everything as being fine:
hostflags interface flags interface flags
172.16.0.0/16 10.234.23.0/24
--------- ------------------- -------------------
castor T- 172.16.8.1 UA- 10.234.23.1 UA-
pollux TN 172.16.8.2 UA- 10.234.23.2 UAM
The fact that you have "M" listed as the flag on 10.234.23.2 indicates
that it's been configured with "utadm -a", whereas the lack of "M" on
10.234.23.1 indicates that it was configured with "utadm -A". You need
to bring these into sync. I don't know if that's the root cause of your
issue but it's a place to start.
-Bob
But castor is receiving all sessions although it is turned offline.
Turning it online doesn't make any difference (except ing the 'N').
The only reference to the error message I found is someone reporting a
similar problem is from Stefan Hoellendampf on 02/28/08 on this mailing
list, but no one responded. In his case it was related to the kernel
version used.
The auth_log on pollux with all debug I could find turned on says this
over and over:
09/10/2008 18:14:51 sendall: sent 305 bytes on eth0 to 224.101.101.101
09/10/2008 18:14:51 sendall: sent 305 bytes on eth1 to 224.101.101.101
09/10/2008 18:15:09 Host message from 172.16.8.1 numhosts 2
09/10/2008 18:15:09 host=castor addr=ac100801 time=1221063309 (off=0)
numifs=2 cpus=0 clock=0
09/10/2008 18:15:09 interface=eth0 ip=aea1701 mask=ffffff00
bcast=aea17ff lastpkt=1221063288 flags=1
09/10/2008 18:15:09 interface=eth1 ip=ac100801 mask=ffff0000
bcast=ac10ffff lastpkt=1221063309 flags=1
09/10/2008 18:15:09 Host message from 10.234.23.1 numhosts 2
09/10/2008 18:15:09 host=castor addr=ac100801 time=1221063309 (off=0)
numifs=2 cpus=0 clock=0
09/10/2008 18:15:09 interface=eth0 ip=aea1701 mask=ffffff00
bcast=aea17ff lastpkt=1221063309 flags=1
09/10/2008 18:15:09 interface=eth1 ip=ac100801 mask=ffff0000
bcast=ac10ffff lastpkt=1221063309 flags=1
09/10/2008 18:15:11 Error obtaining system capacity
09/10/2008 18:15:11 Got 3 total interfaces
09/10/2008 18:15:11 Got 2 real interfaces
09/10/2008 18:15:11
Configure:
host=pollux addr=ac100802 time=1221063311 numifs=2 flags=7 cpus=0
clock=0 period=20 starttime=1219133852 load=2.0
interface=eth0 ip=aea1702 mask=ffffff00 bcast=aea17ff flags=0
interface=eth1 ip=ac100802 mask=ffff0000 bcast=ac10ffff flags=1
sessions=2 idles=2
#sig=50a75cfb08a3e4f4df31d700d9cb89e2121065d6
The curious part is the 0 cpu's ... which is most assuredly wrong.
(without debug, as on castor where I can't restart the authd, it only
shows the "Error obtaining..." line)
The strace log of the authd thread producing the error message is this:
open("/proc/cpuinfo", O_RDONLY) = 20
fstat64(20, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0x1000) = 0xf227d000
read(20, "processor\t: 0\nvendor_id\t: Authen"..., 10240) = 3870
read(20, "processor\t: 5\nvendor_id\t: Authen"..., 6144) = 3870
read(20, "processor\t: 10\nvendor_id\t: Authe"..., 2048) = 2048
read(20, "f_lm cmp_legacy svm extapic cr8_"..., 1024) = 1024
close(20) = 0
munmap(0xf227d000, 4096) = 0
time(NULL) = 1221060831
write(2, "09/10/2008 17:33:51 Error obtain"..., 52) = 52
Why is the buffer size decremented by 4096 on getting 3870 bytes from
the kernel? Why isn't the file read to the end? The former might be java
weirdness, but the latter doesn't sound healthy.
Could this be related to having more than 8 cores? With 8 or less cores
it would fit in the 10k buffer.
Is this a bug? Or am I missing something else completely? Any other logs
that might help helping me? Any known workarounds?
regards,
Elmar
------------------------------------------------------------------------
_______________________________________________
SunRay-Users mailing list
[email protected]
http://www.filibeto.org/mailman/listinfo/sunray-users
_______________________________________________
SunRay-Users mailing list
[email protected]
http://www.filibeto.org/mailman/listinfo/sunray-users