Hi all,
I have a new cluster (4+1 machines) all running Fedora Core 2 with a 2.6.6
kernel that I compiled myself. On the master machine, eth0 is the internal
network, and eth1 looks to the university-network. I've added a routing
option so that multicast packets are routed to the internal network, the
routing-table is something like this:
[EMAIL PROTECTED] /]# /sbin/route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use
Iface
192.168.1.0 * 255.255.255.0 U 0 0 0 eth0
131.220.3.0 * 255.255.255.0 U 0 0 0 eth1
169.254.0.0 * 255.255.0.0 U 0 0 0 eth1
127.0.0.0 * 255.0.0.0 U 0 0 0 lo
224.0.0.0 * 240.0.0.0 U 0 0 0 eth0
default rhenus-router.i 0.0.0.0 UG 0 0 0 eth1
Yet I cannot seem to start any clusterServers on the slave machines, when
I start my clusterClient it seems to hang at the multicasting phase:
[...]
select(1024, [12], [], NULL, {0, 520000}) = 0 (Timeout)
sendto(12, "\0\0\0\34\0\0\0\6right2\0\0\0\nStreamSock", 28, 0,
{sa_family=AF_INET, sin_port=htons(8437),
sin_addr=inet_addr("224.245.211.234")}, 16) = 28
sendto(12, "\0\0\0\34\0\0\0\6right2\0\0\0\nStreamSock", 28, 0,
{sa_family=AF_INET, sin_port=htons(8437),
sin_addr=inet_addr("255.255.255.255")}, 16) = 28
select(1024, [12], [], NULL, {2, 0}) = 0 (Timeout)
sendto(12, "\0\0\0\34\0\0\0\6right2\0\0\0\nStreamSock", 28, 0,
{sa_family=AF_INET, sin_port=htons(8437),
sin_addr=inet_addr("224.245.211.234")}, 16) = 28
sendto(12, "\0\0\0\34\0\0\0\6right2\0\0\0\nStreamSock", 28, 0,
{sa_family=AF_INET, sin_port=htons(8437),
sin_addr=inet_addr("255.255.255.255")}, 16) = 28
select(1024, [12], [], NULL, {2, 0}) = 0 (Timeout)
sendto(12, "\0\0\0\34\0\0\0\6right2\0\0\0\nStreamSock", 28, 0,
{sa_family=AF_INET, sin_port=htons(8437),
sin_addr=inet_addr("224.245.211.234")}, 16) = 28
sendto(12, "\0\0\0\34\0\0\0\6right2\0\0\0\nStreamSock", 28, 0,
{sa_family=AF_INET, sin_port=htons(8437),
sin_addr=inet_addr("255.255.255.255")}, 16) = 28
select(1024, [12], [], NULL, {2, 0} <unfinished ...>
[...]
I thought this to be a routing problem, but after this I've run only one
server with OSG_LOG_LEVEL=debug and got the following output:
[...]
Initialized Type Box | 1
init ReflexiveContainerType (1)
init VSCVRMLObjectTypeType Box (1)
Initialized Type Billboard | 1
init ReflexiveContainerType (1)
init VSCVRMLObjectTypeType Billboard (1)
Initialized Type Background | 1
init ReflexiveContainerType (1)
init VSCVRMLObjectTypeType Background (1)
Initialized Type AudioClip | 1
init ReflexiveContainerType (1)
init VSCVRMLObjectTypeType AudioClip (1)
Initialized Type Appearance | 1
init ReflexiveContainerType (1)
init VSCVRMLObjectTypeType Appearance (1)
Initialized Type Anchor | 1
init ReflexiveContainerType (1)
init VSCVRMLObjectTypeType Anchor (1)
(1|0|67)
Got store lock 8168dc8
WARNING: Window::getFunctionByName: Couldn't get function
'glColorTableSGI' for Window 0x8237828.
INFO: Connection bound to port 0
INFO: Waiting for request of right2 StreamSock
INFO: wait for request on group:224.245.211.234
FATAL: SocketLib: setsockopt(IPPROTO_IP,IP_ADD_MEMBERSHIP) 19 No such
device: Server is now unknown
INFO: Stop service thread
I'm unfortunately not a network expert, but to me it looks like the
clusterServer cannot bind to the multicast-port for some reason. Might be
the too new'n'shiny 2.6.x kernel? Or is there a newly introduced bug in
OpenSG? :) I use the dailybuild from 30.06.2004.
I have to give a demo on the cluster tomorrow (Wednesday) so any quick
hints (including but not limited to voodoo-magic and
sacrifice-at-full-moon suggestions) are more than welcome!
Thanks a lot,
Akos
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Opensg-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensg-users