I'm sure I'm missing something simple here but dang if I can find it.
One of servers in our FOG died and was out of service for a month. We
recently returned the machine to service with a fresh OS and Sunray 3.1
install into the FOG. Everything is working normally as far as load
balance goes but utfwsync is failing on the replaced server. Log entries
in auth_log on the replaced server show token communication errors to
all other servers in the FOG whenever I try to run a utfwsync from the
primary server. Likewise, the primary server shows the same errors to
the replaced server during fwsync. Ping works to and from all interfaces
on the dedicated interconnect between all servers so I know they can see
each other. All servers are at srss3.1 with the same patch level.
Firmware download does work from the replaced server because DTU's that
reboot and connect to it are getting firmware updates. Running utgstatus
on all FOG members gives identical output and is correct. Also, utpolicy
output is identical and utreplica -l looks okay. I ran a snoop against
the interconnect and I can see the multicast traffic between the servers
coming and going.
The only thing different between the replaced server then and now is
that we used a different physical interface for the Lan connection
which would cause the mac address of the lan connection to be different
from before.
We reboot these machines once a week and run utfwsync to balance the
pseudo session load after all servers have rebooted. The effect now is
that the sync runs on 3 of the 4 servers causing all DTUs to connect to
the one where the sync fails defeating the purpose of the whole thing.
Thanks in advance.
_______________________________________________
SunRay-Users mailing list
[email protected]
http://www.filibeto.org/mailman/listinfo/sunray-users