Hello all, I'm faced with a problem, there I can't say yet if it is appia, hedera or sequoia to fix.
My general environment:
debian etch, java 1.5, 1 backend postgresql 8.1.4, sequoia 2.10.4, appia 3.2.4,
hedera 1.5.6-cvs03.01.2007, "base view" setup of appia
Raidb-1 setup with 2 machines, each 1 controller with 1 single backend
Due to the HA setup I need to manage a virtual IP controller by heartbeat which
jumps between HA members. The problem now is that sequoia/appia/hedera takes
this virtual IP as member name and not the primary first IP of the network
interface. This works without problems in the setup moment but the whole thing
is running completly mad if this virtual ip switch over to the other machine.
So I would suggest to improve the IP detection and that you try to always use
the IP configured in appia config.
Here my setup:
(for testing: /etc/ha.d/resource.d/IPaddr2 <ip> start/stop)
ip addr machine1:
2: eth0: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:31:49:84:c6:c0 brd ff:ff:ff:ff:ff:ff
inet 215.66.59.211/28 brd 215.66.59.223 scope global eth0
3: eth1: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:30:48:84:c6:c1 brd ff:ff:ff:ff:ff:ff
inet 10.10.10.1/29 brd 10.10.10.7 scope global eth1
ip addr machine2:
2: eth0: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:31:49:84:c6:c0 brd ff:ff:ff:ff:ff:ff
inet 215.66.59.212/28 brd 215.66.59.223 scope global eth0
inet 215.66.59.213/28 brd 215.66.59.223 scope global secondary eth0
3: eth1: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:30:48:84:c6:c1 brd ff:ff:ff:ff:ff:ff
inet 10.10.10.2/29 brd 10.10.10.7 scope global eth1
appia.xml:
<channel name="TCP SEQ Channel" template="tcp_sequencer" initialized="yes">
<memorymanagement size="40000000" up_threshold="15000000"
down_threshold="7000000" />
<chsession name="hederalayer">
<parameter name="base_view">10.10.10.1:21080,10.10.10.2:21080</parameter>
<parameter name="base_endpoints">node00304874867C,node00304884C6C0</parameter>
<parameter name="initial_endpoints">node00304884C6C0</parameter>
<parameter name="local_address">10.10.10.2:21080</parameter>
<parameter name="local_endpoint">node00304884C6C0</parameter>
</chsession>
</channel>
log entries:
during start the members are named with IPs from eth0
-----------------------------------------------------
2007-01-12 10:49:01,315 INFO controller.virtualdatabase.botdb
Member(address=/215.66.59.213:21080, uid=215.66.59.213:21080) see
members:[Member(address=/215.66.59.211:21080, uid=215.66.59.211:21080),
Member(address=/215.66.59.213:21080, uid=215.66.59.213:21080)] and has
mapping:{Member(address=/215.66.59.211:21080,
uid=215.66.59.211:21080)=10.10.10.1:21090, Member(address=/215.66.59.213:21080,
uid=215.66.59.213:21080)=10.10.10.2:21090}
after ip takeover of machine1
-----------------------------
2007-01-15 11:18:34,456 INFO continuent.hedera.gms
Member(address=/217.66.59.213:21080, uid=217.66.59.213:21080) failed in
Group(gid=botdb)
2007-01-15 11:18:34,464 WARN controller.virtualdatabase.botdb Controller
Member(address=/217.66.59.213:21080, uid=217.66.59.213:21080) has left the
cluster.
2007-01-15 11:18:34,465 INFO controller.virtualdatabase.botdb 0 requests were
waiting responses from Member(address=/217.66.59.213:21080,
uid=217.66.59.213:21080)
2007-01-15 11:18:34,479 INFO controller.requestmanager.cleanup Waiting 60000ms
for client of controller 562949953421312 to failover
2007-01-15 11:19:34,497 INFO controller.requestmanager.cleanup Cleanup for
controller 562949953421312 failure is completed.
This failure doesn't exist really. There was just an virtual IP change but
sequoia should be work without any action after IP switch.
Thanx for help and response,
)ngo
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ Sequoia mailing list [email protected] https://forge.continuent.org/mailman/listinfo/sequoia
