Hi,
A problem occurs on restarting a controller (architecture with 2 controllers
with jgroup)
The restarted one logged those lines :
12 Feb 2008 09:31:16,717 | LEVEL1 | Using Hedera properties file:
/hedera_jgroups.properties
12 Feb 2008 09:31:17,459 | LEVEL3 | Group communication channel is configured
as follows: JGroups channel wrapper: [EMAIL PROTECTED]
12 Feb 2008 09:31:20,065 | LEVEL3 | Storing checkpoint
Member(address=/172.20.1.77:65201, uid=db) joined group
db-172.20.1.81:28346-20080212093120061+0000 at request id 0
12 Feb 2008 09:31:20,111 | LEVEL3 | Storing checkpoint
Member(address=/172.20.1.77:65201, uid=db) quit group
db-172.20.1.81:28346-20080212093120108+0000 at request id 0
12 Feb 2008 09:31:20,117 | LEVEL3 | Removing controller null
12 Feb 2008 09:31:20,117 | LEVEL3 | Refreshing members
list:[Member(address=/172.20.1.81:65201, uid=db)]
12 Feb 2008 09:31:20,118 | LEVEL1 | 0 requests were waiting responses from
Member(address=/172.20.1.77:65201, uid=db)
12 Feb 2008 09:31:22,073 | LEVEL1 | Group db connected to
Member(address=/172.20.1.81:65201, uid=db)
12 Feb 2008 09:31:22,080 | LEVEL1 | First controller in group db
The alive one logged those lines
12 Feb 2008 09:30:56,413 | LEVEL3 | Storing checkpoint
Member(address=/172.20.1.81:65201, uid=db) quit group
db-172.20.1.77:28346-20080212093056405+0000 at request id 0
12 Feb 2008 09:30:56,416 | LEVEL1 | Member(address=/172.20.1.81:65201, uid=db)
has left distributed virtual database db
12 Feb 2008 09:30:56,416 | LEVEL1 | Controller
Member(address=/172.20.1.81:65201, uid=db) has left the cluster.
12 Feb 2008 09:30:56,416 | LEVEL1 | 0 requests were waiting responses from
Member(address=/172.20.1.81:65201, uid=db)
12 Feb 2008 09:30:56,455 | LEVEL3 | handleMessageSingleThreaded (class
org.continuent.sequoia.controller.virtualdatabase.protocol.FlushGroupCommunicationMessages):
[EMAIL PROTECTED]
12 Feb 2008 09:30:56,456 | LEVEL3 | handleMessageMultiThreaded (class
org.continuent.sequoia.controller.virtualdatabase.protocol.FlushGroupCommunicationMessages):
[EMAIL PROTECTED]
12 Feb 2008 09:30:56,456 | LEVEL3 | Removed [EMAIL PROTECTED] from total order
queue
12 Feb 2008 09:30:56,500 | LEVEL1 | Waiting 120000ms for client of controller
562949953421312 to failover
12 Feb 2008 09:31:19,765 | LEVEL3 | Storing checkpoint
Member(address=/172.20.1.81:65201, uid=db) joined group
db-172.20.1.77:28346-20080212093119764+0000 at request id 0
12 Feb 2008 09:31:58,027.....
How could it happen !?
Is there anything with my jgroups configuration ?
<config>
<UDP mcast_port="%mcast_port%"
mcast_addr="228.8.8.9"
tos="16"
ucast_recv_buf_size="20000000"
ucast_send_buf_size="640000"
mcast_recv_buf_size="25000000"
mcast_send_buf_size="640000"
loopback="false"
discard_incompatible_packets="true"
max_bundle_size="64000"
max_bundle_timeout="30"
use_incoming_packet_handler="true"
use_outgoing_packet_handler="false"
ip_ttl="12"
bind_addr="%bind_addr%"
bind_port="%bind_port%"
down_thread="false" up_thread="false"
enable_bundling="true"
diagnostics_port="%diagnostics_port%"/>
<PING timeout="2000"
down_thread="false" up_thread="false" num_initial_members="3"/>
<MERGE2 max_interval="10000"
down_thread="false" up_thread="false" min_interval="5000"/>
<FD timeout="2500" max_tries="3" shun="false"/>
<FD_SOCK down_thread="false" up_thread="false" start_port="%fd_sock_port%"/>
<!--FD_ALL intervall="3000" timeout="10000"/-->
<!--VERIFY_SUSPECT timeout="1500" down_thread="false"/-->
<pbcast.NAKACK max_xmit_size="60000"
use_mcast_xmit="false" gc_lag="0"
retransmit_timeout="100,200,300,600,1200,2400,4800"
down_thread="false" up_thread="false"
discard_delivered_msgs="true"/>
<UNICAST timeout="300,600,1200,2400,3600"
down_thread="false" up_thread="false"/>
<pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
down_thread="false" up_thread="false"
max_bytes="400000"/>
<VIEW_SYNC avg_send_interval="60000" down_thread="false" up_thread="false"
/>
<pbcast.GMS print_local_addr="true" join_timeout="3000"
down_thread="false" up_thread="false"
join_retry_timeout="2000" shun="true"
handle_concurrent_startup="true" />
<SEQUENCER down_thread="false" up_thread="false" />
<FC max_credits="2000000" down_thread="false" up_thread="false"
min_threshold="0.10"/>
<!-- FRAG2 frag_size="60000" down_thread="false" up_thread="true"/ -->
<!-- pbcast.STATE_TRANSFER down_thread="false" up_thread="false"/-->
</config>
Any other clue ?
Thanks in advance
Pierre
The information in this e-mail is confidential. The contents may not be
disclosed or used by anyone other then the addressee. Access to this e-mail by
anyone else is unauthorised.
If you are not the intended recipient, please notify Airbus immediately and
delete this e-mail.
Airbus cannot accept any responsibility for the accuracy or completeness of
this e-mail as it has been sent over public networks. If you have any concerns
over the content of this message or its Accuracy or Integrity, please contact
Airbus immediately.
All outgoing e-mails from Airbus are checked using regularly updated virus
scanning software but you should take whatever measures you deem to be
appropriate to ensure that this message and any attachments are virus free.
_______________________________________________
Sequoia mailing list
[email protected]
https://forge.continuent.org/mailman/listinfo/sequoia