Author: aconway
Date: Tue Mar 27 14:49:47 2012
New Revision: 1305855
URL: http://svn.apache.org/viewvc?rev=1305855&view=rev
Log:
QPID-3603: Update new HA docs with information on rgmanager, more detail about
client connections.
Modified:
qpid/trunk/qpid/doc/book/src/Active-Passive-Cluster.xml
Modified: qpid/trunk/qpid/doc/book/src/Active-Passive-Cluster.xml
URL:
http://svn.apache.org/viewvc/qpid/trunk/qpid/doc/book/src/Active-Passive-Cluster.xml?rev=1305855&r1=1305854&r2=1305855&view=diff
==============================================================================
--- qpid/trunk/qpid/doc/book/src/Active-Passive-Cluster.xml (original)
+++ qpid/trunk/qpid/doc/book/src/Active-Passive-Cluster.xml Tue Mar 27 14:49:47
2012
@@ -27,66 +27,62 @@ under the License.
<section>
<title>Overview</title>
<para>
- This release provides a preview of a new module for High Availability
(HA). The new
- module is not yet complete or ready for production use, it being made
available so
- that users can experiment with the new approach and provide feedback
early in the
- development process. Feedback should go to <ulink
- url="mailto:[email protected]">[email protected]</ulink>.
- </para>
- <para>
- The old cluster module takes an <firstterm>active-active</firstterm>
approach,
- i.e. all the brokers in a cluster are able to handle client requests
- simultaneously. The new HA module takes an
<firstterm>active-passive</firstterm>,
- <firstterm>hot-standby</firstterm> approach.
- </para>
- <para>
- In an active-passive cluster, only one broker, known as the
- <firstterm>primary</firstterm>, is active and serving clients at a time.
The other
- brokers are standing by as <firstterm>backups</firstterm>. Changes on
the primary
- are immediately replicated to all the backups so they are always
up-to-date or
- "hot". If the primary fails, one of the backups is promoted to be the
new
- primary. Clients fail-over to the new primary automatically. If there
are multiple
- backups, the backups also fail-over to become backups of the new primary.
- </para>
- <para>
- The new approach depends on an external <firstterm>cluster resource
- manager</firstterm> to detect failure of the primary and choose the new
primary. The
- first supported resource manager will be <ulink
- url="https://fedorahosted.org/cluster/wiki/RGManager">rgmanager</ulink>,
but it will
- be possible to add integration with other resource managers in the
future. The
- preview version is not integrated with any resource manager, you can use
the
- <command>qpid-ha</command> tool to simulate the actions of a resource
manager or do
- your own integration.
+ This release provides a preview of a new module for High Availability
(HA). The new module is
+ not yet complete or ready for production use. It being made available so
that users can
+ experiment with the new approach and provide feedback early in the
development process.
+ Feedback should go to <ulink
url="mailto:[email protected]">[email protected]</ulink>.
+ </para>
+ <para>
+ The old cluster module takes an <firstterm>active-active</firstterm>
approach, i.e. all the
+ brokers in a cluster are able to handle client requests simultaneously.
The new HA module
+ takes an <firstterm>active-passive</firstterm>,
<firstterm>hot-standby</firstterm> approach.
+ </para>
+ <para>
+ In an active-passive cluster only one broker, known as the
<firstterm>primary</firstterm>, is
+ active and serving clients at a time. The other brokers are standing by
as
+ <firstterm>backups</firstterm>. Changes on the primary are immediately
replicated to all the
+ backups so they are always up-to-date or "hot". If the primary fails,
one of the backups is
+ promoted to take over as the new primary. Clients fail-over to the new
primary
+ automatically. If there are multiple backups, the backups also fail-over
to become backups of
+ the new primary. Backup brokers reject connection attempts, to enforce
the requirement that
+ only the primary be active.
+ </para>
+ <para>
+ This approach depends on an external <firstterm>cluster resource
manager</firstterm> to detect
+ failures and choose the primary. <ulink
+ url="https://fedorahosted.org/cluster/wiki/RGManager">Rgmanager</ulink>
is supported
+ initially, but others may be supported in future future.
</para>
<section>
<title>Why the new approach?</title>
- The new active-passive approach has several advantages compared to the
- existing active-active cluster module.
- <itemizedlist>
- <listitem>
- It does not depend directly on openais or corosync. It does not use
multicast
- which simplifies deployment.
- </listitem>
- <listitem>
- It is more portable: in environments that don't support corosync, it
can be
- integrated with a resource manager available in that environment.
- </listitem>
- <listitem>
- Replication to a <firstterm>disaster recovery</firstterm> site can be
handled as
- simply another node in the cluster, it does not require a separate
replication
- mechanism.
- </listitem>
- <listitem>
- It can take advantage of features provided by the resource manager,
for example
- virtual IP addresses.
- </listitem>
- <listitem>
- Improved performance and scalability due to better use of multiple
CPU s
- </listitem>
- </itemizedlist>
+ <para>
+ The new active-passive approach has several advantages compared to the
+ existing active-active cluster module.
+ <itemizedlist>
+ <listitem>
+ It does not depend directly on openais or corosync. It does not use
multicast
+ which simplifies deployment.
+ </listitem>
+ <listitem>
+ It is more portable: in environments that don't support corosync,
it can be
+ integrated with a resource manager available in that environment.
+ </listitem>
+ <listitem>
+ Replication to a <firstterm>disaster recovery</firstterm> site can
be handled as
+ simply another node in the cluster, it does not require a separate
replication
+ mechanism.
+ </listitem>
+ <listitem>
+ It can take advantage of features provided by the resource manager,
for example
+ virtual IP addresses.
+ </listitem>
+ <listitem>
+ Improved performance and scalability due to better use of multiple
CPU s
+ </listitem>
+ </itemizedlist>
+ </para>
</section>
<section>
-
<title>Limitations</title>
<para>
@@ -96,9 +92,9 @@ under the License.
<itemizedlist>
<listitem>
- Transactional changes to queue state are not replicated atomically.
If the
- primary crashes during a transaction, it is possible that the backup
could
- contain only part of the changes introduced by a transaction.
+ Transactional changes to queue state are not replicated atomically.
If the primary crashes
+ during a transaction, it is possible that the backup could contain
only part of the
+ changes introduced by a transaction.
</listitem>
<listitem>
During a fail-over one backup is promoted to primary and any other
backups switch to
@@ -107,14 +103,14 @@ under the License.
switched.
</listitem>
<listitem>
- When used with a persistent store: if the entire cluster fails, there
are no tools
- to help identify the most recent store.
- </listitem>
- <listitem>
Acknowledgments are confirmed to clients before the message has been
dequeued
from replicas or indeed from the local store if that is asynchronous.
</listitem>
<listitem>
+ When used with a persistent store: if the entire cluster fails, there
are no tools to help
+ identify the most recent store.
+ </listitem>
+ <listitem>
A persistent broker must have its store erased before joining an
existing cluster.
In the production version a persistent broker will be able to load
its store and
avoid downloading messages that are in the store from the primary.
@@ -149,18 +145,32 @@ under the License.
</section>
</section>
-
+ <section>
+ <title>Virtual IP Addresses</title>
+ <para>
+ Some resource managers (including <command>rgmanager</command>) support
<firstterm>virtual IP
+ addresses</firstterm>. A virtual IP address is an IP address that can be
relocated to any of
+ the nodes in a cluster. The resource manager associates this address
with the primary node in
+ the cluster, and relocates it to the new primary when there is a
failure. This simplifies
+ configuration as you can publish a single IP address rather than a list.
+ </para>
+ <para>
+ A virtual IP address can be used by clients to connect to the primary,
and also by backup
+ brokers when they connect to the primary. The following sections will
explain how to configure
+ virtual IP addresses for clients or brokers.
+ </para>
+ </section>
<section>
<title>Configuring the Brokers</title>
<para>
- The broker must load the <filename>ha</filename> module, it is loaded by
default
- when you start a broker. The following broker options are available for
the HA module.
+ The broker must load the <filename>ha</filename> module, it is loaded by
default. The
+ following broker options are available for the HA module.
</para>
<table frame="all" id="ha-broker-options">
<title>Options for High Availability Messaging Cluster</title>
<tgroup align="left" cols="2" colsep="1" rowsep="1">
<colspec colname="c1" colwidth="1*"/>
- <colspec colname="c2" colwidth="4*"/>
+ <colspec colname="c2" colwidth="3*"/>
<thead>
<row>
<entry align="center" nameend="c2" namest="c1">
@@ -171,7 +181,7 @@ under the License.
<tbody>
<row>
<entry>
- <command>--ha-cluster <replaceable>yes|no</replaceable></command>
+ <literal>--ha-cluster <replaceable>yes|no</replaceable></literal>
</entry>
<entry>
Set to "yes" to have the broker join a cluster.
@@ -179,7 +189,7 @@ under the License.
</row>
<row>
<entry>
- <command>--ha-brokers <replaceable>URL</replaceable></command>
+ <literal>--ha-brokers <replaceable>URL</replaceable></literal>
</entry>
<entry>
URL use by brokers to connect to each other. The URL lists the
addresses of
@@ -201,19 +211,19 @@ under the License.
</entry>
</row>
<row>
- <entry> <command>--ha-public-brokers
<replaceable>URL</replaceable></command> </entry>
+ <entry> <literal>--ha-public-brokers
<replaceable>URL</replaceable></literal> </entry>
<entry>
URL used by clients to connect to the brokers in the same format
as
- <command>--ha-brokers</command> above. Use this option if you
want client
+ <literal>--ha-brokers</literal> above. Use this option if you
want client
traffic on a different network from broker replication traffic.
If this
option is not set, clients will use the same URL as brokers.
</entry>
</row>
<row>
<entry>
- <para><command>--ha-username
<replaceable>USER</replaceable></command></para>
- <para><command>--ha-password
<replaceable>PASS</replaceable></command></para>
- <para><command>--ha-mechanism
<replaceable>MECH</replaceable></command></para>
+ <para><literal>--ha-username
<replaceable>USER</replaceable></literal></para>
+ <para><literal>--ha-password
<replaceable>PASS</replaceable></literal></para>
+ <para><literal>--ha-mechanism
<replaceable>MECH</replaceable></literal></para>
</entry>
<entry>
Brokers use <replaceable>USER</replaceable>,
@@ -225,16 +235,15 @@ under the License.
</tgroup>
</table>
<para>
- To configure a cluster you must set at least
<command>ha-cluster</command> and <command>ha-brokers</command>
+ To configure a cluster you must set at least
<literal>ha-cluster</literal> and <literal>ha-brokers</literal>.
</para>
</section>
-
<section>
<title>Creating replicated queues and exchanges</title>
<para>
To create a replicated queue or exchange, pass the argument
- <command>qpid.replicate</command> when creating the queue or exchange.
It should
+ <literal>qpid.replicate</literal> when creating the queue or exchange.
It should
have one of the following three values:
<itemizedlist>
<listitem>
@@ -249,113 +258,313 @@ under the License.
</listitem>
</itemizedlist>
</para>
- Bindings are automatically replicated if the queue and exchange being
bound both have
- replication argument of <command>all</command> or
<command>confguration</command>, they are
- not replicated otherwise.
+ <para>
+ Bindings are automatically replicated if the queue and exchange being
bound both have
+ replication argument of <literal>all</literal> or
<literal>configuration</literal>, they are
+ not replicated otherwise.
+ </para>
+ <para>
+ You can create replicated queues and exchanges with the
<command>qpid-config</command>
+ management tool like this:
+ <programlisting>
+ qpid-config add queue myqueue --replicate all
+ </programlisting>
+ </para>
+ <para>
+ To create replicated queues and exchanges via the client API, add a
<literal>node</literal> entry to the address like this:
+ <programlisting>
+
"myqueue;{create:always,node:{x-declare:{arguments:{'qpid.replicate':all}}}}"
+ </programlisting>
+ </para>
+ </section>
- You can create replicated queues and exchanges with the
<command>qpid-config</command>
- management tool like this:
- <programlisting>
- qpid-config add queue myqueue --replicate all
- </programlisting>
+ <section>
+ <title>Client Connection and Fail-over</title>
+ <para>
+ Clients can only connect to the primary broker. Backup brokers
automatically reject any
+ connection attempt by a client.
+ </para>
+ <para>
+ Clients are configured with the URL for the cluster. There are two
possibilities
+ <itemizedlist>
+ <listitem> The URL contains multiple addresses, one for each broker in
the cluster.</listitem>
+ <listitem>
+ The URL contains a single <firstterm>virtual IP address</firstterm>
that is assigned to the primary broker by the resource manager.
+ <footnote><para>Only if the resource manager supports virtual IP
addresses</para></footnote>
+ </listitem>
+ </itemizedlist>
+ In the first case, clients will repeatedly re-try each address in the
URL until they
+ successfully connect to the primary. In the second case the resource
manager will assign the
+ virtual IP address to the primary broker, so clients only need to re-try
on a single address.
+ </para>
+ <para>
+ When the primary broker fails all clients are disconnected. They go back
to re-trying until
+ they connect to the new primary. Any messages that have been sent by
the client, but not yet
+ acknowledged as delivered, are resent. Similarly messages that have been
sent by the broker,
+ but not acknowledged, are re-queued.
+ </para>
+ <para>
+ Suppose your cluster has 3 nodes: <literal>node1</literal>,
<literal>node2</literal>
+ and <literal>node3</literal> all using the default AMQP port. To connect
a client you
+ need to specify the address(es) and set the <literal>reconnect</literal>
property to
+ <literal>true</literal>. Here's how to connect each type of client:
+ </para>
+ <section>
+ <title>C++ clients</title>
+ <para>
+ With the C++ client, you specify multiple cluster addresses in a single
URL
+ <footnote>
+ <para>
+ The full grammar for the URL is:
+ <programlisting>
+ url = ["amqp:"][ user ["/" password] "@" ] addr ("," addr)*
+ addr = tcp_addr / rmda_addr / ssl_addr / ...
+ tcp_addr = ["tcp:"] host [":" port]
+ rdma_addr = "rdma:" host [":" port]
+ ssl_addr = "ssl:" host [":" port]'
+ </programlisting>
+ </para>
+ </footnote>. You also
+ need to specify the connection option <literal>reconnect</literal> to
be true. For
+ example:
+ <programlisting>
+ qpid::messaging::Connection
c("node1,node2,node3","{reconnect:true}");
+ </programlisting>
+ </para>
+ </section>
+ <section>
+ <title>Python clients</title>
+ <para>
+ With the python client, you specify <literal>reconnect=True</literal>
and a list of
+ <replaceable>host:port</replaceable> addresses as
<literal>reconnect_urls</literal>
+ when calling <literal>Connection.establish</literal> or
<literal>Connection.open</literal>
+ <programlisting>
+ connection = qpid.messaging.Connection.establish("node1",
reconnect=True, reconnect_urls=["node1", "node2", "node3"])
+ </programlisting>
+ </para>
+ </section>
+ <section>
+ <title>Java JMS Clients</title>
+ <para>
+ In Java JMS clients, client fail-over is handled automatically if it is
enabled in the
+ connection. You can configure a connection to use fail-over using the
+ <command>failover</command> property:
+ </para>
- To create replicated queues and exchangs via the client API, add a
<command>node</command> entry to the address like this:
- <programlisting>
-
"myqueue;{create:always,node:{x-declare:{arguments:{'qpid.replicate':all}}}}"
- </programlisting>
- </section>
+ <screen>
+ connectionfactory.qpidConnectionfactory =
amqp://guest:guest@clientid/test?brokerlist='tcp://localhost:5672'&failover='failover_exchange'
+ </screen>
+ <para>
+ This property can take three values:
+ </para>
+ <variablelist>
+ <title>Fail-over Modes</title>
+ <varlistentry>
+ <term>failover_exchange</term>
+ <listitem>
+ <para>
+ If the connection fails, fail over to any other broker in the
cluster.
+ </para>
+
+ </listitem>
+
+ </varlistentry>
+ <varlistentry>
+ <term>roundrobin</term>
+ <listitem>
+ <para>
+ If the connection fails, fail over to one of the brokers
specified in the <command>brokerlist</command>.
+ </para>
+
+ </listitem>
+
+ </varlistentry>
+ <varlistentry>
+ <term>singlebroker</term>
+ <listitem>
+ <para>
+ Fail-over is not supported; the connection is to a single broker
only.
+ </para>
+
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+ <para>
+ In a Connection URL, heartbeat is set using the
<command>idle_timeout</command> property, which is an integer corresponding to
the heartbeat period in seconds. For instance, the following line from a JNDI
properties file sets the heartbeat time out to 3 seconds:
+ </para>
+ <screen>
+ connectionfactory.qpidConnectionfactory =
amqp://guest:guest@clientid/test?brokerlist='tcp://localhost:5672',idle_timeout=3
+ </screen>
+
+ </section>
+ </section>
<section>
- <title>Client Fail-over</title>
+ <title>The Cluster Resource Manager</title>
<para>
- Clients can only connect to the single primary broker. All other brokers
in the
- cluster are backups, and they automatically reject any attempt by a
client to
- connect.
+ Broker fail-over is managed by a <firstterm>cluster resource
manager</firstterm>. An
+ integration with <ulink
+ url="https://fedorahosted.org/cluster/wiki/RGManager">rgmanager</ulink>
is provided, but it is
+ possible to integrate with other resource managers.
</para>
<para>
- Clients are configured with the addreses of all of the brokers in the
cluster.
- <footnote>
- <para>
- If the resource manager supports virtual IP addresses then the clients
- can be configured with a single virtual IP address.
- </para>
- </footnote>
- When the client tries to connect initially, it will try all of its
addresses until it
- successfully connects to the primary. If the primary fails, clients will
try to
- try to re-connect to all the known brokers until they find the new
primary.
+ The resource manager is responsible for starting an
appropriately-configured broker on each
+ node in the cluster. The resource manager then
<firstterm>promotes</firstterm> one of the
+ brokers to be the primary. The other brokers connect to the primary as
backups, using the URL
+ provided in the <literal>ha-brokers</literal> configuration option.
</para>
<para>
- Suppose your cluster has 3 nodes: <command>node1</command>,
<command>node2</command> and <command>node3</command> all using the default
AMQP port.
+ Once connected, the backup brokers synchronize their state with the
primary. When a backup is
+ synchronized, or "hot", it is ready to take over if the primary fails.
Backup brokers
+ continually receive updates from the primary in order to stay
synchronized.
</para>
<para>
- With the C++ client, you specify all the cluster addresses in a single
URL, for example:
- <programlisting>
- qpid::messaging::Connection c("node1:node2:node3");
- </programlisting>
+ If the primary fails, backup brokers go into fail-over mode. The
resource manager must detect
+ the failure and promote one of the backups to be the new primary. The
other backups connect
+ to the new primary and synchronize their state so they can be backups
for it.
</para>
<para>
- With the python client, you specify <command>reconnect=True</command>
and a list of <replaceable>host:port</replaceable> addresses as
<command>reconnect_urls</command> when calling <command>establish</command> or
<command>open</command>
- <programlisting>
- connection = qpid.messaging.Connection.establish("node1",
reconnect=True, "reconnect_urls=["node1", "node2", "node3"])
- </programlisting>
+ The resource manager is also responsible for protecting the cluster from
+ <firstterm>split-brain</firstterm> conditions resulting from a network
partition.
+ A network partition divide a cluster into two sub-groups which cannot
see each other.
+ Usually a <firstterm>quorum</firstterm> voting algorithm is used that
disables
+ nodes in the inquorate sub-group.
</para>
</section>
-
<section>
- <title>Broker fail-over</title>
+ <title>Configuring <command>rgmanager</command> as resource manager</title>
<para>
- Broker fail-over is managed by a <firstterm>cluster resource
- manager</firstterm>. The initial preview version of HA is not integrated
with a
- resource manager, the production version will be integrated with <ulink
- url="https://fedorahosted.org/cluster/wiki/RGManager">rgmanager</ulink>
and it may
- be integrated with other resource managers in the future.
+ This section assumes that you are already familiar with setting up and
configuring
+ clustered services using <command>cman</command> and
<command>rgmanager</command>. It
+ will show you how to configure an active-passive, hot-standby
<command>qpidd</command>
+ HA cluster.
</para>
<para>
- The resource manager is responsible for ensuring that there is exactly
one broker
- is acting as primary at all times. It selects the initial primary broker
when the
- cluster is started, detects failure of the primary, and chooses the
backup to
- promote as the new primary.
+ Here is an example <literal>cluster.conf</literal> file for a cluster of
3 nodes named
+ mrg32, mrg34 and mrg35. We will go through the configuration
step-by-step.
</para>
+ <programlisting>
+<![CDATA[
+<?xml version="1.0"?>
+<cluster alias="qpid-hot-standby" config_version="4" name="qpid-hot-standby">
+ <clusternodes>
+ <clusternode name="mrg32" nodeid="1">
+ <fence/>
+ </clusternode>
+ <clusternode name="mrg34" nodeid="2">
+ <fence/>
+ </clusternode>
+ <clusternode name="mrg35" nodeid="3">
+ <fence/>
+ </clusternode>
+ </clusternodes>
+ <cman/>
+ <rm>
+ <failoverdomains>
+ <failoverdomain name="mrg32-domain" restricted="1">
+ <failoverdomainnode name="mrg32"/>
+ </failoverdomain>
+ <failoverdomain name="mrg34-domain" restricted="1">
+ <failoverdomainnode name="mrg34"/>
+ </failoverdomain>
+ <failoverdomain name="mrg35-domain" restricted="1">
+ <failoverdomainnode name="mrg35"/>
+ </failoverdomain>
+ </failoverdomains>
+ <resources>
+ <script file="/etc/init.d/qpidd" name="qpidd"/>
+ <script file="/etc/init.d/qpidd-primary" name="qpidd-primary"/>
+ <ip address="20.0.10.200" monitor_link="1"/>
+ <ip address="20.0.20.200" monitor_link="1"/>
+ </resources>
+
+ <!-- There is a qpidd service on each node, it should be restarted if it
fails. -->
+ <service name="mrg32-qpidd-service" domain="mrg32-domain"
recovery="restart">
+ <script ref="qpidd"/>
+ </service>
+ <service name="mrg34-qpidd-service" domain="mrg34-domain"
recovery="restart">
+ <script ref="qpidd"/>
+ </service>
+ <service name="mrg35-qpidd-service" domain="mrg35-domain"
recovery="restart">
+ <script ref="qpidd"/>
+ </service>
+
+ <!-- There should always be a single qpidd-primary service, it can run on
any node. -->
+ <service name="qpidd-primary-service" autostart="1" exclusive="0"
recovery="relocate">
+ <script ref="qpidd-primary"/>
+ <ip ref="20.0.10.200"/>
+ <ip ref="20.0.20.200"/>
+ </service>
+ </rm>
+ <fencedevices/>
+ <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
+</cluster>
+]]>
+ </programlisting>
<para>
- You can simulate the actions of a resource manager, or indeed do your own
- integration with a resource manager using the <command>qpid-ha</command>
tool. The
- command
- <programlisting>
- qpid-ha promote -b
<replaceable>host</replaceable>:<replaceable>port</replaceable>
- </programlisting>
- will promote the broker listening on
- <replaceable>host</replaceable>:<replaceable>port</replaceable> to be
the primary.
- You should only promote a broker to primary when there is no other
primary in the
- cluster. The brokers will not detect multiple primaries, they rely on
the resource
- manager to do that.
- </para>
- <para>
- A clustered broker always starts initially in
<firstterm>discovery</firstterm>
- mode. It uses the addresses configured in the
<command>ha-brokers</command>
- configuration option and tries to connect to each in turn until it finds
to the
- primary. The resource manager is responsible for choosing on of the
backups to
- promote as the initial primary.
- </para>
- <para>
- If the primary fails, all the backups are disconnected and return to
discovery mode.
- The resource manager chooses one to promote as the new primary. The
other backups
- will eventually discover the new primary and reconnect.
+ There is a <literal>failoverdomain</literal> for each node containing
just that
+ one node. This lets us stipulate that the qpidd service should always
run on all
+ nodes.
+ </para>
+ <para>
+ The <literal>resources</literal> section defines the usual
initialization script to
+ start the <command>qpidd</command> service. <command>qpidd</command>.
It also
+ defines the <command>qpid-primary</command> script. Starting this script
does not
+ actually start a new service, rather it promotes the existing
+ <command>qpidd</command> broker to primary status.
+ </para>
+ <para>
+ The <literal>resources</literal> section also defines a pair of virtual
IP
+ addresses on different sub-nets. One will be used for broker-to-broker
+ communication, the other for client-to-broker.
+ </para>
+ <para>
+ The <literal>service</literal> section defines 3
<command>qpidd</command> services,
+ one for each node. Each service is in a restricted fail-over domain
containing just
+ that node, and has the <literal>restart</literal> recovery policy. The
effect of
+ this is that rgmanager will run <command>qpidd</command> on each node,
restarting if
+ it fails.
+ </para>
+ <para>
+ There is a single <literal>qpidd-primary-service</literal> running the
+ <command>qpidd-primary</command> script which is not restricted to a
domain and has
+ the <literal>relocate</literal> recovery policy. This means rgmanager
will start
+ <command>qpidd-primary</command> on one of the nodes when the cluster
starts and
+ will relocate it to another node if the original node fails. Running the
+ <literal>qpidd-primary</literal> script does not actually start a new
process,
+ rather it promotes the existing broker to become the primary.
</para>
</section>
+
<section>
<title>Broker Administration</title>
<para>
- You can connect to a backup broker with the administrative tool
- <command>qpid-ha</command>. You can also connect with the tools
- <command>qpid-config</command>, <command>qpid-route</command> and
- <command>qpid-stat</command> if you pass the flag
<command>--ha-admin</command> on the
- command line. If you do connect to a backup you should not modify any
of the
- replicated queues, as this will disrupt the replication and may result in
- message loss.
+ Normally, clients are not allowed to connect to a backup broker. However
management tools are
+ allowed to connect to a backup brokers. If you use these tools you
<emphasis>must
+ not</emphasis> add or remove messages from replicated queues, or delete
replicated queues or
+ exchanges as this will corrupt the replication process and may cause
message loss.
+ </para>
+ <para>
+ <command>qpid-ha</command> allows you to view and change HA
configuration settings.
+ </para>
+ <para>
+ The tools <command>qpid-config</command>, <command>qpid-route</command>
and
+ <command>qpid-stat</command> will connect to a backup if you pass the
flag <command>--ha-admin</command> on the
+ command line.
+ </para>
+ <para>
+ To promote a broker to primary use the following command:
+ <programlisting>
+ qpid-ha promote -b
<replaceable>host</replaceable>:<replaceable>port</replaceable>
+ </programlisting>
+ The resource manager must ensure that it does not promote a broker to
primary when
+ there is already a primary in the cluster.
</para>
</section>
</section>
-<!-- LocalWords: scalability rgmanager multicast RGManager mailto LVQ
+
+<!-- LocalWords: scalability rgmanager multicast RGManager mailto LVQ qpidd
IP dequeued Transactional username
-->
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]