http://git-wip-us.apache.org/repos/asf/qpid-site/blob/c3ab60f6/input/releases/qpid-cpp-1.39.0/cpp-broker/book/chapter-Managing-CPP-Broker.html.in ---------------------------------------------------------------------- diff --git a/input/releases/qpid-cpp-1.39.0/cpp-broker/book/chapter-Managing-CPP-Broker.html.in b/input/releases/qpid-cpp-1.39.0/cpp-broker/book/chapter-Managing-CPP-Broker.html.in new file mode 100644 index 0000000..705abfd --- /dev/null +++ b/input/releases/qpid-cpp-1.39.0/cpp-broker/book/chapter-Managing-CPP-Broker.html.in @@ -0,0 +1,459 @@ +<div class="docbook"><div class="navheader"><table summary="Navigation header" width="100%"><tr><th align="center" colspan="3">Chapter 2.  + Managing the AMQP Messaging Broker + </th></tr><tr><td align="left" width="20%"><a accesskey="p" href="ha-queue-replication.html">Prev</a> </td><th align="center" width="60%"> </th><td align="right" width="20%"> <a accesskey="n" href="ch02s02.html">Next</a></td></tr></table><hr /></div><div class="chapter"><div class="titlepage"><div><div><h1 class="title"><a id="chapter-Managing-CPP-Broker"></a>Chapter 2.  + Managing the AMQP Messaging Broker + </h1></div></div></div><div class="toc"><p><strong>Table of Contents</strong></p><dl class="toc"><dt><span class="section"><a href="chapter-Managing-CPP-Broker.html#section-Managing-CPP-Broker">2.1. Managing the C++ Broker </a></span></dt><dd><dl><dt><span class="section"><a href="chapter-Managing-CPP-Broker.html#MgmtC-2B-2B-Usingqpidconfig">2.1.1. + Using qpid-config + </a></span></dt><dt><span class="section"><a href="chapter-Managing-CPP-Broker.html#MgmtC-2B-2B-Usingqpidroute">2.1.2. + Using qpid-route + </a></span></dt><dt><span class="section"><a href="chapter-Managing-CPP-Broker.html#MgmtC-2B-2B-Usingqpidtool">2.1.3. + Using qpid-tool + </a></span></dt><dt><span class="section"><a href="chapter-Managing-CPP-Broker.html#MgmtC-2B-2B-Usingqpidprintevents">2.1.4. + Using + qpid-printevents + </a></span></dt><dt><span class="section"><a href="chapter-Managing-CPP-Broker.html#idm140333888607376">2.1.5. Using qpid-ha</a></span></dt></dl></dd><dt><span class="section"><a href="ch02s02.html">2.2. + Qpid Management Framework + </a></span></dt><dd><dl><dt><span class="section"><a href="ch02s02.html#QpidManagementFramework-WhatIsQMF">2.2.1. + What Is QMF + </a></span></dt><dt><span class="section"><a href="ch02s02.html#QpidManagementFramework-GettingStartedwithQMF">2.2.2. + Getting + Started with QMF + </a></span></dt><dt><span class="section"><a href="ch02s02.html#QpidManagementFramework-QMFConcepts">2.2.3. + QMF Concepts + </a></span></dt><dt><span class="section"><a href="ch02s02.html#QpidManagementFramework-TheQMFProtocol">2.2.4. + The QMF + Protocol + </a></span></dt><dt><span class="section"><a href="ch02s02.html#QpidManagementFramework-HowtoWriteaQMFConsole">2.2.5. + How + to Write a QMF Console + </a></span></dt><dt><span class="section"><a href="ch02s02.html#QpidManagementFramework-HowtoWriteaQMFAgent">2.2.6. + How to + Write a QMF Agent + </a></span></dt></dl></dd><dt><span class="section"><a href="ch02s03.html">2.3. + QMF Python Console Tutorial + </a></span></dt><dd><dl><dt><span class="section"><a href="ch02s03.html#QMFPythonConsoleTutorial-PrerequisiteInstallQpidMessaging">2.3.1. + Prerequisite + - Install Qpid Messaging + </a></span></dt><dt><span class="section"><a href="ch02s03.html#QMFPythonConsoleTutorial-SynchronousConsoleOperations">2.3.2. + Synchronous + Console Operations + </a></span></dt><dt><span class="section"><a href="ch02s03.html#QMFPythonConsoleTutorial-AsynchronousConsoleOperations">2.3.3. + Asynchronous + Console Operations + </a></span></dt><dt><span class="section"><a href="ch02s03.html#QMFPythonConsoleTutorial-DiscoveringwhatKindsofObjectsareAvailable">2.3.4. + Discovering what Kinds of Objects are Available + </a></span></dt></dl></dd></dl></div><div class="section"><div class="titlepage"><div><div><h2 class="title"><a id="section-Managing-CPP-Broker"></a>2.1.  Managing the C++ Broker </h2></div></div></div><p> + There are quite a few ways to interact with the C++ broker. The + command line tools + include: + </p><div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem"><p>qpid-route - used to configure federation (a set of federated + brokers) + </p></li><li class="listitem"><p>qpid-config - used to configure queues, exchanges, bindings + and list them etc + </p></li><li class="listitem"><p>qpid-tool - used to view management information/statistics + and call any management actions on the broker + </p></li><li class="listitem"><p>qpid-printevents - used to receive and print QMF events + </p></li><li class="listitem"><p>qpid-ha - used to interact with the High Availability module + </p></li></ul></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="MgmtC-2B-2B-Usingqpidconfig"></a>2.1.1.  + Using qpid-config + </h3></div></div></div><p> + This utility can be used to create queues exchanges and bindings, + both durable and transient. Always check for latest options by + running --help command. + </p><pre class="programlisting"> +$ qpid-config --help +Usage: qpid-config [OPTIONS] + qpid-config [OPTIONS] exchanges [filter-string] + qpid-config [OPTIONS] queues [filter-string] + qpid-config [OPTIONS] add exchange <type> <name> [AddExchangeOptions] + qpid-config [OPTIONS] del exchange <name> + qpid-config [OPTIONS] add queue <name> [AddQueueOptions] + qpid-config [OPTIONS] del queue <name> + qpid-config [OPTIONS] bind <exchange-name> <queue-name> [binding-key] + qpid-config [OPTIONS] unbind <exchange-name> <queue-name> [binding-key] + +Options: + -b [ --bindings ] Show bindings in queue or exchange list + -a [ --broker-addr ] Address (localhost) Address of qpidd broker + broker-addr is in the form: [username/password@] hostname | ip-address [:<port>] + ex: localhost, 10.1.1.7:10000, broker-host:10000, guest/guest@localhost + +Add Queue Options: + --durable Queue is durable + --file-count N (8) Number of files in queue's persistence journal + --file-size N (24) File size in pages (64Kib/page) + --max-queue-size N Maximum in-memory queue size as bytes + --max-queue-count N Maximum in-memory queue size as a number of messages + --limit-policy [none | reject | flow-to-disk | ring | ring-strict] + Action taken when queue limit is reached: + none (default) - Use broker's default policy + reject - Reject enqueued messages + flow-to-disk - Page messages to disk + ring - Replace oldest unacquired message with new + ring-strict - Replace oldest message, reject if oldest is acquired + --order [fifo | lvq | lvq-no-browse] + Set queue ordering policy: + fifo (default) - First in, first out + lvq - Last Value Queue ordering, allows queue browsing + lvq-no-browse - Last Value Queue ordering, browsing clients may lose data + +Add Exchange Options: + --durable Exchange is durable + --sequence Exchange will insert a 'qpid.msg_sequence' field in the message header + with a value that increments for each message forwarded. + --ive Exchange will behave as an 'initial-value-exchange', keeping a reference + to the last message forwarded and enqueuing that message to newly bound + queues. +</pre><p> + Get the summary page + </p><pre class="programlisting"> +$ qpid-config +Total Exchanges: 6 + topic: 2 + headers: 1 + fanout: 1 + direct: 2 + Total Queues: 7 + durable: 0 + non-durable: 7 +</pre><p> + List the queues + </p><pre class="programlisting"> +$ qpid-config queues +Queue Name Attributes +================================================================= +pub_start +pub_done +sub_ready +sub_done +perftest0 --durable +reply-dhcp-100-18-254.bos.redhat.com.20713 auto-del excl +topic-dhcp-100-18-254.bos.redhat.com.20713 auto-del excl + +</pre><p> + List the exchanges with bindings + </p><pre class="programlisting"> +$ ./qpid-config -b exchanges +Exchange '' (direct) + bind pub_start => pub_start + bind pub_done => pub_done + bind sub_ready => sub_ready + bind sub_done => sub_done + bind perftest0 => perftest0 + bind mgmt-3206ff16-fb29-4a30-82ea-e76f50dd7d15 => mgmt-3206ff16-fb29-4a30-82ea-e76f50dd7d15 + bind repl-3206ff16-fb29-4a30-82ea-e76f50dd7d15 => repl-3206ff16-fb29-4a30-82ea-e76f50dd7d15 +Exchange 'amq.direct' (direct) + bind repl-3206ff16-fb29-4a30-82ea-e76f50dd7d15 => repl-3206ff16-fb29-4a30-82ea-e76f50dd7d15 + bind repl-df06c7a6-4ce7-426a-9f66-da91a2a6a837 => repl-df06c7a6-4ce7-426a-9f66-da91a2a6a837 + bind repl-c55915c2-2fda-43ee-9410-b1c1cbb3e4ae => repl-c55915c2-2fda-43ee-9410-b1c1cbb3e4ae +Exchange 'amq.topic' (topic) +Exchange 'amq.fanout' (fanout) +Exchange 'amq.match' (headers) +Exchange 'qpid.management' (topic) + bind mgmt.# => mgmt-3206ff16-fb29-4a30-82ea-e76f50dd7d15 +</pre></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="MgmtC-2B-2B-Usingqpidroute"></a>2.1.2.  + Using qpid-route + </h3></div></div></div><p> + This utility is to create federated networks of brokers, This + allows you for forward messages between brokers in a network. + Messages can be routed statically (using "qpid-route route add") + where the bindings that control message forwarding are supplied + in the route. Message routing can also be dynamic (using + "qpid-route dynamic add") where the messages are automatically + forwarded to clients based on their bindings to the local broker. + </p><pre class="programlisting"> +$ qpid-route +Usage: qpid-route [OPTIONS] dynamic add <dest-broker> <src-broker> <exchange> [tag] [exclude-list] + qpid-route [OPTIONS] dynamic del <dest-broker> <src-broker> <exchange> + + qpid-route [OPTIONS] route add <dest-broker> <src-broker> <exchange> <routing-key> [tag] [exclude-list] + qpid-route [OPTIONS] route del <dest-broker> <src-broker> <exchange> <routing-key> + qpid-route [OPTIONS] queue add <dest-broker> <src-broker> <exchange> <queue> + qpid-route [OPTIONS] queue del <dest-broker> <src-broker> <exchange> <queue> + qpid-route [OPTIONS] route list [<dest-broker>] + qpid-route [OPTIONS] route flush [<dest-broker>] + qpid-route [OPTIONS] route map [<broker>] + + qpid-route [OPTIONS] link add <dest-broker> <src-broker> + qpid-route [OPTIONS] link del <dest-broker> <src-broker> + qpid-route [OPTIONS] link list [<dest-broker>] + +Options: + -v [ --verbose ] Verbose output + -q [ --quiet ] Quiet output, don't print duplicate warnings + -d [ --durable ] Added configuration shall be durable + -e [ --del-empty-link ] Delete link after deleting last route on the link + -s [ --src-local ] Make connection to source broker (push route) + -t <transport> [ --transport <transport>] + Specify transport to use for links, defaults to tcp + + dest-broker and src-broker are in the form: [username/password@] hostname | ip-address [:<port>] + ex: localhost, 10.1.1.7:10000, broker-host:10000, guest/guest@localhost +</pre><p> + A few examples: + </p><pre class="programlisting"> +qpid-route dynamic add host1 host2 fed.topic +qpid-route dynamic add host2 host1 fed.topic + +qpid-route -v route add host1 host2 hub1.topic hub2.topic.stock.buy +qpid-route -v route add host1 host2 hub1.topic hub2.topic.stock.sell +qpid-route -v route add host1 host2 hub1.topic 'hub2.topic.stock.#' +qpid-route -v route add host1 host2 hub1.topic 'hub2.#' +qpid-route -v route add host1 host2 hub1.topic 'hub2.topic.#' +qpid-route -v route add host1 host2 hub1.topic 'hub2.global.#' +</pre><p> + The link map feature can be used to display the entire federated + network configuration by supplying a single broker as an entry + point: + </p><pre class="programlisting"> +$ qpid-route route map localhost:10001 + +Finding Linked Brokers: + localhost:10001... Ok + localhost:10002... Ok + localhost:10003... Ok + localhost:10004... Ok + localhost:10005... Ok + localhost:10006... Ok + localhost:10007... Ok + localhost:10008... Ok + +Dynamic Routes: + + Exchange fed.topic: + localhost:10002 <=> localhost:10001 + localhost:10003 <=> localhost:10002 + localhost:10004 <=> localhost:10002 + localhost:10005 <=> localhost:10002 + localhost:10006 <=> localhost:10005 + localhost:10007 <=> localhost:10006 + localhost:10008 <=> localhost:10006 + + Exchange fed.direct: + localhost:10002 => localhost:10001 + localhost:10004 => localhost:10003 + localhost:10003 => localhost:10002 + localhost:10001 => localhost:10004 + +Static Routes: + + localhost:10003(ex=amq.direct) <= localhost:10005(ex=amq.direct) key=rkey + localhost:10003(ex=amq.direct) <= localhost:10005(ex=amq.direct) key=rkey2 +</pre></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="MgmtC-2B-2B-Usingqpidtool"></a>2.1.3.  + Using qpid-tool + </h3></div></div></div><p> + This utility provided a telnet style interface to be able to + view, list all stats and action + all the methods. Simple capture below. Best to just play with it + and mail the list if you have + questions or want features added. + </p><pre class="programlisting"> +qpid: +qpid: help +Management Tool for QPID +Commands: + list - Print summary of existing objects by class + list <className> - Print list of objects of the specified class + list <className> all - Print contents of all objects of specified class + list <className> active - Print contents of all non-deleted objects of specified class + list <list-of-IDs> - Print contents of one or more objects (infer className) + list <className> <list-of-IDs> - Print contents of one or more objects + list is space-separated, ranges may be specified (i.e. 1004-1010) + call <ID> <methodName> <args> - Invoke a method on an object + schema - Print summary of object classes seen on the target + schema <className> - Print details of an object class + set time-format short - Select short timestamp format (default) + set time-format long - Select long timestamp format + quit or ^D - Exit the program +qpid: list +Management Object Types: + ObjectType Active Deleted + ================================ + qpid.binding 21 0 + qpid.broker 1 0 + qpid.client 1 0 + qpid.exchange 6 0 + qpid.queue 13 0 + qpid.session 4 0 + qpid.system 1 0 + qpid.vhost 1 0 +qpid: list qpid.system +Objects of type qpid.system + ID Created Destroyed Index + ================================== + 1000 21:00:02 - host +qpid: list 1000 +Object of type qpid.system: (last sample time: 21:26:02) + Type Element 1000 + ======================================================= + config sysId host + config osName Linux + config nodeName localhost.localdomain + config release 2.6.24.4-64.fc8 + config version #1 SMP Sat Mar 29 09:15:49 EDT 2008 + config machine x86_64 +qpid: schema queue +Schema for class 'qpid.queue': + Element Type Unit Access Notes Description + =================================================================================================================== + vhostRef reference ReadCreate index + name short-string ReadCreate index + durable boolean ReadCreate + autoDelete boolean ReadCreate + exclusive boolean ReadCreate + arguments field-table ReadOnly Arguments supplied in queue.declare + storeRef reference ReadOnly Reference to persistent queue (if durable) + msgTotalEnqueues uint64 message Total messages enqueued + msgTotalDequeues uint64 message Total messages dequeued + msgTxnEnqueues uint64 message Transactional messages enqueued + msgTxnDequeues uint64 message Transactional messages dequeued + msgPersistEnqueues uint64 message Persistent messages enqueued + msgPersistDequeues uint64 message Persistent messages dequeued + msgDepth uint32 message Current size of queue in messages + msgDepthHigh uint32 message Current size of queue in messages (High) + msgDepthLow uint32 message Current size of queue in messages (Low) + byteTotalEnqueues uint64 octet Total messages enqueued + byteTotalDequeues uint64 octet Total messages dequeued + byteTxnEnqueues uint64 octet Transactional messages enqueued + byteTxnDequeues uint64 octet Transactional messages dequeued + bytePersistEnqueues uint64 octet Persistent messages enqueued + bytePersistDequeues uint64 octet Persistent messages dequeued + byteDepth uint32 octet Current size of queue in bytes + byteDepthHigh uint32 octet Current size of queue in bytes (High) + byteDepthLow uint32 octet Current size of queue in bytes (Low) + enqueueTxnStarts uint64 transaction Total enqueue transactions started + enqueueTxnCommits uint64 transaction Total enqueue transactions committed + enqueueTxnRejects uint64 transaction Total enqueue transactions rejected + enqueueTxnCount uint32 transaction Current pending enqueue transactions + enqueueTxnCountHigh uint32 transaction Current pending enqueue transactions (High) + enqueueTxnCountLow uint32 transaction Current pending enqueue transactions (Low) + dequeueTxnStarts uint64 transaction Total dequeue transactions started + dequeueTxnCommits uint64 transaction Total dequeue transactions committed + dequeueTxnRejects uint64 transaction Total dequeue transactions rejected + dequeueTxnCount uint32 transaction Current pending dequeue transactions + dequeueTxnCountHigh uint32 transaction Current pending dequeue transactions (High) + dequeueTxnCountLow uint32 transaction Current pending dequeue transactions (Low) + consumers uint32 consumer Current consumers on queue + consumersHigh uint32 consumer Current consumers on queue (High) + consumersLow uint32 consumer Current consumers on queue (Low) + bindings uint32 binding Current bindings + bindingsHigh uint32 binding Current bindings (High) + bindingsLow uint32 binding Current bindings (Low) + unackedMessages uint32 message Messages consumed but not yet acked + unackedMessagesHigh uint32 message Messages consumed but not yet acked (High) + unackedMessagesLow uint32 message Messages consumed but not yet acked (Low) + messageLatencySamples delta-time nanosecond Broker latency through this queue (Samples) + messageLatencyMin delta-time nanosecond Broker latency through this queue (Min) + messageLatencyMax delta-time nanosecond Broker latency through this queue (Max) + messageLatencyAverage delta-time nanosecond Broker latency through this queue (Average) +Method 'purge' Discard all messages on queue +qpid: list queue +Objects of type qpid.queue + ID Created Destroyed Index + =========================================================================== + 1012 21:08:13 - 1002.pub_start + 1014 21:08:13 - 1002.pub_done + 1016 21:08:13 - 1002.sub_ready + 1018 21:08:13 - 1002.sub_done + 1020 21:08:13 - 1002.perftest0 + 1038 21:09:08 - 1002.mgmt-3206ff16-fb29-4a30-82ea-e76f50dd7d15 + 1040 21:09:08 - 1002.repl-3206ff16-fb29-4a30-82ea-e76f50dd7d15 + 1046 21:09:32 - 1002.mgmt-df06c7a6-4ce7-426a-9f66-da91a2a6a837 + 1048 21:09:32 - 1002.repl-df06c7a6-4ce7-426a-9f66-da91a2a6a837 + 1054 21:10:01 - 1002.mgmt-c55915c2-2fda-43ee-9410-b1c1cbb3e4ae + 1056 21:10:01 - 1002.repl-c55915c2-2fda-43ee-9410-b1c1cbb3e4ae + 1063 21:26:00 - 1002.mgmt-8d621997-6356-48c3-acab-76a37081d0f3 + 1065 21:26:00 - 1002.repl-8d621997-6356-48c3-acab-76a37081d0f3 +qpid: list 1020 +Object of type qpid.queue: (last sample time: 21:26:02) + Type Element 1020 + ========================================================================== + config vhostRef 1002 + config name perftest0 + config durable False + config autoDelete False + config exclusive False + config arguments {'qpid.max_size': 0, 'qpid.max_count': 0} + config storeRef NULL + inst msgTotalEnqueues 500000 messages + inst msgTotalDequeues 500000 + inst msgTxnEnqueues 0 + inst msgTxnDequeues 0 + inst msgPersistEnqueues 0 + inst msgPersistDequeues 0 + inst msgDepth 0 + inst msgDepthHigh 0 + inst msgDepthLow 0 + inst byteTotalEnqueues 512000000 octets + inst byteTotalDequeues 512000000 + inst byteTxnEnqueues 0 + inst byteTxnDequeues 0 + inst bytePersistEnqueues 0 + inst bytePersistDequeues 0 + inst byteDepth 0 + inst byteDepthHigh 0 + inst byteDepthLow 0 + inst enqueueTxnStarts 0 transactions + inst enqueueTxnCommits 0 + inst enqueueTxnRejects 0 + inst enqueueTxnCount 0 + inst enqueueTxnCountHigh 0 + inst enqueueTxnCountLow 0 + inst dequeueTxnStarts 0 + inst dequeueTxnCommits 0 + inst dequeueTxnRejects 0 + inst dequeueTxnCount 0 + inst dequeueTxnCountHigh 0 + inst dequeueTxnCountLow 0 + inst consumers 0 consumers + inst consumersHigh 0 + inst consumersLow 0 + inst bindings 1 binding + inst bindingsHigh 1 + inst bindingsLow 1 + inst unackedMessages 0 messages + inst unackedMessagesHigh 0 + inst unackedMessagesLow 0 + inst messageLatencySamples 0 + inst messageLatencyMin 0 + inst messageLatencyMax 0 + inst messageLatencyAverage 0 +qpid: +</pre></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="MgmtC-2B-2B-Usingqpidprintevents"></a>2.1.4.  + Using + qpid-printevents + </h3></div></div></div><p> + This utility connects to one or more brokers and collects events, + printing out a line per event. + </p><pre class="programlisting"> +$ qpid-printevents --help +Usage: qpid-printevents [options] [broker-addr]... + +Collect and print events from one or more Qpid message brokers. If no broker- +addr is supplied, qpid-printevents will connect to 'localhost:5672'. broker- +addr is of the form: [username/password@] hostname | ip-address [:<port>] ex: +localhost, 10.1.1.7:10000, broker-host:10000, guest/guest@localhost + +Options: + -h, --help show this help message and exit +</pre><p> + You get the idea... have fun! + </p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="idm140333888607376"></a>2.1.5. Using qpid-ha</h3></div></div></div><p>This utility lets you monitor and control the activity of the clustering behavior provided by the HA module. + </p><pre class="programlisting"> + +qpid-ha --help +usage: qpid-ha <command> [<arguments>] + +Commands are: + + ready Test if a backup broker is ready. + query Print HA configuration settings. + set Set HA configuration settings. + promote Promote broker from backup to primary. + replicate Set up replication from <queue> on <remote-broker> to <queue> on the current broker. + +For help with a command type: qpid-ha <command> --help + + </pre></div></div></div><div class="navfooter"><hr /><table summary="Navigation footer" width="100%"><tr><td align="left" width="40%"><a accesskey="p" href="ha-queue-replication.html">Prev</a> </td><td align="center" width="20%"> </td><td align="right" width="40%"> <a accesskey="n" href="ch02s02.html">Next</a></td></tr><tr><td align="left" valign="top" width="40%">1.13. Replicating Queues with the HA module </td><td align="center" width="20%"><a accesskey="h" href="index.html">Home</a></td><td align="right" valign="top" width="40%"> 2.2.  + Qpid Management Framework + </td></tr></table></div></div> \ No newline at end of file
http://git-wip-us.apache.org/repos/asf/qpid-site/blob/c3ab60f6/input/releases/qpid-cpp-1.39.0/cpp-broker/book/chapter-ha.html.in ---------------------------------------------------------------------- diff --git a/input/releases/qpid-cpp-1.39.0/cpp-broker/book/chapter-ha.html.in b/input/releases/qpid-cpp-1.39.0/cpp-broker/book/chapter-ha.html.in new file mode 100644 index 0000000..c7975c1 --- /dev/null +++ b/input/releases/qpid-cpp-1.39.0/cpp-broker/book/chapter-ha.html.in @@ -0,0 +1,787 @@ +<div class="docbook"><div class="navheader"><table summary="Navigation header" width="100%"><tr><th align="center" colspan="3">1.12. Active-Passive Messaging Clusters</th></tr><tr><td align="left" width="20%"><a accesskey="p" href="Using-message-groups.html">Prev</a> </td><th align="center" width="60%">Chapter 1.  + Running the AMQP Messaging Broker + </th><td align="right" width="20%"> <a accesskey="n" href="ha-queue-replication.html">Next</a></td></tr></table><hr /></div><div class="section"><div class="titlepage"><div><div><h2 class="title"><a id="chapter-ha"></a>1.12. Active-Passive Messaging Clusters</h2></div></div></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="ha-overview"></a>1.12.1. Overview</h3></div></div></div><p> + + The High Availability (HA) module provides + <em class="firstterm">active-passive</em>, <em class="firstterm">hot-standby</em> + messaging clusters to provide fault tolerant message delivery. + </p><p> + In an active-passive cluster only one broker, known as the + <em class="firstterm">primary</em>, is active and serving clients at a time. The other + brokers are standing by as <em class="firstterm">backups</em>. Changes on the primary + are replicated to all the backups so they are always up-to-date or "hot". Backup + brokers reject client connection attempts, to enforce the requirement that clients + only connect to the primary. + </p><p> + If the primary fails, one of the backups is promoted to take over as the new + primary. Clients fail-over to the new primary automatically. If there are multiple + backups, the other backups also fail-over to become backups of the new primary. + </p><p> + This approach relies on an external <em class="firstterm">cluster resource manager</em> + to detect failures, choose the new primary and handle network partitions. <a class="ulink" href="https://fedorahosted.org/cluster/wiki/RGManager" target="_top">rgmanager</a> is supported + initially, but others may be supported in the future. + </p><div class="section"><div class="titlepage"><div><div><h4 class="title"><a id="ha-at-least-once"></a>1.12.1.1. Avoiding message loss</h4></div></div></div><p> + In order to avoid message loss, the primary broker <span class="emphasis"><em>delays + acknowledgement</em></span> of messages received from clients until the + message has been replicated and acknowledged by all of the back-up + brokers, or has been consumed from the primary queue. + </p><p> + This ensures that all acknowledged messages are safe: they have either + been consumed or backed up to all backup brokers. Messages that are + consumed <span class="emphasis"><em>before</em></span> they are replicated do not need to + be replicated. This reduces the work load when replicating a queue with + active consumers. + </p><p> + Clients keep <span class="emphasis"><em>unacknowledged</em></span> messages in a buffer + <a class="footnote" href="#ftn.idm140333888036112" id="idm140333888036112"><sup class="footnote">[1]</sup></a> + until they are acknowledged by the primary. If the primary fails, clients will + fail-over to the new primary and <span class="emphasis"><em>re-send</em></span> all their + unacknowledged messages. + <a class="footnote" href="#ftn.idm140333887650224" id="idm140333887650224"><sup class="footnote">[2]</sup></a> + </p><p> + If the primary crashes, all the <span class="emphasis"><em>acknowledged</em></span> + messages will be available on the backup that takes over as the new + primary. The <span class="emphasis"><em>unacknowledged</em></span> messages will be + re-sent by the clients. Thus no messages are lost. + </p><p> + Note that this means it is possible for messages to be + <span class="emphasis"><em>duplicated</em></span>. In the event of a failure it is possible for a + message to received by the backup that becomes the new primary + <span class="emphasis"><em>and</em></span> re-sent by the client. The application must take steps + to identify and eliminate duplicates. + </p><p> + When a new primary is promoted after a fail-over it is initially in + "recovering" mode. In this mode, it delays acknowledgement of messages + on behalf of all the backups that were connected to the previous + primary. This protects those messages against a failure of the new + primary until the backups have a chance to connect and catch up. + </p><p> + Not all messages need to be replicated to the back-up brokers. If a + message is consumed and acknowledged by a regular client before it has + been replicated to a backup, then it doesn't need to be replicated. + </p><div class="variablelist"><a id="ha-broker-states"></a><p class="title"><strong>HA Broker States</strong></p><dl class="variablelist"><dt><span class="term">Stand-alone</span></dt><dd><p> + Broker is not part of a HA cluster. + </p></dd><dt><span class="term">Joining</span></dt><dd><p> + Newly started broker, not yet connected to any existing primary. + </p></dd><dt><span class="term">Catch-up</span></dt><dd><p> + A backup broker that is connected to the primary and downloading + existing state (queues, messages etc.) + </p></dd><dt><span class="term">Ready</span></dt><dd><p> + A backup broker that is fully caught-up and ready to take over as + primary. + </p></dd><dt><span class="term">Recovering</span></dt><dd><p> + Newly-promoted primary, waiting for backups to connect and catch up. + Clients can connect but they are stalled until the primary is active. + </p></dd><dt><span class="term">Active</span></dt><dd><p> + The active primary broker with all backups connected and caught-up. + </p></dd></dl></div></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a id="limitations"></a>1.12.1.2. Limitations</h4></div></div></div><p> + There are a some known limitations in the current implementation. These + will be fixed in future versions. + </p><div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem"><p> + Transactional changes to queue state are not replicated atomically. If + the primary crashes during a transaction, it is possible that the + backup could contain only part of the changes introduced by a + transaction. + </p></li><li class="listitem"><p> + Configuration changes (creating or deleting queues, exchanges and + bindings) are replicated asynchronously. Management tools used to + make changes will consider the change complete when it is complete + on the primary, it may not yet be replicated to all the backups. + </p></li><li class="listitem"><p> + Federation links <span class="emphasis"><em>to</em></span> the primary will fail over + correctly. Federated links <span class="emphasis"><em>from</em></span> the primary + will be lost in fail over, they will not be re-connected to the new + primary. It is possible to work around this by replacing the + <code class="literal">qpidd-primary</code> start up script with a script that + re-creates federation links when the primary is promoted. + </p></li></ul></div></div></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="ha-virtual-ip"></a>1.12.2. Virtual IP Addresses</h3></div></div></div><p> + Some resource managers (including <span class="command"><strong>rgmanager</strong></span>) support + <em class="firstterm">virtual IP addresses</em>. A virtual IP address is an IP + address that can be relocated to any of the nodes in a cluster. The + resource manager associates this address with the primary node in the + cluster, and relocates it to the new primary when there is a failure. This + simplifies configuration as you can publish a single IP address rather + than a list. + </p><p> + A virtual IP address can be used by clients to connect to the primary. The + following sections will explain how to configure virtual IP addresses for + clients or brokers. + </p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="ha-broker-config"></a>1.12.3. Configuring the Brokers</h3></div></div></div><p> + The broker must load the <code class="filename">ha</code> module, it is loaded by + default. The following broker options are available for the HA module. + </p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p> + Broker management is required for HA to operate, it is enabled by + default. The option <code class="literal">mgmt-enable</code> must not be set to + "no" + </p></div><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p> + Incorrect security settings are a common cause of problems when + getting started, see <a class="xref" href="chapter-ha.html#ha-security" title="1.12.9. Security and Access Control.">Section 1.12.9, “Security and Access Control.”</a>. + </p></div><div class="table"><a id="ha-broker-options"></a><p class="title"><strong>Table 1.28. Broker Options for High Availability Messaging Cluster</strong></p><div class="table-contents"><table border="1" class="table" summary="Broker Options for High Availability Messaging Cluster"><colgroup><col align="left" class="c1" /><col align="left" class="c2" /></colgroup><thead><tr><th align="center" colspan="2"> + Options for High Availability Messaging Cluster + </th></tr></thead><tbody><tr><td align="left"> + <code class="literal">ha-cluster <em class="replaceable"><code>yes|no</code></em></code> + </td><td align="left"> + Set to "yes" to have the broker join a cluster. + </td></tr><tr><td align="left"> + <code class="literal">ha-queue-replication <em class="replaceable"><code>yes|no</code></em></code> + </td><td align="left"> + Enable replication of specific queues without joining a cluster, see <a class="xref" href="ha-queue-replication.html" title="1.13. Replicating Queues with the HA module">Section 1.13, “Replicating Queues with the HA module”</a>. + </td></tr><tr><td align="left"> + <code class="literal">ha-brokers-url <em class="replaceable"><code>URL</code></em></code> + </td><td align="left"> + <p> + The URL + <a class="footnote" href="#ftn.ha-url-grammar" id="ha-url-grammar"><sup class="footnote">[a]</sup></a> + used by cluster brokers to connect to each other. The URL should + contain a comma separated list of the broker addresses, rather than a + virtual IP address. + </p> + </td></tr><tr><td align="left"><code class="literal">ha-public-url <em class="replaceable"><code>URL</code></em></code> </td><td align="left"> + <p> + This option is only needed for backwards compatibility if you + have been using the <code class="literal">amq.failover</code> exchange. + This exchange is now obsolete, it is recommended to use a + virtual IP address instead. + </p> + <p> + If set, this URL is advertised by the + <code class="literal">amq.failover</code> exchange and overrides the + broker option <code class="literal">known-hosts-url</code> + </p> + </td></tr><tr><td align="left"><code class="literal">ha-replicate </code><em class="replaceable"><code>VALUE</code></em></td><td align="left"> + <p> + Specifies whether queues and exchanges are replicated by default. + <em class="replaceable"><code>VALUE</code></em> is one of: <code class="literal">none</code>, + <code class="literal">configuration</code>, <code class="literal">all</code>. + For details see <a class="xref" href="chapter-ha.html#ha-replicate-values" title="1.12.7. Controlling replication of queues and exchanges">Section 1.12.7, “Controlling replication of queues and exchanges”</a>. + </p> + </td></tr><tr><td align="left"> + <p><code class="literal">ha-username <em class="replaceable"><code>USER</code></em></code></p> + <p><code class="literal">ha-password <em class="replaceable"><code>PASS</code></em></code></p> + <p><code class="literal">ha-mechanism <em class="replaceable"><code>MECHANISM</code></em></code></p> + </td><td align="left"> + Authentication settings used by HA brokers to connect to each other, + see <a class="xref" href="chapter-ha.html#ha-security" title="1.12.9. Security and Access Control.">Section 1.12.9, “Security and Access Control.”</a> + </td></tr><tr><td align="left"><code class="literal">ha-backup-timeout<em class="replaceable"><code>SECONDS</code></em></code> + <a class="footnote" href="#ftn.ha-seconds-spec" id="ha-seconds-spec"><sup class="footnote">[b]</sup></a> + </td><td align="left"> + <p> + Maximum time that a recovering primary will wait for an expected + backup to connect and become ready. + </p> + </td></tr><tr><td align="left"> + <code class="literal">link-maintenance-interval <em class="replaceable"><code>SECONDS</code></em></code> + <a class="footnoteref" href="chapter-ha.html#ftn.ha-seconds-spec"><sup class="footnoteref">[b]</sup></a> + </td><td align="left"> + <p> + HA uses federation links to connect from backup to primary. + Backup brokers check the link to the primary on this interval + and re-connect if need be. Default 2 seconds. Set lower for + faster failover, e.g. 0.1 seconds. Setting too low will result + in excessive link-checking on the backups. + </p> + </td></tr><tr><td align="left"> + <code class="literal">link-heartbeat-interval <em class="replaceable"><code>SECONDS</code></em></code> + <a class="footnoteref" href="chapter-ha.html#ftn.ha-seconds-spec"><sup class="footnoteref">[b]</sup></a> + </td><td align="left"> + <p> + HA uses federation links to connect from backup to primary. + If no heart-beat is received for twice this interval the primary will consider that + backup dead (e.g. if backup is hung or partitioned.) + This interval is also used to time-out for broker status checks, + it may take up to this interval for rgmanager to detect a hung or partitioned broker. + Clients sending messages may be held up during this time. + Default 120 seconds: you will probably want to set this to a lower value e.g. 10. + If set too low rgmanager may consider a slow broker to have failed and kill it. + </p> + </td></tr></tbody><tbody class="footnotes"><tr><td colspan="2"><div class="footnote" id="ftn.ha-url-grammar"><p><a class="para" href="#ha-url-grammar"><sup class="para">[a] </sup></a> + The full format of the URL is given by this grammar: + </p><pre class="programlisting"> +url = ["amqp:"][ user ["/" password] "@" ] addr ("," addr)* +addr = tcp_addr / rmda_addr / ssl_addr / ... +tcp_addr = ["tcp:"] host [":" port] +rdma_addr = "rdma:" host [":" port] +ssl_addr = "ssl:" host [":" port]' + </pre><p> + </p></div><div class="footnote" id="ftn.ha-seconds-spec"><p><a class="para" href="#ha-seconds-spec"><sup class="para">[b] </sup></a> + Values specified as <em class="replaceable"><code>SECONDS</code></em> can be a + fraction of a second, e.g. "0.1" for a tenth of a second. + They can also have an explicit unit, + e.g. 10s (seconds), 10ms (milliseconds), 10us (microseconds), 10ns (nanoseconds) + </p></div></td></tr></tbody></table></div></div><br class="table-break" /><p> + To configure a HA cluster you must set at least <code class="literal">ha-cluster</code> and + <code class="literal">ha-brokers-url</code>. + </p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="ha-rm"></a>1.12.4. The Cluster Resource Manager</h3></div></div></div><p> + Broker fail-over is managed by a <em class="firstterm">cluster resource + manager</em>. An integration with <a class="ulink" href="https://fedorahosted.org/cluster/wiki/RGManager" target="_top">rgmanager</a> is + provided, but it is possible to integrate with other resource managers. + </p><p> + The resource manager is responsible for starting the <span class="command"><strong>qpidd</strong></span> broker + on each node in the cluster. The resource manager then <em class="firstterm">promotes</em> + one of the brokers to be the primary. The other brokers connect to the primary as + backups, using the URL provided in the <code class="literal">ha-brokers-url</code> configuration + option. + </p><p> + Once connected, the backup brokers synchronize their state with the + primary. When a backup is synchronized, or "hot", it is ready to take + over if the primary fails. Backup brokers continually receive updates + from the primary in order to stay synchronized. + </p><p> + If the primary fails, backup brokers go into fail-over mode. The resource + manager must detect the failure and promote one of the backups to be the + new primary. The other backups connect to the new primary and synchronize + their state with it. + </p><p> + The resource manager is also responsible for protecting the cluster from + <em class="firstterm">split-brain</em> conditions resulting from a network partition. A + network partition divide a cluster into two sub-groups which cannot see each other. + Usually a <em class="firstterm">quorum</em> voting algorithm is used that disables nodes + in the inquorate sub-group. + </p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="ha-rm-config"></a>1.12.5. Configuring with <span class="command"><strong>rgmanager</strong></span> as resource manager</h3></div></div></div><p> + This section assumes that you are already familiar with setting up and configuring + clustered services using <span class="command"><strong>cman</strong></span> and + <span class="command"><strong>rgmanager</strong></span>. It will show you how to configure an active-passive, + hot-standby <span class="command"><strong>qpidd</strong></span> HA cluster with <span class="command"><strong>rgmanager</strong></span>. + </p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p> + Once all components are installed it is important to take the following step: + </p><pre class="programlisting"> +chkconfig rgmanager on +chkconfig cman on +chkconfig qpidd <span class="emphasis"><em>off</em></span> + </pre><p> + </p><p> + The qpidd service must be <span class="emphasis"><em>off</em></span> in + <code class="literal">chkconfig</code> because <code class="literal">rgmanager</code> will + start and stop <code class="literal">qpidd</code>. If the normal system init + process also attempts to start and stop qpidd it can cause rgmanager to + lose track of qpidd processes. The symptom when this happens is that + <code class="literal">clustat</code> shows a <code class="literal">qpidd</code> service to + be stopped when in fact there is a <code class="literal">qpidd</code> process + running. The <code class="literal">qpidd</code> log will show errors like this: + </p><pre class="programlisting"> +critical Unexpected error: Daemon startup failed: Cannot lock /var/lib/qpidd/lock: Resource temporarily unavailable + </pre><p> + </p></div><p> + You must provide a <code class="literal">cluster.conf</code> file to configure + <span class="command"><strong>cman</strong></span> and <span class="command"><strong>rgmanager</strong></span>. Here is + an example <code class="literal">cluster.conf</code> file for a cluster of 3 nodes named + node1, node2 and node3. We will go through the configuration step-by-step. + </p><pre class="programlisting"> + +<?xml version="1.0"?> +<!-- +This is an example of a cluster.conf file to run qpidd HA under rgmanager. +This example assumes a 3 node cluster, with nodes named node1, node2 and node3. + +NOTE: fencing is not shown, you must configure fencing appropriately for your cluster. +--> + +<cluster name="qpid-test" config_version="18"> + <!-- The cluster has 3 nodes. Each has a unique nodeid and one vote + for quorum. --> + <clusternodes> + <clusternode name="node1.example.com" nodeid="1"/> + <clusternode name="node2.example.com" nodeid="2"/> + <clusternode name="node3.example.com" nodeid="3"/> + </clusternodes> + + <!-- Resouce Manager configuration. --> + + status_poll_interval is the interval in seconds that the resource manager checks the status + of managed services. This affects how quickly the manager will detect failed services. + --> + <rm status_poll_interval="1"> + <!-- + There is a failoverdomain for each node containing just that node. + This lets us stipulate that the qpidd service should always run on each node. + --> + <failoverdomains> + <failoverdomain name="node1-domain" restricted="1"> + <failoverdomainnode name="node1.example.com"/> + </failoverdomain> + <failoverdomain name="node2-domain" restricted="1"> + <failoverdomainnode name="node2.example.com"/> + </failoverdomain> + <failoverdomain name="node3-domain" restricted="1"> + <failoverdomainnode name="node3.example.com"/> + </failoverdomain> + </failoverdomains> + + <resources> + <!-- This script starts a qpidd broker acting as a backup. --> + <script file="/etc/init.d/qpidd" name="qpidd"/> + + <!-- This script promotes the qpidd broker on this node to primary. --> + <script file="/etc/init.d/qpidd-primary" name="qpidd-primary"/> + + <!-- + This is a virtual IP address for client traffic. + monitor_link="yes" means monitor the health of the NIC used for the VIP. + sleeptime="0" means don't delay when failing over the VIP to a new address. + --> + <ip address="20.0.20.200" monitor_link="yes" sleeptime="0"/> + </resources> + + <!-- There is a qpidd service on each node, it should be restarted if it fails. --> + <service name="node1-qpidd-service" domain="node1-domain" recovery="restart"> + <script ref="qpidd"/> + </service> + <service name="node2-qpidd-service" domain="node2-domain" recovery="restart"> + <script ref="qpidd"/> + </service> + <service name="node3-qpidd-service" domain="node3-domain" recovery="restart"> + <script ref="qpidd"/> + </service> + + <!-- There should always be a single qpidd-primary service, it can run on any node. --> + <service name="qpidd-primary-service" autostart="1" exclusive="0" recovery="relocate"> + <script ref="qpidd-primary"/> + <!-- The primary has the IP addresses for brokers and clients to connect. --> + <ip ref="20.0.20.200"/> + </service> + </rm> +</cluster> + + </pre><p> + There is a <code class="literal">failoverdomain</code> for each node containing just that + one node. This lets us stipulate that the qpidd service should always run on all + nodes. + </p><p> + The <code class="literal">resources</code> section defines the <span class="command"><strong>qpidd</strong></span> + script used to start the <span class="command"><strong>qpidd</strong></span> service. It also defines the + <span class="command"><strong>qpid-primary</strong></span> script which does not + actually start a new service, rather it promotes the existing + <span class="command"><strong>qpidd</strong></span> broker to primary status. + </p><p> + The <code class="literal">resources</code> section also defines a virtual IP + address for clients: <code class="literal">20.0.20.200</code>. + </p><p> + <code class="filename">qpidd.conf</code> should contain these lines: + </p><pre class="programlisting"> +ha-cluster=yes +ha-brokers-url=20.0.20.1,20.0.20.2,20.0.20.3 + </pre><p> + The brokers connect to each other directly via the addresses + listed in <span class="command"><strong>ha-brokers-url</strong></span>. Note the client and broker + addresses are on separate sub-nets, this is recommended but not required. + </p><p> + The <code class="literal">service</code> section defines 3 <code class="literal">qpidd</code> + services, one for each node. Each service is in a restricted fail-over + domain containing just that node, and has the <code class="literal">restart</code> + recovery policy. The effect of this is that rgmanager will run + <span class="command"><strong>qpidd</strong></span> on each node, restarting if it fails. + </p><p> + There is a single <code class="literal">qpidd-primary-service</code> using the + <span class="command"><strong>qpidd-primary</strong></span> script which is not restricted to a + domain and has the <code class="literal">relocate</code> recovery policy. This means + rgmanager will start <span class="command"><strong>qpidd-primary</strong></span> on one of the nodes + when the cluster starts and will relocate it to another node if the + original node fails. Running the <code class="literal">qpidd-primary</code> script + does not start a new broker process, it promotes the existing broker to + become the primary. + </p><div class="section"><div class="titlepage"><div><div><h4 class="title"><a id="ha-rm-shutdown-node"></a>1.12.5.1. Shutting down qpidd on a HA node</h4></div></div></div><p> + As explained above both the per-node <code class="literal">qpidd</code> service + and the re-locatable <code class="literal">qpidd-primary</code> service are + implemented by the same <code class="literal">qpidd</code> daemon. + </p><p> + As a result, stopping the <code class="literal">qpidd</code> service will not stop + a <code class="literal">qpidd</code> daemon that is acting as primary, and + stopping the <code class="literal">qpidd-primary</code> service will not stop a + <code class="literal">qpidd</code> process that is acting as backup. + </p><p> + To shut down a node that is acting as primary you need to shut down the + <code class="literal">qpidd</code> service <span class="emphasis"><em>and</em></span> relocate the + primary: + </p><p> + </p><pre class="programlisting"> +clusvcadm -d somenode-qpidd-service +clusvcadm -r qpidd-primary-service + </pre><p> + </p><p> + This will shut down the <code class="literal">qpidd</code> daemon on that node and + prevent the primary service service from relocating back to the node + because the qpidd service is no longer running there. + </p></div></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="ha-broker-admin"></a>1.12.6. Broker Administration Tools</h3></div></div></div><p> + Normally, clients are not allowed to connect to a backup broker. However + management tools are allowed to connect to a backup brokers. If you use + these tools you <span class="emphasis"><em>must not</em></span> add or remove messages from + replicated queues, nor create or delete replicated queues or exchanges as + this will disrupt the replication process and may cause message loss. + </p><p> + <span class="command"><strong>qpid-ha</strong></span> allows you to view and change HA configuration settings. + </p><p> + The tools <span class="command"><strong>qpid-config</strong></span>, <span class="command"><strong>qpid-route</strong></span> and + <span class="command"><strong>qpid-stat</strong></span> will connect to a backup if you pass the flag <span class="command"><strong>ha-admin</strong></span> on the + command line. + </p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="ha-replicate-values"></a>1.12.7. Controlling replication of queues and exchanges</h3></div></div></div><p> + By default, queues and exchanges are not replicated automatically. You can change + the default behaviour by setting the <code class="literal">ha-replicate</code> configuration + option. It has one of the following values: + </p><div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem"><p> + <em class="firstterm">all</em>: Replicate everything automatically: queues, + exchanges, bindings and messages. + </p></li><li class="listitem"><p> + <em class="firstterm">configuration</em>: Replicate the existence of queues, + exchange and bindings but don't replicate messages. + </p></li><li class="listitem"><p> + <em class="firstterm">none</em>: Don't replicate anything, this is the default. + </p></li></ul></div><p> + </p><p> + You can over-ride the default for a particular queue or exchange by passing the + argument <code class="literal">qpid.replicate</code> when creating the queue or exchange. It + takes the same values as <code class="literal">ha-replicate</code> + </p><p> + Bindings are automatically replicated if the queue and exchange being bound both + have replication <code class="literal">all</code> or <code class="literal">configuration</code>, they + are not replicated otherwise. + </p><p> + You can create replicated queues and exchanges with the + <span class="command"><strong>qpid-config</strong></span> management tool like this: + </p><pre class="programlisting"> +qpid-config add queue myqueue --replicate all + </pre><p> + To create replicated queues and exchanges via the client API, add a + <code class="literal">node</code> entry to the address like this: + </p><pre class="programlisting"> +"myqueue;{create:always,node:{x-declare:{arguments:{'qpid.replicate':all}}}}" + </pre><p> + There are some built-in exchanges created automatically by the broker, these + exchanges are never replicated. The built-in exchanges are the default (nameless) + exchange, the AMQP standard exchanges (<code class="literal">amq.direct, amq.topic, amq.fanout</code> and + <code class="literal">amq.match</code>) and the management exchanges (<code class="literal">qpid.management, qmf.default.direct</code> and + <code class="literal">qmf.default.topic</code>) + </p><p> + Note that if you bind a replicated queue to one of these exchanges, the + binding will <span class="emphasis"><em>not</em></span> be replicated, so the queue will not + have the binding after a fail-over. + </p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="ha-failover"></a>1.12.8. Client Connection and Fail-over</h3></div></div></div><p> + Clients can only connect to the primary broker. Backup brokers reject any + connection attempt by a client. Clients rejected by a backup broker will + automatically fail-over until they connect to the primary. + </p><p> + Clients are configured with the URL for the cluster (details below for + each type of client). There are two possibilities + </p><div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem"><p> + The URL contains multiple addresses, one for each broker in the cluster. + </p></li><li class="listitem"><p> + The URL contains a single <em class="firstterm">virtual IP address</em> + that is assigned to the primary broker by the resource manager. + This is the recommended configuration. + </p></li></ul></div><p> + In the first case, clients will repeatedly re-try each address in the URL + until they successfully connect to the primary. In the second case the + resource manager will assign the virtual IP address to the primary broker, + so clients only need to re-try on a single address. + </p><p> + When the primary broker fails, clients re-try all known cluster addresses + until they connect to the new primary. The client re-sends any messages + that were previously sent but not acknowledged by the broker at the time + of the failure. Similarly messages that have been sent by the broker, but + not acknowledged by the client, are re-queued. + </p><p> + TCP can be slow to detect connection failures. A client can configure a + connection to use a <em class="firstterm">heartbeat</em> to detect connection + failure, and can specify a time interval for the heartbeat. If heartbeats + are in use, failures will be detected no later than twice the heartbeat + interval. The following sections explain how to enable heartbeat in each + client. + </p><p> + Note: the following sections explain how to configure clients with + multiple dresses, but if you are using a virtual IP address you only need + to configure that one address for clients, you don't need to list all the + addresses. + </p><p> + Suppose your cluster has 3 nodes: <code class="literal">node1</code>, + <code class="literal">node2</code> and <code class="literal">node3</code> all using the + default AMQP port, and you are not using a virtual IP address. To connect + a client you need to specify the address(es) and set the + <code class="literal">reconnect</code> property to <code class="literal">true</code>. The + following sub-sections show how to connect each type of client. + </p><div class="section"><div class="titlepage"><div><div><h4 class="title"><a id="ha-clients"></a>1.12.8.1. C++ clients</h4></div></div></div><p> + With the C++ client, you specify multiple cluster addresses in a single URL + <a class="footnote" href="#ftn.idm140333889680320" id="idm140333889680320"><sup class="footnote">[3]</sup></a> + You also need to specify the connection option + <code class="literal">reconnect</code> to be true. For example: + </p><pre class="programlisting"> +qpid::messaging::Connection c("node1,node2,node3","{reconnect:true}"); + </pre><p> + Heartbeats are disabled by default. You can enable them by specifying a + heartbeat interval (in seconds) for the connection via the + <code class="literal">heartbeat</code> option. For example: + </p><pre class="programlisting"> +qpid::messaging::Connection c("node1,node2,node3","{reconnect:true,heartbeat:10}"); + </pre></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a id="ha-python-client"></a>1.12.8.2. Python clients</h4></div></div></div><p> + With the python client, you specify <code class="literal">reconnect=True</code> + and a list of <em class="replaceable"><code>host:port</code></em> addresses as + <code class="literal">reconnect_urls</code> when calling + <code class="literal">Connection.establish</code> or + <code class="literal">Connection.open</code> + </p><pre class="programlisting"> +connection = qpid.messaging.Connection.establish("node1", reconnect=True, reconnect_urls=["node1", "node2", "node3"]) + </pre><p> + Heartbeats are disabled by default. You can + enable them by specifying a heartbeat interval (in seconds) for the + connection via the 'heartbeat' option. For example: + </p><pre class="programlisting"> +connection = qpid.messaging.Connection.establish("node1", reconnect=True, reconnect_urls=["node1", "node2", "node3"], heartbeat=10) + </pre></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a id="ha-jms-client"></a>1.12.8.3. Java JMS Clients</h4></div></div></div><p> + In Java JMS clients, client fail-over is handled automatically if it is + enabled in the connection. You can configure a connection to use + fail-over using the <span class="command"><strong>failover</strong></span> property: + </p><pre class="screen"> + connectionfactory.qpidConnectionfactory = amqp://guest:guest@clientid/test?brokerlist='tcp://localhost:5672'&failover='failover_exchange' + </pre><p> + This property can take three values: + </p><div class="variablelist"><p class="title"><strong>Fail-over Modes</strong></p><dl class="variablelist"><dt><span class="term">failover_exchange</span></dt><dd><p> + If the connection fails, fail over to any other broker in the cluster. + </p></dd><dt><span class="term">roundrobin</span></dt><dd><p> + If the connection fails, fail over to one of the brokers specified in the <span class="command"><strong>brokerlist</strong></span>. + </p></dd><dt><span class="term">singlebroker</span></dt><dd><p> + Fail-over is not supported; the connection is to a single broker only. + </p></dd></dl></div><p> + In a Connection URL, heartbeat is set using the <span class="command"><strong>heartbeat</strong></span> property, which is an integer corresponding to the heartbeat period in seconds. For instance, the following line from a JNDI properties file sets the heartbeat time out to 3 seconds: + </p><pre class="screen"> + connectionfactory.qpidConnectionfactory = amqp://guest:guest@clientid/test?brokerlist='tcp://localhost:5672'&heartbeat='3' + </pre></div></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="ha-security"></a>1.12.9. Security and Access Control.</h3></div></div></div><p> + This section outlines the HA specific aspects of security configuration. + Please see <a class="xref" href="chap-Messaging_User_Guide-Security.html" title="1.5. Security">Section 1.5, “Security”</a> for + more details on enabling authentication and setting up Access Control Lists. + </p><div class="note" style="margin-left: 0.5in; margin-right: 0.5in;"><h3 class="title">Note</h3><p> + Unless you disable authentication with <code class="literal">auth=no</code> in + your configuration, you <span class="emphasis"><em>must</em></span> set the options below + and you <span class="emphasis"><em>must</em></span> have an ACL file with at least the + entry described below. + </p><p> + Backups will be <span class="emphasis"><em>unable to connect to the primary</em></span> if + the security configuration is incorrect. See also <a class="xref" href="chapter-ha.html#ha-troubleshoot-security" title="1.12.12.2. Authentication and ACL failures">Section 1.12.12.2, “Authentication and ACL failures”</a> + </p></div><p> + When authentication is enabled you must set the credentials used by HA + brokers with following options: + </p><div class="table"><a id="ha-security-options"></a><p class="title"><strong>Table 1.29. HA Security Options</strong></p><div class="table-contents"><table border="1" class="table" summary="HA Security Options"><colgroup><col align="left" class="c1" /><col align="left" class="c2" /></colgroup><thead><tr><th align="center" colspan="2"> + HA Security Options + </th></tr></thead><tbody><tr><td align="left"><p><code class="literal">ha-username</code> <em class="replaceable"><code>USER</code></em></p></td><td align="left"><p>User name for HA brokers. Note this must <span class="emphasis"><em>not</em></span> include the <code class="literal">@QPID</code> suffix.</p></td></tr><tr><td align="left"><p><code class="literal">ha-password</code> <em class="replaceable"><code>PASS</code></em></p></td><td align="left"><p>Password for HA brokers.</p></td></tr><tr><td align="left"><p><code class="literal">ha-mechanism</code> <em class="replaceable"><code>MECHANISM</code></em></p></td><td align="left"> + <p> + Mechanism for HA brokers. Any mechanism you enable for + broker-to-broker communication can also be used by a client, so + do not use ha-mechanism=ANONYMOUS in a secure environment. + </p> + </td></tr></tbody></table></div></div><br class="table-break" /><p> + This identity is used to authorize federation links from backup to + primary. It is also used to authorize actions on the backup to replicate + primary state, for example creating queues and exchanges. + </p><p> + When authorization is enabled you must have an Access Control List with the + following rule to allow HA replication to function. Suppose + <code class="literal">ha-username</code>=<em class="replaceable"><code>USER</code></em> + </p><pre class="programlisting"> +acl allow <em class="replaceable"><code>USER</code></em>@QPID all all + </pre></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="ha-other-rm"></a>1.12.10. Integrating with other Cluster Resource Managers</h3></div></div></div><p> + To integrate with a different resource manager you must configure it to: + </p><div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem"><p>Start a qpidd process on each node of the cluster.</p></li><li class="listitem"><p>Restart qpidd if it crashes.</p></li><li class="listitem"><p>Promote exactly one of the brokers to primary.</p></li><li class="listitem"><p>Detect a failure and promote a new primary.</p></li></ul></div><p> + </p><p> + The <span class="command"><strong>qpid-ha</strong></span> command allows you to check if a broker is + primary, and to promote a backup to primary. + </p><p> + To test if a broker is the primary: + </p><pre class="programlisting">qpid-ha -b <em class="replaceable"><code>broker-address</code></em> status --expect=primary</pre><p> + This will return 0 if the broker at <em class="replaceable"><code>broker-address</code></em> is the primary, + non-0 otherwise. + </p><p> + To promote a broker to primary: + </p><pre class="programlisting">qpid-ha --cluster-manager -b <em class="replaceable"><code>broker-address</code></em> promote</pre><p> + </p><p> + Note that <code class="literal">promote</code> is considered a "cluster manager + only" command. Incorrect use of <code class="literal">promote</code> outside of the + cluster manager could create a cluster with multiple primaries. Such a + cluster will malfunction and lose data. "Cluster manager only" commands + are not accessible in <span class="command"><strong>qpid-ha</strong></span> without the + <code class="literal">--cluster-manager</code> option. + </p><p> + To list the full set of commands use: + </p><pre class="programlisting"> +qpid-ha --cluster-manager --help + </pre></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="ha-store"></a>1.12.11. Using a message store in a cluster</h3></div></div></div><p> + If you use a persistent store for your messages then each broker in a + cluster will have its own store. If the entire cluster fails and is + restarted, the *first* broker that becomes primary will recover from its + store. All the other brokers will clear their stores and get an update + from the primary to ensure consistency. + </p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="ha-troubleshoot"></a>1.12.12. Troubleshooting a cluster</h3></div></div></div><p> + This section applies to clusters that are using rgmanager as the + cluster manager. + </p><div class="section"><div class="titlepage"><div><div><h4 class="title"><a id="ha-troubleshoot-no-primary"></a>1.12.12.1. No primary broker</h4></div></div></div><p> + When you initially start a HA cluster, all brokers are in + <code class="literal">joining</code> mode. The brokers do not automatically select + a primary, they rely on the cluster manager <code class="literal">rgmanager</code> + to do so. If <code class="literal">rgmanager</code> is not running or is not + configured correctly, brokers will remain in the + <code class="literal">joining</code> state. See <a class="xref" href="chapter-ha.html#ha-rm-config" title="1.12.5. Configuring with rgmanager as resource manager">Section 1.12.5, “Configuring with <span class="command"><strong>rgmanager</strong></span> as resource manager”</a> + </p></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a id="ha-troubleshoot-security"></a>1.12.12.2. Authentication and ACL failures</h4></div></div></div><p> + If a broker is unable to establish a connection to another broker in the + cluster due to authentication or ACL problems the logs may contain + errors like the following: + </p><pre class="programlisting"> +info SASL: Authentication failed: SASL(-13): user not found: Password verification failed + </pre><p> + </p><pre class="programlisting"> +warning Client closed connection with 320: User anonymous@QPID federation connection denied. Systems with authentication enabled must specify ACL create link rules. + </pre><p> + </p><pre class="programlisting"> +warning Client closed connection with 320: ACL denied anonymous@QPID creating a federation link. + </pre><p> + </p><p> + Set the HA security configuration and ACL file as described in <a class="xref" href="chapter-ha.html#ha-security" title="1.12.9. Security and Access Control.">Section 1.12.9, “Security and Access Control.”</a>. Once the cluster is running and the primary is + promoted , run: + </p><pre class="programlisting">qpid-ha status --all</pre><p> + to make sure that the brokers are running as one cluster. + </p></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a id="ha-troubleshoot-slow-recovery"></a>1.12.12.3. Slow recovery times</h4></div></div></div><p> + The following configuration settings affect recovery time. The + values shown are examples that give fast recovery on a lightly + loaded system. You should run tests to determine if the values are + appropriate for your system and load conditions. + </p><div class="section"><div class="titlepage"><div><div><h5 class="title"><a id="ha-troubleshoot-cluster.conf"></a>cluster.conf:</h5></div></div></div><pre class="programlisting"> +<rm status_poll_interval=1> + </pre><p> + status_poll_interval is the interval in seconds that the + resource manager checks the status of managed services. This + affects how quickly the manager will detect failed services. + </p><pre class="programlisting"> +<ip address="20.0.20.200" monitor_link="yes" sleeptime="0"/> + </pre><p> + This is a virtual IP address for client traffic. + monitor_link="yes" means monitor the health of the network interface + used for the VIP. sleeptime="0" means don't delay when + failing over the VIP to a new address. + </p></div><div class="section"><div class="titlepage"><div><div><h5 class="title"><a id="ha-troubleshoot-qpidd.conf"></a>qpidd.conf</h5></div></div></div><pre class="programlisting"> +link-maintenance-interval=0.1 + </pre><p> + Interval for backup brokers to check the link to the primary + re-connect if need be. Default 2 seconds. Can be set lower for + faster fail-over. Setting too low will result in excessive + link-checking activity on the broker. + </p><pre class="programlisting"> +link-heartbeat-interval=5 + </pre><p> + Heartbeat interval for federation links. The HA cluster uses + federation links between the primary and each backup. The + primary can take up to twice the heartbeat interval to detect a + failed backup. When a sender sends a message the primary waits + for all backups to acknowledge before acknowledging to the + sender. A disconnected backup may cause the primary to block + senders until it is detected via heartbeat. + </p><p> + This interval is also used as the timeout for broker status + checks by rgmanager. It may take up to this interval for + rgmanager to detect a hung broker. + </p><p> + The default of 120 seconds is very high, you will probably want + to set this to a lower value. If set too low, under network + congestion or heavy load, a slow-to-respond broker may be + re-started by rgmanager. + </p></div></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a id="ha-troubleshoot-total-cluster-failure"></a>1.12.12.4. Total cluster failure</h4></div></div></div><p> + Note: for definition of broker states <em class="firstterm">joining</em>, + <em class="firstterm">catch-up</em>, <em class="firstterm">ready</em>, + <em class="firstterm">recovering</em> and <em class="firstterm">active</em> see + <a class="xref" href="chapter-ha.html#ha-broker-states" title="HA Broker States">HA Broker States</a> + </p><p> + The cluster can only guarantee availability as long as there is at + least one active primary broker or ready backup broker left alive. + If all the brokers fail simultaneously, the cluster will fail and + non-persistent data will be lost. + </p><p> + While there is an active primary broker, clients can get service. + If the active primary fails, one of the "ready" backup + brokers will take over, recover and become active. Note a backup + can only be promoted to primary if it is in the "ready" + state (with the exception of the first primary in a new cluster + where all brokers are in the "joining" state) + </p><p> + Given a stable cluster of N brokers with one active primary and + N-1 ready backups, the system can sustain up to N-1 failures in + rapid succession. The surviving broker will be promoted to active + and continue to give service. + </p><p> + However at this point the system <span class="emphasis"><em>cannot</em></span> + sustain a failure of the surviving broker until at least one of + the other brokers recovers, catches up and becomes a ready backup. + If the surviving broker fails before that the cluster will fail in + one of two modes (depending on the exact timing of failures) + </p><div class="section"><div class="titlepage"><div><div><h5 class="title"><a id="ha-troubleshoot-the-cluster-hangs"></a>1. The cluster hangs</h5></div></div></div><p> + All brokers are in joining or catch-up mode. rgmanager tries to + promote a new primary but cannot find any candidates and so + gives up. clustat will show that the qpidd services are running + but the the qpidd-primary service has stopped, something like + this: + </p><pre class="programlisting"> +Service Name Owner (Last) State +------- ---- ----- ------ ----- +service:mrg33-qpidd-service 20.0.10.33 started +service:mrg34-qpidd-service 20.0.10.34 started +service:mrg35-qpidd-service 20.0.10.35 started +service:qpidd-primary-service (20.0.10.33) stopped + </pre><p> + Eventually all brokers become stuck in "joining" mode, + as shown by: <code class="literal">qpid-ha status --all</code> + </p><p> + At this point you need to restart the cluster in one of the + following ways: + </p><div class="orderedlist"><ol class="orderedlist" type="1"><li class="listitem"><p> + Restart the entire cluster: + In <code class="literal">luci:<em class="replaceable"><code>your-cluster</code></em>:Nodes</code> + click reboot to restart the entire cluster + </p></li><li class="listitem"><p> + Stop and restart the cluster with + <code class="literal">ccs --stopall; ccs --startall</code> + </p></li><li class="listitem"><p> + Restart just the Qpid services:In <code class="literal">luci:<em class="replaceable"><code>your-cluster</code></em>:Service Groups</code> + </p><div class="orderedlist"><ol class="orderedlist" type="a"><li class="listitem"><p>Select all the qpidd (not qpidd-primary) services, click restart</p></li><li class="listitem"><p>Select the qpidd-primary service, click restart</p></li></ol></div><p> + </p></li><li class="listitem"><p> + Stop the <code class="literal">qpidd-primary</code> and + <code class="literal">qpidd</code> services with <code class="literal">clusvcadm</code>, + then restart (qpidd-primary last) + </p></li></ol></div><p> + </p></div><div class="section"><div class="titlepage"><div><div><h5 class="title"><a id="ha-troubleshoot-the-cluster-reboots"></a>2. The cluster reboots</h5></div></div></div><p> + A new primary is promoted and the cluster is functional but all + non-persistent data from before the failure is lost. + </p></div></div><div class="section"><div class="titlepage"><div><div><h4 class="title"><a id="ha-troubleshoot-fencing-and-network-partitions"></a>1.12.12.5. Fencing and network partitions</h4></div></div></div><p> + A network partition is a a network failure that divides the + cluster into two or more sub-clusters, where each broker can + communicate with brokers in its own sub-cluster but not with + brokers in other sub-clusters. This condition is also referred to + as a "split brain". + </p><p> + Nodes in one sub-cluster can't tell whether nodes in other + sub-clusters are dead or are still running but disconnected. We + cannot allow each sub-cluster to independently declare its own + qpidd primary and start serving clients, as the cluster will + become inconsistent. We must ensure only one sub-cluster continues + to provide service. + </p><p> + A <span class="emphasis"><em>quorum</em></span> determines which sub-cluster + continues to operate, and <span class="emphasis"><em>power fencing</em></span> + ensures that nodes in non-quorate sub-clusters cannot attempt to + provide service inconsistently. For more information see: + </p><p> + https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html-single/High_Availability_Add-On_Overview/index.html, + chapter 2. Quorum and 4. Fencing. + </p></div></div><div class="footnotes"><br /><hr align="left" width="100" /><div class="footnote" id="ftn.idm140333888036112"><p><a class="para" href="#idm140333888036112"><sup class="para">[1] </sup></a> + You can control the maximum number of messages in the buffer by setting the + client's <code class="literal">capacity</code>. For details of how to set the capacity + in client code see "Using the Qpid Messaging API" in + <em class="citetitle">Programming in Apache Qpid</em>. + </p></div><div class="footnote" id="ftn.idm140333887650224"><p><a class="para" href="#idm140333887650224"><sup class="para">[2] </sup></a> + Clients must use "at-least-once" reliability to enable re-send of unacknowledged + messages. This is the default behaviour, no options need be set to enable it. For + details of client addressing options see "Using the Qpid Messaging API" + in <em class="citetitle">Programming in Apache Qpid</em>. + </p></div><div class="footnote" id="ftn.idm140333889680320"><p><a class="para" href="#idm140333889680320"><sup class="para">[3] </sup></a> + The full grammar for the URL is: + </p><pre class="programlisting"> +url = ["amqp:"][ user ["/" password] "@" ] addr ("," addr)* +addr = tcp_addr / rmda_addr / ssl_addr / ... +tcp_addr = ["tcp:"] host [":" port] +rdma_addr = "rdma:" host [":" port] +ssl_addr = "ssl:" host [":" port]' + </pre></div></div></div><div class="navfooter"><hr /><table summary="Navigation footer" width="100%"><tr><td align="left" width="40%"><a accesskey="p" href="Using-message-groups.html">Prev</a> </td><td align="center" width="20%"><a accesskey="u" href="ch01.html">Up</a></td><td align="right" width="40%"> <a accesskey="n" href="ha-queue-replication.html">Next</a></td></tr><tr><td align="left" valign="top" width="40%">1.11.  + Using Message Groups +  </td><td align="center" width="20%"><a accesskey="h" href="index.html">Home</a></td><td align="right" valign="top" width="40%"> 1.13. Replicating Queues with the HA module</td></tr></table></div></div> \ No newline at end of file --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
