[ 
https://issues.apache.org/jira/browse/QPID-33?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marnie McCormack updated QPID-33:
---------------------------------

    Description: 
This task has been created as an initial place holder from which it is 
anticipated many tasks will derive.

We currently have a clustering implementation which provides scalability but 
not high availability i.e. currently if a broker in a cluster fails its clients 
can failover to another broker in the same cluster BUT we do not have the 
ability to restart on another node at the last state before failure using the 
saved state (from shared storage). 

The other brokers in a cluster will know about (via broadcasting) each other's 
queues etc, but not about any action the failed broker will processing - thus 
we could potentially suffer message loss and state disconnect. Also note that 
currently membership of a cluster does not imply any failover behaviour 
automatically.

We know that there are users who require HA/fault tolerant clustering with 
99.999% availability.

A holding page for clustering & HA notes exists here: 
http://cwiki.apache.org/confluence/display/qpid/ClusteringHA with use case 
content.

The analysis for this task will involve expanding the design documentation and 
inviting review prior to work starting on the implementation and also requires 
a thorough understanding of the protocol.


  was:
This task has been created as an initial place holder from which it is 
anticipated many tasks will derive.

We currently have a clustering implementation which provides scalability but 
not high availability i.e. currently if a broker in a cluster fails its clients 
can failover to another broker in the same cluster BUT we do not have the 
ability to restart on another node at the last state before failure using the 
saved state (from shared storage). 

The other brokers in a cluster will know about (via broadcasting) each other's 
queues etc, but not about any action the failed broker will processing - thus 
we could potentially suffer message loss and state disconnect. Also note that 
currently membership of a cluster does not imply any failover behaviour 
automatically.

We know that there are users who require HA/fault tolerant clustering with 
99.999% availability.

A holding page for clustering & HA notes exists here: 
http://wiki.apache.org/qpid/ClusteringHA with use case content.

The analysis for this task will involve expanding the design documentation and 
inviting review prior to work starting on the implementation and also requires 
a thorough understanding of the protocol.



> Introduce clustering for high availability & fault tolerance
> ------------------------------------------------------------
>
>                 Key: QPID-33
>                 URL: https://issues.apache.org/jira/browse/QPID-33
>             Project: Qpid
>          Issue Type: New Feature
>          Components: Java Broker
>            Reporter: Marnie McCormack
>            Assignee: Rafael H. Schloming
>
> This task has been created as an initial place holder from which it is 
> anticipated many tasks will derive.
> We currently have a clustering implementation which provides scalability but 
> not high availability i.e. currently if a broker in a cluster fails its 
> clients can failover to another broker in the same cluster BUT we do not have 
> the ability to restart on another node at the last state before failure using 
> the saved state (from shared storage). 
> The other brokers in a cluster will know about (via broadcasting) each 
> other's queues etc, but not about any action the failed broker will 
> processing - thus we could potentially suffer message loss and state 
> disconnect. Also note that currently membership of a cluster does not imply 
> any failover behaviour automatically.
> We know that there are users who require HA/fault tolerant clustering with 
> 99.999% availability.
> A holding page for clustering & HA notes exists here: 
> http://cwiki.apache.org/confluence/display/qpid/ClusteringHA with use case 
> content.
> The analysis for this task will involve expanding the design documentation 
> and inviting review prior to work starting on the implementation and also 
> requires a thorough understanding of the protocol.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:[email protected]

Reply via email to