Re: Data synchronization between 2 running clusters on different availability zone

2014-12-01 Thread Robert Coli
On Thu, Nov 27, 2014 at 1:24 AM, Spico Florin spicoflo...@gmail.com wrote:

   I have another question. What about the following scenario: two
 Cassandra instances installed on different cloud providers (EC2, Flexiant)?
 How do you synchronize them? Can you use some internal tools or do I have
 to implement my own mechanism?


That's what I meant by if maybe hybrid in the future, use GPFS :

http://www.datastax.com/documentation/cassandra/2.0/cassandra/architecture/architectureSnitchGossipPF_c.html

hybrid in this case means AWS-and-not-AWS.

=Rob


Re: Data synchronization between 2 running clusters on different availability zone

2014-12-01 Thread Jeremy Jongsma
Here's a snitch we use for this situation - it uses a property file if it
exists, but falls back to EC2 autodiscovery if it is missing.

https://github.com/barchart/cassandra-plugins/blob/master/src/main/java/com/barchart/cassandra/plugins/snitch/GossipingPropertyFileWithEC2FallbackSnitch.java

On Mon, Dec 1, 2014 at 12:33 PM, Robert Coli rc...@eventbrite.com wrote:

 On Thu, Nov 27, 2014 at 1:24 AM, Spico Florin spicoflo...@gmail.com
 wrote:

   I have another question. What about the following scenario: two
 Cassandra instances installed on different cloud providers (EC2, Flexiant)?
 How do you synchronize them? Can you use some internal tools or do I have
 to implement my own mechanism?


 That's what I meant by if maybe hybrid in the future, use GPFS :


 http://www.datastax.com/documentation/cassandra/2.0/cassandra/architecture/architectureSnitchGossipPF_c.html

 hybrid in this case means AWS-and-not-AWS.

 =Rob




Re: Data synchronization between 2 running clusters on different availability zone

2014-11-27 Thread Spico Florin
Hello, Rob!
  Thank you very much for the detailed support.
Regards,
 Florin

On Wed, Nov 26, 2014 at 12:41 AM, Robert Coli rc...@eventbrite.com wrote:

 On Tue, Nov 25, 2014 at 7:09 AM, Spico Florin spicoflo...@gmail.com
 wrote:

 1. For ensuring high availability I would like to install one Cassandra
 cluster on one availability zone
 (on Amazon EC2 US-east) and one Cassandra cluster on other AZ (Amazon EC2
 US-west).


 One cluster, replication factor of 2, cluster configured with a rack aware
 snitch is how this is usually done. Well, more accurately, people usually
 deploy with at least RF=3 and across 3 AZs. A RF of at least 3 is also
 required to use QUORUM Consistency Level.

 If you will always operate only out of EC2, you probably want to look into
 the EC2Snitch. If you plan to ultimately go multi-region,
 EC2MultiRegionSnitch. If maybe hybrid in the future,
 GossipingPropertyFileSnitch.


 http://www.datastax.com/documentation/cassandra/2.0/cassandra/architecture/architectureSnitchEC2_t.html

 http://www.datastax.com/documentation/cassandra/2.0/cassandra/architecture/architectureSnitchEC2MultiRegion_c.html

 http://www.datastax.com/documentation/cassandra/2.0/cassandra/architecture/architectureSnitchGossipPF_c.html

 For some good meta on the internals here :

 https://issues.apache.org/jira/browse/CASSANDRA-3810

 =Rob
 http://twitter.com/rcolidba




Re: Data synchronization between 2 running clusters on different availability zone

2014-11-27 Thread Spico Florin
Hello!
  I have another question. What about the following scenario: two Cassandra
instances installed on different cloud providers (EC2, Flexiant)? How do
you synchronize them? Can you use some internal tools or do I have to
implement my own mechanism?
Thanks.
 Florin


On Thu, Nov 27, 2014 at 11:18 AM, Spico Florin spicoflo...@gmail.com
wrote:

 Hello, Rob!
   Thank you very much for the detailed support.
 Regards,
  Florin

 On Wed, Nov 26, 2014 at 12:41 AM, Robert Coli rc...@eventbrite.com
 wrote:

 On Tue, Nov 25, 2014 at 7:09 AM, Spico Florin spicoflo...@gmail.com
 wrote:

 1. For ensuring high availability I would like to install one Cassandra
 cluster on one availability zone
 (on Amazon EC2 US-east) and one Cassandra cluster on other AZ (Amazon
 EC2 US-west).


 One cluster, replication factor of 2, cluster configured with a rack
 aware snitch is how this is usually done. Well, more accurately, people
 usually deploy with at least RF=3 and across 3 AZs. A RF of at least 3 is
 also required to use QUORUM Consistency Level.

 If you will always operate only out of EC2, you probably want to look
 into the EC2Snitch. If you plan to ultimately go multi-region,
 EC2MultiRegionSnitch. If maybe hybrid in the future,
 GossipingPropertyFileSnitch.


 http://www.datastax.com/documentation/cassandra/2.0/cassandra/architecture/architectureSnitchEC2_t.html

 http://www.datastax.com/documentation/cassandra/2.0/cassandra/architecture/architectureSnitchEC2MultiRegion_c.html

 http://www.datastax.com/documentation/cassandra/2.0/cassandra/architecture/architectureSnitchGossipPF_c.html

 For some good meta on the internals here :

 https://issues.apache.org/jira/browse/CASSANDRA-3810

 =Rob
 http://twitter.com/rcolidba






Re: Data synchronization between 2 running clusters on different availability zone

2014-11-27 Thread Eric Stevens
There's no reason you can't run on multiple cloud providers as long as you
treat them as logically distinct DC's.  It should largely work the same way
as running in several AWS regions, but you'll need to use something
like GossipingPropertyFileSnitch
because the EC2 snitches are specific to AWS.

On Thu Nov 27 2014 at 2:26:27 AM Spico Florin spicoflo...@gmail.com wrote:

 Hello!
   I have another question. What about the following scenario: two
 Cassandra instances installed on different cloud providers (EC2, Flexiant)?
 How do you synchronize them? Can you use some internal tools or do I have
 to implement my own mechanism?
 Thanks.
  Florin


 On Thu, Nov 27, 2014 at 11:18 AM, Spico Florin spicoflo...@gmail.com
 wrote:

 Hello, Rob!
   Thank you very much for the detailed support.
 Regards,
  Florin

 On Wed, Nov 26, 2014 at 12:41 AM, Robert Coli rc...@eventbrite.com
 wrote:

 On Tue, Nov 25, 2014 at 7:09 AM, Spico Florin spicoflo...@gmail.com
 wrote:

 1. For ensuring high availability I would like to install one Cassandra
 cluster on one availability zone
 (on Amazon EC2 US-east) and one Cassandra cluster on other AZ (Amazon
 EC2 US-west).


 One cluster, replication factor of 2, cluster configured with a rack
 aware snitch is how this is usually done. Well, more accurately, people
 usually deploy with at least RF=3 and across 3 AZs. A RF of at least 3 is
 also required to use QUORUM Consistency Level.

 If you will always operate only out of EC2, you probably want to look
 into the EC2Snitch. If you plan to ultimately go multi-region,
 EC2MultiRegionSnitch. If maybe hybrid in the future,
 GossipingPropertyFileSnitch.


 http://www.datastax.com/documentation/cassandra/2.0/cassandra/architecture/architectureSnitchEC2_t.html

 http://www.datastax.com/documentation/cassandra/2.0/cassandra/architecture/architectureSnitchEC2MultiRegion_c.html

 http://www.datastax.com/documentation/cassandra/2.0/cassandra/architecture/architectureSnitchGossipPF_c.html

 For some good meta on the internals here :

 https://issues.apache.org/jira/browse/CASSANDRA-3810

 =Rob
 http://twitter.com/rcolidba







Data synchronization between 2 running clusters on different availability zone

2014-11-25 Thread Spico Florin
Hello!
   I have the following scenario:
1. For ensuring high availability I would like to install one Cassandra
cluster on one availability zone
(on Amazon EC2 US-east) and one Cassandra cluster on other AZ (Amazon EC2
US-west).
2.I have pipeline that is running on Amazon EC2-EAST and is feeding the
Cassandra installed on this AZ.
Here are my questions:
1. Is this scenario feasible?
 2. Is the architecture correct regarding the availability of Cassandra?
3. If the architecture is fine, how do you keep data synchronized between
the two instances?

I look forward for your answers.
 Regards,
  Florin


Re: Data synchronization between 2 running clusters on different availability zone

2014-11-25 Thread Robert Coli
On Tue, Nov 25, 2014 at 7:09 AM, Spico Florin spicoflo...@gmail.com wrote:

 1. For ensuring high availability I would like to install one Cassandra
 cluster on one availability zone
 (on Amazon EC2 US-east) and one Cassandra cluster on other AZ (Amazon EC2
 US-west).


One cluster, replication factor of 2, cluster configured with a rack aware
snitch is how this is usually done. Well, more accurately, people usually
deploy with at least RF=3 and across 3 AZs. A RF of at least 3 is also
required to use QUORUM Consistency Level.

If you will always operate only out of EC2, you probably want to look into
the EC2Snitch. If you plan to ultimately go multi-region,
EC2MultiRegionSnitch. If maybe hybrid in the future,
GossipingPropertyFileSnitch.

http://www.datastax.com/documentation/cassandra/2.0/cassandra/architecture/architectureSnitchEC2_t.html
http://www.datastax.com/documentation/cassandra/2.0/cassandra/architecture/architectureSnitchEC2MultiRegion_c.html
http://www.datastax.com/documentation/cassandra/2.0/cassandra/architecture/architectureSnitchGossipPF_c.html

For some good meta on the internals here :

https://issues.apache.org/jira/browse/CASSANDRA-3810

=Rob
http://twitter.com/rcolidba