Re: advice for EC2 deployment

2011-06-23 Thread pankaj soni
hey,

I have got my ec2 multi-dc across AZ's but in same region us-east.

Now I am trying to deploy cassandra over multiple regions that is ec2 us
west, singapore and us-east. I have edited the config file as
sasha's reply below.

though when I run nodetool in each DC, I only see the nodes from that
region. That is EC2 US west is showing only 2 nodes which are up in that
region
but not the other 2 which are there in US-east.

Kindly suggest a solution.

-thanks

On Wed, Apr 27, 2011 at 5:45 PM, Sasha Dolgy sdo...@gmail.com wrote:

 Hi,

 If I understand you correctly, you are trying to get a private ip in
 us-east speaking to the private ip in us-west.  to make your life
 easier, configure your nodes to use hostname of the server.  if it's
 in a different region, it will use the public ip (ec2 dns will handle
 this for you) and if it's in the same region, it will use the private
 ip.  this way you can stop worrying about if you are using the public
 or private ip to communicate with another node.  let the aws dns do
 the work for you.

 just make sure you are using v0.8 with SSL turned on and have the
 appropriate security group definitions ...

 -sasha



 On Wed, Apr 27, 2011 at 1:55 PM, pankajsoni0126
 pankajsoni0...@gmail.com wrote:
  I have been trying to deploy Cassandra cluster across regions and for
 that I
  posted this IP address resolution in MultiDC setup.
 
  But when it is to get nodes talking to each other on different regions
 say,
  us-east and us-west over private IP's of EC2 nodes I am facing problems.
 
  I am assuming if Cassandra is built for multi-DC setup it should be
 easily
  deployed with node1's DC1's public IP listed as seed in all nodes in DC2
 and
  to gain idea about network topology? I have hit a dud for deployment in
 such
  scenario.
 
  Or is it there any way possible to use Private IP's for such a scenario
 in
  EC2, as Public Ip are less secure and costly?



Re: advice for EC2 deployment

2011-06-23 Thread Sasha Dolgy
are you able to open a connection from one of the nodes to a node on
the other side?  us-east to us-west?  could your problem be as simple
as connectivity and/or security group configuration?

On Thu, Jun 23, 2011 at 1:51 PM, pankaj soni pankajsoni0...@gmail.com wrote:
 hey,
 I have got my ec2 multi-dc across AZ's but in same region us-east.
 Now I am trying to deploy cassandra over multiple regions that is ec2 us
 west, singapore and us-east. I have edited the config file as
 sasha's reply below.
 though when I run nodetool in each DC, I only see the nodes from that
 region. That is EC2 US west is showing only 2 nodes which are up in that
 region
 but not the other 2 which are there in US-east.
 Kindly suggest a solution.
 -thanks
 On Wed, Apr 27, 2011 at 5:45 PM, Sasha Dolgy sdo...@gmail.com wrote:

 Hi,

 If I understand you correctly, you are trying to get a private ip in
 us-east speaking to the private ip in us-west.  to make your life
 easier, configure your nodes to use hostname of the server.  if it's
 in a different region, it will use the public ip (ec2 dns will handle
 this for you) and if it's in the same region, it will use the private
 ip.  this way you can stop worrying about if you are using the public
 or private ip to communicate with another node.  let the aws dns do
 the work for you.

 just make sure you are using v0.8 with SSL turned on and have the
 appropriate security group definitions ...

 -sasha



 On Wed, Apr 27, 2011 at 1:55 PM, pankajsoni0126
 pankajsoni0...@gmail.com wrote:
  I have been trying to deploy Cassandra cluster across regions and for
  that I
  posted this IP address resolution in MultiDC setup.
 
  But when it is to get nodes talking to each other on different regions
  say,
  us-east and us-west over private IP's of EC2 nodes I am facing problems.
 
  I am assuming if Cassandra is built for multi-DC setup it should be
  easily
  deployed with node1's DC1's public IP listed as seed in all nodes in DC2
  and
  to gain idea about network topology? I have hit a dud for deployment in
  such
  scenario.
 
  Or is it there any way possible to use Private IP's for such a scenario
  in
  EC2, as Public Ip are less secure and costly?





-- 
Sasha Dolgy
sasha.do...@gmail.com


Re: advice for EC2 deployment

2011-06-23 Thread pankajsoni0126
No, the nodes in the separate DC's are able to discover each other. But
across the Dc's its not happening. 

I have double checked the config parameters, both require in amazon settings
and cassandra.yaml before posting query here.

has anybody got there nodes talking to each other across regions by just
using public-dns? 

I am also looking into open vpn and how to deploy it.

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/advice-for-EC2-deployment-tp6294613p6508278.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.


Re: advice for EC2 deployment

2011-06-23 Thread Sameer Farooqui
EC2Snitch doesn't currently support multi-Regions in Amazon.

Tickets to track:
https://issues.apache.org/jira/browse/CASSANDRA-2452
https://issues.apache.org/jira/browse/CASSANDRA-2491

Let us know if/how you get the OpenVPN connection to work across Regions.


On Thu, Jun 23, 2011 at 6:29 AM, pankajsoni0126 pankajsoni0...@gmail.comwrote:

 No, the nodes in the separate DC's are able to discover each other. But
 across the Dc's its not happening.

 I have double checked the config parameters, both require in amazon
 settings
 and cassandra.yaml before posting query here.

 has anybody got there nodes talking to each other across regions by just
 using public-dns?

 I am also looking into open vpn and how to deploy it.

 --
 View this message in context:
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/advice-for-EC2-deployment-tp6294613p6508278.html
 Sent from the cassandra-u...@incubator.apache.org mailing list archive at
 Nabble.com.



Re: advice for EC2 deployment

2011-06-23 Thread Sasha Dolgy
we use a combination of Vyatta  OpenVPN on the nodes that are EC2 and
nodes that aren't Ec2works a treat.

On Thu, Jun 23, 2011 at 10:23 PM, Sameer Farooqui
cassandral...@gmail.com wrote:
 EC2Snitch doesn't currently support multi-Regions in Amazon.
 Tickets to track:
 https://issues.apache.org/jira/browse/CASSANDRA-2452
 https://issues.apache.org/jira/browse/CASSANDRA-2491
 Let us know if/how you get the OpenVPN connection to work across Regions.

 On Thu, Jun 23, 2011 at 6:29 AM, pankajsoni0126 pankajsoni0...@gmail.com
 wrote:

 No, the nodes in the separate DC's are able to discover each other. But
 across the Dc's its not happening.

 I have double checked the config parameters, both require in amazon
 settings
 and cassandra.yaml before posting query here.

 has anybody got there nodes talking to each other across regions by just
 using public-dns?

 I am also looking into open vpn and how to deploy it.


Re: advice for EC2 deployment

2011-04-28 Thread aaron morton
If you are not going to be multi-region straight away, but wish to be in the 
near future I would consider:

- 1 region
- 2 AZ's, with the same number of nodes
- Using the EC2Snitch as is, this will map to 1 cassandra DC and 2 cassandra 
Racks
- Using the NetworkTopology strategy 

For background see this excellent discussion from Peter 
http://www.mail-archive.com/user@cassandra.apache.org/msg12092.html and jump 
into NetworkTopologyStrategy.calculateNaturalEndpoints() if you are so inclined.
 
In each DC the NTS will first spread the replicas across each rack (aws AZ). It 
then choses replicas based on token order. 

I thinking you may be able to spread the replicas evenly across the racks (aws 
AZ), so that a total failure of one AZ means the cassadra DC still has enough 
nodes in the other AZ to continue working, this would mean a high RF in the DC 
(4 or 6 say). Not 100% sure, but I think thats correct.

The reason for doing would be to make life easier in the future when you want 
to add another DC. You would update the options for the NTS replication 
strategy and add the DC with the correct RF, then run repair. 

Hope that helps. 
Aaron

 
On 28 Apr 2011, at 01:38, William Oberman wrote:

 I think you're right about changing NetworkToplogyStrategy, but the timing 
 isn't working in my favor at this point.  I wonder how bad that will really 
 be
 
 On Wed, Apr 27, 2011 at 9:35 AM, Sasha Dolgy sdo...@gmail.com wrote:
 so can you not simply leverage a strategy that replicates data between
 racks and at some point in the future when you move to multi-dc
 upgrade the replication strategy to maintain the current replication
 and add in some replication between DC's ... ?
 
 i'll go re-read your posts to see if you've already tried this.  I
 vaguely remember Ellis saying it's not a good idea to switch
 NetworkTopologyStrategy ...
 
 On Wed, Apr 27, 2011 at 3:29 PM, William Oberman
 ober...@civicscience.com wrote:
  Thanks Sasha.  Fortunately/unfortunately I did realize the default  current
  behavior of the Ec2Snitch, but my application isn't multi-region capable
  (yet), so I need to get intra-region redundancy.  And having a
  SingleRegionEc2Snitch that did DC=ec2zone and RACK=??? would be much better
  for me (for now).
 
 
 
 -- 
 Will Oberman
 Civic Science, Inc.
 3030 Penn Avenue., First Floor
 Pittsburgh, PA 15201
 (M) 412-480-7835
 (E) ober...@civicscience.com



Re: advice for EC2 deployment

2011-04-27 Thread William Oberman
While I haven't configured it for multi-region yet, Sasha is exactly right
now how amzon's DNS works (returning private vs. public IP depending on if
the machine is local to the region or not).  For extra fun, now that Route53
exists you can (somewhat trivially) map and dynamically maintain all EC2
instances to stable DNS names (but make sure to use CNAMEs to get the DNS
magic!).  E.g.
cassandra1.somethinghardtoguess.ec2.yourdomain.com -
weird.ec2.public.dns.name

I'd drop in the somethinghardtoguess myself given Route53 can expose your
internal network topology if someone can guess the DNS name.

will

On Wed, Apr 27, 2011 at 8:15 AM, Sasha Dolgy sdo...@gmail.com wrote:

 Hi,

 If I understand you correctly, you are trying to get a private ip in
 us-east speaking to the private ip in us-west.  to make your life
 easier, configure your nodes to use hostname of the server.  if it's
 in a different region, it will use the public ip (ec2 dns will handle
 this for you) and if it's in the same region, it will use the private
 ip.  this way you can stop worrying about if you are using the public
 or private ip to communicate with another node.  let the aws dns do
 the work for you.

 just make sure you are using v0.8 with SSL turned on and have the
 appropriate security group definitions ...

 -sasha



 On Wed, Apr 27, 2011 at 1:55 PM, pankajsoni0126
 pankajsoni0...@gmail.com wrote:
  I have been trying to deploy Cassandra cluster across regions and for
 that I
  posted this IP address resolution in MultiDC setup.
 
  But when it is to get nodes talking to each other on different regions
 say,
  us-east and us-west over private IP's of EC2 nodes I am facing problems.
 
  I am assuming if Cassandra is built for multi-DC setup it should be
 easily
  deployed with node1's DC1's public IP listed as seed in all nodes in DC2
 and
  to gain idea about network topology? I have hit a dud for deployment in
 such
  scenario.
 
  Or is it there any way possible to use Private IP's for such a scenario
 in
  EC2, as Public Ip are less secure and costly?




-- 
Will Oberman
Civic Science, Inc.
3030 Penn Avenue., First Floor
Pittsburgh, PA 15201
(M) 412-480-7835
(E) ober...@civicscience.com


Re: advice for EC2 deployment

2011-04-27 Thread William Oberman
It's great advice, but I'm still torn.  I've never done multi-region work
before, and I'd prefer to wait for 0.8 with built-in inter-node security,
but I'm otherwise ready to roll (and need to roll) cassandra out sooner than
that.

Given how well my system held up with a total single AZ failure, I'm really
leaning on starting by treating AZ's as DCs, and racks as... random?  I
don't think that part matters.  My question for today is to just use the
property file snitch, or to roll my own version of Ec2Snith that does AZ as
DC.

I do increase my risk being single region to start, so I was going to figure
out how to push snapshots to S3.  One question on that note: is it better to
try and snapshot all nodes at roughly the same point in time, or is it
better to do rolling snapshots?

will

On Wed, Apr 27, 2011 at 7:13 AM, aaron morton aa...@thelastpickle.comwrote:

 Using the EC2Snitch you could have one AZ in us-east-1 and one Az in
 us-west-1, treat each AZ as a single rack and each region as a DC. The
 network topology is rack aware so will prefer request that go to the same
 rack (not much of an issue when you have only one rack).

 If possible I would use the same RF in each DC, if you want the fail over
 to be as clean as possible (see earlier comments about when number failed
 nodes in a dc). i.e. 3 replicas in each dc / region.

 Until you find a reason otherwise use LOCAL_QUORUM, if that proves to be
 too slow or you get more experience and feel comfortable with the trade offs
 then change to a lower CL.

 Dropping the CL level for write bursts does not make the cluster run any
 faster, it lets the client think the cluster is running faster and can
 result in the client overloading (in a good this is what it should do way)
 the cluster. This can result in more eventual consistency work to be done
 later during maintenance or read requests. If that is a reasonable trade
 off, you can write at CL ONE and read at CL ALL to ensure you get consistent
 reads (quorum is not good enough in that case).

 Jump in and test it at Quorum, you may find the write performance is good
 enough. There are lots of dials to play with
 http://wiki.apache.org/cassandra/MemtableThresholds

 Hope that helps.
 Aaron


 On 27 Apr 2011, at 09:31, William Oberman wrote:

 I see what you're saying.  I was able to control write latency on mysql
 using insert vs insert delayed (what I feel is MySQLs poor man's eventual
 consistency option) + the fact that replication was a background
 asynchronous process.  In terms of read latency, I was able to do up to a
 few hundred well indexed mysql queries (across AZs) on a view while keeping
 the overall latency of the page around or less than a second.

 I basically am replacing two use cases, the cases with difficult to scale
 anticipated write volumes.  The first case was previously using insert
 delayed (which I'm doing in cassandra as ONE) as I wasn't getting consistent
 write/read operations before anyways.  The second case was using traditional
 insert (which I was going to replace with some QUORUM-like level, I was
 assuming LOCAL_QUORUM).  But, the latter case uses a write through memory
 cache (memcache), so I don't know how often it really reads data from the
 persistent store.  But I definitely need to make sure it is consistent.

 In any case, it sounds like I'd be best served treating AZs as DCs, but
 then I don't know what to make racks?  Or do racks not matter in a single
 AZ?  That way I can get an ack from a LOCAL_QUORUM read/write before the
 (slightly) slower read/write to/from the other AZ (for redundancy).  Then
 I'm only screwed if Amazon has a multi-AZ failure (so far, they've kept it
 to only one!) :-)

 will

 On Tue, Apr 26, 2011 at 5:01 PM, aaron morton aa...@thelastpickle.comwrote:

 One difference between Cassandra and MySQL replication may be when the
 network IO happens. Was the MySQL replication synchronous on transaction
 commit ?  I was only aware that it had async replication, which means the
 client is not exposed to the network latency. In cassandra the network
 latency is exposed to the client as it needs to wait for the CL number of
 nodes to respond.

 If you use the PropertyFilePartitioner with the NetworkTopology you can
 manually assign machines to racks / dc's based on IP.
 See conf/cassandra-topology.property file there is also an Ec2Snitch which
 (from the code)
 /**
  * A snitch that assumes an EC2 region is a DC and an EC2
 availability_zone
  *  is a rack. This information is available in the config for the node.

 Recent discussion on DC aware CL levels
 http://www.mail-archive.com/user@cassandra.apache.org/msg11414.html

 Hope that helps.
  http://www.mail-archive.com/user@cassandra.apache.org/msg11414.html
 Aaron


 On 27 Apr 2011, at 01:18, William Oberman wrote:

 Thanks Aaron!

 Unless no one on this list uses EC2, there were a few minor troubles end
 of last week through the weekend which taught me a lot about obscure failure
 modes 

Re: advice for EC2 deployment

2011-04-27 Thread Sasha Dolgy
Hi William,

The default behavior of Ec2Snitch is outlined below:

http://svn.apache.org/repos/asf/cassandra/trunk/src/java/org/apache/cassandra/locator/Ec2Snitch.java

// Split us-east-1a or asia-1a into us-east/1a and asia/1a.
String azone = new String(b ,UTF-8);
String[] splits = azone.split(-);
ec2zone = splits[splits.length - 1];
ec2region = splits.length  3 ? splits[0] : splits[0]+-+splits[1];
logger.info(EC2Snitch using region:  + ec2region + , zone:
 + ec2zone + .);

ApplicationState.DC = ec2region
ApplicationState.RACK = ec2zone

We leverage cassandra instances in APAC, US  Europe ... so it's
important for us to know that we have one data center in each 'region'
and multiple racks per DC ...

-sasha

On Wed, Apr 27, 2011 at 3:06 PM, William Oberman
ober...@civicscience.com wrote:
 It's great advice, but I'm still torn.  I've never done multi-region work
 before, and I'd prefer to wait for 0.8 with built-in inter-node security,
 but I'm otherwise ready to roll (and need to roll) cassandra out sooner than
 that.

 Given how well my system held up with a total single AZ failure, I'm really
 leaning on starting by treating AZ's as DCs, and racks as... random?  I
 don't think that part matters.  My question for today is to just use the
 property file snitch, or to roll my own version of Ec2Snith that does AZ as
 DC.

 I do increase my risk being single region to start, so I was going to figure
 out how to push snapshots to S3.  One question on that note: is it better to
 try and snapshot all nodes at roughly the same point in time, or is it
 better to do rolling snapshots?

 will


Re: advice for EC2 deployment

2011-04-27 Thread Sasha Dolgy
if you migrate the instance, does Route53 automatically re-map all the
information to the new ec2 instance?  another issue is that cassandra
only maintains the IP of the other nodes, and not the hostname
(assumed based on output of the nodetool ring)  ...

which means, if you migrate the instance and Route53 does do some
auto-magic .. the private ip for the instance will have changed and
you will need to migrate that node back into the ring, while moving
the old referenced IP out ... we've had quite a lot of pain with this
in the past.  rule of thumb, if you want to upgrade / migrate an
instance, you need to remove it from the ring, do your work, bootstrap
it back to the ring .. i think this could be avoided if cassandra
maintained hostname references and not just IP references for nodes.

-sasha

On Wed, Apr 27, 2011 at 2:56 PM, William Oberman
ober...@civicscience.com wrote:
 While I haven't configured it for multi-region yet, Sasha is exactly right
 now how amzon's DNS works (returning private vs. public IP depending on if
 the machine is local to the region or not).  For extra fun, now that Route53
 exists you can (somewhat trivially) map and dynamically maintain all EC2
 instances to stable DNS names (but make sure to use CNAMEs to get the DNS
 magic!).  E.g.
 cassandra1.somethinghardtoguess.ec2.yourdomain.com -
 weird.ec2.public.dns.name

 I'd drop in the somethinghardtoguess myself given Route53 can expose your
 internal network topology if someone can guess the DNS name.

 will


Re: advice for EC2 deployment

2011-04-27 Thread William Oberman
Thanks Sasha.  Fortunately/unfortunately I did realize the default  current
behavior of the Ec2Snitch, but my application isn't multi-region capable
(yet), so I need to get intra-region redundancy.  And having a
SingleRegionEc2Snitch that did DC=ec2zone and RACK=??? would be much better
for me (for now).

On Wed, Apr 27, 2011 at 9:21 AM, Sasha Dolgy sdo...@gmail.com wrote:

 Hi William,

 The default behavior of Ec2Snitch is outlined below:


 http://svn.apache.org/repos/asf/cassandra/trunk/src/java/org/apache/cassandra/locator/Ec2Snitch.java

// Split us-east-1a or asia-1a into us-east/1a and
 asia/1a.
String azone = new String(b ,UTF-8);
String[] splits = azone.split(-);
ec2zone = splits[splits.length - 1];
ec2region = splits.length  3 ? splits[0] : splits[0]+-+splits[1];
logger.info(EC2Snitch using region:  + ec2region + , zone:
  + ec2zone + .);

 ApplicationState.DC = ec2region
 ApplicationState.RACK = ec2zone

 We leverage cassandra instances in APAC, US  Europe ... so it's
 important for us to know that we have one data center in each 'region'
 and multiple racks per DC ...

 -sasha

 On Wed, Apr 27, 2011 at 3:06 PM, William Oberman
 ober...@civicscience.com wrote:
  It's great advice, but I'm still torn.  I've never done multi-region work
  before, and I'd prefer to wait for 0.8 with built-in inter-node security,
  but I'm otherwise ready to roll (and need to roll) cassandra out sooner
 than
  that.
 
  Given how well my system held up with a total single AZ failure, I'm
 really
  leaning on starting by treating AZ's as DCs, and racks as... random?  I
  don't think that part matters.  My question for today is to just use the
  property file snitch, or to roll my own version of Ec2Snith that does AZ
 as
  DC.
 
  I do increase my risk being single region to start, so I was going to
 figure
  out how to push snapshots to S3.  One question on that note: is it better
 to
  try and snapshot all nodes at roughly the same point in time, or is it
  better to do rolling snapshots?
 
  will




-- 
Will Oberman
Civic Science, Inc.
3030 Penn Avenue., First Floor
Pittsburgh, PA 15201
(M) 412-480-7835
(E) ober...@civicscience.com


Re: advice for EC2 deployment

2011-04-27 Thread William Oberman
I don't think of it as migrating an instance, it's more of a destroy/start
with EC2.  But, I still think it would be very useful to spin up a set of
instances with known hostnames (cassandra1, 2, 3... N) and be able to
quickly SSH to them by doing ssh ec2u...@cassandra1.random.ec2.mydomain.com
.

Also, it makes finding seeds a lot easier, as you don't have to manage IPs
in the config file, just names (cassandra-seed1.random.ec2.mydomain.com).

I should have mentioned it, but people that are already doing this trick
(I'm not... yet) are actually doing: hostname.region.ec2.mydomain.com (as
it's useful to know the region).  I don't anything cares about AZ, but you
could embed that too if it matters.

will

On Wed, Apr 27, 2011 at 9:26 AM, Sasha Dolgy sdo...@gmail.com wrote:

 if you migrate the instance, does Route53 automatically re-map all the
 information to the new ec2 instance?  another issue is that cassandra
 only maintains the IP of the other nodes, and not the hostname
 (assumed based on output of the nodetool ring)  ...

 which means, if you migrate the instance and Route53 does do some
 auto-magic .. the private ip for the instance will have changed and
 you will need to migrate that node back into the ring, while moving
 the old referenced IP out ... we've had quite a lot of pain with this
 in the past.  rule of thumb, if you want to upgrade / migrate an
 instance, you need to remove it from the ring, do your work, bootstrap
 it back to the ring .. i think this could be avoided if cassandra
 maintained hostname references and not just IP references for nodes.

 -sasha

 On Wed, Apr 27, 2011 at 2:56 PM, William Oberman
 ober...@civicscience.com wrote:
  While I haven't configured it for multi-region yet, Sasha is exactly
 right
  now how amzon's DNS works (returning private vs. public IP depending on
 if
  the machine is local to the region or not).  For extra fun, now that
 Route53
  exists you can (somewhat trivially) map and dynamically maintain all EC2
  instances to stable DNS names (but make sure to use CNAMEs to get the DNS
  magic!).  E.g.
  cassandra1.somethinghardtoguess.ec2.yourdomain.com -
  weird.ec2.public.dns.name
 
  I'd drop in the somethinghardtoguess myself given Route53 can expose your
  internal network topology if someone can guess the DNS name.
 
  will




-- 
Will Oberman
Civic Science, Inc.
3030 Penn Avenue., First Floor
Pittsburgh, PA 15201
(M) 412-480-7835
(E) ober...@civicscience.com


Re: advice for EC2 deployment

2011-04-27 Thread William Oberman
Oh, and Route53 doesn't do anything automatically, but there is an API to
manage the DNS.  It's up to you to run a task on instance boot/terminate, or
a cron job if you want to do this trick (for now, seems like a solid future
feature of Route53).  Though, I hear geographical aware Route53 is already
in the works (to route EC2 traffic to the closest region).

will

On Wed, Apr 27, 2011 at 9:33 AM, William Oberman
ober...@civicscience.comwrote:

 I don't think of it as migrating an instance, it's more of a destroy/start
 with EC2.  But, I still think it would be very useful to spin up a set of
 instances with known hostnames (cassandra1, 2, 3... N) and be able to
 quickly SSH to them by doing ssh
 ec2u...@cassandra1.random.ec2.mydomain.com.

 Also, it makes finding seeds a lot easier, as you don't have to manage IPs
 in the config file, just names (cassandra-seed1.random.ec2.mydomain.com).

 I should have mentioned it, but people that are already doing this trick
 (I'm not... yet) are actually doing: hostname.region.ec2.mydomain.com (as
 it's useful to know the region).  I don't anything cares about AZ, but you
 could embed that too if it matters.

 will


 On Wed, Apr 27, 2011 at 9:26 AM, Sasha Dolgy sdo...@gmail.com wrote:

 if you migrate the instance, does Route53 automatically re-map all the
 information to the new ec2 instance?  another issue is that cassandra
 only maintains the IP of the other nodes, and not the hostname
 (assumed based on output of the nodetool ring)  ...

 which means, if you migrate the instance and Route53 does do some
 auto-magic .. the private ip for the instance will have changed and
 you will need to migrate that node back into the ring, while moving
 the old referenced IP out ... we've had quite a lot of pain with this
 in the past.  rule of thumb, if you want to upgrade / migrate an
 instance, you need to remove it from the ring, do your work, bootstrap
 it back to the ring .. i think this could be avoided if cassandra
 maintained hostname references and not just IP references for nodes.

 -sasha

 On Wed, Apr 27, 2011 at 2:56 PM, William Oberman
 ober...@civicscience.com wrote:
  While I haven't configured it for multi-region yet, Sasha is exactly
 right
  now how amzon's DNS works (returning private vs. public IP depending on
 if
  the machine is local to the region or not).  For extra fun, now that
 Route53
  exists you can (somewhat trivially) map and dynamically maintain all EC2
  instances to stable DNS names (but make sure to use CNAMEs to get the
 DNS
  magic!).  E.g.
  cassandra1.somethinghardtoguess.ec2.yourdomain.com -
  weird.ec2.public.dns.name
 
  I'd drop in the somethinghardtoguess myself given Route53 can expose
 your
  internal network topology if someone can guess the DNS name.
 
  will




 --
 Will Oberman
 Civic Science, Inc.
 3030 Penn Avenue., First Floor
 Pittsburgh, PA 15201
 (M) 412-480-7835
 (E) ober...@civicscience.com




-- 
Will Oberman
Civic Science, Inc.
3030 Penn Avenue., First Floor
Pittsburgh, PA 15201
(M) 412-480-7835
(E) ober...@civicscience.com


Re: advice for EC2 deployment

2011-04-27 Thread Sasha Dolgy
so can you not simply leverage a strategy that replicates data between
racks and at some point in the future when you move to multi-dc
upgrade the replication strategy to maintain the current replication
and add in some replication between DC's ... ?

i'll go re-read your posts to see if you've already tried this.  I
vaguely remember Ellis saying it's not a good idea to switch
NetworkTopologyStrategy ...

On Wed, Apr 27, 2011 at 3:29 PM, William Oberman
ober...@civicscience.com wrote:
 Thanks Sasha.  Fortunately/unfortunately I did realize the default  current
 behavior of the Ec2Snitch, but my application isn't multi-region capable
 (yet), so I need to get intra-region redundancy.  And having a
 SingleRegionEc2Snitch that did DC=ec2zone and RACK=??? would be much better
 for me (for now).


Re: advice for EC2 deployment

2011-04-27 Thread William Oberman
I think you're right about changing NetworkToplogyStrategy, but the timing
isn't working in my favor at this point.  I wonder how bad that will really
be

On Wed, Apr 27, 2011 at 9:35 AM, Sasha Dolgy sdo...@gmail.com wrote:

 so can you not simply leverage a strategy that replicates data between
 racks and at some point in the future when you move to multi-dc
 upgrade the replication strategy to maintain the current replication
 and add in some replication between DC's ... ?

 i'll go re-read your posts to see if you've already tried this.  I
 vaguely remember Ellis saying it's not a good idea to switch
 NetworkTopologyStrategy ...

 On Wed, Apr 27, 2011 at 3:29 PM, William Oberman
 ober...@civicscience.com wrote:
  Thanks Sasha.  Fortunately/unfortunately I did realize the default 
 current
  behavior of the Ec2Snitch, but my application isn't multi-region capable
  (yet), so I need to get intra-region redundancy.  And having a
  SingleRegionEc2Snitch that did DC=ec2zone and RACK=??? would be much
 better
  for me (for now).




-- 
Will Oberman
Civic Science, Inc.
3030 Penn Avenue., First Floor
Pittsburgh, PA 15201
(M) 412-480-7835
(E) ober...@civicscience.com


Re: advice for EC2 deployment

2011-04-26 Thread William Oberman
Thanks Aaron!

Unless no one on this list uses EC2, there were a few minor troubles end of
last week through the weekend which taught me a lot about obscure failure
modes in various applications I use :-)  My original post was trying to be
more redundant than fast, which has been by overall goal from even before
moving to Cassandra (my downtime from the EC2 madness was minimal, and due
to only having one single point of failure == the amazon load balancer).  My
secondary goal was  trying to make moving to a second region easier, but is
that is causing problems I can drop the idea.

I might be downplaying the cost of inter-AZ communication, but I've lived
with that for quite some time, for example my current setup of MySQL in
Master-Master replication is split over zones, and my webservers live in yet
different zones.  Maybe Cassandra is chattier than I'm used to?  (again,
I'm fairly new to cassandra)

Based on that article, the discussion, and the recent EC2 issues, it sounds
like it would be better to start with:
-6 nodes split in two AZs 3/3
-Configure replication to do 2 in one AZ and one in the other
(NetworkTopology treats AZs as racks, so does RF=3,us-east=3 make this
happen naturally?)
-What does LOCAL_QUORUM do in this case?  Is there a rack quorum?  Or does
the natural latencies of AZs make LOCAL_QUORUM behave like a rack quorum?

will

On Tue, Apr 26, 2011 at 1:14 AM, aaron morton aa...@thelastpickle.comwrote:

 For background see this article:

 http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers


 http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centersAnd
 this recent discussion
 http://www.mail-archive.com/user@cassandra.apache.org/msg12502.html

 http://www.mail-archive.com/user@cassandra.apache.org/msg12502.htmlIssues
 that may be a concern:
 - lots of cross AZ latency in us-east, e.g. LOCAL_QUORUM ops must wait
 cross AZ . Also consider it during maintenance tasks, how much of a pain is
 it going to be to have latency between every node.
 - IMHO not having sufficient (by that I mean 3) replicas in a cassandra DC
 to handle a single node failure when working at Quorum reduces the utility
 of the DC. e.g. with a local RF of 2 in the west, the quorum is 2, and if
 you lose one node from the replica set you will not be able to use local
 QUORUM for keys in that range. Or consider a failure mode where the west is
 disconnected from the east.

 Could you start simple with 3 replicas in one AZ in us-east and 3 replicas
 in an AZ+Region ?  Then work through some failure scenarios.

 Hope that helps.
 Aaron


 On 22 Apr 2011, at 03:28, William Oberman wrote:

 Hi,

 My service is not yet ready to be fully multi-DC, due to how some of my
 legacy MySQL stuff works.  But, I wanted to get cassandra going ASAP and
 work towards multi-DC.  I have two main cassandra use cases: one where I can
 handle eventual consistency (and all of the writes/reads are currently ONE),
 and one where I can't (writes/reads are currently QUORUM).  My test cluster
 is currently 4 smalls all in us-east with RF=3 (more to prove I can
 clustering, than to have an exact production replica).  All of my unit
 tests, and load tests (again, not to prove true max load, but to more to
 tease out concurrency issues) are passing now.

 For production, I was thinking of doing:
 -4 cassandra larges in us-east (where I am now), once in each AZ
 -1 cassandra large in us-west (where I have nothing)
 For now, my data can fit into a single large's 2 disk ephemeral using
 RAID0, and I was then thinking of doing a RF=3 with us-east=2 and
 us-west=1.  If I do eventual consistency at ONE, and consistency at
 LOCAL_QUORUM, I was hoping:
 -eventual consistency ops would be really fast
 -consistent ops would be pretty fast (what does LOCAL_QUORUM do in this
 case?  return after 1 or 2 us-east nodes ack?)
 -us-west would contain a complete copy of my data, so it's a good
 eventually consistent close to real time backup  (assuming it can keep up
 over long periods of time, but I think it should)
 -eventually, when I'm ready to roll out in us-west I'll be able to change
 the replication settings and that server in us-west could help seed new
 cassandra instances faster than the ones in us-east

 Or am I missing something really fundamental about how cassandra works
 making this a terrible plan?  I should have plenty of time to get my
 multi-DC working before the instance in us-west fills up (but even then, I
 should be able to add instances over there to stall fairly trivially,
 right?).

 Thanks!

 will





-- 
Will Oberman
Civic Science, Inc.
3030 Penn Avenue., First Floor
Pittsburgh, PA 15201
(M) 412-480-7835
(E) ober...@civicscience.com


Re: advice for EC2 deployment

2011-04-26 Thread aaron morton
One difference between Cassandra and MySQL replication may be when the network 
IO happens. Was the MySQL replication synchronous on transaction commit ?  I 
was only aware that it had async replication, which means the client is not 
exposed to the network latency. In cassandra the network latency is exposed to 
the client as it needs to wait for the CL number of nodes to respond. 

If you use the PropertyFilePartitioner with the NetworkTopology you can 
manually assign machines to racks / dc's based on IP. 
See conf/cassandra-topology.property file there is also an Ec2Snitch which 
(from the code) 
/**
 * A snitch that assumes an EC2 region is a DC and an EC2 availability_zone
 *  is a rack. This information is available in the config for the node.

Recent discussion on DC aware CL levels 
http://www.mail-archive.com/user@cassandra.apache.org/msg11414.html

Hope that helps.
Aaron
 

On 27 Apr 2011, at 01:18, William Oberman wrote:

 Thanks Aaron!
 
 Unless no one on this list uses EC2, there were a few minor troubles end of 
 last week through the weekend which taught me a lot about obscure failure 
 modes in various applications I use :-)  My original post was trying to be 
 more redundant than fast, which has been by overall goal from even before 
 moving to Cassandra (my downtime from the EC2 madness was minimal, and due to 
 only having one single point of failure == the amazon load balancer).  My 
 secondary goal was  trying to make moving to a second region easier, but is 
 that is causing problems I can drop the idea.
 
 I might be downplaying the cost of inter-AZ communication, but I've lived 
 with that for quite some time, for example my current setup of MySQL in 
 Master-Master replication is split over zones, and my webservers live in yet 
 different zones.  Maybe Cassandra is chattier than I'm used to?  (again, 
 I'm fairly new to cassandra)
 
 Based on that article, the discussion, and the recent EC2 issues, it sounds 
 like it would be better to start with:
 -6 nodes split in two AZs 3/3
 -Configure replication to do 2 in one AZ and one in the other 
 (NetworkTopology treats AZs as racks, so does RF=3,us-east=3 make this happen 
 naturally?)
 -What does LOCAL_QUORUM do in this case?  Is there a rack quorum?  Or does 
 the natural latencies of AZs make LOCAL_QUORUM behave like a rack quorum?
 
 will
 
 On Tue, Apr 26, 2011 at 1:14 AM, aaron morton aa...@thelastpickle.com wrote:
 For background see this article:
 http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers
 
 And this recent discussion 
 http://www.mail-archive.com/user@cassandra.apache.org/msg12502.html
 
 Issues that may be a concern:
 - lots of cross AZ latency in us-east, e.g. LOCAL_QUORUM ops must wait cross 
 AZ . Also consider it during maintenance tasks, how much of a pain is it 
 going to be to have latency between every node.   
 - IMHO not having sufficient (by that I mean 3) replicas in a cassandra DC to 
 handle a single node failure when working at Quorum reduces the utility of 
 the DC. e.g. with a local RF of 2 in the west, the quorum is 2, and if you 
 lose one node from the replica set you will not be able to use local QUORUM 
 for keys in that range. Or consider a failure mode where the west is 
 disconnected from the east.
 
 Could you start simple with 3 replicas in one AZ in us-east and 3 replicas in 
 an AZ+Region ?  Then work through some failure scenarios.  
 
 Hope that helps. 
 Aaron
   
 
 On 22 Apr 2011, at 03:28, William Oberman wrote:
 
 Hi,
 
 My service is not yet ready to be fully multi-DC, due to how some of my 
 legacy MySQL stuff works.  But, I wanted to get cassandra going ASAP and 
 work towards multi-DC.  I have two main cassandra use cases: one where I can 
 handle eventual consistency (and all of the writes/reads are currently ONE), 
 and one where I can't (writes/reads are currently QUORUM).  My test cluster 
 is currently 4 smalls all in us-east with RF=3 (more to prove I can 
 clustering, than to have an exact production replica).  All of my unit 
 tests, and load tests (again, not to prove true max load, but to more to 
 tease out concurrency issues) are passing now.
 
 For production, I was thinking of doing:
 -4 cassandra larges in us-east (where I am now), once in each AZ
 -1 cassandra large in us-west (where I have nothing)
 For now, my data can fit into a single large's 2 disk ephemeral using RAID0, 
 and I was then thinking of doing a RF=3 with us-east=2 and us-west=1.  If I 
 do eventual consistency at ONE, and consistency at LOCAL_QUORUM, I was 
 hoping:
 -eventual consistency ops would be really fast
 -consistent ops would be pretty fast (what does LOCAL_QUORUM do in this 
 case?  return after 1 or 2 us-east nodes ack?)
 -us-west would contain a complete copy of my data, so it's a good eventually 
 consistent close to real time backup  (assuming it can keep up over long 
 periods of time, but I think it should)
 -eventually, when I'm ready to roll 

Re: advice for EC2 deployment

2011-04-26 Thread William Oberman
I see what you're saying.  I was able to control write latency on mysql
using insert vs insert delayed (what I feel is MySQLs poor man's eventual
consistency option) + the fact that replication was a background
asynchronous process.  In terms of read latency, I was able to do up to a
few hundred well indexed mysql queries (across AZs) on a view while keeping
the overall latency of the page around or less than a second.

I basically am replacing two use cases, the cases with difficult to scale
anticipated write volumes.  The first case was previously using insert
delayed (which I'm doing in cassandra as ONE) as I wasn't getting consistent
write/read operations before anyways.  The second case was using traditional
insert (which I was going to replace with some QUORUM-like level, I was
assuming LOCAL_QUORUM).  But, the latter case uses a write through memory
cache (memcache), so I don't know how often it really reads data from the
persistent store.  But I definitely need to make sure it is consistent.

In any case, it sounds like I'd be best served treating AZs as DCs, but then
I don't know what to make racks?  Or do racks not matter in a single AZ?
That way I can get an ack from a LOCAL_QUORUM read/write before the
(slightly) slower read/write to/from the other AZ (for redundancy).  Then
I'm only screwed if Amazon has a multi-AZ failure (so far, they've kept it
to only one!) :-)

will

On Tue, Apr 26, 2011 at 5:01 PM, aaron morton aa...@thelastpickle.comwrote:

 One difference between Cassandra and MySQL replication may be when the
 network IO happens. Was the MySQL replication synchronous on transaction
 commit ?  I was only aware that it had async replication, which means the
 client is not exposed to the network latency. In cassandra the network
 latency is exposed to the client as it needs to wait for the CL number of
 nodes to respond.

 If you use the PropertyFilePartitioner with the NetworkTopology you can
 manually assign machines to racks / dc's based on IP.
 See conf/cassandra-topology.property file there is also an Ec2Snitch which
 (from the code)
 /**
  * A snitch that assumes an EC2 region is a DC and an EC2 availability_zone
  *  is a rack. This information is available in the config for the node.

 Recent discussion on DC aware CL levels
 http://www.mail-archive.com/user@cassandra.apache.org/msg11414.html

 Hope that helps.
  http://www.mail-archive.com/user@cassandra.apache.org/msg11414.html
 Aaron


 On 27 Apr 2011, at 01:18, William Oberman wrote:

 Thanks Aaron!

 Unless no one on this list uses EC2, there were a few minor troubles end of
 last week through the weekend which taught me a lot about obscure failure
 modes in various applications I use :-)  My original post was trying to be
 more redundant than fast, which has been by overall goal from even before
 moving to Cassandra (my downtime from the EC2 madness was minimal, and due
 to only having one single point of failure == the amazon load balancer).  My
 secondary goal was  trying to make moving to a second region easier, but is
 that is causing problems I can drop the idea.

 I might be downplaying the cost of inter-AZ communication, but I've lived
 with that for quite some time, for example my current setup of MySQL in
 Master-Master replication is split over zones, and my webservers live in yet
 different zones.  Maybe Cassandra is chattier than I'm used to?  (again,
 I'm fairly new to cassandra)

 Based on that article, the discussion, and the recent EC2 issues, it sounds
 like it would be better to start with:
 -6 nodes split in two AZs 3/3
 -Configure replication to do 2 in one AZ and one in the other
 (NetworkTopology treats AZs as racks, so does RF=3,us-east=3 make this
 happen naturally?)
 -What does LOCAL_QUORUM do in this case?  Is there a rack quorum?  Or
 does the natural latencies of AZs make LOCAL_QUORUM behave like a rack
 quorum?

 will

 On Tue, Apr 26, 2011 at 1:14 AM, aaron morton aa...@thelastpickle.comwrote:

 For background see this article:

 http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers


 http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centersAnd
 this recent discussion
 http://www.mail-archive.com/user@cassandra.apache.org/msg12502.html

 http://www.mail-archive.com/user@cassandra.apache.org/msg12502.htmlIssues
 that may be a concern:
 - lots of cross AZ latency in us-east, e.g. LOCAL_QUORUM ops must wait
 cross AZ . Also consider it during maintenance tasks, how much of a pain is
 it going to be to have latency between every node.
 - IMHO not having sufficient (by that I mean 3) replicas in a cassandra DC
 to handle a single node failure when working at Quorum reduces the utility
 of the DC. e.g. with a local RF of 2 in the west, the quorum is 2, and if
 you lose one node from the replica set you will not be able to use local
 QUORUM for keys in that range. Or consider a failure mode where the west is
 disconnected from the east.

 Could you