Re: [Openstack-operators] expanding to 2nd location

2015-05-06 Thread Joseph Bajin
Just to add in my $0.02, we run in multiple sites as well.  We are using
regions to do this.  Cells at this point have a lot going for it, but we
thought it wasn't there yet.  We also don't have the necessary resources to
make our own changes to it like a few other places do.

With that, we said the only real thing that we should do is make sure items
such as Tenant and User ID's remained the same. That allows us to do
show-back reporting and it makes it easier on the user-base on when they
want to deploy from one region to another.   With that requirement, we did
used galera in the same manner that many of the others mentioned.  We then
deployed Keystone pointing to that galera DB.  That is the only DB that is
replicated across sites.  Everything else such as Nova, Neutron, etc are
all within its own location.

The only real confusing piece for our users is the dashboard.  When you
first go to the dashboard, there is a dropdown to select a region.  Many
users think that is going to send them to a particular location, so their
information from that location is going to show up.  It is really to which
region do you want to authenticate against.  Once you are in the dashboard,
you can select which Project you want to see.  That has been a major point
of confusion. I think our solution is to just rename that text.





On Tue, May 5, 2015 at 11:46 AM, Clayton O'Neill clay...@oneill.net wrote:

 On Tue, May 5, 2015 at 11:33 AM, Curtis serverasc...@gmail.com wrote:

 Do people have any comments or strategies on dealing with Galera
 replication across the WAN using regions? Seems like something to try
 to avoid if possible, though might not be possible. Any thoughts on
 that?


 We're doing this with good luck.  Few things I'd recommend being aware of:

 Set galera_group_segment so that each site is in a separate segment.  This
 will make it smarter about doing replication and for state transfer.

 Make sure you look at the timers and tunables in Galera and make sure they
 make sense for your network.  We've got lots of BW and lowish latency
 (37ms), so the defaults have worked pretty well for us.

 Make sure that when you do provisioning in one site, you don't have CM
 tools in the other site breaking things.  We can ran into issues during our
 first deploy like this where Puppet was making a change in one site to a
 user, and Puppet in the other site reverted the change nearly immediately.
 You may have to tweak your deployment process to deal with that sort of
 thing.

 Make sure you're running Galera or Galera Arbitrator in enough sites to
 maintain quorum if you have issues.  We run 3 nodes in one DC, and 3 nodes
 in another DC for Horizon, Keystone and Designate.  We run a Galera
 arbitrator in a third DC to settle ties.

 Lastly, the obvious one is just to stay up to date on patches.  Galera is
 pretty stable, but we have run into bugs that we had to get fixes for.

 On Tue, May 5, 2015 at 11:33 AM, Curtis serverasc...@gmail.com wrote:

 Do people have any comments or strategies on dealing with Galera
 replication across the WAN using regions? Seems like something to try
 to avoid if possible, though might not be possible. Any thoughts on
 that?

 Thanks,
 Curtis.

 On Mon, May 4, 2015 at 3:11 PM, Jesse Keating j...@bluebox.net wrote:
  I agree with Subbu. You'll want that to be a region so that the control
  plane is mostly contained. Only Keystone (and swift if you have that)
 would
  be doing lots of site to site communication to keep databases in sync.
 
  http://docs.openstack.org/arch-design/content/multi_site.html is a
 good read
  on the topic.
 
 
  - jlk
 
  On Mon, May 4, 2015 at 1:58 PM, Allamaraju, Subbu su...@subbu.org
 wrote:
 
  I suggest building a new AZ (“region” in OpenStack parlance) in the new
  location. In general I would avoid setting up control plane to operate
  across multiple facilities unless the cloud is very large.
 
   On May 4, 2015, at 1:40 PM, Jonathan Proulx j...@jonproulx.com
 wrote:
  
   Hi All,
  
   We're about to expand our OpenStack Cloud to a second datacenter.
   Anyone one have opinions they'd like to share as to what I would and
   should be worrying about or how to structure this?  Should I be
   thinking cells or regions (or maybe both)?  Any obvious or not so
   obvious pitfalls I should try to avoid?
  
   Current scale is about 75 hypervisors.  Running juno on Ubuntu 14.04
   using Ceph for volume storage, ephemeral block devices, and image
   storage (as well as object store).  Bulk data storage for most (but
 by
   no means all) of our workloads is at the current location (not that
   that matters I suppose).
  
   Second location is about 150km away and we'll have 10G (at least)
   between sites. The expansion will be approximately the same size as
   the existing cloud maybe slightly larger and given site capacities
 the
   new location is also more likely to be where any future grown goes.
  
   Thanks,
   -Jon
  
   

Re: [Openstack-operators] expanding to 2nd location

2015-05-06 Thread Mike Dorman
+1 to second site = second region.

I would not recommend using cells unless you have a real nova scalability 
problem.  There are a lot of caveats/gotchas.  Cells v2 I think should come as 
an experimental feature in Liberty, and past that point cells will be the 
default mode of operation.  It will probably be much easier to go from no cells 
to cells v2 than cells v1 to v2.

Mike



From: Joseph Bajin
Date: Wednesday, May 6, 2015 at 8:06 AM
Cc: OpenStack Operators
Subject: Re: [Openstack-operators] expanding to 2nd location

Just to add in my $0.02, we run in multiple sites as well.  We are using 
regions to do this.  Cells at this point have a lot going for it, but we 
thought it wasn't there yet.  We also don't have the necessary resources to 
make our own changes to it like a few other places do.

With that, we said the only real thing that we should do is make sure items 
such as Tenant and User ID's remained the same. That allows us to do show-back 
reporting and it makes it easier on the user-base on when they want to deploy 
from one region to another.   With that requirement, we did used galera in the 
same manner that many of the others mentioned.  We then deployed Keystone 
pointing to that galera DB.  That is the only DB that is replicated across 
sites.  Everything else such as Nova, Neutron, etc are all within its own 
location.

The only real confusing piece for our users is the dashboard.  When you first 
go to the dashboard, there is a dropdown to select a region.  Many users think 
that is going to send them to a particular location, so their information from 
that location is going to show up.  It is really to which region do you want to 
authenticate against.  Once you are in the dashboard, you can select which 
Project you want to see.  That has been a major point of confusion. I think our 
solution is to just rename that text.





On Tue, May 5, 2015 at 11:46 AM, Clayton O'Neill 
clay...@oneill.netmailto:clay...@oneill.net wrote:
On Tue, May 5, 2015 at 11:33 AM, Curtis 
serverasc...@gmail.commailto:serverasc...@gmail.com wrote:
Do people have any comments or strategies on dealing with Galera
replication across the WAN using regions? Seems like something to try
to avoid if possible, though might not be possible. Any thoughts on
that?

We're doing this with good luck.  Few things I'd recommend being aware of:

Set galera_group_segment so that each site is in a separate segment.  This will 
make it smarter about doing replication and for state transfer.

Make sure you look at the timers and tunables in Galera and make sure they make 
sense for your network.  We've got lots of BW and lowish latency (37ms), so the 
defaults have worked pretty well for us.

Make sure that when you do provisioning in one site, you don't have CM tools in 
the other site breaking things.  We can ran into issues during our first deploy 
like this where Puppet was making a change in one site to a user, and Puppet in 
the other site reverted the change nearly immediately.  You may have to tweak 
your deployment process to deal with that sort of thing.

Make sure you're running Galera or Galera Arbitrator in enough sites to 
maintain quorum if you have issues.  We run 3 nodes in one DC, and 3 nodes in 
another DC for Horizon, Keystone and Designate.  We run a Galera arbitrator in 
a third DC to settle ties.

Lastly, the obvious one is just to stay up to date on patches.  Galera is 
pretty stable, but we have run into bugs that we had to get fixes for.

On Tue, May 5, 2015 at 11:33 AM, Curtis 
serverasc...@gmail.commailto:serverasc...@gmail.com wrote:
Do people have any comments or strategies on dealing with Galera
replication across the WAN using regions? Seems like something to try
to avoid if possible, though might not be possible. Any thoughts on
that?

Thanks,
Curtis.

On Mon, May 4, 2015 at 3:11 PM, Jesse Keating 
j...@bluebox.netmailto:j...@bluebox.net wrote:
 I agree with Subbu. You'll want that to be a region so that the control
 plane is mostly contained. Only Keystone (and swift if you have that) would
 be doing lots of site to site communication to keep databases in sync.

 http://docs.openstack.org/arch-design/content/multi_site.html is a good read
 on the topic.


 - jlk

 On Mon, May 4, 2015 at 1:58 PM, Allamaraju, Subbu 
 su...@subbu.orgmailto:su...@subbu.org wrote:

 I suggest building a new AZ (“region” in OpenStack parlance) in the new
 location. In general I would avoid setting up control plane to operate
 across multiple facilities unless the cloud is very large.

  On May 4, 2015, at 1:40 PM, Jonathan Proulx 
  j...@jonproulx.commailto:j...@jonproulx.com wrote:
 
  Hi All,
 
  We're about to expand our OpenStack Cloud to a second datacenter.
  Anyone one have opinions they'd like to share as to what I would and
  should be worrying about or how to structure this?  Should I be
  thinking cells or regions (or maybe both)?  Any obvious or not so
  obvious pitfalls I should try to avoid

Re: [Openstack-operators] expanding to 2nd location

2015-05-05 Thread Jonathan Proulx
On Mon, May 4, 2015 at 9:42 PM, Tom Fifield t...@openstack.org wrote:

 Do you need users to be able to see it as one cloud, with a single API
 endpoint?

Define need :)

As many of you know my cloud is a University system and researchers
are nothing if not lazy, in the best possible sense of course :)  So
having a single API and scheduler so users don't *need* to think about
placement while using AZs so that they can (as Tim mentions a little
further dorm the thread)  is very attractive.  Managing complexity is
also important since we about 1 FTE equivalent (split between two or
three actual humans) to manage our cloud.

For partially technical partially political reasons we will not have
the same IP networks available at the second location.  With a bit of
heavy lifting on my end I could probably change this, but if i did it
would mean all the L3 would need to be routes for one of the sites
(because $reasons trust me).  So given that users would need to pick
which network to use, which would in fact be picking which site to
launch in, which sounds like it would rather be a Region.

So  Joe's model where Region2 slaves off Region1 for Keystone and
Glance is looking like the best fit. We could force users to balance
across regions by splitting their quota using the non-unified quota
model to our advantage.

Though I still have a bit of reeading to do apparently since I'd
forgotten about the Architecture Design Guide Jesse pointed out
http://docs.openstack.org/arch-design/content/multi_site.html

Thanks all,
-Jon

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] expanding to 2nd location

2015-05-05 Thread Matt Fischer
We do it with some of our databases (horizon, designate, and keystone) and
we run a arbitrator process (garbd) in a 3rd DC. We have lots of low
latency bandwidth which you have to be careful with. My recommendation
would be that you need to know your network well and have good monitoring
in place.

On Tue, May 5, 2015 at 9:33 AM, Curtis serverasc...@gmail.com wrote:

 Do people have any comments or strategies on dealing with Galera
 replication across the WAN using regions? Seems like something to try
 to avoid if possible, though might not be possible. Any thoughts on
 that?

 Thanks,
 Curtis.

 On Mon, May 4, 2015 at 3:11 PM, Jesse Keating j...@bluebox.net wrote:
  I agree with Subbu. You'll want that to be a region so that the control
  plane is mostly contained. Only Keystone (and swift if you have that)
 would
  be doing lots of site to site communication to keep databases in sync.
 
  http://docs.openstack.org/arch-design/content/multi_site.html is a good
 read
  on the topic.
 
 
  - jlk
 
  On Mon, May 4, 2015 at 1:58 PM, Allamaraju, Subbu su...@subbu.org
 wrote:
 
  I suggest building a new AZ (“region” in OpenStack parlance) in the new
  location. In general I would avoid setting up control plane to operate
  across multiple facilities unless the cloud is very large.
 
   On May 4, 2015, at 1:40 PM, Jonathan Proulx j...@jonproulx.com
 wrote:
  
   Hi All,
  
   We're about to expand our OpenStack Cloud to a second datacenter.
   Anyone one have opinions they'd like to share as to what I would and
   should be worrying about or how to structure this?  Should I be
   thinking cells or regions (or maybe both)?  Any obvious or not so
   obvious pitfalls I should try to avoid?
  
   Current scale is about 75 hypervisors.  Running juno on Ubuntu 14.04
   using Ceph for volume storage, ephemeral block devices, and image
   storage (as well as object store).  Bulk data storage for most (but by
   no means all) of our workloads is at the current location (not that
   that matters I suppose).
  
   Second location is about 150km away and we'll have 10G (at least)
   between sites. The expansion will be approximately the same size as
   the existing cloud maybe slightly larger and given site capacities the
   new location is also more likely to be where any future grown goes.
  
   Thanks,
   -Jon
  
   ___
   OpenStack-operators mailing list
   OpenStack-operators@lists.openstack.org
  
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
 
 
  ___
  OpenStack-operators mailing list
  OpenStack-operators@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
 
 
 
  ___
  OpenStack-operators mailing list
  OpenStack-operators@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
 



 --
 Twitter: @serverascode
 Blog: serverascode.com

 ___
 OpenStack-operators mailing list
 OpenStack-operators@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] expanding to 2nd location

2015-05-04 Thread Tim Bell

CERN runs two data centres in Geneva (3.5MW) and Budapest (2.7MW), around
1,200 KMs . We have two 100Gb/s links between the two sites and latency of
around 22ms. 

We run this as a single cloud with 13 cells. Each cell is only in one data
centre.

We wanted a single API endpoint from the user perspective and thus, we did
not use regions.

There are things to consider such as

- Availability zone set up so that people can choose which centre to place
work in (such as disaster recovery)
- Scheduling of work for projects and localisation of the volumes for
those Vms (we¹ve not found a good solution for this one)

In an ideal world, we¹d have a high availability API layer for the cells
across two sites. We¹ve not got that far yet.

Tim

On 5/5/15, 3:42 AM, Tom Fifield t...@openstack.org wrote:



On 05/05/15 04:40, Jonathan Proulx wrote:
 Hi All,

 We're about to expand our OpenStack Cloud to a second datacenter.
 Anyone one have opinions they'd like to share as to what I would and
 should be worrying about or how to structure this?  Should I be
 thinking cells or regions (or maybe both)?  Any obvious or not so
 obvious pitfalls I should try to avoid?

 Current scale is about 75 hypervisors.  Running juno on Ubuntu 14.04
 using Ceph for volume storage, ephemeral block devices, and image
 storage (as well as object store).  Bulk data storage for most (but by
 no means all) of our workloads is at the current location (not that
 that matters I suppose).

 Second location is about 150km away and we'll have 10G (at least)
 between sites. The expansion will be approximately the same size as
 the existing cloud maybe slightly larger and given site capacities the
 new location is also more likely to be where any future grown goes.


Do you need users to be able to see it as one cloud, with a single API
endpoint?

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] expanding to 2nd location

2015-05-04 Thread Allamaraju, Subbu
I suggest building a new AZ (“region” in OpenStack parlance) in the new 
location. In general I would avoid setting up control plane to operate across 
multiple facilities unless the cloud is very large.

 On May 4, 2015, at 1:40 PM, Jonathan Proulx j...@jonproulx.com wrote:
 
 Hi All,
 
 We're about to expand our OpenStack Cloud to a second datacenter.
 Anyone one have opinions they'd like to share as to what I would and
 should be worrying about or how to structure this?  Should I be
 thinking cells or regions (or maybe both)?  Any obvious or not so
 obvious pitfalls I should try to avoid?
 
 Current scale is about 75 hypervisors.  Running juno on Ubuntu 14.04
 using Ceph for volume storage, ephemeral block devices, and image
 storage (as well as object store).  Bulk data storage for most (but by
 no means all) of our workloads is at the current location (not that
 that matters I suppose).
 
 Second location is about 150km away and we'll have 10G (at least)
 between sites. The expansion will be approximately the same size as
 the existing cloud maybe slightly larger and given site capacities the
 new location is also more likely to be where any future grown goes.
 
 Thanks,
 -Jon
 
 ___
 OpenStack-operators mailing list
 OpenStack-operators@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] expanding to 2nd location

2015-05-04 Thread Jonathan Proulx
Hi All,

We're about to expand our OpenStack Cloud to a second datacenter.
Anyone one have opinions they'd like to share as to what I would and
should be worrying about or how to structure this?  Should I be
thinking cells or regions (or maybe both)?  Any obvious or not so
obvious pitfalls I should try to avoid?

Current scale is about 75 hypervisors.  Running juno on Ubuntu 14.04
using Ceph for volume storage, ephemeral block devices, and image
storage (as well as object store).  Bulk data storage for most (but by
no means all) of our workloads is at the current location (not that
that matters I suppose).

Second location is about 150km away and we'll have 10G (at least)
between sites. The expansion will be approximately the same size as
the existing cloud maybe slightly larger and given site capacities the
new location is also more likely to be where any future grown goes.

Thanks,
-Jon

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators