Re: [Openstack-operators] [publicClouds-wg] Public Cloud Working Group

2016-11-15 Thread James Dempsey
Sign me up.

-James

On 15/11/16 23:43, matt Jarvis wrote:
> So after input from a variety of sources over the last few weeks, I'd
> very much like to try and put together a public cloud working group,
> with the very high level goal of representing the interests of the
> public cloud provider community around the globe. 
> 
> I'd like to propose that, in line with the new process for creation of
> working groups, we set up some initial IRC meetings for all interested
> parties. The goals for these initial meetings would be :
> 
> 1. Define the overall scope and mission statement for the working group
> 2. Set out the constituency for the group - communication methods,
> definitions of public clouds, meeting schedules, chairs etc.
> 3. Identify areas of interest - eg. technical issues, collaboration
> opportunities
> 
> Before I go ahead and schedule first meetings, I'd very much like to
> gather input from the community on interested parties, and if this seems
> like a reasonable first step forward. My thought initially was to
> schedule a first meeting on #openstack-operators and then decide on best
> timings, locations and communication methods from there, but again I'd
> welcome input. 
> 
> At this point it would seem to me that one of the key metrics for this
> working group to be successful is participation as widely as possible
> within the public cloud provider community, currently approximately 21
> companies globally according
> to https://www.openstack.org/marketplace/public-clouds/. If we could get
> representation from all of those companies in any potential working
> group, then that would clearly be the best outcome, although that may be
> optimistic ! As has become clear at recent Ops events, it may be that
> not all of those companies are represented on these lists, so I'd
> welcome any input on the best way to reach out to those folks. 
> 
> Matt
> 
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> 

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Potential updates to networking guide deployment scenarios

2015-12-08 Thread James Dempsey
Hi Matt,

Commentary in-line.

On 05/12/15 14:03, Matt Kassawara wrote:
> The networking guide [1] contains deployment scenarios [2] that describe
> the operation of several common OpenStack Networking (neutron)
> architectures including functional configuration examples.
> 
> Currently, the legacy and L3HA scenarios [3][4][5][6] only support
> attaching VMs to private/internal/project networks (managed by projects)
> with a combination of routers and floating IPs that provide connectivity to
> external networks such as the Internet. However, L3 support regardless of
> architecture adds complexity and can introduce redundancy/performance
> concerns.
> 
> On the other hand, the provider networks scenarios [7][8] only support
> attaching VMs to public/external/provider networks (managed by
> administrators) and exclude components such as private networks, routers,
> and floating IPs.
> 
> Turns out... you can do both. In fact, the installation guide for Liberty
> [9] supports attaching VMs to both public and private networks. No choosing
> between the simplicity of provider networks and the "self-service" nature
> of true cloud networking in your deployment.
> 
> So, I propose that we update the legacy and L3HA scenarios in the
> networking guide to support attaching VMs to both public and private
> networks using one of the following options:
> 
> 1) Add support for attaching VMs to public networks to the existing
> scenarios.
> 2) Create additional scenarios that support attaching VMs to both public
> and private networks.
> 3) Restructure the existing scenarios by starting out with simple provider
> networks architectures for both Open vSwitch and Linux bridge and
> optionally adding L3 support to them. The installation guide for Liberty
> uses this approach.
> 
> Option 1 somewhat increases complexity of scenarios that our audience may
> already find difficult to comprehend. Option 2 proliferates the scenarios
> and makes it more difficult for our audience to choose the best one for a
> particular deployment. In addition, it can lead to duplication of content
> that becomes difficult to keep consistent. Option 3 requires a more complex
> documentation structure that our audience may find difficult to follow. As
> the audience, I would like your input on the usefulness of these potential
> updates and which option works best... or add another option.
> 


I'm not crazy about option 1 because I think it could over-complicate
the more simple scenarios.

With respect to option 2, would you be doubling the number of documented
scenarios?

It sounds like the provider network and "Legacy"/L3HA scenarios are
orthogonal enough that they could be separate from each other.  I don't
think it is too much to ask of operators to read a couple of sections
and compose them, provided the requirements and prerequisites are clear.


While not specifically pertaining to the re-structure, I will make a
couple of comments about the deploy/scenario sections, if they are being
updated...

a. I think these sections are bound to be confusing regardless of how
they are structured or re-structured.  Perhaps there should be
high-level comparison of the different scenarios to help operators
decide which scenario best fits their use case.  Maybe even a table
comparing them?

b. Does 'Legacy' just mean 'No HA/DVR Routing?'  I think that within the
context of OpenStack Networking, it is risky to call anything aside from
Nova Network 'Legacy.'  It seems like a 'single L3 agent' scenario is a
perfectly valid use case... It reduces complexity and cost while still
letting users create whatever topology they want.  Let me know if I'm
reading this wrong.

Cheers,
James



> Thanks,
> Matt
> 
> [1] http://docs.openstack.org/networking-guide/
> [2] http://docs.openstack.org/networking-guide/deploy.html
> [3] http://docs.openstack.org/networking-guide/scenario_legacy_ovs.html
> [4] http://docs.openstack.org/networking-guide/scenario_legacy_lb.html
> [5] http://docs.openstack.org/networking-guide/scenario_l3ha_ovs.html
> [6] http://docs.openstack.org/networking-guide/scenario_l3ha_lb.html
> [7] http://docs.openstack.org/networking-guide/scenario_provider_ovs.html
> [8] http://docs.openstack.org/networking-guide/scenario_provider_lb.html
> [9] http://docs.openstack.org/liberty/install-guide-ubuntu/
> 
> 
> 
> ___________
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> 


-- 
James Dempsey
Senior Cloud Engineer
Catalyst IT Limited
+64 4 803 2264
--

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Galera setup testing

2015-12-07 Thread James Dempsey
On 08/12/15 04:04, Matt Fischer wrote:
> On Mon, Dec 7, 2015 at 3:54 AM, Ajaya Agrawal  wrote:
> 
>> Hi everyone,
>>
>> We are deploying Openstack and planning to run multi-master Galera setup
>> in production. My team is responsible for running a highly available
>> Keystone. I have two questions when it comes to Galera with Keystone.
>>
>> 1. How do you test if a Galera cluster is setup properly?
>> 2. Is there any Galera test specific to Keystone which you have found
>> useful?
>>
>>
> For 1 you could say that the clustercheck script which ships with
> puppet-galera and is forked from
> https://github.com/olafz/percona-clustercheck is a really simple check that
> galera is up and the cluster is sync'd. It's main goal however is to
> provide status to haproxy.
> 
> One thing you want to check is the turnaround time on operations, for
> example, creating a user on a node and then immediately using them on
> another node. We found that this is likely to sometimes (but rarely) fail.
> The solution is two-fold, first, don't store tokens in mysql. Second,
> designate one node as the primary in haproxy.
> 

+1 for designating one node as primary.  This helped us reduce some
deadlocks that we were seeing when balancing sessions between DB hosts.

> Other than that we've gotten good at reading the wsrep_ cluster status
> info, but to be honest, once we removed tokens from the db, we've been in
> way better shape.



___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Two regions and so two metadata servers sharing the same VLAN

2015-11-26 Thread James Dempsey
On 27/11/15 03:49, gilles.mocel...@nuagelibre.org wrote:
> Hello stackers !
> 
> Sorry, I also cross-posted that question here
> https://ask.openstack.org/en/question/85195/two-regions-and-so-two-metadata-servers-sharing-the-same-vlan/
> 
> 
> But I think I can reach a wider audience here.
> 
> So here's my problem.
> 
> I'm facing an non-conventional situation. We're building a two region
> Cloud to separate a VMware backend and a KVM one. But both regions share
> the same 2 VLANs where we connect all our instances.
> 
> We don't use routers, private network, floating IPs... I've enabled
> enable_isolated_metadata, so the metadata IP is inside the dhcp
> namespace and there's a static route in the created instances to it via
> the dhcp's IP. The two DHCPs could have been a problem but we will use
> separate IP ranges, and as Neutron sets static leases with the instances
> MAC address, they should not interfere.
> 
> The question I've been asked is whether we will have network problems
> with the metadata server IP 169.254.169.254, that will exist in 2
> namepaces on 2 neutron nodes but on the same VLAN. So they will send ARP
> packets with different MAC, and will perhaps perturb access to the
> metadata URL form the instances.
> 

I think you will see periodic interruptions in service.  ARP tables will
have entries that for the metadata service IP which flap back and forth
as the MAC is expired/re-learned.  As is often the case with duplicate
addressing, it will work sometimes and be unhappy sometimes.  This might
not be a huge problem, if cloud-init is retrying enough during boot, but
keep in mind that other pieces of software also poll the metadata
service(puppet/facter, for example).

I think you understand the core issue: you have two instances of Neutron
working in the same L2 broadcast domain... I wouldn't want to support a
configuration like this in production.

> Tcpdump shows nothing wrong, but I can't really test now because we
> haven't got yet the two regions. What do you think ?
> 
> Of course, the question is not about why we choose to have two regions.
> I would have chosen Host Agregates to separate VMware and KVM, but
> cinder glance should have been configure the same way. And with VMware,
> it's not so feasible.
> 
> Also, if we can, we will try to have separate networks for each regions,
> but it involves a lot of bureaucracy here...
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


-- 
James Dempsey
Senior Cloud Engineer
Catalyst IT Limited
--

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] Neutron Network Node Sizing

2015-11-24 Thread James Dempsey
Hi Operators,

I'd love to hear from anyone running multiple physical Neutron L3 Agent
nodes...

Specifically: How many routers do you schedule per Neutron L3 Agent and
how much CPU/Memory do you put into an L3 Agent physical host?

Because of a couple of bugs in Linux[1][2], we have a pretty miserable
time when we cram more than 100ish routers on an L3 Agent(with VPNaaS
enabled).  Previously, we ran L3 Agents on servers that also hosted
control plane VMs (API, MQ, etc...), but now we are buying separate
hardware for L3 agent nodes.  Would love to hear about router counts /
hardware specs from other operators.

This is a Juno ML2+OVS cloud at the moment, but it will be Liberty as
soon as possible.

[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1486670
[2] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1403152

Cheers,
James

-- 
James Dempsey
Senior Cloud Engineer
Catalyst IT Limited
--


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [openstack-dev] PLEASE READ: VPNaaS API Change - not backward compatible

2015-08-26 Thread James Dempsey
On 26/08/15 23:43, Paul Michali wrote:
 James,
 
 Great stuff! Please see @PCM in-line...
 
 On Tue, Aug 25, 2015 at 6:26 PM James Dempsey jam...@catalyst.net.nz

SNIP

 1) Horizon compatibility

 We run a newer version of horizon than we do neutron.  If Horizon
 version X doesn't work with Neutron version X-1, this is a very big
 problem for us.

 
 @PCM Interesting. I always thought that Horizon updates lagged Neutron
 changes, and this wouldn't be a concern.
 

@JPD
Our Installed Neutron typically lags Horizon by zero or one release.  My
concern is how will Horizon version X cope with a point-in-time API
change?  Worded slightly differently: We rarely update Horizon and
Neutron at the same time so there would need to be a version(or
versions) of Horizon that could detect a Neutron upgrade and start using
the new API.  (I'm fine if there is a Horizon config option to select
old/new VPN API usage.)

 
 

 2) Service interruption

 How much of a service interruption would the 'migration path' cause?
 
 
 @PCM The expectation of the proposal is that the migration would occur as
 part of the normal OpenStack upgrade process (new services installed,
 current services stopped, database migration occurs, new services are
 started).
 
 It would have the same impact as what would happen today, if you update
 from one release to another. I'm sure you folks have a much better handle
 on that impact and how to handle it (maintenance windows, scheduled
 updates, etc).
 

@JPD This seems fine.

 
 We
 all know that IPsec VPNs can be fragile...  How much of a guarantee will
 we have that migration doesn't break a bunch of VPNs all at the same
 time because of some slight difference in the way configurations are
 generated?

 
 @PCM I see the risk as extremely low. With the migration, the end result is
 really just moving/copying fields from one table to another. The underlying
 configuration done to *Swan would be the same.
 
 For example, the subnet ID, which is specified in the VPN service API and
 stored in the vpnservices table, would be stored in a new vpn_endpoints
 table, and the ipsec_site_connections table would reference that entry
 (rather than looking up the subnet in the vpnservices table).
 

@JPD This makes me feel more comfortable; thanks for explaining.

 
 
 3) Heat compatibility

 We don't always run the same version of Heat and Neutron.

 
 @PCM I must admit, I've never used Heat, and am woefully ignorant about it.
 Can you elaborate on Heat concerns as may be related to VPN API differences?
 
 Is Heat being used to setup VPN connections, as part of orchestration?
 

@JPD
My concerns are two-fold:

1) Because Heat makes use of the VPNaaS API, it seems like the same
situation exists as with Horizon.  Some version or versions of Heat will
need to be able to make use of both old and new VPNaaS APIs in order to
cope with a Neutron upgrade.

2) Because we use Heat resource types like
OS::Neutron::IPsecSiteConnection [1], we may lose the ability to
orchestrate VPNs if endpoint groups are not added to Heat at the same time.


Number 1 seems like a real problem that needs a fix.  Number 2 is a fact
of life that I am not excited about, but am prepared to deal with.

Yes, Heat is being used to build VPNs, but I am prepared to make the
decision on behalf of my users... VPN creation via Heat is probably less
important than the new VPNaaS features, but it would be really great if
we could work on the updated heat resource types in parallel.

[1]
http://docs.openstack.org/developer/heat/template_guide/openstack.html#OS::Neutron::IPsecSiteConnection

 
 


 Is there pain for the customers beyond learning about the new API
 changes
 and capabilities (something that would apply whether there is backward
 compatibility or not)?


 See points 1,2, and 3 above.

 

 Another implication of not having backwards compatibility would be that
 end-users would need to immediately switch to using the new API, once the
 migration occurs, versus doing so on their own time frame.  Would this
 be a
 concern for you (customers not having the convenience of delaying their
 switch to the new API)?


 I was thinking that backward incompatible changes would adversely affect
 people who were using client scripts/apps to configure (a large number
 of)
 IPsec connections, where they'd have to have client scripts/apps
 in-place
 to support the new API.


 This is actually less of a concern.  We have found that VPN creation is
 mostly done manually and anyone who is clever enough to make IPsec go is
 clever enough to learn a new API/horizon interface.

 
 @PCM Do you see much reliance on tooling to setup VPN (such that having to
 update the tooling would be a concern for end-users), or is this something
 that could be managed through process/preparation?
 

@JPD  I see very little reliance on tooling to setup VPNs.  We could
manage this through preparation.

 
 


 Which is more of a logistics issue, and could be managed, IMHO.




 Would

Re: [Openstack-operators] [openstack-dev] PLEASE READ: VPNaaS API Change - not backward compatible

2015-08-25 Thread James Dempsey
Oops, sorry about the blank email.  Answers/Questions in-line.

On 26/08/15 07:46, Paul Michali wrote:
 Previous post only went to dev list. Ensuring both and adding a bit more...
 
 
 
 On Tue, Aug 25, 2015 at 8:37 AM Paul Michali p...@michali.net wrote:
 
 Xav,

 The discussion is very important, and hence why both Kyle and I have been
 posting these questions on the operator (and dev) lists. Unfortunately, I
 wasn't subscribed to the operator's list and missed some responses to
 Kyle's message, which were posted only to that list.

 As a result, I had an incomplete picture and posted this thread to see if
 it was OK to do this without backward compatibility, based on the
 (incorrect) assumption that there was no production use. That is corrected
 now, and I'm getting all the messages and thanks to everyone, have input on
 messages I missed.

 So given that, let's try a reset on the discussion, so that I can better
 understand the issues...


Great!  Thanks very much for expanding the scope.  We really appreciate it.

 Do you feel that not having backward compatibility (but having a migration
 path) would seriously affect you or would it be manageable?

Currently, this feels like it would seriously affect us.  I don't feel
confident that the following concerns won't cause us big problems.


As Xav mentioned previously, we have a few major concerns:

1) Horizon compatibility

We run a newer version of horizon than we do neutron.  If Horizon
version X doesn't work with Neutron version X-1, this is a very big
problem for us.

2) Service interruption

How much of a service interruption would the 'migration path' cause? We
all know that IPsec VPNs can be fragile...  How much of a guarantee will
we have that migration doesn't break a bunch of VPNs all at the same
time because of some slight difference in the way configurations are
generated?

3) Heat compatibility

We don't always run the same version of Heat and Neutron.


 Is there pain for the customers beyond learning about the new API changes
 and capabilities (something that would apply whether there is backward
 compatibility or not)?


See points 1,2, and 3 above.

 
 Another implication of not having backwards compatibility would be that
 end-users would need to immediately switch to using the new API, once the
 migration occurs, versus doing so on their own time frame.  Would this be a
 concern for you (customers not having the convenience of delaying their
 switch to the new API)?
 
 
 I was thinking that backward incompatible changes would adversely affect
 people who were using client scripts/apps to configure (a large number of)
 IPsec connections, where they'd have to have client scripts/apps in-place
 to support the new API.


This is actually less of a concern.  We have found that VPN creation is
mostly done manually and anyone who is clever enough to make IPsec go is
clever enough to learn a new API/horizon interface.

 
 Which is more of a logistics issue, and could be managed, IMHO.
 
 
 

 Would there be customers that would fall into that category, or are
 customers manually configuring IPSec connections in that they could just
 use the new API?


Most customers could easily adapt to a new API.


 Are there other adverse effects of not having backward compatibility that
 need to be considered?


As with the dashboard, heat also needs a bit of consideration.  How
would Heat deal with the API changes?

 
 So far, I'm identifying one effect that is more of a convenience (although
 nice one at that), and one effect that can be avoided by planning for the
 upgrade.  I'd like to know if I'm missing something more important to
 operators.
 
 I'd also like to know if we thing there is a user base large enough (and
 how large is large?0 that would warrant going through the complexity and
 risk to support both API versions simultaneously?

This is a bit frustrating...  It implies that only large clouds matter.
 There is a further tacit implication the API is not really a contract
that can be relied upon.

We are operating a multi-region cloud with many clients who depend upon
VPNaaS for business-critical production workloads.

Of course, this is a two-way road!  We all want what is best for
OpenStack, so we should talk about the complexity and risk on your end.
 Can you tell us more about that?  I really have no interest in being an
operator who demands the world from developers, but I am worried about
what all this means for my cloud.

Cheers,
James Dempsey

-- 
James Dempsey
Senior Cloud Engineer
Catalyst IT Limited
+64 4 803 2264
--


 
 Regards,
 
 Paul Michali (pc_m)
 
 
 Specifically, we're talking about the VPN service create API no longer
 taking a subnet ID (instead an endpoint group is create that contains the
 subnet ID), and the IPSec site-to-site connection create API would no
 longer take a list of peer CIDRs, but instead would take a pair of endpoint
 group IDs (one for the local subnet(s) formally specified by the service
 API

Re: [Openstack-operators] [neutron] Any users of Neutron's VPN advanced service?

2015-08-11 Thread James Dempsey
Hi Kyle,

We deployed VPNaaS(OpenSwan driver) in the Catalyst Cloud just over a
year ago when it was running Havana.  We are in the middle of Icehouse
- Juno upgrades and consider this a must-have feature (we also look
forward to the RFE to enable VPN+HA routers.)  Aside from typical
site-to-site tunnel mode IPsec use cases, we also use it to deliver
multi-region anycast services directly into our corporate WAN.

Cheers,
James

On 06/08/15 10:21, Tamanna Z Sait wrote:
 Hi Kyle
 
 We have been actively using Neutron VPNaaS code from icehouse, juno, kilo 
 releases and have plans to upstream bug fixes as well as enhancements in 
 this neurton's VPNaaS area moving forward. 
 We have been using the feature for over 1 year now and plan to continue to 
 use it and deploy it.
 
 
 
 Kyle Mestery mestery at mestery.com 
 Wed Aug 5 19:56:01 UTC 2015 
 Previous message: [Openstack-operators] [hpc] Tuning KVM 
 Next message: [Openstack-operators] [neutron] Any users of Neutron's VPN 
 advanced service? 
 Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] 
 
 Operators:
 
 We (myself, Paul and Doug) are looking to better understand who might be
 using Neutron's VPNaaS code. We're looking for what version you're using,
 how long you're using it, and if you plan to continue deploying it with
 future upgrades. Any information operators can provide here would be
 fantastic!
 
 Thank you!
 Kyle


-- 
James Dempsey
Senior Cloud Engineer
Catalyst IT Limited
+64 4 803 2264
--

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Fostering OpenStack Users

2014-12-30 Thread James Dempsey
On 31/12/14 04:40, matt wrote:
 There are several fundamental problems in getting your feet wet in
 OpenStack.
 
 First is that OpenStack is expecting to be installed into a rack of
 systems, not one system.  While there are work arounds such as devstack,
 they fail to accurately produce a production environment or even a
 useful facsimile of one.  One of the larger problems for many adopters
 of openstack at least initially is coming to grips with neutron and
 getting it plugged into their existing network environment.  Devstack
 doesn't help you here much at all.

Sorry, I might not have been very precise with my words.  I was talking
about helping people become effective at using OpenStack Clouds, not
building them.  i.e. How do I make sure that people who use my OpenStack
Cloud are able/excited to build great applications there?

I see this as tremendously important, in terms of driving cloud growth.
 If my users aren't abuzz with OpenStack enthusiasm, how will I get more
money for bigger and better toys?

Cheers,
James

-- 
James Dempsey
Senior Cloud Engineer
Catalyst IT Limited
https://catalyst.net.nz/cloud
+64 4 803 2264
--

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] Fostering OpenStack Users

2014-12-29 Thread James Dempsey
Hey Operators,

How do you help new OpenStack users learn to build awesome stuff in your
OpenStack clouds?

I come from a more traditional Systems/Operations background, so when it
comes time to on-board someone who intends to build an application in
the cloud, I'm at a bit of a loss once I have sorted out their API
access.  Sure, I'll give them some cloud-init examples and tell them to
go read about Heat, but infrastructure is my area of expertise, not
cloud applications.

So, back to my original question:

What information do you give new users to help them be effective in the
cloud?  What is your go-to demo for people who don't quite understand
what OpenStack is offering?  How do you reach out to people in your
organizations who aren't OpenStack users yet, but probably should be?

Cheers,
James

-- 
James Dempsey
Senior Cloud Engineer
Catalyst IT Limited
https://catalyst.net.nz/cloud
+64 4 803 2264
--

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators