Re: [Openstack-operators] RabbitMQ and SSL

2018-11-12 Thread Sam Morrison
On the off chance that others see this or there is talk about this in Berlin I 
have tracked this down to versions of python-amqp and python-kombu

More information at the bug report 
https://bugs.launchpad.net/oslo.messaging/+bug/1800957 
<https://bugs.launchpad.net/oslo.messaging/+bug/1800957>

Sam



> On 1 Nov 2018, at 11:04 am, Sam Morrison  wrote:
> 
> Hi all,
> 
> We’ve been battling an issue after an upgrade to pike which essentially makes 
> using rabbit with ssl impossible 
> 
> https://bugs.launchpad.net/oslo.messaging/+bug/1800957 
> <https://bugs.launchpad.net/oslo.messaging/+bug/1800957>
> 
> We use ubuntu cloud archives so it might no exactly be olso but a dependant 
> library.
> 
> Anyone else seen similar issues?
> 
> Cheers,
> Sam
> 
> 

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] RabbitMQ and SSL

2018-10-31 Thread Sam Morrison
Hi all,

We’ve been battling an issue after an upgrade to pike which essentially makes 
using rabbit with ssl impossible 

https://bugs.launchpad.net/oslo.messaging/+bug/1800957 


We use ubuntu cloud archives so it might no exactly be olso but a dependant 
library.

Anyone else seen similar issues?

Cheers,
Sam


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [neutron][lbaas][neutron-lbaas][octavia] Update on the previously announced deprecation of neutron-lbaas and neutron-lbaas-dashboard

2018-09-30 Thread Sam Morrison
Hi Michael,

Are all the backends that are supported by lbaas supported by octavia? I can’t 
see a page that lists the supported backends.

Eg. We use lbaas with the midonet driver and I can’t tell if this will still 
work when switching over?


Thanks,
Sam



> On 29 Sep 2018, at 8:07 am, Michael Johnson  wrote:
> 
> During the Queens release cycle we announced the deprecation of
> neutron-lbaas and neutron-lbaas-dashboard[1].
> 
> Today we are announcing the expected end date for the neutron-lbaas
> and neutron-lbaas-dashboard deprecation cycles.  During September 2019
> or the start of the “U” OpenStack release cycle, whichever comes
> first, neutron-lbaas and neutron-lbaas-dashboard will be retired. This
> means the code will be be removed and will not be released as part of
> the "U" OpenStack release per the infrastructure team’s “retiring a
> project” process[2].
> 
> We continue to maintain a Frequently Asked Questions (FAQ) wiki page
> to help answer additional questions you may have about this process:
> https://wiki.openstack.org/wiki/Neutron/LBaaS/Deprecation
> 
> For more information or if you have additional questions, please see
> the following resources:
> 
> The FAQ: https://wiki.openstack.org/wiki/Neutron/LBaaS/Deprecation
> 
> The Octavia documentation: https://docs.openstack.org/octavia/latest/
> 
> Reach out to us via IRC on the Freenode IRC network, channel #openstack-lbaas
> 
> Weekly Meeting: 20:00 UTC on Wednesdays in #openstack-lbaas on the
> Freenode IRC network.
> 
> Sending email to the OpenStack developer mailing list: openstack-dev
> [at] lists [dot] openstack [dot] org. Please prefix the subject with
> '[openstack-dev][Octavia]'
> 
> Thank you for your support and patience during this transition,
> 
> Michael Johnson
> Octavia PTL
> 
> [1] 
> http://lists.openstack.org/pipermail/openstack-dev/2018-January/126836.html
> [2] https://docs.openstack.org/infra/manual/drivers.html#retiring-a-project
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Openstack Kilo - Danger: There was an error submitting the form. Please try again.

2018-08-28 Thread Sam Morrison
Hi Anwar,

The log message you posted below is not an error (it is at level DEBUG and can 
be ignored). I would look in your horizon logs and also. The API logs of 
nova/neutron/glance/cinder to see what’s going on.

Good luck!
Sam


> On 28 Aug 2018, at 9:22 pm, Anwar Durrani  wrote:
> 
> Hi Team,
> 
> I am using KILO, i am having issue while launching any instance, its ending 
> up with following error 
> 
> Danger: There was an error submitting the form. Please try again.
> 
> i have managed to capture the log, where i have found as
> 
> tail -f /var/log/nova/nova-conductor.log
> 2018-08-28 16:51:07.298 6187 DEBUG nova.openstack.common.loopingcall 
> [req-4f7a0004-ab9d-431e-8629-b0c4c15617e2 - - - - -] Dynamic looping call 
>  0x470ab90>> sleeping for 60.00 seconds _inner 
> /usr/lib/python2.7/site-packages/nova/openstack/common/loopingcall.py:132
> 
> Any clue why this is happening ?
> 
> --
>   
> Thanks & regards,
> Anwar M. Durrani
> +91-9923205011
>  
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [nova][cinder][neutron] Cross-cell cold migration

2018-08-22 Thread Sam Morrison
I think in our case we’d only migrate between cells if we know the network and 
storage is accessible and would never do it if not. 
Thinking moving from old to new hardware at a cell level.

If storage and network isn’t available ideally it would fail at the api request.

There is also ceph backed instances and so this is also something to take into 
account which nova would be responsible for.

I’ll be in Denver so we can discuss more there too.

Cheers,
Sam





> On 23 Aug 2018, at 11:23 am, Matt Riedemann  wrote:
> 
> Hi everyone,
> 
> I have started an etherpad for cells topics at the Stein PTG [1]. The main 
> issue in there right now is dealing with cross-cell cold migration in nova.
> 
> At a high level, I am going off these requirements:
> 
> * Cells can shard across flavors (and hardware type) so operators would like 
> to move users off the old flavors/hardware (old cell) to new flavors in a new 
> cell.
> 
> * There is network isolation between compute hosts in different cells, so no 
> ssh'ing the disk around like we do today. But the image service is global to 
> all cells.
> 
> Based on this, for the initial support for cross-cell cold migration, I am 
> proposing that we leverage something like shelve offload/unshelve 
> masquerading as resize. We shelve offload from the source cell and unshelve 
> in the target cell. This should work for both volume-backed and 
> non-volume-backed servers (we use snapshots for shelved offloaded 
> non-volume-backed servers).
> 
> There are, of course, some complications. The main ones that I need help with 
> right now are what happens with volumes and ports attached to the server. 
> Today we detach from the source and attach at the target, but that's assuming 
> the storage backend and network are available to both hosts involved in the 
> move of the server. Will that be the case across cells? I am assuming that 
> depends on the network topology (are routed networks being used?) and storage 
> backend (routed storage?). If the network and/or storage backend are not 
> available across cells, how do we migrate volumes and ports? Cinder has a 
> volume migrate API for admins but I do not know how nova would know the 
> proper affinity per-cell to migrate the volume to the proper host (cinder 
> does not have a routed storage concept like routed provider networks in 
> neutron, correct?). And as far as I know, there is no such thing as port 
> migration in Neutron.
> 
> Could Placement help with the volume/port migration stuff? Neutron routed 
> provider networks rely on placement aggregates to schedule the VM to a 
> compute host in the same network segment as the port used to create the VM, 
> however, if that segment does not span cells we are kind of stuck, correct?
> 
> To summarize the issues as I see them (today):
> 
> * How to deal with the targeted cell during scheduling? This is so we can 
> even get out of the source cell in nova.
> 
> * How does the API deal with the same instance being in two DBs at the same 
> time during the move?
> 
> * How to handle revert resize?
> 
> * How are volumes and ports handled?
> 
> I can get feedback from my company's operators based on what their deployment 
> will look like for this, but that does not mean it will work for others, so I 
> need as much feedback from operators, especially those running with multiple 
> cells today, as possible. Thanks in advance.
> 
> [1] https://etherpad.openstack.org/p/nova-ptg-stein-cells
> 
> -- 
> 
> Thanks,
> 
> Matt


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] neutron-server memcached connections

2018-07-31 Thread Sam Morrison
Great, yeah we have also seen these issues with nova-api with keystonemiddle in 
newton and ocata.

Thanks for the heads up as I was going to start digging deeper.

Cheers,
Sam




> On 31 Jul 2018, at 10:09 am, iain MacDonnell  
> wrote:
> 
> 
> Following up on my own question, in case it's useful to others
> 
> Turns out that keystonemiddleware uses eventlet, and, by default, creates a 
> connection to memcached from each green thread (and doesn't clean them up), 
> and the green threads are essentially unlimited.
> 
> There is a solution for this, which implements a shared connection pool. It's 
> enabled via the keystone_authtoken.memcache_use_advanced_pool config option.
> 
> Unfortunately it was broken in a few different ways (I guess this means that 
> no one is using it?)
> 
> I've worked with the keystone devs, and we were able to get a fix (in 
> keystonemiddleware) in just in time for the Rocky release. Related fixes have 
> also been backported to Queens (for the next update), and a couple needed for 
> Pike are pending completion.
> 
> With this in place, so-far I have not seen more than one connection to 
> memcached for each neutron-api worker process, and everything seems to be 
> working well.
> 
> Some relevant changes:
> 
> master:
> 
> https://review.openstack.org/#/c/583695/
> 
> 
> Queens:
> 
> https://review.openstack.org/#/c/583698/
> https://review.openstack.org/#/c/583684/
> 
> 
> Pike:
> 
> https://review.openstack.org/#/c/583699/
> https://review.openstack.org/#/c/583835/
> 
> 
> I do wonder how others are managing memcached connections for larger 
> deployments...
> 
>~iain
> 
> 
> 
> On 06/26/2018 12:59 PM, iain MacDonnell wrote:
>> In diagnosing a situation where a Pike deployment was intermittently slower 
>> (in general), I discovered that it was (sometimes) exceeding memcached's 
>> maximum connection limit, which is set to 4096.
>> Looking closer, ~2750 of the connections are from 8 neutron-server process. 
>> neutron-server is configured with 8 API workers, and those 8 processes have 
>> a combined total of ~2750 connections to memcached:
>> # lsof -i TCP:11211 | awk '/^neutron-s/ {print $2}' | sort | uniq -c
>> 245 2611
>> 306 2612
>> 228 2613
>> 406 2614
>> 407 2615
>> 385 2616
>> 369 2617
>> 398 2618
>> #
>> There doesn't seem to be much turnover - comparing samples of the 
>> connections (incl. source port) 15 mins apart, two were dropped, and one new 
>> one added.
>> In neutron.conf, keystone_authtoken.memcached_servers is configured, but 
>> nothing else pertaining to caching, so 
>> keystone_authtoken.memcache_pool_maxsize should default to 10.
>> Am I misunderstanding something, or shouldn't I see a maximum of 10 
>> connections from each of the neutron-server API workers, with this 
>> configuration?
>> Any known issues, or pointers to what I'm missing?
>> TIA,
>> ~iain
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Ubuntu Kernel with Meltdown mitigation SSL issues

2018-01-18 Thread Sam Morrison
We have an F5 doing all the SSL in front of our API servers.

SSL-Session:
Protocol  : TLSv1.2
Cipher: ECDHE-RSA-AES256-GCM-SHA384

The majority of the requests that were failing was a glance request
/v2/images?limit=20 (around 25% of requests which is around 1-2 a second)
Glance is on Ocata.

We also saw the same error on the heat and designate running pike and other
services.

We thought it was to do with low entropy on the control VMs as they were
actually low, however we tweaked this and increased entropy to over 3000
and still had issues.

The underlying hypervisor is also running Xenial and the 4.4.0-109 kernel
but it hasn't got the intel-microcode package installed.

Let me know if anyone wants more details of our setup and I'll happily
provide.


Cheers,
Sam




On Fri, Jan 19, 2018 at 7:18 AM, Logan V. <lo...@protiumit.com> wrote:

> We upgraded our control plane to 4.4.0-109 + intel-microcode
> 3.20180108.0~ubuntu16.04.2 several days ago, and are about 1/2 of the
> way thru upgrading our compute hosts with these changes. We use Ocata
> for all services, and no issue like this has been observed yet on our
> env. Control hosts are E5-2600 V2's and the computes are a mix of
> E5-2600 v2/v3/v4's along with some Xeon D1541's.
>
> On Thu, Jan 18, 2018 at 2:42 AM, Adam Heczko <ahec...@mirantis.com> wrote:
> > Hello Sam, thank you for sharing this information.
> > Could you please provide more information related to your specific setup.
> > How is Keystone API endpoint TLS terminated in your setup?
> > AFAIK in our OpenStack labs we haven't observed anything like this
> although
> > we terminate TLS on Nginx or HAProxy.
> >
> >
> > On Thu, Jan 18, 2018 at 4:36 AM, Sam Morrison <sorri...@gmail.com>
> wrote:
> >>
> >> Hi All,
> >>
> >> We updated our control infrastructure to the latest Ubuntu Xenial Kernel
> >> (4.4.0-109) which includes the meltdown fixes.
> >>
> >> We have found this kernel to have issues with SSL connections with
> python
> >> and have since downgraded. We get errors like:
> >>
> >> SSLError: SSL exception connecting to
> >> https://keystone.example.com:35357/v3/auth/tokens: ("bad handshake:
> >> Error([('', 'osrandom_rand_bytes', 'getrandom() initialization
> >> failed.')],)”,)
> >>
> >> Full trace:  http://paste.openstack.org/show/646803/
> >>
> >> This was affecting glance mainly but all API services were having
> issues.
> >>
> >> Our controllers are running inside KVM VMs and the guests see the CPU as
> >> "Intel Xeon E3-12xx v2 (Ivy Bridge)”
> >>
> >> This isn’t an openstack issue specifically but hopefully it helps others
> >> who may be seeing similar issues.
> >>
> >>
> >> Cheers,
> >> Sam
> >>
> >>
> >>
> >>
> >> ___
> >> OpenStack-operators mailing list
> >> OpenStack-operators@lists.openstack.org
> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> >
> >
> >
> >
> > --
> > Adam Heczko
> > Security Engineer @ Mirantis Inc.
> >
> > ___
> > OpenStack-operators mailing list
> > OpenStack-operators@lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> >
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] Ubuntu Kernel with Meltdown mitigation SSL issues

2018-01-17 Thread Sam Morrison
Hi All,

We updated our control infrastructure to the latest Ubuntu Xenial Kernel 
(4.4.0-109) which includes the meltdown fixes.

We have found this kernel to have issues with SSL connections with python and 
have since downgraded. We get errors like:

SSLError: SSL exception connecting to 
https://keystone.example.com:35357/v3/auth/tokens: ("bad handshake: Error([('', 
'osrandom_rand_bytes', 'getrandom() initialization failed.')],)”,)

Full trace:  http://paste.openstack.org/show/646803/

This was affecting glance mainly but all API services were having issues.

Our controllers are running inside KVM VMs and the guests see the CPU as "Intel 
Xeon E3-12xx v2 (Ivy Bridge)”

This isn’t an openstack issue specifically but hopefully it helps others who 
may be seeing similar issues.


Cheers,
Sam




___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Mixed service version CI testing

2018-01-01 Thread Sam Morrison
We usually upgrade nova last so would be helpful. 

Nectar has been running a mix of versions for a couple of years now and we 
treat each project as it’s own thing and upgrade everything separately.

You can see what versions we run currently at 
https://trello.com/b/9fkuT1eU/nectar-openstack-versions 


Sam



> On 29 Dec 2017, at 4:28 am, Clint Byrum  wrote:
> 
> Excerpts from Matt Riedemann's message of 2017-12-19 09:58:34 -0600:
>> During discussion in the TC channel today [1], we got talking about how 
>> there is a perception that you must upgrade all of the services together 
>> for anything to work, at least the 'core' services like 
>> keystone/nova/cinder/neutron/glance - although maybe that's really just 
>> nova/cinder/neutron?
>> 
>> Anyway, I posit that the services are not as tightly coupled as some 
>> people assume they are, at least not since kilo era when microversions 
>> started happening in nova.
>> 
>> However, with the way we do CI testing, and release everything together, 
>> the perception is there that all things must go together to work.
>> 
>> In our current upgrade job, we upgrade everything to N except the 
>> nova-compute service, that remains at N-1 to test rolling upgrades of 
>> your computes and to make sure guests are unaffected by the upgrade of 
>> the control plane.
>> 
>> I asked if it would be valuable to our users (mostly ops for this 
>> right?) if we had an upgrade job where everything *except* nova were 
>> upgraded. If that's how the majority of people are doing upgrades anyway 
>> it seems we should make sure that works.
>> 
>> I figure leaving nova at N-1 makes more sense because nova depends on 
>> the other services (keystone/glance/cinder/neutron) and is likely the 
>> harder / slower upgrade if you're going to do rolling upgrades of your 
>> compute nodes.
>> 
>> This type of job would not run on nova changes on the master branch, 
>> since those changes would not be exercised in this type of environment. 
>> So we'd run this on master branch changes to 
>> keystone/cinder/glance/neutron/trove/designate/etc.
>> 
>> Does that make sense? Would this be valuable at all? Or should the 
>> opposite be tested where we upgrade nova to N and leave all of the 
>> dependent services at N-1?
>> 
> 
> It makes sense completely. What would really be awesome would be to test
> the matrix of single upgrades:
> 
> upgrade only keystone
> upgrade only glance
> upgrade only neutron
> upgrade only cinder
> upgrade only nova
> 
> That would have a good chance at catching any co-dependencies that crop
> up.
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org 
> 
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators 
> 
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [nova] Cinder cross_az_attach=False changes/fixes

2017-06-06 Thread Sam Morrison
Hi Matt,

Just looking into this,

> On 1 Jun 2017, at 9:08 am, Matt Riedemann  wrote:
> 
> This is a request for any operators out there that configure nova to set:
> 
> [cinder]
> cross_az_attach=False
> 
> To check out these two bug fixes:
> 
> 1. https://review.openstack.org/#/c/366724/
> 
> This is a case where nova is creating the volume during boot from volume and 
> providing an AZ to cinder during the volume create request. Today we just 
> pass the instance.availability_zone which is None if the instance was created 
> without an AZ set. It's unclear to me if that causes the volume creation to 
> fail (someone in IRC was showing the volume going into ERROR state while Nova 
> was waiting for it to be available), but I think it will cause the later 
> attach to fail here [1] because the instance AZ (defaults to None) and volume 
> AZ (defaults to nova) may not match. I'm still looking for more details on 
> the actual failure in that one though.
> 
> The proposed fix in this case is pass the AZ associated with any host 
> aggregate that the instance is in.

If cross_az_attach is false won’t it always result in the instance AZ being 
None as it won’t be on a host yet?
I haven’t traced back the code fully so not sure if an instance gets scheduled 
onto a host and then the volume create call happens  or they happen in parallel 
etc. (in the case for boot from volume) 


When cross_az_attach is false:
If a user does a boot from volume (create new volume) and specifies an AZ then 
I would expect the instance and the volume to be created in the specified AZ.
If the AZ doesn’t exist in cinder or nova I would expect it to fail.

If a user doesn’t specify an AZ I would expect that the instance and the volume 
are in the same AZ.
If there isn’t a common AZ between cinder and nova I would expect it to fail.



> 
> 2. https://review.openstack.org/#/c/469675/
> 
> This is similar, but rather than checking the AZ when we're on the compute 
> and the instance has a host, we're in the API and doing a boot from volume 
> where an existing volume is provided during server create. By default, the 
> volume's AZ is going to be 'nova'. The code doing the check here is getting 
> the AZ for the instance, and since the instance isn't on a host yet, it's not 
> in any aggregate, so the only AZ we can get is from the server create request 
> itself. If an AZ isn't provided during the server create request, then we're 
> comparing instance.availability_zone (None) to volume['availability_zone'] 
> ("nova") and that results in a 400.
> 
> My proposed fix is in the case of BFV checks from the API, we default the AZ 
> if one wasn't requested when comparing against the volume. By default this is 
> going to compare "nova" for nova and "nova" for cinder, since 
> CONF.default_availability_zone is "nova" by default in both projects.
> 

Is this an alternative approach? Just trying to get my head around this all.

Thanks,
Sam


> --
> 
> I'm requesting help from any operators that are setting cross_az_attach=False 
> because I have to imagine your users have run into this and you're patching 
> around it somehow, so I'd like input on how you or your users are dealing 
> with this.
> 
> I'm also trying to recreate these in upstream CI [2] which I was already able 
> to do with the 2nd bug.
> 
> Having said all of this, I really hate cross_az_attach as it's config-driven 
> API behavior which is not interoperable across clouds. Long-term I'd really 
> love to deprecate this option but we need a replacement first, and I'm hoping 
> placement with compute/volume resource providers in a shared aggregate can 
> maybe make that happen.
> 
> [1] 
> https://github.com/openstack/nova/blob/f278784ccb06e16ee12a42a585c5615abe65edfe/nova/virt/block_device.py#L368
> [2] https://review.openstack.org/#/c/467674/
> 
> -- 
> 
> Thanks,
> 
> Matt
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Successful nova-network to Neutron Migration

2017-05-28 Thread Sam Morrison
Awesome! Glad to hear it all went well.

Cheers,
Sam


> On 21 May 2017, at 4:51 am, Joe Topjian  wrote:
> 
> Hi all,
> 
> There probably aren't a lot of people in this situation nowadays, but for 
> those that are, I wanted to report a successful nova-network to Neutron 
> migration.
> 
> We used NeCTAR's migration scripts which can be found here:
> 
> https://github.com/NeCTAR-RC/novanet2neutron 
> 
> 
> These scripts allowed us to do an in-place upgrade with almost no downtime. 
> There was probably an hour or two of network downtime, but all instances 
> stayed up and running. There were also a handful of instances that needed a 
> hard reboot and some that had to give up their Floating IP to Neutron. All 
> acceptable, IMO.
> 
> We modified them to suit our environment, specifically by adding support for 
> IPv6 and Floating IPs. In addition, we leaned on our existing Puppet 
> environment to deploy certain  Nova and Neutron settings in phases. 
> 
> But we wouldn't have been able to do this migration without these scripts, so 
> to Sam and the rest of the NeCTAR crew: thank you all very much!
> 
> Joe
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [openstack-dev] [nova][glance] Who needs multiple api_servers?

2017-05-01 Thread Sam Morrison

> On 1 May 2017, at 4:24 pm, Sean McGinnis  wrote:
> 
> On Mon, May 01, 2017 at 10:17:43AM -0400, Matthew Treinish wrote:
>>> 
>> 
>> I thought it was just nova too, but it turns out cinder has the same exact
>> option as nova: (I hit this in my devstack patch trying to get glance 
>> deployed
>> as a wsgi app)
>> 
>> https://github.com/openstack/cinder/blob/d47eda3a3ba9971330b27beeeb471e2bc94575ca/cinder/common/config.py#L51-L55
>> 
>> Although from what I can tell you don't have to set it and it will fallback 
>> to
>> using the catalog, assuming you configured the catalog info for cinder:
>> 
>> https://github.com/openstack/cinder/blob/19d07a1f394c905c23f109c1888c019da830b49e/cinder/image/glance.py#L117-L129
>> 
>> 
>> -Matt Treinish
>> 
> 
> FWIW, that came with the original fork out of Nova. I do not have any real
> world data on whether that is used or not.

Yes this is used in cinder.

A lot of the projects you can set endpoints for them to use. This is extremely 
useful in a a large production Openstack install where you want to control the 
traffic.

I can understand using the catalog in certain situations and feel it’s OK for 
that to be the default but please don’t prevent operators configuring it 
differently.

Glance is the big one as you want to control the data flow efficiently but any 
service to service configuration should ideally be able to be manually 
configured.

Cheers,
Sam


> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Trove Shadow Tenant

2017-02-05 Thread Sam Morrison
Hi Sergio,

I’m very interested in this feature too, it might be worth asking in the 
openstack-trove IRC channel or on the openstack-dev mailing list (adding a 
[trove] in the subject should get their attention) to get some answers to this 
question.

Cheers,
Sam


> On 4 Feb 2017, at 1:34 am, Sergio Morales Acuña  wrote:
> 
> Hi.
> 
> I'm looking for information about the "Trove Shadow Tenant" feature.
> 
> There some blogs talking about this but I can't find any information about 
> the configuration.
> 
> I have a working implementation of Trove but the instance is created in the 
> same project as the user requesting the database. This is a problem for me 
> because the user can create a snapshot of the instance and capture the 
> RabbitMQ password.
> 
> Cheers.
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] What would you like in Pike?

2017-01-18 Thread Sam Morrison
I would love it if all the projects policy.json was actually usable. Too many 
times the policy.json isn’t the only place where authN happens with lots of 
hard coded is_admin etc.

Just the ability to to have a certain role to a certain thing would be amazing. 
It makes it really hard to have read only users to generate reports with that 
we can show our funders how much people use our openstack cloud.

Cheers,
Sam
(non-enterprise)



> On 18 Jan 2017, at 6:10 am, Melvin Hillsman  wrote:
> 
> Well said, as a consequence of this thread being on the mailing list, I hope 
> that we can get all operators, end-users, and app-developers to respond. If 
> you are aware of folks who do not fall under the "enterprise" label please 
> encourage them directly to respond; I would encourage everyone to do the same.
> 
> On Tue, Jan 17, 2017 at 11:52 AM, Silence Dogood  > wrote:
> I can see a huge problem with your contributing operators... all of them are 
> enterprise.
> 
> enterprise needs are radically different from small to medium deployers who 
> openstack has traditionally failed to work well for.
> 
> On Tue, Jan 17, 2017 at 12:47 PM, Piet Kruithof  > wrote:
> Sorry for the late reply, but wanted to add a few things.
> 
> OpenStack UX did suggest to the foundation that the community needs a second 
> survey that focuses exclusively on operators.  The rationale was that the 
> user survey is primarily focused on marketing data and there isn't really a 
> ton of space for additional questions that focuses exclusively on operators. 
> We also recommended a second survey called a MaxDiff study that enabled 
> operators to identify areas of improvement and also rate them in order of 
> importance including distance.
> 
> There is also an etherpad that asked operators three priorities for OpenStack:
> 
> https://etherpad.openstack.org/p/mitaka-openstackux-enterprise-goals 
> 
> 
> It was distributed about a year ago, so I'm not sure how much of it was 
> relevant.  The list does include responses from folks at TWC, Walmart, 
> Pacific Northwest Labs, BestBuy, Comcast, NTTi3 and the US government. It 
> might be a good place for the group to add their own improvements as well as 
> "+" other peoples suggestions.
> 
> There is also a list of studies that have been conducted with operators on 
> behalf of the community. The study included quotas, deployment and 
> information needs. Note that the information needs study extended beyond docs 
> to things like the need to easily google solutions and the need for SMEs.
> 
> Hope this is helpful.  
> 
> Piet
> 
> ___
> OPENSTACK USER EXPERIENCE STATUS
> The goal of this presentation is to provide an overview of research that was 
> conducted on behalf of the OpenStack community.  All of the studies conducted 
> on behalf of the OpenStack community were included in this presentation. 
> 
> Why this research matters:
> Consistency across projects has been identified as an issue in the user 
> survey.
> 
> Study design:
> This usability study, conducted at the OpenStack Austin Summit, observed 10 
> operators as they attempted to perform standard tasks in the OpenStack client.
> 
> https://docs.google.com/presentation/d/1hZYCOADJ1gXiFHT1ahwv8-tDIQCSingu7zqSMbKFZ_Y/edit#slide=id.p
>  
> 
>  
> 
> 
> 
> ___
> USER RESEARCH RESULTS: SEARCHLIGHT/HORIZON INTEGRATION
> Why this research matters:
> The Searchlight plug-in for Horizon aims to provide a consistent search API 
> across OpenStack resources. To validate its suitability and ease of use, we 
> evaluated it with cloud operators who use Horizon in their role.
> 
> Study design:
> Five operators performed tasks that explored Searchlight’s filters, full-text 
> capability, and multi-term search.
> 
> https://docs.google.com/presentation/d/1TfF2sm98Iha-bNwBJrCTCp6k49zde1Z8I9Qthx1moIM/edit?usp=sharing
>  
> 
>  
> 
> 
> 
> ___
> CLOUD OPERATOR INTERVIEWS: QUOTA MANAGEMENT AT PRODUCTION SCALE
> Why this research matters:
> The study was initiated following operator feedback identifying quotas as a 
> challenge to manage at scale.
> 
> Study design:
> One-on-one interviews with cloud operators sought to understand their methods 
> for managing quotas at production scale.
> 
> https://docs.google.com/presentation/d/1J6-8MwUGGOwy6-A_w1EaQcZQ1Bq2YWeB-kw4vCFxbwM/edit
>  
> 
> 
> 
> 
> ___
> CLOUD OPERATOR INTERVIEWS: INFORMATION NEEDS
> Why this research matters:
> Documentation has been consistently identified as an issue by operators 
> during the 

Re: [Openstack-operators] RabbitMQ 3.6.x experience?

2017-01-10 Thread Sam Morrison

> On 10 Jan 2017, at 11:04 pm, Tomáš Vondra <von...@homeatcloud.cz> wrote:
> 
> The version is 3.6.2, but the issue that I believe is relevant is still not 
> fixed:
> https://github.com/rabbitmq/rabbitmq-management/issues/41
> Tomas
> 

Yeah we found this version unusable, 3.6.5 hasn’t had any problems for us.

Sam



> -Original Message-
> From: Mike Dorman [mailto:mdor...@godaddy.com] 
> Sent: Monday, January 09, 2017 6:00 PM
> To: Ricardo Rocha; Sam Morrison
> Cc: OpenStack Operators
> Subject: Re: [Openstack-operators] RabbitMQ 3.6.x experience?
> 
> Great info, thanks so much for this.  We, too, have turned off stats 
> collection some time ago (and haven’t really missed it.)
> 
> Tomáš, what minor version of 3.6 are you using?  We would probably go to 
> 3.6.6 if we upgrade.
> 
> Thanks again all!
> Mike
> 
> 
> On 1/9/17, 2:34 AM, "Ricardo Rocha" <rocha.po...@gmail.com> wrote:
> 
>Same here, running 3.6.5 for (some) of the rabbit clusters.
> 
>It's been stable over the last month (fingers crossed!), though:
>* gave up on stats collection (set to 6 which makes it not so useful)
>* can still make it very sick with a couple of misconfigured clients
>(rabbit_retry_interval=1 and rabbit_retry_backoff=60 currently
>everywhere).
> 
>Some data from the neutron rabbit cluster (3 vm nodes, not all infra
>currently talks to neutron):
> 
>* connections: ~8k
>* memory used per node: 2.5GB, 1.7GB, 0.1GB (the last one is less used
>due to a previous net partition i believe)
>* rabbit hiera configuration
>rabbitmq::cluster_partition_handling: 'autoheal'
>rabbitmq::config_kernel_variables:
>  inet_dist_listen_min: 41055
>  inet_dist_listen_max: 41055
>rabbitmq::config_variables:
>  collect_statistics_interval: 6
>  reverse_dns_lookups: true
>  vm_memory_high_watermark: 0.8
>rabbitmq::environment_variables:
>  SERVER_ERL_ARGS: "'+K true +A 128 +P 1048576'"
>rabbitmq::tcp_keepalive: true
>rabbitmq::tcp_backlog: 4096
> 
>* package versions
> 
>erlang-kernel-18.3.4.4-1
>rabbitmq-server-3.6.5-1
> 
>It's stable enough to keep scaling it up in the next couple months and
>see how it goes.
> 
>Cheers,
>  Ricardo
> 
>On Mon, Jan 9, 2017 at 3:54 AM, Sam Morrison <sorri...@gmail.com> wrote:
>> We’ve been running 3.6.5 for sometime now and it’s working well.
>> 
>> 3.6.1 - 3.6.3 are unusable, we had lots of issues with stats DB and other
>> weirdness.
>> 
>> Our setup is a 3 physical node cluster with around 9k connections, average
>> around the 300 messages/sec delivery. We have the stats sample rate set to
>> default and it is working fine.
>> 
>> Yes we did have to restart the cluster to upgrade.
>> 
>> Cheers,
>> Sam
>> 
>> 
>> 
>> On 6 Jan 2017, at 5:26 am, Matt Fischer <m...@mattfischer.com> wrote:
>> 
>> MIke,
>> 
>> I did a bunch of research and experiments on this last fall. We are running
>> Rabbit 3.5.6 on our main cluster and 3.6.5 on our Trove cluster which has
>> significantly less load (and criticality). We were going to upgrade to 3.6.5
>> everywhere but in the end decided not to, mainly because there was little
>> perceived benefit at the time. Our main issue is unchecked memory growth at
>> random times. I ended up making several config changes to the stats
>> collector and then we also restart it after every deploy and that solved it
>> (so far).
>> 
>> I'd say these were my main reasons for not going to 3.6 for our control
>> nodes:
>> 
>> In 3.6.x they re-wrote the stats processor to make it parallel. In every 3.6
>> release since then, Pivotal has fixed bugs in this code. Then finally they
>> threw up their hands and said "we're going to make a complete rewrite in
>> 3.7/4.x" (you need to look through issues on Github to find this discussion)
>> Out of the box with the same configs 3.6.5 used more memory than 3.5.6,
>> since this was our main issue, I consider this a negative.
>> Another issue is the ancient version of erlang we have with Ubuntu Trusty
>> (which we are working on) which made upgrades more complex/impossible
>> depending on the version.
>> 
>> Given those negatives, the main one being that I didn't think there would be
>> too many more fixes to the parallel statsdb collector in 3.6, we decided to
>> stick with 3.5.6. In the end the devil we know is better than the devil we
>> don't and I had no evidence that 3.6.5 would be a

Re: [Openstack-operators] RabbitMQ 3.6.x experience?

2017-01-08 Thread Sam Morrison
We’ve been running 3.6.5 for sometime now and it’s working well.

3.6.1 - 3.6.3 are unusable, we had lots of issues with stats DB and other 
weirdness. 

Our setup is a 3 physical node cluster with around 9k connections, average 
around the 300 messages/sec delivery. We have the stats sample rate set to 
default and it is working fine.

Yes we did have to restart the cluster to upgrade.

Cheers,
Sam



> On 6 Jan 2017, at 5:26 am, Matt Fischer  wrote:
> 
> MIke,
> 
> I did a bunch of research and experiments on this last fall. We are running 
> Rabbit 3.5.6 on our main cluster and 3.6.5 on our Trove cluster which has 
> significantly less load (and criticality). We were going to upgrade to 3.6.5 
> everywhere but in the end decided not to, mainly because there was little 
> perceived benefit at the time. Our main issue is unchecked memory growth at 
> random times. I ended up making several config changes to the stats collector 
> and then we also restart it after every deploy and that solved it (so far). 
> 
> I'd say these were my main reasons for not going to 3.6 for our control nodes:
> In 3.6.x they re-wrote the stats processor to make it parallel. In every 3.6 
> release since then, Pivotal has fixed bugs in this code. Then finally they 
> threw up their hands and said "we're going to make a complete rewrite in 
> 3.7/4.x" (you need to look through issues on Github to find this discussion)
> Out of the box with the same configs 3.6.5 used more memory than 3.5.6, since 
> this was our main issue, I consider this a negative.
> Another issue is the ancient version of erlang we have with Ubuntu Trusty 
> (which we are working on) which made upgrades more complex/impossible 
> depending on the version.
> Given those negatives, the main one being that I didn't think there would be 
> too many more fixes to the parallel statsdb collector in 3.6, we decided to 
> stick with 3.5.6. In the end the devil we know is better than the devil we 
> don't and I had no evidence that 3.6.5 would be an improvement.
> 
> I did decide to leave Trove on 3.6.5 because this would give us some bake-in 
> time if 3.5.x became untenable we'd at least have had it up and running in 
> production and some data on it.
> 
> If statsdb is not a concern for you, I think this changes the math and maybe 
> you should use 3.6.x. I would however recommend at least going to 3.5.6, it's 
> been better than 3.3/3.4 was.
> 
> No matter what you do definitely read all the release notes. There are some 
> upgrades which require an entire cluster shutdown. The upgrade to 3.5.6 did 
> not require this IIRC.
> 
> Here's the hiera for our rabbit settings which I assume you can translate:
> 
> rabbitmq::cluster_partition_handling: 'autoheal'
> rabbitmq::config_variables:
>   'vm_memory_high_watermark': '0.6'
>   'collect_statistics_interval': 3
> rabbitmq::config_management_variables:
>   'rates_mode': 'none'
> rabbitmq::file_limit: '65535'
> 
> Finally, if you do upgrade to 3.6.x please report back here with your results 
> at scale!
> 
> 
> On Thu, Jan 5, 2017 at 8:49 AM, Mike Dorman  > wrote:
> We are looking at upgrading to the latest RabbitMQ in an effort to ease some 
> cluster failover issues we’ve been seeing.  (Currently on 3.4.0)
> 
>  
> 
> Anyone been running 3.6.x?  And what has been your experience?  Any gottchas 
> to watch out for?
> 
>  
> 
> Thanks,
> 
> Mike
> 
>  
> 
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org 
> 
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators 
> 
> 
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] need feedback about Glance image 'visibility' migration in Ocata

2016-11-17 Thread Sam Morrison
Hi Brian,

I don't think the user can shoot themselves in the foot here. If they are
adding a member to an image it is pretty clear it means they want to share
it.

Yes I can see the case when you want to disable sharing but I don't think
the 'visibility' attribute is the way to do it.

What if you want to share an image with a few people and then prevent the
sharing of the image to any other people. Do you then change the visibility
to private? Maybe this is what the protected attribute should be for?

Basically I think you're overloading the visibility attribute, in one sense
it means you can see the image, but then you're also now making it
determine if the image can be shared or not.

Cheers,
Sam

On Fri, Nov 18, 2016 at 12:27 AM, Brian Rosmaita <
brian.rosma...@rackspace.com> wrote:

> On 11/17/16, 1:39 AM, "Sam Morrison" <sorri...@gmail.com> wrote:
>
>
> On 17 Nov. 2016, at 3:49 pm, Brian Rosmaita <brian.rosma...@rackspace.com>
> wrote:
>
> Ocata workflow:  (1) create an image with default visibility, (2) change
> its visibility to 'shared', (3) add image members
>
>
> Unsure why this can’t be done in 2 steps, when someone adds an image
> member to a ‘private’ image the visibility changes to ‘shared’
> automatically.
> Just seems an extra step for no reason?
>
>
> Thanks for asking, Sam, I'm sure others have the same question.
>
> Here's what we're thinking.  We want to avoid "magic" visibility
> transitions as a side effect of another action, and we want all means of
> changing visibility to be consistent going forward.  The two-step 1-1
> sharing that automatically takes you from 'private' -> 'shared' is
> dangerous, as it can expose data and doesn't give an end user a way to make
> an image "really" private.  It's true that all an end user has to do under
> the new scheme is make one extra API call and then still shoot him/herself
> in the foot, but at least the end user has to remove the safety first by
> explicitly changing the visibility of the image from 'private' to 'shared'
> before the member-list has any effect.
>
> So basically, the reasons for the extra step are consistency and clarity.
>
>
> Sam
>
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] need feedback about Glance image 'visibility' migration in Ocata

2016-11-16 Thread Sam Morrison

> On 17 Nov. 2016, at 3:49 pm, Brian Rosmaita  
> wrote:
> 
> Ocata workflow:  (1) create an image with default visibility, (2) change
> its visibility to 'shared', (3) add image members

Unsure why this can’t be done in 2 steps, when someone adds an image member to 
a ‘private’ image the visibility changes to ‘shared’ automatically.
Just seems an extra step for no reason?

Sam

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Audit Logging - Interested? What's missing?

2016-11-16 Thread Sam Morrison
Anybody using http://docs.openstack.org/developer/keystonemiddleware/audit.html 
 ??




> On 17 Nov. 2016, at 11:51 am, Kris G. Lindgren  wrote:
> 
> I need to do a deeper dive on audit logging. 
> 
> However, we have a requirement for when someone changes a security group that 
> we log what the previous security group was and what the new security group 
> is and who changed it.  I don’’t know if this is specific to our crazy 
> security people or if others security peoples want to have this.  I am sure I 
> can think of others.
> 
> 
> ___
> Kris Lindgren
> Senior Linux Systems Engineer
> GoDaddy
> 
> On 11/16/16, 3:29 PM, "Tom Fifield"  wrote:
> 
>Hi Ops,
> 
>Was chatting with Department of Defense in Australia the other day, and 
>one of their pain points is Audit Logging. Some bits of OpenStack just 
>don't leave enough information for proper audit. So, thought it might be 
>a good idea to gather people who are interested to brainstorm how to get 
>it to a good level for all :)
> 
>Does your cloud need good audit logging? What do you wish was there at 
>the moment, but isn't?
> 
> 
>Regards,
> 
> 
>Tom
> 
>___
>OpenStack-operators mailing list
>OpenStack-operators@lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> 
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [puppet][fuel][packstack][tripleo] puppet 3 end of life

2016-11-03 Thread Sam Morrison

> On 4 Nov. 2016, at 1:33 pm, Emilien Macchi <emil...@redhat.com> wrote:
> 
> On Thu, Nov 3, 2016 at 9:10 PM, Sam Morrison <sorri...@gmail.com 
> <mailto:sorri...@gmail.com>> wrote:
>> Wow I didn’t realise puppet3 was being deprecated, is anyone actually using 
>> puppet4?
>> 
>> I would hope that the openstack puppet modules would support puppet3 for a 
>> while still, at lest until the next ubuntu LTS is out else we would get to 
>> the stage where the openstack  release supports Xenial but the corresponding 
>> puppet module would not? (Xenial has puppet3)
> 
> I'm afraid we made a lot of communications around it but you might
> have missed it, no problem.
> I have 3 questions for you:
> - for what reasons would you not upgrade puppet?

Because I’m a time poor operator with more important stuff to upgrade :-)
Upgrading puppet *could* be a big task and something we haven’t had time to 
look into. Don’t follow along with puppetlabs so didn’t realise puppet3 was 
being deprecated. Now that this has come to my attention we’ll look into it for 
sure.

> - would it be possible for you to use puppetlabs packaging if you need
> puppet4 on Xenial? (that's what upstream CI is using, and it works
> quite well).

OK thats promising, good to know that the CI is using puppet4. It’s all my 
other dodgy puppet code I’m worried about.

> - what version of the modules do you deploy? (and therefore what
> version of OpenStack)

We’re using a mixture of newton/mitaka/liberty/kilo, sometimes the puppet 
module version is newer than the openstack version too depending on where we’re 
at in the upgrade process of the particular openstack project.

I understand progress must go on, I am interested though in how many operators 
use puppet4. We may be in the minority and then I’ll be quiet :-)

Maybe it should be deprecated in one release and then dropped in the next?


Cheers,
Sam





> 
>> My guess is that this would also be the case for RedHat and other distros 
>> too.
> 
> Fedora is shipping Puppet 4 and we're going to do the same for Red Hat
> and CentOS7.
> 
>> Thoughts?
>> 
>> 
>> 
>>> On 4 Nov. 2016, at 2:58 am, Alex Schultz <aschu...@redhat.com> wrote:
>>> 
>>> Hey everyone,
>>> 
>>> Puppet 3 is reaching it's end of life at the end of this year[0].
>>> Because of this we are planning on dropping official puppet 3 support
>>> as part of the Ocata cycle.  While we currently are not planning on
>>> doing any large scale conversion of code over to puppet 4 only syntax,
>>> we may allow some minor things in that could break backwards
>>> compatibility.  Based on feedback we've received, it seems that most
>>> people who may still be using puppet 3 are using older (< Newton)
>>> versions of the modules.  These modules will continue to be puppet 3.x
>>> compatible but we're using Ocata as the version where Puppet 4 should
>>> be the target version.
>>> 
>>> If anyone has any concerns or issues around this, please let us know.
>>> 
>>> Thanks,
>>> -Alex
>>> 
>>> [0] https://puppet.com/misc/puppet-enterprise-lifecycle
>>> 
>>> ___
>>> OpenStack-operators mailing list
>>> OpenStack-operators@lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>> 
>> 
>> ___
>> OpenStack-operators mailing list
>> OpenStack-operators@lists.openstack.org 
>> <mailto:OpenStack-operators@lists.openstack.org>
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators 
>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators>
> 
> 
> 
> -- 
> Emilien Macchi

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [puppet][fuel][packstack][tripleo] puppet 3 end of life

2016-11-03 Thread Sam Morrison
Wow I didn’t realise puppet3 was being deprecated, is anyone actually using 
puppet4?

I would hope that the openstack puppet modules would support puppet3 for a 
while still, at lest until the next ubuntu LTS is out else we would get to the 
stage where the openstack  release supports Xenial but the corresponding puppet 
module would not? (Xenial has puppet3)

My guess is that this would also be the case for RedHat and other distros too.

Thoughts?



> On 4 Nov. 2016, at 2:58 am, Alex Schultz  wrote:
> 
> Hey everyone,
> 
> Puppet 3 is reaching it's end of life at the end of this year[0].
> Because of this we are planning on dropping official puppet 3 support
> as part of the Ocata cycle.  While we currently are not planning on
> doing any large scale conversion of code over to puppet 4 only syntax,
> we may allow some minor things in that could break backwards
> compatibility.  Based on feedback we've received, it seems that most
> people who may still be using puppet 3 are using older (< Newton)
> versions of the modules.  These modules will continue to be puppet 3.x
> compatible but we're using Ocata as the version where Puppet 4 should
> be the target version.
> 
> If anyone has any concerns or issues around this, please let us know.
> 
> Thanks,
> -Alex
> 
> [0] https://puppet.com/misc/puppet-enterprise-lifecycle
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Ceilometer/oslo.messaging connect to multiple RMQ endpoints

2016-11-03 Thread Sam Morrison
That was me! and yes you can do it when consuming notifications with 
ceilometer-agent-notification

Eg in our ceilometer.conf we have

[notification]
workers=12
disable_non_metric_meters=true
store_events = true
batch_size = 50
batch_timeout = 5
messaging_urls = rabbit://XX:XX@rabbithost1:5671/vhost1 

messaging_urls = rabbit://XX:XX@rabbithost2:5671/vhost 
2
messaging_urls = rabbit://XX:XX@rabbithost3:5671/vhost3


If no messaging_urls are set then it will fall back to the settings in the 
[oslo_messaging_rabbit] config section
Also if you set messaging_urls then it won’t consume from the rabbit specified 
in [oslo_messaging_rabbit] so you have to add it to messaging_urls too.

Cheers,
Sam




> On 4 Nov. 2016, at 10:28 am, Mike Dorman  wrote:
> 
> I heard third hand from the summit that it’s possible to configure 
> Ceilometer/oslo.messaging with multiple rabbitmq_hosts config entries, which 
> will let you connect to multiple RMQ endpoints at the same time.
>  
> The scenario here is we use the Ceilometer notification agent the pipe events 
> from OpenStack services into a Kafka queue for consumption by other team(s) 
> in the company.  We also run Nova cells v1, so we have to run one Ceilometer 
> agent for the API cell, as well as an agent for every compute cell (because 
> they have independent RMQ clusters.)
>  
> Anyway, I tried configuring it this way and it still only connects to a 
> single RMQ server.  We’re running Liberty Ceilometer and oslo.messaging, so 
> I’m wondering if this behavior is only in a later version?  Can anybody shed 
> any light?  I would love to get away from running so many Ceilometer agents.
>  
> Thanks!
> Mike
>  
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org 
> 
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators 
> 

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Murano in Production

2016-09-18 Thread Sam Morrison
We run completely separate clusters. I’m sure vhosts give you acceptable 
security but it means also sharing disk and ram which means if something went 
awry and generated lots of messages etc. it could take your whole rabbit 
cluster down.

Sam


> On 17 Sep 2016, at 3:34 PM, Joe Topjian  wrote:
> 
> Hi all,
> 
> We're planning to deploy Murano to one of our OpenStack clouds and I'm 
> debating the RabbitMQ setup.
> 
> For background: the Murano agent that runs on instances requires access to 
> RabbitMQ. Murano is able to be configured with two RabbitMQ services: one for 
> traditional OpenStack communication and one for the Murano/Agent 
> communication.
> 
> From a security/segregation point of view, would vhost separation on our 
> existing RabbitMQ cluster be sufficient? Or is it recommended to have an 
> entirely separate cluster?
> 
> As you can imagine, I'd like to avoid having to manage *two* RabbitMQ 
> clusters. :)
> 
> Thanks,
> Joe
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Designate service

2016-08-28 Thread Sam Morrison
Hi Alexandra,

I don’t think that is supported in designate yet, I think there is/was a 
blueprint floating around somewhere for this but not sure if anyone has looked 
at implementing.

If you wanted to code this up yourself you could also do this with a 
notification handler that looks for the instance.boot event and then goes off 
and adds a record to designate.

Cheers,
Sam


> On 25 Aug 2016, at 8:55 PM, Alexandra Kisin  wrote:
> 
> Hello, I'm operating Liberty openstack environment and using Designate 
> service for DNS based on PowerDNS. 
> Everything is working and syncing fine except one important thing - there are 
> only A records in dns and no PTR records. 
> And we have some applications and services which require reverse naming. 
> My question is how it can be implemented in Liberty environment ?
> How I can define additional reverse domain in /etc/designate/designate.conf 
> file and make reverse naming to be registered automatically once the instance 
> was deployed ? 
> 
> Thank you. 
> 
> Regards,
> 
> Alexandra Kisin
> Servers & Network group, IBM R Labs in Israel
> Unix & Virtualization Team
> Phone: +972-48296172 | Mobile: +972-54-6976172 | Fax: +972-4-8296111
> 
> 
> 
> 
> 
> 
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Blazar? (Reservations and/or scheduled termination)

2016-08-03 Thread Sam Morrison

> On 4 Aug 2016, at 3:12 AM, Kris G. Lindgren  wrote:
> 
> We do something similar.  We give everyone in the company an account on the 
> internal cloud.  By default they have a user- project.  We have a 
> Jenkins job that adds metadata to all vm’s that are in user- projects.  We 
> then have additional jobs that read that metadata and determine when the VM 
> has been alive for x period of time.  At 45 days we send an email saying that 
> we will remove the vm in 15 days, and they can request a 30 day extension 
> (which really just resets some metadata information on the vm).  On day 60 
> the vm is shut down and removed.  For non user- projects, people are allowed 
> to have their vm’s created as long as they want.

What stops a user modifying the metadata? Do you have novas policy.json set up 
so they can’t?

Sam


> 
> I believe I remember seeing something presented in the paris(?) time frame by 
> overstock(?) that would treat vm’s more as a lease.  IE You get an env for 90 
> days, it goes away at the end of that.
> 
> 
> ___
> Kris Lindgren
> Senior Linux Systems Engineer
> GoDaddy
> 
> On 8/3/16, 10:47 AM, "Jonathan D. Proulx"  wrote:
> 
>Hi All,
> 
>As a private cloud operatior who doesn't charge internal users, I'd
>really like a way to force users to set an exiration time on their
>instances so if they forget about them they go away.
> 
>I'd though Blazar was the thing to look at and Chameleoncloud.org
>seems to be using it (any of you around here?) but it also doesn't
>look like it's seen substantive work in a long time.
> 
>Anyone have operational exprience with blazar to share or other
>solutions?
> 
>-Jon
> 
>___
>OpenStack-operators mailing list
>OpenStack-operators@lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> 
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [gnocchi] monitoring storage use case inquiry

2016-07-31 Thread Sam Morrison
Hi Gordon,

We are using the influxDB backend and we have our retention policies set to:

Every minute for an hour
Every 10 minutes for a day
Every hour for a year

Currently we hover around 8,000 instances.

We understand the influxDB driver was taken out of gnocchi, bit annoyed as it 
wasn’t mentioned on the operators list.
We currently have just got it working with version 2.1 of gnocchi and are keen 
to see if it can be added back into gnocchi.

We already use influxDB for other non openstack stuff and so would rather use 
that as opposed to adding yet another system.

Would be interested to know what other operators use gnocchi and what backend 
they use.

Cheers,
Sam


> On 29 Jul 2016, at 11:30 PM, gordon chung  wrote:
> 
> hi folks,
> 
> the Gnocchi dev team is working on pushing out a new serialization 
> format to improve disk footprint and while we're at it, we're looking at 
> other changes as well. to get a bit more insight to help decide what 
> changes we make, one useful metric would be to know what your 
> requirements are for storing data. as you may know Gnocchi does not 
> store raw datapoints but aggregates data to a specified granularity (eg. 
> 5s, 30s, 1min, 1 day, etc...). what we're after is what's the longest 
> timeseries you're capturing or hoping to capture? a datapoint every 
> minute for a day/week/month/year? a datapoint every 10mins for a 
> week/month/year? something else?
> 
> your feedback would be greatly appreciated.
> 
> cheers,
> 
> -- 
> gord
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [oslo] RabbitMQ queue TTL issues moving to Liberty

2016-07-28 Thread Sam Morrison

> On 28 Jul 2016, at 10:17 PM, Dmitry Mescheryakov <dmescherya...@mirantis.com> 
> wrote:
> 
> 
> 
> 2016-07-27 2:20 GMT+03:00 Sam Morrison <sorri...@gmail.com 
> <mailto:sorri...@gmail.com>>:
> 
>> On 27 Jul 2016, at 4:05 AM, Dmitry Mescheryakov <dmescherya...@mirantis.com 
>> <mailto:dmescherya...@mirantis.com>> wrote:
>> 
>> 
>> 
>> 2016-07-26 2:15 GMT+03:00 Sam Morrison <sorri...@gmail.com 
>> <mailto:sorri...@gmail.com>>:
>> The queue TTL happens on reply queues and fanout queues. I don’t think it 
>> should happen on fanout queues. They should auto delete. I can understand 
>> the reason for having them on reply queues though so maybe that would be a 
>> way to forward?
>> 
>> Or am I missing something and it is needed on fanout queues too?
>> 
>> I would say we do need fanout queues to expire for the very same reason we 
>> want reply queues to expire instead of auto delete. In case of broken 
>> connection, the expiration provides client time to reconnect and continue 
>> consuming from the queue. In case of auto-delete queues, it was a frequent 
>> case that RabbitMQ deleted the queue before client reconnects ... along with 
>> all non-consumed messages in it.
> 
> But in the case of fanout queues, if there is a broken connection can’t the 
> service just recreate the queue if it doesn’t exist? I guess that means it 
> needs to store the state of what the queue name is though?
> 
> Yes they could loose messages directed at them but all the services I know 
> that consume on fanout queues have a re sync functionality for this very case.
> 
> If the connection is broken will oslo messaging know how to connect to the 
> same queue again anyway? I would’ve thought it would handle the disconnect 
> and then reconnect, either with the same queue name or a new queue all 
> together?
> 
> oslo.messaging handles reconnect perfectly - on connect it just 
> unconditionally declares the queue and starts consuming from it. If queue 
> already existed, the declaration operation will just be ignored by RabbitMQ.
> 
> For your earlier point that services re sync and hence messages lost in 
> fanout are not that important, I can't comment on that. But after some 
> thinking I do agree that having big expiration time for fanouts is 
> non-adequate for big deployments anyway. How about we split 
> rabbit_transient_queues_ttl into two parameters - one for reply queue and one 
> for fanout ones? In that case people concerned with messages piling up in 
> fanouts might set it to 1, which will virtually make these queues behave like 
> auto-delete ones (though I strongly recommend to leave it at least at 20 
> seconds, to give service a chance to reconnect).

Hi Dmitry,

Splitting out the config options would be great, I think that would solve our 
issues. 

Thanks,
Sam


> 
> Thanks,
> 
> Dmitry
> 
>  
> 
> Sam

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [oslo] RabbitMQ queue TTL issues moving to Liberty

2016-07-26 Thread Sam Morrison

> On 27 Jul 2016, at 4:05 AM, Dmitry Mescheryakov <dmescherya...@mirantis.com> 
> wrote:
> 
> 
> 
> 2016-07-26 2:15 GMT+03:00 Sam Morrison <sorri...@gmail.com 
> <mailto:sorri...@gmail.com>>:
> The queue TTL happens on reply queues and fanout queues. I don’t think it 
> should happen on fanout queues. They should auto delete. I can understand the 
> reason for having them on reply queues though so maybe that would be a way to 
> forward?
> 
> Or am I missing something and it is needed on fanout queues too?
> 
> I would say we do need fanout queues to expire for the very same reason we 
> want reply queues to expire instead of auto delete. In case of broken 
> connection, the expiration provides client time to reconnect and continue 
> consuming from the queue. In case of auto-delete queues, it was a frequent 
> case that RabbitMQ deleted the queue before client reconnects ... along with 
> all non-consumed messages in it.

But in the case of fanout queues, if there is a broken connection can’t the 
service just recreate the queue if it doesn’t exist? I guess that means it 
needs to store the state of what the queue name is though?

Yes they could loose messages directed at them but all the services I know that 
consume on fanout queues have a re sync functionality for this very case.

If the connection is broken will oslo messaging know how to connect to the same 
queue again anyway? I would’ve thought it would handle the disconnect and then 
reconnect, either with the same queue name or a new queue all together?

Sam


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [oslo] RabbitMQ queue TTL issues moving to Liberty

2016-07-25 Thread Sam Morrison
The queue TTL happens on reply queues and fanout queues. I don’t think it 
should happen on fanout queues. They should auto delete. I can understand the 
reason for having them on reply queues though so maybe that would be a way to 
forward?

Or am I missing something and it is needed on fanout queues too?

Cheers,
Sam



> On 25 Jul 2016, at 8:47 PM, Dmitry Mescheryakov <dmescherya...@mirantis.com> 
> wrote:
> 
> Sam,
> 
> For your case I would suggest to lower rabbit_transient_queues_ttl until you 
> are comfortable with volume of messages which comes during that time. Setting 
> the parameter to 1 will essentially replicate bahaviour of auto_delete 
> queues. But I would suggest not to set it that low, as otherwise your 
> OpenStack will suffer from the original bug. Probably a value like 20 seconds 
> should work in most cases.
> 
> I think that there is a space for improvement here - we can delete reply and 
> fanout queues on graceful shutdown. But I am not sure if it will be easy to 
> implement, as it requires services (Nova, Neutron, etc.) to stop RPC server 
> on sigint and I don't know if they do it right now.
> 
> I don't think we can make case with sigkill any better. Other than that, the 
> issue could be investigated on Neutron side, maybe number of messages could 
> be reduced there.
> 
> Thanks,
> 
> Dmitry
> 
> 2016-07-25 9:27 GMT+03:00 Sam Morrison <sorri...@gmail.com 
> <mailto:sorri...@gmail.com>>:
> We recently upgraded to Liberty and have come across some issues with queue 
> build ups.
> 
> This is due to changes in rabbit to set queue expiries as opposed to queue 
> auto delete.
> See https://bugs.launchpad.net/oslo.messaging/+bug/1515278 
> <https://bugs.launchpad.net/oslo.messaging/+bug/1515278> for more information.
> 
> The fix for this bug is in liberty and it does fix an issue however it causes 
> another one.
> 
> Every time you restart something that has a fanout queue. Eg. 
> cinder-scheduler or the neutron agents you will have
> a queue in rabbit that is still bound to the rabbitmq exchange (and so still 
> getting messages in) but no consumers.
> 
> These messages in these queues are basically rubbish and don’t need to exist. 
> Rabbit will delete these queues after 10 mins (although the default in master 
> is now changed to 30 mins)
> 
> During this time the queue will grow and grow with messages. This sets off 
> our nagios alerts and our ops guys have to deal with something that isn’t 
> really an issue. They basically delete the queue.
> 
> A bad scenario is when you make a change to your cloud that means all your 
> 1000 neutron agents are restarted, this causes a couple of dead queues per 
> agent to hang around. (port updates and security group updates) We get around 
> 25 messages / second on these queues and so you can see after 10 minutes we 
> have a ton of messages in these queues.
> 
> 1000 x 2 x 25 x 600 = 30,000,000 messages in 10 minutes to be precise.
> 
> Has anyone else been suffering with this before a raise a bug?
> 
> Cheers,
> Sam
> 
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org 
> <mailto:OpenStack-operators@lists.openstack.org>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators 
> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators>
> 

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] [oslo] RabbitMQ queue TTL issues moving to Liberty

2016-07-25 Thread Sam Morrison
We recently upgraded to Liberty and have come across some issues with queue 
build ups.

This is due to changes in rabbit to set queue expiries as opposed to queue auto 
delete. 
See https://bugs.launchpad.net/oslo.messaging/+bug/1515278 for more information.

The fix for this bug is in liberty and it does fix an issue however it causes 
another one.

Every time you restart something that has a fanout queue. Eg. cinder-scheduler 
or the neutron agents you will have 
a queue in rabbit that is still bound to the rabbitmq exchange (and so still 
getting messages in) but no consumers.

These messages in these queues are basically rubbish and don’t need to exist. 
Rabbit will delete these queues after 10 mins (although the default in master 
is now changed to 30 mins)

During this time the queue will grow and grow with messages. This sets off our 
nagios alerts and our ops guys have to deal with something that isn’t really an 
issue. They basically delete the queue.

A bad scenario is when you make a change to your cloud that means all your 1000 
neutron agents are restarted, this causes a couple of dead queues per agent to 
hang around. (port updates and security group updates) We get around 25 
messages / second on these queues and so you can see after 10 minutes we have a 
ton of messages in these queues.

1000 x 2 x 25 x 600 = 30,000,000 messages in 10 minutes to be precise.

Has anyone else been suffering with this before a raise a bug?

Cheers,
Sam


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] OpenStack mitaka using swift as backend for glance

2016-07-17 Thread Sam Morrison
Hi Michael,

This would indicate that glance can’t find the swift endpoint in the keystone 
catalog.

You can either add it to the catalog or specify the swift url in the config.

Cheers,
Sam


> On 15 Jul 2016, at 9:07 PM, Michael Stang  
> wrote:
> 
> Hi everyone,
>  
> I tried to setup swift as backend for glance in our new mitaka installation. 
> I used this in the glance-api.conf
>  
> [glance_store]
> stores = swift
> default_store = swift
> swift_store_create_container_on_put = True
> swift_store_region = RegionOne
> default_swift_reference = ref1
> swift_store_config_file = /etc/glance/glance-swift-store.conf
> 
>  
> and in the glance-swift-store.conf this
> 
> [ref1]
> auth_version = 3
> project_domain_id = default
> user_domain_id = default
> auth_address = http://controller:35357 
> user = services:swift
> key = x
> 
> When I trie now to upload an image it gets the status "killed" and this is in 
> the glance-api.log
> 
> 2016-07-15 12:21:44.379 14230 ERROR glance.api.v1.upload_utils 
> [req-0ec16fa5-a605-47f3-99e9-9ab231116f04 de9463239010412d948df4020e9be277 
> 669e037b13874b6c871
> 2b1fd10c219f0 - - -] Failed to upload image 
> 6de45d08-b420-477b-a665-791faa232379
> 2016-07-15 12:21:44.379 14230 ERROR glance.api.v1.upload_utils Traceback 
> (most recent call last):
> 2016-07-15 12:21:44.379 14230 ERROR glance.api.v1.upload_utils File 
> "/usr/lib/python2.7/dist-packages/glance/api/v1/upload_utils.py", line 110, 
> in upload_d
> ata_to_store
> 2016-07-15 12:21:44.379 14230 ERROR glance.api.v1.upload_utils 
> context=req.context)
> 2016-07-15 12:21:44.379 14230 ERROR glance.api.v1.upload_utils File 
> "/usr/lib/python2.7/dist-packages/glance_store/backend.py", line 344, in 
> store_add_to_b
> ackend
> 2016-07-15 12:21:44.379 14230 ERROR glance.api.v1.upload_utils 
> verifier=verifier)
> 2016-07-15 12:21:44.379 14230 ERROR glance.api.v1.upload_utils File 
> "/usr/lib/python2.7/dist-packages/glance_store/capabilities.py", line 226, in 
> op_checke
> r
> 2016-07-15 12:21:44.379 14230 ERROR glance.api.v1.upload_utils return 
> store_op_fun(store, *args, **kwargs)
> 2016-07-15 12:21:44.379 14230 ERROR glance.api.v1.upload_utils File 
> "/usr/lib/python2.7/dist-packages/glance_store/_drivers/swift/store.py", line 
> 532, in a
> dd
> 2016-07-15 12:21:44.379 14230 ERROR glance.api.v1.upload_utils 
> allow_reauth=need_chunks) as manager:
> 2016-07-15 12:21:44.379 14230 ERROR glance.api.v1.upload_utils File 
> "/usr/lib/python2.7/dist-packages/glance_store/_drivers/swift/store.py", line 
> 1170, in
> get_manager_for_store
> 2016-07-15 12:21:44.379 14230 ERROR glance.api.v1.upload_utils store, 
> store_location, context, allow_reauth)
> 2016-07-15 12:21:44.379 14230 ERROR glance.api.v1.upload_utils File 
> "/usr/lib/python2.7/dist-packages/glance_store/_drivers/swift/connection_manager.py",
>  l
> ine 64, in __init__
> 2016-07-15 12:21:44.379 14230 ERROR glance.api.v1.upload_utils 
> self.storage_url = self._get_storage_url()
> 2016-07-15 12:21:44.379 14230 ERROR glance.api.v1.upload_utils File 
> "/usr/lib/python2.7/dist-packages/glance_store/_drivers/swift/connection_manager.py",
>  l
> ine 160, in _get_storage_url
> 2016-07-15 12:21:44.379 14230 ERROR glance.api.v1.upload_utils raise 
> exceptions.BackendException(msg)
> 2016-07-15 12:21:44.379 14230 ERROR glance.api.v1.upload_utils 
> BackendException: Cannot find swift service endpoint : The resource could not 
> be found. (HTTP
> 404)
> 2016-07-15 12:21:44.379 14230 ERROR glance.api.v1.upload_utils
> 
>  
> anyone an idea what i'm missing in the config file oder what might be the 
> problem?
> 
> Thanks and kind regards,
> Michael
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org 
> 
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators 
> 
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Kilo ->-> Mitaka anyone have notes :)

2016-06-26 Thread Sam Morrison
I’ve done kilo -> mitaka with Keystone and all worked fine. Nothing special I 
needed to do.

If you’re wanting to do live upgrades with nova you can’t skip a version from 
my understanding.

Sam


> On 25 Jun 2016, at 4:16 AM, Jonathan Proulx  wrote:
> 
> Hi All,
> 
> I about to start testing for our Kilo->Mitaka migration.
> 
> I seem to recall many (well a few at least) people who were looking to
> do a direct Kilo to Mitaka upgrade (skipping Liberty).
> 
> Blue Box apparently just did and I read Stefano's blog[1] about it,
> and while it gives me hope my plan is possible it's not realy a
> technical piece.
> 
> I'm on my 7th version of OpenStack for this cloud now so not my first
> redeo as they say, but other than read Liberty and Mitaka release
> notes carefully and test like crazy wonder if anyone has seen specific
> issues or has specific advice for skiping a step here?
> 
> Thanks,
> -Jon
> 
> -- 
> [1] 
> https://www.dreamhost.com/blog/2016/06/21/dreamcompute-goes-m-for-mitaka-without-unleashing-the-dragons/
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Keystone's DB_SYNC from Kilo to Liberty

2016-06-26 Thread Sam Morrison
That usually means your DB is at version 86 (you can check the DB table to see, 
the table is called migration_version or something)
BUT your keystone version is older and doesn’t know about version 86. 

Is it possible the keystone version your running is older and doesn’t know 
about version 86?

Sam






> On 23 Jun 2016, at 9:35 PM, Alvise Dorigo  wrote:
> 
> Hi,
> I've a Kilo installation which I want to migrate to Liberty.
> I've installed the Liberty Keystone's RPMs and configured the minimun to 
> upgrade the DB schema ("connection" parameter in the [database] section of 
> keystone.conf).
> Then, I've tried to run
> 
>su -s /bin/sh -c "keystone-manage db_sync" keystone
> 
> but it's failed with the following error:
> 
>2016-06-23 13:20:50.191 22423 CRITICAL keystone [-] KeyError: 
> 
> which is quite useless.
> 
> Any suggestion ?
> 
> many thanks,
> 
>Alvise
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [Openstack-Operators] Keystone cache strategies

2016-06-21 Thread Sam Morrison

> On 22 Jun 2016, at 10:58 AM, Matt Fischer <m...@mattfischer.com> wrote:
> 
> Have you setup token caching at the service level? Meaning a Memcache cluster 
> that glance, Nova etc would talk to directly? That will really cut down the 
> traffic.
> 
Yeah we have that although the default cache time is 10 seconds for revocation 
lists. I might just set that to some large number to limit this traffic a bit.

Sam



> On Jun 21, 2016 5:55 PM, "Sam Morrison" <sorri...@gmail.com 
> <mailto:sorri...@gmail.com>> wrote:
> 
>> On 22 Jun 2016, at 9:42 AM, Matt Fischer <m...@mattfischer.com 
>> <mailto:m...@mattfischer.com>> wrote:
>> 
>> On Tue, Jun 21, 2016 at 4:21 PM, Sam Morrison <sorri...@gmail.com 
>> <mailto:sorri...@gmail.com>> wrote:
>>> 
>>> On 22 Jun 2016, at 1:45 AM, Matt Fischer <m...@mattfischer.com 
>>> <mailto:m...@mattfischer.com>> wrote:
>>> 
>>> I don't have a solution for you, but I will concur that adding revocations 
>>> kills performance especially as that tree grows. I'm curious what you guys 
>>> are doing revocations on, anything other than logging out of Horizon?
>>> 
>> 
>> Is there a way to disable revocations?
>> 
>> Sam
>> 
>> 
>> I don't think so. There is no no-op driver for it that I can see. I've not 
>> tried it but maybe setting the expiration_buffer to a negative value would 
>> cause them to not be retained?
>> 
>> They expire at the rate your tokens expire (plus a buffer of 30 min by 
>> default) and under typical operation are not generated very often, so 
>> usually when you have say 10-20ish in the tree, its not too bad. It gets way 
>> worse when you have say 1000 of them. However, in our cloud anyway we just 
>> don't generate many. The only things that generate them are Horizon log outs 
>> and test suites that add and delete users and groups. If I knew we were 
>> generating anymore I'd probably setup an icinga alarm for them. When the 
>> table gets large after multiple test runs or we want to do perf tests we end 
>> up truncating the table in the DB. However that clearly is not a best 
>> security practice.
> 
> 
> How token TTLs are very low so I’d be willing to remove revocation. The bulk 
> of data going though our load balancers on API requests if you take glance 
> images out is requests to the revocation url. 
> 
> 

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [Openstack-Operators] Keystone cache strategies

2016-06-21 Thread Sam Morrison

> On 22 Jun 2016, at 9:42 AM, Matt Fischer <m...@mattfischer.com> wrote:
> 
> On Tue, Jun 21, 2016 at 4:21 PM, Sam Morrison <sorri...@gmail.com 
> <mailto:sorri...@gmail.com>> wrote:
>> 
>> On 22 Jun 2016, at 1:45 AM, Matt Fischer <m...@mattfischer.com 
>> <mailto:m...@mattfischer.com>> wrote:
>> 
>> I don't have a solution for you, but I will concur that adding revocations 
>> kills performance especially as that tree grows. I'm curious what you guys 
>> are doing revocations on, anything other than logging out of Horizon?
>> 
> 
> Is there a way to disable revocations?
> 
> Sam
> 
> 
> I don't think so. There is no no-op driver for it that I can see. I've not 
> tried it but maybe setting the expiration_buffer to a negative value would 
> cause them to not be retained?
> 
> They expire at the rate your tokens expire (plus a buffer of 30 min by 
> default) and under typical operation are not generated very often, so usually 
> when you have say 10-20ish in the tree, its not too bad. It gets way worse 
> when you have say 1000 of them. However, in our cloud anyway we just don't 
> generate many. The only things that generate them are Horizon log outs and 
> test suites that add and delete users and groups. If I knew we were 
> generating anymore I'd probably setup an icinga alarm for them. When the 
> table gets large after multiple test runs or we want to do perf tests we end 
> up truncating the table in the DB. However that clearly is not a best 
> security practice.


How token TTLs are very low so I’d be willing to remove revocation. The bulk of 
data going though our load balancers on API requests if you take glance images 
out is requests to the revocation url. 


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [Openstack-Operators] Keystone cache strategies

2016-06-21 Thread Sam Morrison
> 
> On 22 Jun 2016, at 1:45 AM, Matt Fischer  wrote:
> 
> I don't have a solution for you, but I will concur that adding revocations 
> kills performance especially as that tree grows. I'm curious what you guys 
> are doing revocations on, anything other than logging out of Horizon?
> 

Is there a way to disable revocations?

Sam




> On Tue, Jun 21, 2016 at 5:45 AM, Jose Castro Leon  > wrote:
> Hi all,
> 
> While doing scale tests on our infrastructure, we observed some increase in 
> the response times of our keystone servers.
> 
> After further investigation we observed that we have a hot key in our cache 
> configuration (this means than all keystone servers are checking this key 
> quite frequently)
> 
> We are using a pool of memcache servers for hosting the cache and the 
> solution does not seem ideal at this scale.
> 
>  
> 
> The key turns out to be the revocation tree, that is evaluated in every token 
> validation.  If the revocation tree object stored is big enough it can kill 
> the network connectivity
> 
> on the cache server affecting the whole infrastructure as the identity 
> servers needs to check the key before validating a token.
> 
>  
> 
> On our scale tests after the cleanup, we have 250 requests/second for an 
> object of 500KB that is a throughput of 1Gbit/sec that saturate the network 
> link of the cache server.
> 
>  
> 
> We are checking other strategies like redis or mongo, but we would like to 
> know if you have already seen this before? If so what you have done?
> 
>  
> 
> Kind regards,
> 
> Jose
> 
>  
> 
> Jose Castro Leon
> 
> CERN IT-CM-RPS   tel:+41.22.76.74272
> 
> mob: +41.75.41.19222
> 
> fax:+41.22.76.67955
> 
> Office: 31-1-026  CH-1211  Geneve 23
> 
> email: jose.castro.l...@cern.ch 
>  
> 
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org 
> 
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators 
> 
> 
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] Duplicates and confusion in nova policy.json files

2016-06-15 Thread Sam Morrison
Now that policy files in nova Liberty apparently work I’m going through the 
stock example one and see that there are duplicate entries in the policy.json 
like

compute:create:forced_host
os_compute_api:servers:create:forced_host

Which one do I use to change who can do forced_host? both or a specific one?

Anyone have any ideas

Cheers,
Sam



___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] oslo_messaging, rabbit, ssl and mitaka and xenial

2016-06-02 Thread Sam Morrison
Hi all,

We’ve been trying out some mitaka packages as well as some Xenial hosts and 
have been having some issues with rabbit and SSL.

If using rabbitMQ 3.6.x on Trusty I can’t get a mitaka host (oslo_messaging 
4.6.1, python-amqp 1.4.9) to connect to rabbit over SSL. 

If I use rabbitMQ 3.6.x on Xenial I can get it to work BUT I need to change 
some settings on rabbit to allow some weaker ciphers.

I had to add the following to rabbitmq.config (found on some random blog and 
haven’t investigated what exactly needed to change sorry)

{versions, ['tlsv1.2', 'tlsv1.1', tlsv1]},
{ciphers, 
["ECDHE-ECDSA-AES256-GCM-SHA384","ECDHE-RSA-AES256-GCM-SHA384",
   
"ECDHE-ECDSA-AES256-SHA384","ECDHE-RSA-AES256-SHA384", 
"ECDHE-ECDSA-DES-CBC3-SHA",
   
"ECDH-ECDSA-AES256-GCM-SHA384","ECDH-RSA-AES256-GCM-SHA384","ECDH-ECDSA-AES256-SHA384",
   
"ECDH-RSA-AES256-SHA384","DHE-DSS-AES256-GCM-SHA384","DHE-DSS-AES256-SHA256",
   
"AES256-GCM-SHA384","AES256-SHA256","ECDHE-ECDSA-AES128-GCM-SHA256",
   
"ECDHE-RSA-AES128-GCM-SHA256","ECDHE-ECDSA-AES128-SHA256","ECDHE-RSA-AES128-SHA256",
   
"ECDH-ECDSA-AES128-GCM-SHA256","ECDH-RSA-AES128-GCM-SHA256","ECDH-ECDSA-AES128-SHA256",
   
"ECDH-RSA-AES128-SHA256","DHE-DSS-AES128-GCM-SHA256","DHE-DSS-AES128-SHA256",
   
"AES128-GCM-SHA256","AES128-SHA256","ECDHE-ECDSA-AES256-SHA",
   
"ECDHE-RSA-AES256-SHA","DHE-DSS-AES256-SHA","ECDH-ECDSA-AES256-SHA",
   
"ECDH-RSA-AES256-SHA","AES256-SHA","ECDHE-ECDSA-AES128-SHA",
   
"ECDHE-RSA-AES128-SHA","DHE-DSS-AES128-SHA","ECDH-ECDSA-AES128-SHA",
   "ECDH-RSA-AES128-SHA","AES128-SHA"]},
{honor_cipher_order, true},


Is anyone else had a play with this and got it working where a mitaka host can 
talk to a rabbitmq server running on trusty?
The version or erlang is the difference here and I’m pretty sure that is where 
the change is.

Cheers,
Sam


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Ops Meetup event sizes

2016-06-01 Thread Sam Morrison
As an operator what should I prioritise now the main summit is changing, the 
thing formally known as the summit or the ops mid cycle?

Will there be operator sessions at the summit still?

Sorry if this has already been mentioned but still not 100% sure how operators 
fit into the new model.

Cheers,
Sam


> On 1 Jun 2016, at 6:12 PM, Matt Jarvis  wrote:
> 
> Hi All
> 
> As part of the work we've been doing on the Ops Meetups Team working group, 
> we've recently had some discussion on the ideal attendee numbers for future 
> Ops Mid Cycles which we'd like as much feedback as possible on from the wider 
> community.
> 
> The general consensus in the discussions we've had, and from the Austin 
> summit sessions and the Manchester feedback session, is that between 150-200 
> attendees should be the maximum size. 
> 
> The thinking behind this has been ( in no particular order ) : 
> 
> 1. The aim of the events is to encourage active participation, and there is 
> an optimal session size for this ( ~ 50 - 100 people )
> 2. Keep the events mainly focused on operators and developers attending
> 3. Bigger events are more difficult to organise and deliver - challenges of 
> wifi, finding appropriate venues, more logistical staff needed, larger sums 
> of sponsorship to be raised etc.
> 
> The Ops Meetups Team would welcome input from as many of you all as possible 
> on this topic, so what are your thoughts ? 
> 
> Matt
> 
> DataCentred Limited registered in England and Wales no. 
> 05611763___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [openstack-dev] [glance] Proposal for a mid-cycle virtual sync on operator issues

2016-05-25 Thread Sam Morrison
I’m hoping some people from the Large Deployment Team can come along. It’s not 
a good time for me in Australia but hoping someone else can join in.

Sam


> On 26 May 2016, at 2:16 AM, Nikhil Komawar  wrote:
> 
> Hello,
> 
> 
> Firstly, I would like to thank Fei Long for bringing up a few operator
> centric issues to the Glance team. After chatting with him on IRC, we
> realized that there may be more operators who would want to contribute
> to the discussions to help us take some informed decisions.
> 
> 
> So, I would like to call for a 2 hour sync for the Glance team along
> with interested operators on Thursday June 9th, 2016 at 2000UTC. 
> 
> 
> If you are interested in participating please RSVP here [1], and
> participate in the poll for the tool you'd prefer. I've also added a
> section for Topics and provided a template to document the issues clearly.
> 
> 
> Please be mindful of everyone's time and if you are proposing issue(s)
> to be discussed, come prepared with well documented & referenced topic(s).
> 
> 
> If you've feedback that you are not sure if appropriate for the
> etherpad, you can reach me on irc (nick: nikhil).
> 
> 
> [1] https://etherpad.openstack.org/p/newton-glance-and-ops-midcycle-sync
> 
> -- 
> 
> Thanks,
> Nikhil Komawar
> Newton PTL for OpenStack Glance
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [craton] Versions of Python to support for Craton

2016-05-24 Thread Sam Morrison
I’m in favour of using 3.5. We are in the process of moving things to ubuntu 
xenial and 3.5 is native there.

BTW when is Craton planning on getting into openstack gerrit etc?

Sam




> On 25 May 2016, at 6:20 AM, Jim Baker  wrote:
> 
> tl;dr - any reason why Craton should support Python 2.7 for your use case?
> 
> First, some background: Craton is a fleet management tool under active 
> development for standing up and maintaining OpenStack clouds. It does so by 
> supporting inventory and audit/remediation workflows, both at scale and being 
> pluggable. This architecture follows the model used by Rackspace public 
> cloud; you can think of Craton as being the "2.0" version of what we use at 
> Rackspace. Currently most of the developers are part of OSIC (so Rackspace, 
> Intel). Craton is built on top of a variety of Oslo libraries (notably 
> TaskFlow), but otherwise has no dependence on OpenStack components. Craton 
> itself in turn relies on other tooling like OpenStack Ansible to actually do 
> its work - we have no agents. More details here: 
> https://etherpad.openstack.org/p/Fleet_Management 
> 
> 
> We plan to make Craton a big tent OpenStack project.
> 
> Since we are so brand new, we are trying to make the most of being 
> greenfield. Ubuntu policy is to target new Python development only against 
> Python 3. Other distros are similarly favoring Python 3; see 
> https://wiki.openstack.org/wiki/Python3#Status_of_Python_3_in_Linux_distributions
>  
> 
> 
> Currently we run tox tests against both Python 2.7 and Python 3 (specifically 
> 3.4, 3.5). For interested operators, is there a good reason why we should 
> continue supporting 2.7?
> 
> Such change will let us:
> Reduce development effort, because we will have not to use awkward constructs 
> for dual support of Python 2.7 and Python 3.
> Enable use of new functionality without backports (examples: chainmap, 
> futurist, ipaddress, etc).
> Take advantage of new functionality that has no backport support at all. 
> Python 2.7 at this point only gets security updates.
> We may also want to further simplify by requiring a minimum of Python 3.5. 
> Doing so would enable us to take advantage of static type hinting, for higher 
> quality code. Feedback on that is also appreciated.
> 
> - Jim
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Nova cells patches against Liberty

2016-05-19 Thread Sam Morrison
ERROR oslo_messaging.rpc.dispatcher   File 
> "/usr/lib/python2.7/site-packages/nova/objects/instance.py", line 864, in 
> obj_load_attr
> 2016-05-19 08:15:07.741 24589 ERROR oslo_messaging.rpc.dispatcher 
> reason='attribute %s not lazy-loadable' % attrname)
> 2016-05-19 08:15:07.741 24589 ERROR oslo_messaging.rpc.dispatcher 
> ObjectActionError: Object action obj_load_attr failed because: attribute id 
> not lazy-loadable
> 2016-05-19 08:15:07.741 24589 ERROR oslo_messaging.rpc.dispatcher
> 
> 
> The problem is that _ensure_cells_system_metadata is referencing self.name 
> before the object is actually loaded from the database, so only uuid and I 
> think vm_state are set at that point.  The name accessor method references 
> self.id, which (for some reason) is not lazy loadable.
> 
> So I tried moving the _ensure_cells_system_metadata later in the save method, 
> after the object is loaded from the database (here: 
> https://github.com/openstack/nova/blob/stable/liberty/nova/objects/instance.py#L676
>  )  That seems to work in practice, but it causes some of the tox tests to 
> fail:  https://gist.github.com/misterdorm/cc7dfd235ebcc2a23009b9115b58e4d5
> 
> Anyways, I’m at a bit of a loss here and curious if anybody might have some 
> better insights.
> 
> Thanks,
> Mike
> 
> 
> 
> From:  Sam Morrison <sorri...@gmail.com>
> Date:  Wednesday, May 4, 2016 at 6:23 PM
> To:  Mike Dorman <mdor...@godaddy.com>
> Cc:  OpenStack Operators <openstack-operators@lists.openstack.org>
> Subject:  Re: [Openstack-operators] Nova cells patches against Liberty
> 
> 
> Hi Mike,
> 
> I’ve also been working on these and have some updated patches at:
> 
> https://github.com/NeCTAR-RC/nova/commits/stable/liberty-cellsv1
> 
> There are a couple of patches that you have in your tree that need updating 
> for Liberty. Mainly around supporting the v2.1 API and more things moved to 
> objects. I have also written some tests for some more of them too. I haven’t 
> tested all of these functionally
> yet but they pass all tox tests.
> 
> Cheers,
> Sam
> 
> 
> 
> 
> On 5 May 2016, at 4:19 AM, Mike Dorman <mdor...@godaddy.com> wrote:
> 
> I went ahead and pulled out the Nova cells patches we’re running against 
> Liberty so that others can use them if so desired.
> 
> https://github.com/godaddy/openstack-nova-patches
> 
> Usual disclaimers apply here, your mileage may vary, these may not work as 
> expected in your environment, etc.  We have tested these at a basic level 
> (unit tests), but are not running these for Liberty in real production yet.
> 
> Mike
> 
> 
> 
> 
> 
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> 
> 
> 
> 
> 
> 


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [openstack-dev] [glance] glance-registry deprecation: Request for feedback

2016-05-15 Thread Sam Morrison

> On 14 May 2016, at 5:36 AM, Flavio Percoco <fla...@redhat.com> wrote:
> 
>> On 5/12/16 9:20 PM, Sam Morrison wrote:
>> 
>>   We find glance registry quite useful. Have a central glance-registry api 
>> is useful when you have multiple datacenters all with glance-apis and 
>> talking back to a central registry service. I guess they could all talk back 
>> to the central DB server but currently that would be over the public 
>> Internet for us. Not really an issue, we can work around it.
>> 
>>   The major thing that the registry has given us has been rolling upgrades. 
>> We have been able to upgrade our registry first then one by one upgrade our 
>> API servers (we have about 15 glance-apis)
> 
> I'm curious to know how you did this upgrade, though. Did you shutdown your
> registry nodes, upgraded the database and then re-started them? Did you 
> upgraded
> one registry node at a time?
> 
> I'm asking because, as far as I can tell, the strategy you used for upgrading
> the registry nodes is the one you would use to upgrade the glance-api nodes
> today. Shutting down all registry nodes would live you with unusable 
> glance-api
> nodes anyway so I'd assume you did a partial upgrade or something similar to
> that.

Yeah, if glance supported versioned objects then yes this would be great. 

We only have 3 glance-registries and so upgrading these first is a lot easier 
than upgrading all ~15 of our glance-apis at once.

Sam




___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [glance] glance-registry deprecation: Request for feedback

2016-05-12 Thread Sam Morrison
We find glance registry quite useful. Have a central glance-registry api is 
useful when you have multiple datacenters all with glance-apis and talking back 
to a central registry service. I guess they could all talk back to the central 
DB server but currently that would be over the public Internet for us. Not 
really an issue, we can work around it.

The major thing that the registry has given us has been rolling upgrades. We 
have been able to upgrade our registry first then one by one upgrade our API 
servers (we have about 15 glance-apis) 

I don’t think we would’ve been able to do that if all the glance-apis were 
talking to the DB, (At least not in glance’s current state)

Sam




> On 12 May 2016, at 1:51 PM, Flavio Percoco  wrote:
> 
> Greetings,
> 
> The Glance team is evaluating the needs and usefulness of the Glance Registry
> service and this email is a request for feedback from the overall community
> before the team moves forward with anything.
> 
> Historically, there have been reasons to create this service. Some deployments
> use it to hide database credentials from Glance public endpoints, others use 
> it
> for scaling purposes and others because v1 depends on it. This is a good time
> for the team to re-evaluate the need of these services since v2 doesn't depend
> on it.
> 
> So, here's the big question:
> 
> Why do you think this service should be kept around?
> 
> Summit etherpad: 
> https://etherpad.openstack.org/p/newton-glance-registry-deprecation
> 
> Flavio
> -- 
> @flaper87
> Flavio Percoco
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Nova cells patches against Liberty

2016-05-04 Thread Sam Morrison
Hi Mike,

I’ve also been working on these and have some updated patches at:

https://github.com/NeCTAR-RC/nova/commits/stable/liberty-cellsv1

There are a couple of patches that you have in your tree that need updating for 
Liberty. Mainly around supporting the v2.1 API and more things moved to 
objects. I have also written some tests for some more of them too. I haven’t 
tested all of these functionally yet but they pass all tox tests.

Cheers,
Sam



> On 5 May 2016, at 4:19 AM, Mike Dorman  wrote:
> 
> I went ahead and pulled out the Nova cells patches we’re running against 
> Liberty so that others can use them if so desired.
> 
> https://github.com/godaddy/openstack-nova-patches 
> 
> 
> Usual disclaimers apply here, your mileage may vary, these may not work as 
> expected in your environment, etc.  We have tested these at a basic level 
> (unit tests), but are not running these for Liberty in real production yet.
> 
> Mike
> 
> 
> 
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Nova-network -> Neutron Migration

2015-12-08 Thread Sam Morrison

> On 9 Dec 2015, at 2:16 PM, Tom Fifield  wrote:
> 
> NeCTAR used this script (https://github.com/NeCTAR-RC/novanet2neutron ) with 
> success to do a live nova-net to neutron using Juno.

That’s correct except we were on Kilo. I’m not sure I would try to do this on 
Icehouse though, neutron was pretty immature back then so could be a lot of 
pain.

Cheers,
Sam


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] [cinder] upgrade juno -> kilo (near live)

2015-12-01 Thread Sam Morrison
Hi All,

We have got some patches that allows a near live upgrade of Cinder from Juno -> 
Kilo. It allows juno cinder-volumes to live with kilo API and schedulers.

You basically need to ensure your juno hosts all have the patch
Run the kilo DB migrations
then upgrade api and scheduler to kilo. 
Then you can upgrade your cinder-volumes one by one (which is handy if you have 
lots of them like we do)

Patch for juno is 
https://github.com/NeCTAR-RC/cinder/commit/e70ad0bcf7f1932e6bea6c893e886a293d973bf5
 


You also need a minor patch to kilo to stop it deleting certain columns.

https://github.com/NeCTAR-RC/cinder/commit/e49ca7d95b5fffa58dc7a2d17f347e550d29f6a0
 


These have worked for us but I stress we haven’t tested all functionality. We 
have tested:
create
delete
attach
detach
snapshot-create
snapshot-delete

Cheers,
Sam

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] How do I install specific versions of openstack/puppet-keystone

2015-11-25 Thread Sam Morrison
Can you get R10k to NOT install dependencies listed in metadata etc.?  
We use puppet-librarian and it can’t and so we have to change every puppet 
module eg. [1] as some of the dependencies break things for us.

Sam


[1] 
https://github.com/NeCTAR-RC/puppet-glance/commit/818e876eafe24252d9847474f68300bc0f706b22
 




> On 26 Nov 2015, at 6:46 AM, Kris G. Lindgren  wrote:
> 
> We use R10k as well.
> 
> ___
> Kris Lindgren
> Senior Linux Systems Engineer
> GoDaddy
> 
> From: Matt Fischer >
> Date: Wednesday, November 25, 2015 at 12:16 PM
> To: Saverio Proto >
> Cc: "openstack-operators@lists.openstack.org 
> " 
>  >
> Subject: Re: [Openstack-operators] How do I install specific versions of 
> openstack/puppet-keystone
> 
> I'd second the vote for r10k. You need to do this however otherwise you'll 
> get the master branch:
> 
> mod 'nova',
>   :git => 'https://github.com/openstack/puppet-nova.git 
> ',
>   :ref => 'stable/kilo'
> 
> mod 'glance',
>   :git => 'https://github.com/openstack/puppet-glance.git 
> ',
>   :ref => 'stable/kilo'
> 
> mod 'cinder',
>   :git => 'https://github.com/openstack/puppet-cinder.git 
> ',
>   :ref => 'stable/kilo'
> 
> ...
> 
> 
> On Wed, Nov 25, 2015 at 11:34 AM, Saverio Proto  > wrote:
> Hello,
> 
> you can use r10k
> 
> go in a empty folder, create a file called Puppetfile with this content:
> 
> mod 'openstack-ceilometer'
> mod 'openstack-cinder'
> mod 'openstack-glance'
> mod 'openstack-heat'
> mod 'openstack-horizon'
> mod 'openstack-keystone'
> mod 'openstack-neutron'
> mod 'openstack-nova'
> mod 'openstack-openstack_extras'
> mod 'openstack-openstacklib'
> mod 'openstack-vswitch'
> 
> the type the commands:
> gem install r10k
> r10k puppetfile install -v
> 
> Look at r10k documentation for howto specify a version number of the modules.
> 
> Saverio
> 
> 
> 
> 2015-11-25 18:43 GMT+01:00 Oleksiy Molchanov  >:
> > Hi,
> >
> > You can provide --version parameter to 'puppet module install' or even use
> > puppet-librarian with puppet in standalone mode. This tool is solving all
> > your issues described.
> >
> > BR,
> > Oleksiy.
> >
> > On Wed, Nov 25, 2015 at 6:16 PM, Russell Cecala  > >
> > wrote:
> >>
> >> Hi,
> >>
> >> I am struggling with setting up OpenStack via the OpenStack community
> >> puppet modules.  For example
> >> https://github.com/openstack/puppet-keystone/tree/stable/kilo 
> >> 
> >>
> >> If I do what the README.md file says to do ...
> >>
> >> example% puppet module install puppetlabs/keystone
> >>
> >> What release of the module would I get?  Do I get Liberty, Kilo, Juno?
> >> And what if I needed to be able to install the Liberty version on one
> >> system
> >> but need the Juno version for yet another system?  How can I ensure the
> >> the right dependencies like cprice404-inifile and puppetlabs-mysql get
> >> installed?
> >>
> >> Thanks
> >>
> >> ___
> >> OpenStack-operators mailing list
> >> OpenStack-operators@lists.openstack.org 
> >> 
> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators 
> >> 
> >>
> >
> >
> > ___
> > OpenStack-operators mailing list
> > OpenStack-operators@lists.openstack.org 
> > 
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators 
> > 
> >
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org 
> 
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators 
> 
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list

Re: [Openstack-operators] Informal Ops Meetup?

2015-10-29 Thread Sam Morrison
I’ll be there, talked to Tom too and he said there may be a room we can use 
else there is plenty of space around the dev lounge to use.

See you tomorrow.

Sam


> On 29 Oct 2015, at 6:02 PM, Xav Paice  wrote:
> 
> Suits me :)
> 
> On 29 October 2015 at 16:39, Kris G. Lindgren  > wrote:
> Hello all,
> 
> I am not sure if you guys have looked at the schedule for Friday… but its all 
> working groups.  I was talking with a few other operators and the idea came 
> up around doing an informal ops meetup tomorrow.  So I wanted to float this 
> idea by the mailing list and see if anyone was interested in trying to do an 
> informal ops meet up tomorrow.
> 
> ___
> Kris Lindgren
> Senior Linux Systems Engineer
> GoDaddy
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org 
> 
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators 
> 
> 
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Neutron DHCP failover bug

2015-09-30 Thread Sam Morrison

> On 30 Sep 2015, at 3:48 pm, John Dewey  wrote:
> 
> Why not run neutron dhcp agents on both nodes? 

Yeah have tried this too.

We don’t do this due to metadata, DHCP adds a static route to the metadata 
agent (which is the same as the network node running DHCP). If the network node 
is down the instances loose metadata if they happened to get a lease from that 
DHCP server. With failover the IP address transfers to the other network node 
and metadata keeps working for all.

Make sense?

Sam


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] Neutron DHCP failover bug

2015-09-29 Thread Sam Morrison
Hi All,

We are running Kilo and have come across this bug 
https://bugs.launchpad.net/neutron/+bug/1410067 


Pretty easy to replicate, have 2 network nodes, shutdown 1 of them and DHCP 
etc. moves over to the new host fine. Except doing a port-show on the DHCP port 
shows it still on the old host and in state BUILD.
Everything works but the DB is in the wrong state.

Just wondering if anyone else sees this and if so if they know the associated 
fix in Liberty that addresses this.

Cheers,
Sam

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [openstack-dev] [cinder] [all] The future of Cinder API v1

2015-09-28 Thread Sam Morrison
Yeah we’re still using v1 as the clients that are packaged with most distros 
don’t support v2 easily.

Eg. with Ubuntu Trusty they have version 1.1.1, I just updated our “volume” 
endpoint to point to v2 (we have a volumev2 endpoint too) and the client breaks.

$ cinder list
ERROR: OpenStack Block Storage API version is set to 1 but you are accessing a 
2 endpoint. Change its value through --os-volume-api-version or 
env[OS_VOLUME_API_VERSION].

Sam


> On 29 Sep 2015, at 8:34 am, Matt Fischer  wrote:
> 
> Yes, people are probably still using it. Last time I tried to use V2 it 
> didn't work because the clients were broken, and then it went back on the 
> bottom of my to do list. Is this mess fixed?
> 
> http://lists.openstack.org/pipermail/openstack-operators/2015-February/006366.html
>  
> 
> 
> On Mon, Sep 28, 2015 at 4:25 PM, Ivan Kolodyazhny  > wrote:
> Hi all,
> 
> As you may know, we've got 2 APIs in Cinder: v1 and v2. Cinder v2 API was 
> introduced in Grizzly and v1 API is deprecated since Juno.
> 
> After [1] is merged, Cinder API v1 is disabled in gates by default. We've got 
> a filed bug [2] to remove Cinder v1 API at all.
> 
> 
> According to Deprecation Policy [3] looks like we are OK to remote it. But I 
> would like to ask Cinder API users if any still use API v1.
> Should we remove it at all Mitaka release or just disable by default in the 
> cinder.conf?
> 
> AFAIR, only Rally doesn't support API v2 now and I'm going to implement it 
> asap.
> 
> [1] https://review.openstack.org/194726  
> [2] https://bugs.launchpad.net/cinder/+bug/1467589 
> 
> [3] 
> http://lists.openstack.org/pipermail/openstack-dev/2015-September/073576.html 
> 
> Regards,
> Ivan Kolodyazhny
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org 
> 
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators 
> 
> 
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org 
> 
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators 
> 
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Potential deprecation of cinder.cross_az_attach option in nova

2015-09-23 Thread Sam Morrison
We very much rely on this and I see this is already merged! Great another patch 
I have to manage locally.

I don’t understand what the confusion is. We have multiple availability zones 
in nova and each zone has a corresponding cinder-volume service(s) in the same 
availability zone.

We don’t want people attaching a volume from one zone to another as the network 
won’t allow that as the zones are in different network domains and different 
data centres.

I will reply in the mailing list post on the dev channel but it seems it’s too 
late.

Sam



> On 24 Sep 2015, at 6:49 am, Matt Riedemann  wrote:
> 
> I wanted to bring this to the attention of the operators mailing list in case 
> someone is relying on the cinder.cross_az_attach.
> 
> There is a -dev thread here [1] that started this discussion.  That led to a 
> change proposed to deprecate the cinder.cross_az_attach option in nova [2].
> 
> This is for deprecation in mitaka and removal in N.  If this affects you, 
> please speak up in the mailing list or in the review.
> 
> [1] 
> http://lists.openstack.org/pipermail/openstack-dev/2015-September/075264.html
> [2] https://review.openstack.org/#/c/226977/
> 
> -- 
> 
> Thanks,
> 
> Matt Riedemann
> 
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Adding v1 LIKE support to python-glanceclient releases 1.x.x

2015-09-09 Thread Sam Morrison
I’d just like things like image-list to work under the new client :-)

https://bugs.launchpad.net/python-glanceclient/+bug/1492887


> On 10 Sep 2015, at 8:41 am, Nikhil Komawar  wrote:
> 
> Hi all,
> 
> We recently release python-glanceclient 1.0.0 and it has the default
> shell version as v2. This may result into some scripts not detecting the
> change by default and discomfort to an extent.
> 
> So, I am reaching out to this list with the hope of getting some
> feedback on the requirements, pros and cons you all think exist for
> adding some support for v1 like calls as hidden command to the default
> python-glanceclient shell API that is v2 centric by default. This should
> unbreak the scripts to an extent and give a warning to users to update
> the scripts in a stipulated time period so that they use the v2 API.
> 
> Here's the proposed patch https://review.openstack.org/#/c/219802/ . We
> are not yet sure if we need to get it merged by tomorrow so that it can
> be in stable/liberty by the end of the week. There has been one request
> to get those in and the feedback we received from the developer
> community was neutral.
> 
> In order to form an opinion on what's best for our users, we need some
> feedback on this topic. Please send us your thoughts as soon as possible
> and we will try to accommodate the same if permissible within the
> technical limitations:
> 
> 1. Whether you would like these commands added as hidden commands so
> that shell API works like before (to extent possible).
> 2. You would like to use v2 shell API of the client by default and don't
> care about this commit.
> 3. You don't care about the change. Your scripts are awesome and can
> adjust to the upgrade of the client easily.
> 4. Anything else.
> 
> -- 
> 
> Thanks,
> Nikhil
> 
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] kilo - neutron - ipset problems?

2015-09-03 Thread Sam Morrison
Hi Kris,

We are moving to neutron in a month and are hitting this in our pre prod 
environment too.

Sam


> On 2 Sep 2015, at 6:32 am, Kris G. Lindgren  wrote:
> 
> Hello,
> 
> We ran into this again today.
> 
> I created bug: https://bugs.launchpad.net/neutron/+bug/1491131 for this.  
> With the log files for ~10 seconds before the issue happened to the first 
> couple ipset delete failures.
> 
> 
> 
> 
> On 8/20/15, 6:37 AM, "Miguel Angel Ajo"  wrote:
> 
>> Hi Kris,
>> 
>>   I'm adding Shi Han Zhang to the thread,
>> 
>>   I'm was involved in some refactors during kilo and Han Zhang in some 
>> extra fixes during Liberty [1] [2] [3],
>> 
>>   Could you get us some logs of such failures to see what was 
>> happening around the failure time?, as a minimum we should
>> post the log error traces to a bug in https://bugs.launchpad.net/neutron
>> 
>>We will be glad to use such information to make the ipset more 
>> fault tolerant, and try to identify the cause of the
>> possible race conditions.
>> 
>> 
>> [1] https://review.openstack.org/#/c/187483/
>> [2] https://review.openstack.org/190991
>> [3] https://review.openstack.org/#/c/187433/
>> 
>> 
>> 
>> Kris G. Lindgren wrote:
>>> 
>>> We have been using ipsets since juno.  Twice now since our kilo 
>>> upgrade we have had issues with ipsets blowing up on a compute node.
>>> 
>>> The first time, was iptables was referencing an ipset that was either 
>>> no longer there or was not added, and was trying to apply the iptables 
>>> config every second and dumping the full iptables-resotore output into 
>>> the log when it failed at TRACE level.
>>> Second time, was that ipsets was failing to remove an element that was 
>>> no longer there.
>>> 
>>> For #1 I solved by restarting the neutron-openvswitch-agent.  For #2 
>>> we just added the entry that ipsets was trying to remove.  It seems 
>>> like we are having some race conditions under kilo that were not 
>>> present under juno (or we managed to run it for 6+ months without it 
>>> biting us).
>>> 
>>> Is anyone else seeing the same problems?  I am noticing some commits 
>>> reverting/re-adding around ipsets in kilo and liberty so trying to 
>>> confirm if I need to open a new bug on this.
>>> 
>>> 
>>> Kris Lindgren
>>> Senior Linux Systems Engineer
>>> GoDaddy, LLC.
>>> 
>>> ___
>>> OpenStack-operators mailing list
>>> OpenStack-operators@lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>> 
>> 
>> Kris G. Lindgren wrote:
>>> We have been using ipsets since juno.  Twice now since our kilo upgrade we 
>>> have had issues with ipsets blowing up on a compute node.
>>> 
>>> The first time, was iptables was referencing an ipset that was either no 
>>> longer there or was not added, and was trying to apply the iptables config 
>>> every second and dumping the full iptables-resotore output into the log 
>>> when it failed at TRACE level.
>>> Second time, was that ipsets was failing to remove an element that was no 
>>> longer there.
>>> 
>>> For #1 I solved by restarting the neutron-openvswitch-agent.  For #2 we 
>>> just added the entry that ipsets was trying to remove.  It seems like we 
>>> are having some race conditions under kilo that were not present under juno 
>>> (or we managed to run it for 6+ months without it biting us).
>>> 
>>> Is anyone else seeing the same problems?  I am noticing some commits 
>>> reverting/re-adding around ipsets in kilo and liberty so trying to confirm 
>>> if I need to open a new bug on this.
>>> 
>>> 
>>> Kris Lindgren
>>> Senior Linux Systems Engineer
>>> GoDaddy, LLC.
>>> 
>>> ___
>>> OpenStack-operators mailing list
>>> OpenStack-operators@lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Juno and Kilo Interoperability

2015-08-26 Thread Sam Morrison
Yeah we run a multitude of versions at the same time. We usually run N and N-1 
in the same env but have also done N-2 (eg. Havana, Icehouse and Juno)

Currently we are mainly Juno (keystone, heat, ceilometer, cinder) with a couple 
of icehouse things lying around. We are in the progress of upgrading to Kilo so 
some of our nova control is kilo now.

With clients the only issue I’ve had is with the designate client not being 
backwards compatible between icehouse and juno.
Ceilometer changed the way they signed messaged between Icehouse and Juno which 
was a pain so we had to set up parallel virtual hosts and collectors to push it 
into mongo.

All the APIs are pretty stable so it shouldn’t really matter what version of 
say keystone can work with what version of nova etc. 
We basically take it for granted now although of course test in your own env.

With nova make sure you set upgrade_levels so your nova control can talk to you 
computes that are on version N-1 etc.

Sam


 On 27 Aug 2015, at 3:09 am, David Medberry openst...@medberry.net wrote:
 
 Hi Eren,
 
 I'm pretty sure NECTaR is doing diff versions at different sites in a widely 
 distributed way.
 
 https://www.openstack.org/user-stories/nectar/ 
 https://www.openstack.org/user-stories/nectar/
 
 I've cc'd Sam as well. He's your man.
 
 On Wed, Aug 26, 2015 at 5:24 AM, Eren Türkay er...@skyatlas.com 
 mailto:er...@skyatlas.com wrote:
 Hello operators,
 
 I am wondering if anyone is using different versions of Openstack in different
 sites.
 
 We have our first site which is Juno, and we are now having another site where
 we are planning to deploy Kilo. Does anyone have experience with different
 versions of installation? Particularly, our Horizon and other clients will be
 Juno, but they will talk to secondary site which is Kilo. Inferring from the
 release notes, Kilo API looks backward compatible with Juno, so I'm a little
 optimistic about it but still I'm not sure.
 
 Any help is appreciated,
 Eren
 
 --
 Eren Türkay, System Administrator
 https://skyatlas.com/ https://skyatlas.com/ | +90 850 885 0357 
 tel:%2B90%20850%20885%200357
 
 Yildiz Teknik Universitesi Davutpasa Kampusu
 Teknopark Bolgesi, D2 Blok No:107
 Esenler, Istanbul Pk.34220
 
 
 ___
 OpenStack-operators mailing list
 OpenStack-operators@lists.openstack.org 
 mailto:OpenStack-operators@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators 
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
 
 

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] [puppet] module dependencies and different openstack versions

2015-07-27 Thread Sam Morrison
We currently use our own custom puppet modules to deploy openstack, I have been 
looking into the official openstack modules and have a few barriers to 
switching.

We are looking at doing this at a project at a time but the modules have a lot 
of dependencies. Eg. they all depend on the keystone module and try to do 
things in keystone suck as create users, service endpoints etc.

This is a pain as I don’t want it to mess with keystone (for one we don’t 
support setting endpoints via an API) but also we don’t want to move to the 
official keystone module at the same time. We have some custom keystone stuff 
which means we’ll may never move to the official keystone puppet module.

The neutron module pulls in the vswitch module but we don’t use vswitch and it 
doesn’t seem to be a requirement of the module so maybe doesn’t need to be in 
metadata dependencies?

It looks as if all the openstack puppet modules are designed to all be used at 
once? Does anyone else have these kind of issues? It would be great if eg. the 
neutron module would just manage neutron and not try and do things in nova, 
keystone, mysql etc.


The other issue we have is that we have different services in openstack running 
different versions. Currently we have Kilo, Juno and Icehouse versions of 
different bits in the same cloud. It seems as if the puppet modules are 
designed just to manage one openstack version? Is there any thoughts on making 
it support different versions at the same time? Does this work?

Thanks,
Sam



___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [keystone][all] Deprecating slash ('/') in project names

2015-07-06 Thread Sam Morrison
Do you mean project names or project IDs?

Sam


 On 3 Jul 2015, at 12:12 am, Henrique Truta henriquecostatr...@gmail.com 
 wrote:
 
 Hi everyone,
 
 In Kilo, keystone introduced the concept of Hierarchical Multitenancy[1], 
 which allows cloud operators to organize projects in hierarchies. This 
 concept is evolving in Liberty, with the addition of the Reseller use 
 case[2], where among other features, it’ll have hierarchies of domains by 
 making the domain concept a feature of projects and not a different entity: 
 from now on, every domain will be treated as a project that has the 
 “is_domain” property set to True.
 
 Currently, getting a project scoped token can be made by only passing the 
 project name and the domain it belongs to, once project names are unique 
 between domains. However with those hierarchies of projects, in M we intend 
 to remove this constraint in order to make a project name unique only in its 
 level in the hierarchy (project parent). In other words, it won’t be possible 
 to have sibling projects with the same name. For example. the following 
 hierarchy will be valid:
 
A - project with the domain feature
  /\
 B   C   - “pure” projects, children of A
 |  |
A B  - “pure” projects, children of B and C respectively
 
 Therefore, the cloud user faces some problems when getting a project scoped 
 token by name to projects A or B, since keystone won’t be able to distinguish 
 them only by their names. The best way to solve this problem is providing the 
 full hierarchy, like “A/B/A”, “A/B”, “A/C/B” and so on.
 
 To achieve this, we intend to deprecate the “/” character in project names in 
 Liberty and prohibit it in M, removing/replacing this character in a database 
 migration**.
 
 Do you have some strong reason to keep using this character in project names? 
 How bad would it be for existing deploys? We’d like to hear from you.
 
 Best regards,
 Henrique
 
 ** LDAP as assignment backend does not support Hierarchical Multitenancy. 
 This change will be only applied to SQL backends.
 [1] 
 http://specs.openstack.org/openstack/keystone-specs/specs/juno/hierarchical_multitenancy.html
  
 http://specs.openstack.org/openstack/keystone-specs/specs/juno/hierarchical_multitenancy.html
 [2] 
 http://specs.openstack.org/openstack/keystone-specs/specs/kilo/reseller.html 
 http://specs.openstack.org/openstack/keystone-specs/specs/kilo/reseller.html
 ___
 OpenStack-operators mailing list
 OpenStack-operators@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Help improve Neutron!

2015-06-22 Thread Sam Morrison
Hi Piet,

Is this a UX focused interview?

Sam


 On 23 Jun 2015, at 6:04 am, Kruithof, Piet pieter.c.kruithof...@hp.com 
 wrote:
 
 Hi Folks,
 
 The OpenStack UX team is looking for six people that would be willing to 
 participate in a one hour interview.
 
 We’re specifically looking for folks that are currently using the 
 Nova-Networks in their cloud and have not moved to Neutron.  As always, the 
 results will be shared with the community.
 
 Thanks,
 
 Piet Kruithof
 Sr UX Architect/HP Helion Cloud
 
 ___
 OpenStack-operators mailing list
 OpenStack-operators@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Help improve Neutron!

2015-06-22 Thread Sam Morrison
From our point for view the issues aren’t user experience they are things 
like: (being discussed in the Large Deployments Team)

[1] - https://etherpad.openstack.org/p/Network_Segmentation_Usecases
[2] - https://bugs.launchpad.net/neutron/+bug/1458890
[3] - https://review.openstack.org/#/c/180803/

Cheers,
Sam


 On 23 Jun 2015, at 10:09 am, Kruithof, Piet pieter.c.kruithof...@hp.com 
 wrote:
 
 Hi Sam!
 
 The study is focused on user experience.  However, rather than show
 mockups we¹re trying to understand what is preventing folks from moving
 from Nova-Netwroks to Nuetron.
 
 Piet
 
 On 6/22/15, 5:54 PM, Sam Morrison sorri...@gmail.com wrote:
 
 Hi Piet,
 
 Is this a UX focused interview?
 
 Sam
 
 
 On 23 Jun 2015, at 6:04 am, Kruithof, Piet
 pieter.c.kruithof...@hp.com wrote:
 
 Hi Folks,
 
 The OpenStack UX team is looking for six people that would be willing
 to participate in a one hour interview.
 
 We¹re specifically looking for folks that are currently using the
 Nova-Networks in their cloud and have not moved to Neutron.  As always,
 the results will be shared with the community.
 
 Thanks,
 
 Piet Kruithof
 Sr UX Architect/HP Helion Cloud
 
 ___
 OpenStack-operators mailing list
 OpenStack-operators@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
 
 


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [openstack-dev] [nova] [neutron] Re: How do your end users use networking?

2015-06-17 Thread Sam Morrison

 On 17 Jun 2015, at 8:35 pm, Neil Jerram neil.jer...@metaswitch.com wrote:
 
 Hi Sam,
 
 On 17/06/15 01:31, Sam Morrison wrote:
 We at NeCTAR are starting the transition to neutron from nova-net and 
 neutron almost does what we want.
 
 We have 10 “public networks and 10 “service networks and depending on 
 which compute node you land on you get attached to one of them.
 
 In neutron speak we have multiple shared externally routed provider 
 networks. We don’t have any tenant networks or any other fancy stuff yet.
 How I’ve currently got this set up is by creating 10 networks and subsequent 
 subnets eg. public-1, public-2, public-3 … and service-1, service-2, 
 service-3 and so on.
 
 In nova we have made a slight change in allocate for instance [1] whereby 
 the compute node has a designated hardcoded network_ids for the public and 
 service network it is physically attached to.
 We have also made changes in the nova API so users can’t select a network 
 and the neutron endpoint is not registered in keystone.
 
 That all works fine but ideally I want a user to be able to choose if they 
 want a public and or service network. We can’t let them as we have 10 public 
 networks, we almost need something in neutron like a network group” or 
 something that allows a user to select “public” and it allocates them a port 
 in one of the underlying public networks.
 
 This begs the question: why have you defined 10 public-N networks, instead of 
 just one public network?

I think this has all been answered but just in case.
There are multiple reasons. We don’t have a single IPv4 range big enough for 
our cloud, don’t want the broadcast domain too be massive, the compute nodes 
are in different data centres etc. etc.
Basically it’s not how our underlying physical network is set up and we can’t 
change that.

Sam


 
 I tried going down the route of having 1 public and 1 service network in 
 neutron then creating 10 subnets under each. That works until you get to 
 things like dhcp-agent and metadata agent although this looks like it could 
 work with a few minor changes. Basically I need a dhcp-agent to be spun up 
 per subnet and ensure they are spun up in the right place.
 
 Why the 10 subnets?  Is it to do with where you actually have real L2 
 segments, in your deployment?
 
 Thanks,
   Neil
 
 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [openstack-dev] [nova] [neutron] Re: How do your end users use networking?

2015-06-16 Thread Sam Morrison

 On 17 Jun 2015, at 10:56 am, Armando M. arma...@gmail.com wrote:
 
 
 
 On 16 June 2015 at 17:31, Sam Morrison sorri...@gmail.com 
 mailto:sorri...@gmail.com wrote:
 We at NeCTAR are starting the transition to neutron from nova-net and neutron 
 almost does what we want.
 
 We have 10 “public networks and 10 “service networks and depending on which 
 compute node you land on you get attached to one of them.
 
 In neutron speak we have multiple shared externally routed provider networks. 
 We don’t have any tenant networks or any other fancy stuff yet.
 How I’ve currently got this set up is by creating 10 networks and subsequent 
 subnets eg. public-1, public-2, public-3 … and service-1, service-2, 
 service-3 and so on.
 
 In nova we have made a slight change in allocate for instance [1] whereby the 
 compute node has a designated hardcoded network_ids for the public and 
 service network it is physically attached to.
 We have also made changes in the nova API so users can’t select a network and 
 the neutron endpoint is not registered in keystone.
 
 That all works fine but ideally I want a user to be able to choose if they 
 want a public and or service network. We can’t let them as we have 10 public 
 networks, we almost need something in neutron like a network group” or 
 something that allows a user to select “public” and it allocates them a port 
 in one of the underlying public networks.
 
 I tried going down the route of having 1 public and 1 service network in 
 neutron then creating 10 subnets under each. That works until you get to 
 things like dhcp-agent and metadata agent although this looks like it could 
 work with a few minor changes. Basically I need a dhcp-agent to be spun up 
 per subnet and ensure they are spun up in the right place.
 
 I’m not sure what the correct way of doing this. What are other people doing 
 in the interim until this kind of use case can be done in Neutron?
 
 Would something like [1] be adequate to address your use case? If not, I'd 
 suggest you to file an RFE bug (more details in [2]), so that we can keep the 
 discussion focused on this specific case.
 
 HTH
 Armando
 
 [1] https://blueprints.launchpad.net/neutron/+spec/rbac-networks 
 https://blueprints.launchpad.net/neutron/+spec/rbac-networks
That’s not applicable in this case. We don’t care about what tenants are when 
in this case.

 [2] 
 https://github.com/openstack/neutron/blob/master/doc/source/policies/blueprints.rst#neutron-request-for-feature-enhancements
  
 https://github.com/openstack/neutron/blob/master/doc/source/policies/blueprints.rst#neutron-request-for-feature-enhancements
The bug Kris mentioned outlines all I want too I think.

Sam


 
  
 
 Cheers,
 Sam
 
 [1] 
 https://github.com/NeCTAR-RC/nova/commit/1bc2396edc684f83ce471dd9dc9219c4635afb12
  
 https://github.com/NeCTAR-RC/nova/commit/1bc2396edc684f83ce471dd9dc9219c4635afb12
 
 
 
  On 17 Jun 2015, at 12:20 am, Jay Pipes jaypi...@gmail.com 
  mailto:jaypi...@gmail.com wrote:
 
  Adding -dev because of the reference to the Neutron Get me a network 
  spec. Also adding [nova] and [neutron] subject markers.
 
  Comments inline, Kris.
 
  On 05/22/2015 09:28 PM, Kris G. Lindgren wrote:
  During the Openstack summit this week I got to talk to a number of other
  operators of large Openstack deployments about how they do networking.
   I was happy, surprised even, to find that a number of us are using a
  similar type of networking strategy.  That we have similar challenges
  around networking and are solving it in our own but very similar way.
   It is always nice to see that other people are doing the same things
  as you or see the same issues as you are and that you are not crazy.
  So in that vein, I wanted to reach out to the rest of the Ops Community
  and ask one pretty simple question.
 
  Would it be accurate to say that most of your end users want almost
  nothing to do with the network?
 
  That was my experience at ATT, yes. The vast majority of end users could 
  not care less about networking, as long as the connectivity was reliable, 
  performed well, and they could connect to the Internet (and have others 
  connect from the Internet to their VMs) when needed.
 
  In my experience what the majority of them (both internal and external)
  want is to consume from Openstack a compute resource, a property of
  which is it that resource has an IP address.  They, at most, care about
  which network they are on.  Where a network is usually an arbitrary
  definition around a set of real networks, that are constrained to a
  location, in which the company has attached some sort of policy.  For
  example, I want to be in the production network vs's the xyz lab
  network, vs's the backup network, vs's the corp network.  I would say
  for Godaddy, 99% of our use cases would be defined as: I want a compute
  resource in the production network zone, or I want a compute resource in
  this other network zone.  The end user only cares

Re: [Openstack-operators] [nova] [neutron] Re: How do your end users use networking?

2015-06-16 Thread Sam Morrison
We at NeCTAR are starting the transition to neutron from nova-net and neutron 
almost does what we want.

We have 10 “public networks and 10 “service networks and depending on which 
compute node you land on you get attached to one of them.

In neutron speak we have multiple shared externally routed provider networks. 
We don’t have any tenant networks or any other fancy stuff yet.
How I’ve currently got this set up is by creating 10 networks and subsequent 
subnets eg. public-1, public-2, public-3 … and service-1, service-2, service-3 
and so on.

In nova we have made a slight change in allocate for instance [1] whereby the 
compute node has a designated hardcoded network_ids for the public and service 
network it is physically attached to.
We have also made changes in the nova API so users can’t select a network and 
the neutron endpoint is not registered in keystone.

That all works fine but ideally I want a user to be able to choose if they want 
a public and or service network. We can’t let them as we have 10 public 
networks, we almost need something in neutron like a network group” or 
something that allows a user to select “public” and it allocates them a port in 
one of the underlying public networks.

I tried going down the route of having 1 public and 1 service network in 
neutron then creating 10 subnets under each. That works until you get to things 
like dhcp-agent and metadata agent although this looks like it could work with 
a few minor changes. Basically I need a dhcp-agent to be spun up per subnet and 
ensure they are spun up in the right place.

I’m not sure what the correct way of doing this. What are other people doing in 
the interim until this kind of use case can be done in Neutron?

Cheers,
Sam
 
[1] 
https://github.com/NeCTAR-RC/nova/commit/1bc2396edc684f83ce471dd9dc9219c4635afb12



 On 17 Jun 2015, at 12:20 am, Jay Pipes jaypi...@gmail.com wrote:
 
 Adding -dev because of the reference to the Neutron Get me a network spec. 
 Also adding [nova] and [neutron] subject markers.
 
 Comments inline, Kris.
 
 On 05/22/2015 09:28 PM, Kris G. Lindgren wrote:
 During the Openstack summit this week I got to talk to a number of other
 operators of large Openstack deployments about how they do networking.
  I was happy, surprised even, to find that a number of us are using a
 similar type of networking strategy.  That we have similar challenges
 around networking and are solving it in our own but very similar way.
  It is always nice to see that other people are doing the same things
 as you or see the same issues as you are and that you are not crazy.
 So in that vein, I wanted to reach out to the rest of the Ops Community
 and ask one pretty simple question.
 
 Would it be accurate to say that most of your end users want almost
 nothing to do with the network?
 
 That was my experience at ATT, yes. The vast majority of end users could not 
 care less about networking, as long as the connectivity was reliable, 
 performed well, and they could connect to the Internet (and have others 
 connect from the Internet to their VMs) when needed.
 
 In my experience what the majority of them (both internal and external)
 want is to consume from Openstack a compute resource, a property of
 which is it that resource has an IP address.  They, at most, care about
 which network they are on.  Where a network is usually an arbitrary
 definition around a set of real networks, that are constrained to a
 location, in which the company has attached some sort of policy.  For
 example, I want to be in the production network vs's the xyz lab
 network, vs's the backup network, vs's the corp network.  I would say
 for Godaddy, 99% of our use cases would be defined as: I want a compute
 resource in the production network zone, or I want a compute resource in
 this other network zone.  The end user only cares that the IP the vm
 receives works in that zone, outside of that they don't care any other
 property of that IP.  They do not care what subnet it is in, what vlan
 it is on, what switch it is attached to, what router its attached to, or
 how data flows in/out of that network.  It just needs to work. We have
 also found that by giving the users a floating ip address that can be
 moved between vm's (but still constrained within a network zone) we
 can solve almost all of our users asks.  Typically, the internal need
 for a floating ip is when a compute resource needs to talk to another
 protected internal or external resource. Where it is painful (read:
 slow) to have the acl's on that protected resource updated. The external
 need is from our hosting customers who have a domain name (or many) tied
 to an IP address and changing IP's/DNS is particularly painful.
 
 This is precisely my experience as well.
 
 Since the vast majority of our end users don't care about any of the
 technical network stuff, we spend a large amount of time/effort in
 abstracting or hiding the technical stuff from the users view. Which 

Re: [Openstack-operators] FYI: Rabbit Heartbeat Patch Landed

2015-05-03 Thread Sam Morrison
I’ve found a couple of issues with this:

1. Upgrading the packages in ubuntu doesn’t seem to work, you need to remove 
them all then install fresh. Some conflicts with file paths etc.
2. With juno heat the requirements.txt has upper limits on the versions for 
oslo deps. I just removed these and it seems to work fine.
3. When using amqp_durable_queues it will no longer declare the exchanges with 
this argument set so this will give errors when declaring the exchange. (I 
think this is a bug, at least an upgrade bug as this will affect people moving 
juno - kilo)




 On 4 May 2015, at 9:08 am, Sam Morrison sorri...@gmail.com wrote:
 
 We’re running:
 
 kombu: 3.0.7
 amqp: 1.4.5
 rabbitmq, 3.3.5
 erlang: R14B04
 
 
 On 2 May 2015, at 1:51 am, Kris G. Lindgren klindg...@godaddy.com wrote:
 
 We are running:
 kombu 3.0.24
 amqp 1.4.6
 rabbitmq 3.4.0
 erlang R16B-03.10
 
 
 Kris Lindgren
 Senior Linux Systems Engineer
 GoDaddy, LLC.
 
 
 
 On 5/1/15, 9:41 AM, Davanum Srinivas dava...@gmail.com wrote:
 
 may i request folks post the versions of rabbitmq and pip versions of
 kombu and amqp libraries?
 
 thanks,
 dims
 
 On Fri, May 1, 2015 at 11:29 AM, Mike Dorman mdor...@godaddy.com wrote:
 We¹ve been running the new oslo.messaging under Juno for about the last
 month, and we¹ve seen success with it, too.
 
 From: Sam Morrison
 Date: Thursday, April 30, 2015 at 11:02 PM
 To: David Medberry
 Cc: OpenStack Operators
 Subject: Re: [Openstack-operators] FYI: Rabbit Heartbeat Patch Landed
 
 Great, let me know how you get on.
 
 
 On 1 May 2015, at 12:21 pm, David Medberry openst...@medberry.net
 wrote:
 
 Great news Sam. I'll pull those packages into my Juno devel environment
 and
 see if it makes any difference.
 Much appreciated for the rebuilds/links.
 
 Also, good to connect with you at ... Connect AU.
 
 On Thu, Apr 30, 2015 at 7:30 PM, Sam Morrison sorri...@gmail.com
 wrote:
 
 I managed to get a juno environment with oslo.messaging 1.8.1 working
 in
 ubuntu 14.04
 
 I have a debian repo with all the required dependancies at:
 
 deb http://download.rc.nectar.org.au/nectar-ubuntu
 trusty-juno-testing-oslo main
 
 All it includes is ubuntu official packages from vivid.
 
 Have installed in our test environment and all looking good so far
 although haven¹t done much testing yet.
 
 Sam
 
 
 
 On 21 Mar 2015, at 2:35 am, David Medberry openst...@medberry.net
 wrote:
 
 Hi Sam,
 
 I started down the same path yesterday. If I have any success today,
 I'll
 post to this list.
 
 I'm also going to reach out to the Ubuntu Server (aka Cloud) team and
 so
 if they can throw up a PPA with this for Juno quickly (which they will
 likely NOT do but it doesn't hurt to ask.) We need to get the
 stable/juno
 team on board with this backport/regression.
 
 On Fri, Mar 20, 2015 at 4:14 AM, Sam Morrison sorri...@gmail.com
 wrote:
 
 I¹ve been trying to build a ubuntu deb of this in a juno environment.
 It¹s a bit of a nightmare as they have changed all the module names
 from
 oslo.XXX to oslo_XXX
 
 Have fixed those up with a few sed replaces and had to remove support
 for
 aioeventlet as the dependencies aren¹t in the ubuntu cloud archive
 juno.
 
 Still have a couple of tests failing but I think it *should* work in
 on
 our juno hosts.
 
 I have a branch of the 1.8.0 release that I¹m trying to build against
 Juno here [1] and I¹m hoping that it will be easy to integrate the
 heartbeat
 code.
 I¹m sure there is lots of people that would be keen to get a latest
 version of oslo.messaging working against a juno environment. What is
 the
 best way to make that happen though?
 
 Cheers,
 Sam
 
 [1] https://github.com/NeCTAR-RC/oslo.messaging/commits/nectar/1.8.0
 
 
 
 On 20 Mar 2015, at 8:59 am, Davanum Srinivas dava...@gmail.com
 wrote:
 
 So, talking about experiments, here's one:
 https://review.openstack.org/#/c/165981/
 
 Trying to run oslo.messaging trunk against stable/juno of the rest
 of
 the components.
 
 -- dims
 
 On Thu, Mar 19, 2015 at 5:10 PM, Matt Fischer m...@mattfischer.com
 wrote:
 I think everyone is highly interested in running this change or a
 newer OSLO
 messaging in general + this change in Juno rather than waiting for
 Kilo.
 Hopefully everyone could provide updates as they do experiments.
 
 
 On Thu, Mar 19, 2015 at 1:22 PM, Kevin Bringard (kevinbri)
 kevin...@cisco.com wrote:
 
 Can't speak to that concept, but I did try cherry picking the
 commit
 into
 the stable/juno branch of oslo.messaging and there'd definitely be
 some work
 to be done there. I fear that could mean havoc for trying to just
 use
 master
 oslo as well, but a good idea to try for sure.
 
 -- Kevin
 
 On Mar 19, 2015, at 1:13 PM, Jesse Keating j...@bluebox.net
 wrote:
 
 On 3/19/15 10:15 AM, Davanum Srinivas wrote:
 Apologies. i was waiting for one more changeset to merge.
 
 Please try oslo.messaging master branch
 https://github.com/openstack/oslo.messaging/commits/master/
 
 (you need

Re: [Openstack-operators] FYI: Rabbit Heartbeat Patch Landed

2015-05-03 Thread Sam Morrison

 On 4 May 2015, at 11:49 am, Davanum Srinivas dava...@gmail.com wrote:
 
 Sam,
 
 1. Weird, did you pick it up from ubuntu cloud archive? we could raise
 a bug against them

I’m basically upgrading the oslo packages from juno cloud archive to the 
packages in vivid. So maybe will work if upgrading to kilo cloud archive 
packages.

 2. yes, not much we can do about that now i guess.
 3. yes, Can you please log a bug for this?

Actually this has just fixed itself so I’m trying to figure out what is going 
on here. Looks fine now.

Sam



 
 Thanks!
 
 On Sun, May 3, 2015 at 9:45 PM, Sam Morrison sorri...@gmail.com wrote:
 I’ve found a couple of issues with this:
 
 1. Upgrading the packages in ubuntu doesn’t seem to work, you need to remove 
 them all then install fresh. Some conflicts with file paths etc.
 2. With juno heat the requirements.txt has upper limits on the versions for 
 oslo deps. I just removed these and it seems to work fine.
 3. When using amqp_durable_queues it will no longer declare the exchanges 
 with this argument set so this will give errors when declaring the exchange. 
 (I think this is a bug, at least an upgrade bug as this will affect people 
 moving juno - kilo)
 
 
 
 
 On 4 May 2015, at 9:08 am, Sam Morrison sorri...@gmail.com wrote:
 
 We’re running:
 
 kombu: 3.0.7
 amqp: 1.4.5
 rabbitmq, 3.3.5
 erlang: R14B04
 
 
 On 2 May 2015, at 1:51 am, Kris G. Lindgren klindg...@godaddy.com wrote:
 
 We are running:
 kombu 3.0.24
 amqp 1.4.6
 rabbitmq 3.4.0
 erlang R16B-03.10
 
 
 Kris Lindgren
 Senior Linux Systems Engineer
 GoDaddy, LLC.
 
 
 
 On 5/1/15, 9:41 AM, Davanum Srinivas dava...@gmail.com wrote:
 
 may i request folks post the versions of rabbitmq and pip versions of
 kombu and amqp libraries?
 
 thanks,
 dims
 
 On Fri, May 1, 2015 at 11:29 AM, Mike Dorman mdor...@godaddy.com wrote:
 We¹ve been running the new oslo.messaging under Juno for about the last
 month, and we¹ve seen success with it, too.
 
 From: Sam Morrison
 Date: Thursday, April 30, 2015 at 11:02 PM
 To: David Medberry
 Cc: OpenStack Operators
 Subject: Re: [Openstack-operators] FYI: Rabbit Heartbeat Patch Landed
 
 Great, let me know how you get on.
 
 
 On 1 May 2015, at 12:21 pm, David Medberry openst...@medberry.net
 wrote:
 
 Great news Sam. I'll pull those packages into my Juno devel environment
 and
 see if it makes any difference.
 Much appreciated for the rebuilds/links.
 
 Also, good to connect with you at ... Connect AU.
 
 On Thu, Apr 30, 2015 at 7:30 PM, Sam Morrison sorri...@gmail.com
 wrote:
 
 I managed to get a juno environment with oslo.messaging 1.8.1 working
 in
 ubuntu 14.04
 
 I have a debian repo with all the required dependancies at:
 
 deb http://download.rc.nectar.org.au/nectar-ubuntu
 trusty-juno-testing-oslo main
 
 All it includes is ubuntu official packages from vivid.
 
 Have installed in our test environment and all looking good so far
 although haven¹t done much testing yet.
 
 Sam
 
 
 
 On 21 Mar 2015, at 2:35 am, David Medberry openst...@medberry.net
 wrote:
 
 Hi Sam,
 
 I started down the same path yesterday. If I have any success today,
 I'll
 post to this list.
 
 I'm also going to reach out to the Ubuntu Server (aka Cloud) team and
 so
 if they can throw up a PPA with this for Juno quickly (which they will
 likely NOT do but it doesn't hurt to ask.) We need to get the
 stable/juno
 team on board with this backport/regression.
 
 On Fri, Mar 20, 2015 at 4:14 AM, Sam Morrison sorri...@gmail.com
 wrote:
 
 I¹ve been trying to build a ubuntu deb of this in a juno environment.
 It¹s a bit of a nightmare as they have changed all the module names
 from
 oslo.XXX to oslo_XXX
 
 Have fixed those up with a few sed replaces and had to remove support
 for
 aioeventlet as the dependencies aren¹t in the ubuntu cloud archive
 juno.
 
 Still have a couple of tests failing but I think it *should* work in
 on
 our juno hosts.
 
 I have a branch of the 1.8.0 release that I¹m trying to build against
 Juno here [1] and I¹m hoping that it will be easy to integrate the
 heartbeat
 code.
 I¹m sure there is lots of people that would be keen to get a latest
 version of oslo.messaging working against a juno environment. What is
 the
 best way to make that happen though?
 
 Cheers,
 Sam
 
 [1] https://github.com/NeCTAR-RC/oslo.messaging/commits/nectar/1.8.0
 
 
 
 On 20 Mar 2015, at 8:59 am, Davanum Srinivas dava...@gmail.com
 wrote:
 
 So, talking about experiments, here's one:
 https://review.openstack.org/#/c/165981/
 
 Trying to run oslo.messaging trunk against stable/juno of the rest
 of
 the components.
 
 -- dims
 
 On Thu, Mar 19, 2015 at 5:10 PM, Matt Fischer m...@mattfischer.com
 wrote:
 I think everyone is highly interested in running this change or a
 newer OSLO
 messaging in general + this change in Juno rather than waiting for
 Kilo.
 Hopefully everyone could provide updates as they do experiments.
 
 
 On Thu, Mar 19, 2015 at 1:22 PM

Re: [Openstack-operators] FYI: Rabbit Heartbeat Patch Landed

2015-04-30 Thread Sam Morrison
I managed to get a juno environment with oslo.messaging 1.8.1 working in ubuntu 
14.04

I have a debian repo with all the required dependancies at:

deb http://download.rc.nectar.org.au/nectar-ubuntu trusty-juno-testing-oslo main

All it includes is ubuntu official packages from vivid.

Have installed in our test environment and all looking good so far although 
haven’t done much testing yet.

Sam



 On 21 Mar 2015, at 2:35 am, David Medberry openst...@medberry.net wrote:
 
 Hi Sam,
 
 I started down the same path yesterday. If I have any success today, I'll 
 post to this list.
 
 I'm also going to reach out to the Ubuntu Server (aka Cloud) team and so if 
 they can throw up a PPA with this for Juno quickly (which they will likely 
 NOT do but it doesn't hurt to ask.) We need to get the stable/juno team on 
 board with this backport/regression.
 
 On Fri, Mar 20, 2015 at 4:14 AM, Sam Morrison sorri...@gmail.com 
 mailto:sorri...@gmail.com wrote:
 I’ve been trying to build a ubuntu deb of this in a juno environment. It’s a 
 bit of a nightmare as they have changed all the module names from oslo.XXX to 
 oslo_XXX
 
 Have fixed those up with a few sed replaces and had to remove support for 
 aioeventlet as the dependencies aren’t in the ubuntu cloud archive juno.
 
 Still have a couple of tests failing but I think it *should* work in on our 
 juno hosts.
 
 I have a branch of the 1.8.0 release that I’m trying to build against Juno 
 here [1] and I’m hoping that it will be easy to integrate the heartbeat code.
 I’m sure there is lots of people that would be keen to get a latest version 
 of oslo.messaging working against a juno environment. What is the best way to 
 make that happen though?
 
 Cheers,
 Sam
 
 [1] https://github.com/NeCTAR-RC/oslo.messaging/commits/nectar/1.8.0 
 https://github.com/NeCTAR-RC/oslo.messaging/commits/nectar/1.8.0
 
 
 
  On 20 Mar 2015, at 8:59 am, Davanum Srinivas dava...@gmail.com 
  mailto:dava...@gmail.com wrote:
 
  So, talking about experiments, here's one:
  https://review.openstack.org/#/c/165981/ 
  https://review.openstack.org/#/c/165981/
 
  Trying to run oslo.messaging trunk against stable/juno of the rest of
  the components.
 
  -- dims
 
  On Thu, Mar 19, 2015 at 5:10 PM, Matt Fischer m...@mattfischer.com 
  mailto:m...@mattfischer.com wrote:
  I think everyone is highly interested in running this change or a newer 
  OSLO
  messaging in general + this change in Juno rather than waiting for Kilo.
  Hopefully everyone could provide updates as they do experiments.
 
 
  On Thu, Mar 19, 2015 at 1:22 PM, Kevin Bringard (kevinbri)
  kevin...@cisco.com mailto:kevin...@cisco.com wrote:
 
  Can't speak to that concept, but I did try cherry picking the commit into
  the stable/juno branch of oslo.messaging and there'd definitely be some 
  work
  to be done there. I fear that could mean havoc for trying to just use 
  master
  oslo as well, but a good idea to try for sure.
 
  -- Kevin
 
  On Mar 19, 2015, at 1:13 PM, Jesse Keating j...@bluebox.net 
  mailto:j...@bluebox.net wrote:
 
  On 3/19/15 10:15 AM, Davanum Srinivas wrote:
  Apologies. i was waiting for one more changeset to merge.
 
  Please try oslo.messaging master branch
  https://github.com/openstack/oslo.messaging/commits/master/ 
  https://github.com/openstack/oslo.messaging/commits/master/
 
  (you need at least till Change-Id:
  I4b729ed1a6ddad2a0e48102852b2ce7d66423eaa - change id is in the commit
  message)
 
  Please note that these changes are NOT in the kilo branch that has been
  cut already
  https://github.com/openstack/oslo.messaging/commits/stable/kilo 
  https://github.com/openstack/oslo.messaging/commits/stable/kilo
 
  So we need your help with testing to promote it to kilo for you all to
  use it in Kilo :)
 
  Please file reviews or bugs or hop onto #openstack-oslo if you see
  issues etc.
 
  Many thanks to Kris Lindgren to help shake out some issues in his
  environment.
 
  How bad of an idea would it be to run master of oslo.messaging with juno
  code base? Explosions all over the place?
 
  --
  -jlk
 
  ___
  OpenStack-operators mailing list
  OpenStack-operators@lists.openstack.org 
  mailto:OpenStack-operators@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators 
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
 
 
  ___
  OpenStack-operators mailing list
  OpenStack-operators@lists.openstack.org 
  mailto:OpenStack-operators@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators 
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
 
 
 
  ___
  OpenStack-operators mailing list
  OpenStack-operators@lists.openstack.org 
  mailto:OpenStack-operators@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo

Re: [Openstack-operators] Flat network with linux bridge plugin

2015-04-08 Thread Sam Morrison
I reset neutron (cleared DB etc.) and rebooted the network node and it worked 
fine the second time around.

I think the first time something went wrong talking to the DB initially. I 
fixed up the config but then it was impossible for it to fix itself. eg. 
restarting the neutron agents did nothing.

Sam


 On 9 Apr 2015, at 1:47 am, Daniele Venzano daniele.venz...@eurecom.fr wrote:
 
 I am 99% sure I configured the linuxbridge agent on the network node the same 
 way as on the compute nodes, but it was doing nothing. 
 But I did it a while ago, so I could be wrong. Anyway having the agent 
 constantly running just to create a bridge at boot is bit of a waste.
 The next maintenance window I will try again, just to understand.
  
  
 From: Kris G. Lindgren [mailto:klindg...@godaddy.com] 
 Sent: Wednesday 08 April 2015 17:01
 To: Daniele Venzano; 'Daniel Comnea'
 Cc: openstack-operators@lists.openstack.org
 Subject: Re: [Openstack-operators] Flat network with linux bridge plugin
  
 We run this exact configuration with the exception that we are using OVS 
 instead of linux bridge agent.  On your Network nodes (those running 
 metadata/dhcp) you need to configure them exactly like you do you compute 
 services from the standpoint of the L2 agent.  Once we did that when the l2 
 agent starts it creates the bridges it cares about and the dhcp agent then 
 gets plugged into those bridges.  We didn't have to specifically create any 
 bridges or manually plug vifs into it to get everything to work.
  
 I would be highly surprised if the linuxbridge agent acted any differently.  
 Mainly because the dhcp agent consumes an IP/port on the network, no 
 different than a vm would.  So the L2 agent should plug it for you 
 automatically.
 
  
 Kris Lindgren
 Senior Linux Systems Engineer
 GoDaddy, LLC.
  
 From: Daniele Venzano daniele.venz...@eurecom.fr 
 mailto:daniele.venz...@eurecom.fr
 Organization: EURECOM
 Date: Wednesday, April 8, 2015 at 4:21 AM
 To: 'Daniel Comnea' comnea.d...@gmail.com mailto:comnea.d...@gmail.com
 Cc: openstack-operators@lists.openstack.org 
 mailto:openstack-operators@lists.openstack.org 
 openstack-operators@lists.openstack.org 
 mailto:openstack-operators@lists.openstack.org
 Subject: Re: [Openstack-operators] Flat network with linux bridge plugin
  
 Juno (from ubuntu cloud), on Ubuntu 14.04
  
 From: daniel.com...@gmail.com mailto:daniel.com...@gmail.com 
 [mailto:daniel.com...@gmail.com mailto:daniel.com...@gmail.com] On Behalf 
 Of Daniel Comnea
 Sent: Wednesday 08 April 2015 11:29
 To: Daniele Venzano
 Cc: Sam Morrison; openstack-operators@lists.openstack.org 
 mailto:openstack-operators@lists.openstack.org
 Subject: Re: [Openstack-operators] Flat network with linux bridge plugin
  
 Which release are you using it, on which OS ?
 
  
 On Wed, Apr 8, 2015 at 9:00 AM, Daniele Venzano daniele.venz...@eurecom.fr 
 mailto:daniele.venz...@eurecom.fr wrote:
 Well, I found a way to make it work.
 Yes, you need a bridge (brctl addbr ...).
 You need to create it by hand and add the interfaces (physical and dnsmasq 
 namespace) to it.
 The linuxbridge agent installed on the network node does not do anything.
  
 The problem with this is that the interface for the namespace is created 
 after an arbitrary amount of time by one of the neutron daemons, so you 
 cannot simply put the bridge creation in one of the boot scripts, but you 
 have to wait for the interface to appear.
  
  
 From: Sam Morrison [mailto:sorri...@gmail.com mailto:sorri...@gmail.com] 
 Sent: Wednesday 08 April 2015 05:46
 To: Daniele Venzano
 Cc: openstack-operators@lists.openstack.org 
 mailto:openstack-operators@lists.openstack.org
 Subject: Re: [Openstack-operators] Flat network with linux bridge plugin
  
 Hi Daniele,
  
 I’ve started playing with neutron too and have the exact same issue. Did you 
 find a solution?
  
 Cheers,
 Sam
  
  
  
 On 18 Feb 2015, at 8:47 pm, Daniele Venzano daniele.venz...@eurecom.fr 
 mailto:daniele.venz...@eurecom.fr wrote:
  
 Hello,
  
 I’m trying to configure a very simple Neutron setup.
  
 On the compute nodes I want a linux bridge connected to a physical interface 
 on one side and the VMs on the other side. This I have, by using the linux 
 bridge agent and a physnet1:em1 mapping in the config file.
  
 On the controller side I need the dhcp and metadata agents. I installed and 
 configured them. They start, no errors in logs. I see a namespace with a 
 ns-* interface in it for dhcp. Outside the namespace I see a tap* interface 
 without IP address, not connected to anything.
 I installed the linux bridge agent also on the controller node, hoping it 
 would create the bridge between the physnet interface and the dhcp namespace 
 tap interface, but it just sits there and does nothing.
  
 So: I have VMs sending DHCP requests. I see the requests on the controller 
 node, but the dhcp namespace is not connected to anything.
 I can provide

Re: [Openstack-operators] FYI: Rabbit Heartbeat Patch Landed

2015-03-20 Thread Sam Morrison
I’ve been trying to build a ubuntu deb of this in a juno environment. It’s a 
bit of a nightmare as they have changed all the module names from oslo.XXX to 
oslo_XXX

Have fixed those up with a few sed replaces and had to remove support for 
aioeventlet as the dependencies aren’t in the ubuntu cloud archive juno.

Still have a couple of tests failing but I think it *should* work in on our 
juno hosts.

I have a branch of the 1.8.0 release that I’m trying to build against Juno here 
[1] and I’m hoping that it will be easy to integrate the heartbeat code.
I’m sure there is lots of people that would be keen to get a latest version of 
oslo.messaging working against a juno environment. What is the best way to make 
that happen though?

Cheers,
Sam

[1] https://github.com/NeCTAR-RC/oslo.messaging/commits/nectar/1.8.0



 On 20 Mar 2015, at 8:59 am, Davanum Srinivas dava...@gmail.com wrote:
 
 So, talking about experiments, here's one:
 https://review.openstack.org/#/c/165981/
 
 Trying to run oslo.messaging trunk against stable/juno of the rest of
 the components.
 
 -- dims
 
 On Thu, Mar 19, 2015 at 5:10 PM, Matt Fischer m...@mattfischer.com wrote:
 I think everyone is highly interested in running this change or a newer OSLO
 messaging in general + this change in Juno rather than waiting for Kilo.
 Hopefully everyone could provide updates as they do experiments.
 
 
 On Thu, Mar 19, 2015 at 1:22 PM, Kevin Bringard (kevinbri)
 kevin...@cisco.com wrote:
 
 Can't speak to that concept, but I did try cherry picking the commit into
 the stable/juno branch of oslo.messaging and there'd definitely be some work
 to be done there. I fear that could mean havoc for trying to just use master
 oslo as well, but a good idea to try for sure.
 
 -- Kevin
 
 On Mar 19, 2015, at 1:13 PM, Jesse Keating j...@bluebox.net wrote:
 
 On 3/19/15 10:15 AM, Davanum Srinivas wrote:
 Apologies. i was waiting for one more changeset to merge.
 
 Please try oslo.messaging master branch
 https://github.com/openstack/oslo.messaging/commits/master/
 
 (you need at least till Change-Id:
 I4b729ed1a6ddad2a0e48102852b2ce7d66423eaa - change id is in the commit
 message)
 
 Please note that these changes are NOT in the kilo branch that has been
 cut already
 https://github.com/openstack/oslo.messaging/commits/stable/kilo
 
 So we need your help with testing to promote it to kilo for you all to
 use it in Kilo :)
 
 Please file reviews or bugs or hop onto #openstack-oslo if you see
 issues etc.
 
 Many thanks to Kris Lindgren to help shake out some issues in his
 environment.
 
 How bad of an idea would it be to run master of oslo.messaging with juno
 code base? Explosions all over the place?
 
 --
 -jlk
 
 ___
 OpenStack-operators mailing list
 OpenStack-operators@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
 
 
 ___
 OpenStack-operators mailing list
 OpenStack-operators@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
 
 
 
 ___
 OpenStack-operators mailing list
 OpenStack-operators@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
 
 
 
 
 -- 
 Davanum Srinivas :: https://twitter.com/dims
 
 ___
 OpenStack-operators mailing list
 OpenStack-operators@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] max_age and until_refresh for fixing Nova quotas

2015-03-20 Thread Sam Morrison
We’ve had the following for a year or so but doesn’t help much, we still see it 
occurring every 10 mins or so.

max_age = 10
until_refresh = 5
reservation_expire=600

We have a cron job that runs every 10 mins that figures out what projects are 
out of sync and corrects them.
We’ve always been scared of setting these to zero but we probably should.

Sam


 On 15 Mar 2015, at 2:53 pm, Mike Dorman mdor...@godaddy.com wrote:
 
 Yeah the default is just ‘0’ for both, which disables the refresh.
 
 
 
 The one downside is that it may not be 100% transparent to the user.  If 
 the quota is already (incorrectly) too high, and exceeding the quota 
 limit, the reservation that triggers the refresh will still fail.  I.e. 
 the reservation is attempted based on the quota usage values _before_ the 
 refresh.  But then after that the quota should be fixed and it will work 
 again on the next reservation.
 
 But my thinking is that most quota issues happen slowly over time.  If we 
 are correcting them often and automatically, they hopefully never get to 
 the point where they’re bad enough to manifest reservation errors to the 
 user.
 
 I don’t have any information re: db load.  I assume it regenerates based 
 on what’s in the instances or reservations table.  I imagine the load for 
 doing a single refresh is probably comparable to doing a ‘nova list’.
 
 Mike
 
 
 
 On 3/14/15, 2:27 PM, Tim Bell tim.b...@cern.ch wrote:
 
 Interesting... what are the defaults ?
 
 Assuming no massive DB load, getting synced within a day would seem 
 reasonable. Is the default no max age ?
 
 Tim
 
 -Original Message-
 From: Jesse Keating [mailto:j...@bluebox.net]
 Sent: 14 March 2015 16:59
 To: openstack-operators@lists.openstack.org
 Subject: Re: [Openstack-operators] max_age and until_refresh for fixing 
 Nova
 quotas
 
 On 3/14/15 8:11 AM, Mike Dorman wrote:
 I did short write-up here http://t.co/Q5X1hTgJG1 if you are interested
 in the details.
 
 
 Thanks for sharing Matt! That's an excellent write up.
 
 --
 -jlk
 
 ___
 OpenStack-operators mailing list
 OpenStack-operators@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
 
 ___
 OpenStack-operators mailing list
 OpenStack-operators@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
 ___
 OpenStack-operators mailing list
 OpenStack-operators@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] max_age and until_refresh for fixing Nova quotas

2015-03-20 Thread Sam Morrison
I’d need to go through the logs to see exactly the occurrence maybe 10 minutes 
was a bit much. We run the cron job every 10 mins.

The cron job stopped a few weeks back and we had at least half a dozen projects 
with out of sync quotas within a few hours.

Cheers,
Sam


 On 21 Mar 2015, at 4:01 am, Mike Dorman mdor...@godaddy.com wrote:
 
 Hey Sam,
 
 When you say it occurs every 10 minutes, what exactly do you mean?  The 
 quota refresh?  Or a quota getting out of sync?
 
 I am surprised you have max_age set so low.  I would think that would 
 basically trigger a quota refresh on every single reservation for most 
 users, right?
 
 Mike
 
 
 
 
 
 On 3/20/15, 4:18 AM, Sam Morrison sorri...@gmail.com wrote:
 
 We’ve had the following for a year or so but doesn’t help much, we still 
 see it occurring every 10 mins or so.
 
 max_age = 10
 until_refresh = 5
 reservation_expire=600
 
 We have a cron job that runs every 10 mins that figures out what projects 
 are out of sync and corrects them.
 We’ve always been scared of setting these to zero but we probably should.
 
 Sam
 
 
 On 15 Mar 2015, at 2:53 pm, Mike Dorman mdor...@godaddy.com wrote:
 
 Yeah the default is just ‘0’ for both, which disables the refresh.
 
 
 
 The one downside is that it may not be 100% transparent to the user.  
 If 
 the quota is already (incorrectly) too high, and exceeding the quota 
 limit, the reservation that triggers the refresh will still fail.  I.e. 
 the reservation is attempted based on the quota usage values _before_ 
 the 
 refresh.  But then after that the quota should be fixed and it will 
 work 
 again on the next reservation.
 
 But my thinking is that most quota issues happen slowly over time.  If 
 we 
 are correcting them often and automatically, they hopefully never get 
 to 
 the point where they’re bad enough to manifest reservation errors to 
 the 
 user.
 
 I don’t have any information re: db load.  I assume it regenerates 
 based 
 on what’s in the instances or reservations table.  I imagine the load 
 for 
 doing a single refresh is probably comparable to doing a ‘nova list’.
 
 Mike
 
 
 
 On 3/14/15, 2:27 PM, Tim Bell tim.b...@cern.ch wrote:
 
 Interesting... what are the defaults ?
 
 Assuming no massive DB load, getting synced within a day would seem 
 reasonable. Is the default no max age ?
 
 Tim
 
 -Original Message-
 From: Jesse Keating [mailto:j...@bluebox.net]
 Sent: 14 March 2015 16:59
 To: openstack-operators@lists.openstack.org
 Subject: Re: [Openstack-operators] max_age and until_refresh for 
 fixing 
 Nova
 quotas
 
 On 3/14/15 8:11 AM, Mike Dorman wrote:
 I did short write-up here http://t.co/Q5X1hTgJG1 if you are 
 interested
 in the details.
 
 
 Thanks for sharing Matt! That's an excellent write up.
 
 --
 -jlk
 
 ___
 OpenStack-operators mailing list
 OpenStack-operators@lists.openstack.org
 
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
 
 ___
 OpenStack-operators mailing list
 OpenStack-operators@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
 ___
 OpenStack-operators mailing list
 OpenStack-operators@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
 
 ___
 OpenStack-operators mailing list
 OpenStack-operators@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] max_age and until_refresh for fixing Nova quotas

2015-03-20 Thread Sam Morrison
I put the script up at https://gist.github.com/sorrison/99c8e87295756e0ed787

The cron job that runs is like:

#!/bin/bash
BASEDIR=/tmp
/usr/bin/mysql --defaults-group-suffix=-prod-keystone -B -e 'select target_id, 
actor_id from assignment inner join project on target_id = project.id inner 
join user on user.id = actor_id order by target_id, actor_id;' | tail -n+2  
$BASEDIR/keystone_project_users.txt
scp $BASEDIR/keystone_project_users.txt root@nova-controller:.
ssh root@nova-controller nova-manage shell script sync_quotas_main.py


It’s not pretty but it works…..

Sam



 On 21 Mar 2015, at 12:35 am, Kris G. Lindgren klindg...@godaddy.com wrote:
 
 Can you post of you cronjob/script that you use to correct the quotas?
 
 
 Kris Lindgren
 Senior Linux Systems Engineer
 GoDaddy, LLC.
 
 
 
 On 3/20/15, 4:18 AM, Sam Morrison sorri...@gmail.com wrote:
 
 We¹ve had the following for a year or so but doesn¹t help much, we still
 see it occurring every 10 mins or so.
 
 max_age = 10
 until_refresh = 5
 reservation_expire=600
 
 We have a cron job that runs every 10 mins that figures out what projects
 are out of sync and corrects them.
 We¹ve always been scared of setting these to zero but we probably should.
 
 Sam
 
 
 On 15 Mar 2015, at 2:53 pm, Mike Dorman mdor...@godaddy.com wrote:
 
 Yeah the default is just Œ0¹ for both, which disables the refresh.
 
 
 
 The one downside is that it may not be 100% transparent to the user.
 If 
 the quota is already (incorrectly) too high, and exceeding the quota
 limit, the reservation that triggers the refresh will still fail.  I.e.
 the reservation is attempted based on the quota usage values _before_
 the 
 refresh.  But then after that the quota should be fixed and it will
 work 
 again on the next reservation.
 
 But my thinking is that most quota issues happen slowly over time.  If
 we 
 are correcting them often and automatically, they hopefully never get
 to 
 the point where they¹re bad enough to manifest reservation errors to
 the 
 user.
 
 I don¹t have any information re: db load.  I assume it regenerates
 based 
 on what¹s in the instances or reservations table.  I imagine the load
 for 
 doing a single refresh is probably comparable to doing a Œnova list¹.
 
 Mike
 
 
 
 On 3/14/15, 2:27 PM, Tim Bell tim.b...@cern.ch wrote:
 
 Interesting... what are the defaults ?
 
 Assuming no massive DB load, getting synced within a day would seem
 reasonable. Is the default no max age ?
 
 Tim
 
 -Original Message-
 From: Jesse Keating [mailto:j...@bluebox.net]
 Sent: 14 March 2015 16:59
 To: openstack-operators@lists.openstack.org
 Subject: Re: [Openstack-operators] max_age and until_refresh for
 fixing 
 Nova
 quotas
 
 On 3/14/15 8:11 AM, Mike Dorman wrote:
 I did short write-up here http://t.co/Q5X1hTgJG1 if you are
 interested
 in the details.
 
 
 Thanks for sharing Matt! That's an excellent write up.
 
 --
 -jlk
 
 ___
 OpenStack-operators mailing list
 OpenStack-operators@lists.openstack.org
 
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
 
 ___
 OpenStack-operators mailing list
 OpenStack-operators@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
 ___
 OpenStack-operators mailing list
 OpenStack-operators@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
 
 
 ___
 OpenStack-operators mailing list
 OpenStack-operators@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
 


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] Specifying multiple tenants for aggregate_multitenancy_isolation_filter

2015-01-27 Thread Sam Morrison
Hi operators,

I have a review up to fix this filter to allow multiple tenants, there are 2 
proposed ways in which this can be specified.

1. using a comma e.g., tenantid1,tenantid2
2. Using a json list eg. [“tenantid1”, “tenantid2”]

Which one do you think is better?

https://review.openstack.org/148807 https://review.openstack.org/148807

Thanks,
Sam
 

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Way to check compute - rabbitmq connectivity

2015-01-15 Thread Sam Morrison
We’ve had a lot of issues with Icehouse related to rabbitMQ. Basically the 
change from openstack.rpc to oslo.messaging broke things. These things are now 
fixed in oslo.messaging version 1.5.1, there is still an issue with heartbeats 
and that patch is making it’s way through review process now.

https://review.openstack.org/#/c/146047/ 
https://review.openstack.org/#/c/146047/

Cheers,
Sam


 On 16 Jan 2015, at 10:55 am, sridhar basam sridhar.ba...@gmail.com wrote:
 
 
 If you are using ha queues, use a version of rabbitmq  3.3.0. There was a 
 change in that version where consumption on queues was automatically enabled 
 when a master election for a queue happened. Previous versions only informed 
 clients that they had to reconsume on a queue. It was the clients 
 responsibility to start consumption on a queue.
 
 Make sure you enable tcp keepalives to a low enough value in case you have a 
 firewall device in between your rabbit server and it's consumers.
 
 Monitor consumers on your rabbit infrastructure using 'rabbitmqctl 
 list_queues name messages consumers'. Consumers on fanout queues is going to 
 depend on the number of services of any type you have in your environment.
 
 Sri
 On Jan 15, 2015 6:27 PM, Michael Dorman mdor...@godaddy.com 
 mailto:mdor...@godaddy.com wrote:
 Here is the bug I’ve been tracking related to this for a while.  I haven’t 
 really kept up to speed with it, so I don’t know the current status.
 
 https://bugs.launchpad.net/nova/+bug/856764 
 https://bugs.launchpad.net/nova/+bug/856764
 
 
 From: Kris Lindgren klindg...@godaddy.com mailto:klindg...@godaddy.com
 Date: Thursday, January 15, 2015 at 12:10 PM
 To: Gustavo Randich gustavo.rand...@gmail.com 
 mailto:gustavo.rand...@gmail.com, OpenStack Operators 
 openstack-operators@lists.openstack.org 
 mailto:openstack-operators@lists.openstack.org
 Subject: Re: [Openstack-operators] Way to check compute - rabbitmq 
 connectivity
 
 During the Atlanta ops meeting this topic came up and I specifically 
 mentioned about adding a no-op or healthcheck ping to the rabbitmq stuff to 
 both nova  neutron.  The dev's in the room looked at me like I was crazy, 
 but it was so that we could exactly catch issues as you described.  I am also 
 interested if any one knows of a lightweight call that could be used to 
 verify/confirm rabbitmq connectivity as well.  I haven't been able to devote 
 time to dig into it.  Mainly because if one client is having issues - you 
 will notice other clients are having similar/silent errors and a restart of 
 all the things is the easiest way to fix, for us atleast.
 
  
 Kris Lindgren
 Senior Linux Systems Engineer
 GoDaddy, LLC.
 
 
 From: Gustavo Randich gustavo.rand...@gmail.com 
 mailto:gustavo.rand...@gmail.com
 Date: Thursday, January 15, 2015 at 11:53 AM
 To: openstack-operators@lists.openstack.org 
 mailto:openstack-operators@lists.openstack.org 
 openstack-operators@lists.openstack.org 
 mailto:openstack-operators@lists.openstack.org
 Subject: Re: [Openstack-operators] Way to check compute - rabbitmq 
 connectivity
 
 Just to add one more background scenario, we also had similar problems trying 
 to load balance rabbitmq via F5 Big IP LTM. For that reason we don't use it 
 now. Our installation is a single rabbitmq instance and no intermediaries 
 (albeit network switches). We use Folsom and Icehouse, the problem being 
 perceived more in Icehouse nodes.
 
 We are already monitoring message queue size, but we would like to pinpoint 
 in semi-realtime the specific hosts/racks/network paths experiencing the 
 stale connection before a user complains about an operation being stuck, or 
 even hosts with no such pending operations but already disconnected -- we 
 also could diagnose possible network causes and avoid massive service 
 restarting.
 
 So, for now, if someone knows about a cheap and quick openstack operation 
 that triggers a message interchange between rabbitmq and nova-compute and a 
 way of checking the result it would be great.
 
 
 
 
 On Thu, Jan 15, 2015 at 1:45 PM, Kris G. Lindgren klindg...@godaddy.com 
 mailto:klindg...@godaddy.com wrote:
 We did have an issue using celery  on an internal application that we wrote - 
 but I believe it was fixed after much failover testing and code changes.  We 
 also use logstash via rabbitmq and haven't noticed any issues there either.
 
 So this seems to be just openstack/oslo related.
 
 We have tried a number of different configurations - all of them had their 
 issues.  We started out listing all the members in the cluster on the 
 rabbit_hosts line.  This worked most of the time without issue, until we 
 would restart one of the servers, then it seemed like the clients wouldn't 
 figure out they were disconnected and reconnect to the next host.  
 
 In an attempt to solve that we moved to using harpoxy to present a vip that 
 we configured in the rabbit_hosts line.  This created issues with