[Openstack-operators] [cloudkitty] Looking for use-cases to extend Cloudkitty, let's meet in Barcelona !

2016-10-18 Thread Christophe Sauthier

Dear ops !

In the Cloudkitty team (as a reminder it is used for chargeback and 
rating and it is in the Big Tent) we are heavily looking for use-cases 
that you're experiencing.
We are doing our best with the people we have around us, but I am sure 
we are lacking many use-cases.
I am sure many of you will go to the OpenStack Summit in Barcelona, and 
the good thing is so are we !


I would be really really interested to meet you in person, especially 
if you have a chargeback or rating project/need or if you are already 
doing so...


Please drop me an email so that we can arrange a quick chat there or 
catch me at the D26 booth (Objectif Lilbre) where I should be hanging 
when I am not attending a talk...


  Christohe Sauthier, PTL of Cloudkitty


Christophe Sauthier   Mail : 
christophe.sauth...@objectif-libre.com

CEO   Mob : +33 (0) 6 16 98 63 96
Objectif LibreURL : www.objectif-libre.com
Au service de votre Cloud Twitter : @objectiflibre

Suivez les actualités OpenStack en français en vous abonnant à la Pause 
OpenStack

http://olib.re/pause-openstack

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] How do you even test for that?

2016-10-18 Thread Jonathan D. Proulx
On Mon, Oct 17, 2016 at 05:45:07PM -0600, Matt Fischer wrote:
:This does not cover all your issues but after seeing mysql bugs between I
:and J and also J to K we now export and restore production control plane
:data into a dev environment to test the upgrades. If we have issues we
:destroy this environment and run it again.

Yeah I learned that one the hard way a while back (maybe havana?), you
ever revert a production OpenStack upgrade ;)

A copy of the production DB goes into test pretty much immediately
prior to upgrade tests.

:For longer running instances that's tough but we try to catch those in our
:shared dev environment or staging with regression tests. This is also where
:we catch issues with outside hardware interactions like load balancers and
:storage.
:
:For your other issue was there a warning or depreciation in the logs for
:that? That's always at the top of our checklist.

Not that I saw or could find post facto on the controllers or
hypervisors nova and glance logs.

But it's not so much the specific issue which is dealt with as the class of
issue. Compatibility of artifacts created under Latest-N where N>1

It entirely possible there isn't a good way to test.  I mean I can't
think of one but I know some of you out there are smarter than me so
hope springs eternal.

Perhaps designing better post upgrade validation that focuses on
oldest artifacts, or various generations of them is the best I can
hope for.  At least then Ops would catch them and start working on a
fix ASAP.

-Jon


:On Oct 17, 2016 12:51 PM, "Jonathan Proulx"  wrote:
:
:> Hi All,
:>
:> Just on the other side of a Kilo->Mitaka upgrade (with a very brief
:> transit through Liberty in the middle).
:>
:> As usual I've caught a few problems in production that I have no idea
:> how I could possibly have tested for because they relate to older
:> running instances and some remnants of older package versions on the
:> production side which wouldn't have existed in test unless I'd
:> installed the test server with Havana and done incremental upgrades
:> starting a fairly wide suite of test instances along the way.
:>
:> First thing that bit me was neutron-db-manage being confused because
:> my production system still had migrations from Havana hanging around.
:> I'm calling this a packaging bug
:> https://bugs.launchpad.net/ubuntu/+source/neutron/+bug/1633576 but I
:> also feel like remembering release names forever might be a good
:> thing.
:>
:> Later I discovered during the Juno release (maybe earlier ones too)
:> making snapshot of running instances populated the snapshot's meta
:> data with "instance_type_vcpu_weight: none".  Currently (Mitaka) this
:> value must be an integer if it is set or boot fails.  This has the
:> interesting side effect of putting your instance into shutdown/error
:> state if you try a hard reboot of a formerly working instance.  I
:> 'fixed' this manually frobbing the DB to set lines where
:> instance_type_vcpu_weight was set to none to be deleted.
:>
:> Does anyone have strategies on how to actually test for problems with
:> "old" artifacts like these?
:>
:> Yes having things running from 18-24month old snapshots is "bad" and
:> yes it would be cleaner to install a fresh control plane at each
:> upgrade and cut over rather than doing an actual in place upgrade.  But
:> neither of these sub-optimal patterns are going all the way away
:> anytime soon.
:>
:> -Jon
:>
:> --
:>
:> ___
:> OpenStack-operators mailing list
:> OpenStack-operators@lists.openstack.org
:> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
:>

-- 

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] If you are deploying with or against RDO, we'd like to hear from you

2016-10-18 Thread David Moreau Simard
Hi openstack-operators,

We're currently gathering feedback from RDO users and operators with
the help of a very small and anonymous survey [1].

It's not very in-depth as we value your time and we all know filling
surveys is boring but it'd be very valuable to us.

If you'd like to chat RDO more in details at the upcoming OpenStack
summit: what we're doing right, what we're doing wrong or even if you
have questions about either starting to use it or how to get
involved...
Feel free to get in touch with me or join us at the RDO community meetup [2].

Thanks !

[1]: https://www.redhat.com/archives/rdo-list/2016-October/msg00128.html
[2]: https://www.eventbrite.com/e/an-evening-of-ceph-and-rdo-tickets-28022550202

David Moreau Simard
Senior Software Engineer | Openstack RDO

dmsimard = [irc, github, twitter]

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] If you are deploying with or against RDO, we'd like to hear from you

2016-10-18 Thread Kruithof Jr, Pieter
Hi David,

Can you talk a bit about how you intend to share the results and/or data from 
the survey?

Thanks,

Piet


On 10/18/16, 9:48 AM, "David Moreau Simard"  wrote:

Hi openstack-operators,

We're currently gathering feedback from RDO users and operators with
the help of a very small and anonymous survey [1].

It's not very in-depth as we value your time and we all know filling
surveys is boring but it'd be very valuable to us.

If you'd like to chat RDO more in details at the upcoming OpenStack
summit: what we're doing right, what we're doing wrong or even if you
have questions about either starting to use it or how to get
involved...
Feel free to get in touch with me or join us at the RDO community meetup 
[2].

Thanks !

[1]: https://www.redhat.com/archives/rdo-list/2016-October/msg00128.html
[2]: 
https://www.eventbrite.com/e/an-evening-of-ceph-and-rdo-tickets-28022550202

David Moreau Simard
Senior Software Engineer | Openstack RDO

dmsimard = [irc, github, twitter]

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] [Neutron][LBaaS] Architecture/Wiring for LBaaS service extension

2016-10-18 Thread Adam Lawson
Greetings fellow stackers.

So I found the list of service extensions [1,] the modular architecture [2]
and info re the driver API [3]. The architecture diagram [4] doesn't show
up in link3 and furthermore shows it was last updated in 2014 which tells
me it has probably changed since then.

Where is the best/most recent info for Neutron's LBaaS service extension?

[1]
http://docs.openstack.org/developer/neutron/devref/service_extensions.html
[2] https://wiki.openstack.org/wiki/Neutron/LBaaS/Architecture
[3] https://wiki.openstack.org/wiki/Neutron/LBaaS/DriverAPI
[4] https://wiki.openstack.org/wiki/File:Lbaas_arch.JPG

//adam

*Adam Lawson*

Principal Architect, CEO
Office: +1-916-794-5706
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] Ops Lightning Talks

2016-10-18 Thread David Medberry
Dear Readers,

We have TWO back-to-back (double sessions) of Lightning Talks for
Operators. And by "for Operators" I mean largely that Operators will be the
audience.

If you have an OpenStack problem, thingamajig, technique, simplification,
gadget, whatzit that readily lends itself to a Lightning talk. please email
me and put it into the etherpad here:

https://etherpad.openstack.org/p/BCN-ops-lightning-talks

There are two sessions... but I'd prefer to fill the first one and cancel
the second one. But if your schedule dictates that you are only available
for the second, we'll hold both.

(And in spite of my natural levity, it can be a serious talk, a serious
problem, or something completely frivolous but there might be tomatoes in
the audience so watch it.)

-dave

David Medberry
OpenStack Guy and your friendly moderator.
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] How do you even test for that?

2016-10-18 Thread Clint Byrum
Excerpts from Jonathan Proulx's message of 2016-10-17 14:49:13 -0400:
> Hi All,
> 
> Just on the other side of a Kilo->Mitaka upgrade (with a very brief
> transit through Liberty in the middle).
> 
> As usual I've caught a few problems in production that I have no idea
> how I could possibly have tested for because they relate to older
> running instances and some remnants of older package versions on the
> production side which wouldn't have existed in test unless I'd
> installed the test server with Havana and done incremental upgrades
> starting a fairly wide suite of test instances along the way.
> 

In general, modifying _anything_ in place is hard to test.

You're much better off with as much immutable content as possible on all
of your nodes. If you've been wondering what this whole Docker nonsense
is about, well, that's what it's about. You docker build once per software
release attempt, and then mount data read/write, and configs readonly.
Both openstack-ansible and kolla are deployment projects that try to do
some of this via lxc or docker, IIRC.

This way when you test your container image in test, you copy it out to
prod, start up the new containers, stop the old ones, and you know that
_at least_ you don't have older stuff running anymore. Data and config
are still likely to be the source of issues, but there are other ways
to help test that.

> First thing that bit me was neutron-db-manage being confused because
> my production system still had migrations from Havana hanging around.
> I'm calling this a packaging bug
> https://bugs.launchpad.net/ubuntu/+source/neutron/+bug/1633576 but I
> also feel like remembering release names forever might be a good
> thing.
> 

Ouch, indeed one of the first things to do _before_ an upgrade is to run
the migrations of the current version to make sure your schema is up to
date. Also it's best to make sure you have _all_ of the stable updates
before you do that, since it's possible fixes have landed in the
migrations that are meant to smooth the upgrade process.

> Later I discovered during the Juno release (maybe earlier ones too)
> making snapshot of running instances populated the snapshot's meta
> data with "instance_type_vcpu_weight: none".  Currently (Mitaka) this
> value must be an integer if it is set or boot fails.  This has the
> interesting side effect of putting your instance into shutdown/error
> state if you try a hard reboot of a formerly working instance.  I
> 'fixed' this manually frobbing the DB to set lines where
> instance_type_vcpu_weight was set to none to be deleted.
> 

This one is tough because it is clearly data and state related. It's
hard to say how you got the 'none' values in there instead of ints.
Somebody else suggested making db snapshots and loading them into a test
control plane. That seems like an easy-ish one to do some surface level
finding, but the fact is it could also be super dangerous if not isolated
well, and the more isolation, the less of a real simulation it is.

> Does anyone have strategies on how to actually test for problems with
> "old" artifacts like these?
> 
> Yes having things running from 18-24month old snapshots is "bad" and
> yes it would be cleaner to install a fresh control plane at each
> upgrade and cut over rather than doing an actual in place upgrade.  But
> neither of these sub-optimal patterns are going all the way away
> anytime soon.
>

In-place upgrades must work. If they don't, please file bugs and
complain loudly. :)

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] If you are deploying with or against RDO, we'd like to hear from you

2016-10-18 Thread David Moreau Simard
Hi Pieter,

We are planning on doing a post on the RDO blog [1] of the aggregated
(further anonymized if necessary) data once the survey is closed.
The last question where we ask for your contact information will
obviously not be published.

[1]: https://www.rdoproject.org/blog/

David Moreau Simard
Senior Software Engineer | Openstack RDO

dmsimard = [irc, github, twitter]


On Tue, Oct 18, 2016 at 11:56 AM, Kruithof Jr, Pieter
 wrote:
> Hi David,
>
> Can you talk a bit about how you intend to share the results and/or data from 
> the survey?
>
> Thanks,
>
> Piet
>
>
> On 10/18/16, 9:48 AM, "David Moreau Simard"  wrote:
>
> Hi openstack-operators,
>
> We're currently gathering feedback from RDO users and operators with
> the help of a very small and anonymous survey [1].
>
> It's not very in-depth as we value your time and we all know filling
> surveys is boring but it'd be very valuable to us.
>
> If you'd like to chat RDO more in details at the upcoming OpenStack
> summit: what we're doing right, what we're doing wrong or even if you
> have questions about either starting to use it or how to get
> involved...
> Feel free to get in touch with me or join us at the RDO community meetup 
> [2].
>
> Thanks !
>
> [1]: https://www.redhat.com/archives/rdo-list/2016-October/msg00128.html
> [2]: 
> https://www.eventbrite.com/e/an-evening-of-ceph-and-rdo-tickets-28022550202
>
> David Moreau Simard
> Senior Software Engineer | Openstack RDO
>
> dmsimard = [irc, github, twitter]
>
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] PCI passthrough trying to use busy resource?

2016-10-18 Thread Jonathan D. Proulx
Hi all,

I have a test GPU system that seemed to be working properly under Kilo
running 1 and 2 GPU instnace types on an 8GPU server.

After Mitaka upgrade it seems to alway try and assing the same Device
which is alredy in use rather than pick one of the 5 currently
available.


 Build of instance 9542cc63-793c-440e-9a57-cc06eb401839 was
 re-scheduled: Requested operation is not valid: PCI device
 :09:00.0 is in use by driver QEMU, domain instance-000abefa
 _do_build_and_run_instance
 /usr/lib/python2.7/dist-packages/nova/compute/manager.py:1945

it tries to schedule 5 times, but each time uses the same busy
device.  Since there are currently 3 in use if it had just picked a
new one each time

In trying to debug this I realize I have no idea how devices are
selected. Does OpenStack track which PCI devices are claimed or is
that a libvirt function and in either case where woudl I look to find
out what it thinks the current state is?

Thanks,
-Jon
-- 

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] User Survey results & DIY survey analysis tool

2016-10-18 Thread Heidi Joy Tretheway
Hello Operators, 
First, a HUGE thank you for contributing to our eighth OpenStack User Survey.  
I wanted to alert you that the results are now available. This cycle, we 
focused on a deployments-only update of the charts most often cited by our 
community.

On openstack.org/user-survey , you’ll be 
able to download the 12-page PDF “highlights” report (as well as past reports), 
view a 2-minute video overview , and learn more 
about these key findings:
NPS for deployments continues to tick up, 8 points higher than a year ago.
The share of deployments in production is 20% higher than a year ago.
Cost, operational efficiency and innovation are the top three business drivers.
Significantly higher interest in NFV and bare metal, and containers leads the 
list of emerging technologies three cycles in a row.
OpenStack is adopted by companies of every size. Nearly one-quarter of users 
are companies smaller than 100 people.
Want to dig deeper? We’re unveiling a beta version of our User Survey analysis 
tool at www.openstack.org/analytics  that 
enables you to compare 2016 data to prior data, and to apply six global filters 
to evaluate key metrics that matter to you.

We look forward to February, when we’ll ask you to answer the full-length User 
Survey. The report will be available by our Boston Summit, May 8–12, 2017. 
Please feel free to send me any questions or feedback about either the report 
or analysis tool.




Heidi Joy Tretheway
Senior Marketing Manager, OpenStack Foundation
503 816 9769  | Skype: heidi.tretheway 

     




___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators