Re: [Openstack-operators] [openstack-dev] [nova] Does anyone rely on PUT /os-services/disable for non-compute services?

2017-06-13 Thread Kris G. Lindgren
I am fine with #2, and I am also fine with calling it a bug.  Since the 
enabled/disabled state for the other services didn’t actually do anything.


___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

On 6/13/17, 8:46 PM, "Dan Smith"  wrote:

> Are we allowed to cheat and say auto-disabling non-nova-compute services
> on startup is a bug and just fix it that way for #2? :) Because (1) it
> doesn't make sense, as far as we know, and (2) it forces the operator to
> have to use the API to enable them later just to fix their nova
> service-list output.

Yes, definitely.

--Dan

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] RFC - Global Request Ids

2017-05-16 Thread Kris G. Lindgren
As long as the request-id is validated that it looks like a guid/request id, I 
have no problem trusting the input and passing that around.  In fact we want to 
be able to take request-id inputs on the API, as we have other systems that 
call into our openstack api’s so if there is an issue it would be nice to be 
able to use the same request-id between those external systems and openstack.


___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

On 5/16/17, 10:01 AM, "Sean Dague"  wrote:

After the forum session on logging, we came up with what we think is an
approach here for global request ids -
https://review.openstack.org/#/c/464746/ - it would be great of
interested operators would confirm this solves their concerns.

There is also an open question. A long standing concern was "trusting"
the request-id, though I don't really know how that could be exploited
for anything really bad, and this puts in a system for using service
users as a signal for trust.

But the whole system is a lot easier, and comes together quicker, if
we don't have that. For especially public cloud users, are there any
concerns that you have in letting users set Request-Id (assuming you'll
also still have a 2nd request-id that's service local and acts like
request-id today)?

-Sean

-- 
Sean Dague
http://dague.net

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Newton lbaas v2 remote_addr

2017-05-15 Thread Kris G. Lindgren
Ha proxy should be adding an x-forwarded-for header.  You should be able to 
adjust your apache logs and/or enable mod_remoteip to see this (I believe it is 
also made available to other modules within apache or code that is being ran by 
apache (IE php).

https://httpd.apache.org/docs/current/mod/mod_remoteip.html


___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Ignazio Cassano 
Date: Monday, May 15, 2017 at 11:05 AM
To: OpenStack Operators 
Subject: [Openstack-operators] Newton lbaas v2 remote_addr

Hi All,  I installed newton with lbaas v2 haproxy .
Creating an http load balancer the remote_addr showed  by each balanced apache  
is always the load balancing ip. Is there any option to show the client address 
?
Regards
Ignazio


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Openvswitch flat and provider in the same bond

2017-04-19 Thread Kris G. Lindgren
One small change for us recently is that we removed the vlan-splinters piece on 
eth0 and eth2 during the creation of the bond.

Other than that – I am not really sure what to tell you.  We use to do things 
using linux bonding with bond., however when we went to production no 
traffic actually flowed.  So we implemented what we have now.

You need to post your ovs-vsctl show and see if the br is wired into 
br-int or not.  If its not, then your bridge_mapping is wrong in neutron.  
Since neutron will wire-up br into br-int.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Ignazio Cassano <ignaziocass...@gmail.com>
Date: Wednesday, April 19, 2017 at 2:17 PM
To: "Kris G. Lindgren" <klindg...@godaddy.com>
Cc: Dan Sneddon <dsned...@redhat.com>, OpenStack Operators 
<openstack-operators@lists.openstack.org>
Subject: Re: [Openstack-operators] Openvswitch flat and provider in the same 
bond

Thx Kris.
If I map flat with br-flat and I add bond.567 interface to br-flat, it does not 
work ?
Regards
Ignazio


Il 19/Apr/2017 21:20, "Kris G. Lindgren" 
<klindg...@godaddy.com<mailto:klindg...@godaddy.com>> ha scritto:
We handle this a different way, I want to look to see if we can redo it, so it 
requires less pre-configuration on the host.  However, this is what we 
currently do (and have been doing for ~4 years):
http://www.dorm.org/blog/wp-content/uploads/2015/10/ovs-wiring-676x390.png

From there inside neutron ml2.  You just specify bridge mapping entries for 
br:.

We have the HV vlan being the native vlan.  However, you can simply change to 
the config for mgmt0 to have tag=.  If I were to change this I would 
think about using vlan networks instead of flat networks for everything VM 
related.  So that OVS, would create the vlan specific interfaces, instead of us 
doing it ahead of time and telling neutron to use what we created.

Under redhat we use the following network-scripts to make this happen on boot:
/etc/sysconfig/network-scripts/ifcfg-eth0:
DEVICE=eth0
USERCTL=no
ONBOOT=yes
BOOTPROTO=none
# The next 2 lines are required on first
# boot to work around some kudzu stupidity.
ETHTOOL_OPTS="wol g autoneg on"
HWADDR=F0:4D:A2:0A:E4:26
unset HWADDR

/etc/sysconfig/network-scripts/ifcfg-eth2:
DEVICE=eth2
USERCTL=no
ONBOOT=yes
BOOTPROTO=none
ETHTOOL_OPTS="wol g autoneg on"
# The next 2 lines are required on first
# boot to work around some kudzu stupidity.
HWADDR=F0:4D:A2:0A:E4:2A
unset HWADDR

/etc/sysconfig/network-scripts/ifcfg-bond0:
DEVICE=bond0
TYPE=OVSBond
DEVICETYPE=ovs
USERCTL=no
ONBOOT=yes
BOOTPROTO=none
BOND_IFACES="eth0 eth2"
OVS_BRIDGE=br-ext
OVS_EXTRA="set interface eth0 other-config:enable-vlan-splinters=true -- set 
interface eth2 other-config:enable-vlan-splinters=true"
/etc/sysconfig/network-scripts/ifcfg-br-ext:
DEVICE=br-ext
TYPE=OVSBridge
DEVICETYPE="ovs"
BOOTPROTO=none
ONBOOT=yes
USERCTL=no
BOOTPROTO=static

/etc/sysconfig/network-scripts/ifcfg-mgmt0:
DEVICE=mgmt0
TYPE=OVSIntPort
DEVICETYPE="ovs"
ONBOOT=yes
USERCTL=no
OVS_OPTIONS="vlan_mode=native-untagged"
OVS_EXTRA=""
OVS_BRIDGE=br-ext
IPADDR=
NETMASK=

/etc/sysconfig/network-scripts/ifcfg-ext-vlan-499:
DEVICE=ext-vlan-499
TYPE=OVSPort
DEVICETYPE="ovs"
ONBOOT=yes
USERCTL=no
OVS_OPTIONS="tag=499"
OVS_EXTRA="set interface $DEVICE type=patch -- set interface $DEVICE 
options:peer=br499-ext"
OVS_BRIDGE=br-ext
/etc/sysconfig/network-scripts/ifcfg-br499-ext:
DEVICE=br499-ext
TYPE=OVSIntPort
DEVICETYPE="ovs"
ONBOOT=yes
USERCTL=no
OVS_EXTRA="set interface $DEVICE type=patch -- set interface $DEVICE 
options:peer=ext-vlan-499"
OVS_BRIDGE=br499
/etc/sysconfig/network-scripts/ifcfg-br499:
DEVICE=br499
TYPE=OVSBridge
DEVICETYPE="ovs"
BOOTPROTO=none
ONBOOT=yes
USERCTL=no
/etc/sysconfig/network-scripts/ifcfg-ext-vlan-500:
DEVICE=ext-vlan-500
TYPE=OVSPort
DEVICETYPE="ovs"
ONBOOT=yes
USERCTL=no
OVS_OPTIONS="tag=500"
OVS_EXTRA="set interface $DEVICE type=patch -- set interface $DEVICE 
options:peer=br500-ext"
OVS_BRIDGE=br-ext
/etc/sysconfig/network-scripts/ifcfg-br500-ext:
DEVICE=br500-ext
TYPE=OVSIntPort
DEVICETYPE="ovs"
ONBOOT=yes
USERCTL=no
OVS_EXTRA="set interface $DEVICE type=patch -- set interface $DEVICE 
options:peer=ext-vlan-500"
OVS_BRIDGE=br500
/etc/sysconfig/network-scripts/ifcfg-br500:
DEVICE=br500
TYPE=OVSBridge
DEVICETYPE="ovs"
BOOTPROTO=none
ONBOOT=yes
USERCTL=no
/etc/sysconfig/network-scripts/ifcfg-br-int:
DEVICE=br-int
TYPE=OVSBridge
DEVICETYPE="ovs"
BOOTPROTO=none
ONBOOT=yes
USERCTL=no

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Ignazio Cassano 
<ignaziocass...@gmail.com<mailto:ignaziocass...@gmail.com>&g

Re: [Openstack-operators] Openvswitch flat and provider in the same bond

2017-04-19 Thread Kris G. Lindgren
We handle this a different way, I want to look to see if we can redo it, so it 
requires less pre-configuration on the host.  However, this is what we 
currently do (and have been doing for ~4 years):
http://www.dorm.org/blog/wp-content/uploads/2015/10/ovs-wiring-676x390.png

From there inside neutron ml2.  You just specify bridge mapping entries for 
br:.

We have the HV vlan being the native vlan.  However, you can simply change to 
the config for mgmt0 to have tag=.  If I were to change this I would 
think about using vlan networks instead of flat networks for everything VM 
related.  So that OVS, would create the vlan specific interfaces, instead of us 
doing it ahead of time and telling neutron to use what we created.


Under redhat we use the following network-scripts to make this happen on boot:
/etc/sysconfig/network-scripts/ifcfg-eth0:
DEVICE=eth0
USERCTL=no
ONBOOT=yes
BOOTPROTO=none
# The next 2 lines are required on first
# boot to work around some kudzu stupidity.
ETHTOOL_OPTS="wol g autoneg on"
HWADDR=F0:4D:A2:0A:E4:26
unset HWADDR

/etc/sysconfig/network-scripts/ifcfg-eth2:
DEVICE=eth2
USERCTL=no
ONBOOT=yes
BOOTPROTO=none
ETHTOOL_OPTS="wol g autoneg on"
# The next 2 lines are required on first
# boot to work around some kudzu stupidity.
HWADDR=F0:4D:A2:0A:E4:2A
unset HWADDR

/etc/sysconfig/network-scripts/ifcfg-bond0:
DEVICE=bond0
TYPE=OVSBond
DEVICETYPE=ovs
USERCTL=no
ONBOOT=yes
BOOTPROTO=none
BOND_IFACES="eth0 eth2"
OVS_BRIDGE=br-ext
OVS_EXTRA="set interface eth0 other-config:enable-vlan-splinters=true -- set 
interface eth2 other-config:enable-vlan-splinters=true"
/etc/sysconfig/network-scripts/ifcfg-br-ext:
DEVICE=br-ext
TYPE=OVSBridge
DEVICETYPE="ovs"
BOOTPROTO=none
ONBOOT=yes
USERCTL=no
BOOTPROTO=static

/etc/sysconfig/network-scripts/ifcfg-mgmt0:
DEVICE=mgmt0
TYPE=OVSIntPort
DEVICETYPE="ovs"
ONBOOT=yes
USERCTL=no
OVS_OPTIONS="vlan_mode=native-untagged"
OVS_EXTRA=""
OVS_BRIDGE=br-ext
IPADDR=
NETMASK=

/etc/sysconfig/network-scripts/ifcfg-ext-vlan-499:
DEVICE=ext-vlan-499
TYPE=OVSPort
DEVICETYPE="ovs"
ONBOOT=yes
USERCTL=no
OVS_OPTIONS="tag=499"
OVS_EXTRA="set interface $DEVICE type=patch -- set interface $DEVICE 
options:peer=br499-ext"
OVS_BRIDGE=br-ext
/etc/sysconfig/network-scripts/ifcfg-br499-ext:
DEVICE=br499-ext
TYPE=OVSIntPort
DEVICETYPE="ovs"
ONBOOT=yes
USERCTL=no
OVS_EXTRA="set interface $DEVICE type=patch -- set interface $DEVICE 
options:peer=ext-vlan-499"
OVS_BRIDGE=br499
/etc/sysconfig/network-scripts/ifcfg-br499:
DEVICE=br499
TYPE=OVSBridge
DEVICETYPE="ovs"
BOOTPROTO=none
ONBOOT=yes
USERCTL=no
/etc/sysconfig/network-scripts/ifcfg-ext-vlan-500:
DEVICE=ext-vlan-500
TYPE=OVSPort
DEVICETYPE="ovs"
ONBOOT=yes
USERCTL=no
OVS_OPTIONS="tag=500"
OVS_EXTRA="set interface $DEVICE type=patch -- set interface $DEVICE 
options:peer=br500-ext"
OVS_BRIDGE=br-ext
/etc/sysconfig/network-scripts/ifcfg-br500-ext:
DEVICE=br500-ext
TYPE=OVSIntPort
DEVICETYPE="ovs"
ONBOOT=yes
USERCTL=no
OVS_EXTRA="set interface $DEVICE type=patch -- set interface $DEVICE 
options:peer=ext-vlan-500"
OVS_BRIDGE=br500
/etc/sysconfig/network-scripts/ifcfg-br500:
DEVICE=br500
TYPE=OVSBridge
DEVICETYPE="ovs"
BOOTPROTO=none
ONBOOT=yes
USERCTL=no
/etc/sysconfig/network-scripts/ifcfg-br-int:
DEVICE=br-int
TYPE=OVSBridge
DEVICETYPE="ovs"
BOOTPROTO=none
ONBOOT=yes
USERCTL=no


___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Ignazio Cassano 
Date: Wednesday, April 19, 2017 at 1:06 PM
To: Dan Sneddon 
Cc: OpenStack Operators 
Subject: Re: [Openstack-operators] Openvswitch flat and provider in the same 
bond

Hi Dan, on the physical  switch the 567 is not the native vlan but is tagged 
like 555 and 556.
I know I could set 567 as a native vlan to receive it untagged.
But if I would like more than one flat network ?
I am not skilled in networking but I think only one native vlan can be set in 
switch.
Any further solution or suggestion ?
Regards
Ignazio

Il 19/Apr/2017 20:19, "Dan Sneddon" 
> ha scritto:
On 04/19/2017 09:02 AM, Ignazio Cassano wrote:
> Dear All,  in my openstack Newton installation compute e controllers
> node have e separate management network nic and a lacp bond0 where
> provider vlan (555,556) and flat vlan (567) are trunked.
> Since I cannot specify the vlan id (567) when I create a flat network, I
> need to know how I can create the bridge for flat network in openvswitch.
> For providers network I created a bridge br-ex and added bond0 to that
> bridge and configured openvswitch agent and ml2 for mapping br-ex.
> I don't know what can I do for flat network : must I create another
> bridge ? What interface I must add to the bridge for flat (567) network ?
> I configured the same scenario with linuxbridge mechanism driver  and it
> seems more easy to do.
> Sorry for my bad english.
> Regards
> 

Re: [Openstack-operators] Help: Liberty installation guide (English).

2017-04-11 Thread Kris G. Lindgren
Hello,

Liberty has been end-of-life for some time now.  Mitaka is also I believe EOL’d 
as of today.  Why would you want to deploy Liberty?  If You are deploying 
anything new, you should be deploying Ocata or maybe Newton.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Gaurav Goyal 
Date: Tuesday, April 11, 2017 at 8:51 AM
To: "openstack-operators@lists.openstack.org" 
, 
"openstack-operators-requ...@lists.openstack.org" 

Subject: [Openstack-operators] Help: Liberty installation guide (English).

Dear Openstack Users,

I want to deploy Liberty openstack in my environment .
Can you please help to share the link to Liberty installation guide (English).

I am finding installation guide for Mitaka and later version but not for 
liberty.



Regards
Gaurav Goyal
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Flavors

2017-03-15 Thread Kris G. Lindgren
So how do you bill for someone when you have a 24 core, 256GB ram, with 3TB of 
disk machine - and someone creates a 1 core, 512MB ram, 2.9TB disk – flavor?  
Are you going to charge them same amount as if they created a 24 core, 250GB 
instances with 1TB of disk?  Because both of those flavors make it practically 
impossible to use that hardware for another VM.  Thus, to you they have exactly 
the same cost.

With free-for all flavor sizes your bin packing goes to shit and you are left 
with inefficiently used hardware.  With free for all flavor sizes how can you 
make sure that your large ram instances go to sku’s optimized to handle those 
large ram VM’s?

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Matthew Kaufman 
Date: Wednesday, March 15, 2017 at 5:42 PM
To: "Fox, Kevin M" 
Cc: OpenStack Operators 
Subject: Re: [Openstack-operators] Flavors

Screw the short answer -- that is annoying to read, and it doesn't simplify 
BILLING from a CapEx/OpEx perspective, so please - wtf?
Anyway, Vladimir - I love your question and have always wanted the same thing.

On Wed, Mar 15, 2017 at 6:10 PM, Fox, Kevin M 
> wrote:
I think the really short answer is something like: It greatly simplifies 
scheduling and billing.

From: Vladimir Prokofev [v...@prokofev.me]
Sent: Wednesday, March 15, 2017 2:41 PM
To: OpenStack Operators
Subject: [Openstack-operators] Flavors
A question of curiosity - why do we even need flavors?

I do realise that we need a way to provide instance configuration, but why use 
such a rigid construction? Wouldn't it be more flexible to provide instance 
configuration as a set of parameters(metadata), and if you need some presets - 
well, use a preconfigured set of them as a flavor in your front-end(web/CLI 
client parameters)?

Suppose commercial customer has an instance with high storage IO load. 
Currently they have only one option - upsize instance to a flavor that provides 
higher IOPS. But ususally provider has a limited amount of flavors for 
purchase, and they upscale everything for a price. So instead of paying only 
for IOPS customers are pushed to pay for whole package. This is good from 
revenue point of view, but bad for customer's bank account and marketing(i.e. 
product architecure limits).
This applies to every resource - vCPU, RAM, storage, networking, etc - 
everything is controlled by flavor.

This concept has never been questioned anywhere I can search, so I have a 
feeling I'm missing something big here. Maybe other ways are too complicated to 
implement?

So does anyone has any idea - why such rigid approach as flavors instead of 
something more flexible?

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [openstack] Unable to launch an instance

2017-02-28 Thread Kris G. Lindgren
You should look at the compute node and the scheduler for actual error logs as 
said below.

 But one things I see is that you are using qcow2 images, but have disk in your 
flavor set to zero.  Seems like the problem is that your image is larger than 
the virtual disk that you are creating via your flavor.  Does booting a VM 
using a default flavor work?


___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

On 2/28/17, 11:26 AM, "Mikhail Medvedev"  wrote:

On Tue, Feb 28, 2017 at 11:49 AM, Amit Kumar  wrote:
> Hi All,
>
> I have installed Openstack Newton using Openstack-Ansible. While creating 
an
> instance, it is failing with following error:
>
> MessageNo valid host was found. There are not enough hosts

The "not enough hosts" could be due to any number of reasons. To know
exactly why, check your /var/log/nova/nova-scheduler.log on
controller. The error means that nova scheduler was not able to find
any suitable hosts to boot the VM. It could be for example because you
are using a flavor that does not fit, or because your compute node
appears dead to controller. In any case, nova scheduler log should
make it a bit clearer.

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] pip problems with openstack-ansible deployment

2017-02-17 Thread Kris G. Lindgren
I don't run OSAD, however did you confirm that on your repo server that you can 
actually download the files via a curl/wget call, locally and remotely?  I see 
you show the files exist, but don't see anything confirming that the web server 
is actually serving them.  I have seen things under apache, at least, that 
prevent the web server from sending the correct info.  Default config files 
forcing a specific index page, selinux permissions preventing directories from 
being shown.

On Feb 17, 2017, at 1:34 AM, Danil Zhigalin (Europe) 
> 
wrote:



I noticed one error in my previous explanation. I am running Ubuntu 14.04 LTS, 
not 16.04.



Danil Zhigalin
Technical Consultant
Tel: +49 211 1717 1260
Mob: +49 174 151 8457
danil.zhiga...@dimensiondata.com

Derendorfer Allee 26, Düsseldorf, North Rhine-Westphalia, 40476, Germany.

For more information, please go to 
www.dimensiondata.com

Dimension Data Germany AG & Co.KG, Horexstraße 7, 61352 Bad Homburg
Sitz: Bad Homburg, Amtsgericht Bad Homburg, HRA 3207
Pers. Haftende Ges : Dimension Data Verwaltungs AG, Sitz Bad Homburg.
Amtsgericht Bad Homburg, HRB 6172
Vorstand: Roberto Del Corno
Vors. des Aufsichtsrats: Andrew Coulsen.


-Original Message-
From: Danil Zhigalin (Europe)
Sent: 17 February 2017 09:15
To: 
'openstack-operators@lists.openstack.org'
 
>
Subject: pip problems with openstack-ansible deployment

Hello everyone,

Context:
openstact-ansible: stable/newton
OS: ubuntu 16.04 LTS

I am having trouble completing my deployment due to pip errors.

I have a 2 node setup and one separate deployment node. One of the nodes I am 
using to host all controller, network and storage functions and another as a 
compute. Repo container with the server is also hosted on the controller node. 
I already ran into similar problems as Achi Hamza who already reported pip 
issue on the Thu Nov 17 08:34:14 UTC 2016 in this mailing list.

This is how my openstack_user_config.yml file looks like (as in Hamza's case 
internal and external addresses are the same):

global_overrides:
internal_lb_vip_address: 172.21.51.152
external_lb_vip_address: 172.21.51.152 <...>

The recommendation that he got from another users were to set:

openstack_service_publicuri_proto: http
openstack_external_ssl: false
haproxy_ssl: false

in /etc/openstack_deploy/user_vriables.yml

These recommendations helped in my case as well and I was able to advance 
further until I faced another pip issues in the same playbook.

My current problem is that neither of containers can install pip packages from 
the repository.

TASK [galera_client : Install pip packages] 
FAILED - RETRYING: TASK: galera_client : Install pip packages (5 retries left).
FAILED - RETRYING: TASK: galera_client : Install pip packages (4 retries left).
FAILED - RETRYING: TASK: galera_client : Install pip packages (3 retries left).
FAILED - RETRYING: TASK: galera_client : Install pip packages (2 retries left).
FAILED - RETRYING: TASK: galera_client : Install pip packages (1 retries left).
fatal: [control1_galera_container-434df170]: FAILED! => {"changed": false, 
"cmd": "/usr/local/bin/pip install -U --constraint 
http://172.21.51.152:8181/os-releases/14.0.7/requirements_absolute_requirements.txt
 MySQL-python", "failed": true, "msg": "stdout: Collecting mysql_python==1.2.5 
(from -c 
http://172.21.51.152:8181/os-releases/14.0.7/requirements_absolute_requirements.txt
 (line 81))\n\n:stderr: Could not find a version that satisfies the requirement 
mysql_python==1.2.5 (from -c 
http://172.21.51.152:8181/os-releases/14.0.7/requirements_absolute_requirements.txt
 (line 81)) (from versions: )\nNo matching distribution found for 
mysql_python==1.2.5 (from -c 
http://172.21.51.152:8181/os-releases/14.0.7/requirements_absolute_requirements.txt
 (line 81))\n"}

I already checked everything related to the HAproxy and tcpdumped on the repo 
side to see what requests are coming when pip install is called.

I found that there was a HTTP GET to the URL 
http://172.21.51.152:8181/os-releases/14.0.7/

I saw that it was forwarded by the proxy to the repo server and that repo 
server returned index.html from /var/www/repo/os-releases/14.0.7/

ls /var/www/repo/os-releases/14.0.7/ | grep index index.html
index.html.1
index.html.2

I also checked that MySQL-python is in the repo:

root@control1-repo-container-dad60ff0:~# ls /var/www/repo/os-releases/14.0.7/ | 
grep mysql_python mysql_python-1.2.5-cp27-cp27mu-linux_x86_64.whl

But for some reason pip can't figure out it is there.

I very much appreciate your help in solving this issue.

Best regards,
Danil


This email and all contents are subject to the following disclaimer:

Re: [Openstack-operators] query NUMA topology via API

2017-02-01 Thread Kris G. Lindgren
I just confirmed with jaypipes and mriedem that the new placement API will 
provide this information.  With huma nodes being a “nested resource provider” 
[1].  This work however, did not make it into ocata.  So this should be a thing 
under available in pike and beyond.

https://blueprints.launchpad.net/nova/+spec/nested-resource-providers

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

On 2/1/17, 12:52 PM, "Chris Friesen"  wrote:

On 02/01/2017 09:49 AM, Gustavo Randich wrote:
> Hi, is there any way to query via Compute API the NUMA topology of a 
compute
> node, and free ram/cpu of each NUMA cell?

Not that I know of, but might be a useful thing for the admin to have.

Chris


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Encrypted Cinder Volume Deployment

2017-01-23 Thread Kris G. Lindgren
Slightly off topic,

But I remember a discussion involving encrypted volumes and nova(?) and there 
was an issue where an issue/bug where nova was using the wrong key – like it 
got hashed wrong and was using the badly hashed key/password vs’s what was 
configured.


___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Joe Topjian 
Date: Monday, January 23, 2017 at 12:41 PM
To: "openstack-operators@lists.openstack.org" 

Subject: [Openstack-operators] Encrypted Cinder Volume Deployment

Hi all,

I'm investigating the options for configuring Cinder with encrypted volumes and 
have a few questions.

The Cinder environment is currently running Kilo which will be upgraded to 
something between M-O later this year. The Kilo release supports the fixed_key 
setting. I see fixed_key is still supported, but has been abstracted into 
Castellan.

Question: If I configure Kilo with a fixed key, will existing volumes still be 
able to work with that same fixed key in an M, N, O release?

Next, fixed_key is discouraged because of it being a single key for all 
tenants. My understanding is that Barbican provides a way for each tenant to 
generate their own key.

Question: If I deploy with fixed_key (either now or in a later release), can I 
move from a master key to Barbican without bricking all existing volumes?

Are there any other issues to be aware of? I've done a bunch of Googling and 
searching on bugs.launchpad.net and am pretty 
satisfied with the current state of support. My intention is to provide users 
with simple native encrypted volume support - not so much supporting uploaded 
volumes, bootable volumes, etc.

But what I want to make sure of is that I'm not in a position where in order to 
upgrade, a bunch of volumes become irrecoverable.

Thanks,
Joe
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [kolla-ansible] [kolla] Am I doing this wrong?

2017-01-20 Thread Kris G. Lindgren
Adding [kolla] tag.


___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: "Kris G. Lindgren" <klindg...@godaddy.com>
Date: Friday, January 20, 2017 at 4:54 PM
To: "openstack-...@lists.openstack.org" <openstack-...@lists.openstack.org>
Cc: "openstack-operators@lists.openstack.org" 
<openstack-operators@lists.openstack.org>
Subject: Re: [kolla-ansible] Am I doing this wrong?

Poke.  Bueller?


___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: "Kris G. Lindgren" <klindg...@godaddy.com>
Date: Tuesday, January 10, 2017 at 5:34 PM
To: "openstack-...@lists.openstack.org" <openstack-...@lists.openstack.org>
Subject: [kolla-ansible] Am I doing this wrong?

Hello Kolla/Kolla-ansible peoples.

I have been trying to take kolla/kolla-ansible and use it to start moving our 
existing openstack deployment into containers.  At the same time also trying to 
fix some of the problems that we created with our previous deployment work 
(everything was in puppet).  Where we had puppet doing *everything* which 
eventually created a system that effectively performed actions at a distance.  
As we were never really 100% what puppet was going to do when we ran it.  Even 
with NOOP mode enabled.  So taking an example of building and deploying glance 
via kolla-ansible. I am running into some problems/concerns and wanted to reach 
out to make sure that I am not missing something.

Things that I am noticing:
 * I need to define a number of servers in my inventory outside of the specific 
servers that I want to perform actions against.  I need to define groups 
baremetal, rabbitmq, memcached, and control (IN addition to the glance specific 
groups) most of these seem to be gathering information for config? (Baremetal 
was needed soley to try to run the bootstrap play).  Running a change 
specifically against "glance" causes fact gathering on a number of other 
servers not specifically where glance is running?  My concern here is that I 
want to be able to run kola-ansible against a specific service and know that 
only those servers are being logged into.

* I want to run a dry-run only, being able to see what will happen before it 
happens, not during; during makes it really hard to see what will happen until 
it happens. Also supporting  `ansible --diff` would really help in 
understanding what will be changed (before it happens).  Ideally, this wouldn’t 
be 100% needed.  But the ability to figure out what a run would *ACTUALLY* do 
on a box is what I was hoping to see.

* Database task are ran on every deploy and status of change DB permissions 
always reports as changed? Even when nothing happens, which makes you wonder 
"what changed"?  Seems like this is because the task either reports a 0 or a 1, 
where it seems like there is 3 states, did nothing, updated something, failed 
to do what was required.  Also, Can someone tell me why the DB stuff is done on 
a deployment task?  Seems like the db checks/migration work should only be done 
on a upgrade or a bootstrap?

* Database services (that at least we have) our not managed by our team, so 
don't want kolla-ansible touching those (since it won't be able to). No way to 
mark the DB as "externally managed"?  IE we dont have permissions to create 
databases or add users.  But we got all other permissions on the databases that 
are created, so normal db-manage tooling works.

* Maintenance level operations; doesn't seem to be any built-in to say 'take a 
server out  of a production state, deploy to it, test it, put it back into 
production'  Seems like if kola-ansible is doing haproxy for API's, it should 
be managing this?  Or an extension point to allow us to run our own 
maintenance/testing scripts?

* Config must come from kolla-ansible and generated templates.  I know we have 
a patch up for externally managed service configuration.  But if we aren't 
suppose to use kolla-ansible for generating configs (see below), why cant we 
override this piece?

Hard to determine what kolla-ansible *should* be used for:

* Certain parts of it are 'reference only' (the config tasks), some are not 
recommended
  to be used at all (bootstrap?); what is the expected parts of kolla-ansible 
people are
  actually using (and not just as a reference point); if parts of kolla-ansible 
are just
  *reference only* then might as well be really upfront about it and tell 
people how to
  disable/replace those reference pieces?

* Seems like this will cause everyone who needs to make tweaks to fork or 
create an "overlay" to override playbooks/tasks with specific functions?

Other questions:

Is kolla-ansibles design philosophy that every deployment is an upgrade?  Or 
every deployment should include all the base level boostrap tests?

Re: [Openstack-operators] [kolla-ansible] Am I doing this wrong?

2017-01-20 Thread Kris G. Lindgren
Poke.  Bueller?


___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: "Kris G. Lindgren" <klindg...@godaddy.com>
Date: Tuesday, January 10, 2017 at 5:34 PM
To: "openstack-...@lists.openstack.org" <openstack-...@lists.openstack.org>
Subject: [kolla-ansible] Am I doing this wrong?

Hello Kolla/Kolla-ansible peoples.

I have been trying to take kolla/kolla-ansible and use it to start moving our 
existing openstack deployment into containers.  At the same time also trying to 
fix some of the problems that we created with our previous deployment work 
(everything was in puppet).  Where we had puppet doing *everything* which 
eventually created a system that effectively performed actions at a distance.  
As we were never really 100% what puppet was going to do when we ran it.  Even 
with NOOP mode enabled.  So taking an example of building and deploying glance 
via kolla-ansible. I am running into some problems/concerns and wanted to reach 
out to make sure that I am not missing something.

Things that I am noticing:
 * I need to define a number of servers in my inventory outside of the specific 
servers that I want to perform actions against.  I need to define groups 
baremetal, rabbitmq, memcached, and control (IN addition to the glance specific 
groups) most of these seem to be gathering information for config? (Baremetal 
was needed soley to try to run the bootstrap play).  Running a change 
specifically against "glance" causes fact gathering on a number of other 
servers not specifically where glance is running?  My concern here is that I 
want to be able to run kola-ansible against a specific service and know that 
only those servers are being logged into.

* I want to run a dry-run only, being able to see what will happen before it 
happens, not during; during makes it really hard to see what will happen until 
it happens. Also supporting  `ansible --diff` would really help in 
understanding what will be changed (before it happens).  Ideally, this wouldn’t 
be 100% needed.  But the ability to figure out what a run would *ACTUALLY* do 
on a box is what I was hoping to see.

* Database task are ran on every deploy and status of change DB permissions 
always reports as changed? Even when nothing happens, which makes you wonder 
"what changed"?  Seems like this is because the task either reports a 0 or a 1, 
where it seems like there is 3 states, did nothing, updated something, failed 
to do what was required.  Also, Can someone tell me why the DB stuff is done on 
a deployment task?  Seems like the db checks/migration work should only be done 
on a upgrade or a bootstrap?

* Database services (that at least we have) our not managed by our team, so 
don't want kolla-ansible touching those (since it won't be able to). No way to 
mark the DB as "externally managed"?  IE we dont have permissions to create 
databases or add users.  But we got all other permissions on the databases that 
are created, so normal db-manage tooling works.

* Maintenance level operations; doesn't seem to be any built-in to say 'take a 
server out  of a production state, deploy to it, test it, put it back into 
production'  Seems like if kola-ansible is doing haproxy for API's, it should 
be managing this?  Or an extension point to allow us to run our own 
maintenance/testing scripts?

* Config must come from kolla-ansible and generated templates.  I know we have 
a patch up for externally managed service configuration.  But if we aren't 
suppose to use kolla-ansible for generating configs (see below), why cant we 
override this piece?

Hard to determine what kolla-ansible *should* be used for:

* Certain parts of it are 'reference only' (the config tasks), some are not 
recommended
  to be used at all (bootstrap?); what is the expected parts of kolla-ansible 
people are
  actually using (and not just as a reference point); if parts of kolla-ansible 
are just
  *reference only* then might as well be really upfront about it and tell 
people how to
  disable/replace those reference pieces?

* Seems like this will cause everyone who needs to make tweaks to fork or 
create an "overlay" to override playbooks/tasks with specific functions?

Other questions:

Is kolla-ansibles design philosophy that every deployment is an upgrade?  Or 
every deployment should include all the base level boostrap tests?

Because it seems to me that you have a required set of tasks that should only 
be done once (boot strap).  Another set of tasks that should be done for day to 
day care/feeding: service restarts, config changes, updates to code (new 
container deployments), package updates (new docker container deployment).  And 
a final set of tasks for upgrades where you will need to do things like db 
migrations and other special upgrade things.  It also seems like the day to day 
care and feeding tasks shou

Re: [Openstack-operators] [Glance] [Nova] Multiple backends / Qcow Derived Images

2017-01-04 Thread Kris G. Lindgren
We used raw backed qcows.  We even have/had an ansible playbook that would 
pre-stage the raw files on compute nodes.

This should be controlled by the: force_raw_images= True paramater (which also 
happens to be the default).  I don’t think its possible to do this via 
ephemeral volumes.

Re: your image placement there is schedulure options that cover image/falvor 
metadata and dedicating hosts to those specific images/flavors. 
http://docs.openstack.org/kilo/config-reference/content/section_compute-scheduler.html
 - Look for Images Properties filter and the Compute Capabilities Filter.  You 
may need to extend the image property filter to inculde random key matching 
https://github.com/openstack/nova/blob/master/nova/scheduler/filters/image_props_filter.py

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Adam Lawson 
Date: Wednesday, January 4, 2017 at 2:48 PM
To: "openstack-operators@lists.openstack.org" 

Subject: Re: [Openstack-operators] [Glance] [Nova] Multiple backends / Qcow 
Derived Images

Just a friendly bump. To clarify, the ideas being tossed around are to host 
QCOW images on each Compute node so the provisioning is faster (i.e. less 
dependency on network connectivity to a shared back-end). I need to know if 
this is possible or not. So far, I've seen nothing that suggests that it is but 
i want to confirm that.

Also, derived images is a QCOW thing[1], I'm wondering if creating these 
dynamically is supported by Nova and/or Glance.

[1] 
http://nairobi-embedded.org/manipulating_disk_images_with_qemu-img.html#creating-derived-images

//adam


Adam Lawson

Principal Architect, CEO
Office: +1-916-794-5706

On Tue, Jan 3, 2017 at 5:01 PM, Adam Lawson 
> wrote:
Greetings fellow Stackers!

Question re GlanceNova: Does Glance and/or Nova support attaching volumes built 
with derived images (created from a master registered with Glance (ref qcow))?

Glance-only question: Can Glance be configured to place images on separate 
hosts (i.e. imageX on ComputeX and imageY on ComputeY)?

//adam


Adam Lawson

Principal Architect, CEO
Office: +1-916-794-5706

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Analogs of EC2 dedicated instances & dedicated hosts?

2016-12-19 Thread Kris G. Lindgren
Not aware of an easy answer for #1, without creating a flavor or image with 
metadata on it and adding specific hosts to a host_aggregate that has the same 
metadata on it.
http://docs.openstack.org/kilo/config-reference/content/section_compute-scheduler.html
 - Look at isolatedhostfilter or the aggregate_instance_extra_specs and the 
config example for specifying a compute host with SSD’s.


#2 just sounds like affinity/anti-affinity rules? Combined with #1.

ServerGroupAffinityFilter
The ServerGroupAffinityFilter ensures that an instance is scheduled on to a 
host from a set of group hosts. To take advantage of this filter, the requester 
must create a server group with anʼ/span>affinityʼ/span>policy, and pass a 
scheduler hint, usingʼ/span>groupʼ/span>as the key and the server group UUID as 
the value. Using theʼ/span>novaʼ/span>command-line tool, use 
theʼ/span>--hintʼ/span>flag. For example:
$ nova server-group-create --policy affinity group-1
$ nova boot --image IMAGE_ID --flavor 1 --hint group=SERVER_GROUP_UUID server-1
ServerGroupAntiAffinityFilter
The ServerGroupAntiAffinityFilter ensures that each instance in a group is on a 
different host. To take advantage of this filter, the requester must create a 
server group with anʼ/span>anti-affinityʼ/span>policy, and pass a scheduler 
hint, usinggroupʼ/span>as the key and the server group UUID as the value. Using 
theʼ/span>novaʼ/span>command-line tool, use theʼ/span>--hintʼ/span>flag. For 
example:
$ nova server-group-create --policy anti-affinity group-1
$ nova boot --image IMAGE_ID --flavor 1 --hint group=SERVER_GROUP_UUID server-1

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: "Kimball, Conrad" 
Date: Monday, December 19, 2016 at 3:24 PM
To: "openstack-operators@lists.openstack.org" 

Subject: [Openstack-operators] Analogs of EC2 dedicated instances & dedicated 
hosts?

Hi All,

What mechanisms does OpenStack provide that would enable me to implement 
behaviors analogous to AWS EC2 dedicated instances and dedicated hosts?


· Dedicated instances:  an OpenStack tenant can deploy VM instances 
that are guaranteed to not share a compute host with any other tenant (for 
example, as the tenant I want physical segregation of my compute).


· Dedicated hosts: goes beyond dedicated instances, allowing an 
OpenStack tenant to explicitly place only specific VM instances onto the same 
compute host (for example, as the tenant I want to place VMs foo and bar onto 
the same compute host to share a software license that is licensed per host).

Conrad Kimball
Associate Technical Fellow
Chief Architect, Enterprise Cloud Services
Engineering, Operations & Technology / Information Technology / Core 
Infrastructure Engineering
conrad.kimb...@boeing.com
P.O. Box 3707, Mail Code 7M-TE
Seattle, WA  98124-2207
Bellevue 33-11 bldg, office 3A6-3.9
Mobile:  425-591-7802

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Change 'swap' in a flavor template

2016-12-16 Thread Kris G. Lindgren
FYI, you can provide the flavor ID to use during a flavor create.

So if you wanted to change flavor 5 you can delete it and recreate flavor 5 
with your changes.
___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

On 12/16/16, 5:47 AM, "William Josefsson"  wrote:

thx George, yes I tried that, but it will replace the ID of an
existing flavor, with a uuid. I didn't want to introduce this change
as it may potentially affect something else, so I went ahead and
updated nova.instance_types, however I agree this ain't ideal. I'm not
sure the flavor set/update operation doesn't accept updating swap, or
other parameters. I assume it may in the worst case cause
inconsistencies.. thx will

On Thu, Dec 15, 2016 at 11:58 PM, George Mihaiescu  
wrote:
> Can you not update the flavour in dashboard?
>
>> On Dec 15, 2016, at 09:34, William Josefsson 
 wrote:
>>
>>> On Thu, Dec 15, 2016 at 9:40 PM, Mikhail Medvedev  
wrote:
>>>
>>> I could not figure out how to set swap on existing flavor fast enough,
>>> so I initially edited nova db directly. There were no side effects in
>>> doing so in Icehouse. I see no reason it would not work in Liberty.
>>>
 Can anyone please advice on how to go about changing the 'swap'
 setting for an existing flavor? Last resort is to add additional
 flavors with swap values, but that would be very ugly. :(
>>>
>>> For a "nicer" way I ended up recreating flavor I needed to edit:
>>> delete old one, create new one with the same id and swap enabled. I
>>> hope there is a better way, but editing db directly, or recreating
>>> flavor was sufficient for me so far.
>>
>> Thanks Mikhail. Appreciate the hint. I thought of deleting the flavor,
>> and add again but was concerned about if that would affect current
>> instances with that flavor-id in use? Maybe the easiest is to just go
>> ahead and update the 'nova' table. I just was concerned that there
>> would be existing relationships that would brake upon e.g. deleting
>> existing instances, however.. I think I should go ahead and try the
>> db-update way first. thanks! will
>>
>> ___
>> OpenStack-operators mailing list
>> OpenStack-operators@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] feedback on pymysql

2016-11-17 Thread Kris G. Lindgren
In one of our cells we are running nova with pymsql.  It actually helped when 
we had some db connectivity issues that we were trying to figure out.  This is 
because it wouldn’t block the entire green thread for x seconds while it waits 
for the db to time out.  For us what would happen is db connectivity would be 
impacted, we would log an error, then shortly after nova would log a bunch of 
rabbitmq heartbeat failures.  Switching to pymsql stopped the heartbeat 
failures from happening/being logged.


___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Matt Fischer 
Date: Thursday, November 17, 2016 at 3:11 PM
To: "openstack-operators@lists.openstack.org" 

Subject: [Openstack-operators] feedback on pymysql

As a part of our upgrades to Newton we are transitioning our services to use 
pymysql rather than the deprecated MySQL-Python [1]. I believe pymsql has been 
the default in devstack and the gate for sometime now and that MySQL-Python is 
essentially untested and not updated, hence our desire to switch.

devstack is one thing, but I'm curious if anyone has experience operating in 
production with this, especially if there are issues. I've not seen anything in 
testing but if anyone else has I'd love to know positive or negative.

[1] https://wiki.openstack.org/wiki/PyMySQL_evaluation#MySQL-Connector-Python
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [openstack-dev] [openstack-ansible] pip issues

2016-11-17 Thread Kris G. Lindgren
Have you tried tcpdumping on the deployment node and trying to connect in from 
the container to see if the packets even get there?

If the packets do get there, have you looked at the webserver logs to see if 
they provide any insight?


___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Achi Hamza 
Date: Thursday, November 17, 2016 at 8:07 AM
To: Jesse Pretorius 
Cc: "OpenStack-operators@lists.openstack.org" 

Subject: Re: [Openstack-operators] [openstack-dev] [openstack-ansible] pip 
issues

It also works on the Public IP of the repo:
root@maas:/opt/openstack-ansible/playbooks# ansible hosts -m shell -a "curl 
http://172.16.1.222:8181/os-releases/;
Variable files: "-e @/etc/openstack_deploy/user_secrets.yml -e 
@/etc/openstack_deploy/user_variables.yml "
node01 | SUCCESS | rc=0 >>

Index of /os-releases/

Index of /os-releases/../
14.0.1/
16-Nov-2016 14:47   -

  % Total% Received % Xferd  Average Speed   TimeTime Time  
Current
 Dload  Upload   Total   SpentLeft  Speed
100   2870   2870 0   381k  0 --:--:-- --:--:-- --:--:--  280k


do you have an explanation to this Jesse ?

Thank you

On 17 November 2016 at 15:53, Achi Hamza 
> wrote:
It also works on the internal interface of the containers, i can fetch from the 
repo container to the host on the internal IP of the container:

root@maas:/opt/openstack-ansible/playbooks# ansible hosts -m shell -a "curl 
http://10.0.3.92:8181/os-releases/;
Variable files: "-e @/etc/openstack_deploy/user_secrets.yml -e 
@/etc/openstack_deploy/user_variables.yml "
node01 | SUCCESS | rc=0 >>

Index of /os-releases/

Index of /os-releases/../
14.0.1/
16-Nov-2016 14:47   -

  % Total% Received % Xferd  Average Speed   TimeTime Time  
Current
 Dload  Upload   Total   SpentLeft  Speed
100   2870   2870 0   405k  0 --:--:-- --:--:-- --:--:--  280k


On 17 November 2016 at 15:26, Achi Hamza 
> wrote:
It works on the repo itself:

root@maas:/opt/openstack-ansible/playbooks# ansible repo_all -m shell -a "curl 
http://localhost:8181/os-releases/;
Variable files: "-e @/etc/openstack_deploy/user_secrets.yml -e 
@/etc/openstack_deploy/user_variables.yml "
node01_repo_container-82b4e1f6 | SUCCESS | rc=0 >>

Index of /os-releases/

Index of /os-releases/../
14.0.1/
16-Nov-2016 14:47   -

  % Total% Received % Xferd  Average Speed   TimeTime Time  
Current
 Dload  Upload   Total   SpentLeft  Speed
100   2870   2870 0  59878  0 --:--:-- --:--:-- --:--:-- 71750


On 17 November 2016 at 15:22, Jesse Pretorius 
> wrote:


From: Achi Hamza >
Date: Thursday, November 17, 2016 at 1:57 PM
To: Jesse Pretorius 
>, 
"OpenStack-operators@lists.openstack.org"
 
>
Subject: Re: [Openstack-operators] [openstack-dev] [openstack-ansible] pip 
issues

Thank you Jesse, but these iptables rules are just applied on the deployment 
node not the host nodes. do i have to omit these rules even on the deployment 
node ?

Thank you

Ah, then that’s a red herring. As long as your hosts can reach the internet 
through it, then you’re good on that front.

Let’s go back to verifying access to the repo – try checking access from the 
repo server to itself:

ansible repo_all -m uri -a "url=http://localhost:8181/os-releases/;

or

ansible repo_all –m shell –a "curl http://localhost:8181/os-releases/;



Rackspace Limited is a company registered in England & Wales (company 
registered number 03897010) whose registered office is at 5 Millington Road, 
Hyde Park Hayes, Middlesex UB3 4AZ. Rackspace Limited privacy policy can be 
viewed at 
www.rackspace.co.uk/legal/privacy-policy
 - This e-mail message may contain confidential or privileged information 
intended for the recipient. Any dissemination, distribution or copying of the 
enclosed material is prohibited. If you receive this transmission in error, 
please notify us immediately by e-mail at 
ab...@rackspace.com and delete the original 
message. Your cooperation is appreciated.

___

Re: [Openstack-operators] sync power states and externally shut off VMs

2016-11-16 Thread Kris G. Lindgren
As a follow up on this.  You can configure the host to shutdown and start up in 
a way that all the VM’s are shutdown and started up automatically.

To do this you need to do a few things:

1.) Ensure that nova-compute is configured to stopped before 
libvirt-guests.  Make sure libvirt-guests is enabled.

2.) Allow libvirt-guests to shutdown the vm’s that are running (I recommend 
avoiding suspending the vm’s. As this will lead to in vm clock sync issues):  
ON_SHUTDOWN=shutdown

3.) Ensure that libvirt-guests is configure with: ON_BOOT=ignore

4.) Set [DEFAULT] resume_guests_state_on_host_boot=true in nova.conf

This config will gracefully shutdown the running vm’s via a normal host 
shutdown, preserving the running state in nova.  This will then cause nova to 
bring the vm’s online when nova-compute starts on host start up.  This also 
works with ungraceful power downs.  The key is that nova needs to be the one 
that starts the VM’s.  Because libvirt-guests will not be able to successfully 
start the vm’s, because neutron eeds to plug the vifs for the vms.  As long as 
the state of the VM is “running” inside the DB, this config will work.

NB: if you do chassis swaps and for some reason the OS comes up in a config 
that no longer works, all of the VM’s will go to error.  You will need to fix 
whatever issue prevented the vm’s from starting and manually reset the state 
and start the vms.
Some examples that we have seen: vtx extensions disabled on new server.  
Replacement server has different cpu’s and no longer matches either the numa 
config or the new processors do not have the same cpu extensions as the old 
processors.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Adam Thurlow 
Date: Wednesday, November 16, 2016 at 9:58 PM
To: "openstack-operators@lists.openstack.org" 

Subject: Re: [Openstack-operators] sync power states and externally shut off VMs


If you are interested in manually mucking around with local virsh domain 
states, and you don't want nova to interfere, you can just stop the local 
nova-compute service and it won't be doing any syncing. Once you get those 
instances back into their desired state, you can restart nova-compute and it 
won't be any wiser.

You can obviously shoot yourself in the foot using this method, but I can 
understand in some cases that large hammers and manual virsh commands are 
necessary.
Cheers!
On 2016-11-16 17:32, Mohammed Naser wrote:
Typically, you should not be managing your VMs by virsh. After a power outage, 
I would recommend sending a start API call to instances that are housed on that 
specific hypervisor

Sent from my iPhone

On Nov 16, 2016, at 4:26 PM, Gustavo Randich 
> wrote:
When a VM is shutdown without using nova API (kvm process down, libvirt failed 
to start instance on host boot, etc.), Openstack "freezes" the shutdown power 
state in the DB, and then re-applies it if the VM is not started via API, e.g.:

# virsh shutdown 

[ sync power states -> stop instance via API ], because hypervisor rules 
("power_state is always updated from hypervisor to db")

# virsh startup 

[ sync power states -> stop instance via API ], because database rules


I understand this behaviour is "by design", but I'm confused about the 
asymmetry: if VM is shutdown without using nova API, should I not be able to 
start it up again without nova API?

This is a common scenario in power outages or failures external to Openstack, 
when VMs fail to start and we need to start them up again using virsh.

Thanks!

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators




___

OpenStack-operators mailing list

OpenStack-operators@lists.openstack.org

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Audit Logging - Interested? What's missing?

2016-11-16 Thread Kris G. Lindgren
I need to do a deeper dive on audit logging. 

However, we have a requirement for when someone changes a security group that 
we log what the previous security group was and what the new security group is 
and who changed it.  I don’’t know if this is specific to our crazy security 
people or if others security peoples want to have this.  I am sure I can think 
of others.


___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

On 11/16/16, 3:29 PM, "Tom Fifield"  wrote:

Hi Ops,

Was chatting with Department of Defense in Australia the other day, and 
one of their pain points is Audit Logging. Some bits of OpenStack just 
don't leave enough information for proper audit. So, thought it might be 
a good idea to gather people who are interested to brainstorm how to get 
it to a good level for all :)

Does your cloud need good audit logging? What do you wish was there at 
the moment, but isn't?


Regards,


Tom

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Managing quota for Nova local storage?

2016-11-11 Thread Kris G. Lindgren
I don’t mean to hijack your thread a little bit but since you mentioned KSM.  I 
realize that you guys run private cloud, so you don’t have to worry about bad 
actors getting a server from you and doing malicious things with it.  But do 
you have any concerns about the recent research [1] that uses Rowhammer + ksm + 
transparent hugepages + kvm  to change the memory of collocated VM’s?  The 
research showed that they were able to successfully target memory inside other 
VM’s to do things like modify authorized_keys in memory in such a way that they 
could successfully login with their own key.  They also performed other attacks 
like manipulating the update URL for Ubuntu vm’s and modifying the gpg key (in 
memory), so that when an update is performed they install packages from a 
malicious source.  On the SSH attack, they showed that out of 300 attempts they 
were able to successfully change the in memory representation of 
authorized_keys in another vm 252 (84.1%) of the time, most of the time within 
6 minutes, with a max time of 12.6 minutes.

The attack mainly works because of KSM + transparent hugepages.  You obviously 
need rowhammer vulnerable memory chips.  But lets face it – with the majority 
of them susceptible – you most likely have vulnerable ram somewhere the 
machines in your datacenter.

1 - 
https://www.usenix.org/system/files/conference/usenixsecurity16/sec16_paper_razavi.pdf
___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: "Van Leeuwen, Robert" <rovanleeu...@ebay.com>
Date: Friday, November 11, 2016 at 12:10 AM
To: "Kris G. Lindgren" <klindg...@godaddy.com>, Edmund Rhudy 
<erh...@bloomberg.net>, "war...@wangspeed.com" <war...@wangspeed.com>
Cc: "openstack-operators@lists.openstack.org" 
<openstack-operators@lists.openstack.org>
Subject: Re: [Openstack-operators] Managing quota for Nova local storage?

Thx for your stories,

I think we are now all doing pretty much the same thing to get around the issue 
but it still looks like a very useful feature.

So to share what we (eBay-ECG) are doing:
We also started out with scaling the flavor disksize to either memory or cpu. 
(so e.g. large disk == large memory)
But our users started asking for flavors with quite different specs.
Not being able to give those would be hugely inefficient.

So now we started giving flavors to specific tenants instead of making them 
public (and let the quota’s sort it out)
e.g. a flavor with 8 cores, 12G and 1TB of local storage will only be available 
for the tenants that really need it.

Looking at our hypervisor stats we either run out of memory or disk before cpu 
cycles so not having a tunable on disk is inconvenient.
Our latest spec hypervisors have 768GB and we run KSM so we will probably run 
out of DISK first there.
We run SSD-only on local storage so that space in the flavor is real $$$.

We started to run on zfs with compression on our latest config/iteration and 
that seems to alleviate the pain a bit.
It is a bit early to tell exactly but it seems to run stable and the 
compression factor will be around 2.0

P.S. I noticed my search for blueprints was not good enough so I closed mine 
and subscribed to the one that’s was already there:
https://blueprints.launchpad.net/nova/+spec/nova-disk-quota-tracking

Robert van Leeuwen

From: "Kris G. Lindgren" <klindg...@godaddy.com>
Date: Thursday, November 10, 2016 at 5:18 PM
To: Edmund Rhudy <erh...@bloomberg.net>, "war...@wangspeed.com" 
<war...@wangspeed.com>, Robert Van Leeuwen <rovanleeu...@ebay.com>
Cc: "openstack-operators@lists.openstack.org" 
<openstack-operators@lists.openstack.org>
Subject: Re: [Openstack-operators] Managing quota for Nova local storage?

This is what we have done as well.

We made our flavors stackable, starting with our average deployed flavor size 
and making things a multiple of that.  IE if our average deployed flavor size 
is 8GB 120GB of disk, our larger flavors are multiple of that.  So if 16GB 
240GB of disk is the average, the next flavor up maybe: 32GB 480GB of disk.  
From there its easy to then say with 256GB of ram we will average:  ~30 VM’s 
which means we need to have ~3.6TB of local storage per node.  Assuming that 
you don’t over allocate disk or ram.  In practice though you can get a running 
average of the amount of disk space consumed and work towards that plus a bit 
of a buffer and run with a disk oversubscription.

We currently have no desire to remove local storage.  We want the root disks to 
be on local storage.  That being said in the future we will most likely give 
smaller root disks and if people need more space ask them to provisioning a rbd 
volume through cinder.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: &qu

Re: [Openstack-operators] Managing quota for Nova local storage?

2016-11-10 Thread Kris G. Lindgren
This is what we have done as well.

We made our flavors stackable, starting with our average deployed flavor size 
and making things a multiple of that.  IE if our average deployed flavor size 
is 8GB 120GB of disk, our larger flavors are multiple of that.  So if 16GB 
240GB of disk is the average, the next flavor up maybe: 32GB 480GB of disk.  
From there its easy to then say with 256GB of ram we will average:  ~30 VM’s 
which means we need to have ~3.6TB of local storage per node.  Assuming that 
you don’t over allocate disk or ram.  In practice though you can get a running 
average of the amount of disk space consumed and work towards that plus a bit 
of a buffer and run with a disk oversubscription.

We currently have no desire to remove local storage.  We want the root disks to 
be on local storage.  That being said in the future we will most likely give 
smaller root disks and if people need more space ask them to provisioning a rbd 
volume through cinder.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: "Edmund Rhudy (BLOOMBERG/ 120 PARK)" 
Reply-To: Edmund Rhudy 
Date: Thursday, November 10, 2016 at 8:47 AM
To: "war...@wangspeed.com" , "rovanleeu...@ebay.com" 

Cc: "openstack-operators@lists.openstack.org" 

Subject: Re: [Openstack-operators] Managing quota for Nova local storage?

We didn't come up with one. RAM on our HVs is the limiting factor since we 
don't run with memory overcommit, so the ability of people to run an HV out of 
disk space ended up being moot. ¯\_(ツ)_/¯

Long term we would like to switch to being exclusively RBD-backed and get rid 
of local storage entirely, but that is Distant Future at best.

From: rovanleeu...@ebay.com
Subject: Re: [Openstack-operators] Managing quota for Nova local storage?
Hi,

Found this thread in the archive so a bit of a late reaction.
We are hitting the same thing so I created a blueprint:
https://blueprints.launchpad.net/nova/+spec/nova-local-storage-quota

If you guys already found a nice solution to this problem I’d like to hear it :)

Robert van Leeuwen
eBay - ECG

From: Warren Wang 
Date: Wednesday, February 17, 2016 at 8:00 PM
To: Ned Rhudy 
Cc: "openstack-operators@lists.openstack.org" 

Subject: Re: [Openstack-operators] Managing quota for Nova local storage?

We are in the same boat. Can't get rid of ephemeral for it's speed, and 
independence. I get it, but it makes management of all these tiny pools a 
scheduling and capacity nightmare.
Warren @ Walmart

On Wed, Feb 17, 2016 at 1:50 PM, Ned Rhudy (BLOOMBERG/ 731 LEX) 
> wrote:
The subject says it all - does anyone know of a method by which quota can be 
enforced on storage provisioned via Nova rather than Cinder? Googling around 
appears to indicate that this is not possible out of the box (e.g., 
https://ask.openstack.org/en/question/8518/disk-quota-for-projects/).

The rationale is we offer two types of storage, RBD that goes via Cinder and 
LVM that goes directly via the libvirt driver in Nova. Users know they can 
escape the constraints of their volume quotas by using the LVM-backed 
instances, which were designed to provide a fast-but-unreliable RAID 0-backed 
alternative to slower-but-reliable RBD volumes. Eventually users will hit their 
max quota in some other dimension (CPU or memory), but we'd like to be able to 
limit based directly on how much local storage is used in a tenancy.

Does anyone have a solution they've already built to handle this scenario? We 
have a few ideas already for things we could do, but maybe somebody's already 
come up with something. (Social engineering on our user base by occasionally 
destroying a random RAID 0 to remind people of their unsafety, while tempting, 
is probably not a viable candidate solution.)

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] old Juno documentation

2016-11-07 Thread Kris G. Lindgren
I don’t have an answer for you, however I have noticed this EXACT same thing 
happening with API documentation.  The documentation gets replaced with latest 
version and it’s impossible to point people to the documentation for the older 
version of the api’s that we are actually running.

IE: http://developer.openstack.org/api-ref/compute/ != valid for Liberty and I 
have no way of specify a micro version or even getting v2 api documentation 
(the v2 link points back to this link).

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Cristina Aiftimiei 
Date: Monday, November 7, 2016 at 3:16 PM
To: "openstack-operators@lists.openstack.org" 

Subject: [Openstack-operators] old Juno documentation

Dear all,
in the last days I'm looking around for an old link that I had on how to 
configure "Provider networks with Open vSwitch"in Juno release.
I had the link 
http://docs.openstack.org/kilo/networking-guide/scenario_provider_ovs.html that 
now give s a nice "404"
Going to the Juno documentation - http://docs.openstack.org/juno/ - one can 
lcearly see that under the "Networking Guide" there is a lin pointing to the 
"wrong" release - http://docs.openstack.org/kilo/networking-guide/, from where 
one can reach the:
http://docs.openstack.org/kilo/networking-guide/scenario_provider_ovs.html
None of the documents under the " Operations and Administration Guides " point 
anymore to the "juno" version.
As we have still a/some testbed with the Juno version I would like to ask you 
if you know of a place where "obsoleted" documentation is moved. As there are 
still the installation guides for this version I would expect that at least a 
trace of the Operation & Administration ones is kept.
At least as vintage collection I would like to be able to read it once more ...

Thank you very much for any information that you can provide,
Cristina


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Neutron Nova Notification and Virtual Interface plugging events

2016-10-31 Thread Kris G. Lindgren
The reason behind this was that in the past it took neutron a long time to plug 
a vif (especially true on large networks with many boot requests happening at 
the same time)...  Iirc, it was updating the dhcp server that was the slowest 
operation.  Before the nova/neutron notifications nova would start a vm's and 
assume that the vif plug operation was instant.  This would result in a vm's 
booting up potentially without networking.

So now nova will create the vm, pause the vm, wait for neutron to say it's 
finished plugging the vif, then unpause the vm.  Ensuring that all vm's will 
have networking by the time the is boots.

In nova if you set vif plug is fatal and a timeout, the vm's will fail to boot 
if neutron hasn't plugged the vif within the timeout value.  Setting vif plug 
is fatal to false runs with the old behavior in that nova will just boot the 
vm, I do not remember if it waits for the timeout value and continues or if it 
waits at all if is fatal is not set.

Sent from my iPad

> On Oct 31, 2016, at 10:36 AM, Ahmed Mostafa  wrote:
> 
> Hello all, 
> 
> I am a bit confuse by the following parameters :
> 
> notify_nova_on_port_status_change
> notify_nova_on_port_data_change
> 
> The reason i am confused is that i see if there either of these values is set 
> to true (Which is the default), neutron will create a notiva notifier and 
> send these events to nova.
> 
> But i can not understand that part from the nova side, because as i see, when 
> you create a virtual machine you basically call 
> 
> _do_build_and_run_instance, which calls _create_domain_and_network
> 
> These two methods create a virtual machine and a port, then attach that port 
> on the virtual machine
> 
> but in _create_domain_and_network, is see nova waiting for neutron to create 
> the port and it has a neutron_callbak_fail, if neutron should fail in 
> creating the port.
> 
> Now, if i set these values to false, instance creation still will work, so i 
> do not really understand, are these two values critical in creating a virtual 
> machine ? and if not what exactly do they do ?
> 
> Thank you 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Instances failing to launch when rbd backed (ansible Liberty setup)

2016-10-21 Thread Kris G. Lindgren
From the traceback it looks like nova-compute is running out of a venv.

You need to activate the venv, most likely via: source 
/openstack/venvs/nova-12.0.16/.venv/bin/activate then run: pip freeze.  If you 
don’t see the RBD stuff – then that is your issue.  You might be able to fix 
via: pip install rbd.

Venv’s are self-contained python installs, so they do not use the system level 
python packages at all.

I would also ask for some help in the #openstack-ansible channel on irc as well.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Grant Morley 
Date: Friday, October 21, 2016 at 6:14 AM
To: OpenStack Operators 
Cc: "ian.ba...@serverchoice.com" 
Subject: [Openstack-operators] Instances failing to launch when rbd backed 
(ansible Liberty setup)


Hi all,

We have a openstack-ansible setup and have ceph installed for the backend. 
However whenever we try and launch a new instance it fails to launch and we get 
the following error:

2016-10-21 12:08:06.241 70661 INFO nova.virt.libvirt.driver 
[req-79811c40-8394-4e33-b16d-ff5fa7341b6a 41c60f65ae914681b6a6ca27a42ff780 
324844c815084205995aff10b03a85e1 - - -] [instance: 
5633d98e-5f79-4c13-8d45-7544069f0e6f] Creating image
2016-10-21 12:08:06.242 70661 ERROR nova.compute.manager 
[req-79811c40-8394-4e33-b16d-ff5fa7341b6a 41c60f65ae914681b6a6ca27a42ff780 
324844c815084205995aff10b03a85e1 - - -] [instance: 
5633d98e-5f79-4c13-8d45-7544069f0e6f] Instance failed to spawn
2016-10-21 12:08:06.242 70661 ERROR nova.compute.manager [instance: 
5633d98e-5f79-4c13-8d45-7544069f0e6f] Traceback (most recent call last):
2016-10-21 12:08:06.242 70661 ERROR nova.compute.manager [instance: 
5633d98e-5f79-4c13-8d45-7544069f0e6f]   File 
"/openstack/venvs/nova-12.0.16/lib/python2.7/site-packages/nova/compute/manager.py",
 line 2156, in _build_resources
2016-10-21 12:08:06.242 70661 ERROR nova.compute.manager [instance: 
5633d98e-5f79-4c13-8d45-7544069f0e6f] yield resources
2016-10-21 12:08:06.242 70661 ERROR nova.compute.manager [instance: 
5633d98e-5f79-4c13-8d45-7544069f0e6f]   File 
"/openstack/venvs/nova-12.0.16/lib/python2.7/site-packages/nova/compute/manager.py",
 line 2009, in _build_and_run_instance
2016-10-21 12:08:06.242 70661 ERROR nova.compute.manager [instance: 
5633d98e-5f79-4c13-8d45-7544069f0e6f] block_device_info=block_device_info)
2016-10-21 12:08:06.242 70661 ERROR nova.compute.manager [instance: 
5633d98e-5f79-4c13-8d45-7544069f0e6f]   File 
"/openstack/venvs/nova-12.0.16/lib/python2.7/site-packages/nova/virt/libvirt/driver.py",
 line 2527, in spawn
2016-10-21 12:08:06.242 70661 ERROR nova.compute.manager [instance: 
5633d98e-5f79-4c13-8d45-7544069f0e6f] admin_pass=admin_password)
2016-10-21 12:08:06.242 70661 ERROR nova.compute.manager [instance: 
5633d98e-5f79-4c13-8d45-7544069f0e6f]   File 
"/openstack/venvs/nova-12.0.16/lib/python2.7/site-packages/nova/virt/libvirt/driver.py",
 line 2939, in _create_image
2016-10-21 12:08:06.242 70661 ERROR nova.compute.manager [instance: 
5633d98e-5f79-4c13-8d45-7544069f0e6f] backend = image('disk')
2016-10-21 12:08:06.242 70661 ERROR nova.compute.manager [instance: 
5633d98e-5f79-4c13-8d45-7544069f0e6f]   File 
"/openstack/venvs/nova-12.0.16/lib/python2.7/site-packages/nova/virt/libvirt/driver.py",
 line 2884, in image
2016-10-21 12:08:06.242 70661 ERROR nova.compute.manager [instance: 
5633d98e-5f79-4c13-8d45-7544069f0e6f] fname + suffix, image_type)
2016-10-21 12:08:06.242 70661 ERROR nova.compute.manager [instance: 
5633d98e-5f79-4c13-8d45-7544069f0e6f]   File 
"/openstack/venvs/nova-12.0.16/lib/python2.7/site-packages/nova/virt/libvirt/imagebackend.py",
 line 967, in image
2016-10-21 12:08:06.242 70661 ERROR nova.compute.manager [instance: 
5633d98e-5f79-4c13-8d45-7544069f0e6f] return backend(instance=instance, 
disk_name=disk_name)
2016-10-21 12:08:06.242 70661 ERROR nova.compute.manager [instance: 
5633d98e-5f79-4c13-8d45-7544069f0e6f]   File 
"/openstack/venvs/nova-12.0.16/lib/python2.7/site-packages/nova/virt/libvirt/imagebackend.py",
 line 748, in __init__
2016-10-21 12:08:06.242 70661 ERROR nova.compute.manager [instance: 
5633d98e-5f79-4c13-8d45-7544069f0e6f] rbd_user=self.rbd_user)
2016-10-21 12:08:06.242 70661 ERROR nova.compute.manager [instance: 
5633d98e-5f79-4c13-8d45-7544069f0e6f]   File 
"/openstack/venvs/nova-12.0.16/lib/python2.7/site-packages/nova/virt/libvirt/storage/rbd_utils.py",
 line 117, in __init__
2016-10-21 12:08:06.242 70661 ERROR nova.compute.manager [instance: 
5633d98e-5f79-4c13-8d45-7544069f0e6f] raise RuntimeError(_('rbd python 
libraries not found'))
2016-10-21 12:08:06.242 70661 ERROR nova.compute.manager [instance: 
5633d98e-5f79-4c13-8d45-7544069f0e6f] RuntimeError: rbd python libraries not 
found

It moans about the rbd python libraries not being found, however all of the 

Re: [Openstack-operators] [Nova][icehouse]Any way to rotating log by size

2016-10-13 Thread Kris G. Lindgren
Add size: 100mb to your logrotate.conf
It will only logrotate after the size is greater than that size.

Sent from my iPad

On Oct 13, 2016, at 6:03 PM, Zhang, Peng 
> wrote:

Hi guys,

Our disk of Nova controller has been filled with log files several times and it 
causes the node down.
Although log rotation of Operation system is working fine to make rotation 
every hour, it is not efficient.
Has anyone got any idea to rotate log files by size (e.g. 100MB) ?

It costs time to add a new log achieve server, therefore a temporary solution 
should be considered.

Best regards
From: Peng, Zhang

FYI: My own solution(not a good one) is shared here:
Referring to the document : 
http://docs.openstack.org/developer/oslo.log/configfiles/example_nova.html
I have found a way using python logging modules.

I added a configuration file as follows under /etc/nova/ directory:

File name: logging.conf
[DEFAULT]
logfile =/var/log/nova/api.log

[loggers]
keys = root

[handlers]
keys = rotatingfile

[formatters]
keys = context

[logger_root]
level = DEBUG
handlers = rotatingfile

[handler_rotatingfile]
class = handlers.RotatingFileHandler
args = (%(logfile)s, 'a', 5024000, 5)
formatter = context

[formatter_context]
class = nova.openstack.common.log.ContextFormatter

And I also changed this parameter in nova.conf to make nova use the above 
configuration:
log_config_append=/etc/nova/logging.conf

Everything seems going well except that all nova services such as api, 
scheduler, etc. begin
to put their log messages into the same file(api,log)!

So I have to put a script to replace the file name defined in DEFAULT session:
logfile = '%s.log' % (os.path.join('/var/log', 
*os.path.basename(__import__('inspect').stack()[-1][1]).split('-')))

It works but it's also weird.
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] glance, nova backed by NFS

2016-10-12 Thread Kris G. Lindgren
Tobias does bring up something that we have ran into before.

With NFSv3 user mapping is done by ID, so you need to ensure that all of your 
servers use the same UID for nova/glance.  If you are using packages/automation 
that do useradd’s  without the same userid its *VERY* easy to have mismatched 
username/uid’s across multiple boxes.

NFSv4, iirc, sends the username and the nfs server does the translation of the 
name to uid, so it should not have this issue.  But we have been bit by that 
more than once on nfsv3.


___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

On 10/12/16, 11:59 AM, "Tobias Schön"  wrote:

Hi,

We have an environment with glance and cinder using NFS.
It's important that they have the correct rights. The shares should be 
owned by nova on compute if mounted up on /var/lib/nova/instances
And the same for nova and glance on the controller..

It's important that you map the glance and nova up in fstab.

The cinder one is controlled by the nfsdriver.

We are running rhelosp6, Openstack Juno.

This parameter is used:
nfs_shares_config=/etc/cinder/shares-nfs.conf in the 
/etc/cinder/cinder.conf file and then we have specified the share in 
/etc/cinder/shares-nfs.conf.

chmod 0640 /etc/cinder/shares-nfs.conf

setsebool -P virt_use_nfs on
This one is important to make it work with SELinux

How up to date this is actually I don't know tbh, but it was up to date as 
of redhat documentation when we deployed it around 1.5y ago.

//Tobias

-Ursprungligt meddelande-
Från: Curtis [mailto:serverasc...@gmail.com] 
Skickat: den 12 oktober 2016 19:21
Till: openstack-operators@lists.openstack.org
Ämne: [Openstack-operators] glance, nova backed by NFS

Hi All,

I've never used NFS with OpenStack before. But I am now with a small lab 
deployment with a few compute nodes.

Is there anything special I should do with NFS and glance and nova? I 
remember there was an issue way back when of images being deleted b/c certain 
components weren't aware they are on NFS. I'm guessing that has changed but 
just wanted to check if there is anything specific I should be doing 
configuration-wise.

I can't seem to find many examples of NFS usage...so feel free to point me 
to any documentation, blog posts, etc. I may have just missed it.

Thanks,
Curtis.

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] glance, nova backed by NFS

2016-10-12 Thread Kris G. Lindgren
We don’t use shared storage at all.  But I do remember what you are talking 
about.  The issue is that compute nodes weren’t aware they were on shared 
storage, and would nuke the backing mage from shared storage, after all vm’s on 
*that* compute node had stopped using it. Not after all vm’s had stopped using 
it.

https://bugs.launchpad.net/nova/+bug/1620341 - Looks like some code to address 
that concern has landed  but only in trunk maybe mitaka.  Any stable releases 
don’t appear to be shared backing image safe.

You might be able to get around this by setting the compute image manager task 
to not run.  But the issue with that will be one missed compute node, and 
everyone will have a bad day.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

On 10/12/16, 11:21 AM, "Curtis"  wrote:

Hi All,

I've never used NFS with OpenStack before. But I am now with a small
lab deployment with a few compute nodes.

Is there anything special I should do with NFS and glance and nova? I
remember there was an issue way back when of images being deleted b/c
certain components weren't aware they are on NFS. I'm guessing that
has changed but just wanted to check if there is anything specific I
should be doing configuration-wise.

I can't seem to find many examples of NFS usage...so feel free to
point me to any documentation, blog posts, etc. I may have just missed
it.

Thanks,
Curtis.

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Murano in Production

2016-09-23 Thread Kris G. Lindgren
How are you having ha proxy pointing to the current primary controller?  Is 
this done automatically or are you manually setting a server as the master?

Sent from my iPad

> On Sep 23, 2016, at 5:17 AM, Serg Melikyan  wrote:
> 
> Hi Joe,
> 
> I can share some details on how murano is configured as part of the
> default Mirantis OpenStack configuration and try to explain why it's
> done in that way as it's done, I hope it helps you in your case.
> 
> As part of Mirantis OpenStack second instance of the RabbitMQ is
> getting deployed specially for the murano, but it's configuration is
> different than for the RabbitMQ instance used by the other OpenStack
> components.
> 
> Why to use separate instance of the RabbitMQ?
> 1. Prevent possibility to get access to the RabbitMQ supporting
> whole cloud infrastructure by limiting access on the networking level
> rather than rely on authentication/authorization
> 2. Prevent possibility of DDoS by limiting access on the
> networking level to the infrastructure RabbitMQ
> 
> Given that second RabbitMQ instance is used only for the murano-agent
> <-> murano-engine communications and murano-agent is running on the
> VMs we had to make couple of changes in the deployment of the RabbitMQ
> (bellow I am referencing RabbitMQ as RabbitMQ instance used by Murano
> for m-agent <-> m-engine communications):
> 
> 1. RabbitMQ is not clustered, just separate instance running on each
> controller node
> 2. RabbitMQ is exposed on the Public VIP where all OpenStack APIs are exposed
> 3. It's has different port number than default
> 4. HAProxy is used, RabbitMQ is hidden behind it and HAProxy is always
> pointing to the RabbitMQ on the current primary controller
> 
> Note: How murano-agent is working? Murano-engine creates queue with
> uniq name and put configuration tasks to that queue which are later
> getting picked up by murano-agent when VM is booted and murano-agent
> is configured to use created queue through cloud-init.
> 
> #1 Clustering
> 
> * Given that per 1 app deployment from we create 1-N VMs and send 1-M
> configuration tasks, where in most of the cases N and M are less than
> 3.
> * Even if app deployment will be failed due to cluster failover it's
> can be always re-deployed by the user.
> * Controller-node failover most probably will lead to limited
> accessibility of the Heat, Nova & Neutron API and application
> deployment will fail regardless of the not executing configuration
> task on the VM.
> 
> #2 Exposure on the Public VIP
> 
> One of the reasons behind choosing RabbitMQ as transport for
> murano-agent communications was connectivity from the VM - it's much
> easier to implement connectivity *from* the VM than *to* VM.
> 
> But even in the case when you are connecting to the broker from the VM
> you should have connectivity and public interface where all other
> OpenStack APIs are exposed is most natural way to do that.
> 
> #3 Different from the default port number
> 
> Just to avoid confusion from the RabbitMQ used for the infrastructure,
> even given that they are on the different networks.
> 
> #4 HAProxy
> 
> In case of the default Mirantis OpenStack configuration is used mostly
> to support non-clustered RabbitMQ setup and exposure on the Public
> VIP, but also helpful in case of more complicated setups.
> 
> P.S. I hope my answers helped, let me know if I can cover something in
> more details.
> -- 
> Serg Melikyan, Development Manager at Mirantis, Inc.
> http://mirantis.com | smelik...@mirantis.com
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Openstack team size vs's deployment size

2016-09-08 Thread Kris G. Lindgren
I completely agree about the general rule of thumb.  I am only looking at the 
team that specifically supports openstack.  For us frontend support for public 
clouds is handled by another team/org all together.

So what I am looking at is the people who are in charge of the care/feeding of 
the openstack system specifically.
Which means: people who do the dev work on openstack, community participation, 
the integrations work, the capacity monitoring and server additions, the 
responding to alarms/monitoring around openstack, the POC of new features.  Any 
automation work specifically around openstack.  Any testing specifically done 
against openstack.

For us we have both private and public clouds in 4 different regions.
We try to maintain either N or N-1 release cadence.
We build our own packages.
We have lots of existing legacy systems that we need to integrate with.
We have a number of local patches that we carry (the majority around cellsv1).

So given that would you be willing to share your compute node/engineer ratio?

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: "Van Leeuwen, Robert" <rovanleeu...@ebay.com>
Date: Thursday, September 8, 2016 at 1:33 AM
To: "Kris G. Lindgren" <klindg...@godaddy.com>, OpenStack Operators 
<openstack-operators@lists.openstack.org>
Subject: Re: [Openstack-operators] Openstack team size vs's deployment size

> I was hoping to poll other operators to see what their average team size vs’s 
> deployment size is,
>  as I am trying to use this in an internal company discussion.
> Right now we are in the order of ~260 Compute servers per Openstack 
> Dev/Engineer.
> So trying to see how we compare with other openstack installs, particularly 
> those running with more than a few hundred compute nodes.

In my opinion it highly depends on too many things to have general rule of 
thumb.
Just a few things that I think would impact required team size:
* How many regions you have: setting up and managing a region usually takes 
more time then adding computes to an existing region
* How often do you want/need to upgrade
* Are you offering more then “core IAAS services” e.g. designate/trove/…
* What supporting things do you need around your cloud and who manage e.g. 
networking, setting up dns / repositories / authentication systems  etc
* What kind of SDN are you using/ how it needs to be integrated existing 
networks
* What kind of hardware you are rolling and what is the average size of the 
instances. E.G. hosting 1000 tiny instances on a 768GB / 88 core hypervisor 
will probably create more support tickets then 10 large instances on a low-spec 
hypervisor.
* How you handle storage ceph/san/local?
* Do you need live-migration when doing maintenance or are you allowed to bring 
down an availability zone
* Are you building your own packages / Using vendor packages
* The level of support the users expect and which team is taking care of that

In my private cloud experience rolling compute nodes and the controllers are 
not the bulk of the work.
The time goes in all the things that you need around the cloud and 
customizations that takes time.

It might be a bit different for public cloud providers where you might deliver 
as-is and do not need any integrations.
But you might need other things like very reliable billing and good automation 
around misbehaving users.


Cheer,
Robert van Leeuwen
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Delete cinder service

2016-09-01 Thread Kris G. Lindgren
Just be careful with LIMIT x on your servers if you have replicated mysql 
databases.  At least under older versions of mysql this can lead to broken 
replication as the results of the query performed on the master and on the 
slave are not guaranteed to be the same.

https://dev.mysql.com/doc/refman/5.7/en/replication-features-limit.html

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

On 9/1/16, 9:51 AM, "Nick Jones"  wrote:


On 1 Sep 2016, at 15:36, Jonathan D. Proulx wrote:

> On Thu, Sep 01, 2016 at 04:25:25PM +0300, Vladimir Prokofev wrote:
> :I've used direct database update to achive this in Mitaka:
> :use cinder;
> :update services set deleted = '1' where ;
>
>
> I belive the official way is:
>
> cinder-manage service remove  
>
> Which probably more or less does the same thing...

Yep.  Both options basically require direct interaction with the 
database as opposed to via a Cinder API call, but at least with 
cinder-manage the scope for making a mistake is far more limited than 
missing some qualifying clause off an UPDATE statement (limit 1 is your 
friend!) ;)

—

-Nick

-- 
DataCentred Limited registered in England and Wales no. 05611763

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [OpenStack-DefCore] [OSOps] Ansible work load test for interop patch set

2016-08-31 Thread Kris G. Lindgren
I originally agreed with you, but then I thought about it more this way:  It’s 
a tool to test to see if clouds are interop compatible (atleast that heat works 
the same on the two clouds).  While not technically a tool to manage openstack. 
 But still something that some Operators could want to know if they are looking 
at doing hybrid cloud.  Or they may want to ensure that two of their own 
private clouds are interop compatible.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Joseph Bajin 
Date: Wednesday, August 31, 2016 at 1:39 PM
To: "Yih Leong, Sun." 
Cc: OpenStack Operators , 
defcore-committee 
Subject: Re: [Openstack-operators] [OpenStack-DefCore] [OSOps] Ansible work 
load test for interop patch set

This looks like this was merged, but no one really answered my questions about 
an "InterOp Challenge" code base going into the Operators repository.

--Joe

On Wed, Aug 31, 2016 at 12:23 PM, Yih Leong, Sun. 
> wrote:
Can someone from ospos please review the following patch?
https://review.openstack.org/#/c/351799/

The patchset was last updated Aug 11th.
Thanks!



On Tue, Aug 16, 2016 at 7:17 PM, Joseph Bajin 
> wrote:
Sorry about that. I've been a little busy as of late, and was able to get 
around to taking a look.

I have a question about these.   What exactly is the Interop Challenge?  The 
OSOps repos are usually for code that can help Operators maintain and run their 
cloud.   These don't necessarily look like what we normally see submitted.

Can you expand on what the InterOp Challenge is and if it is something that 
Operators would use?

Thanks

Joe

On Tue, Aug 16, 2016 at 3:02 PM, Shamail 
> wrote:


> On Aug 16, 2016, at 1:44 PM, Christopher Aedo 
> > wrote:
>
> Tong Li, I think the best place to ask for a look would be the
> Operators mailing list
> (http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators).
> I've cc'd that list here, though it looks like you've already got a +2
> on it at least.
+1

I had contacted JJ earlier and he told me that the best person to contact would 
be Joseph Bajin (RaginBajin in IRC).  I've also added an OSOps tag to this 
message.
>
> -Christopher
>
>> On Tue, Aug 16, 2016 at 7:59 AM, Tong Li 
>> > wrote:
>> The patch set has been submitted to github for awhile, can some one please
>> review the patch set here?
>>
>> https://review.openstack.org/#/c/354194/
>>
>> Thanks very much!
>>
>> Tong Li
>> IBM Open Technology
>> Building 501/B205
>> liton...@us.ibm.com
>>
>>
>> ___
>> Defcore-committee mailing list
>> defcore-commit...@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/defcore-committee
>
> ___
> Defcore-committee mailing list
> defcore-commit...@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/defcore-committee

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
Defcore-committee mailing list
defcore-commit...@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/defcore-committee


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] ethtool with virtual NIC shows nothing - print the current settings of the NIC in OpenStack VM

2016-08-26 Thread Kris G. Lindgren
Assuming you are using paravirtualized nics, you won’t see anything because 
it’s not a real network device.  Additionally – I think this would fall under 
the leaking of physical implementation details to the cloud user.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Lukas Lehner 
Date: Friday, August 26, 2016 at 2:31 PM
To: "openstack-operators@lists.openstack.org" 

Subject: [Openstack-operators] ethtool with virtual NIC shows nothing - print 
the current settings of the NIC in OpenStack VM

Hi

http://unix.stackexchange.com/questions/305638/ethtool-with-virtual-nic-shows-nothing-print-the-current-settings-of-the-nic-i

Lukas
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Shelving

2016-08-18 Thread Kris G. Lindgren
Does shelving an instance also free up the instances reservation against that 
node?  If it doesn’t I assume that’s why it still counts against their quota?  
IE Nova is still trying to keep a slot open for them on that server, so when 
you unshelv does it go back to the same node or does it go to a new node?

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: David Medberry 
Date: Thursday, August 18, 2016 at 3:49 PM
To: "Jonathan D. Proulx" 
Cc: "openstack-operators@lists.openstack.org" 

Subject: Re: [Openstack-operators] Shelving


On Thu, Aug 18, 2016 at 3:12 PM, Jonathan D. Proulx 
> wrote:

True they do consume IPs.

In my configuration they do not consume any hypervisor disk.  I
*think* this is true of all configurations once the 'shelved' systems
are 'offloaded'.

i concur, that's the intent of shelving as I understand it, to free up the 
hypervisor by moving off of the hypervisor entirely. So in our case with 
libvirt, there are no "shut off" instance names listed with "virsh list --all". 
They truly are "shelved" with just glance storage (once they are 
"shelve-offload"ed).
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] systemd and duplicate logs -- /var/log/syslog and /var/log/nova/nova-compute.log

2016-08-09 Thread Kris G. Lindgren
Systemd logs all python output by default, if you have rsyslog to pull from the 
systemd as well you can get double logging.  I think you need to execute under 
systemd with:  StandardOutput=null in the unit file under the [service] 
heading.  Atleast that’s what we do.

Disclaimer: we don’t run rsyslog but use file output (and we are running 
CentOS7).  Our problem was that we got debug level messages to 
/var/log/messages and the correct loglevel in /var/log/nova/nova-compute.log 
Telling system to not ouput logs prevented that from happening.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Gustavo Randich 
Date: Tuesday, August 9, 2016 at 3:52 PM
To: "openstack-operators@lists.openstack.org" 

Subject: [Openstack-operators] systemd and duplicate logs -- /var/log/syslog 
and /var/log/nova/nova-compute.log

Hi guys,

We want to be able to forward nova-compute's log to a central rsyslog but at 
the same time mantain the local "/var/log/nova/nova-compute.log". In Icehouse 
we achieved this with the following configuration in 
"/etc/rsyslog.d/60-nova.conf":

*.*;local0.none,auth,authpriv.none   -/var/log/syslog
local0.* @@10.161.0.1:1024
local0.* /var/log/nova/nova-compute.log

We also had to comment this line in "/etc/rsyslog.d/50-default.conf" 
(reference: https://www.osso.nl/blog/rsyslog-cron-deleting-rules)

*.*;auth,authpriv.none   -/var/log/syslog


Now, in Mitaka / Ubuntu 16 / systemd, with this same configuration, we are 
getting duplicate logs: every line goes to /var/log/syslog and 
/var/log/nova/nova-compute.log

We want to only log in nova-compute.log

Maybe this is because systemd is forwarding everything to rsyslog? By default, 
/etc/systemd/journald.conf has "ForwardToSyslog=yes"

thanks!
Gustavo

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Blazar? (Reservations and/or scheduled termination)

2016-08-03 Thread Kris G. Lindgren
See inline.

Sent from my iPad

> On Aug 3, 2016, at 8:49 PM, Sam Morrison <sorri...@gmail.com> wrote:
> 
> 
>> On 4 Aug 2016, at 3:12 AM, Kris G. Lindgren <klindg...@godaddy.com> wrote:
>> 
>> We do something similar.  We give everyone in the company an account on the 
>> internal cloud.  By default they have a user- project.  We have a 
>> Jenkins job that adds metadata to all vm’s that are in user- projects.  We 
>> then have additional jobs that read that metadata and determine when the VM 
>> has been alive for x period of time.  At 45 days we send an email saying 
>> that we will remove the vm in 15 days, and they can request a 30 day 
>> extension (which really just resets some metadata information on the vm).  
>> On day 60 the vm is shut down and removed.  For non user- projects, people 
>> are allowed to have their vm’s created as long as they want.
> 
> What stops a user modifying the metadata? Do you have novas policy.json set 
> up so they can’t?
> 
> Sam
> 

Technically nothing.  Most users don't know about metadata and they don't look 
at their vm's very hard.  For us it help clean up the cruft.  But if someone is 
willing to update the metadata on their vm every 15 days, imho they can keep it 
running in their user- project.  We are mainly trying to auto cleanup cruft 
that users did for testing and then forgot about.


> 
>> 
>> I believe I remember seeing something presented in the paris(?) time frame 
>> by overstock(?) that would treat vm’s more as a lease.  IE You get an env 
>> for 90 days, it goes away at the end of that.
>> 
>> 
>> ___
>> Kris Lindgren
>> Senior Linux Systems Engineer
>> GoDaddy
>> 
>> On 8/3/16, 10:47 AM, "Jonathan D. Proulx" <j...@csail.mit.edu> wrote:
>> 
>>   Hi All,
>> 
>>   As a private cloud operatior who doesn't charge internal users, I'd
>>   really like a way to force users to set an exiration time on their
>>   instances so if they forget about them they go away.
>> 
>>   I'd though Blazar was the thing to look at and Chameleoncloud.org
>>   seems to be using it (any of you around here?) but it also doesn't
>>   look like it's seen substantive work in a long time.
>> 
>>   Anyone have operational exprience with blazar to share or other
>>   solutions?
>> 
>>   -Jon
>> 
>>   ___
>>   OpenStack-operators mailing list
>>   OpenStack-operators@lists.openstack.org
>>   http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>> 
>> 
>> ___
>> OpenStack-operators mailing list
>> OpenStack-operators@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> 

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Blazar? (Reservations and/or scheduled termination)

2016-08-03 Thread Kris G. Lindgren
We do something similar.  We give everyone in the company an account on the 
internal cloud.  By default they have a user- project.  We have a 
Jenkins job that adds metadata to all vm’s that are in user- projects.  We then 
have additional jobs that read that metadata and determine when the VM has been 
alive for x period of time.  At 45 days we send an email saying that we will 
remove the vm in 15 days, and they can request a 30 day extension (which really 
just resets some metadata information on the vm).  On day 60 the vm is shut 
down and removed.  For non user- projects, people are allowed to have their 
vm’s created as long as they want.

I believe I remember seeing something presented in the paris(?) time frame by 
overstock(?) that would treat vm’s more as a lease.  IE You get an env for 90 
days, it goes away at the end of that.


___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

On 8/3/16, 10:47 AM, "Jonathan D. Proulx"  wrote:

Hi All,

As a private cloud operatior who doesn't charge internal users, I'd
really like a way to force users to set an exiration time on their
instances so if they forget about them they go away.

I'd though Blazar was the thing to look at and Chameleoncloud.org
seems to be using it (any of you around here?) but it also doesn't
look like it's seen substantive work in a long time.

Anyone have operational exprience with blazar to share or other
solutions?

-Jon

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [oslo] RabbitMQ queue TTL issues moving to Liberty

2016-07-28 Thread Kris G. Lindgren
We also believe the change from auto-delete queues to 10min expiration queues 
was the cause of our rabbit whoes a month or so ago.  Where we had rabbitmq 
servers filling their stats DB and consuming 20+ GB of ram before hitting the 
rabbitmq mem high watermark.  We were running for 6+ months without issue under 
kilo and when we moved to Liberty rabbit consistently started falling on its 
face.  We eventually turned down the stats collection interval, but I would 
imagine keeping stats around for queue’s for 10 minutes that were used for a 
single RPC message when we are passing 1500+ messages per second wasn’t helping 
anything.  We haven’t tried changing the timeout values to be lower, to see if 
that made things better.  But we did identify this change as something that 
could contribute to our rabbitmq issues.


___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Dmitry Mescheryakov 
Date: Thursday, July 28, 2016 at 6:17 AM
To: Sam Morrison 
Cc: OpenStack Operators 
Subject: Re: [Openstack-operators] [oslo] RabbitMQ queue TTL issues moving to 
Liberty



2016-07-27 2:20 GMT+03:00 Sam Morrison 
>:

On 27 Jul 2016, at 4:05 AM, Dmitry Mescheryakov 
> wrote:



2016-07-26 2:15 GMT+03:00 Sam Morrison 
>:
The queue TTL happens on reply queues and fanout queues. I don’t think it 
should happen on fanout queues. They should auto delete. I can understand the 
reason for having them on reply queues though so maybe that would be a way to 
forward?

Or am I missing something and it is needed on fanout queues too?

I would say we do need fanout queues to expire for the very same reason we want 
reply queues to expire instead of auto delete. In case of broken connection, 
the expiration provides client time to reconnect and continue consuming from 
the queue. In case of auto-delete queues, it was a frequent case that RabbitMQ 
deleted the queue before client reconnects ... along with all non-consumed 
messages in it.

But in the case of fanout queues, if there is a broken connection can’t the 
service just recreate the queue if it doesn’t exist? I guess that means it 
needs to store the state of what the queue name is though?

Yes they could loose messages directed at them but all the services I know that 
consume on fanout queues have a re sync functionality for this very case.

If the connection is broken will oslo messaging know how to connect to the same 
queue again anyway? I would’ve thought it would handle the disconnect and then 
reconnect, either with the same queue name or a new queue all together?

oslo.messaging handles reconnect perfectly - on connect it just unconditionally 
declares the queue and starts consuming from it. If queue already existed, the 
declaration operation will just be ignored by RabbitMQ.

For your earlier point that services re sync and hence messages lost in fanout 
are not that important, I can't comment on that. But after some thinking I do 
agree that having big expiration time for fanouts is non-adequate for big 
deployments anyway. How about we split rabbit_transient_queues_ttl into two 
parameters - one for reply queue and one for fanout ones? In that case people 
concerned with messages piling up in fanouts might set it to 1, which will 
virtually make these queues behave like auto-delete ones (though I strongly 
recommend to leave it at least at 20 seconds, to give service a chance to 
reconnect).

Thanks,

Dmitry



Sam



___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [nova] Rabbit-mq 3.4 crashing (anyone else seen this?)

2016-07-05 Thread Kris G. Lindgren
We tried some of these (well I did last night), but the issue was that 
eventually rabbitmq actually died.  I was trying some of the eval commands to 
try to get what was in the mgmt_db, bet any get-status call eventually lead to 
a timeout error.  Part of the problem is that we can go from a warning to a 
zomg out of memory in under 2 minutes.  Last night it was taking only 2 hours 
to chew thew 40GB of ram.  Messaging rates were in the 150-300/s which is not 
all that high (another cell is doing a constant 1k-2k).

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Matt Fischer >
Date: Tuesday, July 5, 2016 at 11:25 AM
To: Joshua Harlow >
Cc: 
"openstack-...@lists.openstack.org" 
>, 
OpenStack Operators 
>
Subject: Re: [Openstack-operators] [nova] Rabbit-mq 3.4 crashing (anyone else 
seen this?)


Yes! This happens often but I'd not call it a crash, just the mgmt db gets 
behind then eats all the memory. We've started monitoring it and have runbooks 
on how to bounce just the mgmt db. Here are my notes on that:

restart rabbitmq mgmt server - this seems to clear the memory usage.

rabbitmqctl eval 'application:stop(rabbitmq_management).'
rabbitmqctl eval 'application:start(rabbitmq_management).'

run GC on rabbit_mgmt_db:
rabbitmqctl eval '(erlang:garbage_collect(global:whereis_name(rabbit_mgmt_db)))'

status of rabbit_mgmt_db:
rabbitmqctl eval 'sys:get_status(global:whereis_name(rabbit_mgmt_db)).'

Rabbitmq mgmt DB how much memory is used:
/usr/sbin/rabbitmqctl status | grep mgmt_db

Unfortunately I didn't see that an upgrade would fix for sure and any settings 
changes to reduce the number of monitored events also require a restart of the 
cluster. The other issue with an upgrade for us is the ancient version of 
erlang shipped with trusty. When we upgrade to Xenial we'll upgrade erlang and 
rabbit and hope it goes away. I'll also probably tweak the settings on 
retention of events then too.

Also for the record the GC doesn't seem to help at all.

On Jul 5, 2016 11:05 AM, "Joshua Harlow" 
> wrote:
Hi ops and dev-folks,

We over at godaddy (running rabbitmq with openstack) have been hitting a issue 
that has been causing the `rabbit_mgmt_db` consuming nearly all the processes 
memory (after a given amount of time),

We've been thinking that this bug (or bugs?) may have existed for a while and 
our dual-version-path (where we upgrade the control plane and then 
slowly/eventually upgrade the compute nodes to the same version) has somehow 
triggered this memory leaking bug/issue since it has happened most prominently 
on our cloud which was running nova-compute at kilo and the other services at 
liberty (thus using the versioned objects code path more frequently due to 
needing translations of objects).

The rabbit we are running is 3.4.0 on CentOS Linux release 7.2.1511 with kernel 
3.10.0-327.4.4.el7.x86_64 (do note that upgrading to 3.6.2 seems to make the 
issue go away),

# rpm -qa | grep rabbit

rabbitmq-server-3.4.0-1.noarch

The logs that seem relevant:

```
**
*** Publishers will be blocked until this alarm clears ***
**

=INFO REPORT 1-Jul-2016::16:37:46 ===
accepting AMQP connection <0.23638.342> 
(127.0.0.1:51932 -> 
127.0.0.1:5671)

=INFO REPORT 1-Jul-2016::16:37:47 ===
vm_memory_high_watermark clear. Memory used:29910180640 allowed:47126781542
```

This happens quite often, the crashes have been affecting our cloud over the 
weekend (which made some dev/ops not so happy especially due to the july 4th 
mini-vacation),

Looking to see if anyone else has seen anything similar?

For those interested this is the upstream bug/mail that I'm also seeing about 
getting confirmation from the upstream users/devs (which also has erlang crash 
dumps attached/linked),

https://groups.google.com/forum/#!topic/rabbitmq-users/FeBK7iXUcLg

Thanks,

-Josh

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] 回复: Quota exceeded for resources:['security_group'].

2016-07-05 Thread Kris G. Lindgren
If you are using neutron – you also need to update the quotas for the tenant in 
neutron as well.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: 云淡风轻 <821696...@qq.com>
Date: Tuesday, July 5, 2016 at 1:50 AM
To: "vincent.legoll" 
>, 
openstack-operators 
>
Subject: [Openstack-operators] 回复: Quota exceeded for 
resources:['security_group'].

thanks for you.

I try it for times ,and then config to

nova quota-update --security-groups 300 --security-group-rules 60 
6cb156a82d0f486a9f50132be9438eb6
nova quota-show | grep security_group
| security_groups | 300   |
| security_group_rules| 60|

but it is also



-- 原始邮件 --
发件人: 
"vincent.legoll";>;
发送时间: 2016年7月5日(星期二) 下午2:54
收件人: 
"openstack-operators">;
主题: Re: [Openstack-operators] Quota exceeded for resources:['security_group'].

Hello,

Le 05/07/2016 06:02, 云淡风轻 a écrit :
> when i create cluster:
>  openstack dataprocessing cluster create --json 
> my_cluster_create_vmdk.json
[...]
> How to deal it ,thanks !

Look at the actual quotas:

$ nova quota-show | grep security_group
| security_groups | 10|
| security_group_rules| 20|

Then grow them a bit:

$ nova quota-update --security-groups 11 --security-group-rules 21 

There's the equivalent unified client (openstack) commands, should be easy to 
find

Hope this helps

--
Vincent Legoll
EGI FedCloud task force
Cloud Computing at IdGC
France Grilles / CNRS / IPHC

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Bandwidth limitations

2016-06-29 Thread Kris G. Lindgren
I would also look at seeing how its doing it.  IN the past what it did was drop 
packets over a specific threshold which is really really really terrible.  We 
do some traffic policing on some of our vm's – but we do it outside of 
openstack via a qemu hook and setting up our own qdisc and ifb device for each 
tap device that we want to police.

https://github.com/godaddy/openstack-traffic-shaping

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Joseph Bajin >
Date: Wednesday, June 29, 2016 at 10:43 AM
To: Daniel Levy >
Cc: OpenStack Operators 
>
Subject: Re: [Openstack-operators] Bandwidth limitations

Hi there,

It looks like QOS is already available within the Mitaka release.   Maybe it 
doesn't have all the features you need, but looks to be a good start.
http://docs.openstack.org/mitaka/networking-guide/adv-config-qos.html

I haven't used it yet, but maybe someone else will pipe up with some expierence.

--Joe

On Wed, Jun 29, 2016 at 12:36 PM, Daniel Levy 
> wrote:
Hi all,
I'd like to learn about potential solutions anyone out there is using for 
bandwidth limitations on VMs. Potentially applying QOS (quality of service) 
rules on the VM ports in an automated fashion.
If there are no current solutions, I might submit a blue print to tackle this 
issue


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Packaging Virtualenvs

2016-06-23 Thread Kris G. Lindgren
When we did this within CentOS6 with the python 2.7 software collection.  When 
nova called into nova-rootwrap, rootwrap was called without any of the software 
collection or venv stuff activated. So we had to move rootwrap to rootwrap-real 
and create a shell script that did the needful (activate the software 
collection and activate the venv, then call rootwrap-real).

Other than that we just modified the normal upstart scripts to activate the 
venv before executing the service like normal.  Under systemd one would do the 
same.  We never packaged the openstack clients into venv's, but if we did one 
could just add the correct stuff to your keystonerc file.
___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy







On 6/23/16, 3:12 PM, "Doug Hellmann"  wrote:

>Excerpts from Silence Dogood's message of 2016-06-23 15:45:34 -0400:
>> I know from conversations that a few folks package their python apps as
>> distributable virtualenvs.   spotify created dh-virtualenv for this.  you
>> can do it pretty simply by hand.
>> 
>> I built a toolchain for building rpms as distributable virtualenvs and that
>> works really well.
>> 
>> What I'd like to do is make it so that every app that's built as a
>> virtualenv gets setup to automatically execute at call time in their
>> virtualenv.
>> 
>> I see two options:
>> 
>> 1)  Dynamically generate a wrapper script during build and put it in the
>> RPM.  Call the wrapper.
>> 
>> 2)  Created a dependency global module ( as an rpm ) set it as a
>> dependency.  And basically it'd be an autoexecuting import that
>> instantiates the virtualenv.  it would probably know all it needs to
>> because I am building all my packages to an internal standard.  Then when
>> building the APP rpm all I need to do is inject an import into the import
>> chain if it's being built as a virtualenv.  Then I have what are
>> effectively statically compiled python apps.
>> 
>> I like 2.  But 1 isn't very bad.  Both are a little hokey.
>> 
>> Was curious if folks might have a preference, or a better idea.
>> 
>> Thanks.
>> 
>> Matt
>
>I'm not sure what you mean by a "wrapper script".  If you run the
>Python console script from within the virtualenv you've packaged,
>you shouldn't need to do anything to "activate" that environment
>separately because it should have the correct shebang line.
>
>Are you seeing different behavior?
>
>Doug
>
>___
>OpenStack-operators mailing list
>OpenStack-operators@lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] DNS searchdomains for your instances?

2016-06-20 Thread Kris G. Lindgren
Yea we set the DNS servers per network as well.  I am actually asking about the 
search domain, IE what FQDN's should be appended to non-fqdn queries, to a 
lookup the ip of a short name.

IE: if searching for the shortname of: somevm.  If you had a search domain of 
domain1.com, domain2.net, and domain3.org – it would try each of those domain 
names in order until it found: somevm..
___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: <medbe...@gmail.com<mailto:medbe...@gmail.com>> on behalf of David 
Medberry <openst...@medberry.net<mailto:openst...@medberry.net>>
Date: Monday, June 20, 2016 at 1:19 PM
To: "Kris G. Lindgren" <klindg...@godaddy.com<mailto:klindg...@godaddy.com>>
Cc: "openstack-oper." 
<openstack-operators@lists.openstack.org<mailto:openstack-operators@lists.openstack.org>>
Subject: Re: [Openstack-operators] DNS searchdomains for your instances?

and of course that was the WRONG picture

https://www.dropbox.com/s/wxe9e9nu5cqgx9m/Screenshot%202016-06-20%2013.17.02.png?dl=0

On Mon, Jun 20, 2016 at 1:18 PM, David Medberry 
<openst...@medberry.net<mailto:openst...@medberry.net>> wrote:
Each tenant in our neutron network has their own subnet and each subnet sets 
its own dns rules (see pic).

https://www.dropbox.com/s/dgfgkdqijmrfweo/2016-06-19%2005.15.29.jpg?dl=0

On Mon, Jun 20, 2016 at 1:07 PM, Kris G. Lindgren 
<klindg...@godaddy.com<mailto:klindg...@godaddy.com>> wrote:
Hello all,

Wondering how you guys are handling the dns searchdomains for your instances in 
your internal cloud.  Currently we are updating the network metadata template, 
on each compute node,  to include the dns-search-domains options.  We (Josh 
Harlow) is working on implement the new network template that nova created (in 
liberty) and are trying to get this added.  Currently nova/neutron doesn't 
support any option to specify this metadata/information anywhere.  I see some 
work on the neutron side to allow setting of extra dhcp-opts network wide 
(currently only allowed on a port) [1] [2].  But once that gets merged, then 
changes need to be made on the nova side to pull that extra data into the 
network.json.

So that begs the question how are other people handling this?

[1] (spec) - https://review.openstack.org/#/c/247027/
[2] (code) - https://review.openstack.org/#/c/248931/

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org<mailto:OpenStack-operators@lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators



___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] DNS searchdomains for your instances?

2016-06-20 Thread Kris G. Lindgren
Hello all,

Wondering how you guys are handling the dns searchdomains for your instances in 
your internal cloud.  Currently we are updating the network metadata template, 
on each compute node,  to include the dns-search-domains options.  We (Josh 
Harlow) is working on implement the new network template that nova created (in 
liberty) and are trying to get this added.  Currently nova/neutron doesn't 
support any option to specify this metadata/information anywhere.  I see some 
work on the neutron side to allow setting of extra dhcp-opts network wide 
(currently only allowed on a port) [1] [2].  But once that gets merged, then 
changes need to be made on the nova side to pull that extra data into the 
network.json.

So that begs the question how are other people handling this?

[1] (spec) - https://review.openstack.org/#/c/247027/
[2] (code) - https://review.openstack.org/#/c/248931/

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Scaling Ceilometer compute agent?

2016-06-14 Thread Kris G. Lindgren
Cern is running ceilometer at scale with many thousands of compute nodes.  I 
think their blog goes into some detail about it [1], but I don’t have a direct 
link to it.


[1] - http://openstack-in-production.blogspot.com/
___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Bill Jones >
Date: Tuesday, June 14, 2016 at 9:03 AM
To: "openstack-oper." 
>
Subject: [Openstack-operators] Scaling Ceilometer compute agent?

Has anyone had any experience with scaling ceilometer compute agents?

We're starting to see messages like this in logs for some of our compute agents:

WARNING ceilometer.openstack.common.loopingcall [-] task  run outlasted interval by 293.25 sec

This is an indication that the compute agent failed to execute its pipeline 
processing within the allotted interval (in our case 10 min). The result of 
this is that less instance samples are generated per hour than expected, and 
this causes billing issues for us due to the way we calculate usage.

It looks like we have three options for addressing this: make the pipeline run 
faster, increase the interval time, or scale the compute agents. I'm 
investigating the latter.

I think I read in the ceilometer architecture docs that the agents are designed 
to scale, but I don't see anything in the docs on how to facilitate that. Any 
pointers would be appreciated.

Thanks,
Bill
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Too many connections

2016-05-30 Thread Kris G. Lindgren
You are most likely running db pools with a number of worker processes.  If you 
look at the MySQL connections most of them will be idle.  If that's the case 
set the db pool timeout lower.  Lower the pool size down.  Each worker thread 
opens a connection pool to the database.  If you are running 10 workers with a 
min db pool size of 5 and a max of 10.  You will have a minimum number of 50 db 
connections, max 100, per server running that service.


I would be looking at: pool_timeout, min_pool_size, max_pool_size

http://docs.openstack.org/developer/oslo.db/opts.html


On May 30, 2016, at 9:24 AM, Fran Barrera 
> wrote:

Hi,

I'm using Mitaka on ubuntu 16.04 and I have many problems in horizon. I can see 
this in the logs of all components: "OperationalError: 
(pymysql.err.OperationalError) (1040, u'Too many connections')" If I increase 
the max_connections on mysql works well a few minutes but the same error. Maybe 
Openstack don't close connections with Mysql. The version of Mysql is 5.7.

Any suggestions?

Regards,
Fran
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Problems (simple ones) at scale... Invisible VMs.

2016-05-18 Thread Kris G. Lindgren
Nova has a config setting for the maximum number of results to be returned by a 
single call.  You can bump that up so that you can do a nova list —all-tenants 
and still see everything. However if I am reading the below correctly, then I 
didn't realize that the —limit –1 apparently by-passes that config option?

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: David Medberry >
Date: Wednesday, May 18, 2016 at 4:13 PM
To: 
"openstack-operators@lists.openstack.org"
 
>
Subject: [Openstack-operators] Problems (simple ones) at scale... Invisible VMs.

So, we just ran into an "at-sale" issue that shouldn't have been an issue.

Many of the OpenStack CLI tools accept a limit parameter (to limit how much 
data you get back from a single query). However, much less well documented is 
that there is an inherent limit that you will run into at a 1000 VMs (not 
counting deleted ones). Many operators have already exceeded that limit and 
likely run into this. With nova cli and openstack client, you can simply pass 
in a limit of -1 to get around this (and though it will still make paged 
queries, you won't have "invisible" VMs which is what I've begun to call the 
ones that don't make it into the first/default page.

I can't really call this a bug for Nova (but it is definitely a bug for Cinder 
which doesn't have a functional get me all of them command and is also limited 
at 1000 for a single call but you can never get the rest at least in our 
Liberty environment.)

box:~# nova list  |tail -n +4 |head -n -1 |wc
   1000   16326  416000
box:~# nova list --limit -1  |tail -n +4 |head -n -1 |wc
   1060   17274  440960

(so I recently went over the limit of 1000)

YMMV.

Good luck.

-d
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Moving from distro packages to containers (or virtualenvs...)

2016-05-13 Thread Kris G. Lindgren
Curious how you are using puppet to handle multi-node orchestration, as this is 
something puppet specific does not do.  Are you using ansible/salt to 
orchestrate a puppet run on all the servers?

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy







On 5/12/16, 4:19 PM, "Nick Jones"  wrote:

>Hi.
>
>> I am investigating how to help move godaddy from rpms to a container-like 
>> solution (virtualenvs, lxc, or docker...) and a set of questions that comes 
>> up is the following (and I would think that some folks on this mailing list 
>> may have some useful insight into the answers):
>
>I’ve been mulling this over for a while as well, and although we’re not yet 
>there I figured I might as well chip in with my .2p all the same.
>
>> * Have you done the transition?
>
>Not yet!
>
>> * Was/is kolla used or looked into? or something custom?
>
>We’re looking at deploying Docker containers from images that have been 
>created using Puppet.  We’d also use Puppet to manage the orchestration, i.e 
>to make sure a given container is running in the right place and using the 
>correct image ID.  Containers would comprise discrete OpenStack service 
>‘composables’, i.e a container on a control node running the core nova 
>services (nova-api, nova-scheduler, nova-compute, and so on), one running 
>neutron-server, one for keystone, etc.  Nothing unusual there.
>
>The workflow would be something like:
>
>1. Developer generates / updates configuration via Puppet and builds a new 
>image;
>2. Image is uploaded into a private Docker image registry.  Puppet handles 
>deploying a container from this new image ID;
>3. New container is deployed into a staging environment for testing;
>4. Assuming everything checks out, Puppet again handles deploying an updated 
>container into the production environment on the relevant hosts.
>
>I’m simplifying things a little but essentially that’s how I see this hanging 
>together.
>
>> * What was the roll-out strategy to achieve the final container solution?
>
>We’d do this piecemeal, and so containerise some of the ‘safer’ components 
>first of all (such as Horizon) to make sure this all hangs together.  
>Eventually we’d have all of our core OpenStack services on the control nodes 
>isolated and running in containers, and then work on this approach for the 
>rest of the platform.
>
>Would love to hear from other operators as well as to their experience and 
>conclusions.
>
>— 
>
>-Nick
>-- 
>DataCentred Limited registered in England and Wales no. 05611763
>
>___
>OpenStack-operators mailing list
>OpenStack-operators@lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [Nova] Significance of Error Vs Failed status

2016-05-11 Thread Kris G. Lindgren
I am +1 on this response as well.  Seems like having live or cold migrations 
follow the same pattern/states would make sense.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: David Medberry >
Date: Wednesday, May 11, 2016 at 4:05 PM
To: Andrew Laski >
Cc: 
"openstack-operators@lists.openstack.org"
 
>
Subject: Re: [Openstack-operators] [Nova] Significance of Error Vs Failed status

So I'm a big ol' -0- don't care on this. We've never used that list before (but 
will now). Seems like it would be useful though to have it the same for l-m and 
cold migration.

On Wed, May 11, 2016 at 9:27 AM, Andrew Laski 
> wrote:



On Wed, May 11, 2016, at 11:10 AM, David Medberry wrote:
Kekane,

Hi,

This setting, how does it display in the "nova show $UUID" or in the "openstack 
server show $UUID"? Ie, I don't want a VM showing ERROR state if the VM itself 
is not in error. A failed migration doesn't leave the VM down (well, not 
always) but error generally implies it is down. If this is more of an internal 
status, then +1. I'll look at the code shortly but wanted to get a reply off 
first.

To clarify, this is only about the state of a migration not an instance. If as 
an admin you list or show your migrations this would affect how that is 
displayed. Nothing about the instance, or how it's displayed, will change.



ALSO: It would have been very very helpful to see "live-migration" in the 
subject line.


-d

On Wed, May 11, 2016 at 12:55 AM, Kekane, Abhishek 
> wrote:

Hi Operators,



Could you please provide your opinion on below mail. I need to discuss this in 
coming nova meeting (12 May, 2016).



Thank you,



Abhishek Kekane



From: Kekane, Abhishek 
[mailto:abhishek.kek...@nttdata.com]
Sent: Monday, May 09, 2016 7:22 AM
To: 
openstack-operators@lists.openstack.org
Subject: [Openstack-operators] [Nova] Significance of Error Vs Failed status



Hi All,

In Liberty release, we had upstream [1] a security fix to cleanup orphaned 
instance files from compute nodes for resize operation. To fix this security 
issue, a new periodic task '_cleanup_incomplete_migrations’ was introduced that 
runs on each compute node which queries for deleted instances and migration 
status in “error” status. If there are any such instances, then it simply 
cleanup instance files on that particular compute node.

Similar issue is reported in LP bug [2] for Live-migration operation and we 
would like to use the same periodic task to fix this problem. But in case of 
live migration, the migration status is set to “failed” instead of “error” 
status if migration fails for any reason. This change was introduced in patch 
[3] when migration object support was added for live migration. Due to this 
inconsistency, the periodic task will not pickup instances to cleanup orphaned 
instance files. To fix this problem, we simply want to set the migration status 
to “error” in patch [4] same as done for resize operation to bring consistency 
to the code.

We have discussed about this issue in the nova meeting [5] and decided that to 
the client, migration status 'error' vs. 'failed' should be considered the same 
thing, it's a failure. From operators point of view, is there any significance 
of setting migration status to 'error' or 'failed', if yes what is it and what 
impact it will have if migration status is changed from 'failed' to 'error'. 
Please provide your opinions on the same.



[1] https://review.openstack.org/#/c/219299

[2] : https://bugs.launchpad.net/nova/+bug/1470420

[3] https://review.openstack.org/#/c/183331

[4] https://review.openstack.org/#/c/215483

[5] 
http://eavesdrop.openstack.org/irclogs/%23openstack-meeting/%23openstack-meeting.2016-05-05.log.html#t2016-05-05T14:40:51

Thank You,

Abhishek



__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.



__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the 

Re: [Openstack-operators] User Survey usage of QEMU (as opposed to KVM) ?

2016-05-11 Thread Kris G. Lindgren
In the next user survey - could we clarify that qemu == full software cpu 
emulation and kvm (qemu/kvm) = hardware accelerated virtualization or some 
similar phrasing.  It's totally possible that people are like: I run both qemu 
and kvm (thinking that’s qemu/kvm) - when in fact they only run kvm (qemu/kvm).

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy







On 5/11/16, 11:58 AM, "Tim Bell"  wrote:

>Does anyone see a good way to fix this to report KVM or QEMU/KVM ?
>
>I guess the worry is whether this would count as a bug fix or an incompatible 
>change.
>
>Tim
>
>On 11/05/16 17:51, "Kashyap Chamarthy"  wrote:
>
>>On Tue, May 03, 2016 at 02:27:00PM -0500, Sergio Cuellar Valdes wrote:
>>
>>[...]
>>
>>> I'm confused too about the use of KVM or QEMU In the computes the
>>> file​/etc/nova/nova-compute.conf has:
>>> 
>>> virt_type=kvm
>>> 
>>> The output of:
>>> 
>>> nova hypervisor-show  | grep hypervisor_type
>>> 
>>> is:
>>> 
>>> hypervisor_type   | QEMU
>>
>>As Dan noted in his response, it's because it is reporting the libvirt driver
>>name (which is reported as QEMU).
>>
>>Refer below if you want to double-confirm if your instances are using KVM.
>>
>>> 
>>> The virsh dumpxml of the instances shows:
>>> 
>>> 
>>
>>That means, yes, you using KVM.  You can confirm that by checking your QEMU
>>command-line of the Nova instance, you'll see something like "accel=kvm":
>>
>>  # This is on Fedora 23 system
>>  $ ps -ef | grep -i qemu-system-x86_64
>>  [...] /usr/bin/qemu-system-x86_64 -machine accel=kvm [...]
>>
>>> 
>>> /usr/bin/qemu-system-x86_64
>>> 
>>> ​But according to ​this document [1], it is using QEMU emulator instead of
>>> KVM, because it is not using /usr/bin/qemu-kvm
>>>
>>> 
>>> So I really don't know if it's using KVM or QEMU.
>>
>>As noted above, a sure-fire way to know is to see if the instance's QEMU
>>command-line has "accel=kvm".
>>
>>A related useful tool is `virt-host-validate` (which is part of libvirt-client
>>package, at least on Fedora-based systems):
>>
>>   $ virt-host-validate | egrep -i 'kvm'
>>QEMU: Checking if device /dev/kvm exists  
>>  : PASS
>>QEMU: Checking if device /dev/kvm is accessible   
>>  : PASS
>>
>>
>>> [1] https://libvirt.org/drvqemu.html
>>> 
>>
>>
>>-- 
>>/kashyap
>>
>>___
>>OpenStack-operators mailing list
>>OpenStack-operators@lists.openstack.org
>>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>___
>OpenStack-operators mailing list
>OpenStack-operators@lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Designate keystone auth issue

2016-05-10 Thread Kris G. Lindgren
Which section of the config did you add that to? The [keystone_authtoken] 
section?

Also that section seems to want auth_host: 
https://github.com/openstack/designate/blob/master/etc/designate/designate.conf.sample#L158
___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: raju >
Date: Tuesday, May 10, 2016 at 11:52 AM
To: 
"openstack-operators@lists.openstack.org"
 
>
Subject: [Openstack-operators] Designate keystone auth issue

Hi All,

I am trying to integrate designate(Dnaas) with my existing kilo environment and 
deployed designate in a separate node where I specified to connect to 
keystone(controller) server but still it is hitting on localhost while am 
trying to do api calls.

designate conf:

auth_uri = http://controller:5000/v2.0
identity_uri = http://controller:35357/
admin_tenant_name = service
admin_user = designate
admin_password = 


error log:

DEBUG keystoneclient.session [-] REQ: curl -g -i -X GET https://127.0.0.1:35357 
-H "Accept: application/json" -H "User-Agent: python-keystoneclient" 
_http_log_request /usr/lib/python2.7/site-packages/keystoneclient/session.py:195
2016-05-10 13:39:38.175 27473 INFO requests.packages.urllib3.connectionpool [-] 
Starting new HTTPS connection (5): 127.0.0.1
2016-05-10 13:39:38.176 27473 WARNING keystonemiddleware.auth_token [-] 
Authorization failed for token
2016-05-10 13:39:38.177 27473 INFO keystonemiddleware.auth_token [-] Invalid 
user token - rejecting request
2016-05-10 13:39:38.177 27473 INFO eventlet.wsgi [-] 127.0.0.1 - - [10/May/2016 
13:39:38] "GET /v1/servers HTTP/1.1" 401 283 0.005058
2016-05-10 13:39:38.363 27473 DEBUG keystoneclient.session [-] REQ: curl -g -i 
-X GET https://127.0.0.1:35357 -H "Accept: application/json" -H "User-Agent: 
python-keystoneclient" _http_log_request 
/usr/lib/python2.7/site-packages/keystoneclient/session.py:195
2016-05-10 13:39:38.365 27473 INFO requests.packages.urllib3.connectionpool [-] 
Starting new HTTPS connection (6): 127.0.0.1
2016-05-10 13:39:38.366 27473 WARNING keystonemiddleware.auth_token [-] 
Authorization failed for token
2016-05-10 13:39:38.366 27473 INFO keystonemiddleware.auth_token [-] Invalid 
user token - rejecting request
2016-05-10 13:39:38.367 27473 INFO eventlet.wsgi [-] 127.0.0.1 - - [10/May/2016 
13:39:38] "GET /v1/servers HTTP/1.1" 401 283 0.004249
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [openstack-dev][barbican]barbican github installation failing

2016-05-10 Thread Kris G. Lindgren
Uwsgi is a way to run the API portion of a python code base.  You most likely 
need to install uwsgi for you operating system.

http://uwsgi-docs.readthedocs.io/en/latest/

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Akshay Kumar Sanghai 
>
Date: Tuesday, May 10, 2016 at 11:15 AM
To: "OpenStack Development Mailing List (not for usage questions)" 
>, 
openstack-operators 
>
Subject: [Openstack-operators] [openstack-dev][barbican]barbican github 
installation failing

Hi,
I have a 4 node working setup of openstack (1 controller, 1 network node, 2 
compute node).
I am trying to use ssl offload feature of lbaas v2. For that I need tls 
containers, hence barbican.
I did a git clone of barbican repo from https://github.com/openstack/barbican
Then ./bin/barbican.sh install
I am getting this error

Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/mock/mock.py", line 1305, in patched
return func(*args, **keywargs)
  File "barbican/tests/queue/test_keystone_listener.py", line 327, in 
test_should_wait
msg_server = keystone_listener.MessageServer(self.conf)
  File "barbican/queue/keystone_listener.py", line 156, in __init__
endpoints=[self])
  File "barbican/queue/__init__.py", line 112, in get_notification_server
allow_requeue)
TypeError: __init__() takes exactly 3 arguments (5 given)
Ran 1246 tests in 172.776s (-10.533s)
FAILED (id=1, failures=4, skips=4)
error: testr failed (1)
Starting barbican...
./bin/barbican.sh: line 57: uwsgi: command not found

Please help me.

Thanks
Akshay
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Healthcheck URLs for services

2016-04-29 Thread Kris G. Lindgren
We have been using this since juno for Glance to do healthchecks against glance 
from haproxy.  Its worked pretty well for the most part.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Andy Botting >
Date: Friday, April 29, 2016 at 10:49 AM
To: Simon Pasquier >
Cc: 
"openstack-operators@lists.openstack.org"
 
>
Subject: Re: [Openstack-operators] Healthcheck URLs for services

Hi Simon,

There's a healthcheck oslo.middleware plugin [1] available. So you could 
possibly configure the service pipeline to include this except it won't 
exercise the db connection, RabbitMQ connection, and so on. But it would help 
if you want to kick out a service instance from the load-balancer without 
stopping the service completely [2].

[1] http://docs.openstack.org/developer/oslo.middleware/healthcheck_plugins.html
[2] 
http://docs.openstack.org/developer/oslo.middleware/healthcheck_plugins.html#disable-by-file

Thanks for this - I didn't find any of this in my Googling.

cheers,
Andy
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Security group rules not working on instances kilo

2016-04-21 Thread Kris G. Lindgren
Make sure that the bridges are being created (1 bridge per vm) they should be 
named close to the vm tap device name.  Then make sure that you have bridge 
nf-call-* files enabled:

http://wiki.libvirt.org/page/Net.bridge.bridge-nf-call_and_sysctl.conf

Under hybrid mode what happens is a linux bridge (not an ovs bridge (brctl)) is 
created per vm.  The vm's tap device is plugged into this bridge.  A veth is 
created that spans from the vm's linux bridge to br-int and is plugged at both 
ends.  This is done because older versions of OVS did not have support (or 
efficient support) for doing firewalling.  The problem is that in the kernel, 
packets traversing the Openvswitch code paths are unable to be hooked into by 
netfilter.  So the linux bridge is created solely to allow the VM traffic to 
pass through a netfilter hookable location, so security groups work.

You need at a minimum to make sure /proc/sys/net/bridge/bridge-nf-call-iptables 
is set to 1.  If its not then when you look at the iptables rules that are 
created – you will see that none of the security group chains are seeing 
traffic.
___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: raju >
Date: Thursday, April 21, 2016 at 5:26 PM
To: 
"openstack-operators@lists.openstack.org"
 
>
Subject: [Openstack-operators] Security group rules not working on instances 
kilo

Hi,

I am running into a issue where security group rules are not applying to 
instances when I create a new security group with default rules it should 
reject all incoming traffic but it is allowing everything without blocking

here is my config for nova :

security_group_api = neutron
firewall_driver = nova.virt.firewall.NoopFirewallDriver

and in ml2.con.ini

firewall_driver = 
neutron.agent.linux.iptables_firewall.OVSHybridIptablesFirewallDriver

iptables service is running on all the nodes, please suggest me if  I miss 
anything.


Thanks.
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [oslo]nova compute reconnection Issue Kilo

2016-04-21 Thread Kris G. Lindgren
Yea, that only fixes part of the issue.  The other part is getting the 
openstack messaging code itself to figure out the connection its using is no 
longer valid.  Heartbeats by itself solved 90%+ of our issues with rabbitmq and 
nodes being disconnected and never reconnecting.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: "Ajay Kalambur (akalambu)" <akala...@cisco.com<mailto:akala...@cisco.com>>
Date: Thursday, April 21, 2016 at 12:51 PM
To: "Kris G. Lindgren" <klindg...@godaddy.com<mailto:klindg...@godaddy.com>>, 
"openstack-operators@lists.openstack.org<mailto:openstack-operators@lists.openstack.org>"
 
<openstack-operators@lists.openstack.org<mailto:openstack-operators@lists.openstack.org>>
Subject: Re: [Openstack-operators] [oslo]nova compute reconnection Issue Kilo

Trying that now. I had aggressive system keepalive timers before

net.ipv4.tcp_keepalive_intvl = 10
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_time = 5


From: "Kris G. Lindgren" <klindg...@godaddy.com<mailto:klindg...@godaddy.com>>
Date: Thursday, April 21, 2016 at 11:50 AM
To: Ajay Kalambur <akala...@cisco.com<mailto:akala...@cisco.com>>, 
"openstack-operators@lists.openstack.org<mailto:openstack-operators@lists.openstack.org>"
 
<openstack-operators@lists.openstack.org<mailto:openstack-operators@lists.openstack.org>>
Subject: Re: [Openstack-operators] [oslo]nova compute reconnection Issue Kilo

Do you have rabbitmq/oslo messaging heartbeats enabled?

If you aren't using heartbeats it will take a long time  for the nova-compute 
agent to figure out that its actually no longer attached to anything.  
Heartbeat does periodic checks against rabbitmq and will catch this state and 
reconnect.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: "Ajay Kalambur (akalambu)" <akala...@cisco.com<mailto:akala...@cisco.com>>
Date: Thursday, April 21, 2016 at 11:43 AM
To: 
"openstack-operators@lists.openstack.org<mailto:openstack-operators@lists.openstack.org>"
 
<openstack-operators@lists.openstack.org<mailto:openstack-operators@lists.openstack.org>>
Subject: [Openstack-operators] [oslo]nova compute reconnection Issue Kilo


Hi
I am seeing on Kilo if I bring down one contoller node sometimes some computes 
report down forever.
I need to restart the compute service on compute node to recover. Looks like 
oslo is not reconnecting in nova-compute
Here is the Trace from nova-compute
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db   File 
"/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 156, in 
call
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db 
retry=self.retry)
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db   File 
"/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 90, in 
_send
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db 
timeout=timeout, retry=retry)
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db   File 
"/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 
350, in send
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db retry=retry)
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db   File 
"/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 
339, in _send
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db result = 
self._waiter.wait(msg_id, timeout)
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db   File 
"/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 
243, in wait
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db message = 
self.waiters.get(msg_id, timeout=timeout)
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db   File 
"/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 
149, in get
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db 'to message ID 
%s' % msg_id)
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db MessagingTimeout: 
Timed out waiting for a reply to message ID e064b5f6c8244818afdc5e91fff8ebf1


Any thougths. I am at stable/kilo for oslo

Ajay

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [oslo]nova compute reconnection Issue Kilo

2016-04-21 Thread Kris G. Lindgren
Do you have rabbitmq/oslo messaging heartbeats enabled?

If you aren't using heartbeats it will take a long time  for the nova-compute 
agent to figure out that its actually no longer attached to anything.  
Heartbeat does periodic checks against rabbitmq and will catch this state and 
reconnect.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: "Ajay Kalambur (akalambu)" >
Date: Thursday, April 21, 2016 at 11:43 AM
To: 
"openstack-operators@lists.openstack.org"
 
>
Subject: [Openstack-operators] [oslo]nova compute reconnection Issue Kilo


Hi
I am seeing on Kilo if I bring down one contoller node sometimes some computes 
report down forever.
I need to restart the compute service on compute node to recover. Looks like 
oslo is not reconnecting in nova-compute
Here is the Trace from nova-compute
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db   File 
"/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 156, in 
call
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db 
retry=self.retry)
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db   File 
"/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 90, in 
_send
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db 
timeout=timeout, retry=retry)
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db   File 
"/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 
350, in send
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db retry=retry)
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db   File 
"/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 
339, in _send
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db result = 
self._waiter.wait(msg_id, timeout)
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db   File 
"/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 
243, in wait
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db message = 
self.waiters.get(msg_id, timeout=timeout)
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db   File 
"/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 
149, in get
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db 'to message ID 
%s' % msg_id)
2016-04-19 20:25:39.090 6 TRACE nova.servicegroup.drivers.db MessagingTimeout: 
Timed out waiting for a reply to message ID e064b5f6c8244818afdc5e91fff8ebf1


Any thougths. I am at stable/kilo for oslo

Ajay

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Allow to investigate instance actions after instance deletion

2016-04-13 Thread Kris G. Lindgren
This spec/feature has already done on it and is committed:

https://review.openstack.org/#/q/topic:bp/os-instance-actions-read-deleted-instances

It landed in mitaka.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Dina Belova >
Date: Wednesday, April 13, 2016 at 4:08 AM
To: George Shuklin >
Cc: 
"openstack-operators@lists.openstack.org"
 
>
Subject: Re: [Openstack-operators] Allow to investigate instance actions after 
instance deletion

George,

I really believe this can be processed via Ceilometer events. Events about all 
actions happened to instance are coming to Ceilometer.

Cheers,
Dina

On Wed, Apr 13, 2016 at 12:23 PM, George Shuklin 
> wrote:
I filed a bug (feature request) about ability to see deleted instances action 
list: https://bugs.launchpad.net/nova/+bug/1569779

Any ideas?

I really want to see it like this:

I filed a bug (feature request) about ability to see deleted instances action 
list: https://bugs.launchpad.net/nova/+bug/1569779

Any ideas?

I really want to see it like this:

+---+--+-++
| Action| Request_ID   | Message | 
Start_Time |
+---+--+-++
| create| req-31f61086-ce71-4e0a-9ef5-3d1bdd386043 | -   | 
2015-05-26T12:09:54.00 |
| reboot| req-4632c799-a83e-489c-bb04-5ed4f47705af | -   | 
2015-05-26T14:21:53.00 |
| stop  | req-120635d8-ef53-4237-b95a-7d15f00ab6bf | -   | 
2015-06-01T08:46:03.00 |
| migrate   | req-bdd680b3-06d5-48e6-868b-d3e4dc17796a | -   | 
2015-06-01T08:48:14.00 |
| confirmResize | req-a9af49d4-833e-404e-86ac-7d8907badd9e | -   | 
2015-06-01T08:58:03.00 |
| start | req-5a2f5295-8b63-4cb7-84d9-dad1c6abf053 | -   | 
2015-06-01T08:58:20.00 |
| delete| req----- | -   | 
2016-04-01T00:00:00.00 |
+---+--+-++




___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators



--

Best regards,

Dina Belova

Software Engineer

Mirantis Inc.
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [osops] Finding ways to get operator issues to projects - Starting with NOVA

2016-04-11 Thread Kris G. Lindgren
LDT is large deployment team, its a working group for large deployments.  Like 
Rackspace, Cern, NeCTAR, Yahoo, GoDaddy, Bluebox.  Talk about issues scaling 
openstack, Nova cells, monitoring, all the stuff that becomes hard when you 
have thousands of servers or hundreds of clouds.  Also, the public-cloud 
working group is part of the LDT working group as well.  Since a large portion 
of us also happen to run public clouds.

Sorry - but your post came off (to me) as: Working groups don’t do anything 
actionable, atleast I have never seen it in neutron.  I was just giving 
actionable work that has come from LDT, alone, in neutron.
___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy







On 4/11/16, 9:58 AM, "Sean M. Collins" <s...@coreitpro.com> wrote:

>Kris G. Lindgren wrote:
>> You mean outside of the LDT filing an RFE bug with neutron to get
>
>Sorry, I don't know what LDT is. Can you explain?
>
>As for the RFE bug and the contributions that GoDaddy has been involved
>with, my statement is not about "if" operators are contributing, because
>obviously they are. But an RFE bug and coming to the midcycle is part of 
>Neutron's development process. Not a working group.
>
>
>-- 
>Sean M. Collins
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [osops] Finding ways to get operator issues to projects - Starting with NOVA

2016-04-11 Thread Kris G. Lindgren
You mean outside of the LDT filing an RFE bug with neutron to get 
segmented/routed network support added to neutron complete with an etherpad of 
all the ways we are using that at our companies and our use cases [1] .  Or 
where we (GoDaddy) came to the neutron Mid-cycle in Fort Collins to further 
talk about said use case as well as to put feelers out for ip-usages-extension. 
 Which was commited to Neutron in the Mitaka release [2]. 

These are just the things that I was am aware of and have been involved in 
neutron alone in the past 6 months, I am sure there are many more.

[1] - https://etherpad.openstack.org/p/Network_Segmentation_Usecases & 
https://bugs.launchpad.net/neutron/+bug/1458890
[2] - 
https://github.com/openstack/neutron/commit/2f741ca5f9545c388270ddab774e9e030b006d8a

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy







On 4/11/16, 9:11 AM, "Sean M. Collins"  wrote:

>To be blunt: Are we ensuring that all this work that people are
>capturing in these working groups is actually getting updated and
>communicated to the developers?
>
>As I become more involved with rolling upgrades, I will try and attend
>meetings and be available from the WG side, but I don't believe I've
>ever seen someone from the WG side come over to Neutron and say "We need
>XYZ and here's a link to what we've captured in our repo to explain what
>we mean"
>
>But then again I'm not on the neutron-drivers team or a core.
>
>Anyway, I updated what I've been involved with in the Mitaka cycle, when
>it comes to Neutron and upgrades (https://review.openstack.org/304181)
>
>-- 
>Sean M. Collins
>
>___
>OpenStack-operators mailing list
>OpenStack-operators@lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [neutron] Liberty - How to install latest version

2016-03-29 Thread Kris G. Lindgren
To be fair.  The missing update that he needed was from almost 60 days ago 
(Tagged on Jan 23rd).

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy







On 3/29/16, 6:14 AM, "Ihar Hrachyshka"  wrote:

>Christopher Hull  wrote:
>
>> Nevermind!  :-)   Updated using all noarch RPMs.   Liberty Router Pings  
>> for the very first time!   Yes, the fix was in that patch.   CentOS, for  
>> the sake of future stackers, please update your repo.   And thanks all  
>> for the help!!!
>
>Note that while CentOS provides some infra for RDO, RDO is a project that  
>is separate from CentOS. If anything, you should ask RDO folks to manage  
>missing updates.
>
>That said, I note that 22 days without an update does not sound  
>embarrassing to me. RDO folks need some time to deliver an update while  
>making sure it’s not breaking core scenarios for existing installations.
>
>Ihar
>
>___
>OpenStack-operators mailing list
>OpenStack-operators@lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Where can I get current Liberty code updates?... was Re:[neutron] router Found my bug Can't add gateway

2016-03-26 Thread Kris G. Lindgren
Looks like they are taking care of it:


[11:12] number80 klindgren: it's in liberty-testing => 
http://cbs.centos.org/koji/buildinfo?buildID=10149

[11:14] number80 I tagged it into -release since nobody reported issue for two 
weeks should be good

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Christopher Hull <chrishul...@gmail.com<mailto:chrishul...@gmail.com>>
Date: Saturday, March 26, 2016 at 11:06 AM
To: "Kris G. Lindgren" <klindg...@godaddy.com<mailto:klindg...@godaddy.com>>
Cc: Kevin Benton <ke...@benton.pub<mailto:ke...@benton.pub>>, OpenStack 
Operators 
<openstack-operators@lists.openstack.org<mailto:openstack-operators@lists.openstack.org>>
Subject: Re: [Openstack-operators] Where can I get current Liberty code 
updates?... was Re:[neutron] router Found my bug Can't add gateway

Wow, thanks Kris;

It's hard to believe that anyone running CentOS 7.2 has had any luck with 
Neutron routers.   I found what seems to be the fix to my problem.   And it's a 
simple 2 line Python change.  I've been coding for decades, and the path of 
least risk as of now seems to be to just edit the code and hope there are no 
cascading issues.  Had hoped to do it "right"  :-).. but after 3 weeks (not 
full time, I have a real coding job too) I'm tired of routers not working.  I 
want to migrate my servers already!  :-)

Will hit the IRC channels you suggest.

Thanks all;
-Chris




- Christopher T. Hull
I am presently seeking a new career opportunity  Please see career page
http://chrishull.com/career
333 Orchard Ave, Sunnyvale CA. 94085
(415) 385 4865
chrishul...@gmail.com<mailto:chrishul...@gmail.com>
http://chrishull.com



On Sat, Mar 26, 2016 at 9:41 AM, Kris G. Lindgren 
<klindg...@godaddy.com<mailto:klindg...@godaddy.com>> wrote:
I believe some Redhat people that hang out in #openstack-rpm-packaging.  But 
per: https://www.rdoproject.org/community/  Their main points of contact are:

#rdo: Discussion around RDO in general
#rdo-puppet: Discussion around deploying RDO with Packstack and it's puppet 
modules
#openstack: Discussion around OpenStack with the broader OpenStack community
#centos-devel: Discussion around the CentOS Cloud Special Interest Group (SIG)

I pointed out in the openstack-rpm-packaging channel that they are pretty far 
behind on stable release.  Last stable for neutron is 7.0.1 when 7.0.3 is out.  
Nova is also behind on a stable that was released 22 days ago.  I would suggest 
that you talk to those guys on irc via their normal communication path to let 
them know that they don’t appear to be publishing stable releases.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Christopher Hull <chrishul...@gmail.com<mailto:chrishul...@gmail.com>>
Date: Saturday, March 26, 2016 at 9:57 AM
To: Kevin Benton <ke...@benton.pub<mailto:ke...@benton.pub>>, Christopher Hull 
<chrishul...@gmail.com<mailto:chrishul...@gmail.com>>
Cc: OpenStack Operators 
<openstack-operators@lists.openstack.org<mailto:openstack-operators@lists.openstack.org>>
Subject: [Openstack-operators] Where can I get current Liberty code updates?... 
was Re:[neutron] router Found my bug Can't add gateway

I'm not getting current bug fixes and releases of Liberty via CentOS repo.

So the short version is this  Getting via yum install...

Repo-id  : centos-openstack-liberty/x86_64
Repo-name: CentOS-7 - OpenStack liberty
Repo-updated : Fri Feb  5 15:03:35 2016
Repo-baseurl : http://mirror.centos.org/centos/7/cloud/x86_64/openstack-liberty/

Clearly this repo isn't being updated.  My Neutron and others are several 
versions back, and I don't have a needed Neutron bug fix for CentOS.
Yes, of course yum update -y... NADA.  :-)

Where are you all getting your updates from??   I've been combing the net for a 
repo.   I'm running CentOS 7.2.

Thanks;
-Chris



- Christopher T. Hull
I am presently seeking a new career opportunity  Please see career page
http://chrishull.com/career
333 Orchard Ave, Sunnyvale CA. 94085
(415) 385 4865<tel:%28415%29%20385%204865>
chrishul...@gmail.com<mailto:chrishul...@gmail.com>
http://chrishull.com



On Sat, Mar 26, 2016 at 4:44 AM, Christopher Hull 
<chrishul...@gmail.com<mailto:chrishul...@gmail.com>> wrote:
Hi Keven;

"Bug fixed a long time ago.   How do you have old Nuetron version?"

I was wondering that myself.   See the install guide
http://docs.openstack.org/liberty/install-guide-rdo/environment-packages.html

Indeed my Neutron (and likely other parts of my install) seem quite old.   How 
can this be?

The fix is in neutron 7.0.2.
[root@maersk qr]# neutron --version
3.1.0   ???   wow!

For good measure I did a
yum update -y

--

Re: [Openstack-operators] Where can I get current Liberty code updates?... was Re:[neutron] router Found my bug Can't add gateway

2016-03-26 Thread Kris G. Lindgren
I believe some Redhat people that hang out in #openstack-rpm-packaging.  But 
per: https://www.rdoproject.org/community/  Their main points of contact are:

#rdo: Discussion around RDO in general
#rdo-puppet: Discussion around deploying RDO with Packstack and it's puppet 
modules
#openstack: Discussion around OpenStack with the broader OpenStack community
#centos-devel: Discussion around the CentOS Cloud Special Interest Group (SIG)

I pointed out in the openstack-rpm-packaging channel that they are pretty far 
behind on stable release.  Last stable for neutron is 7.0.1 when 7.0.3 is out.  
Nova is also behind on a stable that was released 22 days ago.  I would suggest 
that you talk to those guys on irc via their normal communication path to let 
them know that they don’t appear to be publishing stable releases.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Christopher Hull >
Date: Saturday, March 26, 2016 at 9:57 AM
To: Kevin Benton >, Christopher Hull 
>
Cc: OpenStack Operators 
>
Subject: [Openstack-operators] Where can I get current Liberty code updates?... 
was Re:[neutron] router Found my bug Can't add gateway

I'm not getting current bug fixes and releases of Liberty via CentOS repo.

So the short version is this  Getting via yum install...

Repo-id  : centos-openstack-liberty/x86_64
Repo-name: CentOS-7 - OpenStack liberty
Repo-updated : Fri Feb  5 15:03:35 2016
Repo-baseurl : http://mirror.centos.org/centos/7/cloud/x86_64/openstack-liberty/

Clearly this repo isn't being updated.  My Neutron and others are several 
versions back, and I don't have a needed Neutron bug fix for CentOS.
Yes, of course yum update -y... NADA.  :-)

Where are you all getting your updates from??   I've been combing the net for a 
repo.   I'm running CentOS 7.2.

Thanks;
-Chris



- Christopher T. Hull
I am presently seeking a new career opportunity  Please see career page
http://chrishull.com/career
333 Orchard Ave, Sunnyvale CA. 94085
(415) 385 4865
chrishul...@gmail.com
http://chrishull.com



On Sat, Mar 26, 2016 at 4:44 AM, Christopher Hull 
> wrote:
Hi Keven;

"Bug fixed a long time ago.   How do you have old Nuetron version?"

I was wondering that myself.   See the install guide
http://docs.openstack.org/liberty/install-guide-rdo/environment-packages.html

Indeed my Neutron (and likely other parts of my install) seem quite old.   How 
can this be?

The fix is in neutron 7.0.2.
[root@maersk qr]# neutron --version
3.1.0   ???   wow!

For good measure I did a
yum update -y

---
From the Installation Guide
yum remove epel-release


On centos  enable Openstack Repos


yum install centos-release-openstack-liberty -y
yum upgrade -y
yum install python-openstackclient -y
yum install openstack-selinux -y


---
My repolist

[root@maersk qr]# yum -v repolist
Loading "fastestmirror" plugin
Loading "langpacks" plugin
Adding en_US to language list
Config time: 0.005
Yum version: 3.4.3
Loading mirror speeds from cached hostfile
 * base: centos.sonn.com
 * extras: centos.sonn.com
 * updates: centos.sonn.com
Setting up Package Sacks
pkgsack time: 0.002
Repo-id  : base/7/x86_64
Repo-name: CentOS-7 - Base
Repo-revision: 1449700451
Repo-updated : Wed Dec  9 17:35:45 2015
Repo-pkgs: 9,007
Repo-size: 6.5 G
Repo-mirrors : 
http://mirrorlist.centos.org/?release=7=x86_64=os=stock
Repo-baseurl : http://centos.sonn.com/7/os/x86_64/ (9 more)
Repo-expire  : 21,600 second(s) (last: Sat Mar 26 00:29:22 2016)
Repo-filename: /etc/yum.repos.d/CentOS-Base.repo

Repo-id  : centos-openstack-liberty/x86_64
Repo-name: CentOS-7 - OpenStack liberty
Repo-revision: 1454702604
Repo-updated : Fri Feb  5 15:03:35 2016
Repo-pkgs: 976
Repo-size: 485 M
Repo-baseurl : http://mirror.centos.org/centos/7/cloud/x86_64/openstack-liberty/
Repo-expire  : 21,600 second(s) (last: Sat Mar 26 00:29:23 2016)
Repo-filename: /etc/yum.repos.d/CentOS-OpenStack-liberty.repo

Repo-id  : extras/7/x86_64
Repo-name: CentOS-7 - Extras
Repo-revision: 1458849247
Repo-updated : Thu Mar 24 15:54:21 2016
Repo-pkgs: 228
Repo-size: 599 M
Repo-mirrors : 
http://mirrorlist.centos.org/?release=7=x86_64=extras=stock
Repo-baseurl : http://centos.sonn.com/7/extras/x86_64/ (9 more)
Repo-expire  : 21,600 second(s) (last: Sat Mar 26 00:29:23 2016)

Re: [Openstack-operators] nova-conductor scale out

2016-03-15 Thread Kris G. Lindgren
Yes.  Nova-conductor is rpc based so you can add as many servers as you need, 
and they will process messages from the conductor queue on rabbit, without any 
problems.  I would suggest also moving rabbitmq off on its own server as well.  
As rabbitmq also chews up a significant amount of CPU as well.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Gustavo Randich 
<gustavo.rand...@gmail.com<mailto:gustavo.rand...@gmail.com>>
Date: Tuesday, March 15, 2016 at 9:38 AM
To: "Kris G. Lindgren" <klindg...@godaddy.com<mailto:klindg...@godaddy.com>>
Cc: David Medberry <openst...@medberry.net<mailto:openst...@medberry.net>>, 
"openstack-operators@lists.openstack.org<mailto:openstack-operators@lists.openstack.org>"
 
<openstack-operators@lists.openstack.org<mailto:openstack-operators@lists.openstack.org>>
Subject: Re: [Openstack-operators] nova-conductor scale out

PD: 32 cores


On Tue, Mar 15, 2016 at 12:37 PM, Gustavo Randich 
<gustavo.rand...@gmail.com<mailto:gustavo.rand...@gmail.com>> wrote:
We are melting right now (rpc timeouts, rabbitmq connection timeouts, high load 
on controller, etc.): we are running 375 compute nodes, and only one controller 
(on vmware) on which we run rabbitmq + nova-conductor with 28 workers

So I can seamlessly add more controller nodes with more nova-conductor workers?


On Tue, Mar 15, 2016 at 11:59 AM, Kris G. Lindgren 
<klindg...@godaddy.com<mailto:klindg...@godaddy.com>> wrote:
We run cells, but when we reached about 250 hv in a cell we needed to add 
another cell api (went from 2 to 3) to help with the cpu load caused by 
novaconductor.  Nova-conductor was/is constantly crushing the cpu on those 
servers.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: David Medberry <openst...@medberry.net<mailto:openst...@medberry.net>>
Date: Tuesday, March 15, 2016 at 8:54 AM
To: Gustavo Randich 
<gustavo.rand...@gmail.com<mailto:gustavo.rand...@gmail.com>>
Cc: 
"openstack-operators@lists.openstack.org<mailto:openstack-operators@lists.openstack.org>"
 
<openstack-operators@lists.openstack.org<mailto:openstack-operators@lists.openstack.org>>
Subject: Re: [Openstack-operators] nova-conductor scale out

How many compute nodes do you have (that is triggering your controller node 
limitations)?

We run nova-conductor on multiple control nodes. Each control node runs "N" 
conductors where N is basically the HyperThreaded CPU count.

On Tue, Mar 15, 2016 at 8:44 AM, Gustavo Randich 
<gustavo.rand...@gmail.com<mailto:gustavo.rand...@gmail.com>> wrote:
Hi,

Simple question: can I deploy nova-conductor across several servers? (Icehouse)

Because we are reaching a limit in our controller node


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org<mailto:OpenStack-operators@lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators




___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] nova-conductor scale out

2016-03-15 Thread Kris G. Lindgren
We run cells, but when we reached about 250 hv in a cell we needed to add 
another cell api (went from 2 to 3) to help with the cpu load caused by 
novaconductor.  Nova-conductor was/is constantly crushing the cpu on those 
servers.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: David Medberry >
Date: Tuesday, March 15, 2016 at 8:54 AM
To: Gustavo Randich 
>
Cc: 
"openstack-operators@lists.openstack.org"
 
>
Subject: Re: [Openstack-operators] nova-conductor scale out

How many compute nodes do you have (that is triggering your controller node 
limitations)?

We run nova-conductor on multiple control nodes. Each control node runs "N" 
conductors where N is basically the HyperThreaded CPU count.

On Tue, Mar 15, 2016 at 8:44 AM, Gustavo Randich 
> wrote:
Hi,

Simple question: can I deploy nova-conductor across several servers? (Icehouse)

Because we are reaching a limit in our controller node


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Setting affinity based on instance type

2016-03-03 Thread Kris G. Lindgren
Cern actually did a pretty good write up of this:

http://openstack-in-production.blogspot.com/2014/07/openstack-plays-tetris-stacking-and.html

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Adam Lawson >
Date: Thursday, March 3, 2016 at 4:28 PM
To: Silence Dogood >
Cc: 
"openstack-operators@lists.openstack.org"
 
>
Subject: Re: [Openstack-operators] Setting affinity based on instance type

Mathieu,

Blame it on my scattered brain but I'm now curious. How would this be 
approached practically speaking? I.e. how would ram_weight_multiplier enable 
the scenario I mentioned in my earliest post ?

//adam


Adam Lawson

AQORN, Inc.
427 North Tatnall Street
Ste. 58461
Wilmington, Delaware 19801-2230
Toll-free: (844) 4-AQORN-NOW ext. 101
International: +1 302-387-4660
Direct: +1 916-246-2072

On Thu, Mar 3, 2016 at 10:43 AM, Silence Dogood 
> wrote:
cool!

On Thu, Mar 3, 2016 at 1:39 PM, Mathieu Gagné 
> wrote:
On 2016-03-03 12:50 PM, Silence Dogood wrote:
> We did some early affinity work and discovered some interesting problems
> with affinity and scheduling. =/  by default openstack used to ( may
> still ) deploy nodes across hosts evenly.
>
> Personally, I think this is a bad approach.  Most cloud providers stack
> across a couple racks at a time filling them then moving to the next.
> This allows older equipment to age out instances more easily for removal
> / replacement.
>
> The problem then is, if you have super large capacity instances they can
> never be deployed once you've got enough tiny instances deployed across
> the environment.  So now you are fighting with the scheduler to ensure
> you have deployment targets for specific instance types ( not very
> elastic / ephemeral ).  goes back to the wave scheduling model being
> superior.
>
> Anyways we had the braindead idea of locking whole physical nodes out
> from the scheduler for a super ( full node ) instance type.  And I
> suppose you could do this with AZs or regions if you really needed to.
> But, it's not a great approach.
>
> I would say that you almost need a wave style scheduler to do this sort
> of affinity work.
>

You can already do it with the RAMWeigher using the
ram_weight_multiplier config:

  Multiplier used for weighing ram.  Negative
  numbers mean to stack vs spread.

Default is 1.0 which means spread.

--
Mathieu

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] libvirt cpu type per instance?

2016-03-03 Thread Kris G. Lindgren
I would be curious if specifing the cpu type would actually restrict 
performance.  As far as I know, this only restricts the cpu features presented 
to a vm.  You can present a vm that has the cpu instruction sets of a Pentium 3 
– but runs and is as performant as a single core on a 2.8ghz hexcore cpu.

Additionally one would have to change that for the entire HV.  You might have 
better luck also using the flavor extra_specs:

http://docs.openstack.org/admin-guide-cloud/compute-flavors.html

I am pretty sure thought you can't set what CPU type/flags to present to the vm 
through them though. (Maybe with custom code)

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Mike Smith >
Date: Thursday, March 3, 2016 at 2:06 PM
To: Jonathan Proulx >
Cc: 
"openstack-operators@lists.openstack.org"
 
>
Subject: Re: [Openstack-operators] libvirt cpu type per instance?

Jonathan -

There are some nova settings (at least for KVM that you are probably thinking 
of, such as:

cpu_mode
cpu_model

http://docs.openstack.org/liberty/config-reference/content/kvm.html



Mike Smith
Lead Cloud Systems Architect
Overstock.com



On Mar 3, 2016, at 1:52 PM, Jonathan Proulx 
> wrote:


I have a user who wants to specify their libvirt CPU type to restrict
performance because they're modeling embeded systems.

I seem to vaguely recall there is/was a way to specify this either in
the instance type or maybe even in the image metadata, but I can't
seem to find it.

Am I delusional or blind?

-Jon

--

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Workload Management (post-instantiation)

2016-03-02 Thread Kris G. Lindgren
We would love to have something like that as well.

However, to do it in openstack would mean that something would have to 
gather/monitor the health of the HV's and not only disable new provisions but 
kick off/monitor the migrations off the host and onto the new chosen 
destinations .  Also, due to the fact that some migration may never complete 
(dirtying pages faster than you can copy them) it would have to have some 
smarts to select the vm's that have the higher chance of being migrated.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Edgar Magana >
Date: Wednesday, March 2, 2016 at 4:31 PM
To: Adam Lawson >, 
"openstack-operators@lists.openstack.org"
 
>
Subject: Re: [Openstack-operators] Workload Management (post-instantiation)

We have done it with nagios checks and customize ruby code.

Edgar

From: Adam Lawson >
Date: Wednesday, March 2, 2016 at 1:48 PM
To: 
"openstack-operators@lists.openstack.org"
 
>
Subject: [Openstack-operators] Workload Management (post-instantiation)

Hello fellow Ops-minded stackers!

I understand OpenStack uses scheduler logic to place a VM on a host to ensure 
the load is balanced across hosts. My 64 million dollar question is: Has anyone 
identified a way to monitor capacity across all hosts on an ongoing basis and 
automatically live migrate VM's as needed to ensure hosts resource consumption 
is balanced over time?

It seems the scheduler addresses capacity at the time of instantiation but 
there's nothing that addresses optimal usage AFTER the VM is initially placed.

Thoughts/experiences?

//adam

Adam Lawson

AQORN, Inc.
427 North Tatnall Street
Ste. 58461
Wilmington, Delaware 19801-2230
Toll-free: (844) 4-AQORN-NOW ext. 101
International: +1 302-387-4660
Direct: +1 916-246-2072
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [nova][neutron] What are your cells networking use cases?

2016-02-25 Thread Kris G. Lindgren
To follow up on the relay idea.  In our implementation we have looked at trying 
to enable ip_helper on the switches to forward dhcp to a set of defined neutron 
dhcp servers.  The issue is that this turns the dhcp requests from a broadcast 
packet to a unicast packet.  With the default way neutron configures each dhcp 
agent (in its own network name space) - I could not think of a way, outside of 
a total hack, to get the unicast packet into the correct namespace. 

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy







On 2/25/16, 3:20 PM, "Carl Baldwin"  wrote:

>(resending with reply-all)
>
>The routed networks work will include a change to the DHCP scheduler
>which will work something like this:
>
>1. Neutron subnets will have optional affinity to a segment
>2. DHCP agents will (somewhat indirectly) report which segments to
>which they are attached*.
>3. Where today, DHCP schedules networks to DHCP agents, tomorrow DHCP
>will schedule each segment to an agent that can reach it.  This will
>be predicated on 'enable_dhcp' being set on the subnets.
>
>There is an implicit assumption here that the operator will deploy a
>DHCP agent in each of the segments.  This will be documented in the
>guide.
>
>Down the road, I really think we should continue to explore other
>possibilities like DHCP relay or a DHCP responder on the compute host.
>But, that should be considered an independent effort.
>
>Carl
>
>* they already do this by reporting physical_network in bridge mappings
>
>On Thu, Feb 25, 2016 at 11:30 AM, Tim Bell  wrote:
>>
>> The CERN guys had some concerns on how dhcp was working in a segment 
>> environment. I’ll leave them to give details.
>>
>> Tim
>>
>>
>>
>>
>>
>> On 25/02/16 14:53, "Andrew Laski"  wrote:
>>
>>>
>>>
>>>On Thu, Feb 25, 2016, at 05:01 AM, Tim Bell wrote:

 CERN info added.. Feel free to come back for more information if needed.
>>>
>>>An additional piece of information we're specifically interested in from
>>>all cellsv1 deployments is around the networking control plane setup. Is
>>>there a single nova-net/Neutron deployment per region that is shared
>>>among cells? It appears that all cells users are splitting the network
>>>data plane into clusters/segments, are similar things being done to the
>>>control plane?
>>>
>>>

 Tim




 On 24/02/16 22:47, "Edgar Magana"  wrote:

 >It will be awesome if we can add this doc into the networking guide  :-)
 >
 >
 >Edgar
 >
 >
 >
 >
 >On 2/24/16, 1:42 PM, "Matt Riedemann"  wrote:
 >
 >>The nova and neutron teams are trying to sort out existing deployment
 >>network scenarios for cells v1 so we can try and document some of that
 >>and get an idea if things change at all with cells v2.
 >>
 >>Therefore we're asking that deployers running cells please document
 >>anything you can in an etherpad [1].
 >>
 >>We'll try to distill that for upstream docs at some point and then use
 >>it as a reference when talking about cells v2 + networking.
 >>
 >>[1] https://etherpad.openstack.org/p/cells-networking-use-cases
 >>
 >>--
 >>
 >>Thanks,
 >>
 >>Matt Riedemann
 >>
 >>
 >>___
 >>OpenStack-operators mailing list
 >>OpenStack-operators@lists.openstack.org
 >>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
 >___
 >OpenStack-operators mailing list
 >OpenStack-operators@lists.openstack.org
 >http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
 ___
 OpenStack-operators mailing list
 OpenStack-operators@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
 Email had 1 attachment:
 + smime.p7s
   4k (application/pkcs7-signature)
>>>
>>>___
>>>OpenStack-operators mailing list
>>>OpenStack-operators@lists.openstack.org
>>>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>> ___
>> OpenStack-operators mailing list
>> OpenStack-operators@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>
>___
>OpenStack-operators mailing list
>OpenStack-operators@lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [nova] Do you, or your users, have input on how get-me-a-network should work in Nova?

2016-02-19 Thread Kris G. Lindgren


___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy







On 2/19/16, 10:07 AM, "Matt Riedemann"  wrote:

>There is a long contentious dev thread going on here [1] about how Nova 
>should handle the Neutron auto-allocate-topology API (referred to as the 
>'get-me-a-network' effort).
>
>The point is to reduce the complexity for users to simply boot an 
>instance and be able to ssh into it without having to first setup 
>networks/subnets/routers in neutron and then specify a nic when booting 
>the instance. If the planets are aligned, and no nic is provided (or 
>available to the project), then nova would call the new neutron API to 
>auto-allocate the network and use that to create a port to associate 
>with the instance.
>
>There is existing behavior in Nova where you can boot an instance and 
>get no networking with neutron as the backend. You can later add 
>networking by attaching an interface. The nova dev team has no idea how 
>common this use case is though.
>
>There will be a microversion to the nova API with the get-me-a-network 
>support. The debate is what the default behavior should be when using 
>that microversion. The options are basically:
>
>1. If no nic is provided at boot and none are available, don't provide a 
>network (existing behavior). If the user wants a network auto-allocated, 
>they specify something like: --nic=auto

This is my preferred choice - keep the functionality exactly the same as the 
way it is today.  Users (if this is available) can opt-in.  Not 100% familiar 
with micro-version - but is it possible to opt-out of this micro-version all 
together, but have other, later, micro-versions?


>
>In this case the user has to opt into auto-allocating the network.
>
>2. If no nic is provided at boot and none are available, nova will 
>attempt to auto-allocate the network from neutron. If the user 
>specifically doesn't want networking on instance create (for whatever 
>reason), they have to opt into that behavior with something like: --nic=none
>
>This is closer in behavior to how booting an instance works with 
>nova-network, but it is a change in the default behavior for the neutron 
>case, and that is a cause for concern for any users that have written 
>tools to expect that default behavior.


I don't like this but I think other people might.  Really I would like to see a 
config option detailing how the cloud admin wants to handle this behavior.

>
>3. If no nic is provided at boot and none are available, fail the 
>request and force the request to be explicit, i.e. provide a specific 
>nic, or auto, or none. This is a fail-fast scenario to force users to 
>really state what they want.

I don't like this option at all.  You are chaning what people must provide on 
the bootline and this as far as I can tell is a breaking change. 

>
>--
>
>As with any microversion change, we hope that users are reading the docs 
>and aware of the changes in each microversion, but we can't guarantee 
>that, so changing default behavior (case 2) requires discussion and 
>input, especially from outside the dev team.
>
>If you or your users have any input on this, please respond in this 
>thread of the one in the -dev list.
>
>[1] 
>http://lists.openstack.org/pipermail/openstack-dev/2016-February/086437.html
>
>-- 
>
>Thanks,
>
>Matt Riedemann
>
>
>___
>OpenStack-operators mailing list
>OpenStack-operators@lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [glance] Image enters "killed" state on upload

2016-02-02 Thread Kris G. Lindgren
Not related to your issue, but something to keep an eye out for, is that you 
need to keep the uid for glance synced across your glances servers when using 
an nfsv3 store.  Since nfsv3 stores the uid & gid for the file perms.  You can 
run into weird issues if glance is uid/gid 501 on one glance server and 502 on 
another.  We had that problem crop up in production when packages were doing 
"useradd" without specifying a uid/gid.  So you could end up with systems with 
different id's and permissions that are all screwed up between multiple servers.

So related to your question .. If I remember correctly you need read/execute 
permissions to list the contents/enter a directory under linux.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Liam Haworth 
>
Date: Tuesday, February 2, 2016 at 4:25 PM
To: Abel Lopez >
Cc: 
"openstack-operators@lists.openstack.org"
 
>
Subject: Re: [Openstack-operators] [glance] Image enters "killed" state on 
upload

Here is a output from my system instead of my blabbering in a long winded email

root@ctrl1:~# uname -a
Linux ctrl1 3.19.0-43-generic #49~14.04.1-Ubuntu SMP Thu Dec 31 15:44:49 UTC 
2015 x86_64 x86_64 x86_64 GNU/Linux

root@ctrl1:~# df -h
Filesystem  SizeUsedAvail 
Use% Mounted on
udev   7.9G   4.0K 7.9G 
1%  /dev
tmpfs  1.6G   724K1.6G  
   1%  /run
/dev/mapper/ctrl1--vg-root396G  6.4G370G 2% 
/
none   4.0K0   4.0K 
0% /sys/fs/cgroup
none   5.0M   0   5.0M  
   0%/run/lock
none   7.9G0  7.9G  
0%/run/shm
none   100M  0  100M
 0%/run/user
/dev/sdc1   236M  38M 186M   
17%/boot
10.16.16.30:/srv/glance  739G 97G  604G14%  
  /var/lib/glance/images

And to save from massed output from a LS, ever file in /var/lib/glance/images 
is: -rw-r- 1 glance glance

No apparmour installed or configured

On Wed, 3 Feb 2016 at 10:17 Abel Lopez 
> wrote:
Ok, with file store, some of the silly things that crop up are around directory 
permissions, disk space, SELinux/apparmour.

Make sure the glance user and group have ownership (recursively) of the 
/var/lib/glance directory, make sure you're not low on space, if you have 
SELinux set to enforcing, test setting it to permissive (if that is the issue, 
resolve the contexts)

On Feb 2, 2016, at 3:13 PM, Liam Haworth 
> wrote:

Glance is configured to use file store to /var/lib/glance/images

On Wed, 3 Feb 2016 at 10:12 Abel Lopez 
> wrote:
I ran into a similar issue in Havana, but that was because we were doing some 
'behind-the-scenes' modification of the image (format conversion)
Once we stopped that, the issue went away.

What is your glance store configured as?

On Feb 2, 2016, at 3:05 PM, Liam Haworth 
> wrote:

Hey All,

This sounds like an old bug after trying to google it but everything I found 
doesn't really seem to help. I'm trying to upload a 2.5GB QCOW2 image to glance 
to be used by users, the upload goes fine and in the glance registry logs I can 
see that it has successfully saved the image but then it does this

2016-02-03 09:51:49.607 2826 DEBUG glance.registry.api.v1.images 
[req-5ba18ea3-5777-4023-9f85-040aca48dfa7 --trunced-- - - -] Updating image 
03a920ce-7979-4439-ab71-bc3dd34df3d3 with metadata: {u'status': u'killed'} 
update /usr/lib/python2.7/dist-packages/glance/registry/api/v1/images.py:470

What reasons are their for it to do this to an image that just successfully 
uploaded?

Thanks,

Liam Haworth.
--
Liam Haworth | Junior Software Engineer | 
www.bluereef.com.au
_
T: +61 3 9898 8000 | F: +61 3 9898 8055


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

--
Liam Haworth | Junior 

Re: [Openstack-operators] Config Drive has no content/0000

2016-02-02 Thread Kris G. Lindgren
We noticed the same thing.  It a simple patch in /nova/virt/netutils.py (we 
have been running this since icehouse).

Below is our current patch for kilo.

--- a/nova/virt/netutils.py
+++ b/nova/virt/netutils.py

@@ -104,8 +104,9 @@ def get_injected_network_template(network_info, 
use_ipv6=None, template=None,

 ifc_num += 1

-if not network.get_meta('injected'):
-continue
+# GD force network template in config drive on dhcp network
+#if not network.get_meta('injected'):
+#continue

 hwaddress = vif.get('address')
 address = None
 @@ -114,8 +115,8 @@ def get_injected_network_template(network_info, 
use_ipv6=None, template=None,
 broadcast = None
 dns = None
 if subnet_v4:
-if subnet_v4.get_meta('dhcp_server') is not None:
-continue
+#if subnet_v4.get_meta('dhcp_server') is not None:
+#continue

 if subnet_v4['ips']:
 ip = subnet_v4['ips'][0]

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: TAO ZHOU >
Date: Tuesday, February 2, 2016 at 9:35 PM
To: OpenStack Operations Mailing List 
>
Subject: Re: [Openstack-operators] Config Drive has no content/

When you create the network you need to set dhcp to false, otherwise, no static 
ip in config drive.

On Thu, Jul 2, 2015 at 9:59 AM, TAO ZHOU 
> wrote:

Hi,

I have an icehouse openstack setup.

I have the following lines in nova.conf:

force_config_drive = always
config_drive_cdrom = True

Whenever I launch an instance, I can see a content directory in the config drive

openstack/content/ contains all network interfaces.

I can simply configure the static IP address from openstack/content/.

Now I have a new openstack cluster setup and I cannot see this content 
directory when I launch a VM.

I checked my configuration files and I can't find any difference with my old 
cluster.

Any ideas?

Thanks

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Config Drive has no content/0000

2016-02-02 Thread Kris G. Lindgren
I just noticed you said icehouse…

Here is our icehouse patch:

@@ -88,8 +88,8 @@ def get_injected_network_template(network_info, 
use_ipv6=CONF.use_ipv6,

 ifc_num += 1

-if not network.get_meta('injected'):
-continue
+#if not network.get_meta('injected'):
+#continue

 address = None
 netmask = None
 @@ -97,8 +97,8 @@ def get_injected_network_template(network_info, 
use_ipv6=CONF.use_ipv6,
 broadcast = None
 dns = None
 if subnet_v4:
-if subnet_v4.get_meta('dhcp_server') is not None:
-continue
+#if subnet_v4.get_meta('dhcp_server') is not None:
+#continue

 if subnet_v4['ips']:
 ip = subnet_v4['ips'][0]
___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: "Kris G. Lindgren" <klindg...@godaddy.com<mailto:klindg...@godaddy.com>>
Date: Tuesday, February 2, 2016 at 9:50 PM
To: TAO ZHOU <angelo...@gmail.com<mailto:angelo...@gmail.com>>, OpenStack 
Operations Mailing List 
<openstack-operators@lists.openstack.org<mailto:openstack-operators@lists.openstack.org>>
Subject: Re: [Openstack-operators] Config Drive has no content/

We noticed the same thing.  It a simple patch in /nova/virt/netutils.py (we 
have been running this since icehouse).

Below is our current patch for kilo.

--- a/nova/virt/netutils.py
+++ b/nova/virt/netutils.py

@@ -104,8 +104,9 @@ def get_injected_network_template(network_info, 
use_ipv6=None, template=None,

 ifc_num += 1

-if not network.get_meta('injected'):
-continue
+# GD force network template in config drive on dhcp network
+#if not network.get_meta('injected'):
+#continue

 hwaddress = vif.get('address')
 address = None
 @@ -114,8 +115,8 @@ def get_injected_network_template(network_info, 
use_ipv6=None, template=None,
 broadcast = None
 dns = None
 if subnet_v4:
-if subnet_v4.get_meta('dhcp_server') is not None:
-continue
+#if subnet_v4.get_meta('dhcp_server') is not None:
+#continue

 if subnet_v4['ips']:
 ip = subnet_v4['ips'][0]

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: TAO ZHOU <angelo...@gmail.com<mailto:angelo...@gmail.com>>
Date: Tuesday, February 2, 2016 at 9:35 PM
To: OpenStack Operations Mailing List 
<openstack-operators@lists.openstack.org<mailto:openstack-operators@lists.openstack.org>>
Subject: Re: [Openstack-operators] Config Drive has no content/

When you create the network you need to set dhcp to false, otherwise, no static 
ip in config drive.

On Thu, Jul 2, 2015 at 9:59 AM, TAO ZHOU 
<angelo...@gmail.com<mailto:angelo...@gmail.com>> wrote:

Hi,

I have an icehouse openstack setup.

I have the following lines in nova.conf:

force_config_drive = always
config_drive_cdrom = True

Whenever I launch an instance, I can see a content directory in the config drive

openstack/content/ contains all network interfaces.

I can simply configure the static IP address from openstack/content/.

Now I have a new openstack cluster setup and I cannot see this content 
directory when I launch a VM.

I checked my configuration files and I can't find any difference with my old 
cluster.

Any ideas?

Thanks

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] how to get glance images for a specific tenant with the openstack client ?

2016-01-25 Thread Kris G. Lindgren
This doesn't answer your specific question.  However there are two projects out 
there that are specifically for cleaning up projects and everything associated 
with them for removal.  They are: 

The coda project: https://github.com/openstack/osops-coda

Which given a tenant ID will cleanup all resources for the tenant before its 
removed.  This is a project that came out of HP and has been turned over to the 
Openstack-Operators group.

The second one is: https://github.com/openstack/ospurge

This project works on projects that have already been deleted from keystone but 
has orphaned resources (you can also use it on active projects as well).

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy






On 1/25/16, 8:49 AM, "Thomas Blank - Hetzner Online AG" 
 wrote:

>Hey there,
>
>have you tried using the --property option of the openstack-client?
>You could filter images by their owner (openstack_project_id or name).
>
>"openstack image list --property owner=[openstack_project_id]"
>
>see:
>
>http://docs.openstack.org/developer/python-openstackclient/command-objects/image.html#cmdoption-image-list--property
>
>thomas blank,
>
>
>
>On 25.01.2016 16:19, Saverio Proto wrote:
>> Hello there,
>> 
>> I need to delete some users  and tenants from my public cloud. Before
>> deleting the users and tenants from keystone, I need to delete all the
>> resources in the tenants.
>> 
>> I am stucked listing the glance images uploaded in a specific tenant.
>> I cannot find the way, I always get either all the images in the
>> system, or just the ones of the active OS_TENANT_NAME
>> 
>> openstack help image list
>> usage: openstack image list [-h] [-f {csv,json,table,value,yaml}] [-c COLUMN]
>> [--max-width ] [--noindent]
>> [--quote {all,minimal,none,nonnumeric}]
>> [--public | --private | --shared]
>> [--property 

Re: [Openstack-operators] I have an installation question and possible bug

2016-01-25 Thread Kris G. Lindgren
In the past we have had issues with having glance terminating ssl and downloads 
either not completing or being corrupted.  If you are having glance terminate 
ssl, for us moving ssl termination to haproxy and running glance as non-ssl 
fixed that issue for us.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy







On 1/25/16, 11:23 AM, "Clint Byrum"  wrote:

>Excerpts from Christopher Hull's message of 2016-01-25 09:11:59 -0800:
>> Hello all;
>> 
>> I'm an experienced developer and I work at Cisco.  Chances are I've covered
>> the basics here,but just in case, check me.
>> I've followed the Kilo install instructions to the letter so far as I can
>> tell.   I have not installed Swift, but I think everything else, and my
>> installation almost works.   I'm having a little trouble with Glance.
>> 
>> It seems that when I attempt to create a large image (that may or not may
>> be the issue), the checksum that Glance records in it's DB is incorrect.
>> Cirros image runs just fine.  CentOS cloud works.  But when I offload and
>> create an image from a big CentOS install (say 100gb), nova says the
>> checksum is wrong when I try to boot it.
>> 
>
>Did you check the file that glance saved to disk to make sure it was
>the same one you uploaded? I kind of wonder if something timed out and
>did not properly report the error, leading to a partially written file.
>
>Also, is there some reason you aren't deploying Liberty?
>
>___
>OpenStack-operators mailing list
>OpenStack-operators@lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [nova] Recovering instances from old system

2016-01-11 Thread Kris G. Lindgren
Seconding what Matt said.  You are also going to need to spend some time at the 
kilo code level to do the flavor migrations.  As that was a requirement from 
kilo -> liberty.  I also know that you needed to be on kilo.1 (or .2) to go to 
liberty to fix a bug in Numa Node pinning (iirc).

I would also look at the upgrade requirements for every version between 
icehouse and liberty and you are also going to need to perform those actions.


___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Matt Fischer >
Date: Monday, January 11, 2016 at 8:29 AM
To: Liam Haworth 
>
Cc: 
"openstack-operators@lists.openstack.org"
 
>
Subject: Re: [Openstack-operators] [nova] Recovering instances from old system

Personally, I'd just try to load the instance images like you said. If you try 
to load Icehouse records onto Liberty code its not going to work. Typically 
you'd do the upgrade one step at a time with database migrations done at every 
step.

On Sun, Jan 10, 2016 at 9:58 PM, Liam Haworth 
> wrote:
Hey Abel,

When I say 'wiped everything clean" I mean that we made backups of the 
databases and configuration on all hosts before reinstalling Ubuntu 14.04 LTS 
on top to start with a clean slate, we did this because we decided to make the 
change to the LinuxBridge agent from the OpenVSwitch agent plus we where 
upgrading from Icehouse to Liberty, this along with a major restructure with 
how our network is laid out.

Really I'm just looking for some input on if I should attempt to extract the 
records of the instance backed up from the database and manually insert them or 
if I should take the instance dicks and load them in as images and spin them 
backup from there.

Hopefully I made a bit more sense this time, sorry.

On Sat, 9 Jan 2016 at 03:26 Abel Lopez 
> wrote:
I would expect that if you have the databases in place, and they went through 
the proper migrations, that your instances would still be there.
You can check the nova database instances table using the uuid.

How exactly did you 'wipe everything clean'?

On Jan 7, 2016, at 5:35 PM, Liam Haworth 
> wrote:

Hey guys,

I just recently rebuilt the OpenStack infrastructure at my work to upgrade from 
Juno to Liberty, before wiping everything clean I made dumps of all the 
databases and the instances where persisted though the upgrade via NFS.

Now that I have the infrastructure up and going again I was wondering what 
would be the best way to import the instances back in, I understand I may need 
to change some fields in the database dumps to change project ids and such but 
I just wanted to get some input on if this is a good idea or if there is a 
better way to do this?

Kind regards,
--
Liam Haworth | Junior Software Engineer | 
www.bluereef.com.au
_
T: +61 3 9898 8000 | F: +61 3 9898 8055


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

--
Liam Haworth | Junior Software Engineer | 
www.bluereef.com.au
_
T: +61 3 9898 8000 | F: +61 3 9898 8055



___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Nova-network -> Neutron Migration

2015-12-09 Thread Kris G. Lindgren
Doesn't this script only solve the case of going from flatdhcp networks in 
nova-network to same dchp/provider networks in neutron.  Did anyone test to see 
if it also works for doing more advanced nova-network configs?

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Edgar Magana >
Date: Wednesday, December 9, 2015 at 9:54 AM
To: Matt Kassawara >, "Kevin 
Bringard (kevinbri)" >
Cc: OpenStack Operators 
>
Subject: Re: [Openstack-operators] Nova-network -> Neutron Migration

Yes! We should but with a huge caveat that is not not supported officially by 
the OpenStack community. At least the author wants to make a move with the 
Neutron team to make it part of the tree.

Edgar

From: Matt Kassawara
Date: Wednesday, December 9, 2015 at 8:52 AM
To: "Kevin Bringard (kevinbri)"
Cc: Edgar Magana, Tom Fifield, OpenStack Operators
Subject: Re: [Openstack-operators] Nova-network -> Neutron Migration

Anyone think we should make this script a bit more "official" ... perhaps in 
the networking guide?

On Wed, Dec 9, 2015 at 9:01 AM, Kevin Bringard (kevinbri) 
> wrote:
Thanks, Tom, Sam, and Edgar, that's really good info. If nothing else it'll 
give me a good blueprint for what to look for and where to start.



On 12/8/15, 10:37 PM, "Edgar Magana" 
> wrote:

>Awesome code! I just did a small testbed test and it worked nicely!
>
>Edgar
>
>
>
>
>On 12/8/15, 7:16 PM, "Tom Fifield" 
>> wrote:
>
>>On 09/12/15 06:32, Kevin Bringard (kevinbri) wrote:
>>> Hey fellow oppers!
>>>
>>> I was wondering if anyone has any experience doing a migration from 
>>> nova-network to neutron. We're looking at an in place swap, on an Icehouse 
>>> deployment. I don't have parallel
>>>
>>> I came across a couple of things in my search:
>>>
>>> https://wiki.openstack.org/wiki/Neutron/MigrationFromNovaNetwork/HowTo
>>> http://docs.openstack.org/networking-guide/migration_nova_network_to_neutron.html
>>>
>>> But neither of them have much in the way of details.
>>>
>>> Looking to disrupt as little as possible, but of course with something like 
>>> this there's going to be an interruption.
>>>
>>> If anyone has any experience, pointers, or thoughts I'd love to hear about 
>>> it.
>>>
>>> Thanks!
>>>
>>> -- Kevin
>>
>>NeCTAR used this script (https://github.com/NeCTAR-RC/novanet2neutron )
>>with success to do a live nova-net to neutron using Juno.
>>
>>
>>
>>___
>>OpenStack-operators mailing list
>>OpenStack-operators@lists.openstack.org
>>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>___
>OpenStack-operators mailing list
>OpenStack-operators@lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [keystone] RBAC usage at production

2015-12-09 Thread Kris G. Lindgren
In other projects the policy.json file is read each time of api request.  So 
changes to the file take place immediately.  I was 90% sure keystone was the 
same way?

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy







On 12/9/15, 1:39 AM, "Oguz Yarimtepe"  wrote:

>Hi,
>
>I am wondering whether there are people using RBAC at production. The 
>policy.json file has a structure that requires restart of the service 
>each time you edit the file. Is there and on the fly solution or tips 
>about it?
>
>
>
>___
>OpenStack-operators mailing list
>OpenStack-operators@lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Two regions and so two metadata servers sharing the same VLAN

2015-12-03 Thread Kris G. Lindgren
Not sure what you can do on your vmware backed boxes, but on the kvm compute 
nodes you can run nova-api-metadata locally.  We do this by binding 
169.254.169.254  to loopback (technically an non-arping interface would work) 
on each hypervisor.  If I recall correctly, setting the metadata_server to 
127.0.0.1 should add the correct iptables rules when the nova-api-metadata 
services starts up.  You can then block requests for 169.254.169.254 from 
leaving/entering the server on external interfaces.  That should keep all 
metadata requests locally to the kvm server.   We do this on all of our 
hypervisors (minus the blocking of metadata from leaving the hypervsior) and 
are running with flat networks in neutron.  Assuming, that keeps all the kvm 
metadata requests local, you could then run metadata normally on the network to 
service the vmware clusters.  Assuming that you cant do something similar on 
those boxes.

I haven't done/tried this… but you could also use the extra dhcp options to 
inject specific and different routes to the metadata service via 
dhcp/config-drive.  Assuming that the traffic gets routed to the metadata 
server for 169.254.169.254 you could bind the metadata address to a non-arping 
interface and everything should be fine.

I am not sure if vmware supports config drive. If it does, then you could 
simply not run metadata services and use config-drive with cloud init instead.  
Assuming of course that you are ok with the fact that metadata never changes on 
config drive once the vm is booted.  You can also with a fiarly small patch 
make it so where config-drive always injects the networking information into 
config-drive, even for neutron networks with dhcp enabled. Then statically IP 
your boxes using config drive vs's dhcp.  This is what we do.  DHCP is for 
backup only, all of our images are configured with cloud-init to statically ip 
from config drive on boot)
___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Kevin Benton >
Date: Thursday, December 3, 2015 at 5:29 PM
To: Gilles Mocellin 
>
Cc: OpenStack Operators 
>
Subject: Re: [Openstack-operators] Two regions and so two metadata servers 
sharing the same VLAN

Well if that's the case then the metadata wouldn't work for every instance that 
ARP'ed for the address and got the wrong response first.

On Thu, Dec 3, 2015 at 3:56 PM, Gilles Mocellin 
> wrote:
Hum, I don't think so. Things like hostname must be only known by the neutron 
instance of one region...

Le 03/12/2015 00:01, Kevin Benton a écrit :
Are both metadata servers able to provide metadata for all instances of both 
sides? If so, why not disable isolated metadata on one of the sides so only one 
of the DHCP agents will respond?


On Thu, Nov 26, 2015 at 6:49 AM, 
 
>> 
wrote:

Hello stackers !

Sorry, I also cross-posted that question here

https://ask.openstack.org/en/question/85195/two-regions-and-so-two-metadata-servers-sharing-the-same-vlan/

But I think I can reach a wider audience here.

So here's my problem.

I'm facing an non-conventional situation. We're building a two
region Cloud to separate a VMware backend and a KVM one. But both
regions share the same 2 VLANs where we connect all our instances.

We don't use routers, private network, floating IPs... I've
enabled enable_isolated_metadata, so the metadata IP is inside the
dhcp namespace and there's a static route in the created instances
to it via the dhcp's IP. The two DHCPs could have been a problem
but we will use separate IP ranges, and as Neutron sets static
leases with the instances MAC address, they should not interfere.

The question I've been asked is whether we will have network
problems with the metadata server IP 169.254.169.254, that will
exist in 2 namepaces on 2 neutron nodes but on the same VLAN. So
they will send ARP packets with different MAC, and will perhaps
perturb access to the metadata URL form the instances.

Tcpdump shows nothing wrong, but I can't really test now because
we haven't got yet the two regions. What do you think ?

Of course, the question is not about why we choose to have two
regions. I would have chosen Host Agregates to separate VMware and
KVM, but cinder glance should have been configure the same way.
And with VMware, it's not so feasible.

Also, if we can, we will try to have separate networks for each
regions, but it involves a lot of bureaucracy here...


[Openstack-operators] [nova] [openstack-operators] Tools to move instances between projects?

2015-12-02 Thread Kris G. Lindgren
Hello,

I was wondering if someone has a set of tools/code to work allow admins to move 
vm's from one tenant to another?  We get asked this fairly frequently in our 
internal cloud (atleast once a week, more when we start going through and 
cleaning up resources for people who are no longer with the company).   I have 
searched and I was able to find anything externally.

Matt Riedemann pointed me to an older spec for nova : 
https://review.openstack.org/#/c/105367/ for nova.  I realize that this will 
most likely need to be a cross projects effort.  Since vm's consume resources 
for multiple other projects, and to move a VM between projects would also 
require that those other resources get updated as well.

Is anyone aware of a cross project spec to handle this – or of specs in other 
projects?
___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [nova] [openstack-operators] Tools to move instances between projects?

2015-12-02 Thread Kris G. Lindgren
I can describe our specific uses cases, not sure our same limitations apply to 
everyone.

Every developer in our company has a project created for them (user-username) 
they are allowed to spinup 5 vm's in this project to do dev/test/POC whatever.  
These projects are not tied into show back or usage that is done internally for 
orgs.  It's simply done to allow any dev to have immediate access to servers so 
that they cant test out ideas/try something ect ect.  Actual applications/teams 
create projects.  Resources used in these projects are done as show back model 
to allow us to move fake money around to help purchase capacity for the cloud.  
We are moving to a lease model for for user- projects, where we we 
automatically, unless action is taken by the user, reclaiming those resources 
after x number of days.  Additionally, every so often we cleanup projects that 
are tied to users that are no longer with the company.  It's during these 
actions that we usually find people asking if we can transfer vm's from one 
project to another project.  Only the employee has access to their 
user- project within openstack.

For us - we don't allow snapshots in our private cloud.  We encourage all of 
our devs to be able to rebuild any vm that is running in cloud at any time.  
Which is the line we have been toting for these requests.  However, we would 
still like to be able to support their requests.  Additionally, all of our vm's 
are joined to a domain (both linux and windows), taking a snapshot of the 
server and trying to spin it up a replacement is problematic with servers 
joined to the domain - specifically windows.  It also doesn't take care of 
floating ip's or applied security group rules, volumes that are mapped ect ect.

Taking a snapshot of the vm, making it public, booting another vm from that 
snapshot, deleting the old vm and the snapshot is pretty heavy handed... when 
we really just need to update in nova with which project the vm falls under.

We have also had people who didn't pay attention to which project they created 
vm's under and asked us later if we could move the vm from tenant x to tenant y.

We try to have cattle, but people, apparently, really like cows as pets.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy






On 12/2/15, 3:50 PM, "Matt Riedemann" <mrie...@linux.vnet.ibm.com> wrote:

>
>
>On 12/2/2015 2:52 PM, Kris G. Lindgren wrote:
>> Hello,
>>
>> I was wondering if someone has a set of tools/code to work allow admins
>> to move vm's from one tenant to another?  We get asked this fairly
>> frequently in our internal cloud (atleast once a week, more when we
>> start going through and cleaning up resources for people who are no
>> longer with the company).   I have searched and I was able to find
>> anything externally.
>>
>> Matt Riedemann pointed me to an older spec for nova :
>> https://review.openstack.org/#/c/105367/ for nova.  I realize that this
>> will most likely need to be a cross projects effort.  Since vm's consume
>> resources for multiple other projects, and to move a VM between projects
>> would also require that those other resources get updated as well.
>>
>> Is anyone aware of a cross project spec to handle this – or of specs in
>> other projects?
>> ___
>> Kris Lindgren
>> Senior Linux Systems Engineer
>> GoDaddy
>>
>>
>> ___
>> OpenStack-operators mailing list
>> OpenStack-operators@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>
>I think we need a good understanding of what the use case is first. I 
>have to assume that these are pets and that's why we can't just snapshot 
>an instance and then the new user/project can boot an instance from that.
>
>Quotas are going to be a big issue here I'd think, along with any 
>orchestration that nova would need to do with other services like 
>cinder/glance/neutron to transfer ownership of volumes or network 
>resources (ports), and those projects also have their own quota frameworks.
>
>-- 
>
>Thanks,
>
>Matt Riedemann
>
>
>___
>OpenStack-operators mailing list
>OpenStack-operators@lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] How do I install specific versions of openstack/puppet-keystone

2015-11-25 Thread Kris G. Lindgren
We use R10k as well.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Matt Fischer >
Date: Wednesday, November 25, 2015 at 12:16 PM
To: Saverio Proto >
Cc: 
"openstack-operators@lists.openstack.org"
 
>
Subject: Re: [Openstack-operators] How do I install specific versions of 
openstack/puppet-keystone

I'd second the vote for r10k. You need to do this however otherwise you'll get 
the master branch:


mod 'nova',
  :git => 'https://github.com/openstack/puppet-nova.git',
  :ref => 'stable/kilo'

mod 'glance',
  :git => 'https://github.com/openstack/puppet-glance.git',
  :ref => 'stable/kilo'

mod 'cinder',
  :git => 'https://github.com/openstack/puppet-cinder.git',
  :ref => 'stable/kilo'


...


On Wed, Nov 25, 2015 at 11:34 AM, Saverio Proto 
> wrote:
Hello,

you can use r10k

go in a empty folder, create a file called Puppetfile with this content:

mod 'openstack-ceilometer'
mod 'openstack-cinder'
mod 'openstack-glance'
mod 'openstack-heat'
mod 'openstack-horizon'
mod 'openstack-keystone'
mod 'openstack-neutron'
mod 'openstack-nova'
mod 'openstack-openstack_extras'
mod 'openstack-openstacklib'
mod 'openstack-vswitch'

the type the commands:
gem install r10k
r10k puppetfile install -v

Look at r10k documentation for howto specify a version number of the modules.

Saverio



2015-11-25 18:43 GMT+01:00 Oleksiy Molchanov 
>:
> Hi,
>
> You can provide --version parameter to 'puppet module install' or even use
> puppet-librarian with puppet in standalone mode. This tool is solving all
> your issues described.
>
> BR,
> Oleksiy.
>
> On Wed, Nov 25, 2015 at 6:16 PM, Russell Cecala 
> >
> wrote:
>>
>> Hi,
>>
>> I am struggling with setting up OpenStack via the OpenStack community
>> puppet modules.  For example
>> https://github.com/openstack/puppet-keystone/tree/stable/kilo
>>
>> If I do what the README.md file says to do ...
>>
>> example% puppet module install puppetlabs/keystone
>>
>> What release of the module would I get?  Do I get Liberty, Kilo, Juno?
>> And what if I needed to be able to install the Liberty version on one
>> system
>> but need the Juno version for yet another system?  How can I ensure the
>> the right dependencies like cprice404-inifile and puppetlabs-mysql get
>> installed?
>>
>> Thanks
>>
>> ___
>> OpenStack-operators mailing list
>> OpenStack-operators@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>
>
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [Nova] Question about starting nova as service versus directly

2015-11-20 Thread Kris G. Lindgren
Upstart is the startup system used by Ubuntu.  It's been phased out "in favor" 
of systemd.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Adam Lawson >
Date: Friday, November 20, 2015 at 11:16 AM
To: Joe Topjian >
Cc: 
"openstack-operators@lists.openstack.org"
 
>
Subject: Re: [Openstack-operators] [Nova] Question about starting nova as 
service versus directly

Thanks I will remember this! Unfortunately the image is long gone but very good 
info to keep handy.

What exactly does upstart do by the way (as I check the log on a known working 
image)?

//adam


Adam Lawson

AQORN, Inc.
427 North Tatnall Street
Ste. 58461
Wilmington, Delaware 19801-2230
Toll-free: (844) 4-AQORN-NOW ext. 101
International: +1 302-387-4660
Direct: +1 916-246-2072
[http://www.aqorn.com/images/logo.png]

On Fri, Nov 20, 2015 at 8:27 AM, Joe Topjian 
> wrote:

Yes, most likely is related to permissions. Another good source of
information for troubleshooting is /var/log/upstart/nova-compute.log

Ah yes! Much easier.



___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] how to manage multiple openstack regions(Juno)

2015-11-18 Thread Kris G. Lindgren
Jeff,

I was just talking to Yahoo! about this exact same thing.  We both have many 
regions that we would like to manage from a single plane of glass.  From 
Godaddys side it is mainly around managing quota for projects between multiple 
regions.  IE we would like to define a high level quota for a project and allow 
the end users to say how much of it should be allocated where.

Yahoo! Has some tooling that they are looking at revamping/Open sourcing.  
Since, you have a similar need and tooling, would you like to help out?  Is the 
code for what you have available somewhere?

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: XueSong Ma >
Date: Monday, November 16, 2015 at 7:34 PM
To: openstack-operators 
>
Subject: [Openstack-operators] how to manage multiple openstack regions(Juno)

We have large scale physical servers to manage depend on our services, and 
built multiple openstack env.(regions), does any one how to manage these 
individual openstacks in one operation portal?We developed our own UI for 
it(not horizon).
Thanks a lot!
Jeff




___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Running mixed stuff Juno & Kilo , Was: cinder-api with rbd driver ignores ceph.conf

2015-11-17 Thread Kris G. Lindgren
If you are doing this on the same server you are going to have many many issues 
with olso.* libs being incompatible between releases (not just juno -> kilo but 
all releases).  I don't have specific knowledge around cinder, however on 
separate machines/vm's we have run mismatched versions of other openstack 
services without any issues.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy







On 11/17/15, 10:01 AM, "Saverio Proto"  wrote:

>Hello there,
>
>I need to quickly find a workaround to be able to use ceph object map
>features for cinder volumes with rbd backend.
>
>However, upgrading everything from Juno to Kilo will require a lot of
>time for testing and updating all my puppet modules.
>
>Do you think it is feasible to start updating just cinder to Kilo ?
>Will it work with the rest of the Juno components ?
>
>Has someone here experience in running mixed components between Juno and Kilo ?
>
>thanks
>
>Saverio
>
>___
>OpenStack-operators mailing list
>OpenStack-operators@lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [openstack-operators][osops] Listing orphaned Neutron objects via listorphans.py

2015-11-11 Thread Kris G. Lindgren
The issue with ospurge is it only cleans up resources on a project to hasn't 
been deleted. It doesn't detect/cleanup resources tied to already deleted 
projects.

As I understand it, it suppose to be ran against a project - before the project 
is removed.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy







On 11/11/15, 10:03 AM, "Assaf Muller"  wrote:

>Are you aware of https://github.com/openstack/ospurge?
>
>On Wed, Nov 11, 2015 at 11:49 AM, Nick Jones
> wrote:
>> A while ago I knocked up a quick-and-dirty Python script to list various
>> ‘orphaned’ Neutron objects.  By orphans I mean objects that OpenStack is
>> aware of but which don’t have a valid project ID any more.  This is
>> something that we as a public cloud operator have to manage on a regular
>> basis, as it’s easy to chew through your public IPv4 address space with
>> orphaned routers that have a gateway set (as an example).
>>
>> I’ve recently updated it so that it’s a lot quicker, but I’ve also changed
>> the output slightly making it easier to pipe directly into something else
>> (i.e some kind of deletion script).  It’s part of the OSOps project and so
>> my change is here for review: https://review.openstack.org/#/c/244160/3
>>
>> I’m posting this on the off chance that someone else is using it and for
>> whom the output change might cause a problem - if you could review and leave
>> a comment where appropriate then that’d be awesome.
>>
>> Thanks!
>>
>> —
>>
>> -Nick
>>
>>
>>
>> DataCentred Limited registered in England and Wales no. 05611763
>> ___
>> OpenStack-operators mailing list
>> OpenStack-operators@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>
>___
>OpenStack-operators mailing list
>OpenStack-operators@lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [openstack-dev] [stable][all] Keeping Juno "alive" for longer.

2015-11-09 Thread Kris G. Lindgren
I wonder how many people forgot to update their cloud in the user survey.  I 
almost did this, I noticed it had my cloud pre-defined and almost clicked next. 
 Versus going in and editing the cloud to make sure the details were correct 
(they weren't).  If I forgot to do this – I would have been reporting in on 
being on icehouse vs's Kilo.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: matt >
Date: Monday, November 9, 2015 at 1:18 PM
To: Tom Cameron >
Cc: 
"openstack-operators@lists.openstack.org"
 
>
Subject: Re: [Openstack-operators] [openstack-dev] [stable][all] Keeping Juno 
"alive" for longer.

Hell.  There's no clear upgrade path, and no guaranteed matched functionality 
just for starters.

Also most enterprise deployments do 3 to 5 year deployment plans.   This ties 
into how equipment / power / resources are budgeted in the project plans.  They 
don't work with this mentality of rapid release cycles.

We assumed early on that the people deploying OpenStack would be more agile 
because of the ephemeral nature of cloud.  That's not really what's happening. 
There are good and bad reasons for that.  One good reason is policy 
certification.  By the time a team has prepped, built, tested an environment 
and is moving to production it's already been an entire release ( or two since 
most ops refuse to use a fresh release for stability reasons ).  By the time it 
passes independent security / qa testing and development workflows for 
deploying apps to the environment it's been 3-4 releases or more. But more 
often than not the problem is most of the VM workloads aren't good with 
ephemeral and mandating downtime on systems is an onerous change control 
process.  Making the upgrade process for the environment very difficult and 
time consuming.

More than that vendors that provide extra ( sometimes necessary ) additions to 
openstack, such as switch vendors take at least a few months to test a new 
release and certify their drivers for deployment.  Most folks aren't even 
beginning to deploy a fresh release of openstack EVEN if they wanted to until 
it's been out for at least six months.   It's not like they can really test 
pre-rc releases and expect their tests to mean anything.

There's almost no one riding the wave of new deployments.


On Mon, Nov 9, 2015 at 3:06 PM, Tom Cameron 
> wrote:
>I would not call that the extreme minority.
>I would say a good percentage of users are on only getting to Juno now.

The survey seems to indicate lots of people are on Havana, Icehouse and Juno in 
production. I would love to see the survey ask _why_ people are on older 
versions because for many operators I suspect they forked when they needed a 
feature or function that didn't yet exist, and they're now stuck in a horrible 
parallel universe where upstream has not only added the missing feature but has 
also massively improved code quality. Meanwhile, they can't spend the person 
hours on either porting their work into the new Big Tent world we live in, or 
can't bare the thought of having to throw away their hard earned tech debt. For 
more on this, see the myth of the "sunken cost".

If it turns out people really are deploying new clouds with old versions on 
purpose because of a perceived stability benefit, then they aren't reading the 
release schedule pages close enough to see that what they're deploying today 
will be abandoned soon in the future. In my _personal_ opinion which has 
nothing to do with Openstack or my employer, this is really poor operational 
due diligence.

If, however, a deployer has been working on a proof of concept for 18-24 months 
and they're now ready to go live with their cloud running a release from 18-24 
months ago, I have sympathy for them. The bigger the deployment, the harder 
this one is to solve which makes it a prime candidate for the LTS strategy.

Either way, we've lost the original conversation long ago. It sounds like we 
all agree that an LTS release strategy suits most needs but also that it would 
take a lot of work that hasn't yet been thought of or started. Maybe there 
should be a session in Austin for this topic after blueprints are submitted and 
discussed? It would be nice to have the operators and developers input in a 
single place, and to get this idea on the radar of all of the projects.

--
Tom Cameron



From: Maish Saidel-Keesing >
Sent: Monday, November 9, 2015 14:29
To: Tom Cameron; Jeremy Stanley; 

[Openstack-operators] [logs] Neutron not logging user information on wsgi requests by default

2015-11-06 Thread Kris G. Lindgren
Hello all,

I noticed the otherday that in our Openstack install (Kilo) Neutron seems to be 
the only project that was not logging the username/tenant information on every 
wsgi request.  Nova/Glance/heat all log a username and/or project on each 
request.  Our wsgi logs from neutron look like the following:

2015-11-05 13:45:24.302 14549 INFO neutron.wsgi 
[req-ab633261-da6d-4ac7-8a35-5d321a8b4a8f ] 10.224.48.132 - - [05/Nov/2015 
13:45:24]
"GET /v2.0/networks.json?id=2d5fe344-4e98-4ccc-8c91-b8064d17c64c HTTP/1.1" 200 
655 0.027550

I did a fair amount of digging and it seems that devstack is by default 
overriding the context log format for neutron to add the username/tenant 
information into the logs.  However, there is active work to remove this 
override from devstack[1].  However, using the devstack way I was able to true 
up our neutron wsgi logs to be inline with what other services are providing.

If you add:
logging_context_format_string = %(asctime)s.%(msecs)03d %(levelname)s %(name)s 
[%(request_id)s %(user_name)s %(project_name)s] %(instance)s%(message)s

To the [DEFAULT] section of neutron.conf and restart neutron-server.  You will 
now get log output like the following:

 2015-11-05 18:07:31.033 INFO neutron.wsgi 
[req-ebf1d3c9-b556-48a7-b1fa-475dd9df0bf7  ] 10.224.48.132 - - [05/Nov/2015 18:07:31]
"GET /v2.0/networks.json?id=55e1b92a-a2a3-4d64-a2d8-4b0bee46f3bf HTTP/1.1" 200 
617 0.035515

So go forth, check your logs, and before you need to use your logs to debug who 
did what,when, and where.  Get the information that you need added to the wsgi 
logs.  if you are not seeing wsgi logs for your projects trying enabling 
verbose=true in the [DEFAULT] section as well.

Adding [logs] tag since it would be nice to have all projects logging to a 
standard wsgi format out of the gate.

[1] - https://review.openstack.org/#/c/172508/2/lib/neutron-legacy
___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Informal Ops Meetup?

2015-10-29 Thread Kris G. Lindgren
We seem to have enough interest… so meeting time will be at 10am in the Prince 
room (if we get an actual room I will send an update).

Does anyone have any ideas about what they want to talk about?  I am pretty 
much open to anything.  I started: 
https://etherpad.openstack.org/p/TYO-informal-ops-meetup  for tracking of some 
ideas/time/meeting place info.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Sam Morrison <sorri...@gmail.com<mailto:sorri...@gmail.com>>
Date: Thursday, October 29, 2015 at 6:14 PM
To: 
"openstack-operators@lists.openstack.org<mailto:openstack-operators@lists.openstack.org>"
 
<openstack-operators@lists.openstack.org<mailto:openstack-operators@lists.openstack.org>>
Subject: Re: [Openstack-operators] Informal Ops Meetup?

I’ll be there, talked to Tom too and he said there may be a room we can use 
else there is plenty of space around the dev lounge to use.

See you tomorrow.

Sam


On 29 Oct 2015, at 6:02 PM, Xav Paice 
<xavpa...@gmail.com<mailto:xavpa...@gmail.com>> wrote:

Suits me :)

On 29 October 2015 at 16:39, Kris G. Lindgren 
<klindg...@godaddy.com<mailto:klindg...@godaddy.com>> wrote:
Hello all,

I am not sure if you guys have looked at the schedule for Friday… but its all 
working groups.  I was talking with a few other operators and the idea came up 
around doing an informal ops meetup tomorrow.  So I wanted to float this idea 
by the mailing list and see if anyone was interested in trying to do an 
informal ops meet up tomorrow.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org<mailto:OpenStack-operators@lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org<mailto:OpenStack-operators@lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] How to install magnum?

2015-10-28 Thread Kris G. Lindgren
I also installed magnum - but ran into problems under kilo.  Also, don't use 
CoreOS as it wont work as well.  I am trying to get magnum working against our 
open stack install under liberty.  But am running into problems assumptions 
around with what services/features that Magnum expects your cloud to provide.  
Which is currently not realistic for the majority of clouds. 

As for the the publickey url error.  This is because magnum is defaulting to 
expecting that barbican is provided by the cloud.  You can change the key 
credential storage to local and only get a warning that local is for "testing 
purposes only".  But it did work, or at least it got past that error.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy







On 10/28/15, 12:35 PM, "JJ Asghar"  wrote:

>-BEGIN PGP SIGNED MESSAGE-
>Hash: SHA512
>
>
>
>
>
>On 10/28/15 10:35 AM, Mike Perez wrote:
>> On 14:09 Oct 16, hittang wrote:
>>> Hello,everynoe. Can anybody help me for installing magnum? I have an
>>> openstack installtion,which has one controller node, one network node, and
>>> server computes node. Now, I want to install magnum, and  to manage docker
>>> containers with. 
>> 
>> I was not able to find this information in the Magnum wiki [1], except for 
>> the
>> developer quick start. Doing a quick search, other related threads point
>> to the dev docs for installation, which is developer centric.
>> 
>> Adrian, is this something missing in documentation, or did we miss it?
>> 
>> [1] - https://wiki.openstack.org/wiki/Magnum
>> 
>
>Yep, this would be awesome. It's neat to see the integrations with
>DevStack, but getting it to work in a "prod" environment seems confusing
>at best.
>
>I've attempted a couple times now, and failed each one. I'm more then
>willing to help debug/QA the docs that yall decide to put together.
>
>Best Regards,
>JJ Asghar
>c: 512.619.0722 t: @jjasghar irc: j^2
>-BEGIN PGP SIGNATURE-
>Version: GnuPG/MacGPG2 v2
>Comment: GPGTools - https://gpgtools.org
>
>iQIcBAEBCgAGBQJWMEKWAAoJEDZbxzMH0+jTAMMQAMOL04sqzQVkiUwsGUzGxUku
>MR1HHM3FWUxKpAqs23mefn7fD5SCMM9joQK5YdgQDIiJllqX+i9dQkNqIGGFOzrK
>A/u6sV4TTqoHR9x1y6yI+OhT4g+12gfZs2A70idyn8NHFBEKjC21XicgL9JDWpmy
>9sWCZt7rSQJmGXnULBAib7Qt6zqxBTmB+0LzvHkUT+Jt0hEHmfLqW6BGk/GKGYwr
>DEfNxxqoXdCXLYCkNOI1k+4MXX6W9p/aUi1NF8TWImRhpX8EkrOpAgh5xYVAMTu5
>UOLEA5N+Ve4JRl13t0sXih+MeADXjmdGpzXaLjsiIdI/8GS6ERI40rnZsMofkQ6q
>2PEd6uD1/UjgZ8mfSCwNZLU/jI3YPlyxCLulpUAGsgUqHWQTcNAxLL/faUxPkuRz
>EM7Ql8CP4o1Fi1Frdjuy5hqd9C7J5go/GRauQvWNsxCxGO4PRTl/bmL0PvGVnyKQ
>o1PA2fz5YOyYzKxc82S9+mF8IQWsG804smpxLwfjxuXq0UH6MDk40W/VTJTfKFnE
>6C+PaO3wEK85pg41ih8vM4FsykT3Dtn2ppLpDPKNUV12XsyzvXx9G6TkW4S3WF7O
>0uzLFMaZlEaA5JaDS+31PICryQG4I9VoyL28TpE0PLeSSJX1QuX5NcF4ssmJerks
>NnKFpRCGI9RZvlX7LwPt
>=02cm
>-END PGP SIGNATURE-
>
>___
>OpenStack-operators mailing list
>OpenStack-operators@lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [openstack-dev] [nova] Min libvirt for Mitaka is 0.10.2 and suggest Nxxx uses 1.1.1

2015-10-07 Thread Kris G. Lindgren
Please see inline.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy







On 10/7/15, 6:12 AM, "Tim Bell"  wrote:

>
>
>> -Original Message-
>> From: Daniel P. Berrange [mailto:berra...@redhat.com]
>> Sent: 07 October 2015 13:25
>> To: Tim Bell 
>> Cc: Sean Dague ; OpenStack Development Mailing List
>> (not for usage questions) ; openstack-
>> operat...@lists.openstack.org
>> Subject: Re: [Openstack-operators] [openstack-dev] [nova] Min libvirt for
>> Mitaka is 0.10.2 and suggest Nxxx uses 1.1.1
>>
>> On Wed, Oct 07, 2015 at 11:13:12AM +, Tim Bell wrote:
>> >
>> > Although Red Hat is no longer supporting RHEL 6 after Icehouse, a
>> > number of users such as GoDaddy and CERN are using Software
>> > Collections to run the Python 2.7 code.
>>
>> Do you have any educated guess as to when you might switch to deploying
>> new OpenStack version exclusively on RHEL 7 ? I understand such a switch is
>> likely to take a while so you can test its performance and reliability and 
>> so on,
>> but I'm assuming you'll eventually switch ?
>>
>
>I think we'll be all 7 by spring next year (i.e. when we install Liberty). The 
>software collections work is not for the faint hearted and 7 brings lots of 
>good things with it for operations so we want to get there as soon as 
>possible. Thus, I think we'd be fine with a change in Mitaka (especially given 
>the points you mention below).

Like CERN, we don't currently plan on doing the software collections + venv 
trick past kilo.  We plan on having all of our HV's running cent 7+ before we 
move to liberty.  That said, Liberty should still technically work under CentOS 
6...

I am ok dropping support for RHEL/CentOS 6 in N.

>
>> > However, since this modification would only take place when Mitaka
>> > gets released, this would realistically give those sites a year to
>> > complete migration to RHEL/CentOS 7 assuming they are running from one
>> > of the community editions.
>> >
>> > What does the 1.1.1 version bring that is the motivation for raising
>> > the limit ?
>>
>> If we require 1.1.1 we could have unconditional support for
>>
>>  - Hot-unplug of PCI devices (needs 1.1.1)
>>  - Live snapshots (needs 1.0.0)
>>  - Live volume snapshotting (needs 1.1.1)
>>  - Disk sector discard support (needs 1.0.6)
>>  - Hyper-V clock tunables (needs 1.0.0 & 1.1.0)
>>
>> If you lack those versions, in case of hotunplug, and live volume snapshots
>> we just refuse the corresponding API call. With live snapshots we fallback 
>> to
>> non-live snapshots. For disk discard and hyperv clock we just run with
>> degraded functionality. The lack of hyperv clock tunables means Windows
>> guests will have unreliable time keeping and are likely to suffer random
>> BSOD, which I think is a particularly important issue.
>>
>> And of course we remove a bunch of conditional logic from Nova which
>> simplifies the code paths and removes code paths which rarely get testing
>> coverage.
>>
>> Regards,
>> Daniel
>> --
>> |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ 
>> :|
>> |: http://libvirt.org  -o- http://virt-manager.org 
>> :|
>> |: http://autobuild.org   -o- http://search.cpan.org/~danberr/ 
>> :|
>> |: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc 
>> :|
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Neutron DHCP failover bug

2015-09-30 Thread Kris G. Lindgren
We run nova-metadata on all the compute nodes, then bind 169.254.169.254 to lo 
on each HV.  This usually works with the standard iptables rule that 
nova-metadata add's.  Worse case you just add it to the the default rules set 
for the compute node.  Inside the images I think all you need to do is make 
sure that zeroconfig is turned off.  So that the default route for 
169.254.169.254 out eth0 is still there.  I suppose you could also add a route 
via dhcp to always point out the 169.254.169254 out eth0.  Worst case if the 
arp entry for 169.254.169.254 makes it out of the HV, you get automatic HA with 
all the other HV's on the same network who will respond to the ARP request for 
that vm.  IE for us we typically have 43 other servers running metadata on the 
same network, so its a active, active, active, active config.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Sam Morrison
Date: Wednesday, September 30, 2015 at 7:24 PM
To: Assaf Muller
Cc: 
"openstack-operators@lists.openstack.org"
Subject: Re: [Openstack-operators] Neutron DHCP failover bug


On 1 Oct 2015, at 10:52 am, Assaf Muller 
> wrote:

That's interesting. Looks like DHCP A/A only works if you use your (HA) routers 
to provide metadata, then.

Yes that’s true, we’re not doing any L3 stuff in neutron yet. These are just 
shared external provider networks.

Sam



___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [ops] Operator Local Patches

2015-09-29 Thread Kris G. Lindgren
Hello All,

We have some pretty good contributions of local patches on the etherpad.  We 
are going through right now and trying to group patches that multiple people 
are carrying and patches that people may not be carrying but solves a problem 
that they are running into.  If you can take some time and either add your own 
local patches that you have to the ether pad or add +1's next to the patches 
that are laid out, it would help us immensely.

The etherpad can be found at: 
https://etherpad.openstack.org/p/operator-local-patches

Thanks for your help!

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: "Kris G. Lindgren"
Date: Tuesday, September 22, 2015 at 4:21 PM
To: openstack-operators
Subject: Re: Operator Local Patches

Hello all,

Friendly reminder: If you have local patches and haven't yet done so, please 
contribute to the etherpad at: 
https://etherpad.openstack.org/p/operator-local-patches

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: "Kris G. Lindgren"
Date: Friday, September 18, 2015 at 4:35 PM
To: openstack-operators
Cc: Tom Fifield
Subject: Operator Local Patches

Hello Operators!

During the ops meetup in Palo Alto were we talking about sessions for Tokyo. A 
session that I purposed, that got a bunch of +1's,  was about local patches 
that operators were carrying.  From my experience this is done to either 
implement business logic,  fix assumptions in projects that do not apply to 
your implementation, implement business requirements that are not yet 
implemented in openstack, or fix scale related bugs.  What I would like to do 
is get a working group together to do the following:

1.) Document local patches that operators have (even those that are in gerrit 
right now waiting to be committed upstream)
2.) Figure out commonality in those patches
3.) Either upstream the common fixes to the appropriate projects or figure out 
if a hook can be added to allow people to run their code at that specific point
4.) 
5.) Profit

To start this off, I have documented every patch, along with a description of 
what it does and why we did it (where needed), that GoDaddy is running [1].  
What I am asking is that the operator community please update the etherpad with 
the patches that you are running, so that we have a good starting point for 
discussions in Tokyo and beyond.

[1] - https://etherpad.openstack.org/p/operator-local-patches
___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [openstack-operators][osops] Something other than NOOP in our jenkins tests

2015-09-29 Thread Kris G. Lindgren
If we are going to be stringent on formatting – I would also like to see us be 
relatively consistent on arguments/env variables that are needed to make a 
script run.  Some pull in ENV vars, some source a rc file, some just say 
already source your rc file to start with, others accept command options.  It 
would be nice if we had a set of curated scripts that all worked in a similar 
fashion.

Also, to Joe's point. It would be nice if we had two place for scripts.  A 
"dumping ground" that people could share what they had.  And a curated one, 
where everything within the curated repo follows a standard set of 
conventions/guidelines.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Joe Topjian
Date: Tuesday, September 29, 2015 at 1:43 PM
To: JJ Asghar
Cc: 
"openstack-operators@lists.openstack.org"
Subject: Re: [Openstack-operators] [openstack-operators][osops] Something other 
than NOOP in our jenkins tests

So this will require bash scripts to adhere to bashate before being accepted? 
Is it possible to have the check as non-voting? Does this open the door to 
having other file types be checked?

IMHO, it's more important for the OSOps project to foster collaboration and 
contributions rather than worry about an accepted style.

As an example, yesterday's commits used hard-tabs:

https://review.openstack.org/#/c/228545/
https://review.openstack.org/#/c/228534/

I think we're going to see a lot of variation of styles coming in.

I don't want to come off as sounding ignorant or disrespectful to other 
projects that have guidelines in place -- I fully understand and respect those 
decisions.

Joe

On Tue, Sep 29, 2015 at 12:52 PM, JJ Asghar > 
wrote:
Awesome! That works!

Best Regards,
JJ Asghar
c: 512.619.0722 t: @jjasghar irc: j^2

On 9/29/15 1:27 PM, Christian Berendt wrote:
> On 09/29/2015 07:45 PM, JJ Asghar wrote:
>> So this popped up today[1]. This seems like something that should be
>> leveraged in our gates/validations?
>
> I prepared review requests to enable checks on the gates for
>
> * osops-tools-monitoring: https://review.openstack.org/#/c/229094/
> * osops-tools-generic: https://review.openstack.org/#/c/229043/
>
> Christian.
>


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Migrating an instance to a host with less cores fails

2015-09-25 Thread Kris G. Lindgren
I believe TWC - (medberry on irc) was lamenting to me about cpusets, different 
hypervisors HW configs, and unassigned vcpu's in numa nodes.

The problem is the migration does not re-define the domain.xml, specifically, 
the vcpu mapping to match what makes sense on the new host.  I believe the 
issue is more pronounced when you go from a compute node with more cores to a 
compute node with less cores. I believe the opposite migration works, just the 
vcpu/numa nodes are all wrong. 

CC'ing him as well.
___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy







On 9/25/15, 11:53 AM, "Steve Gordon"  wrote:

>Adding Nikola as he has been working on this.
>
>- Original Message -
>> From: "Aubrey Wells" 
>> To: openstack-operators@lists.openstack.org
>> 
>> Greetings,
>> Trying to decide if this is a bug or just a config option that I can't
>> find. The setup I'm currently testing in my lab with is two compute nodes
>> running Kilo, one has 40 cores (2x 10c with HT) and one has 16 cores (2x 4c
>> + HT). I don't have any CPU pinning enabled in my nova config, which seems
>> to have the effect of setting in libvirt.xml a vcpu cpuset element like (if
>> created on the 40c node):
>> 
>> > cpuset="1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39">1
>> 
>> And then if I migrate that instance to the 16c node, it will bomb out with
>> an exception:
>> 
>> Live Migration failure: Invalid value
>> '0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38' for 'cpuset.cpus':
>> Invalid argument
>> 
>> Which makes sense, since that node doesn't have any vcpus after 15 (0-15).
>> 
>> I can fix the symptom by commenting out a line in
>> nova/virt/libvirt/config.py (circa line 1831) so it always has an empty
>> cpuset and thus doesn't write that line to libvirt.xml:
>> # vcpu.set("cpuset", hardware.format_cpu_spec(self.cpuset))
>> 
>> And the instance will happily migrate to the host with less CPUs, but this
>> loses some of the benefit of openstack trying to evenly spread out the core
>> usage on the host, at least that's what I think the purpose of that is.
>> 
>> I'd rather fix it the right way if there's a config option I don't see or
>> file a bug if its a bug.
>> 
>> What I think should be happening is that when it creates the libvirt
>> definition on the destination compute node, it write out the correct cpuset
>> per the specs of the hardware its going on to.
>> 
>> If it matters, in my nova-compute.conf file, I also have cpu mode and model
>> defined to allow me to migrate between the two different architectures to
>> begin with (the 40c is Sandybridge and the 16c is Westmere so I set it to
>> the lowest common denominator of Westmere):
>> 
>> cpu_mode=custom
>> cpu_model=Westmere
>> 
>> Any help is appreciated.
>> 
>> -
>> Aubrey
>> 
>> ___
>> OpenStack-operators mailing list
>> OpenStack-operators@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>> 
>
>-- 
>Steve Gordon, RHCE
>Sr. Technical Product Manager,
>Red Hat Enterprise Linux OpenStack Platform
>
>___
>OpenStack-operators mailing list
>OpenStack-operators@lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [Large Deployments Team][Performance Team] New informal working group suggestion

2015-09-23 Thread Kris G. Lindgren
Dina,

Do we have a place to put things (etherpad) that we are seeing performance 
issues with?  I know we are seeing issues with CPU load under nova-conductor as 
well as some stuff with the neutron API timing out (seems like it never 
responds to the request (no log entry on the neutron side).

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Matt Van Winkle
Date: Tuesday, September 22, 2015 at 7:46 AM
To: Dina Belova, OpenStack Development Mailing List, 
"openstack-operators@lists.openstack.org"
Subject: Re: [Openstack-operators] [Large Deployments Team][Performance Team] 
New informal working group suggestion

Thanks, Dina!

For context to the rest of the LDT folks, Dina reached out to me about working 
on this under our umbrella for now.  It made sense until we understand if it's 
a large enough thing to live as its own working group because most of us have 
various performance concerns too.  So, like Public Clouds, we'll have to figure 
out how to integrate this sub group.

I suspect the time slot for Tokyo is already packed, so the work for the 
Performance subgroup may have to be informal or in other sessions, but I'll 
start working with Tom and the folks covering the session for me (since I won't 
be able to make it) on what we might be able to do.  I've also asked Dina to 
join the Oct meeting prior to the Summit so we can further discuss the sub team.

Thanks!
VW

From: Dina Belova >
Date: Tuesday, September 22, 2015 7:57 AM
To: OpenStack Development Mailing List 
>, 
"openstack-operators@lists.openstack.org"
 
>
Subject: [Large Deployments Team][Performance Team] New informal working group 
suggestion

Hey, OpenStackers!

I'm writing to propose to organise new informal team to work specifically on 
the OpenStack performance issues. This will be a sub team in already existing 
Large Deployments Team, and I suppose it will be a good idea to gather people 
interested in OpenStack performance in one room and identify what issues are 
worrying contributors, what can be done and share results of performance 
researches :)

So please volunteer to take part in this initiative. I hope it will be many 
people interested and we'll be able to use cross-projects session 
slot to meet in Tokyo and hold a 
kick-off meeting.

I would like to apologise I'm writing to two mailing lists at the same time, 
but I want to make sure that all possibly interested people will notice the 
email.

Thanks and see you in Tokyo :)

Cheers,
Dina

--

Best regards,

Dina Belova

Senior Software Engineer

Mirantis Inc.
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] Cells V1 patches

2015-09-18 Thread Kris G. Lindgren
Hello all,

The LDT working group is currently trying to collect a list of patches that 
people are carrying to better support Cells V1.  We currently have a list of 
~30 patches[1] that operators who are using cells are running to fix bugs or 
fix broken functionality under cells v1.  If you are running cells and have 
patches, please please please update the etherpad[1].  We realize that work is 
on going to move from cells v1 to cells v2.  However multiple child cells under 
v2 will not be supported until at least 2 release out (1+ years).

What we are trying to do is get a list of patches people have to fix issues/add 
functionality and try to get them either added upstream or added to a common 
repository.  The idea being that we want to make it easier for people who need 
to use Cells v1 until Cell v2 is fully supported.

[1] - https://etherpad.openstack.org/p/PAO-LDT-cells-patches
___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] deploy nova cells with neutron network

2015-09-17 Thread Kris G. Lindgren
Sha,

As you noticed the vif_plug_notification  does not work with cells under the 
stock confiuguration.  In icehouse/juno we simply ran with 
vif_plugging_is_fatal is false and set the timeout value to 5 or 10 seconds 
iirc.

Sam Morrison and made a patch and Mathieu Gagné help updated it, to fix this.  
You can find it here: https://review.openstack.org/#/c/215459/

We are running this patch under kilo without any issues in production.

From experience there are a number of other things that are broken in a cells 
setup:
1.) Flavor creation
2.) Availability zones
3.) Host aggregates creation for the API Cell
4.) Instance-name (doesn't match between api-cell/child-cell/HV)
5.) vif_plug_notification
6.) metadata with x509-keys (x509 key is updated in api cell and not pushed to 
child cell, but metadata tries to reference it from the child cell)

We (LDT) are in the process of documenting the patches that we use with Cells 
and trying to get those upstreamed, we are hoping to have a complete list by 
Tokyo:
https://etherpad.openstack.org/p/PAO-LDT-cells-patches

NeCTAR maintains a large number of patches that fix the above broken ness in 
cells.  You can find most of their patches by looking at the commit history on 
their kilo branch:
https://github.com/NeCTAR-RC/nova/commits/nectar/kilo?page=2 (mainly August 
12/13/17/24) - We are running ~10 of those patches in production without any 
issues.
___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: Sha Li
Date: Thursday, September 17, 2015 at 1:38 AM
To: 
"OpenStack-operators@lists.openstack.org"
Cc: "sam.morri...@unimelb.edu.au", 
"belmiro.more...@cern.ch"
Subject: [Openstack-operators] deploy nova cells with neutron network


Hi,

I am try to test the nova cells function.
My test deployment consits of one api-cell node, one child-cell node and one 
compute node.

api-cell node:  nova-api, nova-cells, nova-cert, nova-condoleauth, 
nova-novncproxy
child-cell node: nova-cells, nova-conductor, nova-scheduler
compute node: nova-compute

I found most deployment example is using nova-network with nova-cells. I want 
to use neutron. So I had keystone , glance, and neutron-server, neutron-dhcp, 
neutron-l3  shared between all cells and deployed all on the api-cell node.

I encounterd similar problem as described in this bug report
https://bugs.launchpad.net/nova/+bug/1348103

When boot a new instance, nova-compute fails to get the network-vif-plugged  
notification and get time out waiting for the call back.
But on the neutron server side, it looks like the notification had been 
successfully sent and get the 200 response code from nova-api server

I had to set
vif_plugging_is_fatal = False
Then the instnace can be spawned normally

I am wondering how people use neutron with nova-cells, is this going to cause 
any trouble in large scale production deployment.


Cheers,
Sha



--- neutron server log file

2015-08-22 00:20:35.464 16812 DEBUG neutron.notifiers.nova [-] Sending events: 
[{'status': 'completed', 'tag': u'2839ca4d-b632-4d64-a174-ecfe34a7a746', 
'name': 'network-vif-plugged', 'server_uuid': 
u'092c8bc4-3643-44c0-b79e-ad5caac18b3d'}] send_events 
/usr/lib/python2.7/site-packages/neutron/notifiers/nova.py:232

2015-08-22 00:20:35.468 16812 INFO urllib3.connectionpool [-] Starting new HTTP 
connection (1): 192.168.81.221

2015-08-22 00:20:35.548 16812 DEBUG urllib3.connectionpool [-] "POST 
/v2/338aad513c604880a6a0dcc58b88b905/os-server-external-events HTTP/1.1" 200 
183 _make_request /usr/lib/python2.7/site-packages/urllib3/connectionpool.py:357

2015-08-22 00:20:35.550 16812 INFO neutron.notifiers.nova [-] Nova event 
response: {u'status': u'completed', u'tag': 
u'2839ca4d-b632-4d64-a174-ecfe34a7a746', u'name': u'network-vif-plugged', 
u'server_uuid': u'092c8bc4-3643-44c0-b79e-ad5caac18b3d', u'code': 200}




___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Please help!!!!Openvswitch attacked by ICMP!!!!!!!

2015-09-17 Thread Kris G. Lindgren
For us on boot, we configure the systems init scripts to bring up br-ext and 
plug in the ethernet (or in our case bond) device into the external bridge.  
You should look at your specific distro for guidence here.  Redhat based 
(RHEL/CentOS/Fedora) use: 
http://blog.oddbit.com/2014/05/20/fedora-and-ovs-bridge-interfac/ as a guide.

We do not assign any ip address to the interface attached to the bridge.  If 
you assigned 0.0.0.0 netmask 0.0.0.0 you basically assigned every ip address in 
ipv4 to your interface, so anything that arps on your network for an ip 
address, you server is going to respond say "hey that’s me".
___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: applyhhj
Date: Thursday, September 17, 2015 at 8:55 AM
To: openstack-operators
Subject: [Openstack-operators] Please helpOpenvswitch attacked by 
ICMP!!!

Hi,
I followed The Guidance and tried to configure openvswitch(OVS) service. I 
first created a bridge br-ex and then added eth2 to the bridge. After that I 
set the IP of eth2 to 0.0.0.0 and then reboot the system. However br-ex was not 
up when system launched. So I turned on br-ex manually and then restart the 
network, but br-ex could not get ip from dhcp server. Thus I used “dhclient 
br-ex” to manually acquire IP. Well till then everything worked fine, but in 
the evening the Network Node was continuously attacked by ICMP package. Iptraf 
showed the following messages:

x ICMP time excd (56 bytes) from 4.69.143.125 to 166.111.61.xx on eth2
x ICMP dest unrch (host comm denied) (576 bytes) from 176.32.36.23 to 
166.111.61.xxx on eth2
x ICMP dest unrch (host comm denied) (576 bytes) from 176.32.36.23 to 
166.111.61.xx on eth2
x ICMP dest unrch (host) (100 bytes) from 59.66.96.226 to 166.111.61.xx on eth2
x ICMP time excd (56 bytes) from 4.69.143.125 to 166.111.61.xx on eth2
x ICMP dest unrch (host comm denied) (576 bytes) from 176.32.36.23 to 
166.111.61.xxx on eth2
x ICMP dest unrch (host comm denied) (576 bytes) from 176.32.36.23 to 
166.111.61.xx on eth2
x ICMP dest unrch (host) (100 bytes) from 59.66.96.226 to 166.111.61.x on eth2
x ICMP time excd (56 bytes) from 4.69.143.125 to 166.111.61.63 on eth2
x ICMP dest unrch (host comm denied) (576 bytes) from 176.32.36.23 to 
166.111.61.xx on eth2
x ICMP dest unrch (host comm denied) (576 bytes) from 176.32.36.23 to 
166.111.61.xxx on eth2
x ICMP dest unrch (host) (100 bytes) from 59.66.96.226 to 166.111.61.xx on eth2
x ICMP time excd (56 bytes) from 4.69.143.125 to 166.111.61.x on eth2

My ip is none of the above ones. The download speed in system monitor went up 
to 3m/s or even higher to 8m/s. I tried to use iptables and ebtable to filter 
icmp packages and also set icmp_echo_ignore_all to drop all icmp pacakges. But, 
unfortunately, nothing works. As long as I deleted eth2 from br-ex or brought 
down br-ex, the network went back normal.If you have any idea, please help me. 
I have been stuck here for several days. Thank you very much!!

Regards!
hjh


2015-09-17

applyhhj
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] kilo - neutron - ipset problems?

2015-09-01 Thread Kris G. Lindgren
Hello,

We ran into this again today.

I created bug: https://bugs.launchpad.net/neutron/+bug/1491131 for this.  With 
the log files for ~10 seconds before the issue happened to the first couple 
ipset delete failures.




On 8/20/15, 6:37 AM, "Miguel Angel Ajo" <mangel...@redhat.com> wrote:

>Hi Kris,
>
>I'm adding Shi Han Zhang to the thread,
>
>I'm was involved in some refactors during kilo and Han Zhang in some 
>extra fixes during Liberty [1] [2] [3],
>
>Could you get us some logs of such failures to see what was 
>happening around the failure time?, as a minimum we should
>post the log error traces to a bug in https://bugs.launchpad.net/neutron
>
> We will be glad to use such information to make the ipset more 
>fault tolerant, and try to identify the cause of the
>possible race conditions.
>
>
>[1] https://review.openstack.org/#/c/187483/
>[2] https://review.openstack.org/190991
>[3] https://review.openstack.org/#/c/187433/
>
>
>
>Kris G. Lindgren wrote:
>>
>> We have been using ipsets since juno.  Twice now since our kilo 
>> upgrade we have had issues with ipsets blowing up on a compute node.
>>
>> The first time, was iptables was referencing an ipset that was either 
>> no longer there or was not added, and was trying to apply the iptables 
>> config every second and dumping the full iptables-resotore output into 
>> the log when it failed at TRACE level.
>> Second time, was that ipsets was failing to remove an element that was 
>> no longer there.
>>
>> For #1 I solved by restarting the neutron-openvswitch-agent.  For #2 
>> we just added the entry that ipsets was trying to remove.  It seems 
>> like we are having some race conditions under kilo that were not 
>> present under juno (or we managed to run it for 6+ months without it 
>> biting us).
>>
>> Is anyone else seeing the same problems?  I am noticing some commits 
>> reverting/re-adding around ipsets in kilo and liberty so trying to 
>> confirm if I need to open a new bug on this.
>> 
>>
>> Kris Lindgren
>> Senior Linux Systems Engineer
>> GoDaddy, LLC.
>>
>> ___
>> OpenStack-operators mailing list
>> OpenStack-operators@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
>Kris G. Lindgren wrote:
>> We have been using ipsets since juno.  Twice now since our kilo upgrade we 
>> have had issues with ipsets blowing up on a compute node.
>>
>> The first time, was iptables was referencing an ipset that was either no 
>> longer there or was not added, and was trying to apply the iptables config 
>> every second and dumping the full iptables-resotore output into the log when 
>> it failed at TRACE level.
>> Second time, was that ipsets was failing to remove an element that was no 
>> longer there.
>>
>> For #1 I solved by restarting the neutron-openvswitch-agent.  For #2 we just 
>> added the entry that ipsets was trying to remove.  It seems like we are 
>> having some race conditions under kilo that were not present under juno (or 
>> we managed to run it for 6+ months without it biting us).
>>
>> Is anyone else seeing the same problems?  I am noticing some commits 
>> reverting/re-adding around ipsets in kilo and liberty so trying to confirm 
>> if I need to open a new bug on this.
>> 
>>
>> Kris Lindgren
>> Senior Linux Systems Engineer
>> GoDaddy, LLC.
>>
>> ___
>> OpenStack-operators mailing list
>> OpenStack-operators@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] IP - availability monitoring

2015-08-26 Thread Kris G. Lindgren
Hello,

As you know, much discussion has been around the naming and the url pathing for 
the ip-usages extension.  We also discussed this at the neutron mid-cycle as 
well.  Since we are the ones the made the extension, we use the extension to 
help with scheduling in our layer 3 network design.We have no preference as 
to the url's but the only thing that we want to maintain is the ability to 
query for all subnets at once.  We have a nova scheduling filter that makes a 
call into the  ip-usages extension.  Otherwise every time we provision a vm we 
would have to make N number of calls to get all the subnets and their usages.


Kris Lindgren
Senior Linux Systems Engineer
GoDaddy, LLC.

From: Salvatore Orlando salv.orla...@gmail.commailto:salv.orla...@gmail.com
Date: Tuesday, August 25, 2015 at 4:33 PM
To: Assaf Muller amul...@redhat.commailto:amul...@redhat.com
Cc: 
openstack-operators@lists.openstack.orgmailto:openstack-operators@lists.openstack.org
 
openstack-operators@lists.openstack.orgmailto:openstack-operators@lists.openstack.org
Subject: Re: [Openstack-operators] IP - availability monitoring

As the specification linked by Assaf has without doubt value for operators, I 
think the drivers team might consider it for inclusion in the Liberty release.
Unfortunately the specification and the patch have not received many reviews, 
but can still be sorted, especially considering that the patch's size is 
manageable and its impact contained.

Nevertheless, Daniel in his post referred to providing information about usage 
of IPs in resources like subnets, whereas the patch under review proposes the 
addition of a new read-only resource called 'network_ip_usage'. The only thing 
I'd change is that I'd make this information available in a different way.
For instance:

- through a sub-url of subnets: GET /v2.0/subnets/id/ip_usage
- through a query paramer on subnet GET /v2.0/subnets/id?ip_usage=True
- making IPs a read only resource GET /v2.0/ips?subnet_id=idcount=True

I think from a user perspective the latter would be the more elegant and simple 
to use, but it will require additional work for introducing resource counting 
in Neutron APIs; and for this there's an old spec too [1]. Having operators 
providing feedback on how they reckon this is information is best consumed 
would be valuable.

[1] https://review.openstack.org/#/c/102199/

Salvatore






On 24 August 2015 at 03:21, Assaf Muller 
amul...@redhat.commailto:amul...@redhat.com wrote:


On Sun, Aug 23, 2015 at 8:23 PM, Daniel Speichert 
dan...@speichert.plmailto:dan...@speichert.pl wrote:
On 8/22/2015 23:24, Balaji Narayanan (பாலாஜி நாராயணன்) wrote:
 Hello Operators,

 In the capacity management discussions at the Ops Summit last week, I
 thought there was a some discussion on monitoring of fixed / floating
 subnets and availability.

 At Yahoo, we use nova-network and have an API extension available for
 reporting how much ip subnets are configured on a cluster and how much
 of them are used / remaining. We use this to trigger an alert /
 augment additional subnets to the cluster.

 If there is enough interest in this, we can look at pushing this upstrem.

 Here is a blue print that vilobh wrote initially for this -
 https://review.openstack.org/#/c/94299/
This sounds like a very useful extension, considering there's really no
quotas for IP addresses and IPs are a scarce resource.
I'm aware of multiple big private cloud operators using custom scripts
to generate reports of available IP addresses.

I'm pretty sure an extension like this would be great for neutron (I'm
not using nova-network). Considering that most networking scenarios
(flat, provider networks, floating IPs with L3) have subnets as a
resource in neutron, with allocation pools, it seems enough to create an
extension that would provide statistics for a subnet or summary
statistics for all subnets within a network if so requested.

I can work on a new blueprint version for neutron.

There already is one.

Code:
https://review.openstack.org/#/c/212955/

RFE bug:
https://bugs.launchpad.net/neutron/+bug/1457986

Blueprint:
https://blueprints.launchpad.net/neutron/+spec/network-ip-usage-api

Spec:
https://review.openstack.org/#/c/180803/5/specs/liberty/network-ip-usage-api.rst


Regards,
Daniel Speichert



___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.orgmailto:OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.orgmailto:OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org

  1   2   >