Re: [Openstack-operators] OpenContrail Integration with an existing OpenStack

2018-02-05 Thread Van Leeuwen, Robert
> Do I need to ensure OpenStack network state is clean before integrate 
> OpenContrail with my existing Openstack?
> Any suggestion?

What do you mean by “clean”?

Contrail will ignore all information in the neutron db.
Any incoming neutron api call will be translated to a contrail-api call.
Contrail itself does not use the Mysql database but has its own (Cassandra) 
databases
So any router/subnet/network/interface etc object that currently exists in the 
neutron/mysql database is ignored.
That basically means any object that is using “stock-neutron” will break since 
it will need to be re-created on contrail.

Cheers,
Robert van Leeuwen
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] PCI pass through settings on a flavor without aliases on the API nodes

2017-10-18 Thread Van Leeuwen, Robert
Hi,

Does anyone know if it is possible to set PCI pass through on a flavor without 
also needing to set the alias on the nova API nodes as mentioned here:
https://docs.openstack.org/nova/pike/admin/pci-passthrough.html

E.G you need to set in nova.conf:
[pci]
alias = { "vendor_id":"8086", "product_id":"154d ", "device_type":"type-PF", 
"name":"a1" }

Then you can set the flavor:
openstack flavor set m1.large --property "pci_passthrough:alias"="a1:2"


E.g. I would be fine with just setting the PCI vendor/product on the flavor 
instead of also needing to set this at the api node
So something like:
openstack flavor set m1.large –property “pci_passthrough:vendor”=”8086”  
“pci_passthrough:device”=”154d:1”

Thx,
Robert  van Leeuwen
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Case studies on Openstack HA architecture

2017-08-29 Thread Van Leeuwen, Robert
> Thanks Curtis, Robert, David and Mohammed for your responses. 
>As a follow up question, do you use any deployment automation tools for 
> setting up the HA control plane?
>  I can see the value of deploying each service in separate virtual 
> environment or containers but automating such deployment requires developing 
> some new tools. Openstack-ansible is one potential deployment tool that I am 
> aware of but that had limited support CentOS. 

Here we currently have a physical loadbalancer which provides the ALL the HA 
logic. 
The services are installed on vms managed by Puppet.
IMHO a loadbalancer is the right place to solve HA since you only have to do it 
once. 
(Depending on your Neutron implementation you might also need something for 
Neutron) 

To rant a bit about deployment/automation:
I am not necessarily a fan of management with puppet since module dependencies 
can become a nightmare even with things like R10K.
e.g. you want to deploy a newer version of keystone, this requires a newer 
openstack-common puppet module. 
Now you have a change that affects everything (other OpenStack puppet modules 
also use openstack-common) and might now need upgrades as well.

To solve these kind of things we are looking at containers and investigated two 
possible deployment scenario’s:
OpenStack helm (+containers build by kolla) which is pretty nice and uses k8s.
The problem is that it is still early days for both k8s and helm. 
Things that stuck out most:
* Helm: Nice for a deployment from scratch. 
   Integration with our environment is a bit of a pain (e.g. if you want to 
start with just one service)
   It would need a lot of work to above into the product and not everything 
would be easy to get into upstream.
   Still a very early implementation needs quite a bit of TLC. 
If you can live with what comes out of the box it might be a nice solution.
* K8S: It is a relatively complex product and it is still missing some features 
especially for self-hosted installations.

After some deliberation, we decided to go with the “hashi” stack (with kola 
build containers). 
This stack has more of a unix philosophy, simple processes that do 1 thing well:
* Nomad –scheduling 
* Consul - service discovery and KV store
* Vault – secret management
* Fabio zeroconf loadbalancer which integrates with Consul
In general this stack is really easy to understand for everyone. (just work 
with it half a day and you really understand what is going on under the hood)
There are no overlay networks :)
Lots of the stuff can break without much impact. E.g. Nomad is only relevant 
when you want to start/stop containers it can crash or turned off the rest of 
the time.
Another pro for this is that there is a significant amount of knowledge around 
these products in house.

To give an example on complexity: if you look at deployment of k8s itself and 
the hashi stack: 
* deployment of k8s with kargo: you have a very large playblook which takes 30 
minutes to run to setup a cluster.
* deployment of the hashi stuff: is just one binary for each component with 1 
config file basically done in a few minutes if even that.


Cheers,
Robert van Leeuwen

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Case studies on Openstack HA architecture

2017-08-28 Thread Van Leeuwen, Robert
>
> Hi Openstack operators,
>
> Most Openstack HA deployment use 3 node database cluster, 3 node rabbitMQ 
> cluster and 3 Controllers.
> I am wondering whether there are any studies done that show the pros and cons 
> of co-locating database and messaging service with the Openstack control 
> services.
> In other words, I am very interested in learning about advantages and 
> disadvantages, in terms of ease of deployment,
> upgrade and overall API performance, of having 3 all-in-one Openstack 
> controller over a more distributed deployment model.

In general, a host with less running services is easier to troubleshoot.
However, there is no fundamental issue with sharing mysql/rabbit on a single 
controller assuming the host is quick enough.

For the OpenStack API services I would highly recommend “splitting” things up.
It does not necessarily need to be on different physical hosts but at least use 
something to prevent package dependencies between the different OpenStack 
components (virtualenv/containers/vms).
If you do not do this it makes upgrades of an individual component on a node 
near impossible.

Cheers,
Robert van Leeuwen
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] custom build image is slow

2017-08-02 Thread Van Leeuwen, Robert
>> how do we install virtio drivers if its missing? How do I verify it on the 
>> centos cloud image if its there?

>Unless it’s a very very ancient unsupported version of centos the virt-io 
>drivers will be in the kernel package.
>Do a lsmod and look for virtio to check if it is loaded.

Forgot to mention: I do not think this will have anything to do with the upload 
speed to cinder even if it is not there.

Cheers,
Robert van Leeuwen
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] custom build image is slow

2017-08-02 Thread Van Leeuwen, Robert
> how do we install virtio drivers if its missing? How do I verify it on the 
> centos cloud image if its there?

Unless it’s a very very ancient unsupported version of centos the virt-io 
drivers will be in the kernel package.
Do a lsmod and look for virtio to check if it is loaded.

Regarding the slower speed of the custom image:
First check if they are really the same file format and one is not secretly a 
raw file:
file centos.qcow2

I would expect the virt-sparsify to be indeed the way to go.
Although I think that cinder will convert it to raw for you and you probably 
want that for ceph (IIRC it is a setting).
It might be that the centos downloaded file is actually RAW so it does not need 
to convert that.

You can also check the cinder server debug log / load or ceph load to see if 
something is bottlenecked during the upload to maybe get a hint to what the 
problem is.

Cheers,
Robert van Leeuwen
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Experience with Cinder volumes as root disks?

2017-08-02 Thread Van Leeuwen, Robert
>>> Mike Smith 
>>On the plus side, Cinder does allow you to do QOS to limit I/O, whereas I do 
>>not believe that’s an option with Nova ephemeral.

You can specify the IOPS limits in the flavor.
Drawbacks:
* You might end up with a lot of different flavors because of IOPS requirements
* Modifying an existing flavor won’t retroactively apply it to existing 
instances
   You can hack it directly in the database but the instances will still either 
need to be rebooted or you need to run a lot of virsh command.
   (not sure if this is any better for cinder)

>>> Mike Smith 
>> And, again depending on the Cinder solution employed, the disk I/O for this 
>> kind of setup can be significantly better t
>>han some other options including Nova ephemeral with a Ceph backend.
IMHO specifically ceph performance scales out very well (e.g. lots of 100 IOPS 
instances) but scaling up might be an issue (e.g. running a significant 
database with lots of sync writes doing 10K IOPS)
Even with an optimally tuned SSD/nvme clusters it still might not be as fast as 
you would like it to be.

>>>Kimball, Conrad 
>> and while it is possible to boot an image onto a new volume this is clumsy
As mentioned you can make RBD the default backend for ephemeral so you no 
longer need to specify boot from volume.
Another option would be to use some other automation tools to bring up our 
instances.
I recommend looking at e.g. terraform or some other way to automate deployments.
Running a single command to install a whole environment, and boot from volume 
if necessary, is really great and makes sure things are reproducible.
Our tech savvy users like it but if you have people who can just understand the 
web interface it might be a challenge ;)

Some more points regarding ephemeral local storage:

Pros ephemeral local storage:
* No SPOF for your cloud (e.g. if a ceph software upgrade goes wrong the whole 
cloud will hang)
* Assuming SSDs: great performance
* Discourages pets, people will get used to instances being down for 
maintenance or unrecoverable due to hardware failure and will build and 
automate accordingly
* No volume storage to manage, assuming you will not offer it anyway

Cons ephemeral local storage:
* IMHO live migration with block migrations is not really useable
(the instance will behave a bit slow for some time and e.g. the whole Cassandra 
or Elasticsearch cluster performance will tank)
* No independent scaling of compute and space. E.g. with ephemeral you might 
have lots of disk left but no mem/cpu on the compute node or the other way 
around.
* Hardware failure will mean loss of that local data for at least a period of 
time assuming recoverable at all. With enough compute nodes this will become 
weekly/daily events.
* Some pets (e.g. Jenkins boxes) are hard to get rid of even if you control the 
application landscape to a great degree.

I think that if you have a lot of “pets” or other reasons e.g. a 
server/rack/availability zone cannot go down for maintenance you probably want 
to run from volume storage.
You get your data highly available and can do live-migrations for maintenance.
Note that you still have to do some manual work to boot instances somewhere 
else if a hypervisor goes down but that’s being worked on IIRC.


>>>Kimball, Conrad 
>> Bottom line:  it depends what you need, as both options work well and there 
>> are people doing both out there in the wild.
Totally agree.


Cheers,
Robert van Leeuwen
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Can Windows 8, Windows Server 2003 and Windows Server 2008 be deployed on openstack kilo

2017-04-03 Thread Van Leeuwen, Robert
Hello,

Its not really OpenStack related but it depends on your virtualization stack.
Assuming its OpenStack with KVM:
https://www.linux-kvm.org/page/Guest_Support_Status#Windows_Family

Note that you might have some interesting times with the licensing.
From what I have understood (I am not a license specialist so take this with a 
pile of salt):
You must have a Windows (datacenter?) license per hypervisor that can 
potentially run Windows instances.

Cheers,
Robert


From: Anwar Durrani 
Date: Monday, April 3, 2017 at 11:35 AM
To: openstack-operators 
Subject: [Openstack-operators] Can Windows 8, Windows Server 2003 and Windows 
Server 2008 be deployed on openstack kilo

Hi Team,

I am curious to know if Windows 8 Pro, Windows Server 2003 Std and Windows 
Server 2008 Std can be deployed to openstack or not ? if Yes then what would be 
os-variant for the same ?

--
Thanks & regards,
Anwar M. Durrani
+91-9923205011

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] VM monitoring suggestions

2016-11-21 Thread Van Leeuwen, Robert
>>I know that ceilometer may be an option, but I believe operators use all kind 
>>of tools for their own ressource usage monitoring. So what do you people use?
>>
>>(For this use case, we're looking for something that can be used without 
>>installing an agent in the VM, which makes it impossible to get a VM's load 
>>metric. I would be satisfied with cpu/memory/network/io metrics though.)


Although I’d like to re-evaluate ceilometer at some point we currently use 
something very simple with the infra we already had in place.

We use the collectd libvirt plugin and push the metrics to graphite.
https://collectd.org/wiki/index.php/Plugin:virt

If you use the following format you get the instance uuid in the metric name:
HostnameFormat "hostname uuid"

The output of those graphite keys is not exactly what we want (IIRC you get 
hostname_uuid instead of hostname.uuid)
We rewrite it a bit with carbon-(c-)relay daemon to something more usable so 
you get:
computenode.libvirt.UUID.metrics

We made a grafana dashboard where you can select the uuid and get the stats of 
the instance.

Pros:
* Simple, just needs collectd on compute nodes
* Graphite scales (with the proper setup)

Cons:
* No tenant-id in the metric name  (I guess with some scripting you can make a 
mapping-tree in graphite)
* No metrics in Horizon. (We still have to make some time to integrate these 
metrics into horizon but that should be doable.)
* Just instance metrics nothing else

Cheers,
Robert van Leeuwen
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Managing quota for Nova local storage?

2016-11-14 Thread Van Leeuwen, Robert
> I realize that you guys run private cloud, so you don’t have to worry about 
> bad actors getting a server from you and doing malicious things with it.
We might be a bit more forgiving but we are not as trusting as you might expect 
;)

> But do you have any concerns about the recent research [1] that uses 
> Rowhammer + ksm + transparent hugepages + kvm  to change the memory of 
> collocated VM’s?
>The research showed that they were able to successfully target memory inside 
>other VM’s to do things like modify authorized_keys in memory in such a way 
>that they could successfully login with their own key
Yes, the flip-feng-shui.
We ran a pretty big rowhammer cluster doing both normal and 
double_sided_rowhammer at our hardware but we did not notice any bitflips.
So I am still not sure how this actually affects server hardware with DDR4 and 
ECC.
From what I could find on the web regarding vulnerable DDR4 ( 
http://www.thirdio.com/rowhammer.pdf ) it looks like they tested consumer 
oriented memory chips.
e.g. not DDR4 with ECC chips.
Tests that were done with ECC bases memory were all based on DDR3 (which we do 
not have).

However, with the expansion of the memory config we will also look if memory is 
still the limiting factor and revisit KSM based on the usage.
Not just because of rowhammer but also some other KSM things. e.g. KSM does not 
support hugepages.

Regarding detection: you can look with perf at cache misses at the kvm process 
but not getting false-positives is hard.

Cheers,
Robert




From: "Van Leeuwen, Robert" <rovanleeu...@ebay.com>
Date: Friday, November 11, 2016 at 12:10 AM
To: "Kris G. Lindgren" <klindg...@godaddy.com>, Edmund Rhudy 
<erh...@bloomberg.net>, "war...@wangspeed.com" <war...@wangspeed.com>
Cc: "openstack-operators@lists.openstack.org" 
<openstack-operators@lists.openstack.org>
Subject: Re: [Openstack-operators] Managing quota for Nova local storage?

Thx for your stories,

I think we are now all doing pretty much the same thing to get around the issue 
but it still looks like a very useful feature.

So to share what we (eBay-ECG) are doing:
We also started out with scaling the flavor disksize to either memory or cpu. 
(so e.g. large disk == large memory)
But our users started asking for flavors with quite different specs.
Not being able to give those would be hugely inefficient.

So now we started giving flavors to specific tenants instead of making them 
public (and let the quota’s sort it out)
e.g. a flavor with 8 cores, 12G and 1TB of local storage will only be available 
for the tenants that really need it.

Looking at our hypervisor stats we either run out of memory or disk before cpu 
cycles so not having a tunable on disk is inconvenient.
Our latest spec hypervisors have 768GB and we run KSM so we will probably run 
out of DISK first there.
We run SSD-only on local storage so that space in the flavor is real $$$.

We started to run on zfs with compression on our latest config/iteration and 
that seems to alleviate the pain a bit.
It is a bit early to tell exactly but it seems to run stable and the 
compression factor will be around 2.0

P.S. I noticed my search for blueprints was not good enough so I closed mine 
and subscribed to the one that’s was already there:
https://blueprints.launchpad.net/nova/+spec/nova-disk-quota-tracking

Robert van Leeuwen

From: "Kris G. Lindgren" <klindg...@godaddy.com>
Date: Thursday, November 10, 2016 at 5:18 PM
To: Edmund Rhudy <erh...@bloomberg.net>, "war...@wangspeed.com" 
<war...@wangspeed.com>, Robert Van Leeuwen <rovanleeu...@ebay.com>
Cc: "openstack-operators@lists.openstack.org" 
<openstack-operators@lists.openstack.org>
Subject: Re: [Openstack-operators] Managing quota for Nova local storage?

This is what we have done as well.

We made our flavors stackable, starting with our average deployed flavor size 
and making things a multiple of that.  IE if our average deployed flavor size 
is 8GB 120GB of disk, our larger flavors are multiple of that.  So if 16GB 
240GB of disk is the average, the next flavor up maybe: 32GB 480GB of disk.  
From there its easy to then say with 256GB of ram we will average:  ~30 VM’s 
which means we need to have ~3.6TB of local storage per node.  Assuming that 
you don’t over allocate disk or ram.  In practice though you can get a running 
average of the amount of disk space consumed and work towards that plus a bit 
of a buffer and run with a disk oversubscription.

We currently have no desire to remove local storage.  We want the root disks to 
be on local storage.  That being said in the future we will most likely give 
smaller root disks and if people need more space ask them to provisioning a rbd 
volume through cinder.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDad

Re: [Openstack-operators] Managing quota for Nova local storage?

2016-11-10 Thread Van Leeuwen, Robert
Thx for your stories,

I think we are now all doing pretty much the same thing to get around the issue 
but it still looks like a very useful feature.

So to share what we (eBay-ECG) are doing:
We also started out with scaling the flavor disksize to either memory or cpu. 
(so e.g. large disk == large memory)
But our users started asking for flavors with quite different specs.
Not being able to give those would be hugely inefficient.

So now we started giving flavors to specific tenants instead of making them 
public (and let the quota’s sort it out)
e.g. a flavor with 8 cores, 12G and 1TB of local storage will only be available 
for the tenants that really need it.

Looking at our hypervisor stats we either run out of memory or disk before cpu 
cycles so not having a tunable on disk is inconvenient.
Our latest spec hypervisors have 768GB and we run KSM so we will probably run 
out of DISK first there.
We run SSD-only on local storage so that space in the flavor is real $$$.

We started to run on zfs with compression on our latest config/iteration and 
that seems to alleviate the pain a bit.
It is a bit early to tell exactly but it seems to run stable and the 
compression factor will be around 2.0

P.S. I noticed my search for blueprints was not good enough so I closed mine 
and subscribed to the one that’s was already there:
https://blueprints.launchpad.net/nova/+spec/nova-disk-quota-tracking

Robert van Leeuwen

From: "Kris G. Lindgren" 
Date: Thursday, November 10, 2016 at 5:18 PM
To: Edmund Rhudy , "war...@wangspeed.com" 
, Robert Van Leeuwen 
Cc: "openstack-operators@lists.openstack.org" 

Subject: Re: [Openstack-operators] Managing quota for Nova local storage?

This is what we have done as well.

We made our flavors stackable, starting with our average deployed flavor size 
and making things a multiple of that.  IE if our average deployed flavor size 
is 8GB 120GB of disk, our larger flavors are multiple of that.  So if 16GB 
240GB of disk is the average, the next flavor up maybe: 32GB 480GB of disk.  
From there its easy to then say with 256GB of ram we will average:  ~30 VM’s 
which means we need to have ~3.6TB of local storage per node.  Assuming that 
you don’t over allocate disk or ram.  In practice though you can get a running 
average of the amount of disk space consumed and work towards that plus a bit 
of a buffer and run with a disk oversubscription.

We currently have no desire to remove local storage.  We want the root disks to 
be on local storage.  That being said in the future we will most likely give 
smaller root disks and if people need more space ask them to provisioning a rbd 
volume through cinder.

___
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: "Edmund Rhudy (BLOOMBERG/ 120 PARK)" 
Reply-To: Edmund Rhudy 
Date: Thursday, November 10, 2016 at 8:47 AM
To: "war...@wangspeed.com" , "rovanleeu...@ebay.com" 

Cc: "openstack-operators@lists.openstack.org" 

Subject: Re: [Openstack-operators] Managing quota for Nova local storage?

We didn't come up with one. RAM on our HVs is the limiting factor since we 
don't run with memory overcommit, so the ability of people to run an HV out of 
disk space ended up being moot. ¯\_(ツ)_/¯

Long term we would like to switch to being exclusively RBD-backed and get rid 
of local storage entirely, but that is Distant Future at best.

From: rovanleeu...@ebay.com
Subject: Re: [Openstack-operators] Managing quota for Nova local storage?
Hi,

Found this thread in the archive so a bit of a late reaction.
We are hitting the same thing so I created a blueprint:
https://blueprints.launchpad.net/nova/+spec/nova-local-storage-quota

If you guys already found a nice solution to this problem I’d like to hear it :)

Robert van Leeuwen
eBay - ECG

From: Warren Wang 
Date: Wednesday, February 17, 2016 at 8:00 PM
To: Ned Rhudy 
Cc: "openstack-operators@lists.openstack.org" 

Subject: Re: [Openstack-operators] Managing quota for Nova local storage?

We are in the same boat. Can't get rid of ephemeral for it's speed, and 
independence. I get it, but it makes management of all these tiny pools a 
scheduling and capacity nightmare.
Warren @ Walmart

On Wed, Feb 17, 2016 at 1:50 PM, Ned Rhudy (BLOOMBERG/ 731 LEX) 
> wrote:
The subject says it all - does anyone know of a method by which quota can be 
enforced on storage provisioned via Nova rather than Cinder? Googling around 
appears to indicate that this is not possible out of the box (e.g., 

Re: [Openstack-operators] Managing quota for Nova local storage?

2016-11-08 Thread Van Leeuwen, Robert
Hi,

Found this thread in the archive so a bit of a late reaction.
We are hitting the same thing so I created a blueprint:
https://blueprints.launchpad.net/nova/+spec/nova-local-storage-quota

If you guys already found a nice solution to this problem I’d like to hear it :)

Robert van Leeuwen
eBay - ECG

From: Warren Wang 
Date: Wednesday, February 17, 2016 at 8:00 PM
To: Ned Rhudy 
Cc: "openstack-operators@lists.openstack.org" 

Subject: Re: [Openstack-operators] Managing quota for Nova local storage?

We are in the same boat. Can't get rid of ephemeral for it's speed, and 
independence. I get it, but it makes management of all these tiny pools a 
scheduling and capacity nightmare.
Warren @ Walmart

On Wed, Feb 17, 2016 at 1:50 PM, Ned Rhudy (BLOOMBERG/ 731 LEX) 
> wrote:
The subject says it all - does anyone know of a method by which quota can be 
enforced on storage provisioned via Nova rather than Cinder? Googling around 
appears to indicate that this is not possible out of the box (e.g., 
https://ask.openstack.org/en/question/8518/disk-quota-for-projects/).

The rationale is we offer two types of storage, RBD that goes via Cinder and 
LVM that goes directly via the libvirt driver in Nova. Users know they can 
escape the constraints of their volume quotas by using the LVM-backed 
instances, which were designed to provide a fast-but-unreliable RAID 0-backed 
alternative to slower-but-reliable RBD volumes. Eventually users will hit their 
max quota in some other dimension (CPU or memory), but we'd like to be able to 
limit based directly on how much local storage is used in a tenancy.

Does anyone have a solution they've already built to handle this scenario? We 
have a few ideas already for things we could do, but maybe somebody's already 
come up with something. (Social engineering on our user base by occasionally 
destroying a random RAID 0 to remind people of their unsafety, while tempting, 
is probably not a viable candidate solution.)

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Nova live-migration failing for RHEL7/CentOS7 VMs

2016-09-28 Thread Van Leeuwen, Robert
> There is a bug in the following:
>
> qemu-kvm-1.5.3-105.el7_2.7
> qemu-img-1.5.3-105.el7_2.7
> qemu-kvm-common-1.5.3-105.el7_2.7

You might be better of using the RHEV qemu packages
They are more recent (2.3) and have more features compiled into them.

Cheers,
Robert van Leeuwen
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Openstack team size vs's deployment size

2016-09-08 Thread Van Leeuwen, Robert
> I was hoping to poll other operators to see what their average team size vs’s 
> deployment size is,
>  as I am trying to use this in an internal company discussion.
> Right now we are in the order of ~260 Compute servers per Openstack 
> Dev/Engineer.
> So trying to see how we compare with other openstack installs, particularly 
> those running with more than a few hundred compute nodes.

In my opinion it highly depends on too many things to have general rule of 
thumb.
Just a few things that I think would impact required team size:
* How many regions you have: setting up and managing a region usually takes 
more time then adding computes to an existing region
* How often do you want/need to upgrade
* Are you offering more then “core IAAS services” e.g. designate/trove/…
* What supporting things do you need around your cloud and who manage e.g. 
networking, setting up dns / repositories / authentication systems  etc
* What kind of SDN are you using/ how it needs to be integrated existing 
networks
* What kind of hardware you are rolling and what is the average size of the 
instances. E.G. hosting 1000 tiny instances on a 768GB / 88 core hypervisor 
will probably create more support tickets then 10 large instances on a low-spec 
hypervisor.
* How you handle storage ceph/san/local?
* Do you need live-migration when doing maintenance or are you allowed to bring 
down an availability zone
* Are you building your own packages / Using vendor packages
* The level of support the users expect and which team is taking care of that

In my private cloud experience rolling compute nodes and the controllers are 
not the bulk of the work.
The time goes in all the things that you need around the cloud and 
customizations that takes time.

It might be a bit different for public cloud providers where you might deliver 
as-is and do not need any integrations.
But you might need other things like very reliable billing and good automation 
around misbehaving users.


Cheer,
Robert van Leeuwen
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] ElasticSearch on OpenStack

2016-09-02 Thread Van Leeuwen, Robert
Hi,

I had some “interesting” issues in the past with sparse files on xfs with 
elasticsearch:
http://engineering.spilgames.com/hypervisor-kernel-panics-hit-2014-sl6-5/

If you pre-allocate your files you should be good.

Cheers,
Robert van Leeuwen

From: Tim Bell 
Date: Friday, September 2, 2016 at 2:36 PM
To: openstack-operators 
Subject: [Openstack-operators] ElasticSearch on OpenStack


Has anyone had experience running ElasticSearch on top of OpenStack VMs ?

Are there any tuning recommendations ?

Thanks
Tim
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] Updating flavor quotas (e.g. disk_iops) on existing instances.

2016-07-13 Thread Van Leeuwen, Robert
>> Since the instance_extra flavor table is a big JSON blob it is a pain to 
>> apply changes there.
>> Anybody found an easy way to do this?
> If You are using virsh, You can apply such limits manually for each
> instance. Check blkiotune command in virsh.

Using virsh is only for the running instances until the next reboot.
This is because OpenStack will re-create the .xml file when you reboot an 
instance and any “local changes” made by virsh will be gone.

As I mentioned this info is not re-read from the flavor when the XML is created 
but it is stored per-instance in the instance_extra.flavor table.
Luckily you can just overwrite the instance_extra.flavor with one with a quota 
applied to it and you do not need to parse and modify the JSON-BLOB. (the 
JSON-BLOB does not seem to contain unique data for the instance)

For future reference, the sql query will look something like this:
update instance_extra set flavor='BIG JSON BLOB HERE' where instance_uuid IN ( 
SELECT uuid from instances where instances.instance_type_id='10' and 
instances.deleted='0') ;

This will update all active instances with the flavor-id of 10 (note that this 
flavor-id is an auto-increment id and not the flavor-id you use when creating a 
flavor)
You can get the “JSON BLOB” from an instance which was created with the new  
extra_specs settings applied to it.
This setting will only be applied when you (hard)reboot the instance.
When you want to instances without rebooting them you will ALSO need to to the 
virsh blkiotune part.

I’d gladly hear to any suggestions/tools for an easier way to do it.

Cheers,
Robert van Leeuwen

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] Updating flavor quotas (e.g. disk_iops) on existing instances.

2016-07-12 Thread Van Leeuwen, Robert
Hi,

Is there an easy way to update the quotas for flavors and apply it to existing 
instances?
It looks like these settings are tracked in the “instance_extra” table and not 
re-read from the flavor when (hard) re-booting the instances.

Since the instance_extra flavor table is a big JSON blob it is a pain to apply 
changes there.
Anybody found an easy way to do this?

Thx,
Robert van Leeuwen
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] keystone authentication on public interface

2016-04-15 Thread Van Leeuwen, Robert
>
>Hello folks,
>
>I was wondering if you let me know if enabling keystone to listen on public 
>interface for ports 5000 and 35357 is considered as a normal practice. Example 
>if a customer wants to authenticate not via horizon or some other proxy but 
>setting up OS_AUTH_URL=http://blah  variable to be able to run OpenStack 
>commands in cli.

I think this depends a bit on your user base.
Personally I see horizon more of a getting-started thing for people who are not 
extremely technical and maybe want one or two instances which never change.

You really need the API’s if you want to automate deployments (e.g. Using 
Terraform).
If you have e.g. OPS teams using it they will probably want APIs

Depending on your user base (private/public cloud) you choose to expose the 
APIs on private/public IP space.
Since there are some pretty big OpenStack clouds facing the internet, eg 
backspace, I think the APIs are battle-tested.

Regarding how & ports:
I would terminate everything on port 443 (so people do not have to mess with 
firewalls) and offload SSL to a load-balancer.
You can do host-header inspection on the loadbalancer so e.g. 
keystone.example.com goes to your keystone server on port 5000 and 
keystone-admin.example.com goes to port 35357 (if you chose to expose it)

Cheers,
Robert van Leeuwen
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [puppet] module dependencies and different openstack versions

2015-07-28 Thread Van Leeuwen, Robert

We currently use our own custom puppet modules to deploy openstack, I
have been looking into the official openstack modules and have a few
barriers to switching.

We are looking at doing this at a project at a time but the modules have
a lot of dependencies. Eg. they all depend on the keystone module and try
to do things in keystone suck as create users, service endpoints etc.

This is a pain as I don¹t want it to mess with keystone (for one we don¹t
support setting endpoints via an API) but also we don¹t want to move to
the official keystone module at the same time. We have some custom
keystone stuff which means we¹ll may never move to the official keystone
puppet module.

The neutron module pulls in the vswitch module but we don¹t use vswitch
and it doesn¹t seem to be a requirement of the module so maybe doesn¹t
need to be in metadata dependencies?

It looks as if all the openstack puppet modules are designed to all be
used at once? Does anyone else have these kind of issues? It would be
great if eg. the neutron module would just manage neutron and not try and
do things in nova, keystone, mysql etc.


The other issue we have is that we have different services in openstack
running different versions. Currently we have Kilo, Juno and Icehouse
versions of different bits in the same cloud. It seems as if the puppet
modules are designed just to manage one openstack version? Is there any
thoughts on making it support different versions at the same time? Does
this work?



Hi, 

In my experience (I am setting up a new environment) the modules can be
used ³stand-alone².
It is the OpenStack module itself that comes with a combined server
example.
The separate modules (nova, glance, etc) are very configurable and don¹t
necessarily need to setup e.g. keystone.

From the OpenStack module you can modify the profiles and it will not do
the keystone stuff / database, etc..
E.g. Remove the ³:nova::keystone::auth² part in the nova profile.

We use r10k to select which versions to install and it should be trivial
to use Juno / Kilo stuff together (have not tested this myself).


Regarding the vswich module I *guess* that that is regulated by the
following:
neutron/manifests/agents/ml2/ovs.pp: if
$::neutron::params::ovs_agent_package
So unsetting that variable should not pull the package.

Cheers,
Robert van Leeuwen


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] [puppet][keystone] Creating Keystone users with a password in the puppet module (Kilo) throws error at second puppetrun

2015-07-21 Thread Van Leeuwen, Robert
Hi,

I am using the Kilo puppet recipes to setup Kilo on Ubuntu 14.04 to test the 
latest Puppet recipes with Vagrant.
I am creating an keystone admin user from within the puppet recipe.
Creating the keystone user works fine but the second puppetrun gives an error 
whenever you set a password for the user you want to create.
Error: /Stage[main]/Keystone::Roles::Admin/Keystone_user[admin]: Could not 
evaluate: Execution of '/usr/bin/openstack token issue --format value' returned 
1: ERROR: openstack The resource could not be found.

* When you do not pass the password in the keystone_user native type it does 
not throw an error.
* The first run will create the user successfully and set the password
* After sourcing the credentials file and running manually  /usr/bin/openstack 
token issue --format value” also does not give an error.
( I could not immediately find where puppet decides this command is run and 
with which credentials. )

Anyone hitting the same issue or knows what could be going wrong?

Example puppet keystone user config which breaks after the second run:
  keystone_user { 'admin':
password = $::openstack::config::keystone_admin_password,#Removing 
this line fixes the issue
email= 'admin@openstack',
ensure   = present,
enabled  = True,
 }

Thx,
Robert van Leeuwen
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [openstack-dev] [Neutron] Deprecating the use_namespaces option - Now's the time to speak up!

2015-03-23 Thread Van Leeuwen, Robert
I think there are valid reasons to not use namespaces:

  *   Fewer moving parts == less can potentialy fail
  *   Troubleshooting is easier due to less places to look / need no 
familiarity with namespaces  tools
  *   If I remember correctly setting up a namespace can get really slow when 
you have a lot of them on a single machine

 IMHO, those shouldn’t be valid reasons anymore, since they were due iproute, 
 or sudo issues
 that have been corrected long ago, and all distros installing neutron are 
 supporting netns at this

Well, you exactly made my point:
There is lots that can and will go wrong with more moving parts.
That they are fixed at the moment does not mean that there will not be a new 
bug in the future…

Cheers,
Robert van Leeuwen
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [openstack-dev] [Neutron] Deprecating the use_namespaces option - Now's the time to speak up!

2015-03-23 Thread Van Leeuwen, Robert
  Are the setups out there *not* using the use_namespaces option? I'm
  curious as
  to why, and if it would be difficult to migrate such a setup to use
  namespaces.

At my previous employer we did not use namespaces.
This was due to a installation a few years ago on SL6 which did not have name 
space support at that time.

I think there are valid reasons to not use namespaces:

  *   Fewer moving parts == less can potentialy fail
  *   Troubleshooting is easier due to less places to look / need no 
familiarity with namespaces  tools
  *   If I remember correctly setting up a namespace can get really slow when 
you have a lot of them on a single machine

If you have a requirement for having all networks to be routable disabling 
namespaces does make sense.
Since I’m currently in the design phase for such an install I’d surely like to 
know if it is going to be deprecated.
Thx for letting us know about this :)

Cheers,
Robert van Leeuwen
___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators