Re: [openstack-dev] [Vitrage] About alarms reported by datasource and the alarms generated by vitrage evaluator

2017-01-06 Thread Yujun Zhang
The two questions raised by YinLiYin is actually one, i.e. *how to enrich
the alarm properties *that can be used as an condition in root cause
deducing.

Both 'suspect' or 'datasource' are additional information that may be
referred as a condition in general fault model, a.k.a. scenario in vitrage.

It seems it could be done by

   1. introduce a flexible `metadata` dict in to ALARM entity
   2. Allow generating update event[1] on metadata change
   3. Allow using ALARM metadata in scenario condition
   4. Allow setting ALARM metadata in scenario action

This will leave the flexibility to continuous development by defining a
complex scenario template and keep the vitrage evaluator simple and generic.

My two cents.

[1]:
http://docs.openstack.org/developer/vitrage/scenario-evaluator.html#concepts-and-guidelines


On Sat, Jan 7, 2017 at 2:23 AM Afek, Ifat (Nokia - IL) 
wrote:

> Hi YinLiYin,
>
>
>
> This is an interesting question. Let me divide my answer to two parts.
>
>
>
> First, the case that you described with Nagios and Vitrage. This problem
> depends on the specific Nagios tests that you configure in your system, as
> well as on the Vitrage templates that you use. For example, you can use
> Nagios/Zabbix to monitor the physical layer, and Vitrage to raise deduced
> alarms on the virtual and application layers. This way you will never have
> duplicated alarms. If you want to use Nagios to monitor the other layers as
> well, you can simply modify Vitrage templates so they don’t raise the
> deduced alarms that Nagios may generate, and use the templates to show RCA
> between different Nagios alarms.
>
>
>
> Now let’s talk about the more general case. Vitrage can receive alarms
> from different monitors, including Nagios, Zabbix, collectd and Aodh. If
> you are using more than one monitor, it is possible that the same alarm
> (maybe with a different name) will be raised twice. We need to create a
> mechanism to identify such cases and create a single alarm with the
> properties of both monitors. This has not been designed in details yet, so
> if you have any suggestion we will be happy to hear them.
>
>
>
> Best Regards,
>
> Ifat.
>
>
>
>
>
> *From: *"yinli...@zte.com.cn" 
> *Reply-To: *"OpenStack Development Mailing List (not for usage
> questions)" 
> *Date: *Friday, 6 January 2017 at 03:27
> *To: *"openstack-dev@lists.openstack.org" <
> openstack-dev@lists.openstack.org>
> *Cc: *"gong.yah...@zte.com.cn" , "
> han.jin...@zte.com.cn" , "wang.we...@zte.com.cn" <
> wang.we...@zte.com.cn>, "jia.peiy...@zte.com.cn" ,
> "zhang.yuj...@zte.com.cn" 
> *Subject: *[openstack-dev] [Vitrage] About alarms reported by datasource
> and the alarms generated by vitrage evaluator
>
>
>
> Hi all,
>
>Vitrage generate alarms acording to the templates. All the alarms
> raised by vitrage has the type "vitrage". Suppose Nagios has an alarm A.
> Alarm A is raised by vitrage evaluator according to the action part of a
> scenario, type of alarm A is "vitrage". If Nagios reported alarm A latter,
> a new alarm A with type "Nagios" would be generator in the entity graph.
>   There would be two vertices for the same alarm in the graph. And we have
> to define two alarm entities, two relationships, two scenarios in the
> template file to make the alarm propagation procedure work.
>
>It is inconvenient to describe fault model of system with lot of
> alarms. How to solve this problem?
>
>
>
> 殷力殷 YinLiYin
>
>
>
>
>
>
> 上海市浦东新区碧波路889号中兴研发大楼D502
> D502, ZTE Corporation R Center, 889# Bibo Road,
> Zhangjiang Hi-tech Park, Shanghai, P.R.China, 201203
> T: +86 21 68896229 <+86%2021%206889%206229>
> M: +86 13641895907 <+86%20136%204189%205907>
> E: yinli...@zte.com.cn
> www.zte.com.cn
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [acceleration]Team Bi-weekly Meeting 2017.1.4 Agenda

2017-01-06 Thread Zhipeng Huang
I think we just need half a day. I will try to figure out the room.

On Jan 6, 2017 10:38 PM, "Harm Sluiman"  wrote:

> One question regarding PTG,
> Since we don't get a specific room allocated, and the intent is for people
> to not float around meetings...
> What day(s) are you expecting to have Cyborg specific discussion?
> It seem hotel booking will be a premium soon
>
> On Wed, Jan 4, 2017 at 11:13 AM, Zhipeng Huang 
> wrote:
>
>> Hi Team,
>>
>> Thanks for a great discussion at today's meeting, please find the minutes
>> at https://wiki.openstack.org/wiki/Cyborg/MeetingLogs#2017-01-04
>>
>> On Wed, Jan 4, 2017 at 10:40 PM, Miroslav Halas 
>> wrote:
>>
>>> Howard and team,
>>>
>>>
>>>
>>> I have usually conflict at this time,  but I am trying to keep up with
>>> meeting logs and etherpads J. Either Scott or I will be at PTG
>>> representing Lenovo so we would be happy to participate.
>>>
>>>
>>>
>>> From last meeting I have added TODO to Nasca etherpard to link the
>>> design document and the code being discussed. I cannot seem to locate the
>>> original files Mellanox team shared with us. Would somebody who know where
>>> these are shared be able to insert the links to the etherpad
>>> https://etherpad.openstack.org/p/cyborg-nasca-design
>>>
>>>
>>>
>>> Thank you,
>>>
>>>
>>>
>>> Miro Halas
>>>
>>>
>>>
>>> *From:* Harm Sluiman [mailto:harm.slui...@gmail.com]
>>> *Sent:* Wednesday, January 04, 2017 9:22 AM
>>> *To:* Zhipeng Huang
>>> *Cc:* OpenStack Development Mailing List (not for usage questions);
>>> Miroslav Halas; rodolfo.alonso.hernan...@intel.com; Michele Paolino;
>>> Scott Kelso; Roman Dobosz; Jim Golden; pradeep.jagade...@huawei.com;
>>> michael.ro...@nokia.com; jian-feng.d...@intel.com;
>>> martial.mic...@nist.gov; Moshe Levi; Edan David; Francois Ozog; Fei K
>>> Chen; jack...@huawei.com; li.l...@huawei.com
>>> *Subject:* Re: [acceleration]Team Bi-weekly Meeting 2017.1.4 Agenda
>>>
>>>
>>>
>>> Happy New Year everyone.
>>>
>>> I won't be able participate in the IRC today due to a conflict, but I
>>> will try to connect and monitor.
>>>
>>> I will also put more comments in the etherpads that are linked
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Jan 4, 2017 at 6:28 AM, Zhipeng Huang 
>>> wrote:
>>>
>>> Hi Team,
>>>
>>>
>>>
>>> Please find the agenda at https://wiki.openstack.org/
>>> wiki/Meetings/CyborgTeamMeeting#Agenda_for_next_meeting
>>>
>>>
>>>
>>> our IRC channel is #openstack-cyborg
>>>
>>>
>>>
>>> --
>>>
>>> Zhipeng (Howard) Huang
>>>
>>>
>>>
>>> Standard Engineer
>>>
>>> IT Standard & Patent/IT Prooduct Line
>>>
>>> Huawei Technologies Co,. Ltd
>>>
>>> Email: huangzhip...@huawei.com
>>>
>>> Office: Huawei Industrial Base, Longgang, Shenzhen
>>>
>>>
>>>
>>> (Previous)
>>>
>>> Research Assistant
>>>
>>> Mobile Ad-Hoc Network Lab, Calit2
>>>
>>> University of California, Irvine
>>>
>>> Email: zhipe...@uci.edu
>>>
>>> Office: Calit2 Building Room 2402
>>>
>>>
>>>
>>> OpenStack, OPNFV, OpenDaylight, OpenCompute Aficionado
>>>
>>>
>>>
>>>
>>>
>>> --
>>>
>>> 宋慢
>>> Harm Sluiman
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Zhipeng (Howard) Huang
>>
>> Standard Engineer
>> IT Standard & Patent/IT Prooduct Line
>> Huawei Technologies Co,. Ltd
>> Email: huangzhip...@huawei.com
>> Office: Huawei Industrial Base, Longgang, Shenzhen
>>
>> (Previous)
>> Research Assistant
>> Mobile Ad-Hoc Network Lab, Calit2
>> University of California, Irvine
>> Email: zhipe...@uci.edu
>> Office: Calit2 Building Room 2402
>>
>> OpenStack, OPNFV, OpenDaylight, OpenCompute Aficionado
>>
>
>
>
> --
> 宋慢
> Harm Sluiman
>
>
>
>
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] placement/resource providers update 6

2017-01-06 Thread Alex Xu
2017-01-07 5:55 GMT+08:00 Matt Riedemann :

> On 12/16/2016 6:40 AM, Chris Dent wrote:
>
>>
>> ## Resource Provider Traits
>>
>> There's been some recent activity on the spec for resource provider
>> traits. These are a way of specifying qualitative resource
>> requirements (e.g., "I want my disk to be SSD").
>>
>> https://review.openstack.org/#/c/345138/
>>
>> I'm not clear on whether this is still targeting Ocata or not?
>>
>
> Sorry for the late reply to this older update, but just to be clear,
> traits were never a goal or planned item for Ocata. We have to get the
> quantitative stuff working first before moving onto the qualitative stuff,
> but it's for sure fine to discuss designs/ideas or hack on POC code.
>
>
yeah, ++, it isn't a goal for Ocata. I begin the PoC to ensure I can finish
it before PTG(and right after Ocata-3,  holidays in China).  Then people
can have more material to discuss.


> --
>
> Thanks,
>
> Matt Riedemann
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [kolla] Multi-Regions Support

2017-01-06 Thread Matthieu Simonin


- Mail original -
> De: "Jay Pipes" 
> À: openstack-dev@lists.openstack.org
> Envoyé: Vendredi 6 Janvier 2017 21:42:46
> Objet: Re: [openstack-dev] [kolla] Multi-Regions Support
> 
> On 01/06/2017 03:23 PM, Sam Yaple wrote:
> > This should be read as MariaDB+Galera for replication. It is a
> > highly-available database.
> 
> Don't get me wrong. I love me some Galera. :) However, what the poster
> is really working towards is an implementation of the VCPE and eVCPE use
> cases for ETSI NFV. These use cases require a highly distributed compute
> fabric that can withstand long disruptions in network connectivity
> (between POPs/COs and the last mile of network service) while still
> being able to service compute and network functions at the customer premise.
> 
> Galera doesn't tolerate network disruption of any significant length of
> time. At all. If there is a Keystone services running on the customer
> premise that is connecting to a Galera database, and that Galera
> database's connectivity to its peers is disrupted, down goes the whole
> on-premise cloud fabric. And that's exactly what I believe the original
> poster is attempting to avoid. Thus my not understanding the choice here.
> 

Jay, you are thinking too far ;) 

The goal of this thread is to see how Kolla can deploy a multi region scenario.
Remarks/contributions to progress in that direction is the goal of the initial 
post.

Best,

Matt

> Best,
> -jay
> 
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat][tripleo] Heat memory usage in the TripleO gate during Ocata

2017-01-06 Thread Zane Bitter

On 06/01/17 16:35, Thomas Herve wrote:

Thanks a lot for the analysis. It's great that things haven't gotten off track.


I tracked down most of the step changes to identifiable patches:

2016-10-07: 2.44GiB -> 1.64GiB
 - https://review.openstack.org/382068/ merged, making ResourceInfo classes
more memory-efficient. Judging by the stable branch (where this and the
following patch were merged at different times), this was responsible for
dropping the memory usage from 2.44GiB -> 1.83GiB. (Which seems like a
disproportionately large change?)

Without wanting to get the credit, I believe
https://review.openstack.org/377061/ is more likely the reason here.


It *is* possible (and I had, indeed, forgotten about that patch), since 
those two backports merged at around the same time. However, that patch 
merged to master on 30 September, a week before the other two patches, 
and there was no (downwards) change in the memory usage on master until 
the day after the other two merged. So the evidence is definitely not as 
clear-cut as with some of the others.



 - https://review.openstack.org/#/c/382377/ merged, so we no longer create
multiple yaql contexts. (This was responsible for the drop from 1.83GiB ->
1.64GiB.)

2016-10-17: 1.62GiB -> 0.93GiB
 - https://review.openstack.org/#/c/386696/ merged, reducing the number of
engine workers on the undercloud to 2.

2016-10-19: 0.93GiB -> 0.73GiB (variance also seemed to drop after this)
 - https://review.openstack.org/#/c/386247/ merged (on 2016-10-16), avoiding
loading all nested stacks in a single process simultaneously much of the
time.
 - https://review.openstack.org/#/c/383839/ merged (on 2016-10-16),
switching output calculations to RPC to avoid almost all simultaneous
loading of all nested stacks.

2016-11-08: 0.76GiB -> 0.70GiB
 - This one is a bit of a mystery???

Possibly https://review.openstack.org/390064/ ? Reducing the
environment size could have an effect.


Unlikely; stable/newton fell too (a few days later), but that patch was 
never backported. (Also, it merged on master almost a week before the 
change in memory use.)


It's likely a change in another repo, but I checked the obvious 
candidates (heatclient, tripleoclient) without luck.



2016-11-22: 0.69GiB -> 0.50GiB
 - https://review.openstack.org/#/c/398476/ merged, improving the efficiency
of resource listing?

2016-12-01: 0.49GiB -> 0.88GiB
 - https://review.openstack.org/#/c/399619/ merged, returning the number of
engine workers on the undercloud to 4.



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [glance] priorities for the coming week (01/06-01/12)

2017-01-06 Thread Brian Rosmaita
Hopefully everyone's had some rest over the holidays, and are ready to
code & review for O-3.

Please concentrate on the following:

(0) If you're a member of the Glance coresec team, you should be getting
a separate notification about an issue that could use your attention.
Please make some time to take a look.


(1) Patch to enable better request-id tracking:
https://review.openstack.org/#/c/352892/
On-going discussion on the patch.


(2) Port Glance migrations to Alembic
https://review.openstack.org/#/c/382958/
https://review.openstack.org/#/c/392993/


(3) Community images
This effort is stalled while discussions are happening around Steve's
tempest patch: https://review.openstack.org/#/c/414261/

The current patch will need reviews once the tempest situation has been
clarified, so keep it on your radar:
https://review.openstack.org/#/c/369110/


cheers,
brian

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat][tripleo] Heat memory usage in the TripleO gate during Ocata

2017-01-06 Thread Zane Bitter

On 06/01/17 16:40, Hugh Brock wrote:

Why would TripleO not move to convergence at the earliest possible point?


We'll need some data to decide when the earliest possible point is :)

Last time Steve (Hardy) tested it I believe convergence was looking far 
worse than legacy in memory usage, at a time when legacy was already 
through the roof. Clearly a lot has changed since then, so now would be 
a good time to retest and re-evaluate where we stand.


cheers,
Zane.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat][tripleo] Heat memory usage in the TripleO gate during Ocata

2017-01-06 Thread Zane Bitter

On 06/01/17 16:58, Emilien Macchi wrote:

On Fri, Jan 6, 2017 at 4:35 PM, Thomas Herve  wrote:

On Fri, Jan 6, 2017 at 6:12 PM, Zane Bitter  wrote:

It's worth reiterating that TripleO still disables convergence in the
undercloud, so these are all tests of the legacy code path. It would be
great if we could set up a non-voting job on t-h-t with convergence enabled
and start tracking memory use over time there too. As a first step, maybe we
could at least add an experimental job on Heat to give us a baseline?


+1. We haven't made any huge changes into that direction, but having
some info would be great.


+1 too. I volunteer to do it.

Quick question: to enable it, is it just a matter of setting
convergence_engine to true in heat.conf (on the undercloud)?


Yep! Actually, it's even simpler than that: now that true is the default 
(Newton onwards), it's just a matter of _not_ setting it to false :)


- ZB


If not, what else if needed?



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] OpenStack Developer Mailing List Digest December 31 - January 6

2017-01-06 Thread Mike Perez
HTML version: 
http://www.openstack.org/blog/2017/01/openstack-developer-mailing-list-digest-20170106/

SuccessBot Says
===
* Dims - Keystone now has Devstack based functional test with everything
  running under python3.5.
* Tell us yours via OpenStack IRC channels with message “#success "
* All: https://wiki.openstack.org/wiki/Successes 

Time To Retire Nova-docker
==
* nova-docker has lagged behind the last 6 months of nova development.
* No longer passes simple CI unit tests.
  - There are patches to at least get the unit tests work [1] .
* If the core team no longer has time for it, perhaps we should just archive
  it.
* People ask about it on ##openstack-nova about once or twice a year, but it’s
  not recommended as it’s not maintained.
* It’s believed some people are running and hacking on it outside of the
  community.
* The Sun project provides lifecycle management interface for containers that
  are started in container orchestration engines provided with Magnum.
* Nova-lxc driver provides an ability of treating containers like your virtual
  machines. [2]
  - Not recommended for production use though, but still better maintained than
nova-docker [3].
* Nova-lxd also provides the ability of treating containers like virtual
  machines.
* Virtuozzo which is supported in Nova via libvirt provides both a virtual
  machine and OS containers similar to LXC.
  - These containers have been in production for more than 10 years already.
  - Well maintained and actually has CI testing.
* A proposal to remove it [4].
* Full thread: 
http://lists.openstack.org/pipermail/openstack-dev/2016-December/thread.html#109387

Community Goals For Pike

* A few months ago the community started identifying work for OpenStack-wide
  goals to “achieve visible common changes, push for basic levels of
  consistency and user experience, and efficiently improve certain areas where
  technical debt payments have become to high - across all OpenStack projects.”
* First goal defined [5] to remove copies of incubated Oslo code.
* Moving forward in Pike:
  - Collect feedback of our first iteration. What went well and what was
challenging?
  - Etherpad for feedback [6]
* Goals backlog [7]
  - New goals welcome
  - Each goal should be achievable in one cycle. If not, it should be broken
up.
  - Some goals might require documentation for how it could be achieved.
* Choose goals for Pike
  - What is really urgent? What can wait for six months?
  - Who is available and interested in contributing to the goal?
* Feedback was also collected at the Barcelona summit [8]
* Digest of feedback:
  - Most projects achieved the goal for Ocata, and there was interest in doing
it on time.
  - Some confusion on acknowledging a goal and doing the work.
  - Some projects slow on the uptake and reviewing the patches.
  - Each goal should document where the “guides” are, and how to find them for
help.
  - Achieving multiple goals in a single cycle wouldn’t be possible for all 
team.
* The OpenStack Product Working group is also collecting feedback for goals [9]
* Goals set for Pike:
  - Split out Tempest plugins [10]
  - Python 3 [11]
* TC agreeements from last meeting:
  - 2 goals might be enough for the Pike cycle.
  - The deadline to define Pike goals would be Ocata-3 (Jan 23-27 week).
* Full thread: 
http://lists.openstack.org/pipermail/openstack-dev/2016-December/thread.html#108755

POST /api-wg/news
=
* Guidelines current review:
  - Add guidelines on usage of state vs. status [12]
  - Add guidelines for boolean names [13]
  - Clarify the status values in versions [14]
  - Define pagination guidelines [15]
  - Add API capabilities discovery guideline [16]
* Full thread: 
http://lists.openstack.org/pipermail/openstack-dev/2017-January/109698.html


[1] - 
https://review.openstack.org/#/q/status:open+project:openstack/nova-docker+branch:master+topic:fixes_for_master
 
[2] - http://docs.openstack.org/developer/nova/support-matrix.html
[3] - 
http://docs.openstack.org/newton/config-reference/compute/hypervisor-lxc.html
[4] - http://lists.openstack.org/pipermail/openstack-dev/2016-July/098940.html
[5] - http://governance.openstack.org/goals/index.html
[6] - https://etherpad.openstack.org/p/community-goals-ocata-feedback
[7] - https://etherpad.openstack.org/p/community-goals
[8] - https://etherpad.openstack.org/p/ocata-summit-xp-community-wide-goals
[9] - http://lists.openstack.org/pipermail/product-wg/2016-December/001372.html
[10] - https://review.openstack.org/#/c/369749/
[11] - https://review.openstack.org/349069
[12] - https://review.openstack.org/#/c/411528/
[13] - https://review.openstack.org/#/c/411529/
[14] - https://review.openstack.org/#/c/411849/
[15] - https://review.openstack.org/#/c/390973/
[16] - https://review.openstack.org/#/c/386555/

__
OpenStack Develo

Re: [openstack-dev] [heat][tripleo] Heat memory usage in the TripleO gate during Ocata

2017-01-06 Thread Emilien Macchi
On Fri, Jan 6, 2017 at 4:35 PM, Thomas Herve  wrote:
> On Fri, Jan 6, 2017 at 6:12 PM, Zane Bitter  wrote:
>> tl;dr everything looks great, and memory usage has dropped by about 64%
>> since the initial Newton release of Heat.
>>
>> I re-ran my analysis of Heat memory usage in the tripleo-heat-templates
>> gate. (This is based on the gate-tripleo-ci-centos-7-ovb-nonha job.) Here's
>> a pretty picture:
>>
>> https://fedorapeople.org/~zaneb/tripleo-memory/20170105/heat_memused.png
>>
>> There is one major caveat here: for the period marked in grey where it says
>> "Only 2 engine workers", the job was configured to use only 2 heat-enginer
>> worker processes instead of 4, so this is not an apples-to-apples
>> comparison. The inital drop at the beginning and the subsequent bounce at
>> the end are artifacts of this change. Note that the stable/newton branch is
>> _still_ using only 2 engine workers.
>>
>> The rapidly increasing usage on the left is due to increases in the
>> complexity of the templates during the Newton cycle. It's clear that if
>> there has been any similar complexity growth during Ocata, it has had a tiny
>> effect on memory consumption in comparison.
>
> Thanks a lot for the analysis. It's great that things haven't gotten off 
> track.
>
>> I tracked down most of the step changes to identifiable patches:
>>
>> 2016-10-07: 2.44GiB -> 1.64GiB
>>  - https://review.openstack.org/382068/ merged, making ResourceInfo classes
>> more memory-efficient. Judging by the stable branch (where this and the
>> following patch were merged at different times), this was responsible for
>> dropping the memory usage from 2.44GiB -> 1.83GiB. (Which seems like a
>> disproportionately large change?)
>
> Without wanting to get the credit, I believe
> https://review.openstack.org/377061/ is more likely the reason here.
>
>>  - https://review.openstack.org/#/c/382377/ merged, so we no longer create
>> multiple yaql contexts. (This was responsible for the drop from 1.83GiB ->
>> 1.64GiB.)
>>
>> 2016-10-17: 1.62GiB -> 0.93GiB
>>  - https://review.openstack.org/#/c/386696/ merged, reducing the number of
>> engine workers on the undercloud to 2.
>>
>> 2016-10-19: 0.93GiB -> 0.73GiB (variance also seemed to drop after this)
>>  - https://review.openstack.org/#/c/386247/ merged (on 2016-10-16), avoiding
>> loading all nested stacks in a single process simultaneously much of the
>> time.
>>  - https://review.openstack.org/#/c/383839/ merged (on 2016-10-16),
>> switching output calculations to RPC to avoid almost all simultaneous
>> loading of all nested stacks.
>>
>> 2016-11-08: 0.76GiB -> 0.70GiB
>>  - This one is a bit of a mystery???
>
> Possibly https://review.openstack.org/390064/ ? Reducing the
> environment size could have an effect.
>
>> 2016-11-22: 0.69GiB -> 0.50GiB
>>  - https://review.openstack.org/#/c/398476/ merged, improving the efficiency
>> of resource listing?
>>
>> 2016-12-01: 0.49GiB -> 0.88GiB
>>  - https://review.openstack.org/#/c/399619/ merged, returning the number of
>> engine workers on the undercloud to 4.
>>
>> It's not an exact science because IIUC there's a delay between a patch
>> merging in Heat and it being used in subsequent t-h-t gate jobs. e.g. the
>> change to getting outputs over RPC landed the day before the
>> instack-undercloud patch that cut the number of engine workers, but the
>> effects don't show up until 2 days after. I'd love to figure out what
>> happened on the 8th of November, but I can't correlate it to anything
>> obvious. The attribution of the change on the 22nd also seems dubious, but
>> the timing adds up (including on stable/newton).
>>
>> It's fair to say that none of the other patches we merged in an attempt to
>> reduce memory usage had any discernible effect :D
>>
>> It's worth reiterating that TripleO still disables convergence in the
>> undercloud, so these are all tests of the legacy code path. It would be
>> great if we could set up a non-voting job on t-h-t with convergence enabled
>> and start tracking memory use over time there too. As a first step, maybe we
>> could at least add an experimental job on Heat to give us a baseline?
>
> +1. We haven't made any huge changes into that direction, but having
> some info would be great.

+1 too. I volunteer to do it.

Quick question: to enable it, is it just a matter of setting
convergence_engine to true in heat.conf (on the undercloud)?
If not, what else if needed?

> --
> Thomas
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



-- 
Emilien Macchi

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 

Re: [openstack-dev] [nova] placement/resource providers update 6

2017-01-06 Thread Matt Riedemann

On 12/16/2016 6:40 AM, Chris Dent wrote:


## Resource Provider Traits

There's been some recent activity on the spec for resource provider
traits. These are a way of specifying qualitative resource
requirements (e.g., "I want my disk to be SSD").

https://review.openstack.org/#/c/345138/

I'm not clear on whether this is still targeting Ocata or not?


Sorry for the late reply to this older update, but just to be clear, 
traits were never a goal or planned item for Ocata. We have to get the 
quantitative stuff working first before moving onto the qualitative 
stuff, but it's for sure fine to discuss designs/ideas or hack on POC code.


--

Thanks,

Matt Riedemann


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat][tripleo] Heat memory usage in the TripleO gate during Ocata

2017-01-06 Thread Hugh Brock
Why would TripleO not move to convergence at the earliest possible point?

On Jan 6, 2017 10:37 PM, "Thomas Herve"  wrote:

> On Fri, Jan 6, 2017 at 6:12 PM, Zane Bitter  wrote:
> > tl;dr everything looks great, and memory usage has dropped by about 64%
> > since the initial Newton release of Heat.
> >
> > I re-ran my analysis of Heat memory usage in the tripleo-heat-templates
> > gate. (This is based on the gate-tripleo-ci-centos-7-ovb-nonha job.)
> Here's
> > a pretty picture:
> >
> > https://fedorapeople.org/~zaneb/tripleo-memory/20170105/heat_memused.png
> >
> > There is one major caveat here: for the period marked in grey where it
> says
> > "Only 2 engine workers", the job was configured to use only 2
> heat-enginer
> > worker processes instead of 4, so this is not an apples-to-apples
> > comparison. The inital drop at the beginning and the subsequent bounce at
> > the end are artifacts of this change. Note that the stable/newton branch
> is
> > _still_ using only 2 engine workers.
> >
> > The rapidly increasing usage on the left is due to increases in the
> > complexity of the templates during the Newton cycle. It's clear that if
> > there has been any similar complexity growth during Ocata, it has had a
> tiny
> > effect on memory consumption in comparison.
>
> Thanks a lot for the analysis. It's great that things haven't gotten off
> track.
>
> > I tracked down most of the step changes to identifiable patches:
> >
> > 2016-10-07: 2.44GiB -> 1.64GiB
> >  - https://review.openstack.org/382068/ merged, making ResourceInfo
> classes
> > more memory-efficient. Judging by the stable branch (where this and the
> > following patch were merged at different times), this was responsible for
> > dropping the memory usage from 2.44GiB -> 1.83GiB. (Which seems like a
> > disproportionately large change?)
>
> Without wanting to get the credit, I believe
> https://review.openstack.org/377061/ is more likely the reason here.
>
> >  - https://review.openstack.org/#/c/382377/ merged, so we no longer
> create
> > multiple yaql contexts. (This was responsible for the drop from 1.83GiB
> ->
> > 1.64GiB.)
> >
> > 2016-10-17: 1.62GiB -> 0.93GiB
> >  - https://review.openstack.org/#/c/386696/ merged, reducing the number
> of
> > engine workers on the undercloud to 2.
> >
> > 2016-10-19: 0.93GiB -> 0.73GiB (variance also seemed to drop after this)
> >  - https://review.openstack.org/#/c/386247/ merged (on 2016-10-16),
> avoiding
> > loading all nested stacks in a single process simultaneously much of the
> > time.
> >  - https://review.openstack.org/#/c/383839/ merged (on 2016-10-16),
> > switching output calculations to RPC to avoid almost all simultaneous
> > loading of all nested stacks.
> >
> > 2016-11-08: 0.76GiB -> 0.70GiB
> >  - This one is a bit of a mystery???
>
> Possibly https://review.openstack.org/390064/ ? Reducing the
> environment size could have an effect.
>
> > 2016-11-22: 0.69GiB -> 0.50GiB
> >  - https://review.openstack.org/#/c/398476/ merged, improving the
> efficiency
> > of resource listing?
> >
> > 2016-12-01: 0.49GiB -> 0.88GiB
> >  - https://review.openstack.org/#/c/399619/ merged, returning the
> number of
> > engine workers on the undercloud to 4.
> >
> > It's not an exact science because IIUC there's a delay between a patch
> > merging in Heat and it being used in subsequent t-h-t gate jobs. e.g. the
> > change to getting outputs over RPC landed the day before the
> > instack-undercloud patch that cut the number of engine workers, but the
> > effects don't show up until 2 days after. I'd love to figure out what
> > happened on the 8th of November, but I can't correlate it to anything
> > obvious. The attribution of the change on the 22nd also seems dubious,
> but
> > the timing adds up (including on stable/newton).
> >
> > It's fair to say that none of the other patches we merged in an attempt
> to
> > reduce memory usage had any discernible effect :D
> >
> > It's worth reiterating that TripleO still disables convergence in the
> > undercloud, so these are all tests of the legacy code path. It would be
> > great if we could set up a non-voting job on t-h-t with convergence
> enabled
> > and start tracking memory use over time there too. As a first step,
> maybe we
> > could at least add an experimental job on Heat to give us a baseline?
>
> +1. We haven't made any huge changes into that direction, but having
> some info would be great.
>
> --
> Thomas
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe

Re: [openstack-dev] [heat][tripleo] Heat memory usage in the TripleO gate during Ocata

2017-01-06 Thread Thomas Herve
On Fri, Jan 6, 2017 at 6:12 PM, Zane Bitter  wrote:
> tl;dr everything looks great, and memory usage has dropped by about 64%
> since the initial Newton release of Heat.
>
> I re-ran my analysis of Heat memory usage in the tripleo-heat-templates
> gate. (This is based on the gate-tripleo-ci-centos-7-ovb-nonha job.) Here's
> a pretty picture:
>
> https://fedorapeople.org/~zaneb/tripleo-memory/20170105/heat_memused.png
>
> There is one major caveat here: for the period marked in grey where it says
> "Only 2 engine workers", the job was configured to use only 2 heat-enginer
> worker processes instead of 4, so this is not an apples-to-apples
> comparison. The inital drop at the beginning and the subsequent bounce at
> the end are artifacts of this change. Note that the stable/newton branch is
> _still_ using only 2 engine workers.
>
> The rapidly increasing usage on the left is due to increases in the
> complexity of the templates during the Newton cycle. It's clear that if
> there has been any similar complexity growth during Ocata, it has had a tiny
> effect on memory consumption in comparison.

Thanks a lot for the analysis. It's great that things haven't gotten off track.

> I tracked down most of the step changes to identifiable patches:
>
> 2016-10-07: 2.44GiB -> 1.64GiB
>  - https://review.openstack.org/382068/ merged, making ResourceInfo classes
> more memory-efficient. Judging by the stable branch (where this and the
> following patch were merged at different times), this was responsible for
> dropping the memory usage from 2.44GiB -> 1.83GiB. (Which seems like a
> disproportionately large change?)

Without wanting to get the credit, I believe
https://review.openstack.org/377061/ is more likely the reason here.

>  - https://review.openstack.org/#/c/382377/ merged, so we no longer create
> multiple yaql contexts. (This was responsible for the drop from 1.83GiB ->
> 1.64GiB.)
>
> 2016-10-17: 1.62GiB -> 0.93GiB
>  - https://review.openstack.org/#/c/386696/ merged, reducing the number of
> engine workers on the undercloud to 2.
>
> 2016-10-19: 0.93GiB -> 0.73GiB (variance also seemed to drop after this)
>  - https://review.openstack.org/#/c/386247/ merged (on 2016-10-16), avoiding
> loading all nested stacks in a single process simultaneously much of the
> time.
>  - https://review.openstack.org/#/c/383839/ merged (on 2016-10-16),
> switching output calculations to RPC to avoid almost all simultaneous
> loading of all nested stacks.
>
> 2016-11-08: 0.76GiB -> 0.70GiB
>  - This one is a bit of a mystery???

Possibly https://review.openstack.org/390064/ ? Reducing the
environment size could have an effect.

> 2016-11-22: 0.69GiB -> 0.50GiB
>  - https://review.openstack.org/#/c/398476/ merged, improving the efficiency
> of resource listing?
>
> 2016-12-01: 0.49GiB -> 0.88GiB
>  - https://review.openstack.org/#/c/399619/ merged, returning the number of
> engine workers on the undercloud to 4.
>
> It's not an exact science because IIUC there's a delay between a patch
> merging in Heat and it being used in subsequent t-h-t gate jobs. e.g. the
> change to getting outputs over RPC landed the day before the
> instack-undercloud patch that cut the number of engine workers, but the
> effects don't show up until 2 days after. I'd love to figure out what
> happened on the 8th of November, but I can't correlate it to anything
> obvious. The attribution of the change on the 22nd also seems dubious, but
> the timing adds up (including on stable/newton).
>
> It's fair to say that none of the other patches we merged in an attempt to
> reduce memory usage had any discernible effect :D
>
> It's worth reiterating that TripleO still disables convergence in the
> undercloud, so these are all tests of the legacy code path. It would be
> great if we could set up a non-voting job on t-h-t with convergence enabled
> and start tracking memory use over time there too. As a first step, maybe we
> could at least add an experimental job on Heat to give us a baseline?

+1. We haven't made any huge changes into that direction, but having
some info would be great.

-- 
Thomas

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [kolla] Multi-Regions Support

2017-01-06 Thread Fox, Kevin M
I was kind of hoping k2k federation would solve that.

one keystone per region to provide a local keystone to talk to,
and a centeral keystone users authenticate with.

Just waiting for horizon to gain support before trying though.

Thanks,
Kevin

From: Jay Pipes [jaypi...@gmail.com]
Sent: Friday, January 06, 2017 12:42 PM
To: openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [kolla] Multi-Regions Support

On 01/06/2017 03:23 PM, Sam Yaple wrote:
> This should be read as MariaDB+Galera for replication. It is a
> highly-available database.

Don't get me wrong. I love me some Galera. :) However, what the poster
is really working towards is an implementation of the VCPE and eVCPE use
cases for ETSI NFV. These use cases require a highly distributed compute
fabric that can withstand long disruptions in network connectivity
(between POPs/COs and the last mile of network service) while still
being able to service compute and network functions at the customer premise.

Galera doesn't tolerate network disruption of any significant length of
time. At all. If there is a Keystone services running on the customer
premise that is connecting to a Galera database, and that Galera
database's connectivity to its peers is disrupted, down goes the whole
on-premise cloud fabric. And that's exactly what I believe the original
poster is attempting to avoid. Thus my not understanding the choice here.

Best,
-jay



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [kolla] Multi-Regions Support

2017-01-06 Thread Jay Pipes

On 01/06/2017 03:23 PM, Sam Yaple wrote:

This should be read as MariaDB+Galera for replication. It is a
highly-available database.


Don't get me wrong. I love me some Galera. :) However, what the poster 
is really working towards is an implementation of the VCPE and eVCPE use 
cases for ETSI NFV. These use cases require a highly distributed compute 
fabric that can withstand long disruptions in network connectivity 
(between POPs/COs and the last mile of network service) while still 
being able to service compute and network functions at the customer premise.


Galera doesn't tolerate network disruption of any significant length of 
time. At all. If there is a Keystone services running on the customer 
premise that is connecting to a Galera database, and that Galera 
database's connectivity to its peers is disrupted, down goes the whole 
on-premise cloud fabric. And that's exactly what I believe the original 
poster is attempting to avoid. Thus my not understanding the choice here.


Best,
-jay



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [kolla] Multi-Regions Support

2017-01-06 Thread Sam Yaple
On Fri, Jan 6, 2017 at 8:01 PM, Jay Pipes  wrote:

> On 01/05/2017 09:12 AM, Ronan-Alexandre Cherrueau wrote:
>
>> Hello,
>>
>> TL;DR: We make a multi-regions deployment with Kolla. It requires to
>> patch the code a little bit, and you can find the diff on our
>> GitHub[1]. This patch is just a first attempt to support multi-regions
>> in Kolla and it raises questions. Some modifications are not done in
>> an idiomatic way and we do not expect this to be merged in Kolla. The
>> reminder of this mail explains our patch and states our questions.
>>
>> At Inria/Discovery[2], we evaluate OpenStack at scale for the
>> Performance Working Group. So far, we focus on one single OpenStack
>> region deployment with hundreds of computes and we always go with
>> Kolla for our deployment. Over the last few days, we tried to achieve
>> a multi-regions OpenStack deployment with Kolla. We want to share with
>> you our current deployment workflow, patches we had to apply on Kolla
>> to support multi-regions, and also ask you if we do things correctly.
>>
>> First of all, our multi-regions deployment follows the one described
>> by the OpenStack documentation[3].
>>
>
> I don't see an "Admin Region" as part of the OpenStack documentation for
> multi-region deployment. I also see LDAP mentioned as the recommended
> authentication/IdM store.
>
> > Concretely, the deployment
>
>> considers /one/ Administrative Region (AR) that contains Keystone and
>> Horizon.
>>
>
> That's not a region. Those should be shared resources *across* regions.
>
> > This is a Kolla-based deployment, so Keystone is hidden
>
>> behind an HAProxy, and has MariaDB and memcached as backend.
>>
>
> I thought at Inria, the Nova "MySQL DB has been replaced by the noSQL
> system REDIS"? But here, you're using MariaDB -- a non-distributed database
> -- for the Keystone component which is the very thing that is the most
> highly distributed of all state storage in OpenStack.
>

This should be read as MariaDB+Galera for replication. It is a
highly-available database.


>
> So, you are replacing the Nova DB (which doesn't need to be distributed at
> all, since it's a centralized control plane piece) within the regions with
> a "distributed" NoSQL store (and throwing away transactional safety I might
> add) but you're going with a non-distributed traditional RDBMS for the very
> piece that needs to be shared, distributed, and highly-available across
> OpenStack. I don't understand that.
>
> At the
>
>> same time, /n/ OpenStack Regions (OSR1, ..., OSRn) contain a full
>> OpenStack, except Keystone. We got something as follows at the end of
>> the deployment:
>>
>> Admin Region (AR):
>> - control:
>>   * Horizon
>>   * HAProxy
>>   * Keyston
>>   * MariaDB
>>   * memcached
>>
>
> Again, that's not a region. Those are merely shared services between
> regions.
>
>
> OpenStack Region x (OSRx):
>> - control:
>>   * HAProxy
>>   * nova-api/conductor/scheduler
>>   * neutron-server/l3/dhcp/...
>>   * glance-api/registry
>>   * MariaDB
>>   * RabbitMQ
>>
>> - compute1:
>>   * nova-compute
>>   * neutron-agent
>>
>> - compute2: ...
>>
>> We do the deployment by running Kolla n+1 times. The first run deploys
>> the Administrative Region (AR) and the other runs deploy OpenStack
>> Regions (OSR). For each run, we fix the value of `openstack_region_name'
>> variable to the name of the current region.
>>
>> In the context of multi-regions, Keystone (in the AR) should be
>> available to all OSRs. This means, there are as many Keystone
>> endpoints as regions. For instance, if we consider two OSRs, the
>> result of listing endpoints at the end of the AR deployment looks like
>> this:
>>
>>
>>  $ openstack endpoint list
>>
>>  | Region | Serv Name | Serv Type | Interface | URL
>> |
>>  |+---+---+---+-
>> -|
>>  | AR | keystone  | identity  | public|
>> http://10.24.63.248:5000/v3  |
>>  | AR | keystone  | identity  | internal  |
>> http://10.24.63.248:5000/v3  |
>>  | AR | keystone  | identity  | admin |
>> http://10.24.63.248:35357/v3 |
>>  | OSR1   | keystone  | identity  | public|
>> http://10.24.63.248:5000/v3  |
>>  | OSR1   | keystone  | identity  | internal  |
>> http://10.24.63.248:5000/v3  |
>>  | OSR1   | keystone  | identity  | admin |
>> http://10.24.63.248:35357/v3 |
>>  | OSR2   | keystone  | identity  | public|
>> http://10.24.63.248:5000/v3  |
>>  | OSR2   | keystone  | identity  | internal  |
>> http://10.24.63.248:5000/v3  |
>>  | OSR2   | keystone  | identity  | admin |
>> http://10.24.63.248:35357/v3 |
>>
>
> There shouldn't be an AR region. If the Keystone authentication domain is
> indeed shared between OpenStack regions, then an administrative user should
> be able to hit any Keystone endpoint in any OpenStack region and add
> users/projects/roles, etc. to the shared Keystone data store (or if using
> LDAP, the admin should be able to add 

Re: [openstack-dev] [release] subscribe to the OpenStack release calendar

2017-01-06 Thread Doug Hellmann

> On Jan 6, 2017, at 1:14 PM, Julien Danjou  wrote:
> 
> On Fri, Jan 06 2017, Doug Hellmann wrote:
> 
> Hi Doug,
> 
>> The link for the Ocata schedule is
>> https://releases.openstack.org/ocata/schedule.ics
>> 
>> We will have a similar Pike calendar available as soon as the
>> schedule is finalized.
> 
> Thank you, this is great. One question: could it be possible to have
> only one ICS for all releases? Maybe having one per release plus a
> "all.ics"?
> 
> I'm lazy I don't want to track and add each calendar every 6 months. :-)
> 
> --
> Julien Danjou
> ;; Free Software hacker
> ;; https://julien.danjou.info

See https://review.openstack.org/417495



signature.asc
Description: Message signed with OpenPGP using GPGMail
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [kolla] Multi-Regions Support

2017-01-06 Thread Jay Pipes

On 01/05/2017 09:12 AM, Ronan-Alexandre Cherrueau wrote:

Hello,

TL;DR: We make a multi-regions deployment with Kolla. It requires to
patch the code a little bit, and you can find the diff on our
GitHub[1]. This patch is just a first attempt to support multi-regions
in Kolla and it raises questions. Some modifications are not done in
an idiomatic way and we do not expect this to be merged in Kolla. The
reminder of this mail explains our patch and states our questions.

At Inria/Discovery[2], we evaluate OpenStack at scale for the
Performance Working Group. So far, we focus on one single OpenStack
region deployment with hundreds of computes and we always go with
Kolla for our deployment. Over the last few days, we tried to achieve
a multi-regions OpenStack deployment with Kolla. We want to share with
you our current deployment workflow, patches we had to apply on Kolla
to support multi-regions, and also ask you if we do things correctly.

First of all, our multi-regions deployment follows the one described
by the OpenStack documentation[3].


I don't see an "Admin Region" as part of the OpenStack documentation for 
multi-region deployment. I also see LDAP mentioned as the recommended 
authentication/IdM store.


> Concretely, the deployment

considers /one/ Administrative Region (AR) that contains Keystone and
Horizon.


That's not a region. Those should be shared resources *across* regions.

> This is a Kolla-based deployment, so Keystone is hidden

behind an HAProxy, and has MariaDB and memcached as backend.


I thought at Inria, the Nova "MySQL DB has been replaced by the noSQL 
system REDIS"? But here, you're using MariaDB -- a non-distributed 
database -- for the Keystone component which is the very thing that is 
the most highly distributed of all state storage in OpenStack.


So, you are replacing the Nova DB (which doesn't need to be distributed 
at all, since it's a centralized control plane piece) within the regions 
with a "distributed" NoSQL store (and throwing away transactional safety 
I might add) but you're going with a non-distributed traditional RDBMS 
for the very piece that needs to be shared, distributed, and 
highly-available across OpenStack. I don't understand that.


> At the

same time, /n/ OpenStack Regions (OSR1, ..., OSRn) contain a full
OpenStack, except Keystone. We got something as follows at the end of
the deployment:

Admin Region (AR):
- control:
  * Horizon
  * HAProxy
  * Keyston
  * MariaDB
  * memcached


Again, that's not a region. Those are merely shared services between 
regions.



OpenStack Region x (OSRx):
- control:
  * HAProxy
  * nova-api/conductor/scheduler
  * neutron-server/l3/dhcp/...
  * glance-api/registry
  * MariaDB
  * RabbitMQ

- compute1:
  * nova-compute
  * neutron-agent

- compute2: ...

We do the deployment by running Kolla n+1 times. The first run deploys
the Administrative Region (AR) and the other runs deploy OpenStack
Regions (OSR). For each run, we fix the value of `openstack_region_name'
variable to the name of the current region.

In the context of multi-regions, Keystone (in the AR) should be
available to all OSRs. This means, there are as many Keystone
endpoints as regions. For instance, if we consider two OSRs, the
result of listing endpoints at the end of the AR deployment looks like
this:


 $ openstack endpoint list

 | Region | Serv Name | Serv Type | Interface | URL  |
 |+---+---+---+--|
 | AR | keystone  | identity  | public| http://10.24.63.248:5000/v3  |
 | AR | keystone  | identity  | internal  | http://10.24.63.248:5000/v3  |
 | AR | keystone  | identity  | admin | http://10.24.63.248:35357/v3 |
 | OSR1   | keystone  | identity  | public| http://10.24.63.248:5000/v3  |
 | OSR1   | keystone  | identity  | internal  | http://10.24.63.248:5000/v3  |
 | OSR1   | keystone  | identity  | admin | http://10.24.63.248:35357/v3 |
 | OSR2   | keystone  | identity  | public| http://10.24.63.248:5000/v3  |
 | OSR2   | keystone  | identity  | internal  | http://10.24.63.248:5000/v3  |
 | OSR2   | keystone  | identity  | admin | http://10.24.63.248:35357/v3 |


There shouldn't be an AR region. If the Keystone authentication domain 
is indeed shared between OpenStack regions, then an administrative user 
should be able to hit any Keystone endpoint in any OpenStack region and 
add users/projects/roles, etc. to the shared Keystone data store (or if 
using LDAP, the admin should be able to add a user to 
ActiveDirectory/ApacheDS in any OpenStack region and have that user 
information immediately show up in any of the other regions).


Best,
-jay



This requires patching the `keystone/tasks/register.yml' play[4] to
re-execute the `Creating admin project, user, role, service, and
endpoint' task for all regions we consider. An example of such a patch
is given on our GitHub[5]. In this example, the `openstack_regions'
variable 

Re: [openstack-dev] [nova] Let's kill quota classes (again)

2017-01-06 Thread melanie witt

On Fri, 6 Jan 2017 12:15:34 -0500, William M Edmonds wrote:

Why would someone need to change the defaults via REST API calls? I
agree that we should plan for that now if we think that will eventually
be needed, but I'm not seeing why it would be needed.


The REST API already allows people to change quota defaults. There are 
also quota defaults config options. When a quota default is changed via 
the REST API, an entry for the default is made in the DB and is used 
from then on and the config option is never used again. If a quota 
default has never been changed via the REST API, it can be changed via 
the config option. Defaults are looked for in order 1) DB 2) config option.


This is confusing and we've been discussing getting rid of one of the 
methods for changing quota defaults. We're thinking to keep the REST API 
and ditch the config options, because the REST API (and thus DB) 
provides a central place for the defaults. It takes one call to the REST 
API to affect a quota default change everywhere. With the config 
options, if you wanted to change a default, you'd have to change the 
configs on all of your API hosts any time you did it, and synchronize that.


-melanie

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [api-wg] restarting service-types-authority / service catalog work

2017-01-06 Thread Sean Dague
It's a new year, and in anticipation of having a productive time in
Atlanta around the service catalog, I refreshed the set of patches from
last year around starting to stub out a service-types-authority -
https://review.openstack.org/#/c/286089/

The proposes in 4 base types that are pretty non controversial. 2 other
of the base iaas services aren't in that list yet because of 2 issues.

Neutron / network - there is a discrepancy between the common use of
'network' as the service type, and 'networking' in the api ref url. Is
there a reason for that difference? In the other services in the list
the service-type and the url name for the api-ref are the same, which
leads to less user confusion about which is the correct term.

Cinder / volume - right now the devstack example of using cinder is to
use volumev2 as the service type (volumev3 is also added), which is kind
of ugly and not a thing that I'd like us to standardize on. How do we
move forward here to not make volumev2 a required type?

Comments welcome in getting us rolling again.

-Sean

-- 
Sean Dague
http://dague.net

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [ironic] Stepping down as PTL after this cycle

2017-01-06 Thread Jim Rollenhagen
Hi friends,

I'll be stepping down as PTL after Ocata is done. I'll still be 100%
dedicated to the project, but I'd like to be able to focus more on writing
code and reinforcing the bridges between us and other projects (Nova,
Neutron, etc).

I'd love to see two or more people step up and run next cycle in a real
election. I'm happy to chat with anyone interested in becoming the Pike
PTL, if you have questions or need a candidacy review or whatever.

It's been an amazing experience leading this project. You're all awesome,
and I'm grateful you chose me to do so. I hope the following cycles are
even better. :)

// jim
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [ironic] PTG planning

2017-01-06 Thread Jim Rollenhagen
Hey folks,

The PTG is about 6 weeks from now, so I figure we should start planning. As
far as how we plan this, I'm trying to treat it similar to the summit. The
only difference being we get a room for three days straight instead of 40
minute blocks, so we can be flexible with our time.

Anyway, here's the planning etherpad:
https://etherpad.openstack.org/p/ironic-pike-ptg

Please add the topics you wish to talk about there *with your name and a
description and/or link*. I've added a bunch of stuff that we have in the
pipeline/backlog to kick it off.

Thanks!

// jim
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Vitrage] About introduce suspect state of a alarm

2017-01-06 Thread Afek, Ifat (Nokia - IL)
Hi YinLiYin,

I’m not sure I understood the use case. Are you using a monitor that raises 
alarms A, B, C, D, and E? or does the monitor raise only alarms A and C, and 
the other alarms are deduced alarms created by Vitrage?

In Vitrage templates you can determine causal relationship between different 
alarms. No matter who generated the alarm (Vitrage, Nagios, Zabbix, Aodh…), you 
can model that A is a cause for B etc. You can also model that (B or D) is a 
cause of E.

If you see in Vitrage UI that alarms A, B and E are triggered, you can open the 
Root Casue Analysis view for one of them and see the A->B->E chain. In this use 
case, C and D are not triggered at all, and you will not see them in the UI.

In what use case will you ‘suspect’ A or C? can you describe it in more details?

Best Regards,
Ifat.


From: "yinli...@zte.com.cn" 
Reply-To: "OpenStack Development Mailing List (not for usage questions)" 

Date: Friday, 6 January 2017 at 03:15
To: "openstack-dev@lists.openstack.org" 
Cc: "gong.yah...@zte.com.cn" , "han.jin...@zte.com.cn" 
, "wang.we...@zte.com.cn" , 
"jia.peiy...@zte.com.cn" , "zhang.yuj...@zte.com.cn" 

Subject: [openstack-dev] [Vitrage] About introduce suspect state of a alarm


Hi all,

I have a question when learnning vitrage:

A, B, C, D, E are alarms, consider the following alarm propagation model:

A --> B

C --> D

B or D --> E



When alarm E is reported by the system, from the above model, we know that 
one or both of the following conditions may be true:

   1. A is trigged

   2. C is trigged

That is alarm A and C are suspected to be triggered. The suspect state 
of the alarm is valuable for system administrator because

one could check the system to find out whether the suspected alarms is 
really triggered according to the information.

In current vitrage template, we could not describe this situation. An 
alarm has only two states: triggered or not triggered.

Whether we could introduce suspect state for alarms ?





殷力殷 YinLiYin
[cid:image001.gif@01D2685B.2F443EC0]

[cid:image002.gif@01D2685B.2F443EC0]
上海市浦东新区碧波路889号中兴研发大楼D502
D502, ZTE Corporation R Center, 889# Bibo Road,
Zhangjiang Hi-tech Park, Shanghai, P.R.China, 201203
T: +86 21 68896229
M: +86 13641895907
E: yinli...@zte.com.cn
www.zte.com.cn


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Vitrage] About alarms reported by datasource and the alarms generated by vitrage evaluator

2017-01-06 Thread Afek, Ifat (Nokia - IL)
Hi YinLiYin,

This is an interesting question. Let me divide my answer to two parts.

First, the case that you described with Nagios and Vitrage. This problem 
depends on the specific Nagios tests that you configure in your system, as well 
as on the Vitrage templates that you use. For example, you can use 
Nagios/Zabbix to monitor the physical layer, and Vitrage to raise deduced 
alarms on the virtual and application layers. This way you will never have 
duplicated alarms. If you want to use Nagios to monitor the other layers as 
well, you can simply modify Vitrage templates so they don’t raise the deduced 
alarms that Nagios may generate, and use the templates to show RCA between 
different Nagios alarms.

Now let’s talk about the more general case. Vitrage can receive alarms from 
different monitors, including Nagios, Zabbix, collectd and Aodh. If you are 
using more than one monitor, it is possible that the same alarm (maybe with a 
different name) will be raised twice. We need to create a mechanism to identify 
such cases and create a single alarm with the properties of both monitors. This 
has not been designed in details yet, so if you have any suggestion we will be 
happy to hear them.

Best Regards,
Ifat.


From: "yinli...@zte.com.cn" 
Reply-To: "OpenStack Development Mailing List (not for usage questions)" 

Date: Friday, 6 January 2017 at 03:27
To: "openstack-dev@lists.openstack.org" 
Cc: "gong.yah...@zte.com.cn" , "han.jin...@zte.com.cn" 
, "wang.we...@zte.com.cn" , 
"jia.peiy...@zte.com.cn" , "zhang.yuj...@zte.com.cn" 

Subject: [openstack-dev] [Vitrage] About alarms reported by datasource and the 
alarms generated by vitrage evaluator


Hi all,

   Vitrage generate alarms acording to the templates. All the alarms raised by 
vitrage has the type "vitrage". Suppose Nagios has an alarm A. Alarm A is 
raised by vitrage evaluator according to the action part of a scenario, type of 
alarm A is "vitrage". If Nagios reported alarm A latter, a new alarm A with 
type "Nagios" would be generator in the entity graph. There would be two 
vertices for the same alarm in the graph. And we have to define two alarm 
entities, two relationships, two scenarios in the template file to make the 
alarm propagation procedure work.

   It is inconvenient to describe fault model of system with lot of alarms. How 
to solve this problem?



殷力殷 YinLiYin




[cid:image001.gif@01D26859.D4BAB6B0]

[cid:image002.gif@01D26859.D4BAB6B0]
上海市浦东新区碧波路889号中兴研发大楼D502
D502, ZTE Corporation R Center, 889# Bibo Road,
Zhangjiang Hi-tech Park, Shanghai, P.R.China, 201203
T: +86 21 68896229
M: +86 13641895907
E: yinli...@zte.com.cn
www.zte.com.cn



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [release] subscribe to the OpenStack release calendar

2017-01-06 Thread Julien Danjou
On Fri, Jan 06 2017, Doug Hellmann wrote:

Hi Doug,

> The link for the Ocata schedule is
> https://releases.openstack.org/ocata/schedule.ics
>
> We will have a similar Pike calendar available as soon as the
> schedule is finalized.

Thank you, this is great. One question: could it be possible to have
only one ICS for all releases? Maybe having one per release plus a
"all.ics"?

I'm lazy I don't want to track and add each calendar every 6 months. :-)

-- 
Julien Danjou
;; Free Software hacker
;; https://julien.danjou.info


signature.asc
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [kolla] helm-repository container

2017-01-06 Thread Serguei Bezverkhi (sbezverk)
Hello team,

While researching for an operator, piece of code which runs as a container and 
launches microservice in correct order to bring up a specific service, I came 
across a need to have helm charts/packages be stored and served from a 
centralized location. Since I could not find a ready-to-go solution I am 
proposing to introduce an internal helm-repository container along with 
helm-repository-service and persistent storage.
It should behave something similar to Docker private registry.  In this case 
operator, can rely on service object to locate helm-repository and fetch 
microservice packages it requires to bring up its service.

Here is in general overview of bringing up helm-repository:

1. Init container of helm-repository POD is responsible of initializing and 
populating persistent storage with charts/packages.
 1.1 Init container retrieves charts/packages bits via: (some ideas, 
new/better ideas are welcome). All these methods could be passed as a parameter.
 1.1.1 using git clone of kolla-kubernetes
 1.1.2 using tar file stored on some internal web server
 1.1.3 from the configmap where tar file is attached
 1.2 Init container generates required packages information
2. Main container starts serving packages to operators or other entities.

Here is the reason for using persistent storage:
1. In case container gets restarted/killed, when it comes back up it has to use 
exactly the same set of packages as before to preserve consistency. It will go 
through init process only if persistent storage is empty.
2. Possibility in future to update the repo with new version of packages and 
then not going through the replay of all past updates from the original version 
baked into the image.

Appreciate your comments suggestions and critic ;-)

Thank you

Serguei

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [release] subscribe to the OpenStack release calendar

2017-01-06 Thread Doug Hellmann
The release team has made it possible to subscribe to an ICS version
of the Ocata release schedule. This means you can have the full
schedule of countdown weeks and various cross-project deadlines
visible in your normal calendaring application, and receive updates
automatically.

The link for the Ocata schedule is
https://releases.openstack.org/ocata/schedule.ics

We will have a similar Pike calendar available as soon as the
schedule is finalized.

Doug

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Neutron] Using neutron lib plugin constants

2017-01-06 Thread Ihar Hrachyshka
On Tue, Dec 27, 2016 at 12:03 AM, Gary Kotton  wrote:

> The following decomposed plugins are not gating:
>
> -  openstack/networking-brocade
> 
> (please see https://review.openstack.org/415028)
>
> -  openstack/networking-huawei
> 
> (please see https://review.openstack.org/415029)
>
> -  openstack/astara-neutron
> 
> (please see https://review.openstack.org/405388)
>
> Can the relevant maintainers of those projects please address things if
> possible.
>
>
>
> The openstack/networking-cisco
> 
> does not consumer the correct neutron-lib version. So can the maintainers
> also please take a look at that.
>
> Happy holidays
>
> Gary
>
>
>
> *From: *Gary Kotton 
> *Reply-To: *OpenStack List 
> *Date: *Monday, December 26, 2016 at 3:37 PM
> *To: *OpenStack List 
> *Subject: *[openstack-dev] [Neutron] Using neutron lib plugin constants
>
>
>
> Hi,
>
> Please note the following two patches:
>
> 1.  https://review.openstack.org/414902 - use CORE from neutron lib
>
> 2.  https://review.openstack.org/394164 - use L3 (previous known as
> L3_ROUTER_NAT)
>
> Please note that the above will be removed from neutron/plugins/common
> 
> /*constants.py* and neutron-lib will be used.
>
>
>
> For the core change I have posted:
>
> 1.  VPNaaS - https://review.openstack.org/#/c/414915/
>
>
>

I approved the VPNaaS patch. I think we can land CORE patch for neutron too
now (+2d).


> For the L3 change things are a little more complicated (
> http://codesearch.openstack.org/?q=L3_ROUTER_NAT=nope==):
>
> 1.  networking-cisco – https://review.openstack.org/414977
>
> 2.  group-based-policy – https://review.openstack.org/414976
>
> 3.  big switch - https://review.openstack.org/414956
>
> 4.  brocade - https://review.openstack.org/414960
>
> 5.  dragonflow - https://review.openstack.org/414970
>
> 6.  networking-huawei - https://review.openstack.org/414971
>
> 7.  networking-odl = https://review.openstack.org/414972
>
> 8.  astara-neutron - https://review.openstack.org/414973
>
> 9.  networking-arista - https://review.openstack.org/414974
>
> 10.  networking-fortinet - https://review.openstack.org/414980
>
> 11.  networking-midonet - https://review.openstack.org/414981
>
> 12.  networking-nec - https://review.openstack.org/414982
>
> 13.  networking-onos - https://review.openstack.org/414983
>
>
>
I note a lot of those repos do weird things with tox/requirements, pulling
mitaka neutron into master and similar. I don't think we want to wait for
them to fix their gate strategy. I am fine landing the neutron patch any
time, since projects had a working week to review the relevant patches, and
we merged all stadium fixes already. I put WIP on the neutron patch for now
just to give another announcement during the next team meeting, and then we
will proceed landing it.

Ihar
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Let's kill quota classes (again)

2017-01-06 Thread William M Edmonds

On 12/16/2016 05:03 PM, Jay Pipes wrote:
> On 12/16/2016 04:36 PM, Matt Riedemann wrote:
> > On 12/16/2016 2:20 PM, Jay Pipes wrote:
> >>
> >> For problems with placing data like this as configuration options, see
> >> the hassle we went through in making the allocation_ratio options into
> >> fields stored in the DB...
> >>
> >> Better long-term to have all this kind of configuration live in a data
> >> store (not a config file) and be exposed via an HTTP API.
> >>
> >
> > So, we could do that, we already have the quota_classes table and the
> > os-quota-class-sets REST API, and as mentioned the only usable thing
> > that goes in there is overriding global default quota.
> >
> > Would you suggest we drop the global quota limit configuration options
> > and simply populate the quota_classes table with a 'default' quota
class
> > and the limits from the config in a DB migration, then drop the config
> > options?
>
> Yeah, I think that's the best long-term strategerization.

Why would someone need to change the defaults via REST API calls? I agree
that we should plan for that now if we think that will eventually be
needed, but I'm not seeing why it would be needed. And if we're not sure
this is needed now, we could still always do this later... at which point
we could do it a lot better than the current implementation being able to
start from scratch. Perhaps as part of the implementation of a new API that
allows you to change *any* mutable config option?
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [heat][tripleo] Heat memory usage in the TripleO gate during Ocata

2017-01-06 Thread Zane Bitter
tl;dr everything looks great, and memory usage has dropped by about 64% 
since the initial Newton release of Heat.


I re-ran my analysis of Heat memory usage in the tripleo-heat-templates 
gate. (This is based on the gate-tripleo-ci-centos-7-ovb-nonha job.) 
Here's a pretty picture:


https://fedorapeople.org/~zaneb/tripleo-memory/20170105/heat_memused.png

There is one major caveat here: for the period marked in grey where it 
says "Only 2 engine workers", the job was configured to use only 2 
heat-enginer worker processes instead of 4, so this is not an 
apples-to-apples comparison. The inital drop at the beginning and the 
subsequent bounce at the end are artifacts of this change. Note that the 
stable/newton branch is _still_ using only 2 engine workers.


The rapidly increasing usage on the left is due to increases in the 
complexity of the templates during the Newton cycle. It's clear that if 
there has been any similar complexity growth during Ocata, it has had a 
tiny effect on memory consumption in comparison.


I tracked down most of the step changes to identifiable patches:

2016-10-07: 2.44GiB -> 1.64GiB
 - https://review.openstack.org/382068/ merged, making ResourceInfo 
classes more memory-efficient. Judging by the stable branch (where this 
and the following patch were merged at different times), this was 
responsible for dropping the memory usage from 2.44GiB -> 1.83GiB. 
(Which seems like a disproportionately large change?)
 - https://review.openstack.org/#/c/382377/ merged, so we no longer 
create multiple yaql contexts. (This was responsible for the drop from 
1.83GiB -> 1.64GiB.)


2016-10-17: 1.62GiB -> 0.93GiB
 - https://review.openstack.org/#/c/386696/ merged, reducing the number 
of engine workers on the undercloud to 2.


2016-10-19: 0.93GiB -> 0.73GiB (variance also seemed to drop after this)
 - https://review.openstack.org/#/c/386247/ merged (on 2016-10-16), 
avoiding loading all nested stacks in a single process simultaneously 
much of the time.
 - https://review.openstack.org/#/c/383839/ merged (on 2016-10-16), 
switching output calculations to RPC to avoid almost all simultaneous 
loading of all nested stacks.


2016-11-08: 0.76GiB -> 0.70GiB
 - This one is a bit of a mystery???

2016-11-22: 0.69GiB -> 0.50GiB
 - https://review.openstack.org/#/c/398476/ merged, improving the 
efficiency of resource listing?


2016-12-01: 0.49GiB -> 0.88GiB
 - https://review.openstack.org/#/c/399619/ merged, returning the 
number of engine workers on the undercloud to 4.


It's not an exact science because IIUC there's a delay between a patch 
merging in Heat and it being used in subsequent t-h-t gate jobs. e.g. 
the change to getting outputs over RPC landed the day before the 
instack-undercloud patch that cut the number of engine workers, but the 
effects don't show up until 2 days after. I'd love to figure out what 
happened on the 8th of November, but I can't correlate it to anything 
obvious. The attribution of the change on the 22nd also seems dubious, 
but the timing adds up (including on stable/newton).


It's fair to say that none of the other patches we merged in an attempt 
to reduce memory usage had any discernible effect :D


It's worth reiterating that TripleO still disables convergence in the 
undercloud, so these are all tests of the legacy code path. It would be 
great if we could set up a non-voting job on t-h-t with convergence 
enabled and start tracking memory use over time there too. As a first 
step, maybe we could at least add an experimental job on Heat to give us 
a baseline?


The next big improvement to memory use is likely to come from 
https://review.openstack.org/#/c/407326/ or something like it (though I 
don't think we have a firm decision on whether we'd apply this to 
non-convergence stacks). Hopefully that will deliver a nice speed boost 
for convergence too.


cheers,
Zane.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [infra][diskimage-builder] containers, Containers, CONTAINERS!

2017-01-06 Thread Paul Belanger
On Fri, Jan 06, 2017 at 09:48:31AM +0100, Andre Florath wrote:
> Hello Paul,
> 
> thank you very much for your contribution - it is very appreciated.
> 
> You addressed a topic with your patch set that was IMHO not in a wide
> focus: generating images for containers.  The ideas in the patches are
> good and should be implemented.
> 
> Nevertheless I'm missing the concept behind your patches. What I saw
> are a couple of (independent?) patches - and it looks that there is
> one 'big goal' - but I did not really get it.  My proposal is (as it
> is done for other bigger changes or introducing new concepts) that
> you write a spec for this first [1].  That would help other people
> (see e.g. Matthew) to use the same blueprint also for other
> distributions.
Sure, I can write a spec if needed but the TL;DR is:

Use diskimage-builder to build debootstrap --variant=minbase chroot, and nothing
else. So I can then use take the generated tarball and do something else with
it.

> One possibility would be to classify different element sets and define
> the dependency between them.  E.g. to have a element class 'container'
> which can be referenced by other classes, but is not able to reference
> these (e.g. VM or hardware specific things).
> 
> There are additional two major points:
> 
> * IMHO you addressed only some elements that needs adaptions to be
>   able to used in containers.  One element I stumbled over yesterday
>   is the base element: it is always included until you explicitly
>   exclude it.  This base element depends on a complete init-system -
>   which is for a container unneeded overhead. [2]

Correct, for this I simply pass the -n flag to disk-image-create. This removes
the need for include the base element. If we want to make a future optimization
to remove or keep, I am okay with that. But the main goal for me is to include
the new ubuntu-rootfs element with minimal disruption as possible.
> 
> * Your patches add a lot complexity and code duplication.
>   This is not the way it should be (see [3], p 110, p 345).
The main reason this was done, is yes there is some code duplication, but the
because, this is done in the root.d phase.  Moving this logic into another
phase, then requires the need to install python into chroot, and then dpkg,
dib-python, package-install, etc. This basically contaminants the pristine
debootstrap environment, something I am trying hard not to do. I figure, 2 lines
to delete stale data is fine.  However, if there is an objection, we can remove
it.  Keep in mind, by deleting the cache we get the tarball size to 42Mb (down
from 79Mb).

>   One reason is, that you do everything twice: once for Debian and
>   once for Ubuntu - and both in a (slightly) different way.
Yes, sadly the debian elements came along after the ubuntu-minimal elements,
with different people writing the code. For the most part, I've been trying to
condense the code path between the 2, but we are slowly getting there.

As you can see, the debian-rootfs element does now work correctly[6] based on
previous patches in the stack.

However, I don't believe this is the stack to make things better between the 2
flavors. We can use the existing ubuntu-minimal and debian-minimal elements and
iterate atop of that.  One next steps is to address how we handle the
sources.list file, between ubuntu and debian we do things differently.

[6] https://review.openstack.org/#/c/414765/

>   Please: factor out common code.
>   Please: improve code as you touch it.
> 
> And three minor:
> 
> * Release notes are missing (reno is your friend)
> 
Sure, I can add release notes.

> * Please do not introduce code which 'later on' can / should / will be
>   cleaned up.  Do it correct right from the beginning. [4]
> 
I can rebase code if needed.

> * It looks that this is a bigger patch set - so maybe we should
>   include it in v2?
> 
I'm not sure we need to wait for v2 (but I am biased).  I've recently revamped
our testing infra for diskimage-builder. We now build images, along with
launching them with nodepool and SSHing into them.

Side note, when is v2 landing?  I know there has been issues with tripleo.

> Kind regards
> 
> Andre
> 
> 
> [1] https://review.openstack.org/#/c/414728/
> [2] https://review.openstack.org/#/c/417310/
> [3] "Refactoring - Improving the Design of Existing Code", Martin
> Fowler, Addison Wesley, Boston, 2011
> [4] 
> https://review.openstack.org/#/c/414728/8/elements/debootstrap-minimal/root.d/99-clean-up-cache
> [5] https://review.openstack.org/#/c/413221/
> 
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 

Re: [openstack-dev] [nova] Let's kill quota classes (again)

2017-01-06 Thread Matt Riedemann

On 12/16/2016 4:00 PM, Jay Pipes wrote:


So, we could do that, we already have the quota_classes table and the
os-quota-class-sets REST API, and as mentioned the only usable thing
that goes in there is overriding global default quota.

Would you suggest we drop the global quota limit configuration options
and simply populate the quota_classes table with a 'default' quota class
and the limits from the config in a DB migration, then drop the config
options?


Yeah, I think that's the best long-term strategerization.

-jay

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



OK, I've noted your point in the spec review, maybe we can discuss the 
options at the PTG while we're all in the same room. I think we have 
good options either way going forward so I'm happy.


--

Thanks,

Matt Riedemann


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo][ui] FYI, the tripleo-ui package is currently broken

2017-01-06 Thread Julie Pichon
Hi folks,

Just a heads-up that the DLRN "current"/dev package for the Tripleo UI
is broken in Ocata and will cause the UI to only show a blank page,
until we resolve some dependencies issues within the -deps package.

If I understand correctly, we ended up with an incomplete package
because we were silently ignoring errors during builds [1] - many
thanks to Honza for the debugging work, and the patch!!

In the meantime, if you want to work with the UI package you should
get a version built before December 19th, e.g. [2], or you're probably
better off using the UI from source for the time being [3].

I'll update this thread when this is resolved.

Thanks,

Julie

[1] https://bugs.launchpad.net/tripleo/+bug/1654051
[2] 
https://trunk.rdoproject.org/centos7-master/04/15/0415ee80b5c8354124290ac933a34823f2567800_c211fbe8/openstack-tripleo-ui-2.0.0-0.20161212153814.2dfbb0b.el7.centos.noarch.rpm
[3] 
https://github.com/openstack/tripleo-ui/blob/master/README.md#install-tripleo-ui

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [acceleration]Team Bi-weekly Meeting 2017.1.4 Agenda

2017-01-06 Thread Harm Sluiman
One question regarding PTG,
Since we don't get a specific room allocated, and the intent is for people
to not float around meetings...
What day(s) are you expecting to have Cyborg specific discussion?
It seem hotel booking will be a premium soon

On Wed, Jan 4, 2017 at 11:13 AM, Zhipeng Huang 
wrote:

> Hi Team,
>
> Thanks for a great discussion at today's meeting, please find the minutes
> at https://wiki.openstack.org/wiki/Cyborg/MeetingLogs#2017-01-04
>
> On Wed, Jan 4, 2017 at 10:40 PM, Miroslav Halas  wrote:
>
>> Howard and team,
>>
>>
>>
>> I have usually conflict at this time,  but I am trying to keep up with
>> meeting logs and etherpads J. Either Scott or I will be at PTG
>> representing Lenovo so we would be happy to participate.
>>
>>
>>
>> From last meeting I have added TODO to Nasca etherpard to link the design
>> document and the code being discussed. I cannot seem to locate the original
>> files Mellanox team shared with us. Would somebody who know where these are
>> shared be able to insert the links to the etherpad
>> https://etherpad.openstack.org/p/cyborg-nasca-design
>>
>>
>>
>> Thank you,
>>
>>
>>
>> Miro Halas
>>
>>
>>
>> *From:* Harm Sluiman [mailto:harm.slui...@gmail.com]
>> *Sent:* Wednesday, January 04, 2017 9:22 AM
>> *To:* Zhipeng Huang
>> *Cc:* OpenStack Development Mailing List (not for usage questions);
>> Miroslav Halas; rodolfo.alonso.hernan...@intel.com; Michele Paolino;
>> Scott Kelso; Roman Dobosz; Jim Golden; pradeep.jagade...@huawei.com;
>> michael.ro...@nokia.com; jian-feng.d...@intel.com;
>> martial.mic...@nist.gov; Moshe Levi; Edan David; Francois Ozog; Fei K
>> Chen; jack...@huawei.com; li.l...@huawei.com
>> *Subject:* Re: [acceleration]Team Bi-weekly Meeting 2017.1.4 Agenda
>>
>>
>>
>> Happy New Year everyone.
>>
>> I won't be able participate in the IRC today due to a conflict, but I
>> will try to connect and monitor.
>>
>> I will also put more comments in the etherpads that are linked
>>
>>
>>
>>
>>
>> On Wed, Jan 4, 2017 at 6:28 AM, Zhipeng Huang 
>> wrote:
>>
>> Hi Team,
>>
>>
>>
>> Please find the agenda at https://wiki.openstack.org/
>> wiki/Meetings/CyborgTeamMeeting#Agenda_for_next_meeting
>>
>>
>>
>> our IRC channel is #openstack-cyborg
>>
>>
>>
>> --
>>
>> Zhipeng (Howard) Huang
>>
>>
>>
>> Standard Engineer
>>
>> IT Standard & Patent/IT Prooduct Line
>>
>> Huawei Technologies Co,. Ltd
>>
>> Email: huangzhip...@huawei.com
>>
>> Office: Huawei Industrial Base, Longgang, Shenzhen
>>
>>
>>
>> (Previous)
>>
>> Research Assistant
>>
>> Mobile Ad-Hoc Network Lab, Calit2
>>
>> University of California, Irvine
>>
>> Email: zhipe...@uci.edu
>>
>> Office: Calit2 Building Room 2402
>>
>>
>>
>> OpenStack, OPNFV, OpenDaylight, OpenCompute Aficionado
>>
>>
>>
>>
>>
>> --
>>
>> 宋慢
>> Harm Sluiman
>>
>>
>>
>>
>
>
> --
> Zhipeng (Howard) Huang
>
> Standard Engineer
> IT Standard & Patent/IT Prooduct Line
> Huawei Technologies Co,. Ltd
> Email: huangzhip...@huawei.com
> Office: Huawei Industrial Base, Longgang, Shenzhen
>
> (Previous)
> Research Assistant
> Mobile Ad-Hoc Network Lab, Calit2
> University of California, Irvine
> Email: zhipe...@uci.edu
> Office: Calit2 Building Room 2402
>
> OpenStack, OPNFV, OpenDaylight, OpenCompute Aficionado
>



-- 
宋慢
Harm Sluiman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [snaps] alternative distribution approach for OpenStack

2017-01-06 Thread James Page
Hi All

I’ve been working with a few folk over the last few months to see if snaps
(see [0]) might be a good alternative approach to packaging and
distribution of OpenStack.

As OpenStack projects are Python based, producing snaps has been relatively
trivial with the snapcraft python plugin, which supports all of the tooling
that would be used in a deployment from source (pip, constraints etc…).
Thanks goes to Sam Yaple for his work on the Python plugin to support all
of the required features for snapping OpenStack!

The resulting snap for each project is self contained, with all required
dependencies baked into the snap, rather than using system provided
dependencies from the hosting operating system.

This means that the snap is directly aligned with each OpenStack project
from a software component perspective - avoiding the juggling act that
distro’s have to do each cycle to ensure the entire dependency chain for
OpenStack is lined up correctly.

Additionally, we can also include other non-Python dependencies in a snap -
for example the nova-hypervisor snap includes dnsmasq, ipset and
openvswitch tools built from source as part of the snap build process.  I’d
envision extending that list to include libvirt (but that was a bit too
much to bite off in the first few cycles or work).

>From an operations perspective, snaps are transactionally applied on
installation, which means that if an upgrade fails, the snap will be rolled
back to the last known good version.

Installs and upgrades are also fast, as the snap internally is a read only
squashfs filesystem which is simply mounted alongside the existing
installed snap, daemons are stopped, pointers switched and daemons
restarted.

Snaps typically run a confined environment, sandboxed using AppArmor and
Seccomp on Ubuntu. Snapd (the management daemon for snaps) provides a
number of interfaces to allow users to grant snaps permissions to perform
different operations on the host OS - for example network and firewall
control (the full interface list is much longer  - see [3]).

We’ve leveraged (and contributed to) a number of these interfaces to
support the nova-hypervisor snap.

Snap confinement means that snaps don’t (by default) have access to /etc -
instead configuration is supplied in a snap specific location on the
filesystem (take a look in the README of a snap for how that works at a
high level).  That location essentially mirrors /etc for the snap, which
should make adoption relatively easy for existing deployment tooling.

Snaps are also by design distro agnostic - so long as snapd has been ported
and the kernel version is sufficient to support the required security
features things should just work (but we’ve not tried that out just yet!).

We’ve snapped a few core components (see [1]) - enough to produce an
all-in-one install which you can try on Ubuntu 16.04 using snap-test [2] to
get a flavor of how things will look as this work develops further.


The source for each snap is being developed on OpenStack infra, however the
final build and publication to the snap store is being done on Launchpad
using git repo mirroring and automatic snap building on each change.   This
includes arm64 and ppc64el architecture builds.

Updates are only pushed to the edge channel on the snapstore today - we’ll
need to figure out a good channel strategy as things mature to include
great CI/CD as well as concurrent support for multiple OpenStack releases.

Anyway - that’s probably enough words for now!

We’re all hanging out in #openstack-snaps on Freenode IRC so come find us
if you have any questions!

Cheers

James

[0] http://snapcraft.io

[1] https://github.com/search?q=org%3Aopenstack+snap

[2] https://github.com/openstack-snaps/snap-test
[3] http://snapcraft.io/docs/reference/interfaces
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [kolla] Multi-Regions Support

2017-01-06 Thread Ronan-Alexandre Cherrueau
Hello,

Thanks for your feedback.

> The original value is this:
> auth_uri = {{ internal_protocol }}://{{ kolla_internal_fqdn }}:{{ 
> keystone_public_port }}
>
> Kolla does SSL via haproxy, not directly at the service itself. So internal 
> traffic is http, external is https. In this case, 'kolla_internal_fqdn' 
> points to the haproxy vip.

Got it, but if you look at the default definition of
`keystone_internal_url', inside `group_vars/all.yml', you have:


  332  keystone_internal_url: "{{ internal_protocol }}://{{
kolla_internal_fqdn }}:{{ keystone_public_port }}/v3"


As you can see, the definition is close to the original value of
`auth_uri' (except the v3). So, it seems to be a better idea to set
`auth_uri' to `keystone_internal_url' instead of the canonical form.

Especially in the context of multi-regions. In this context, you cannot
change the value of `{{ kolla_internal_fqdn }}' because, as you say, it
has to target the region's HAProxy vip. However, you have to set the
value of `auth_uri' to target the Administrative Region HAProxy vip. So,
by defining in nova/neutron/glance conf:


  [keystone_authtoken]
  auth_uri = {{ keystone_internal_url }}
  auth_url = {{ keystone_admin_url }}


Then you can target the Administrative Region HAProxy vip by redefining
`keystone_*_url' without redefining `kolla_internal_fqdn'.

> Its not legacy code, but I am sure it can be changed to work multiregion in 
> the way you are attempting to do it.

I was talking about legacy code because, in the stable/newton branch,
the `auth_uri' is already set to `{{ keystone_internal_url }}', but
solely for Kubernetes engine[1].

Shall we proceed by pushing to Gerrit with a dedicated page in the
documentation? Also, our patch is based on stable/newton, is it better
if we follow the kolla-ansible master?

[1] 
https://github.com/openstack/kolla/blob/11bfd9eb7518cc46ac441505a267f5cf974216ae/ansible/roles/nova/templates/nova.conf.j2#L151


On Thu, Jan 5, 2017 at 7:01 PM, Sam Yaple  wrote:
> On Thu, Jan 5, 2017 at 2:12 PM, Ronan-Alexandre Cherrueau
>  wrote:
>>
>> Hello,
>>
>>
>> TL;DR: We make a multi-regions deployment with Kolla. It requires to
>> patch the code a little bit, and you can find the diff on our
>> GitHub[1]. This patch is just a first attempt to support multi-regions
>> in Kolla and it raises questions. Some modifications are not done in
>> an idiomatic way and we do not expect this to be merged in Kolla. The
>> reminder of this mail explains our patch and states our questions.
>>
>>
>> At Inria/Discovery[2], we evaluate OpenStack at scale for the
>> Performance Working Group. So far, we focus on one single OpenStack
>> region deployment with hundreds of computes and we always go with
>> Kolla for our deployment. Over the last few days, we tried to achieve
>> a multi-regions OpenStack deployment with Kolla. We want to share with
>> you our current deployment workflow, patches we had to apply on Kolla
>> to support multi-regions, and also ask you if we do things correctly.
>>
>> First of all, our multi-regions deployment follows the one described
>> by the OpenStack documentation[3]. Concretely, the deployment
>> considers /one/ Administrative Region (AR) that contains Keystone and
>> Horizon. This is a Kolla-based deployment, so Keystone is hidden
>> behind an HAProxy, and has MariaDB and memcached as backend. At the
>> same time, /n/ OpenStack Regions (OSR1, ..., OSRn) contain a full
>> OpenStack, except Keystone. We got something as follows at the end of
>> the deployment:
>>
>>
>> Admin Region (AR):
>> - control:
>>   * Horizon
>>   * HAProxy
>>   * Keyston
>>   * MariaDB
>>   * memcached
>>
>> OpenStack Region x (OSRx):
>> - control:
>>   * HAProxy
>>   * nova-api/conductor/scheduler
>>   * neutron-server/l3/dhcp/...
>>   * glance-api/registry
>>   * MariaDB
>>   * RabbitMQ
>>
>> - compute1:
>>   * nova-compute
>>   * neutron-agent
>>
>> - compute2: ...
>>
>>
>> We do the deployment by running Kolla n+1 times. The first run deploys
>> the Administrative Region (AR) and the other runs deploy OpenStack
>> Regions (OSR). For each run, we fix the value of `openstack_region_name'
>> variable to the name of the current region.
>>
>> In the context of multi-regions, Keystone (in the AR) should be
>> available to all OSRs. This means, there are as many Keystone
>> endpoints as regions. For instance, if we consider two OSRs, the
>> result of listing endpoints at the end of the AR deployment looks like
>> this:
>>
>>
>>  $ openstack endpoint list
>>
>>  | Region | Serv Name | Serv Type | Interface | URL
>> |
>>
>> |+---+---+---+--|
>>  | AR | keystone  | identity  | public|
>> http://10.24.63.248:5000/v3  |
>>  | AR | keystone  | identity  | internal  |
>> http://10.24.63.248:5000/v3  |
>>  | AR | keystone  | identity  | admin |
>> http://10.24.63.248:35357/v3 |
>>  | OSR1   | 

Re: [openstack-dev] [oslo][monasca] Can we uncap python-kafka ?

2017-01-06 Thread Mehdi Abaakouk

Any progress ?

On Thu, Dec 08, 2016 at 08:32:54AM +1100, Tony Breeds wrote:

On Mon, Dec 05, 2016 at 04:03:13AM +, Keen, Joe wrote:

I wasn’t able to set a test up on Friday and with all the other work I
have for the next few days I doubt I’ll be able to get to it much before
Wednesday.


It's Wednesday so can we have an update?

Yours Tony.


--
Mehdi Abaakouk
mail: sil...@sileht.net

irc: sileht

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova] placement/resource providers update 7

2017-01-06 Thread Chris Dent


Welcome to the first placement/resource providers update for 2017.
There's a lot in progress, including plenty of work from new
contributors. Is great to see that.

# What Matters Most

The main priority remains the same: Getting the scheduler using a
filtered list of resource providers. That work is in progress near
here:

https://review.openstack.org/#/c/392569/

Falling out from that are a variety of bug fixes, and continued work
on aggregate handling and the resource tracker. There are links to
these things below.

# Stuff that's Different Now

Placement is running by default in devstack now, and thus running in
nova's gate as a matter of course. This unfortunately led to a brief
period where there was some collision with port handling with senlin,
but that was resolved.

The nova-status command now exists, and has support for indicating
whether the current deployment is ready to start using placement. That
was added (in part) by this:

https://review.openstack.org/#/c/411883/

# Stuff Needing a Plan

## Placement Client

We don't currently have an active plan for when we will implement a
placement client. Though we decided to not use a specific client in
the scheduler's report client (because simple requests+JSON works
just fine) we still have expectations that shared resource providers
will be managed via commands that can be run using the openstackclient.

Miguel Lavalle started work on a client
https://github.com/miguellavalle/python-placementclient/tree/adding-grp-support

## Placement Docs

The initial work for having placement-api-ref documentation has
started at

https://review.openstack.org/#/c/409340/

but sdague has appropriately pointed out that we're going to need gate
automation to draft and publish the results. Having two api-refs in
one repo complicates this a bit.

## can_host, aggregates in filtering

There's still some confusion (from at least me) on whether the
can_host field is relevant when making queries to filter resource
providers. Similarly, when requesting resource providers to satisfy a
set of resources, we don't (unless I've completely missed it) return
resource providers (as compute nodes) that are associated with other
resource providers (by aggregate) that can satisfy a resource
requirement. Feels like we need to work backwards from a test or use
case and see what's missing.

# Pending Planned Work

## Dev Documentation

Work has started on documentation for developers of the placement API.
The first part of that work is published at

http://docs.openstack.org/developer/nova/placement_dev.html

The rest of it is under review starting at

https://review.openstack.org/#/c/411946/

(I'm listing this near the top because the information there is coming
into the world later than it should have, and is proving useful for
people with questions on how to make changes.)

## Resource Tracker Cleanup and Use of Resource Classes For Ironic

The cleanup to the resource tracker so that it can be more effective
and efficient and work correctly with Ironic continues in the
custom-resource-classes topic:


https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/custom-resource-classes

## Handling Aggregates in Placement Server and Clients

There are a few bits of work in place to make sure aggregates can be
used in the placement service. This is important because it's the way
that resource A (a compute node) can use resource B (some shared disk)
but resource C (a compute node in a different rack) cannot.

This means that the client side needs to know which aggregates a
resource provider is associated with:

https://review.openstack.org/#/c/407309/

And the server side needs to be able to report those resource
providers which are members of any of a list of aggregates:

https://review.openstack.org/#/c/407742/

What we haven't got yet is paying attention to aggregates when
filtering available resource providers by resources. From last week:

As this work emerges, we'll need to make sure that both the client
and server sides are aware of aggregate associations as "the
resource providers that the placement service returns will either
have the resources requested or will be associated with aggregates
that have providers that match the requested resources."

Miguel's neutron work (below) found a missing feature in existing
aggregate handling: We need to know uuids!

https://bugs.launchpad.net/nova/+bug/1652642

Jay started a fix:

https://review.openstack.org/#/c/415031/

but it needs a spec:

https://review.openstack.org/#/c/415511/

# Stuff Happening Outside of Nova

* Puppet and TripleO doing Placement
  
https://review.openstack.org/#/q/topic:placement+status:open+owner:%22Emilien+Macchi+%253Cemilien%2540redhat.com%253E%22

* Neutron IPV4 Inventory
  https://review.openstack.org/#/c/358658/

# Bugs, Pick up Work, and Miscellaneous

* Allocation bad input handling and dead code fixing
  

Re: [openstack-dev] [ironic] User survey question

2017-01-06 Thread Dmitry Tantsur

On 01/04/2017 07:43 PM, Mario Villaplana wrote:

Hi,

Here are some questions I've thought of:

TESTING / INTERESTED

- What is your use case for ironic?
- What, if anything, is preventing you from using ironic in a
production environment?
- What alternatives to ironic have you considered?

USING

- What new features would you like to see added to ironic?
- What are some gaps in ironic's documentation you'd like to see fixed?
- What's been your most frustrating or difficult experience with ironic?


I wanted to suggest this ^^^ question as well, thanks Mario.


- Does anyone at your organization contribute to ironic software
development upstream? (may want to rephrase "upstream" in case they're
not familiar with the term)
- What parts of your ironic deployment do you monitor, and how do you
monitor it?

Thanks for asking the community for questions.

Mario




__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] CI status - January 6th

2017-01-06 Thread Emilien Macchi
On Fri, Jan 6, 2017 at 6:57 AM, Emilien Macchi  wrote:
> I found useful to share a status on what's going on in CI now.
>
> 1) We fixed ovb-updates job: https://review.openstack.org/#/c/416706/
> It should be stable again now. Please don't ignore it anymore (it was
> for a few days until last night).
>
> 2) multinode & ovb-nonha are green and pretty stable.
>
> 3) ovb-ha is having (afik) 3 problems:
>
> - pingtest not able to finish, floating-ip is not reachable.
> It sounds like a problem in neutron, see the bug report.
> John Schwarz (jschwarz) from Neutron team is actively working on it.
> https://bugs.launchpad.net/tripleo/+bug/1654032
> Note, the rate of failure was pretty high before we temp-reverted the
> "broken" Neutron patch in TripleO CI:
> http://logstash.openstack.org/#dashboard/file/logstash.json?query=build_name%3A%20*tripleo-ci*%20AND%20build_status%3A%20FAILURE%20AND%20message%3A%20%5C%22From%2010.0.0.1%20icmp_seq%3D1%20Destination%20Host%20Unreachable%5C%22
> Since the Neutron patch has been reverted, we shouldn't hit this bug
> anymore in TripleO CI, which increase our change to have successful
> ovb-ha CI jobs correctly passing.
>
> - pingtest not able to finish: 504 Gateway Time-out when creating a server.
> This one has no launchpad yet. I'll create it today and investigate.
> Note, the rate of failure is not that high:
> http://logstash.openstack.org/#dashboard/file/logstash.json?query=build_name%3A%20*tripleo-ci*%20AND%20build_status%3A%20FAILURE%20AND%20message%3A%20%5C%22504%20Gateway%20Time-out%5C%22

Sounds like something with nova API not able to reach MySQL server.
I reported the bug here:
https://bugs.launchpad.net/tripleo/+bug/1654545

Any help is welcome.

> - pingtest not able to finish: failing to upload an image in Glance
> (with Swift backend)
> https://bugs.launchpad.net/tripleo/+bug/1646750
> Right now, the rate of failure is too low to be able to reproduce it
> easily and have more debug. I'm working on it today, so we can ask
> Swift or Glance folks to have a look when they can.
>
>
> As a conclusion, CI should be pretty stable today:
> http://tripleo.org/cistatus.html
> Except for the ovb-ha job which might fail sometimes. Please be
> careful when merging a patch that is not passing ovb-ha. Please look
> at the logs and see if it hits one of the bugs mentioned here. We
> don't want to inject more regressions.
>
> Also, please do not try CI promotions until we have solved all blockers.
>
> Thanks,
> --
> Emilien Macchi



-- 
Emilien Macchi

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo] CI status - January 6th

2017-01-06 Thread Emilien Macchi
I found useful to share a status on what's going on in CI now.

1) We fixed ovb-updates job: https://review.openstack.org/#/c/416706/
It should be stable again now. Please don't ignore it anymore (it was
for a few days until last night).

2) multinode & ovb-nonha are green and pretty stable.

3) ovb-ha is having (afik) 3 problems:

- pingtest not able to finish, floating-ip is not reachable.
It sounds like a problem in neutron, see the bug report.
John Schwarz (jschwarz) from Neutron team is actively working on it.
https://bugs.launchpad.net/tripleo/+bug/1654032
Note, the rate of failure was pretty high before we temp-reverted the
"broken" Neutron patch in TripleO CI:
http://logstash.openstack.org/#dashboard/file/logstash.json?query=build_name%3A%20*tripleo-ci*%20AND%20build_status%3A%20FAILURE%20AND%20message%3A%20%5C%22From%2010.0.0.1%20icmp_seq%3D1%20Destination%20Host%20Unreachable%5C%22
Since the Neutron patch has been reverted, we shouldn't hit this bug
anymore in TripleO CI, which increase our change to have successful
ovb-ha CI jobs correctly passing.

- pingtest not able to finish: 504 Gateway Time-out when creating a server.
This one has no launchpad yet. I'll create it today and investigate.
Note, the rate of failure is not that high:
http://logstash.openstack.org/#dashboard/file/logstash.json?query=build_name%3A%20*tripleo-ci*%20AND%20build_status%3A%20FAILURE%20AND%20message%3A%20%5C%22504%20Gateway%20Time-out%5C%22

- pingtest not able to finish: failing to upload an image in Glance
(with Swift backend)
https://bugs.launchpad.net/tripleo/+bug/1646750
Right now, the rate of failure is too low to be able to reproduce it
easily and have more debug. I'm working on it today, so we can ask
Swift or Glance folks to have a look when they can.


As a conclusion, CI should be pretty stable today:
http://tripleo.org/cistatus.html
Except for the ovb-ha job which might fail sometimes. Please be
careful when merging a patch that is not passing ovb-ha. Please look
at the logs and see if it hits one of the bugs mentioned here. We
don't want to inject more regressions.

Also, please do not try CI promotions until we have solved all blockers.

Thanks,
-- 
Emilien Macchi

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [infra][diskimage-builder] containers, Containers, CONTAINERS!

2017-01-06 Thread Andre Florath
Hello Paul,

thank you very much for your contribution - it is very appreciated.

You addressed a topic with your patch set that was IMHO not in a wide
focus: generating images for containers.  The ideas in the patches are
good and should be implemented.

Nevertheless I'm missing the concept behind your patches. What I saw
are a couple of (independent?) patches - and it looks that there is
one 'big goal' - but I did not really get it.  My proposal is (as it
is done for other bigger changes or introducing new concepts) that
you write a spec for this first [1].  That would help other people
(see e.g. Matthew) to use the same blueprint also for other
distributions.
One possibility would be to classify different element sets and define
the dependency between them.  E.g. to have a element class 'container'
which can be referenced by other classes, but is not able to reference
these (e.g. VM or hardware specific things).

There are additional two major points:

* IMHO you addressed only some elements that needs adaptions to be
  able to used in containers.  One element I stumbled over yesterday
  is the base element: it is always included until you explicitly
  exclude it.  This base element depends on a complete init-system -
  which is for a container unneeded overhead. [2]

* Your patches add a lot complexity and code duplication.
  This is not the way it should be (see [3], p 110, p 345).
  One reason is, that you do everything twice: once for Debian and
  once for Ubuntu - and both in a (slightly) different way.
  Please: factor out common code.
  Please: improve code as you touch it.

And three minor:

* Release notes are missing (reno is your friend)

* Please do not introduce code which 'later on' can / should / will be
  cleaned up.  Do it correct right from the beginning. [4]

* It looks that this is a bigger patch set - so maybe we should
  include it in v2?

Kind regards

Andre


[1] https://review.openstack.org/#/c/414728/
[2] https://review.openstack.org/#/c/417310/
[3] "Refactoring - Improving the Design of Existing Code", Martin
Fowler, Addison Wesley, Boston, 2011
[4] 
https://review.openstack.org/#/c/414728/8/elements/debootstrap-minimal/root.d/99-clean-up-cache
[5] https://review.openstack.org/#/c/413221/



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev