Re: [openstack-dev] [all] The future of the integrated release

2014-08-22 Thread Michael Chapman
On Fri, Aug 22, 2014 at 2:57 AM, Jay Pipes jaypi...@gmail.com wrote:

 On 08/19/2014 11:28 PM, Robert Collins wrote:

 On 20 August 2014 02:37, Jay Pipes jaypi...@gmail.com wrote:
 ...

  I'd like to see more unification of implementations in TripleO - but I
 still believe our basic principle of using OpenStack technologies that
 already exist in preference to third party ones is still sound, and
 offers substantial dogfood and virtuous circle benefits.



 No doubt Triple-O serves a valuable dogfood and virtuous cycle purpose.
 However, I would move that the Deployment Program should welcome the many
 projects currently in the stackforge/ code namespace that do deployment
 of
 OpenStack using traditional configuration management tools like Chef,
 Puppet, and Ansible. It cannot be argued that these configuration
 management
 systems are the de-facto way that OpenStack is deployed outside of HP,
 and
 they belong in the Deployment Program, IMO.


 I think you mean it 'can be argued'... ;).


 No, I definitely mean cannot be argued :) HP is the only company I know
 of that is deploying OpenStack using Triple-O. The vast majority of
 deployers I know of are deploying OpenStack using configuration management
 platforms and various systems or glue code for baremetal provisioning.

 Note that I am not saying that Triple-O is bad in any way! I'm only saying
 that it does not represent the way that the majority of real-world
 deployments are done.


  And I'd be happy if folk in

 those communities want to join in the deployment program and have code
 repositories in openstack/. To date, none have asked.


 My point in this thread has been and continues to be that by having the TC
 bless a certain project as The OpenStack Way of X, that we implicitly are
 saying to other valid alternatives Sorry, no need to apply here..


  As a TC member, I would welcome someone from the Chef community proposing
 the Chef cookbooks for inclusion in the Deployment program, to live under
 the openstack/ code namespace. Same for the Puppet modules.


 While you may personally welcome the Chef community to propose joining the
 deployment Program and living under the openstack/ code namespace, I'm just
 saying that the impression our governance model and policies create is one
 of exclusion, not inclusion. Hope that clarifies better what I've been
 getting at.



(As one of the core reviewers for the Puppet modules)

Without a standardised package build process it's quite difficult to test
trunk Puppet modules vs trunk official projects. This means we cut release
branches some time after the projects themselves to give people a chance to
test. Until this changes and the modules can be released with the same
cadence as the integrated release I believe they should remain on
Stackforge.

In addition and perhaps as a consequence, there isn't any public
integration testing at this time for the modules, although I know some
parties have developed and maintain their own.

The Chef modules may be in a different state, but it's hard for me to
recommend the Puppet modules become part of an official program at this
stage.



 All the best,
 -jay


 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] The future of the integrated release

2014-08-22 Thread Michael Chapman
On Fri, Aug 22, 2014 at 9:51 PM, Sean Dague s...@dague.net wrote:

 On 08/22/2014 01:30 AM, Michael Chapman wrote:
 
 
 
  On Fri, Aug 22, 2014 at 2:57 AM, Jay Pipes jaypi...@gmail.com
  mailto:jaypi...@gmail.com wrote:
 
  On 08/19/2014 11:28 PM, Robert Collins wrote:
 
  On 20 August 2014 02:37, Jay Pipes jaypi...@gmail.com
  mailto:jaypi...@gmail.com wrote:
  ...
 
  I'd like to see more unification of implementations in
  TripleO - but I
  still believe our basic principle of using OpenStack
  technologies that
  already exist in preference to third party ones is still
  sound, and
  offers substantial dogfood and virtuous circle benefits.
 
 
 
  No doubt Triple-O serves a valuable dogfood and virtuous
  cycle purpose.
  However, I would move that the Deployment Program should
  welcome the many
  projects currently in the stackforge/ code namespace that do
  deployment of
  OpenStack using traditional configuration management tools
  like Chef,
  Puppet, and Ansible. It cannot be argued that these
  configuration management
  systems are the de-facto way that OpenStack is deployed
  outside of HP, and
  they belong in the Deployment Program, IMO.
 
 
  I think you mean it 'can be argued'... ;).
 
 
  No, I definitely mean cannot be argued :) HP is the only company I
  know of that is deploying OpenStack using Triple-O. The vast
  majority of deployers I know of are deploying OpenStack using
  configuration management platforms and various systems or glue code
  for baremetal provisioning.
 
  Note that I am not saying that Triple-O is bad in any way! I'm only
  saying that it does not represent the way that the majority of
  real-world deployments are done.
 
 
   And I'd be happy if folk in
 
  those communities want to join in the deployment program and
  have code
  repositories in openstack/. To date, none have asked.
 
 
  My point in this thread has been and continues to be that by having
  the TC bless a certain project as The OpenStack Way of X, that we
  implicitly are saying to other valid alternatives Sorry, no need to
  apply here..
 
 
  As a TC member, I would welcome someone from the Chef
  community proposing
  the Chef cookbooks for inclusion in the Deployment program,
  to live under
  the openstack/ code namespace. Same for the Puppet modules.
 
 
  While you may personally welcome the Chef community to propose
  joining the deployment Program and living under the openstack/ code
  namespace, I'm just saying that the impression our governance model
  and policies create is one of exclusion, not inclusion. Hope that
  clarifies better what I've been getting at.
 
 
 
  (As one of the core reviewers for the Puppet modules)
 
  Without a standardised package build process it's quite difficult to
  test trunk Puppet modules vs trunk official projects. This means we cut
  release branches some time after the projects themselves to give people
  a chance to test. Until this changes and the modules can be released
  with the same cadence as the integrated release I believe they should
  remain on Stackforge.
 
  In addition and perhaps as a consequence, there isn't any public
  integration testing at this time for the modules, although I know some
  parties have developed and maintain their own.
 
  The Chef modules may be in a different state, but it's hard for me to
  recommend the Puppet modules become part of an official program at this
  stage.

 Is the focus of the Puppet modules only stable releases with packages?



We try to target puppet module master at upstream OpenStack master, but
without CI/CD we fall behind. The missing piece is building packages and
creating a local repo before doing the puppet run, which I'm working on
slowly as I want a single system for both deb and rpm that doesn't make my
eyes bleed. fpm and pleaserun are the two key tools here.


 Puppet + git based deploys would be honestly a really handy thing
 (especially as lots of people end up having custom fixes for their
 site). The lack of CM tools for git based deploys is I think one of the
 reasons we seen people using DevStack as a generic installer.


It's possible but it's also straight up a poor thing to do in my opinion.
If you're going to install nova from source, maybe you also want libvirt
from source to test a new feature, then you want some of libvirt's deps and
so on. Puppet isn't equipped to deal with this effectively. It runs yum
install x, and that brings in the dependencies.

It's much better to automate the package building process

Re: [openstack-dev] bad default values in conf files

2014-02-15 Thread Michael Chapman
Have the folks creating our puppet modules and install recommendations
taken a close look at all the options and determined
that the defaults are appropriate for deploying RHEL OSP in the
configurations we are recommending?

If by our puppet modules you mean the ones in stackforge, in the vast
majority of cases they follow the defaults provided. I check that this is
the case during review, and the only exceptions should be stuff like the db
and mq locations that have to change for almost every install.

 - Michael



On Sat, Feb 15, 2014 at 10:15 AM, Dirk Müller d...@dmllr.de wrote:

  were not appropriate for real deployment, and our puppet modules were
  not providing better values
  https://bugzilla.redhat.com/show_bug.cgi?id=1064061.

 I'd agree that raising the caching timeout is a not a good production
 default choice. I'd also argue that the underlying issue is fixed
 with https://review.openstack.org/#/c/69884/

 In our testing this patch has speed up the revocation retrieval by factor
 120.

  The default probably is too low, but raising it too high will cause
  concern with those who want revoked tokens to take effect immediately
  and are willing to scale the backend to get that result.

 I agree, and changing defaults has a cost as well: Every deployment
 solution out there has to detect the value change, update their config
 templates and potentially also migrate the setting from the old to the
 new default for existing deployments. Being in that situation, it has
 happened that we were surprised by default changes that had
 undesireable side effects, just because we chose to overwrite a
 different default elsewhere.

 I'm totally on board with having production ready defaults, but that
 also includes that they seldomly change and change only for a very
 good, possibly documented reason.


 Greetings,
 Dirk

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] sample config files should be ignored in git...

2014-03-27 Thread Michael Chapman
On Thu, Mar 27, 2014 at 4:10 PM, Robert Collins
robe...@robertcollins.netwrote:

 On 27 March 2014 17:30, Tom Fifield t...@openstack.org wrote:

  Does anyone disagree?
 
  /me raises hand
 
  When I was an operator, I regularly referred to the sample config files
  in the git repository.
 
  If there weren't generated versions of the sample config in the repo, I
  would probably grep the code (not an ideal user experience!). Running
  some random script that I don't know about the existence and might
  depend on having something else installed of is probably not something
  that would happen.

 So, I think its important you have sample configs to refer to.

 Do they need to be in the git repo?

 Note that because libraries now export config options (which is the
 root of this problem!) you cannot ever know from the source all the
 options for a service - you *must* know the library versions you are
 running, to interrogate them for their options.

 We can - and should - have a discussion about the appropriateness of
 the layering leak we have today, but in the meantime this is breaking
 multiple projects every time any shared library that uses oslo.config
 changes any config option... so we need to solve the workflow aspect.

 How about we make a copy of the latest config for each project for
 each series - e.g. trunk of everything, Icehouse of servers with trunk
 of everything else, etc and make that easily acccessible?

 -Rob

 --
 Robert Collins rbtcoll...@hp.com
 Distinguished Technologist
 HP Converged Cloud

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



There are already some samples in the 'configuration reference' section of
docs.openstack, eg:

http://docs.openstack.org/havana/config-reference/content/ch_configuring-openstack-identity.html#sample-configuration-files

However the compute and image sections opt for a formatted table, and the
network section is more like an installation guide.

If the samples are to be removed from github, perhaps our configuration
reference section could be first and foremost the set of sample
configuration files for each project + plugin, rather than them being
merely a part of the reference doc as it currently exists.

I fairly consistently refer to the github copies of the samples. They also
allow me to refer to specific lines of the config when explaining concepts
over text. I am not against their removal, but if we were to remove them
I'd disappointed if I had to search very far on docs.openstack.org to get
to them, and I would want the raw files instead of something formatted.

 - Michael
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Zaqar] Comments on the concerns arose during the TC meeting

2014-10-03 Thread Michael Chapman
On Fri, Oct 3, 2014 at 4:05 AM, Soren Hansen so...@linux2go.dk wrote:

 I'm sorry about my slow responses. For some reason, gmail didn't think
 this was an important e-mail :(

 2014-09-30 18:41 GMT+02:00 Jay Pipes jaypi...@gmail.com:
  On 09/30/2014 08:03 AM, Soren Hansen wrote:
  2014-09-12 1:05 GMT+02:00 Jay Pipes jaypi...@gmail.com:
  How would I go about getting the associated fixed IPs for a network?
  The query to get associated fixed IPs for a network [1] in Nova looks
  like this:
 
  SELECT
   fip.address,
   fip.instance_uuid,
 [...]
  AND fip.instance_uuid IS NOT NULL
  AND i.host = :host
 
  would I have a Riak container for virtual_interfaces that would also
  have instance information, network information, fixed_ip information?
  How would I accomplish the query against a derived table that gets the
  minimum virtual interface ID for each instance UUID?

 What's a minimum virtual interface ID?

 Anyway, I think Clint answered this quite well.

  I've said it before, and I'll say it again. In Nova at least, the
  SQL schema is complex because the problem domain is complex. That
  means lots of relations, lots of JOINs, and that means the best way
  to query for that data is via an RDBMS.
 [...]
  I don't think relying on a central data store is in any conceivable
  way appropriate for a project like OpenStack. Least of all Nova.
 
  I don't see how we can build a highly available, distributed service
  on top of a centralized data store like MySQL.
 [...]
  I don't disagree with anything you say above. At all.

 Really? How can you agree that we can't build a highly available,
 distributed service on top of a centralized data store like MySQL while
 also saying that the best way to handle data in Nova is in an RDBMS?

  For complex control plane software like Nova, though, an RDBMS is
  the best tool for the job given the current lay of the land in open
  source data storage solutions matched with Nova's complex query and
  transactional requirements.
  What transactional requirements?
 
 https://github.com/openstack/nova/blob/stable/icehouse/nova/db/sqlalchemy/api.py#L1654
  When you delete an instance, you don't want the delete to just stop
  half-way through the transaction and leave around a bunch of orphaned
  children.  Similarly, when you reserve something, it helps to not have
  a half-finished state change that you need to go clean up if something
  goes boom.

 Looking at that particular example, it's about deleting an instance and
 all its associated metadata. As we established earlier, these are things
 that would just be in the same key as the instance itself, so it'd just
 be a single key that would get deleted. Easy.

 That said, there will certainly be situations where there'll be a need
 for some sort of anti-entropy mechanism. It just so happens that those
 situations already exist. We're dealing with about a complex distributed
 system.  We're kidding ourselves if we think that any kind of
 consistency is guaranteed, just because our data store favours
 consistency over availability.


I apologize if I'm missing something, but doesn't denormalization to add
join support put the same value in many places, such that an update to that
value is no longer a single atomic transaction? This would appear to
counteract the requirement for strong consistency. If updating a single
value is atomic (as in Riak's consistent mode) then it might be possible to
construct a way to make multiple updates appear atomic, but it would add
many more transactions and many more quorum checks, which would reduce
performance to a crawl.

I also don't really see how a NoSQL system in strong consistency mode is
any different from running MySQL with galera in its failure modes. The
requirement for quorum makes the addition of nodes increase the potential
latency of writes (and reads in some cases) so having large scale doesn't
grant much benefit, if any. Quorum will also prevent nodes on the wrong
side of a partition from being able to access system state (or it will give
them stale state, which is probably just as bad in our case).

I think your goal of having state management that's able to handle network
partitions is a good one, but I don't think the solution is as simple as
swapping out where the state is stored. Maybe in some cases like
split-racks the system needs to react to a network partition by forming its
own independent cell with its own state storage, and when the network heals
it then merges back into the other cluster cleanly? That would be very
difficult to implement, but fun (for some definition of fun).

As a thought experiment, a while ago I considered what would happen if
instead of using a central store, I put a sqlite database behind every
daemon and allowed them to query each other for the data they needed, and
cluster if needed (using raft). Services like nova-scheduler need strong
consistency and would have to cluster to perform their role, but services
like nova-compute would 

Re: [openstack-dev] [tripleo] Contributing to TripleO is challenging

2016-03-11 Thread Michael Chapman
On Sat, Mar 5, 2016 at 10:31 AM, Giulio Fidente  wrote:

> On 03/04/2016 03:23 PM, Emilien Macchi wrote:
>
>> That's not the name of any Summit's talk, it's just an e-mail I wanted
>> to write for a long time.
>>
>> It is an attempt to expose facts or things I've heard a lot; and bring
>> constructive thoughts about why it's challenging to contribute in
>> TripleO project.
>>
>
> hi Emilien,
>
> thanks for bringing this up, it's not an easy topic and yet of most
> crucial. As a core contributors I feel, to some extent, responsible for the
> current status of things and I think it's time for us to reflect more about
> what we can, individually, do.
>
> I have some ideas but I want to start by commenting to your points.
>
> 1/ "I don't review this patch, we don't have CI coverage."
>>
>> One thing I've noticed in TripleO is that a very few people are involved
>> in CI work.
>> In my opinion, CI system is more critical than any feature in a product.
>> Developing Software without tests is a bit like http://goo.gl/OlgFRc
>> All people - specially core - in the project should be involved in CI
>> work. If you are TripleO core and you don't contribute on CI, you might
>> ask yourself why.
>>
>
> Agreed, we need more 'eyes' on out CI to cope with both the infra and the
> inavoidable failures due to changes/bugs in the puppet modules or openstack
> itself.
>
> But there is more hiding behind this problem ... we already have quite a
> number of optional and even pluggable features in TripleO and we're even
> designing an interface to make this easier; testing them all isn't going to
> happen. So we'll always hit something we don't have coverage for.
>
> Let's have a conversation on how we can improve coverage at the summit!
> Maybe we can make simply make our CI scenarios more variegated/complex in
> the attempt to touch more features?
>
> 2/ "I don't review this patch, CI is broken."
>>
>> Another thing I've noticed in TripleO is that when CI is broken, again,
>> a very few people are actually working on fixing failures.
>> My experience over the last years taught me to stop my daily work when
>> CI is broken and fix it asap.
>>
>
> Agreed. More eyes and more coverage to increase its dependability.
>
> 3/ "I don't review it, because this feature / code is not my area".
>>
>> My first though is "Aren't we supposed to be engineers and learn new
>> areas?"
>> My second though is that I think we have a problem with TripleO Heat
>> Templates.
>> THT or TripleO Heat Templates's code is 80% of Puppet / Hiera. If
>> TripleO core say "I'm not familiar with Puppet", we have a problem here,
>> isn't?
>> Maybe should we split this repository? Or revisit the list of people who
>> can +2 patches on THT.
>>
>
> Not sure here, I find that manifests and templates are pretty much "meant
> to go together" so I am worried that a split could solve some problems but
> also cause others.
>

This is pretty much what I proposed last week (
https://blueprints.launchpad.net/tripleo/+spec/refactor-puppet-manifests)
and I noticed Dan approved the blueprint yesterday (cheers). It's
definitely going to cause problems in that THT defines the data interface
and puppet-tripleo is going to have to keep up with that interface in
lock-step in some cases so be prepared to deal with that as a patch author.
This isn't really any different to non-tripleo puppet module situations
where a change to the repo holding hiera data will be tied to changes in
modules.

Ideally I'd like to incrementally decouple the puppet-tripleo profiles from
the data heat provides but for the first cut they'll be joined at the hip.

So given a new home (puppet-tripleo) for a large portion of the code
(starting with overcloud controller and controller_pacemaker), hopefully
this paves the way for giving those who know puppet well the opportunity to
take on responsibility for the manifests without necessarily being
intimately familiar with the rest of the system, which I guess helps with
Emilien's original concern that there's a skill split across the tooling
lines.


>
> This said, let's be honest, an effective patch for THT requires a good
> understanding of many different problems which can be TripleO specific (eg.
> implications on upgrades), tooling specific (eg. Heat/Puppet), OpenStack
> specific (eg. cooperation with other, optional, features) so I have myself
> skipped changes when I didn't feel comfortable with it.
>
> But one problem which I think is more recently slowing reviews and which
> is somewhat concause of 3) is that we're not dealing too well with code
> duplication in the yamls and with conditional logic in the manifests.
>
> Maybe we could stop and think a together about new HOT functionalities
> which could help us? Interesting for the summit as well?
>
> 4/ Patches are stalled. Most of the time.
>>
>> Over the last 12 months, I've pushed a lot of patches in TripleO and one
>> thing I've noticed is that if I don't ping people, my patch