Re: [openstack-dev] [heat] Rolling Upgrades

2016-10-24 Thread Grasza, Grzegorz
> From: Crag Wolfe [mailto:cwo...@redhat.com]
> 
> On 10/20/2016 09:30 PM, Rabi Mishra wrote:
> > Thanks Crag on starting the thread. Few comments inline.
> >
...
> >
> > For RPC changes, we don't have a great solution right now (looking
> > specifically at heat/engine/service.py). If we add a field, an older
> > running heat-engine will break if it receives a request from a newer
> > running heat-engine. For a relevant example, consider adding the
> > "root_id" as an argument (
> > https://review.openstack.org/#/c/354621/13/heat/engine/service.py
> > 
> ).
> >
> > Looking for the simplest solution -- if we introduce a mandatory
> > "future_args" arg (a dict) now to all rpc methods (perhaps provide a
> > decorator to do so), then we could follow this pattern post-Ocata:
> >
> > legacy release: accepts the future_args param (but does nothing with
> > it).
> > release+1: accept the new parameter with a default of None,
> >pass the value of the new parameter in future_args.
> > release+2: accept the new parameter, pass the value of the new
> parameter
> >in its proper placeholder, no longer in future_args.
> >
> > This is something similar to the one is being used by neutron for the
> > agents, i.e consistently capturing those new/unknown arguments with
> > keyword arguments and ignoring them on agent side; and by not
> > enforcing newer RPC entry point versions on server side. However,
> > this makes the rpc api less strict and not ideal.
> >
> 
> I'm not sure what the definition of ideal is here. But, full disclaimer is 
> that I've
> thought about RPC a lot less than the DB side. :-)
> 
> FWIW, we might as well be explicit in stating that we only expect two minor
> versions to be running at once (during a rolling upgrade). That is a simpler
> problem than having to support N minor versions.
> 

For heat, RPC compatibility actually isn't that complicated, because you can 
run multiple heat versions with different AMQP virtual hosts, so they 
physically cannot talk to each other.

You begin the rolling upgrade by starting new instances of heat engine with a 
new virtual host, then upgrade all API nodes (connected with a new virtual 
host). When all jobs on the old part of the cluster are done, you can stop old 
heat engine instances.

/ Greg


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [keystone]Liberty->Mitaka upgrade: is it possible without downtime?

2016-04-14 Thread Grasza, Grzegorz
> From: Gyorgy Szombathelyi
> 
> Unknown column 'user.name' in 'field list'
> 
> in some operation when the DB is already upgraded to Mitaka, but some
> keystone instances in a HA setup are still Liberty.

Currently we don't support rolling upgrades in keystone. To do an upgrade, you 
need to upgrade all keystone service instances at once, instead of going 
one-by-one, which means you have to plan for downtime of the keystone API.

> 
> Is this change is intentional? Should I ignore the problem and just upgrade 
> all
> instances as fast as possible? Or I just overlooked something?
> 

You are right that there will be an error if you try running Liberty+Mitaka, 
since the database schema is not compatible. We have an ongoing effort to 
support online schema migrations, but it didn't make into Mitaka. [1] [2]

We will have a presentation about Online DB Migrations at the summit (in the 
upstream development track), so if you are interested, you can attend or watch 
the recorded session afterwards [3]. There will also be a discussion about this 
in keystone meetings at the design summit. [4]

[1] 
https://specs.openstack.org/openstack/keystone-specs/specs/mitaka/online-schema-migration.html
[2] https://review.openstack.org/#/c/274079/
[3] https://www.openstack.org/summit/austin-2016/summit-schedule/events/7639
[4] https://etherpad.openstack.org/p/keystone-newton-summit-brainstorm


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [keystone] Testing schema migrations was RE: [grenade][keystone] Keystone multinode grenade

2016-02-15 Thread Grasza, Grzegorz

> From: Morgan Fainberg [mailto:morgan.fainb...@gmail.com] 
>>
>> Keystone stable working with master db seems like an interesting bit, are
>> there already tests for that?
>
>Not yet. Right now there is only a unit test, checking obvious 
>incompatibilities.
>
> As an FYI, this test was reverted as we spent a significant time around 
> covering
> it at the midcycle (and it was going to require us to significantly rework 
> in-flight
> code (and was developed / agreed upon before the new db restrictions landed).
> We will be revisiting this with the now better understanding of the scope and
> how to handle the "limited" downtime upgrade for first thing in Newton.

In the commit description you mentioned that "the base feature of what this test
encompasses will instead be migrated over to a full separate gate/check job that
will be able to handle the more complex tasks of ensuring schema upgrades make
sense."

As I understand, a gate test which upgrades the DB to the latest version and 
then
runs tempest on the old release would cover the cases which the unit test 
covered.
Is this what you had in mind?

Do you think I can start working on it, or maybe we should synchronize on what 
the
final approach should be beforehand?

Can you elaborate more about what the ideas of testing schema changes were
at the midcycle?

What especially interests me, is whether you discussed any ideas which might be 
better than just running tempest on keystone in HA.

I'm sorry I couldn't take part in these discussions.

/ Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [grenade][keystone] Keystone multinode grenade

2016-02-08 Thread Grasza, Grzegorz

> From: Sean Dague [mailto:s...@dague.net]
> 
> On 02/05/2016 04:44 AM, Grasza, Grzegorz wrote:
> >
> >> From: Sean Dague [mailto:s...@dague.net]
> >>
> >> On 02/04/2016 10:25 AM, Grasza, Grzegorz wrote:
> >>>
> >>> Keystone is just one service, but we want to run a test, in which it
> >>> is setup in HA – two services running at different versions, using
> >>> the same
> >> DB.
> >>
> >> Let me understand the scenario correctly.
> >>
> >> There would be Keystone Liberty and Keystone Mitaka, both talking to
> >> a Liberty DB?
> >>
> >
> > The DB would be upgraded to Mitaka. From Mitaka onwards, we are
> making only additive schema changes, so that both versions can work
> simultaneously.
> >
> > Here are the specifics:
> > http://docs.openstack.org/developer/keystone/developing.html#online-
> mi
> > gration
> 
> Breaking this down, it seems like there is a simpler test setup here.
> 
> Master keystone is already tested with master db, all over the place. In unit
> tests all the dsvm jobs. So we can assume pretty hard that that works.
> 
> Keystone doesn't cross talk to itself (as there are no workers), so I don't 
> think
> there is anything to test there.
> 
> Keystone stable working with master db seems like an interesting bit, are
> there already tests for that?

Not yet. Right now there is only a unit test, checking obvious 
incompatibilities.

> 
> Also, is there any time where you'd get data from Keystone new use it in a
> server, and then send it back to Keystone old, and have a validation issue?
> That seems easier to trigger edge cases at a lower level. Like an extra
> attribute is in a payload in Keystone new, and Keystone old faceplants with 
> it.

In case of keystone, the data that can cause compatibility issues is in the DB.
There can be issues when data stored or modified by the new keystone
is read by the old service, or the other way around. The issues may happen
only in certain scenarios, like:

row created by old keystone ->
row modified by new keystone ->
failure reading by old keystone

I think a CI test, in which we have more than one keystone version accessible
at the same time is preferable to testing only one scenario. My proposed
solution with HAProxy probably wouldn't trigger all of them, but it may catch
some instances in which there is no full lower level test coverage. I think 
testing
in HA would be helpful, especially at the beginning, when we are only starting 
to
evaluate rolling upgrades and discovering new types of issues that we should
test for.

> 
> The reality is that standing up an HA Proxy Keystone multinode environment
> is going to be pretty extensive amount of work. And when things fail, digging
> out why, is kind of hard. However it feels like most of the interesting edges
> can be tested well at a lower level. And is at least worth getting those 
> sorted
> before biting off the bigger thing.

I only proposed multinode grenade, because I thought it is the most complete
solution for what I want to achieve, but maybe there is a simpler way, like
running two keystone instances on the same node?

/ Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [grenade][keystone] Keystone multinode grenade

2016-02-05 Thread Grasza, Grzegorz


> -Original Message-
> From: Sean Dague [mailto:s...@dague.net]
> 
> On 02/04/2016 10:25 AM, Grasza, Grzegorz wrote:
> >
> > Keystone is just one service, but we want to run a test, in which it
> > is setup in HA – two services running at different versions, using the same
> DB.
> 
> Let me understand the scenario correctly.
> 
> There would be Keystone Liberty and Keystone Mitaka, both talking to a
> Liberty DB?
> 

The DB would be upgraded to Mitaka. From Mitaka onwards, we are making only 
additive schema changes, so that both versions can work simultaneously.

Here are the specifics:
http://docs.openstack.org/developer/keystone/developing.html#online-migration

/ Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [grenade][keystone] Keystone multinode grenade

2016-02-04 Thread Grasza, Grzegorz
Hi Sean,
we are looking into testing online schema migrations in keystone.

The idea is to run grenade with multinode support, but it would be something 
different than the current implementation.

Currently, the two nodes which are started run with different roles, one is a 
controller the other is a compute.

Keystone is just one service, but we want to run a test, in which it is setup 
in HA - two services running at different versions, using the same DB.

They could be joined by running HAProxy in round-robin mode on one of the 
nodes. We could then run tempest against the HAProxy endpoint.

Can you help me with the implementation or give some pointers on where to make 
the change?

Specifically, do you think a new DEVSTACK_GATE_GRENADE or a new 
DEVSTACK_GATE_TOPOLOGY would be needed?

/ Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [keystone] Online schema migration in one release

2016-01-20 Thread Grasza, Grzegorz
Hi,
I wrote up a POC implementing an example online schema migration in one release 
cycle:
https://review.openstack.org/269693

This bases on the Online Schema Migration blueprint [1] and adds a 
configuration option to denote which columns are being used at a given time. 
The upgrade scenario for a rolling upgrade is the following:

1.   Take one node out from the cluster, so that new HTTP API requests are 
managed by other nodes

2.   Shut the service down on that node

3.   Upgrade keystone application

4.   Upgrade the DB schema with keystone-manage db_sync - only new columns 
are added at this point
[note that some DDL statements may still lock tables]

5.   Start the service and join it with the cluster

6.   Do the above for the rest of the nodes

7.   Once only the new version is running, the rest of the data can be 
migrated by normally operating the cluster or by running migration scripts with 
keystone-manage db_migrate (not yet implemented)

8.   After data is migrated in full, the sql.backward_compatible 
configuration flag can be set to False and the services restarted (after this 
point, downgrade is not possible)

9.   Run the "contract" scripts (I provided a working example script for 
this POC, which also migrates the remaining data)
[again, note that some DDL statements may still lock tables, the operator 
should be aware of this]

I think this can be a good moment for discussion on where online schema 
upgrades should go in the future and if this one release cycle upgrade which 
I'm proposing is a good approach.

[1] https://blueprints.launchpad.net/keystone/+spec/online-schema-migration
I also proposed additional documentation with guidelines at 
https://review.openstack.org/#/c/265252/

/ Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [magnum] versioned objects changes

2015-08-26 Thread Grasza, Grzegorz
Hi,

I noticed that right now, when we make changes (adding/removing fields) in 
https://github.com/openstack/magnum/tree/master/magnum/objects , we don't 
change object versions.

The idea of objects is that each change in their fields should be versioned, 
documentation about the change should also be written in a comment inside the 
object and the obj_make_compatible method should be implemented or updated. See 
an example here:
https://github.com/openstack/nova/commit/ad6051bb5c2b62a0de6708cd2d7ac1e3cfd8f1d3#diff-7c6fefb09f0e1b446141d4c8f1ac5458L27

The question is, do you think magnum should support rolling upgrades from next 
release or maybe it's still too early?

If yes, I think core reviewers should start checking for these incompatible 
changes.

To clarify, rolling upgrades means support for running magnum services at 
different versions at the same time.
In Nova, there is an RPC call in the conductor to backport objects, which is 
called when older code gets an object it doesn't understand. This patch does 
this in Magnum: https://review.openstack.org/#/c/184791/ .

I can report bugs and propose patches with version changes for this release, to 
get the effort started.

In Mitaka, when Grenade gets multi-node support, it can be used to add CI tests 
for rolling upgrades in Magnum.


/ Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Heat] RPC API versioning

2015-08-06 Thread Grasza, Grzegorz


 -Original Message-
 From: Zane Bitter [mailto:zbit...@redhat.com]
 Sent: Thursday, 6 August, 2015 2:57
 To: OpenStack Development Mailing List
 Subject: [openstack-dev] [Heat] RPC API versioning
 
 We've been talking about this since before summit without much consensus.
 I think a large part of the problem is that very few people have deep
 knowledge of both Heat and Versioned Objects. However, I think we are at a
 point where we should be able to settle on an approach at least for the API-
 engine RPC interface. I've been talking to Dan Smith about the Nova team's
 plan for upgrades, which goes something like this:
 
 * Specify a max RPC API version in the config file
 * In the RPC client lib, add code to handle versions as far back as the 
 previous
 release
 * The operator rolls out the updated code, keeping the existing config file
 with the pin to the previous release's RPC version
 * Once all services are upgraded, the operator rolls out a new config file
 shifting the pin
 * The backwards compat code to handle release N-1 is removed in the N+1
 release
 
 This is, I believe, sufficient to solve our entire problem.
 Specifically, we have no need for an indirection API that rebroadcasts
 messages that are too new (since that can't happen with pinning) and no
 need for Versioned Objects in the RPC layer. (Versioned objects for the DB
 are still critical, and we are very much better off for all the hard work that
 Michal and others have put into them. Thanks!)

What is the use of versioned objects outside of RPC?
I've written some documentation for Oslo VO and helped in introducing them in 
Heat.
As I understand it, the only use cases for VO are
* to serialize objects to dicts with version information when they are sent 
over RPC
* handle version dependent code inside the objects (instead of scattering it 
around the codebase)
* provide an object oriented and transparent access to the resources 
represented by the objects to services which don't have direct access to that 
resource (via RPC) - the indirection API

The last point was not yet discussed in Heat as far as I know, but the 
indirection API also contains an interface for backporting objects, which is 
something that is currently only used in Nova, and as you say, doesn't have a 
use when version pinning is in place.

 
 The nature of Heat's RPC API is that it is effectively user-facing - the 
 heat-api
 process is essentially a thin proxy between ReST and RPC. We already have a
 translation layer between the internal representation(s) of objects and the
 user-facing representation, in the form of heat.engine.api, and the RPC API is
 firmly on the user-facing side. The requirements for the content of these
 messages are actually much stricter than anything we need for RPC API
 stability, since they need to remain compatible not just with heat-api but
 with heatclient - and we have *zero* control over when that gets upgraded.
 Despite that, we've managed quite nicely for ~3 years without breaking
 changes afaik.
 
 Versioned Objects is a great way of retaining control when you need to share
 internal data structures between processes. Fortunately the architecture of
 Heat makes that unnecessary. That was a good design decision. We are not
 going to reverse that design decision in order to use Versioned Objects. (In
 the interest of making sure everyone uses their time productively, perhaps I
 should clarify that to: your patch is subject to -2 on sight if it introduces
 internal engine data structures to heat-api/heat-cfn-api.)
 
 Hopefully I've convinced you of the sufficiency of this plan for the API-
 engine interface specifically. If anyone disagrees, let them speak now, c.

I don't understand - what is the distinction between internal and external data 
structures?

From what I understand, versioned objects were introduced in Heat to represent 
objects which are sent over RPC between Heat services.

 
 I think there is still a case that could be made for a different approach to 
 the
 RPC API for convergence, which is engine-engine and
 (probably) doesn't yet have a formal translation layer of the same kind.
 At a minimum, obviously, we should do the same stuff listed above (though I
 don't think we need to declare that interface stable until the first release
 where we enable convergence by default).
 

I agree this could be a good use case for VO.


 There's probably places where versioned objects could benefit us. For
 example, when we trigger a check on a resource we pass it a bundle of data
 containing all the attributes and IDs it might need from resources it depends
 on. It definitely makes sense to me that that bundle would be a Versioned
 Object. (In fact, that data gets stored in the DB - as SyncPoint in the
 prototype - so we wouldn't even need to create a new object type. This
 seems like a clear win.)
 
 What I do NOT want to do is to e.g. replace the resource_id of the resource
 to check with a versioned object 

Re: [openstack-dev] [Heat][Oslo] Versioned objects compatibility mode

2015-07-06 Thread Grasza, Grzegorz
 
  On Mon, Jun 22, 2015 at 5:40 AM, Jastrzebski, Michal
  michal.jastrzeb...@intel.com wrote:
   Hello,
  
   I wanted to start discussion about versioned objects backporting
   for conductor-less projects.
   In Vancouver we discussed compatibility mode, which works like that:
  

Dan's blog post suggests that Nova already requires two restarts:
http://www.danplanet.com/blog/2015/06/26/upgrading-nova-to-kilo-with-minimal-downtime/

I added a bp/spec for this in Heat:
https://review.openstack.org/196670

There is also an approved spec in Cinder:
https://review.openstack.org/192037
which, to avoid restarts, stores the configuration in the DB.
This places a requirement for all services to have a direct access to the 
database, so it can't be used in all projects. Thang Pham also wrote a Cinder 
POC implementation: https://review.openstack.org/184404.

/ Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev