Re: [openstack-dev] [Heat] Convergence proof-of-concept showdown

Murugan, Visnusaran Wed, 10 Dec 2014 03:45:19 -0800


-----Original Message-----
From: Zane Bitter [mailto:zbit...@redhat.com] 
Sent: Tuesday, December 9, 2014 3:50 AM
To: openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [Heat] Convergence proof-of-concept showdown

On 08/12/14 07:00, Murugan, Visnusaran wrote:
>
> Hi Zane & Michael,
>
> Please have a look @ 
> https://etherpad.openstack.org/p/execution-stream-and-aggregator-based
> -convergence
>
> Updated with a combined approach which does not require persisting graph and 
> backup stack removal.

Well, we still have to persist the dependencies of each version of a resource 
_somehow_, because otherwise we can't know how to clean them up in the correct 
order. But what I think you meant to say is that this approach doesn't require 
it to be persisted in a separate table where the rows are marked as traversed 
as we work through the graph.

[Murugan, Visnusaran] 
In case of rollback where we have to cleanup earlier version of resources, we 
could get the order from old template. We'd prefer not to have a graph table.

> This approach reduces DB queries by waiting for completion notification on a 
> topic. The drawback I see is that delete stack stream will be huge as it will 
> have the entire graph. We can always dump such data in ResourceLock.data Json 
> and pass a simple flag "load_stream_from_db" to converge RPC call as a 
> workaround for delete operation.

This seems to be essentially equivalent to my 'SyncPoint' proposal[1], with the 
key difference that the data is stored in-memory in a Heat engine rather than 
the database.

I suspect it's probably a mistake to move it in-memory for similar reasons to 
the argument Clint made against synchronising the marking off of dependencies 
in-memory. The database can handle that and the problem of making the DB robust 
against failures of a single machine has already been solved by someone else. 
If we do it in-memory we are just creating a single point of failure for not 
much gain. (I guess you could argue it doesn't matter, since if any Heat engine 
dies during the traversal then we'll have to kick off another one anyway, but 
it does limit our options if that changes in the future.)
[Murugan, Visnusaran] Resource completes, removes itself from resource_lock and 
notifies engine. Engine will acquire parent lock and initiate parent only if 
all its children are satisfied (no child entry in resource_lock). This will 
come in place of Aggregator.

It's not clear to me how the 'streams' differ in practical terms from just 
passing a serialisation of the Dependencies object, other than being 
incomprehensible to me ;). The current Dependencies implementation
(1) is a very generic implementation of a DAG, (2) works and has plenty of unit 
tests, (3) has, with I think one exception, a pretty straightforward API, (4) 
has a very simple serialisation, returned by the edges() method, which can be 
passed back into the constructor to recreate it, and (5) has an API that is to 
some extent relied upon by resources, and so won't likely be removed outright 
in any event. 
Whatever code we need to handle dependencies ought to just build on this 
existing implementation.
[Murugan, Visnusaran] Our thought was to reduce payload size (template/graph). 
Just planning for worst case scenario (million resource stack) We could always 
dump them in ResourceLock.data to be loaded by Worker.

I think the difference may be that the streams only include the
*shortest* paths (there will often be more than one) to each resource. i.e.

      A <------- B <------- C
      ^                     |
      |                     |
      +---------------------+

can just be written as:

      A <------- B <------- C

because there's only one order in which that can execute anyway. (If we're 
going to do this though, we should just add a method to the dependencies.Graph 
class to delete redundant edges, not create a whole new data structure.) There 
is a big potential advantage here in that it reduces the theoretical maximum 
number of edges in the graph from O(n^2) to O(n). (Although in practice real 
templates are typically not likely to have such dense graphs.)

There's a downside to this too though: say that A in the above diagram is 
replaced during an update. In that case not only B but also C will need to 
figure out what the latest version of A is. One option here is to pass that 
data along via B, but that will become very messy to implement in a non-trivial 
example. The other would be for C to go search in the database for resources 
with the same name as A and the current traversal_id marked as the latest. But 
that not only creates a concurrency problem we didn't have before (A could have 
been updated with a new traversal_id at some point after C had established that 
the current traversal was still valid but before it went looking for A), it 
also eliminates all of the performance gains from removing that edge in the 
first place.

[1]
https://github.com/zaneb/heat-convergence-prototype/blob/distributed-graph/converge/sync_point.py

> To Stop current stack operation, we will use your traversal_id based approach.

+1 :)
[Murugan, Visnusaran] We had this idea already :)

> If in case you feel Aggregator model creates more queues, then we 
> might have to poll DB to get resource status. (Which will impact 
> performance adversely :) )

For the reasons given above I would vote for doing this in the DB. I agree 
there will be a performance penalty for doing so, because we'll be paying for 
robustness.
[Murugan, Visnusaran]  +1

> Lock table: name(Unique - Resource_id), stack_id, engine_id, data 
> (Json to store stream dict)

Based on our call on Thursday, I think you're taking the idea of the Lock table 
too literally. The point of referring to locks is that we can use the same 
concepts as the Lock table relies on to do atomic updates on a particular row 
of the database, and we can use those atomic updates to prevent race conditions 
when implementing SyncPoints/Aggregators/whatever you want to call them. It's 
not that we'd actually use the Lock table itself, which implements a mutex and 
therefore offers only a much slower and more stateful way of doing what we want 
(lock mutex, change data, unlock mutex).
[Murugan, Visnusaran] Are you suggesting something like a select-for-update in 
resource table itself without having  a lock table?

cheers,
Zane.

> Your thoughts.
> Vishnu (irc: ckmvishnu)
> Unmesh (irc: unmeshg)

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Heat] Convergence proof-of-concept showdown

Reply via email to