Hi, Thanks for joining the senlin meetup last week at Tokyo summit. We know some of you were not able to make it for various reasons. I'm trying to summarize things we discussed during the meetup and some preliminary conclusions we got. Please feel free to reply to this email or find the team on #senlin channel if you have questions/suggestions.
Short Version ------------- - Senlin will focus more on two things during Mitaka cycle: 1) stability regarding API and engine; 2) Heat resource type support. - Senlin engine won't do "convergence" as suggested by some people, however the engine should be responsible to manage the lifecycles of the objects it creates on behalf of users. - Team will revise the APIs according to the recent guidelines from api-wg and make the first version released as stable as possible. Before having a versioning scheme in place, we won't bump the API versions in ad-hoc ways. - Senlin will NOT introduce complicated monitoring mechanisms into the engine albeit we'd strive to support cluster/node status checkings. We opt to use whatever external monitoring services and leave that an option for users. - We will continue working with TOSCA team to polish policy definitions. - We will document guidelines on how policy decisions are passed from one policy to another. - We are interested in building baremetal clusters, but we will keep it in pipeline unless there are: 1) real requests, and 2) resources to get it done. - As part of the API stabilization effort, we will generalize the concept of 'webhook' into 'receiver'. Long Version (TL;DR) -------------------- * Stability v.s. Features We had some feature requests like managing container clusters, doing smart scheduling, running scripts on a cluster of servers, supporting clusters of non-compute resources... etc. These are all good ideas. However, Senlin is not aiming to become a service of everything. We have to refrain from the temptation of too wide a scope. There are millions of things we can do, but the first priority at this stage is about stability. Making it usable and stable before adding fancy features, this was the consensus we achieved during the meetup. We will stick to that during Mitaka cycle. * Heat Resource Type Support Team had a discussion with heat team during a design summit slot. The basic vision remained the same: let senlin do autoscaling and deprecate heat autoscaling when senlin is stable. There are quite some details to be figured out. The first thing we would do is to land senlin cluster, node and profile resource types in Heat and build a auto-scaling end-to-end solution comparable to existing one. Then the two teams will make decision on how to make the transition smooth for both developers and users. * Convergence or Not There were suggestions to define 'desired' state and 'observed' state for clusters and have senlin engine do the convergence. After some closer examination of the use case, we decided not to do it. The 'desired' state of a node is obvious (i.e. ACTIVE). The 'desired' state of a cluster is a little bit vague. It boils down to whether we would allow 'partial success' when creating a cluster of 1,000 nodes. Failures are unavoidable, thus something we have to live with. However, we are very cautious about making decisions for users. Say we have 90% nodes ACTIVE in a cluster, should we label the cluster an 'ERROR' state, or a 'WARNING' state, or just 'ACTIVE'? We tend to leave this decision to users who are smart people too. To avoid too much burdens on users, we will add some defaults that can be set by operators. There are cases where senlin engine creates objects when enforcing a policy, e.g. the load-balancing policy. The engine should do a good job managing the status of those objects. * API Design Senlin already have an API design which is documented. Before doing a verion 1.0 release, we need to further hammer on it. Most of these revisions would be related to guidelines from api-wg. For example, the following changes are expected to land during Mitaka: - return 202 instead of 200 for asynchronous operations - better align with the proposed change to 'action' APIs - sorting keys and directions - returning 400 or 404 for resources not found - add location headers where appropriate Another change to the current API will be about webhook. We got suggestions related to receving notifications from other channels other than webhooks, e.g. message queues, external monitoring services. To avoid disruptive changes to the APIs in future, we decided to generalize webhook APIs to 'receivers'. This is an important work even if we only support webhook as the only type of receivers. We don't want to see webhook APIs provided and soon replaced/deprecated. * Relying on External Monitoring There used to be some interests in doing status polling on cluster nodes so that the engine will know whether nodes are healthy or not. This idea was rejected during the meetup. There are several reasons on this: too much overhead on the backend services; still unable to get the latest status of resources; scalability concerns etc. We have decided to rely on other monitoring/alarming services to provide status updates for health status report. When users send a GET request for a node resource, we will allow them to get the latest resource status. By default, we may just return the 'cached' status in our own database. * Collaboration with TOSCA Team has been engaging with the development of TOSCA policy definitions since a few months ago. We will continue this collaboration to make sure senlin's policy definition is well aligned with the standard so that a translator can easily translate TOSCA policy definition into senlin version. We will also feed the standard team with suggestions. * More Documentations Senlin already have some documentations for users, developers. The APIs are documented using WADL as well. Going forward, we will need to provide more for developers on policy development. For example, there will be a chain of policies that will be checked in sequence when an action is performed. We need a more explicit protocol for policies to exchange data. A policy has to document the inputs it can consume and the outputs it will generate. We need to keep an eye on the recent proposal to rewrite API docs in a different format. That will hopefully get done during Mitaka cycle as well. Guys, please fill in things I missed and bomb us with questions or suggestions. :) Regards, Qiming __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev