Re: [openstack-dev] [nova] Proposal for new metrics-threshold-filter
Hi John, Thanks for the reply! Originally, I had uploaded the code for review on 2nd Dec [1]. As I did not receive any review/feedback on the same, I thought having a blue-print/spec may help people review the same. So I had put the bp/spec up for review on 14th Dec [2]. As it is a plugin-based light-weight feature[3], it may not necessarily require the bp/spec and if we can complete the review, we can get the code in before the feature freeze. I would request you and the community to consider the code for merge ... [1] - https://review.openstack.org/#/c/254423/ [2] - https://review.openstack.org/#/c/257596/ [3] -http://docs.openstack.org/developer/nova/process.html#how-do-i-get-my-code-merged Regards, SURO irc//freenode: suro-patz On 1/8/16 4:47 AM, John Garbutt wrote: On 7 January 2016 at 18:49, SURO wrote: Hi all, I have proposed a nova-spec[1] for a new scheduling filter based on metrics thresholds. This is slightly different than weighted metrics filter. The rationale and use-case is explained in detail in the spec[1]. The implementation is also ready for review[2]. The effort is tracked with a blueprint[3]. I request the community to review them and provide valuable feedback. [1] - https://review.openstack.org/#/c/257596/ [2] - https://review.openstack.org/#/c/254423/ [3] - https://blueprints.launchpad.net/nova/+spec/metrics-threshold-filter This is very related to the ideas I have written up here: https://review.openstack.org/#/c/256323/ Please note we have not been accepting new specs for the Mitaka release, mostly because we have feature freeze in a few weeks. For more information please see: http://docs.openstack.org/releases/schedules/mitaka.html#m-nova-bp-freeze http://docs.openstack.org/developer/nova/process.html#how-do-i-get-my-code-merged Thanks, johnthetubaguy __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Proposal for new metrics-threshold-filter
Hi all, I have proposed a nova-spec[1] for a new scheduling filter based on metrics thresholds. This is slightly different than weighted metrics filter. The rationale and use-case is explained in detail in the spec[1]. The implementation is also ready for review[2]. The effort is tracked with a blueprint[3]. I request the community to review them and provide valuable feedback. [1] - https://review.openstack.org/#/c/257596/ [2] - https://review.openstack.org/#/c/254423/ [3] - https://blueprints.launchpad.net/nova/+spec/metrics-threshold-filter __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [magnum] Magnum conductor async container operations
Hongbin, Very useful pointers! Thanks for bringing up the relevant contexts! The proposal to block here for consecutive operations on same container, is the approach to start with. We can have a wait queue implementation following - that way the approach will be amortized over time. If you feel strongly, I am okay implementing the wait queue on the first go itself. [ I felt step-by-step approach carries in sizable code, easier to review ] By the way, I think the scope of bay lock and scope of per-bay-per-container operation is different too, in terms of blocking. I have a confusion about non-blocking bay-operations for horizontal scale [1] - " Heat will be having concurrency support, so we can rely on heat for the concurrency issue for now and drop the baylock implementation." - if user issues two consecutive updates on a Bay, and if the updates go through different magnum-conductors, they can land up at Heat in different order, resulting in different state of the bay. How Heat-concurrency will prevent that I am not very clear. [ Take an example of 'magnum bay-update k8sbay replace node_count=100' followed by 'magnum bay-update k8sbay replace node_count=10'] [1] - https://etherpad.openstack.org/p/liberty-work-magnum-horizontal-scale (Line 33) Regards, SURO irc//freenode: suro-patz On 12/17/15 8:10 AM, Hongbin Lu wrote: Suro, FYI. In before, we tried a distributed lock implementation for bay operations (here are the patches [1,2,3,4,5]). However, after several discussions online and offline, we decided to drop the blocking implementation for bay operations, in favor of non-blocking implementation (which is not implemented yet). You can find more discussion in here [6,7]. For the async container operations, I would suggest to consider a non-blocking approach first. If it is impossible and we need a blocking implementation, suggest to use the bay operations patches below as a reference. [1] https://review.openstack.org/#/c/171921/ [2] https://review.openstack.org/#/c/172603/ [3] https://review.openstack.org/#/c/172772/ [4] https://review.openstack.org/#/c/172773/ [5] https://review.openstack.org/#/c/172774/ [6] https://blueprints.launchpad.net/magnum/+spec/horizontal-scale [7] https://etherpad.openstack.org/p/liberty-work-magnum-horizontal-scale Best regards, Hongbin -Original Message- From: Adrian Otto [mailto:adrian.o...@rackspace.com] Sent: December-16-15 10:20 PM To: OpenStack Development Mailing List (not for usage questions) Cc: s...@yahoo-inc.com Subject: Re: [openstack-dev] [magnum] Magnum conductor async container operations On Dec 16, 2015, at 6:24 PM, Joshua Harlow wrote: SURO wrote: Hi all, Please review and provide feedback on the following design proposal for implementing the blueprint[1] on async-container-operations - 1. Magnum-conductor would have a pool of threads for executing the container operations, viz. executor_threadpool. The size of the executor_threadpool will be configurable. [Phase0] 2. Every time, Magnum-conductor(Mcon) receives a container-operation-request from Magnum-API(Mapi), it will do the initial validation, housekeeping and then pick a thread from the executor_threadpool to execute the rest of the operations. Thus Mcon will return from the RPC request context much faster without blocking the Mapi. If the executor_threadpool is empty, Mcon will execute in a manner it does today, i.e. synchronously - this will be the rate-limiting mechanism - thus relaying the feedback of exhaustion. [Phase0] How often we are hitting this scenario, may be indicative to the operator to create more workers for Mcon. 3. Blocking class of operations - There will be a class of operations, which can not be made async, as they are supposed to return result/content inline, e.g. 'container-logs'. [Phase0] 4. Out-of-order considerations for NonBlocking class of operations - there is a possible race around condition for create followed by start/delete of a container, as things would happen in parallel. To solve this, we will maintain a map of a container and executing thread, for current execution. If we find a request for an operation for a container-in-execution, we will block till the thread completes the execution. [Phase0] Does whatever do these operations (mcon?) run in more than one process? Yes, there may be multiple copies of magnum-conductor running on separate hosts. Can it be requested to create in one process then delete in another? If so is that map some distributed/cross-machine/cross-process map that will be inspected to see what else is manipulating a given container (so that the thread can block until that is not the case... basically the map is acting like a operation-lock?) That’s how I interpreted it as well. This is a race prevention technique so that we don’t attempt to act on a resource until it is ready. Another way to deal with this is check the state of the resource, and
Re: [openstack-dev] [magnum] Magnum conductor async container operations
Josh, Thanks for bringing up this discussion. Modulo-hashing introduces a possibility for 'window of inconsistency', and to address the dynamism 'consistent hashing' is better. BUT, for the problem in hand I think modulo hashing is good enough, as number of worker instances for conductor in OpenStack space is managed through config - a change in which would require a restart of the conductor. If the conductor is restarted, then the 'window of inconsistency' does not occur for the situation we are discussing. Regards, SURO irc//freenode: suro-patz On 12/16/15 11:39 PM, Joshua Harlow wrote: SURO wrote: Please find the reply inline. Regards, SURO irc//freenode: suro-patz On 12/16/15 7:19 PM, Adrian Otto wrote: On Dec 16, 2015, at 6:24 PM, Joshua Harlow wrote: SURO wrote: Hi all, Please review and provide feedback on the following design proposal for implementing the blueprint[1] on async-container-operations - 1. Magnum-conductor would have a pool of threads for executing the container operations, viz. executor_threadpool. The size of the executor_threadpool will be configurable. [Phase0] 2. Every time, Magnum-conductor(Mcon) receives a container-operation-request from Magnum-API(Mapi), it will do the initial validation, housekeeping and then pick a thread from the executor_threadpool to execute the rest of the operations. Thus Mcon will return from the RPC request context much faster without blocking the Mapi. If the executor_threadpool is empty, Mcon will execute in a manner it does today, i.e. synchronously - this will be the rate-limiting mechanism - thus relaying the feedback of exhaustion. [Phase0] How often we are hitting this scenario, may be indicative to the operator to create more workers for Mcon. 3. Blocking class of operations - There will be a class of operations, which can not be made async, as they are supposed to return result/content inline, e.g. 'container-logs'. [Phase0] 4. Out-of-order considerations for NonBlocking class of operations - there is a possible race around condition for create followed by start/delete of a container, as things would happen in parallel. To solve this, we will maintain a map of a container and executing thread, for current execution. If we find a request for an operation for a container-in-execution, we will block till the thread completes the execution. [Phase0] Does whatever do these operations (mcon?) run in more than one process? Yes, there may be multiple copies of magnum-conductor running on separate hosts. Can it be requested to create in one process then delete in another? If so is that map some distributed/cross-machine/cross-process map that will be inspected to see what else is manipulating a given container (so that the thread can block until that is not the case... basically the map is acting like a operation-lock?) Suro> @Josh, just after this, I had mentioned "The approach above puts a prerequisite that operations for a given container on a given Bay would go to the same Magnum-conductor instance." Which suggested multiple instances of magnum-conductors. Also, my idea for implementing this was as follows - magnum-conductors have an 'id' associated, which carries the notion of [0 - (N-1)]th instance of magnum-conductor. Given a request for a container operation, we would always have the bay-id and container-id. I was planning to use 'hash(bay-id, key-id) modulo N' to be the logic to ensure that the right instance picks up the intended request. Let me know if I am missing any nuance of AMQP here. Unsure about nuance of AMQP (I guess that's an implementation detail of this); but what this sounds like is similar to the hash-rings other projects have built (ironic uses one[1], ceilometer is slightly different afaik, see http://www.slideshare.net/EoghanGlynn/hash-based-central-agent-workload-partitioning-37760440 and https://github.com/openstack/ceilometer/blob/master/ceilometer/coordination.py#L48). The typical issue with modulo hashing is changes in N (whether adding new conductors or deleting them) and what that change in N does to ongoing requests, how do u change N in an online manner (and so-on); typically with modulo hashing a large amount of keys get shuffled around[2]. So just a thought but a (consistent) hashing routine/ring... might be worthwhile to look into, and/or talk with those other projects to see what they have been up to. My 2 cents, [1] https://github.com/openstack/ironic/blob/master/ironic/common/hash_ring.py [2] https://en.wikipedia.org/wiki/Consistent_hashing That’s how I interpreted it as well. This is a race prevention technique so that we don’t attempt to act on a resource until it is ready. Another way to deal with this is check the state of the resource, and return a “not ready” error if it’s not ready yet. If this happens in a part of the system that is unattended by a user, we can
Re: [openstack-dev] [magnum] Magnum conductor async container operations
Josh, You pointed out correct! magnum-conductor has monkey-patched code, so the underlying thread module is actually using greenthread. - I would use eventlet.greenthread explicitly, as that would enhance the readability - greenthread has a potential of not yielding by itself, if no i/o, blocking call is made. But in the present scenario, it is not much of a concern, as the container-operation execution is lighter on the client side, and mostly block for the response from the server, after issuing the request. I will update the proposal with this change. Regards, SURO irc//freenode: suro-patz On 12/16/15 11:57 PM, Joshua Harlow wrote: SURO wrote: Josh, Please find my reply inline. Regards, SURO irc//freenode: suro-patz On 12/16/15 6:37 PM, Joshua Harlow wrote: SURO wrote: Hi all, Please review and provide feedback on the following design proposal for implementing the blueprint[1] on async-container-operations - 1. Magnum-conductor would have a pool of threads for executing the container operations, viz. executor_threadpool. The size of the executor_threadpool will be configurable. [Phase0] 2. Every time, Magnum-conductor(Mcon) receives a container-operation-request from Magnum-API(Mapi), it will do the initial validation, housekeeping and then pick a thread from the executor_threadpool to execute the rest of the operations. Thus Mcon will return from the RPC request context much faster without blocking the Mapi. If the executor_threadpool is empty, Mcon will execute in a manner it does today, i.e. synchronously - this will be the rate-limiting mechanism - thus relaying the feedback of exhaustion. [Phase0] How often we are hitting this scenario, may be indicative to the operator to create more workers for Mcon. 3. Blocking class of operations - There will be a class of operations, which can not be made async, as they are supposed to return result/content inline, e.g. 'container-logs'. [Phase0] 4. Out-of-order considerations for NonBlocking class of operations - there is a possible race around condition for create followed by start/delete of a container, as things would happen in parallel. To solve this, we will maintain a map of a container and executing thread, for current execution. If we find a request for an operation for a container-in-execution, we will block till the thread completes the execution. [Phase0] This mechanism can be further refined to achieve more asynchronous behavior. [Phase2] The approach above puts a prerequisite that operations for a given container on a given Bay would go to the same Magnum-conductor instance. [Phase0] 5. The hand-off between Mcon and a thread from executor_threadpool can be reflected through new states on the 'container' object. These states can be helpful to recover/audit, in case of Mcon restart. [Phase1] Other considerations - 1. Using eventlet.greenthread instead of real threads => This approach would require further refactoring the execution code and embed yield logic, otherwise a single greenthread would block others to progress. Given, we will extend the mechanism for multiple COEs, and to keep the approach straight forward to begin with, we will use 'threading.Thread' instead of 'eventlet.greenthread'. Also unsure about the above, not quite sure I connect how greenthread usage requires more yield logic (I'm assuming you mean the yield statement here)? Btw if magnum is running with all things monkey patched (which it seems like https://github.com/openstack/magnum/blob/master/magnum/common/rpc_service.py#L33 does) then magnum usage of 'threading.Thread' is a 'eventlet.greenthread' underneath the covers, just fyi. SURO> Let's consider this - function A () { block B; // validation block C; // Blocking op } Now, if we make C a greenthread, as it is, would it not block the entire thread that runs through all the greenthreads? I assumed, it would and that's why we have to incorporate finer grain yield into C to leverage greenthread. If the answer is no, then we can use greenthread. I will validate which version of threading.Thread was getting used. Unsure how to answer this one. If all things are monkey patched then any time a blocking operation (i/o, lock acquisition...) is triggered the internals of eventlet go through a bunch of jumping around to then switch to another green thread (http://eventlet.net/doc/hubs.html). Once u start partially using greenthreads and mixing real threads then you have to start trying to reason about yielding in certain places (and at that point you might as well go to py3.4+ since it has syntax made just for this kind of thinking). Pointer for the thread monkey patching btw: https://github.com/eventlet/eventlet/blob/master/eventlet/patcher.py#L346 https://github.com/eventlet/eventlet/blob/master/eventlet/patcher.py#L212 Easy way to see this: >>> import eventlet >>> even
Re: [openstack-dev] [magnum] Magnum conductor async container operations
On 12/16/15 11:39 PM, Joshua Harlow wrote: SURO wrote: Please find the reply inline. Regards, SURO irc//freenode: suro-patz On 12/16/15 7:19 PM, Adrian Otto wrote: On Dec 16, 2015, at 6:24 PM, Joshua Harlow wrote: SURO wrote: Hi all, Please review and provide feedback on the following design proposal for implementing the blueprint[1] on async-container-operations - 1. Magnum-conductor would have a pool of threads for executing the container operations, viz. executor_threadpool. The size of the executor_threadpool will be configurable. [Phase0] 2. Every time, Magnum-conductor(Mcon) receives a container-operation-request from Magnum-API(Mapi), it will do the initial validation, housekeeping and then pick a thread from the executor_threadpool to execute the rest of the operations. Thus Mcon will return from the RPC request context much faster without blocking the Mapi. If the executor_threadpool is empty, Mcon will execute in a manner it does today, i.e. synchronously - this will be the rate-limiting mechanism - thus relaying the feedback of exhaustion. [Phase0] How often we are hitting this scenario, may be indicative to the operator to create more workers for Mcon. 3. Blocking class of operations - There will be a class of operations, which can not be made async, as they are supposed to return result/content inline, e.g. 'container-logs'. [Phase0] 4. Out-of-order considerations for NonBlocking class of operations - there is a possible race around condition for create followed by start/delete of a container, as things would happen in parallel. To solve this, we will maintain a map of a container and executing thread, for current execution. If we find a request for an operation for a container-in-execution, we will block till the thread completes the execution. [Phase0] Does whatever do these operations (mcon?) run in more than one process? Yes, there may be multiple copies of magnum-conductor running on separate hosts. Can it be requested to create in one process then delete in another? If so is that map some distributed/cross-machine/cross-process map that will be inspected to see what else is manipulating a given container (so that the thread can block until that is not the case... basically the map is acting like a operation-lock?) Suro> @Josh, just after this, I had mentioned "The approach above puts a prerequisite that operations for a given container on a given Bay would go to the same Magnum-conductor instance." Which suggested multiple instances of magnum-conductors. Also, my idea for implementing this was as follows - magnum-conductors have an 'id' associated, which carries the notion of [0 - (N-1)]th instance of magnum-conductor. Given a request for a container operation, we would always have the bay-id and container-id. I was planning to use 'hash(bay-id, key-id) modulo N' to be the logic to ensure that the right instance picks up the intended request. Let me know if I am missing any nuance of AMQP here. Unsure about nuance of AMQP (I guess that's an implementation detail of this); but what this sounds like is similar to the hash-rings other projects have built (ironic uses one[1], ceilometer is slightly different afaik, see http://www.slideshare.net/EoghanGlynn/hash-based-central-agent-workload-partitioning-37760440 and https://github.com/openstack/ceilometer/blob/master/ceilometer/coordination.py#L48). The typical issue with modulo hashing is changes in N (whether adding new conductors or deleting them) and what that change in N does to ongoing requests, how do u change N in an online manner (and so-on); typically with modulo hashing a large amount of keys get shuffled around[2]. So just a thought but a (consistent) hashing routine/ring... might be worthwhile to look into, and/or talk with those other projects to see what they have been up to. Suro> When a new worker instance is added, I guess it is done by restarting the magnum-conductor service. So, the sequencing would get reset altogether and the stickiness will resume afresh. I will go through your pointers, and make sure I am not missing anything here. My 2 cents, [1] https://github.com/openstack/ironic/blob/master/ironic/common/hash_ring.py [2] https://en.wikipedia.org/wiki/Consistent_hashing That’s how I interpreted it as well. This is a race prevention technique so that we don’t attempt to act on a resource until it is ready. Another way to deal with this is check the state of the resource, and return a “not ready” error if it’s not ready yet. If this happens in a part of the system that is unattended by a user, we can re-queue the call to retry after a minimum delay so that it proceeds only when the ready state is reached in the resource, or terminated after a maximum number of attempts, or if the resource enters an error state. This would allow other work to proceed while the retry waits in the queue. Suro> @Adrian, I think asy
Re: [openstack-dev] [magnum] Magnum conductor async container operations
Josh, Please find my reply inline. Regards, SURO irc//freenode: suro-patz On 12/16/15 6:37 PM, Joshua Harlow wrote: SURO wrote: Hi all, Please review and provide feedback on the following design proposal for implementing the blueprint[1] on async-container-operations - 1. Magnum-conductor would have a pool of threads for executing the container operations, viz. executor_threadpool. The size of the executor_threadpool will be configurable. [Phase0] 2. Every time, Magnum-conductor(Mcon) receives a container-operation-request from Magnum-API(Mapi), it will do the initial validation, housekeeping and then pick a thread from the executor_threadpool to execute the rest of the operations. Thus Mcon will return from the RPC request context much faster without blocking the Mapi. If the executor_threadpool is empty, Mcon will execute in a manner it does today, i.e. synchronously - this will be the rate-limiting mechanism - thus relaying the feedback of exhaustion. [Phase0] How often we are hitting this scenario, may be indicative to the operator to create more workers for Mcon. 3. Blocking class of operations - There will be a class of operations, which can not be made async, as they are supposed to return result/content inline, e.g. 'container-logs'. [Phase0] 4. Out-of-order considerations for NonBlocking class of operations - there is a possible race around condition for create followed by start/delete of a container, as things would happen in parallel. To solve this, we will maintain a map of a container and executing thread, for current execution. If we find a request for an operation for a container-in-execution, we will block till the thread completes the execution. [Phase0] This mechanism can be further refined to achieve more asynchronous behavior. [Phase2] The approach above puts a prerequisite that operations for a given container on a given Bay would go to the same Magnum-conductor instance. [Phase0] 5. The hand-off between Mcon and a thread from executor_threadpool can be reflected through new states on the 'container' object. These states can be helpful to recover/audit, in case of Mcon restart. [Phase1] Other considerations - 1. Using eventlet.greenthread instead of real threads => This approach would require further refactoring the execution code and embed yield logic, otherwise a single greenthread would block others to progress. Given, we will extend the mechanism for multiple COEs, and to keep the approach straight forward to begin with, we will use 'threading.Thread' instead of 'eventlet.greenthread'. Also unsure about the above, not quite sure I connect how greenthread usage requires more yield logic (I'm assuming you mean the yield statement here)? Btw if magnum is running with all things monkey patched (which it seems like https://github.com/openstack/magnum/blob/master/magnum/common/rpc_service.py#L33 does) then magnum usage of 'threading.Thread' is a 'eventlet.greenthread' underneath the covers, just fyi. SURO> Let's consider this - function A () { block B; // validation block C; // Blocking op } Now, if we make C a greenthread, as it is, would it not block the entire thread that runs through all the greenthreads? I assumed, it would and that's why we have to incorporate finer grain yield into C to leverage greenthread. If the answer is no, then we can use greenthread. I will validate which version of threading.Thread was getting used. In that case, keeping the code for thread.Threading is portable, as it would work as desired, even if we remove monkey_patching, right? Refs:- [1] - https://blueprints.launchpad.net/magnum/+spec/async-container-operations __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [magnum] Magnum conductor async container operations
Please find the reply inline. Regards, SURO irc//freenode: suro-patz On 12/16/15 7:19 PM, Adrian Otto wrote: On Dec 16, 2015, at 6:24 PM, Joshua Harlow wrote: SURO wrote: Hi all, Please review and provide feedback on the following design proposal for implementing the blueprint[1] on async-container-operations - 1. Magnum-conductor would have a pool of threads for executing the container operations, viz. executor_threadpool. The size of the executor_threadpool will be configurable. [Phase0] 2. Every time, Magnum-conductor(Mcon) receives a container-operation-request from Magnum-API(Mapi), it will do the initial validation, housekeeping and then pick a thread from the executor_threadpool to execute the rest of the operations. Thus Mcon will return from the RPC request context much faster without blocking the Mapi. If the executor_threadpool is empty, Mcon will execute in a manner it does today, i.e. synchronously - this will be the rate-limiting mechanism - thus relaying the feedback of exhaustion. [Phase0] How often we are hitting this scenario, may be indicative to the operator to create more workers for Mcon. 3. Blocking class of operations - There will be a class of operations, which can not be made async, as they are supposed to return result/content inline, e.g. 'container-logs'. [Phase0] 4. Out-of-order considerations for NonBlocking class of operations - there is a possible race around condition for create followed by start/delete of a container, as things would happen in parallel. To solve this, we will maintain a map of a container and executing thread, for current execution. If we find a request for an operation for a container-in-execution, we will block till the thread completes the execution. [Phase0] Does whatever do these operations (mcon?) run in more than one process? Yes, there may be multiple copies of magnum-conductor running on separate hosts. Can it be requested to create in one process then delete in another? If so is that map some distributed/cross-machine/cross-process map that will be inspected to see what else is manipulating a given container (so that the thread can block until that is not the case... basically the map is acting like a operation-lock?) Suro> @Josh, just after this, I had mentioned "The approach above puts a prerequisite that operations for a given container on a given Bay would go to the same Magnum-conductor instance." Which suggested multiple instances of magnum-conductors. Also, my idea for implementing this was as follows - magnum-conductors have an 'id' associated, which carries the notion of [0 - (N-1)]th instance of magnum-conductor. Given a request for a container operation, we would always have the bay-id and container-id. I was planning to use 'hash(bay-id, key-id) modulo N' to be the logic to ensure that the right instance picks up the intended request. Let me know if I am missing any nuance of AMQP here. That’s how I interpreted it as well. This is a race prevention technique so that we don’t attempt to act on a resource until it is ready. Another way to deal with this is check the state of the resource, and return a “not ready” error if it’s not ready yet. If this happens in a part of the system that is unattended by a user, we can re-queue the call to retry after a minimum delay so that it proceeds only when the ready state is reached in the resource, or terminated after a maximum number of attempts, or if the resource enters an error state. This would allow other work to proceed while the retry waits in the queue. Suro> @Adrian, I think async model is to let user issue a sequence of operations, which might be causally ordered. I suggest we should honor the causal ordering than implementing the implicit retry model. As per my above proposal, if we can arbitrate operations for a given bay, given container - we should be able to achieve this ordering. If it's just local in one process, then I have a library for u that can solve the problem of correctly ordering parallel operations ;) What we are aiming for is a bit more distributed. Suro> +1 Adrian This mechanism can be further refined to achieve more asynchronous behavior. [Phase2] The approach above puts a prerequisite that operations for a given container on a given Bay would go to the same Magnum-conductor instance. [Phase0] 5. The hand-off between Mcon and a thread from executor_threadpool can be reflected through new states on the 'container' object. These states can be helpful to recover/audit, in case of Mcon restart. [Phase1] Other considerations - 1. Using eventlet.greenthread instead of real threads => This approach would require further refactoring the execution code and embed yield logic, otherwise a single greenthread would block others to progress. Given, we will extend the mechanism for multiple COEs, and to keep the approach straight forward to begin with, we wi
[openstack-dev] [magnum] Magnum conductor async container operations
Hi all, Please review and provide feedback on the following design proposal for implementing the blueprint[1] on async-container-operations - 1. Magnum-conductor would have a pool of threads for executing the container operations, viz. executor_threadpool. The size of the executor_threadpool will be configurable. [Phase0] 2. Every time, Magnum-conductor(Mcon) receives a container-operation-request from Magnum-API(Mapi), it will do the initial validation, housekeeping and then pick a thread from the executor_threadpool to execute the rest of the operations. Thus Mcon will return from the RPC request context much faster without blocking the Mapi. If the executor_threadpool is empty, Mcon will execute in a manner it does today, i.e. synchronously - this will be the rate-limiting mechanism - thus relaying the feedback of exhaustion. [Phase0] How often we are hitting this scenario, may be indicative to the operator to create more workers for Mcon. 3. Blocking class of operations - There will be a class of operations, which can not be made async, as they are supposed to return result/content inline, e.g. 'container-logs'. [Phase0] 4. Out-of-order considerations for NonBlocking class of operations - there is a possible race around condition for create followed by start/delete of a container, as things would happen in parallel. To solve this, we will maintain a map of a container and executing thread, for current execution. If we find a request for an operation for a container-in-execution, we will block till the thread completes the execution. [Phase0] This mechanism can be further refined to achieve more asynchronous behavior. [Phase2] The approach above puts a prerequisite that operations for a given container on a given Bay would go to the same Magnum-conductor instance. [Phase0] 5. The hand-off between Mcon and a thread from executor_threadpool can be reflected through new states on the 'container' object. These states can be helpful to recover/audit, in case of Mcon restart. [Phase1] Other considerations - 1. Using eventlet.greenthread instead of real threads => This approach would require further refactoring the execution code and embed yield logic, otherwise a single greenthread would block others to progress. Given, we will extend the mechanism for multiple COEs, and to keep the approach straight forward to begin with, we will use 'threading.Thread' instead of 'eventlet.greenthread'. Refs:- [1] - https://blueprints.launchpad.net/magnum/+spec/async-container-operations -- Regards, SURO irc//freenode: suro-patz __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [magnum][blueprint] magnum-service-list
Thanks Jay/Kennan/Adrian for chiming in! From this, I conclude that we have enough consensus to have 'magnum service-list' and 'magnum coe-service-list' segregated. I will capture extract of this discussion at the blueprint and start implementation of the same. Kennan, I would request you to submit a different bp/bug to address the staleness of the state of pod/rc. Regards, SURO irc//freenode: suro-patz On 8/3/15 5:33 PM, Kai Qiang Wu wrote: Hi Suro and Jay, I checked discussion below, and I do believe we also need service-list(for just magnum-api and magnum-conductor), but not so emergent requirement. I also think service-list should not bind to k8s or swarm etc. (can use coe-service etc.) But I have more for below: 1) For k8s or swarm or mesos, I think magnum can expose through the coe-service-list. But if right now, we fetched status from DB for pods/rcs status, It seems not proper to do that, as DB has old data. We need to fetch that through k8s/swarm API endpoints. 2) It can also expose that through k8s/swarm/mesos client tools. If users like that. Thanks Best Wishes, Kai Qiang Wu (吴开强 Kennan) IBM China System and Technology Lab, Beijing E-mail: wk...@cn.ibm.com Tel: 86-10-82451647 Address: Building 28(Ring Building), ZhongGuanCun Software Park, No.8 Dong Bei Wang West Road, Haidian District Beijing P.R.China 100193 Follow your heart. You are miracle! Inactive hide details for Jay Lau ---08/04/2015 05:51:33 AM---Hi Suro, Yes, I did not see a strong reason for adding "service-lJay Lau ---08/04/2015 05:51:33 AM---Hi Suro, Yes, I did not see a strong reason for adding "service-list" to show all of From: Jay Lau To: "OpenStack Development Mailing List (not for usage questions)" Date: 08/04/2015 05:51 AM Subject: Re: [openstack-dev] [magnum][blueprint] magnum-service-list ---- Hi Suro, Yes, I did not see a strong reason for adding "service-list" to show all of magnum system services, but it is nice to have. But I did see a strong reason to rename "service-list" to "coe-service-list" or others which might be more meaningful as I was often asked by someone why does "magnum service-list" is showing some services in kubernetes but not magnum system itself? This command always make people confused. Thanks! 2015-08-03 15:36 GMT-04:00 SURO <_suro.patz@gmail.com_ <mailto:suro.p...@gmail.com>>: Hi Jay, Thanks for clarifying the requirements further. I do agree with the idea of having 'magnum service-list' and 'magnum coe-service-list' to distinguish that coe-service is a different concept. BUT, in openstack space, I do not see 'service-list' as a standardized function across other APIs - 1. 'nova service-list' => Enlists services like api, conductor etc. 2. neutron does not have this option. 3. 'heat service-list' => Enlists available engines. 4. 'keystone service-list' => Enlists services/APIs who consults keystone. Now in magnum, we may choose to model it after nova, but nova really has a bunch of backend services, viz. nova-conductor, nova-cert, nova-scheduler, nova-consoleauth, nova-compute[x N], whereas magnum not. For magnum, at this point creating 'service-list' only for api/conductor - do you see a strong need? Regards, SURO irc//freenode: suro-patz On 8/3/15 12:00 PM, Jay Lau wrote: Hi Suro and others, comments on this? Thanks. 2015-07-30 5:40 GMT-04:00 Jay Lau <_jay.lau.513@gmail.com_ <mailto:jay.lau@gmail.com>>: Hi Suro, In my understanding, even other CoE might have service/pod/rc concepts in future, we may still want to distinguish the "magnum service-list" with "magnum coe-service-list". service-list is mainly for magnum native services, such as magnum-api, magnum-conductor etc. coe-service-list mainly for the services that running for the CoEs in magnum. Thoughts? Thanks. 2015-07-29 17:50 GMT-04:00 SURO <_suro.patz@gmail.com_ <mailto:suro.p...@gmail.com>>: Hi Hongbin, What would be the value of having COE-specific magnum command to go and talk to DB? As in that case, user may use the native client itself to fetch the data from COE, which even will have latest state. In a pluggable architecture t
Re: [openstack-dev] [magnum][blueprint] magnum-service-list
Hi Jay, Thanks for clarifying the requirements further. I do agree with the idea of having 'magnum service-list' and 'magnum coe-service-list' to distinguish that coe-service is a different concept. BUT, in openstack space, I do not see 'service-list' as a standardized function across other APIs - 1. 'nova service-list' => Enlists services like api, conductor etc. 2. neutron does not have this option. 3. 'heat service-list' => Enlists available engines. 4. 'keystone service-list' => Enlists services/APIs who consults keystone. Now in magnum, we may choose to model it after nova, but nova really has a bunch of backend services, viz. nova-conductor, nova-cert, nova-scheduler, nova-consoleauth, nova-compute[x N], whereas magnum not. For magnum, at this point creating 'service-list' only for api/conductor - do you see a strong need? Regards, SURO irc//freenode: suro-patz On 8/3/15 12:00 PM, Jay Lau wrote: Hi Suro and others, comments on this? Thanks. 2015-07-30 5:40 GMT-04:00 Jay Lau <mailto:jay.lau@gmail.com>>: Hi Suro, In my understanding, even other CoE might have service/pod/rc concepts in future, we may still want to distinguish the "magnum service-list" with "magnum coe-service-list". service-list is mainly for magnum native services, such as magnum-api, magnum-conductor etc. coe-service-list mainly for the services that running for the CoEs in magnum. Thoughts? Thanks. 2015-07-29 17:50 GMT-04:00 SURO mailto:suro.p...@gmail.com>>: Hi Hongbin, What would be the value of having COE-specific magnum command to go and talk to DB? As in that case, user may use the native client itself to fetch the data from COE, which even will have latest state. In a pluggable architecture there is always scope for common abstraction and driver implementation. I think it is too early to declare service/rc/pod as specific to k8s, as the other COEs may very well converge onto similar/same concepts. Regards, SURO irc//freenode: suro-patz On 7/29/15 2:21 PM, Hongbin Lu wrote: Suro, I think service/pod/rc are k8s-specific. +1 for Jay’s suggestion about renaming COE-specific command, since the new naming style looks consistent with other OpenStack projects. In addition, it will eliminate name collision of different COEs. Also, if we are going to support pluggable COE, adding prefix to COE-specific command is unavoidable. Best regards, Hongbin *From:*SURO [mailto:suro.p...@gmail.com] *Sent:* July-29-15 4:03 PM *To:* Jay Lau *Cc:* s...@yahoo-inc.com <mailto:s...@yahoo-inc.com>; OpenStack Development Mailing List (not for usage questions) *Subject:* Re: [openstack-dev] [magnum][blueprint] magnum-service-list Hi Jay, 'service'/'pod'/'rc' are conceptual abstraction at magnum level. Yes, the abstraction was inspired from the same in kubernetes, but the data stored in DB about a 'service' is properly abstracted and not k8s-specific at the top level. If we plan to change this to 'k8s-service-list', the same applies for even creation and other actions. This will give rise to COE-specific command and concepts and which may proliferate further. Instead, we can abstract swarm's service concept under the umbrella of magnum's 'service' concept without creating k8s-service and swarm-service. I suggest we should keep the concept/abstraction at Magnum level, as it is. Regards, SURO irc//freenode: suro-patz On 7/28/15 7:59 PM, Jay Lau wrote: Hi Suro, Sorry for late. IMHO, even the "magnum service-list" is getting data from DB, but the DB is actually persisting some data for Kubernetes service, so my thinking is it possible to change "magnum service-list" to "magnum k8s-service-list", same for pod and rc. I know this might bring some trouble for backward compatibility issue, not sure if it is good to do such modification at this time. Comments? Thanks 2015-07-27 20:12 GMT-04:00 SURO mailto:suro.p...@gmail.com>>: Hi all, As we did not hear back further on the requirement of this blueprint, I propose to keep the existing behavior without any modification. We would like to explore the decision on this blueprint on our next weekly IRC meeting[1].
Re: [openstack-dev] [magnum][blueprint] magnum-service-list
Hi Hongbin, What would be the value of having COE-specific magnum command to go and talk to DB? As in that case, user may use the native client itself to fetch the data from COE, which even will have latest state. In a pluggable architecture there is always scope for common abstraction and driver implementation. I think it is too early to declare service/rc/pod as specific to k8s, as the other COEs may very well converge onto similar/same concepts. Regards, SURO irc//freenode: suro-patz On 7/29/15 2:21 PM, Hongbin Lu wrote: Suro, I think service/pod/rc are k8s-specific. +1 for Jay’s suggestion about renaming COE-specific command, since the new naming style looks consistent with other OpenStack projects. In addition, it will eliminate name collision of different COEs. Also, if we are going to support pluggable COE, adding prefix to COE-specific command is unavoidable. Best regards, Hongbin *From:*SURO [mailto:suro.p...@gmail.com] *Sent:* July-29-15 4:03 PM *To:* Jay Lau *Cc:* s...@yahoo-inc.com; OpenStack Development Mailing List (not for usage questions) *Subject:* Re: [openstack-dev] [magnum][blueprint] magnum-service-list Hi Jay, 'service'/'pod'/'rc' are conceptual abstraction at magnum level. Yes, the abstraction was inspired from the same in kubernetes, but the data stored in DB about a 'service' is properly abstracted and not k8s-specific at the top level. If we plan to change this to 'k8s-service-list', the same applies for even creation and other actions. This will give rise to COE-specific command and concepts and which may proliferate further. Instead, we can abstract swarm's service concept under the umbrella of magnum's 'service' concept without creating k8s-service and swarm-service. I suggest we should keep the concept/abstraction at Magnum level, as it is. Regards, SURO irc//freenode: suro-patz On 7/28/15 7:59 PM, Jay Lau wrote: Hi Suro, Sorry for late. IMHO, even the "magnum service-list" is getting data from DB, but the DB is actually persisting some data for Kubernetes service, so my thinking is it possible to change "magnum service-list" to "magnum k8s-service-list", same for pod and rc. I know this might bring some trouble for backward compatibility issue, not sure if it is good to do such modification at this time. Comments? Thanks 2015-07-27 20:12 GMT-04:00 SURO mailto:suro.p...@gmail.com>>: Hi all, As we did not hear back further on the requirement of this blueprint, I propose to keep the existing behavior without any modification. We would like to explore the decision on this blueprint on our next weekly IRC meeting[1]. Regards, SURO irc//freenode: suro-patz [1] -https://wiki.openstack.org/wiki/Meetings/Containers 2015-07-28 UTC 2200 Tuesday On 7/21/15 4:54 PM, SURO wrote: Hi all, [special attention: Jay Lau] The bp[1] registered, asks for the following implementation - * 'magnum service-list' should be similar to 'nova service-list' * 'magnum service-list' should be moved to be ' magnum k8s-service-list'. Also similar holds true for 'pod-list'/'rc-list' As I dug some details, I find - * 'magnum service-list' fetches data from OpenStack DB[2], instead of the COE endpoint. So technically it is not k8s-specific. magnum is serving data for objects modeled as 'service', just the way we are catering for 'magnum container-list' in case of swarm bay. * If magnum provides a way to get the COE endpoint details, users can use native tools to fetch the status of the COE-specific objects, viz. 'kubectl get services' here. * nova has lot more backend services, e.g. cert, scheduler, consoleauth, compute etc. in comparison to magnum's conductor only. Also, not all the api's have this 'service-list' available. With these arguments in view, can we have some more explanation/clarification in favor of the ask in the blueprint? [1] - https://blueprints.launchpad.net/magnum/+spec/magnum-service-list [2] - https://github.com/openstack/magnum/blob/master/magnum/objects/service.py#L114 -- Regards, SURO irc//freenode: suro-patz -- Thanks, Jay Lau (Guangya Liu) __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-b
Re: [openstack-dev] [magnum][blueprint] magnum-service-list
Hi Jay, 'service'/'pod'/'rc' are conceptual abstraction at magnum level. Yes, the abstraction was inspired from the same in kubernetes, but the data stored in DB about a 'service' is properly abstracted and not k8s-specific at the top level. If we plan to change this to 'k8s-service-list', the same applies for even creation and other actions. This will give rise to COE-specific command and concepts and which may proliferate further. Instead, we can abstract swarm's service concept under the umbrella of magnum's 'service' concept without creating k8s-service and swarm-service. I suggest we should keep the concept/abstraction at Magnum level, as it is. Regards, SURO irc//freenode: suro-patz On 7/28/15 7:59 PM, Jay Lau wrote: Hi Suro, Sorry for late. IMHO, even the "magnum service-list" is getting data from DB, but the DB is actually persisting some data for Kubernetes service, so my thinking is it possible to change "magnum service-list" to "magnum k8s-service-list", same for pod and rc. I know this might bring some trouble for backward compatibility issue, not sure if it is good to do such modification at this time. Comments? Thanks 2015-07-27 20:12 GMT-04:00 SURO <mailto:suro.p...@gmail.com>>: Hi all, As we did not hear back further on the requirement of this blueprint, I propose to keep the existing behavior without any modification. We would like to explore the decision on this blueprint on our next weekly IRC meeting[1]. Regards, SURO irc//freenode: suro-patz [1] -https://wiki.openstack.org/wiki/Meetings/Containers 2015-07-28 UTC 2200 Tuesday On 7/21/15 4:54 PM, SURO wrote: Hi all, [special attention: Jay Lau] The bp[1] registered, asks for the following implementation - * 'magnum service-list' should be similar to 'nova service-list' * 'magnum service-list' should be moved to be ' magnum k8s-service-list'. Also similar holds true for 'pod-list'/'rc-list' As I dug some details, I find - * 'magnum service-list' fetches data from OpenStack DB[2], instead of the COE endpoint. So technically it is not k8s-specific. magnum is serving data for objects modeled as 'service', just the way we are catering for 'magnum container-list' in case of swarm bay. * If magnum provides a way to get the COE endpoint details, users can use native tools to fetch the status of the COE-specific objects, viz. 'kubectl get services' here. * nova has lot more backend services, e.g. cert, scheduler, consoleauth, compute etc. in comparison to magnum's conductor only. Also, not all the api's have this 'service-list' available. With these arguments in view, can we have some more explanation/clarification in favor of the ask in the blueprint? [1] - https://blueprints.launchpad.net/magnum/+spec/magnum-service-list [2] - https://github.com/openstack/magnum/blob/master/magnum/objects/service.py#L114 -- Regards, SURO irc//freenode: suro-patz -- Thanks, Jay Lau (Guangya Liu) __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [magnum][blueprint] magnum-service-list
Hi all, As we did not hear back further on the requirement of this blueprint, I propose to keep the existing behavior without any modification. We would like to explore the decision on this blueprint on our next weekly IRC meeting[1]. Regards, SURO irc//freenode: suro-patz [1] - https://wiki.openstack.org/wiki/Meetings/Containers 2015-07-28 UTC 2200 Tuesday On 7/21/15 4:54 PM, SURO wrote: Hi all, [special attention: Jay Lau] The bp[1] registered, asks for the following implementation - * 'magnum service-list' should be similar to 'nova service-list' * 'magnum service-list' should be moved to be ' magnum k8s-service-list'. Also similar holds true for 'pod-list'/'rc-list' As I dug some details, I find - * 'magnum service-list' fetches data from OpenStack DB[2], instead of the COE endpoint. So technically it is not k8s-specific. magnum is serving data for objects modeled as 'service', just the way we are catering for 'magnum container-list' in case of swarm bay. * If magnum provides a way to get the COE endpoint details, users can use native tools to fetch the status of the COE-specific objects, viz. 'kubectl get services' here. * nova has lot more backend services, e.g. cert, scheduler, consoleauth, compute etc. in comparison to magnum's conductor only. Also, not all the api's have this 'service-list' available. With these arguments in view, can we have some more explanation/clarification in favor of the ask in the blueprint? [1] - https://blueprints.launchpad.net/magnum/+spec/magnum-service-list [2] - https://github.com/openstack/magnum/blob/master/magnum/objects/service.py#L114 -- Regards, SURO irc//freenode: suro-patz __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [magnum][blueprint] magnum-service-list
Hi all, [special attention: Jay Lau] The bp[1] registered, asks for the following implementation - * 'magnum service-list' should be similar to 'nova service-list' * 'magnum service-list' should be moved to be ' magnum k8s-service-list'. Also similar holds true for 'pod-list'/'rc-list' As I dug some details, I find - * 'magnum service-list' fetches data from OpenStack DB[2], instead of the COE endpoint. So technically it is not k8s-specific. magnum is serving data for objects modeled as 'service', just the way we are catering for 'magnum container-list' in case of swarm bay. * If magnum provides a way to get the COE endpoint details, users can use native tools to fetch the status of the COE-specific objects, viz. 'kubectl get services' here. * nova has lot more backend services, e.g. cert, scheduler, consoleauth, compute etc. in comparison to magnum's conductor only. Also, not all the api's have this 'service-list' available. With these arguments in view, can we have some more explanation/clarification in favor of the ask in the blueprint? [1] - https://blueprints.launchpad.net/magnum/+spec/magnum-service-list [2] - https://github.com/openstack/magnum/blob/master/magnum/objects/service.py#L114 -- Regards, SURO irc//freenode: suro-patz __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev