Re: [openstack-dev] [Mistral] Refine engine <-> executor protocol

2014-06-17 Thread Renat Akhmerov
On 13 Jun 2014, at 07:03, W Chan  wrote:

> Design proposal for blueprint 
> https://blueprints.launchpad.net/mistral/+spec/mistral-engine-executor-protocol
> Rename Executor to Worker.
I’d be ok with Worker but would prefer ActionRunner since it reflects the 
purpose a little better although being more verbose.
> Continue to use RPC server-client model in oslo.messaging for Engine and 
> Worker protocol.
Sure.
> Use asynchronous call (cast) between Worker and Engine where appropriate.
I would even emphasize: only async calls make sense.
> Remove any DB access from Worker.  DB IO will only be done by Engine.
I still have doubts it’s actually possible. This is a part of the issue I 
mentioned in the previous email. I’ll post more detailed email on that 
separately.
> Worker updates Engine that it's going to start running action now.  If 
> execution is not RUNNING and task is not IDLE, Engine tells Worker to halt at 
> this point.  Worker cannot assume execution state is RUNNING and task state 
> is IDLE because the handle_task message could have been sitting in the 
> message queue for awhile.  This call between Worker and Engine is 
> synchronous, meaning Worker will wait for a response from the Engine.  
> Currently, Executor checks state and updates task state directly to the DB 
> before running the action.
Yes, that’s how it works now. First of all, like I said before we can’t afford 
making any sync calls between engine and executor because it’ll lead to 
problems with scalability and fault tolerance. So for that reason we make DB 
calls directly to make sure that execution and the task itself are in the 
suitable state. This would only work reliably for READ_COMMITTED transactions 
used in both engine and executor which I believe currently isn’t true since we 
use sqlite (it doesn’t seem to support them, right?). With mysql it should be 
fine.

So the whole initial idea was to use DB whenever we need to make sure that 
something is in a right state. That’s why all the reads should see only 
committed data. And we use queue just to notify executors about new tasks. 
Basically we could have even not used a queue and instead used db poll but with 
queue it looked more elegant.

It’s all part of one problem. Let’s try to figure out options to simplify the 
protocol and make it more reliable.
> Worker communicates result (success or failure) to Engine.  Currently, 
> Executor is inconsistent and calls Engine.convey_task_result on success and 
> write directly to DB on failure.
Yes, that probably needs to be fixed.

> Sequence
> Engine -> Worker.handle_task
> Worker converts action spec to Action instance
Yes, it uses action spec in case if it’s ad-hoc action. If not, it just gets 
action class from the factory and instantiate it.
> Worker -> Engine.confirm_task_execution. Engine returns an exception if 
> execution state is not RUNNING or task state is not IDLE.
Maybe I don’t entirely follow your thought but I think it’s not going to work. 
After engine confirms everything’s OK we’ll have a concurrency window again 
after that we’ll have to confirm the states again. That’s why I was talking 
about READ_COMMITTED DB transactions: we need to eliminate concurrency windows.
> Worker runs action
> Worker -> Engine.convey_task_result
That looks fine (it’s as it is now). Maybe the only thing we need to pay 
attention to is to how we communicate errors back to engine. It seems logical 
that “convey_task_result()” can also be used to pass information about errors 
that that error is considered a special case of a regular result. Need to think 
it over though...


Renat Akhmerov
@ Mirantis Inc.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Mistral] Refine engine <-> executor protocol

2014-06-17 Thread Renat Akhmerov
Hi Winson,

Sorry, I haven’t responded so far for I was on vacation. So getting back to you 
now..

It’s my fault that the notes in the BP are not fairly clear.

1. By “worker parallelism” we meant that one worker (which is called “executor" 
now) can poll and handle more than one task from the task queue (it’s not 
abstracted out from the notion of queue but anyway). It would be a nice feature 
because it would allow to tune the system performance much more accurately.

2. What “engine-executor parallelism” means I honestly don’t remember :) I 
guess this is a note made by Dmitri so he may be better aware. Dmitri? As far 
as engine<->executor interaction we now have an issue with it that we need to 
fix but it’s not related with parallelism. The protocol itself is not 100% 
complete in terms of reliability.

Thanks

Renat Akhmerov
@ Mirantis Inc.


On 06 Jun 2014, at 23:12, W Chan  wrote:

> Regarding blueprint 
> https://blueprints.launchpad.net/mistral/+spec/mistral-engine-executor-protocol,
>  can you clarify what it means by worker parallelism and engine-executor 
> parallelism?  Currently, the engine and executor are launched with the 
> eventlet driver in oslo.messaging.  Once a message arrives over transport, a 
> new green thread is spawned and passed to the dispatcher.  In the case of 
> executor, the function being dispatched to is handle_task.  I'm unclear what 
> additional parallelism this blueprint is referring to.  The context isn't 
> clear from the summit notes.
> 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Mistral] Refine engine <-> executor protocol

2014-06-12 Thread W Chan
Design proposal for blueprint
https://blueprints.launchpad.net/mistral/+spec/mistral-engine-executor-protocol


   - Rename Executor to Worker.
   - Continue to use RPC server-client model in oslo.messaging for Engine
   and Worker protocol.
   - Use asynchronous call (cast) between Worker and Engine where
   appropriate.
   - Remove any DB access from Worker.  DB IO will only be done by Engine.
   - Worker updates Engine that it's going to start running action now.  If
   execution is not RUNNING and task is not IDLE, Engine tells Worker to halt
   at this point.  Worker cannot assume execution state is RUNNING and task
   state is IDLE because the handle_task message could have been sitting in
   the message queue for awhile.  This call between Worker and Engine is
   synchronous, meaning Worker will wait for a response from the
Engine.  Currently,
   Executor checks state and updates task state directly to the DB before
   running the action.
   - Worker communicates result (success or failure) to Engine.  Currently,
   Executor is inconsistent and calls Engine.convey_task_result on success and
   write directly to DB on failure.

Sequence


   1. Engine -> Worker.handle_task
   2. Worker converts action spec to Action instance
   3. Worker -> Engine.confirm_task_execution. Engine returns an exception
   if execution state is not RUNNING or task state is not IDLE.
   4. Worker runs action
   5. Worker -> Engine.convey_task_result

Please provide feedback.

Thanks.
Winson




On Fri, Jun 6, 2014 at 9:12 AM, W Chan  wrote:

> Renat,
>
> Regarding blueprint
> https://blueprints.launchpad.net/mistral/+spec/mistral-engine-executor-protocol,
> can you clarify what it means by worker parallelism and engine-executor 
> parallelism?
>  Currently, the engine and executor are launched with the eventlet driver
> in oslo.messaging.  Once a message arrives over transport, a new green
> thread is spawned and passed to the dispatcher.  In the case of executor,
> the function being dispatched to is handle_task.  I'm unclear what
> additional parallelism this blueprint is referring to.  The context isn't
> clear from the summit notes.
>
> Thanks.
> Winson
>
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Mistral] Refine engine <-> executor protocol

2014-06-06 Thread W Chan
Renat,

Regarding blueprint
https://blueprints.launchpad.net/mistral/+spec/mistral-engine-executor-protocol,
can you clarify what it means by worker parallelism and
engine-executor parallelism?
 Currently, the engine and executor are launched with the eventlet driver
in oslo.messaging.  Once a message arrives over transport, a new green
thread is spawned and passed to the dispatcher.  In the case of executor,
the function being dispatched to is handle_task.  I'm unclear what
additional parallelism this blueprint is referring to.  The context isn't
clear from the summit notes.

Thanks.
Winson
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev