I'm sorry but your email client messed up the formatting and I can't figure out what are you comments. Could you please use text only emails.
----- Original Message ----- > From: "ybronhei" <[email protected]> > To: "Saggi Mizrahi" <[email protected]> > Cc: "Adam Litke" <[email protected]>, "engine-devel" <[email protected]>, > "VDSM Project Development" > <[email protected]> > Sent: Wednesday, December 5, 2012 8:37:23 AM > Subject: Re: [vdsm] VDSM tasks, the future > > > On 12/05/2012 12:20 AM, Saggi Mizrahi wrote: > > > As the only subsystem to use asynchronous tasks until now is the > storage subsystem I suggest going over how > I suggest tackling task creation, task stop, task remove and task > recovery. > Other subsystem can create similar mechanisms depending on their > needs. > > There is no way of avoiding it, different types of tasks need > different ways of tracking\recovering from them. > network should always auto-recover because it can't get a "please > fix" command if the network is down. > Storage on the other hand should never start operations on it's own > because it might take up valuable resources from the host. > Tasks that need to be tracked on a single host, 2 hosts, or the > entire cluster need to have their own APIs. > VM configuration never persist across reboots, networking sometimes > persists and storage always persists. > This means that recovery procedures (from the managers point of view) > need to be vastly different. > Add policy, resource allocation, and error flows you see that VDSM > doesn't have nearly as much information to deal with the tasks. > > ----- Original Message ----- > > From: "Adam Litke" <[email protected]> To: "Saggi Mizrahi" > <[email protected]> Cc: "VDSM Project Development" > <[email protected]> , "engine-devel" > <[email protected]> , "Ayal > Baron" <[email protected]> , "Barak Azulay" <[email protected]> , > "Shireesh Anjal" <[email protected]> Sent: Tuesday, December 4, 2012 > 3:50:28 PM > Subject: Re: VDSM tasks, the future > > On Tue, Dec 04, 2012 at 10:35:01AM -0500, Saggi Mizrahi wrote: > > Because I started hinting about how VDSM tasks are going to look > going forward > I thought it's better I'll just write everything in an email so we > can talk > about it in context. This is not set in stone and I'm still > debating things > myself but it's very close to being done. Don't debate them yourself, > debate them here! Even better, propose > your idea in > schema form to show how a command might work exactly. I don't like > throwing ideas in the air It can be much easier to understand the > flow of a task in vdsm and outside vdsm by a small schema, mainly > for the each task's states. > To define the flow of a task you can separate between type of tasks > (network, storage, vms, or else), we should have task's states that > clarify if the task can be recovered or not, can be canceled or not > and inc.. > > Canceling\Aborting\Reverting states should be more clarified and not > every state can lead to all types of states. > I tries to figure how task flow works today in vdsm, and this is what > I've got - http://wiki.ovirt.org/Vdsm_tasks > > > > > > > - Everything is asynchronous. The nature of message based > communication is > that you can't have synchronous operations. This is not really > debatable > because it's just how TCP\AMQP\<messaging> works. Can you show how a > traditionally synchronous command might work? > Let's take > Host.getVmList as an example. The same as it works today, it's all a > matter of how you wrap the transport layer. > You will send a json-rpc request and wait for a response with the > same id. > > As for the bindings, there are a lot of way we can tackle that. > Always wait for the response and simulate synchronous behavior. > Make every method return an object to track the task. > task = host.getVmList() > if not task.wait(1): > task.cancel() > else: > res = task.result() It looks like traditional timeout.. why not > to split blocking actions and non-blocking actions, non-blocking > action will supply callback function to return to if the task > fails or success. for example: > > createAsyncTask(host.getVmList, params, timeout=30, > callbackGetVmList) > > Instead of using the dispatcher? Do you want to keep the dispatcher > concept? > > > > Have it both ways (it's auto generated anyway) and have > list = host.getVmList() > task = host.getVmList_async() > > Have a high level and low level interfaces. > host = host() > host.connect("tcp://host:3233") > req = host.sendRequest("123213", "getVmList", []) > if not req.wait(1): > .... > > shost = SynchHost(host) > shost.getVmList() # Actually wraps a request object > ahost = AsyncHost(host) > task = getVmList() # Actually wraps a request object > > > > - Task IDs will be decided by the caller. This is how json-rpc > works and also > makes sense because no the engine can track the task without > needing to have a > stage where we give it the task ID back. IDs are reusable as long > as no one > else is using them at the time so they can be used for > synchronizing > operations between clients (making sure a command is only executed > once on a > specific host without locking). > > - Tasks are transient If VDSM restarts it forgets all the task > information. > There are 2 ways to have persistent tasks: 1. The task creates an > object that > you can continue work on in VDSM. The new storage does that by the > fact that > copyImage() returns one the target volume has been created but > before the data > has been fully copied. From that moment on the stat of the copy > can be > queried from any host using getImageStatus() and the specific copy > operation > can be queried with getTaskStatus() on the host performing it. > After VDSM > crashes, depending on policy, either VDSM will create a new task to > continue > the copy or someone else will send a command to continue the > operation and > that will be a new task. 2. VDSM tasks just start other operations > track-able > not through the task interface. For example Gluster. > gluster.startVolumeRebalance() will return once it has been > registered with > Gluster. glster.getOperationStatuses() will return the state of > the operation > from any host. Each call is a task in itself. I worry about this > approach because every command has a different > semantic for > checking progress. For migration, we have to check VM status on the > src and > dest hosts. For image copy we need to use a special status call on > the dest > image. It would be nice if there was a unified method for checking > on an > operation. Maybe that can be completion events. > > Client: vdsm: > ------- ----- > > Image.copy(...) --> > <-- Operation Started > Wait for event ... > <-- Event: Operation <id> done <code> > > For an early error: > > Client: vdsm: > ------- ----- > > Image.copy(...) --> > <-- Error: <code> The thing is that a lot of things > need a different way of tracking their progress. > Storage have completely different semantics from network or VM > operations. This is the reason why we can use the implementation of > task as something generic for all processes that we have. > of course things need different ways of tracking their progress... > That's why we need to use task's states with meaning, and split > storageTaskStates and networkTaskStates that inherit of TaskStates > and add their parts as in the new bootstrap implementation. > Also we can add hooks for each state as alonbl did in his otopi code > (not sure if we need that) > > Like for instance: general states can be - starting, started, > finishing, finished, and each specific implementation adds middle > states. like waitForResource, processing, recovering and inc.. > for each one you can add levels (pre state, post state) that can add > more flexibility. > > That way Task Object will be a general way to implement specific > process, you will have a NetworkTask and StorageTask and the > infrastructure will be the interface and implementation of the > generic parts. > > So here how vdsm can work that way: > client: vsdm: > -------- --------- > image.copy() ---> copyImage::starting (same starting code - keeping > the id, and move forward to next state) > copyImage::started (waiting to recovery file that task is started) > copyImage::part1 (whatever you want to do) > copyImage::part2 (whatever you want to do) > copyImage::part3 (whatever you want to do) -- for each process the > programmers will add their states as they want in a sequence flow > result <------ copyImage::finishing (send back to client a success > and clean recovery file) > copyImage::finished (sign task id as succeeded) > > If somewhere in the middle an error occurred, it easier to start over > and remember where we were. > The problem with that is that we need to modify the current > implementation for each process, and I'm not sure if we want to get > there.. but if we do, it won't be so hard. > We can split the logic of each process to define a logic of each > state, and then arranging the states flow for each process and > clarify what can be recovered or not, what signs corruption or > errors, and how the returned result can point of the current process > status (\state) > > > > > > > > - No task tags. They are silly and the caller can mangle whatever > in the task > ID if he really wants to tag tasks. Yes. Agreed. > > - No explicit recovery stage. VDSM will be crash-only, there > should be > efforts to make everything crash-safe. If that is problematic, in > case of > networking, VDSM will recover on start without having a task for > it. How does this work in practice for something like creating a new > image from a > template? > > - No clean Task: Tasks can be started by any number of hosts this > means that > there is no way to own all tasks. There could be cases where VDSM > starts > tasks on it's own and thus they have no owner at all. The caller > needs to > continually track the state of VDSM. We will have brodcasted events > to > mitigate polling. If a disconnected client might have missed a > completion event, it > will need to > check state. This means each async operation that changes state must > document a > proceedure for checking progress of a potentially ongoing operation. > For > Image.copy, that process would be to lookup the new image and check > its state. > > - No revert Impossible to implement safely. How do the engine folks > feel about this? I am ok with it :) I don't care, unless they find > a way to change they way logic works they can't have it. > The whole concept of recovery (as it is defined now) doesn't work in > an HA cluster. > > > > - No SPM\HSM tasks SPM\SDM is no longer necessary for all domain > types (only > for type). What used to be SPM tasks, or tasks that persist and > can be > restarted on other hosts is talked about in previous bullet points. A > nice simplification. > > > -- > Adam Litke <[email protected]> IBM Linux Technology Center > _______________________________________________ > vdsm-devel mailing list [email protected] > https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel > > -- > Yaniv Bronhaim. > RedHat, Israel > 09-7692289 > 054-7744187 _______________________________________________ vdsm-devel mailing list [email protected] https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
