On Tue, Dec 04, 2012 at 10:35:01AM -0500, Saggi Mizrahi wrote: > Because I started hinting about how VDSM tasks are going to look going forward > I thought it's better I'll just write everything in an email so we can talk > about it in context. This is not set in stone and I'm still debating things > myself but it's very close to being done.
Don't debate them yourself, debate them here! Even better, propose your idea in schema form to show how a command might work exactly. > - Everything is asynchronous. The nature of message based communication is > that you can't have synchronous operations. This is not really debatable > because it's just how TCP\AMQP\<messaging> works. Can you show how a traditionally synchronous command might work? Let's take Host.getVmList as an example. > - Task IDs will be decided by the caller. This is how json-rpc works and also > makes sense because no the engine can track the task without needing to have a > stage where we give it the task ID back. IDs are reusable as long as no one > else is using them at the time so they can be used for synchronizing > operations between clients (making sure a command is only executed once on a > specific host without locking). > > - Tasks are transient If VDSM restarts it forgets all the task information. > There are 2 ways to have persistent tasks: 1. The task creates an object that > you can continue work on in VDSM. The new storage does that by the fact that > copyImage() returns one the target volume has been created but before the data > has been fully copied. From that moment on the stat of the copy can be > queried from any host using getImageStatus() and the specific copy operation > can be queried with getTaskStatus() on the host performing it. After VDSM > crashes, depending on policy, either VDSM will create a new task to continue > the copy or someone else will send a command to continue the operation and > that will be a new task. 2. VDSM tasks just start other operations track-able > not through the task interface. For example Gluster. > gluster.startVolumeRebalance() will return once it has been registered with > Gluster. glster.getOperationStatuses() will return the state of the operation > from any host. Each call is a task in itself. I worry about this approach because every command has a different semantic for checking progress. For migration, we have to check VM status on the src and dest hosts. For image copy we need to use a special status call on the dest image. It would be nice if there was a unified method for checking on an operation. Maybe that can be completion events. Client: vdsm: ------- ----- Image.copy(...) --> <-- Operation Started Wait for event ... <-- Event: Operation <id> done <code> For an early error: Client: vdsm: ------- ----- Image.copy(...) --> <-- Error: <code> > - No task tags. They are silly and the caller can mangle whatever in the task > ID if he really wants to tag tasks. Yes. Agreed. > - No explicit recovery stage. VDSM will be crash-only, there should be > efforts to make everything crash-safe. If that is problematic, in case of > networking, VDSM will recover on start without having a task for it. How does this work in practice for something like creating a new image from a template? > - No clean Task: Tasks can be started by any number of hosts this means that > there is no way to own all tasks. There could be cases where VDSM starts > tasks on it's own and thus they have no owner at all. The caller needs to > continually track the state of VDSM. We will have brodcasted events to > mitigate polling. If a disconnected client might have missed a completion event, it will need to check state. This means each async operation that changes state must document a proceedure for checking progress of a potentially ongoing operation. For Image.copy, that process would be to lookup the new image and check its state. > - No revert Impossible to implement safely. How do the engine folks feel about this? I am ok with it :) > - No SPM\HSM tasks SPM\SDM is no longer necessary for all domain types (only > for type). What used to be SPM tasks, or tasks that persist and can be > restarted on other hosts is talked about in previous bullet points. > A nice simplification. -- Adam Litke <a...@us.ibm.com> IBM Linux Technology Center _______________________________________________ vdsm-devel mailing list email@example.com https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel