----- Original Message -----
> From: "Adam Litke" <a...@us.ibm.com>
> To: "Saggi Mizrahi" <smizr...@redhat.com>
> Cc: engine-de...@linode01.ovirt.org, "Dan Kenigsberg" <dan...@redhat.com>,
> "Federico Simoncelli"
> <fsimo...@redhat.com>, "Ayal Baron" <aba...@redhat.com>,
> Sent: Monday, December 3, 2012 3:30:21 PM
> Subject: Re: RFD: API: Identifying vdsm objects in the next-gen API
> On Thu, Nov 29, 2012 at 05:59:09PM -0500, Saggi Mizrahi wrote:
> > ----- Original Message -----
> > > From: "Adam Litke" <a...@us.ibm.com> To: "Saggi Mizrahi"
> > > <smizr...@redhat.com> Cc: engine-de...@linode01.ovirt.org, "Dan
> > > Kenigsberg"
> > > <dan...@redhat.com>, "Federico Simoncelli" <fsimo...@redhat.com>,
> > > "Ayal
> > > Baron" <aba...@redhat.com>, email@example.com
> > > Sent:
> > > Thursday, November 29, 2012 5:22:43 PM Subject: Re: RFD: API:
> > > Identifying
> > > vdsm objects in the next-gen API
> > >
> > > On Thu, Nov 29, 2012 at 04:52:14PM -0500, Saggi Mizrahi wrote:
> > > > They are not future proof as the paradigm is completely
> > > > different.
> > > > Storage domain IDs are not static any more (and are not
> > > > guaranteed to be
> > > > unique or the same across the cluster. Image IDs represent the
> > > > ID of the
> > > > projected data and not the actual unique path. Just as an
> > > > example, to run
> > > > a VM you give a list of domains that might contain the needed
> > > > images in
> > > > the chain and the image ID of the tip. The paradigm is changed
> > > > to and
> > > > most calls get non synchronous number of images and domains.
> > > > Further
> > > > more, the APIs themselves are completely different. So future
> > > > proofing is
> > > > not really an issue.
> > >
> > > I don't understand this at all. Perhaps we could all use some
> > > education on
> > > the architecture of the planned architectural changes. If I can
> > > pass an
> > > arbitrary list of domainIDs that _might_ contain the data, why
> > > wouldn't I
> > > just pass all of them every time? In that case, why are they
> > > even required
> > > since vdsm would have to search anyway?
> > It's for optimization mostly, the engine usually has a good idea of
> > where
> > stuff are, having it give hints to VDSM can speed up the search
> > process.
> > also, then engines knows how transient some storage pieces are. If
> > you have a
> > domain that is only there for backup or "owned" by another manager
> > sharing the
> > host, you don't want you VMs using the disks that are on that
> > storage
> > effectively preventing it from being removed (though we do have
> > plans to have
> > qemu switch base snapshots at runtime for just that).
> This is not a clean design. If the search is slow, then maybe we
> need to
> improve caching internally. Making a client cache a bunch of
> internal IDs to
> pass around sounds like a complete layering violation to me.
You can't cache this, if the same template exists on an 2 different NFS domains
only the engine has enough information to know which you should use.
We only have the engine give us thing information when starting a VM or
merging\copying an image that resides on multiple domains.
It is also completely optional. I didn't like it either.
> > >
> > > > As to making the current API a bit simpler. As I said, making
> > > > them opaque
> > > > is problematic as currently the engine is responsible for
> > > > creating the
> > > > IDs.
> > >
> > > As I mentioned in my last post, engine still can specify the ID's
> > > when the
> > > object is first created. From that point forward the ID never
> > > changes so it
> > > can be baked into the identifier.
> > Where will this identifier be persisted?
> > >
> > > > Further more, some calls require you to play with these (making
> > > > a template
> > > > instead of a snapshot). Also, the full chain and topology
> > > > needs to be
> > > > completely visible to the engine.
> > >
> > > Please provide a specific example of how you play with the IDs.
> > > I can guess
> > > where you are going, but I don't want to divert the thread.
> > The relationship between volumes and images is deceptive at the
> > moment. IMG
> > is the chain and volume is a member, IMGUUID is only used to for
> > verification
> > and to detect when we hit a template going up the chain. When you
> > do
> > operation on images assumptions are being guaranteed about the
> > resulting IDs.
> > When you copy an image, you assume to know all the new IDs as they
> > remain the
> > same. With your method I can't tell what the new "opaque" result
> > is going to
> > be. Preview mode (another abomination being deprecated) relies on
> > the
> > disconnect between imgUUID and volUUID. Live migration currently
> > moves a lot
> > of the responsibility to the engine.
> No client should need to know about all of these internal details. I
> that's the way it is today, and that's one of the main reasons that
> the API is a
> complete pain to use.
You are correct but this is how this API was designed you can't get away from
> > >
> > > > These things, as you said, are problematic. But this is the way
> > > > things are
> > > > today.
> > >
> > > We are changing them.
> > Any intermediary step is needlessly problematic for existing
> > clients. Work is
> > already in progress for fixing the API properly, making some calls
> > a bit nicer
> > isn't an excuse to start making more compatibility code in the
> > engine.
> The engine won't need compatibility code. This only would impact the
> bindings which aren't used by engine yet. When engine switches over,
> then yes
> it would need to adapt.
This means that you put it as a condition to adapting the json-rpc base API
which will postpone engine adoption significantly.
> > >
> > > > As for task IDs. Currently task IDs are only used for storage
> > > > and they
> > > > get persisted to disk. This is WRONG and is not the case with
> > > > the new
> > > > storage API. Because we moved to an asynchronous message based
> > > > protocol
> > > > (json-rpc over TCP\AMQP) there is no need to generate a task
> > > > ID. it is
> > > > built in to json-rpc. json-rpc specifies that the IDs have to
> > > > be unique
> > > > for a client as long as the request is still active. This is
> > > > good enough
> > > > as internally we can have a verb for a client to query it's own
> > > > running
> > > > tasks and a verb to query other host tasks by mangling in the
> > > > client
> > > > before the ID. Because the protocol is
> > >
> > > So this would rely on the client keeping the connection open and
> > > as soon as
> > > it disconnects it would lose the ability to query tasks from
> > > before the
> > > connection went down? I don't know if it's a good idea to
> > > conflate message
> > > ID's with task ID's. While the protocol can operate
> > > asynchronously, some
> > > calls have synchronous semantics and others have asynchronous
> > > semantics. I
> > > would expect sync calls to return their data immediately and
> > > async calls to
> > > return immediately with either: an error code, or an 'operation
> > > started'
> > > message and associated ID for querying the status of the
> > > operation.
> > Upon reflection I agree that having the request ID unique per
> > client is
> > problematic and we need to make sure they are unique per host at
> > every point
> > in time.
> > >
> > > > asynchronous all calls are asynchronous by nature well. Tasks
> > > > will no
> > > > longer be persisted or expected to be persisted. It's the
> > > > callers
> > > > responsibility to query the state and see if the operation
> > > > succeeded or
> > > > failed if the caller or VDSM died in the middle of the call.
> > > > The current
> > > > "cleanTask()" system can't be used when more then one client is
> > > > using VDSM
> > > > and will not be used for anything other then legacy storage.
> > >
> > > I agree about not persisting tasks in the future. Although I
> > > think finished
> > > tasks should remain in memory for some time so they can be
> > > queried by a
> > > client who must reconnect.
> > I am completely against keeping the task for a nominal amount of
> > time, it just
> > makes another flow. You need to have code that makes up in case
> > you missed
> > that window any way then just have one recovery code path, when
> > VDSM looses
> > you task or you lose VDSM recover immediately. Also, because task
> > IDs can be
> > reused once they expire assuming that the task you encountered is
> > the same
> > task you originally sent is problematic.
> > If you expect intermittent connections use the AMQP backend (which
> > will
> > support broker-less p2p communication as well)
> HOw will you tell the difference between a completed task and an
> invalid (no
> such task) task? Do all completed tasks just issue a task completed
When task complete they return a value (and might even send a task completed
If you lost that task you will just have to handle that by inspecting the
As I said, you will have to do that anyway if you put a "sleep" on finished
task and you missed the window or you are tracking a task that is owned by
another VDSM user.
If we have to handle crash situations might as well be crash only.
> > >
> > > > AFAIK Apart from storage all objects IDs are constructed with a
> > > > single ID,
> > > > name or alias. VMs, storageConnections, network interfaces. So
> > > > it's not a
> > > > real issue. I agree that in the future we should keep the
> > > > idiom of pass
> > > > configuration once, name it, and keep using the name to
> > > > reference the
> > > > object.
> > >
> > > Yes, storage is the major problem here.
> > And, as I said, changing the API is problematic for migration of
> > current
> > users.
> > >
> Adam Litke <a...@us.ibm.com>
> IBM Linux Technology Center
vdsm-devel mailing list