Re: [vdsm] RFD: API: Identifying vdsm objects in the next-gen API

2012-12-03 Thread Adam Litke
On Mon, Dec 03, 2012 at 03:57:42PM -0500, Saggi Mizrahi wrote:
> 
> 
> - Original Message -
> > From: "Adam Litke"  To: "Saggi Mizrahi"
> >  Cc: engine-de...@linode01.ovirt.org, "Dan Kenigsberg"
> > , "Federico Simoncelli" , "Ayal
> > Baron" , vdsm-devel@lists.fedorahosted.org Sent: Monday,
> > December 3, 2012 3:30:21 PM Subject: Re: RFD: API: Identifying vdsm objects
> > in the next-gen API
> > 
> > On Thu, Nov 29, 2012 at 05:59:09PM -0500, Saggi Mizrahi wrote:
> > > 
> > > 
> > > - Original Message -
> > > > From: "Adam Litke"  To: "Saggi Mizrahi"
> > > >  Cc: engine-de...@linode01.ovirt.org, "Dan
> > > > Kenigsberg" , "Federico Simoncelli"
> > > > , "Ayal Baron" ,
> > > > vdsm-devel@lists.fedorahosted.org Sent: Thursday, November 29, 2012
> > > > 5:22:43 PM Subject: Re: RFD: API: Identifying vdsm objects in the
> > > > next-gen API
> > > > 
> > > > On Thu, Nov 29, 2012 at 04:52:14PM -0500, Saggi Mizrahi wrote:
> > > > > They are not future proof as the paradigm is completely different.
> > > > > Storage domain IDs are not static any more (and are not guaranteed to
> > > > > be unique or the same across the cluster.  Image IDs represent the ID
> > > > > of the projected data and not the actual unique path.  Just as an
> > > > > example, to run a VM you give a list of domains that might contain the
> > > > > needed images in the chain and the image ID of the tip.  The paradigm
> > > > > is changed to and most calls get non synchronous number of images and
> > > > > domains.  Further more, the APIs themselves are completely different.
> > > > > So future proofing is not really an issue.
> > > > 
> > > > I don't understand this at all.  Perhaps we could all use some education
> > > > on the architecture of the planned architectural changes.  If I can pass
> > > > an arbitrary list of domainIDs that _might_ contain the data, why
> > > > wouldn't I just pass all of them every time?  In that case, why are they
> > > > even required since vdsm would have to search anyway?
> > > It's for optimization mostly, the engine usually has a good idea of where
> > > stuff are, having it give hints to VDSM can speed up the search process.
> > > also, then engines knows how transient some storage pieces are. If you
> > > have a domain that is only there for backup or "owned" by another manager
> > > sharing the host, you don't want you VMs using the disks that are on that
> > > storage effectively preventing it from being removed (though we do have
> > > plans to have qemu switch base snapshots at runtime for just that).
> > 
> > This is not a clean design.  If the search is slow, then maybe we need to
> > improve caching internally.  Making a client cache a bunch of internal IDs
> > to pass around sounds like a complete layering violation to me.
> You can't cache this, if the same template exists on an 2 different NFS
> domains only the engine has enough information to know which you should use.
> We only have the engine give us thing information when starting a VM or
> merging\copying an image that resides on multiple domains.  It is also
> completely optional. I didn't like it either.

Is it even valid for the same template (with identical uuids) to exist in two
places?  I thought uuids aren't supposed to collide.  I can envision some
scenario where a cached storagedomain/storagepool relationship becomes invalid
because another user detached the storagedomain.  In that case, the API just
returns the normal error about "sd XXX is not attached to sp XXX".  So I don't
see any problem here.

> > 
> > > > 
> > > > > As to making the current API a bit simpler. As I said, making them
> > > > > opaque is problematic as currently the engine is responsible for
> > > > > creating the IDs.
> > > > 
> > > > As I mentioned in my last post, engine still can specify the ID's when
> > > > the object is first created.  From that point forward the ID never
> > > > changes so it can be baked into the identifier.
> > > Where will this identifier be persisted?
> > > > 
> > > > > Further more, some calls require you to play with these (making a
> > > > > template instead of a snapshot).  Also, the full chain and topology
> > > > > needs to be completely visible to the engine.
> > > > 
> > > > Please provide a specific example of how you play with the IDs.  I can
> > > > guess where you are going, but I don't want to divert the thread.
> > > The relationship between volumes and images is deceptive at the moment.
> > > IMG is the chain and volume is a member, IMGUUID is only used to for
> > > verification and to detect when we hit a template going up the chain.
> > > When you do operation on images assumptions are being guaranteed about the
> > > resulting IDs.  When you copy an image, you assume to know all the new IDs
> > > as they remain the same.  With your method I can't tell what the new
> > > "opaque" result is going to be.  Preview mode (another abomination being
> > > deprecated) relies on the disconnect between imgUUI

Re: [vdsm] RFD: API: Identifying vdsm objects in the next-gen API

2012-12-03 Thread Saggi Mizrahi


- Original Message -
> From: "Adam Litke" 
> To: "Saggi Mizrahi" 
> Cc: engine-de...@linode01.ovirt.org, "Dan Kenigsberg" , 
> "Federico Simoncelli"
> , "Ayal Baron" , 
> vdsm-devel@lists.fedorahosted.org
> Sent: Monday, December 3, 2012 3:30:21 PM
> Subject: Re: RFD: API: Identifying vdsm objects in the next-gen API
> 
> On Thu, Nov 29, 2012 at 05:59:09PM -0500, Saggi Mizrahi wrote:
> > 
> > 
> > - Original Message -
> > > From: "Adam Litke"  To: "Saggi Mizrahi"
> > >  Cc: engine-de...@linode01.ovirt.org, "Dan
> > > Kenigsberg"
> > > , "Federico Simoncelli" ,
> > > "Ayal
> > > Baron" , vdsm-devel@lists.fedorahosted.org
> > > Sent:
> > > Thursday, November 29, 2012 5:22:43 PM Subject: Re: RFD: API:
> > > Identifying
> > > vdsm objects in the next-gen API
> > > 
> > > On Thu, Nov 29, 2012 at 04:52:14PM -0500, Saggi Mizrahi wrote:
> > > > They are not future proof as the paradigm is completely
> > > > different.
> > > > Storage domain IDs are not static any more (and are not
> > > > guaranteed to be
> > > > unique or the same across the cluster.  Image IDs represent the
> > > > ID of the
> > > > projected data and not the actual unique path.  Just as an
> > > > example, to run
> > > > a VM you give a list of domains that might contain the needed
> > > > images in
> > > > the chain and the image ID of the tip.  The paradigm is changed
> > > > to and
> > > > most calls get non synchronous number of images and domains.
> > > >  Further
> > > > more, the APIs themselves are completely different. So future
> > > > proofing is
> > > > not really an issue.
> > > 
> > > I don't understand this at all.  Perhaps we could all use some
> > > education on
> > > the architecture of the planned architectural changes.  If I can
> > > pass an
> > > arbitrary list of domainIDs that _might_ contain the data, why
> > > wouldn't I
> > > just pass all of them every time?  In that case, why are they
> > > even required
> > > since vdsm would have to search anyway?
> > It's for optimization mostly, the engine usually has a good idea of
> > where
> > stuff are, having it give hints to VDSM can speed up the search
> > process.
> > also, then engines knows how transient some storage pieces are. If
> > you have a
> > domain that is only there for backup or "owned" by another manager
> > sharing the
> > host, you don't want you VMs using the disks that are on that
> > storage
> > effectively preventing it from being removed (though we do have
> > plans to have
> > qemu switch base snapshots at runtime for just that).
> 
> This is not a clean design.  If the search is slow, then maybe we
> need to
> improve caching internally.  Making a client cache a bunch of
> internal IDs to
> pass around sounds like a complete layering violation to me.
You can't cache this, if the same template exists on an 2 different NFS domains 
only the engine has enough information to know which you should use.
We only have the engine give us thing information when starting a VM or 
merging\copying an image that resides on multiple domains.
It is also completely optional. I didn't like it either.
> 
> > > 
> > > > As to making the current API a bit simpler. As I said, making
> > > > them opaque
> > > > is problematic as currently the engine is responsible for
> > > > creating the
> > > > IDs.
> > > 
> > > As I mentioned in my last post, engine still can specify the ID's
> > > when the
> > > object is first created.  From that point forward the ID never
> > > changes so it
> > > can be baked into the identifier.
> > Where will this identifier be persisted?
> > > 
> > > > Further more, some calls require you to play with these (making
> > > > a template
> > > > instead of a snapshot).  Also, the full chain and topology
> > > > needs to be
> > > > completely visible to the engine.
> > > 
> > > Please provide a specific example of how you play with the IDs.
> > >  I can guess
> > > where you are going, but I don't want to divert the thread.
> > The relationship between volumes and images is deceptive at the
> > moment.  IMG
> > is the chain and volume is a member, IMGUUID is only used to for
> > verification
> > and to detect when we hit a template going up the chain.  When you
> > do
> > operation on images assumptions are being guaranteed about the
> > resulting IDs.
> > When you copy an image, you assume to know all the new IDs as they
> > remain the
> > same.  With your method I can't tell what the new "opaque" result
> > is going to
> > be.  Preview mode (another abomination being deprecated) relies on
> > the
> > disconnect between imgUUID and volUUID.  Live migration currently
> > moves a lot
> > of the responsibility to the engine.
> 
> No client should need to know about all of these internal details.  I
> understand
> that's the way it is today, and that's one of the main reasons that
> the API is a
> complete pain to use.
You are correct but this is how this API was designed you can't get away from 
that.
> 
> > > 
> > > > These things, 

Re: [vdsm] RFD: API: Identifying vdsm objects in the next-gen API

2012-12-03 Thread Adam Litke
On Thu, Nov 29, 2012 at 05:59:09PM -0500, Saggi Mizrahi wrote:
> 
> 
> - Original Message -
> > From: "Adam Litke"  To: "Saggi Mizrahi"
> >  Cc: engine-de...@linode01.ovirt.org, "Dan Kenigsberg"
> > , "Federico Simoncelli" , "Ayal
> > Baron" , vdsm-devel@lists.fedorahosted.org Sent:
> > Thursday, November 29, 2012 5:22:43 PM Subject: Re: RFD: API: Identifying
> > vdsm objects in the next-gen API
> > 
> > On Thu, Nov 29, 2012 at 04:52:14PM -0500, Saggi Mizrahi wrote:
> > > They are not future proof as the paradigm is completely different.
> > > Storage domain IDs are not static any more (and are not guaranteed to be
> > > unique or the same across the cluster.  Image IDs represent the ID of the
> > > projected data and not the actual unique path.  Just as an example, to run
> > > a VM you give a list of domains that might contain the needed images in
> > > the chain and the image ID of the tip.  The paradigm is changed to and
> > > most calls get non synchronous number of images and domains.  Further
> > > more, the APIs themselves are completely different. So future proofing is
> > > not really an issue.
> > 
> > I don't understand this at all.  Perhaps we could all use some education on
> > the architecture of the planned architectural changes.  If I can pass an
> > arbitrary list of domainIDs that _might_ contain the data, why wouldn't I
> > just pass all of them every time?  In that case, why are they even required
> > since vdsm would have to search anyway?
> It's for optimization mostly, the engine usually has a good idea of where
> stuff are, having it give hints to VDSM can speed up the search process.
> also, then engines knows how transient some storage pieces are. If you have a
> domain that is only there for backup or "owned" by another manager sharing the
> host, you don't want you VMs using the disks that are on that storage
> effectively preventing it from being removed (though we do have plans to have
> qemu switch base snapshots at runtime for just that).

This is not a clean design.  If the search is slow, then maybe we need to
improve caching internally.  Making a client cache a bunch of internal IDs to
pass around sounds like a complete layering violation to me.

> > 
> > > As to making the current API a bit simpler. As I said, making them opaque
> > > is problematic as currently the engine is responsible for creating the
> > > IDs.
> > 
> > As I mentioned in my last post, engine still can specify the ID's when the
> > object is first created.  From that point forward the ID never changes so it
> > can be baked into the identifier.
> Where will this identifier be persisted?
> > 
> > > Further more, some calls require you to play with these (making a template
> > > instead of a snapshot).  Also, the full chain and topology needs to be
> > > completely visible to the engine.
> > 
> > Please provide a specific example of how you play with the IDs.  I can guess
> > where you are going, but I don't want to divert the thread.
> The relationship between volumes and images is deceptive at the moment.  IMG
> is the chain and volume is a member, IMGUUID is only used to for verification
> and to detect when we hit a template going up the chain.  When you do
> operation on images assumptions are being guaranteed about the resulting IDs.
> When you copy an image, you assume to know all the new IDs as they remain the
> same.  With your method I can't tell what the new "opaque" result is going to
> be.  Preview mode (another abomination being deprecated) relies on the
> disconnect between imgUUID and volUUID.  Live migration currently moves a lot
> of the responsibility to the engine.

No client should need to know about all of these internal details.  I understand
that's the way it is today, and that's one of the main reasons that the API is a
complete pain to use.

> > 
> > > These things, as you said, are problematic. But this is the way things are
> > > today.
> > 
> > We are changing them.
> Any intermediary step is needlessly problematic for existing clients.  Work is
> already in progress for fixing the API properly, making some calls a bit nicer
> isn't an excuse to start making more compatibility code in the engine.

The engine won't need compatibility code.  This only would impact the jsonrpc
bindings which aren't used by engine yet.  When engine switches over, then yes
it would need to adapt.

> > 
> > > As for task IDs.  Currently task IDs are only used for storage and they
> > > get persisted to disk. This is WRONG and is not the case with the new
> > > storage API.  Because we moved to an asynchronous message based protocol
> > > (json-rpc over TCP\AMQP) there is no need to generate a task ID. it is
> > > built in to json-rpc.  json-rpc specifies that the IDs have to be unique
> > > for a client as long as the request is still active.  This is good enough
> > > as internally we can have a verb for a client to query it's own running
> > > tasks and a verb to query other host tasks 

Re: [vdsm] RFD: API: Identifying vdsm objects in the next-gen API

2012-11-29 Thread Saggi Mizrahi


- Original Message -
> From: "Adam Litke" 
> To: "Saggi Mizrahi" 
> Cc: engine-de...@linode01.ovirt.org, "Dan Kenigsberg" , 
> "Federico Simoncelli"
> , "Ayal Baron" , 
> vdsm-devel@lists.fedorahosted.org
> Sent: Thursday, November 29, 2012 5:22:43 PM
> Subject: Re: RFD: API: Identifying vdsm objects in the next-gen API
> 
> On Thu, Nov 29, 2012 at 04:52:14PM -0500, Saggi Mizrahi wrote:
> > They are not future proof as the paradigm is completely different.
> >  Storage
> > domain IDs are not static any more (and are not guaranteed to be
> > unique or the
> > same across the cluster.  Image IDs represent the ID of the
> > projected data and
> > not the actual unique path.  Just as an example, to run a VM you
> > give a list
> > of domains that might contain the needed images in the chain and
> > the image ID
> > of the tip.  The paradigm is changed to and most calls get non
> > synchronous
> > number of images and domains.  Further more, the APIs themselves
> > are
> > completely different. So future proofing is not really an issue.
> 
> I don't understand this at all.  Perhaps we could all use some
> education on the
> architecture of the planned architectural changes.  If I can pass an
> arbitrary
> list of domainIDs that _might_ contain the data, why wouldn't I just
> pass all
> of them every time?  In that case, why are they even required since
> vdsm would
> have to search anyway?
It's for optimization mostly, the engine usually has a good idea of where stuff 
are, having it give hints to VDSM can speed up the search process.
also, then engines knows how transient some storage pieces are. If you have a 
domain that is only there for backup or "owned" by another manager sharing the 
host, you don't want you VMs using the disks that are on that storage 
effectively preventing it from being removed (though we do have plans to have 
qemu switch base snapshots at runtime for just that).
> 
> > As to making the current API a bit simpler. As I said, making them
> > opaque is
> > problematic as currently the engine is responsible for creating the
> > IDs.
> 
> As I mentioned in my last post, engine still can specify the ID's
> when the
> object is first created.  From that point forward the ID never
> changes so it can
> be baked into the identifier.
Where will this identifier be persisted?
> 
> > Further more, some calls require you to play with these (making a
> > template
> > instead of a snapshot).  Also, the full chain and topology needs to
> > be
> > completely visible to the engine.
> 
> Please provide a specific example of how you play with the IDs.  I
> can guess
> where you are going, but I don't want to divert the thread.
The relationship between volumes and images is deceptive at the moment.
IMG is the chain and volume is a member, IMGUUID is only used to for 
verification and to detect when we hit a template going up the chain.
When you do operation on images assumptions are being guaranteed about the 
resulting IDs. When you copy an image, you assume to know all the new IDs as 
they remain the same.
With your method I can't tell what the new "opaque" result is going to be.
Preview mode (another abomination being deprecated) relies on the disconnect 
between imgUUID and volUUID.
Live migration currently moves a lot of the responsibility to the engine.
> 
> > These things, as you said, are problematic. But this is the way
> > things are
> > today.
> 
> We are changing them.
Any intermediary step is needlessly problematic for existing clients.
Work is already in progress for fixing the API properly, making some calls a 
bit nicer isn't an excuse to start making more compatibility code in the engine.
> 
> > As for task IDs.  Currently task IDs are only used for storage and
> > they get
> > persisted to disk. This is WRONG and is not the case with the new
> > storage API.
> > Because we moved to an asynchronous message based protocol
> > (json-rpc over
> > TCP\AMQP) there is no need to generate a task ID. it is built in to
> > json-rpc.
> > json-rpc specifies that the IDs have to be unique for a client as
> > long as the
> > request is still active.  This is good enough as internally we can
> > have a verb
> > for a client to query it's own running tasks and a verb to query
> > other host
> > tasks by mangling in the client before the ID.  Because the
> > protocol is
> 
> So this would rely on the client keeping the connection open and as
> soon as it
> disconnects it would lose the ability to query tasks from before the
> connection
> went down?  I don't know if it's a good idea to conflate message ID's
> with task
> ID's.  While the protocol can operate asynchronously, some calls have
> synchronous semantics and others have asynchronous semantics.  I
> would expect
> sync calls to return their data immediately and async calls to return
> immediately with either: an error code, or an 'operation started'
> message and
> associated ID for querying the status of the operation.
Upon reflection 

Re: [vdsm] RFD: API: Identifying vdsm objects in the next-gen API

2012-11-29 Thread Adam Litke
On Thu, Nov 29, 2012 at 04:52:14PM -0500, Saggi Mizrahi wrote:
> They are not future proof as the paradigm is completely different.  Storage
> domain IDs are not static any more (and are not guaranteed to be unique or the
> same across the cluster.  Image IDs represent the ID of the projected data and
> not the actual unique path.  Just as an example, to run a VM you give a list
> of domains that might contain the needed images in the chain and the image ID
> of the tip.  The paradigm is changed to and most calls get non synchronous
> number of images and domains.  Further more, the APIs themselves are
> completely different. So future proofing is not really an issue.

I don't understand this at all.  Perhaps we could all use some education on the
architecture of the planned architectural changes.  If I can pass an arbitrary
list of domainIDs that _might_ contain the data, why wouldn't I just pass all
of them every time?  In that case, why are they even required since vdsm would
have to search anyway?

> As to making the current API a bit simpler. As I said, making them opaque is
> problematic as currently the engine is responsible for creating the IDs.

As I mentioned in my last post, engine still can specify the ID's when the
object is first created.  From that point forward the ID never changes so it can
be baked into the identifier.

> Further more, some calls require you to play with these (making a template
> instead of a snapshot).  Also, the full chain and topology needs to be
> completely visible to the engine.

Please provide a specific example of how you play with the IDs.  I can guess
where you are going, but I don't want to divert the thread.

> These things, as you said, are problematic. But this is the way things are
> today.

We are changing them.

> As for task IDs.  Currently task IDs are only used for storage and they get
> persisted to disk. This is WRONG and is not the case with the new storage API.
> Because we moved to an asynchronous message based protocol (json-rpc over
> TCP\AMQP) there is no need to generate a task ID. it is built in to json-rpc.
> json-rpc specifies that the IDs have to be unique for a client as long as the
> request is still active.  This is good enough as internally we can have a verb
> for a client to query it's own running tasks and a verb to query other host
> tasks by mangling in the client before the ID.  Because the protocol is

So this would rely on the client keeping the connection open and as soon as it
disconnects it would lose the ability to query tasks from before the connection
went down?  I don't know if it's a good idea to conflate message ID's with task
ID's.  While the protocol can operate asynchronously, some calls have
synchronous semantics and others have asynchronous semantics.  I would expect
sync calls to return their data immediately and async calls to return
immediately with either: an error code, or an 'operation started' message and
associated ID for querying the status of the operation.

> asynchronous all calls are asynchronous by nature well.  Tasks will no longer
> be persisted or expected to be persisted. It's the callers responsibility to
> query the state and see if the operation succeeded or failed if the caller or
> VDSM died in the middle of the call. The current "cleanTask()" system can't be
> used when more then one client is using VDSM and will not be used for anything
> other then legacy storage.

I agree about not persisting tasks in the future.  Although I think finished
tasks should remain in memory for some time so they can be queried by a client
who must reconnect.

> AFAIK Apart from storage all objects IDs are constructed with a single ID,
> name or alias. VMs, storageConnections, network interfaces. So it's not a real
> issue.  I agree that in the future we should keep the idiom of pass
> configuration once, name it, and keep using the name to reference the object.

Yes, storage is the major problem here.

> - Original Message -
> > From: "Adam Litke" 
> > To: "Saggi Mizrahi" 
> > Cc: engine-de...@linode01.ovirt.org, "Dan Kenigsberg" , 
> > "Federico Simoncelli"
> > , "Ayal Baron" , 
> > vdsm-devel@lists.fedorahosted.org
> > Sent: Thursday, November 29, 2012 4:18:40 PM
> > Subject: Re: RFD: API: Identifying vdsm objects in the next-gen API
> > 
> > On Thu, Nov 29, 2012 at 02:16:42PM -0500, Saggi Mizrahi wrote:
> > > This is all only valid for the current storage API the new one
> > > doesn't have
> > > pools or volumes. Only domains and images.  Also, images and
> > > domains are more
> > > loosely coupled and make this method problematic.
> > 
> > I am looking for an incremental way to bridge the differences.  It's
> > been 2
> > years and we still don't have the revamped storage API so I am
> > planning on what
> > we have being around for awhile :)  I think that defining object
> > identifiers as
> > opaque structured types is also future proof.  In the future an
> > Image-ng object
> > we can drop 's

Re: [vdsm] RFD: API: Identifying vdsm objects in the next-gen API

2012-11-29 Thread Saggi Mizrahi
They are not future proof as the paradigm is completely different.
Storage domain IDs are not static any more (and are not guaranteed to be unique 
or the same across the cluster.
Image IDs represent the ID of the projected data and not the actual unique path.
Just as an example, to run a VM you give a list of domains that might contain 
the needed images in the chain and the image ID of the tip.
The paradigm is changed to and most calls get non synchronous number of images 
and domains.
Further more, the APIs themselves are completely different. So future proofing 
is not really an issue.

As to making the current API a bit simpler. As I said, making them opaque is 
problematic as currently the engine is responsible for creating the IDs.
Further more, some calls require you to play with these (making a template 
instead of a snapshot).
Also, the full chain and topology needs to be completely visible to the engine.

These things, as you said, are problematic. But this is the way things are 
today.

As for task IDs.
Currently task IDs are only used for storage and they get persisted to disk. 
This is WRONG and is not the case with the new storage API.
Because we moved to an asynchronous message based protocol (json-rpc over 
TCP\AMQP) there is no need to generate a task ID. it is built in to json-rpc.
json-rpc specifies that the IDs have to be unique for a client as long as the 
request is still active.
This is good enough as internally we can have a verb for a client to query it's 
own running tasks and a verb to query other host tasks by mangling in the 
client before the ID.
Because the protocol is asynchronous all calls are asynchronous by nature well.
Tasks will no longer be persisted or expected to be persisted. It's the callers 
responsibility to query the state and see if the operation succeeded or failed 
if the caller or VDSM died in the middle of the call. The current "cleanTask()" 
system can't be used when more then one client is using VDSM and will not be 
used for anything other then legacy storage.

AFAIK Apart from storage all objects IDs are constructed with a single ID, name 
or alias. VMs, storageConnections, network interfaces. So it's not a real issue.
I agree that in the future we should keep the idiom of pass configuration once, 
name it, and keep using the name to reference the object.

- Original Message -
> From: "Adam Litke" 
> To: "Saggi Mizrahi" 
> Cc: engine-de...@linode01.ovirt.org, "Dan Kenigsberg" , 
> "Federico Simoncelli"
> , "Ayal Baron" , 
> vdsm-devel@lists.fedorahosted.org
> Sent: Thursday, November 29, 2012 4:18:40 PM
> Subject: Re: RFD: API: Identifying vdsm objects in the next-gen API
> 
> On Thu, Nov 29, 2012 at 02:16:42PM -0500, Saggi Mizrahi wrote:
> > This is all only valid for the current storage API the new one
> > doesn't have
> > pools or volumes. Only domains and images.  Also, images and
> > domains are more
> > loosely coupled and make this method problematic.
> 
> I am looking for an incremental way to bridge the differences.  It's
> been 2
> years and we still don't have the revamped storage API so I am
> planning on what
> we have being around for awhile :)  I think that defining object
> identifiers as
> opaque structured types is also future proof.  In the future an
> Image-ng object
> we can drop 'storagepoolID' from the identifier and, if it makes
> sense, remove
> the hard association with a storageDomain as well.  The point behind
> this
> refactoring is to give us the option of coupling multiple UUID's (or
> other data)
> to form a single, opaque identifier.
> 
> > That being said, if we do choose to make the current storage API
> > officially
> > supported I do agree that it looks a bit simpler but for the price
> > of forcing
> > the user to construct these objects before sending the request. I
> > know for a
> > fact that the engine will just create these objects on the fly
> > because they
> > use their own objects to group things logically. This means adding
> > more work
> > instead of removing it.  Most clients will do that anyway as they
> > will use
> > their own DAL to store these relationships.
> > 
> 
> Thanks for bringing up some of these points.  All deserve attention
> so I will
> address each one individually:
> 
> The current API does not yet make an official statement of support
> for anything.
> I want to model the current storage API so that the node level API
> can have the
> same level of functionality as is currently supported.  I am all for
> removing
> deprecated functions and redesigning in-place for a reasonable amount
> of time
> going forward.  In a perfect world, libvdsm-1.0 would release with no
> mention of
> storage pools at all.
> 
> If properly designed, the end-user (including engine) would never be
> constructing these objects itself.  Object identifiers are
> essentially opaque
> structures.  In order to make this possible, we need to make sure
> that the API
> provides all of the functions needed t

Re: [vdsm] RFD: API: Identifying vdsm objects in the next-gen API

2012-11-29 Thread Adam Litke
On Thu, Nov 29, 2012 at 02:16:42PM -0500, Saggi Mizrahi wrote:
> This is all only valid for the current storage API the new one doesn't have
> pools or volumes. Only domains and images.  Also, images and domains are more
> loosely coupled and make this method problematic.

I am looking for an incremental way to bridge the differences.  It's been 2
years and we still don't have the revamped storage API so I am planning on what
we have being around for awhile :)  I think that defining object identifiers as
opaque structured types is also future proof.  In the future an Image-ng object
we can drop 'storagepoolID' from the identifier and, if it makes sense, remove
the hard association with a storageDomain as well.  The point behind this
refactoring is to give us the option of coupling multiple UUID's (or other data)
to form a single, opaque identifier.

> That being said, if we do choose to make the current storage API officially
> supported I do agree that it looks a bit simpler but for the price of forcing
> the user to construct these objects before sending the request. I know for a
> fact that the engine will just create these objects on the fly because they
> use their own objects to group things logically. This means adding more work
> instead of removing it.  Most clients will do that anyway as they will use
> their own DAL to store these relationships. 
> 

Thanks for bringing up some of these points.  All deserve attention so I will
address each one individually:

The current API does not yet make an official statement of support for anything.
I want to model the current storage API so that the node level API can have the
same level of functionality as is currently supported.  I am all for removing
deprecated functions and redesigning in-place for a reasonable amount of time
going forward.  In a perfect world, libvdsm-1.0 would release with no mention of
storage pools at all.

If properly designed, the end-user (including engine) would never be
constructing these objects itself.  Object identifiers are essentially opaque
structures.  In order to make this possible, we need to make sure that the API
provides all of the functions needed to lookup objects.  So far these are:

StoragePool:Host.getConnectedStoragePools
StorageDomain:  Host.getStorageDomains
Image:  StorageDomain.getImages
Volume: StorageDomain.getVolumes / Image.getVolumes
VM: Host.getVMList
Task:   Host.getAllTasks

All of the above would return object identifiers.

The other case is for creation of new resources.  In that case, the create
method needs to move to the owning object.  The key example is VM.create which
should move to Host.createVM.  Functions such as Host.createVM could still
accept a vmUUID (because I assume engine does want the ability to set this
explicitly).  However, they should be changed to return either a TaskIdentifier
(if the creation is asynchronous) or a *Identifier (eg. VmIdentifier) if the
object was created synchronously.

All told, I think the net is less work for clients.  They will no longer need to
model the object associations and relationships because the API will take care
of that automatically.

> - Original Message -
> > From: "Adam Litke" 
> > To: vdsm-devel@lists.fedorahosted.org
> > Cc: engine-de...@linode01.ovirt.org, "Dan Kenigsberg" , 
> > "Federico Simoncelli"
> > , "Saggi Mizrahi" , "Ayal Baron" 
> > 
> > Sent: Thursday, November 29, 2012 12:19:06 PM
> > Subject: RFD: API: Identifying vdsm objects in the next-gen API
> > 
> > Today in vdsm, every object (StoragePool, StorageDomain, VM, Volume,
> > etc) is
> > identified by a single UUID.  On the surface, it seems like this is
> > enough info
> > to properly identify a resource but in practice it's not.  For
> > example, when you
> > look at the API's dealing with Volumes, almost all of them require an
> > sdUUID,
> > spUUID, and imgUUID in order to provide proper context for the
> > operation.
> > 
> > Needing to provide these extra UUIDs is a burden on the API user
> > because knowing
> > which values to pass requires internal knowledge of the API.  For
> > example, the
> > spUUID parameter is almost always just the connected storage pool.
> >  Since we
> > know there can currently be only one connected pool, the value is
> > known.
> > 
> > I would like to move away from needing to understand all of these
> > relationships
> > from the end user perspective by encapsulating the extra context into
> > new object
> > identifier types as follows:
> > 
> > StoragePoolIdentifier:
> > { 'storagepoolID': 'UUID' }
> > StorageDomainIdentifier:
> > { 'storagepoolID*': 'UUID', 'storagedomainID': 'UUID' }
> > ImageIdentifier:
> > { 'storagepoolID*': 'UUID', 'storagedomainID': 'UUID', 'imageID':
> > 'UUID' }
> > VolumeIdentifier:
> > { 'storagepoolID*': 'UUID', 'storagedomainID': 'UUID',
> >   'imageID': 'UUID', 'volumeID': 'UUID' }
> > TaskIdentifier:
> > { 'taskID': 'UUID' }
> 

Re: [vdsm] RFD: API: Identifying vdsm objects in the next-gen API

2012-11-29 Thread Saggi Mizrahi
This is all only valid for the current storage API
the new one doesn't have pools or volumes. Only domains and images.
Also, images and domains are more loosely coupled and make this method 
problematic.

That being said, if we do choose to make the current storage API officially 
supported I do agree that it looks a bit simpler but for the price of forcing 
the user to construct these objects before sending the request. I know for a 
fact that the engine will just create these objects on the fly because they use 
their own objects to group things logically. This means adding more work 
instead of removing it.
Most clients will do that anyway as they will use their own DAL to store these 
relationships. 

- Original Message -
> From: "Adam Litke" 
> To: vdsm-devel@lists.fedorahosted.org
> Cc: engine-de...@linode01.ovirt.org, "Dan Kenigsberg" , 
> "Federico Simoncelli"
> , "Saggi Mizrahi" , "Ayal Baron" 
> 
> Sent: Thursday, November 29, 2012 12:19:06 PM
> Subject: RFD: API: Identifying vdsm objects in the next-gen API
> 
> Today in vdsm, every object (StoragePool, StorageDomain, VM, Volume,
> etc) is
> identified by a single UUID.  On the surface, it seems like this is
> enough info
> to properly identify a resource but in practice it's not.  For
> example, when you
> look at the API's dealing with Volumes, almost all of them require an
> sdUUID,
> spUUID, and imgUUID in order to provide proper context for the
> operation.
> 
> Needing to provide these extra UUIDs is a burden on the API user
> because knowing
> which values to pass requires internal knowledge of the API.  For
> example, the
> spUUID parameter is almost always just the connected storage pool.
>  Since we
> know there can currently be only one connected pool, the value is
> known.
> 
> I would like to move away from needing to understand all of these
> relationships
> from the end user perspective by encapsulating the extra context into
> new object
> identifier types as follows:
> 
> StoragePoolIdentifier:
> { 'storagepoolID': 'UUID' }
> StorageDomainIdentifier:
> { 'storagepoolID*': 'UUID', 'storagedomainID': 'UUID' }
> ImageIdentifier:
> { 'storagepoolID*': 'UUID', 'storagedomainID': 'UUID', 'imageID':
> 'UUID' }
> VolumeIdentifier:
> { 'storagepoolID*': 'UUID', 'storagedomainID': 'UUID',
>   'imageID': 'UUID', 'volumeID': 'UUID' }
> TaskIdentifier:
> { 'taskID': 'UUID' }
> VMIdentifier:
> { 'vmID': 'UUID' }
> 
> In the new API, anytime a reference to an object is required, one of
> the above
> structures must be passed in place of today's single UUID.  In many
> cases, this
> will allow us to reduce the number of parameters to the function
> since the
> needed contextual parameters (spUUID, etc) will be part of the
> object's
> identifier.  Similarly, any time the API returns an object reference
> it would
> return a *Identifier instead of a bare UUID.
> 
> These identifier types are basically opaque blobs to the API users
> and are only
> ever generated by vdsm itself.  Because of this, we can change the
> internal
> structure of the identifier to require new information or (before
> freezing the
> API) remove fields that no longer make sense.
> 
> I would greatly appreciate your comments on this proposal.  If it
> seems
> reasonable, I will revamp the current schema to make the necessary
> changes and
> provide the Bridge patch functions to convert between the current
> implementation
> and the new schema.
> 
> --- sample schema patch ---
> 
> commit 48f6b0f0a111dd0b372d211a4e566ce87f375cee
> Author: Adam Litke 
> Date:   Tue Nov 27 14:14:06 2012 -0600
> 
> schema: Introduce class identifier types
> 
> When calling API methods that belong to a particular class, a
> class instance
> must be indicated by passing a set of identifiers in the request.
>  The location
> of these parameters within the request is: 'params' -> '__obj__'.
>  Since this
> set of identifiers must be used together to correctly instantiate
> an object, it
> makes sense to define these as proper types within the API.
>  Then, functions
> that return an object (or list of objects) can refer to the
> correct type.
> 
> Signed-off-by: Adam Litke 
> 
> diff --git a/vdsm_api/vdsmapi-schema.json
> b/vdsm_api/vdsmapi-schema.json
> index 0418e6e..7e2e851 100644
> --- a/vdsm_api/vdsmapi-schema.json
> +++ b/vdsm_api/vdsmapi-schema.json
> @@ -937,7 +937,7 @@
>  # Since: 4.10.0
>  ##
>  {'command': {'class': 'Host', 'name': 'getConnectedStoragePools'},
> - 'returns': ['StoragePool']}
> + 'returns': ['StoragePoolIdentifier']}
>  
>  ##
>  # @BlockDeviceType:
> @@ -1572,7 +1572,7 @@
>  {'command': {'class': 'Host', 'name': 'getStorageDomains'},
>   'data': {'*storagepoolID': 'UUID', '*domainClass':
>   'StorageDomainImageClass',
>'*storageType': 'StorageDomainType', '*remotePath':
>'str'},
> - 'returns': ['StorageDomain']}
> + 'returns': ['StorageDomainId