[vdsm] Fedora, udev and nic renaming

2012-12-04 Thread Antoni Segura Puimedon
Hi,

We are currently working on stabilizing the networking part of vdsm in Fedora
18 and, to achieve that purpose, we decided to test in in both physical hosts
and, for extra convenience and better support, also in VMs.

Due to the move of Fedora 17 and 18 to systemd and newer udev versions, we
encountered some issues that should be noted and worked on to provide our users
with a hassle-free experience.

First of all, let me state what happens (in renaming) in RHEL-6.x when a new
ethernet device is handled by udev:
a) One or more udev rules match the characteristics of the interface: The last
matching rule is applied.
b) No rule is matching: /lib/udev/write-net-rules writes a permanent
rule using the MAC address of the interface in a udev rules file, so the
interface name will be permanent and in the ethX namespace.

In Fedora 17 (but even more so in F18), with the move to a newer version of
udev and, specially, with the change from sysV init to systemd, the mechanism
changed. Since systemd is making the boot happen in a parallelized way, some
changes had to be enforced in udev to keep the renaming working:

- To avoid naming collisions, it was decided to use Dell's biosdevname software
  to retrieve a device name for the network interfaces. Typically emX for
  onboard nics and pXpY for pci connected nics.
- For devices which biosdevname could not provide any information, it was
  agreed to assign them a name in the ethX space in a first-come, first-served
  fashion.
- Optionally, one could define the interace MAC addr in an ifcfg file
  and /lib/udev/rename-device would look into the ifcfg file and assign
  the device name there set (I have not yet succeeded in that part, I have to
  investigate more, I guess).

As you can see, biosdevname, never reports names in the eth space to avoid
collision with a potential parallel discovery of an interface not recognizable
by it, to which the kernel could have assigned already a bios reported name.

For physical machines this approach works fine. However, for Virtual machines
with more than one nic, the automatic process described above presents some
issues. Biosdevname, due to the different ways the virtualization hypervisors
report the vnics, dropped support for VMs in 0.3.7 (F18 uses 0.4.1-2) and
decided that on VMs, it would just return 4 to indicate to udev to use
kernel first-come, first-served for those interfaces (ethX namespace).

The issue with using first-come first-served, is that due to the highly
parallelized boot there is now, it is very common to encounter that the names
of your devices (as identified by MAC address) suffer a permutation upon each
reboot. Here you can see an example:

NOTE: The libvirt dump of the VM reports the same PCI address for each
interface across reboots.

Boot 0 (Nov 13th 14:59)
eth0: flags=4163UP,BROADCAST,RUNNING,MULTICAST  mtu 1500
ether 52:54:00:54:85:57  txqueuelen 1000  (Ethernet)
eth1: flags=4163UP,BROADCAST,RUNNING,MULTICAST  mtu 1500
ether 52:54:00:77:45:6b  txqueuelen 1000  (Ethernet)
eth2: flags=4163UP,BROADCAST,RUNNING,MULTICAST  mtu 1500
ether 52:54:00:ca:41:c7  txqueuelen 1000  (Ethernet)
eth3: flags=4163UP,BROADCAST,RUNNING,MULTICAST  mtu 1500
ether 52:54:00:f5:3d:c8  txqueuelen 1000  (Ethernet)
eth4: flags=4163UP,BROADCAST,RUNNING,MULTICAST  mtu 1500
ether 52:54:00:5e:10:76  txqueuelen 1000  (Ethernet)
eth5: flags=4163UP,BROADCAST,RUNNING,MULTICAST  mtu 1500
ether 52:54:00:95:00:93  txqueuelen 1000  (Ethernet)

Boot 1 (Nov 13th 15:01)
eth0: flags=4163UP,BROADCAST,RUNNING,MULTICAST  mtu 1500
ether 52:54:00:ca:41:c7  txqueuelen 1000  (Ethernet)
eth1: flags=4163UP,BROADCAST,RUNNING,MULTICAST  mtu 1500
ether 52:54:00:54:85:57  txqueuelen 1000  (Ethernet)
eth2: flags=4163UP,BROADCAST,RUNNING,MULTICAST  mtu 1500
ether 52:54:00:77:45:6b  txqueuelen 1000  (Ethernet)
eth3: flags=4163UP,BROADCAST,RUNNING,MULTICAST  mtu 1500
ether 52:54:00:f5:3d:c8  txqueuelen 1000  (Ethernet)
eth4: flags=4163UP,BROADCAST,RUNNING,MULTICAST  mtu 1500
ether 52:54:00:5e:10:76  txqueuelen 1000  (Ethernet)
eth5: flags=4163UP,BROADCAST,RUNNING,MULTICAST  mtu 1500
ether 52:54:00:95:00:93  txqueuelen 1000  (Ethernet)

As you can see, after rebooting:
eth0 - eth1
eth1 - eth2
eth2 - eth0

This is an issue if different vnics are connected to different networks or for
whichever reason require distinct configuration. To solve this issue, on the
guest there are three options:

- Assign somebody with BIOS knowledge to add KVM guest support to biosdevname
  so we can use the emX/pXpY namespace and maintain a native like experience in
  the VMs. My intuition is that it could just report pXpY where X is bus and Y
  is slot. This is the preferred option.
- Use libguestfs for setting udev rules using the MAC addresses we know from
  the VM definition in the netX namespace (I have been told that it is not
  

Re: [vdsm] Fedora, udev and nic renaming

2012-12-04 Thread Alon Bar-Lev

Thanks for this verbose description.

I don't think using libguestfs is the solution for this.

Fixing qemu to accept BIOS interface name at -net parameter is preferable. I 
don't think we should expose the interface a PCI device as it will have some 
drawbacks, but attempt to use the onboard convention.

Alon

- Original Message -
 From: Antoni Segura Puimedon asegu...@redhat.com
 To: vdsm-devel@lists.fedorahosted.org
 Sent: Tuesday, December 4, 2012 11:08:31 AM
 Subject: [vdsm] Fedora, udev and nic renaming
 
 Hi,
 
 We are currently working on stabilizing the networking part of vdsm
 in Fedora
 18 and, to achieve that purpose, we decided to test in in both
 physical hosts
 and, for extra convenience and better support, also in VMs.
 
 Due to the move of Fedora 17 and 18 to systemd and newer udev
 versions, we
 encountered some issues that should be noted and worked on to provide
 our users
 with a hassle-free experience.
 
 First of all, let me state what happens (in renaming) in RHEL-6.x
 when a new
 ethernet device is handled by udev:
 a) One or more udev rules match the characteristics of the interface:
 The last
 matching rule is applied.
 b) No rule is matching: /lib/udev/write-net-rules writes a permanent
 rule using the MAC address of the interface in a udev rules file, so
 the
 interface name will be permanent and in the ethX namespace.
 
 In Fedora 17 (but even more so in F18), with the move to a newer
 version of
 udev and, specially, with the change from sysV init to systemd, the
 mechanism
 changed. Since systemd is making the boot happen in a parallelized
 way, some
 changes had to be enforced in udev to keep the renaming working:
 
 - To avoid naming collisions, it was decided to use Dell's
 biosdevname software
   to retrieve a device name for the network interfaces. Typically emX
   for
   onboard nics and pXpY for pci connected nics.
 - For devices which biosdevname could not provide any information, it
 was
   agreed to assign them a name in the ethX space in a first-come,
   first-served
   fashion.
 - Optionally, one could define the interace MAC addr in an ifcfg file
   and /lib/udev/rename-device would look into the ifcfg file and
   assign
   the device name there set (I have not yet succeeded in that part, I
   have to
   investigate more, I guess).
 
 As you can see, biosdevname, never reports names in the eth space to
 avoid
 collision with a potential parallel discovery of an interface not
 recognizable
 by it, to which the kernel could have assigned already a bios
 reported name.
 
 For physical machines this approach works fine. However, for Virtual
 machines
 with more than one nic, the automatic process described above
 presents some
 issues. Biosdevname, due to the different ways the virtualization
 hypervisors
 report the vnics, dropped support for VMs in 0.3.7 (F18 uses 0.4.1-2)
 and
 decided that on VMs, it would just return 4 to indicate to udev to
 use
 kernel first-come, first-served for those interfaces (ethX
 namespace).
 
 The issue with using first-come first-served, is that due to the
 highly
 parallelized boot there is now, it is very common to encounter that
 the names
 of your devices (as identified by MAC address) suffer a permutation
 upon each
 reboot. Here you can see an example:
 
 NOTE: The libvirt dump of the VM reports the same PCI address for
 each
 interface across reboots.
 
 Boot 0 (Nov 13th 14:59)
 eth0: flags=4163UP,BROADCAST,RUNNING,MULTICAST  mtu 1500
 ether 52:54:00:54:85:57  txqueuelen 1000  (Ethernet)
 eth1: flags=4163UP,BROADCAST,RUNNING,MULTICAST  mtu 1500
 ether 52:54:00:77:45:6b  txqueuelen 1000  (Ethernet)
 eth2: flags=4163UP,BROADCAST,RUNNING,MULTICAST  mtu 1500
 ether 52:54:00:ca:41:c7  txqueuelen 1000  (Ethernet)
 eth3: flags=4163UP,BROADCAST,RUNNING,MULTICAST  mtu 1500
 ether 52:54:00:f5:3d:c8  txqueuelen 1000  (Ethernet)
 eth4: flags=4163UP,BROADCAST,RUNNING,MULTICAST  mtu 1500
 ether 52:54:00:5e:10:76  txqueuelen 1000  (Ethernet)
 eth5: flags=4163UP,BROADCAST,RUNNING,MULTICAST  mtu 1500
 ether 52:54:00:95:00:93  txqueuelen 1000  (Ethernet)
 
 Boot 1 (Nov 13th 15:01)
 eth0: flags=4163UP,BROADCAST,RUNNING,MULTICAST  mtu 1500
 ether 52:54:00:ca:41:c7  txqueuelen 1000  (Ethernet)
 eth1: flags=4163UP,BROADCAST,RUNNING,MULTICAST  mtu 1500
 ether 52:54:00:54:85:57  txqueuelen 1000  (Ethernet)
 eth2: flags=4163UP,BROADCAST,RUNNING,MULTICAST  mtu 1500
 ether 52:54:00:77:45:6b  txqueuelen 1000  (Ethernet)
 eth3: flags=4163UP,BROADCAST,RUNNING,MULTICAST  mtu 1500
 ether 52:54:00:f5:3d:c8  txqueuelen 1000  (Ethernet)
 eth4: flags=4163UP,BROADCAST,RUNNING,MULTICAST  mtu 1500
 ether 52:54:00:5e:10:76  txqueuelen 1000  (Ethernet)
 eth5: flags=4163UP,BROADCAST,RUNNING,MULTICAST  mtu 1500
 ether 52:54:00:95:00:93  txqueuelen 1000  (Ethernet)
 
 As you can see, after rebooting:
 eth0 - eth1
 

[vdsm] API.py validation

2012-12-04 Thread Antoni Segura Puimedon
Hi all,

I am currently working in adding a new feature to vdsm which requires a new
entry point in vdsm, thus requiring:
- Parameter definitions in vdsm_api/vdsmapi-schema.json
- Implementation and checks in vdsm/API.py and other modules.

Typically, we check for presence absence of required/optional parameters in
API.py using utils.validateMinimalKeySet or just if else clauses. I think this
process could benefit from a more automatic and less duplicated effort, i.e.,
parsing vdsmapi-schema.json in a similar way as process-schema.py does to make
a memoized method that is able to check whether the api call is correct
according to the API definitions. A very good side effect would be that this
would really avoid us from forgetting to update the schema.

Best regards,

Toni
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


[vdsm] VDSM tasks, the future

2012-12-04 Thread Saggi Mizrahi
Because I started hinting about how VDSM tasks are going to look going forward 
I thought it's better I'll just write everything in an email so we can talk 
about it in context.
This is not set in stone and I'm still debating things myself but it's very 
close to being done.

- Everything is asynchronous.
  The nature of message based communication is that you can't have synchronous 
operations.
  This is not really debatable because it's just how TCP\AMQP\messaging works.

- Task IDs will be decided by the caller.
  This is how json-rpc works and also makes sense because no the engine can 
track the task without needing to have a stage where we give it the task ID 
back.
  IDs are reusable as long as no one else is using them at the time so they can 
be used for synchronizing operations between clients (making sure a command is
  only executed once on a specific host without locking).

- Tasks are transient
  If VDSM restarts it forgets all the task information.
  There are 2 ways to have persistent tasks:
  1. The task creates an object that you can continue work on in VDSM.
 The new storage does that by the fact that copyImage() returns one the 
target volume has been created but before the data has been fully copied.
 From that moment on the stat of the copy can be queried from any host 
using getImageStatus() and the specific copy operation can be queried with 
getTaskStatus() on the host performing it.
 After VDSM crashes, depending on policy, either VDSM will create a new 
task to continue the copy or someone else will send a command to continue the 
operation and that will be a new task.
  2. VDSM tasks just start other operations track-able not through the task 
interface. For example Gluster.
 gluster.startVolumeRebalance() will return once it has been registered 
with Gluster.
 glster.getOperationStatuses() will return the state of the operation from 
any host.
 Each call is a task in itself.
  
- No task tags.
  They are silly and the caller can mangle whatever in the task ID if he really 
wants to tag tasks.

- No explicit recovery stage.
  VDSM will be crash-only, there should be efforts to make everything 
crash-safe.
  If that is problematic, in case of networking, VDSM will recover on start 
without having a task for it.

- No clean Task:
  Tasks can be started by any number of hosts this means that there is no way 
to own all tasks.
  There could be cases where VDSM starts tasks on it's own and thus they have 
no owner at all.
  The caller needs to continually track the state of VDSM. We will have 
brodcasted events to mitigate polling.

- No revert
  Impossible to implement safely.

- No SPM\HSM tasks
  SPM\SDM is no longer necessary for all domain types (only for type).
  What used to be SPM tasks, or tasks that persist and can be restarted on 
other hosts is talked about in previous bullet points.
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


[vdsm] link state semantics

2012-12-04 Thread Antoni Segura Puimedon
Hi list!

We are working on the new 3.2 feature for adding support for updating VM
devices, more specifically at the moment network devices.

There is one point of the design which is not yet consensual and we'd 
need to agree on a proper and clean design that would satisfy us all:

My current proposal, as reflected by patch:
   http://gerrit.ovirt.org/#/c/9560/5/vdsm_api/vdsmapi-schema.json
and its parent is to have a linkActive boolean that is true for link
status 'up' and false for link status 'down'.

We want to support a none (dummy) network that is used to dissociate vnics
from any real network. The semantics, as you can see in the patch are that
unless you specify a network, updateDevice will place the interface on that
network. However, Adam Litke argues that not specifying a network should
keep the vnic on the network it currently is, as network is an optional
parameter and 'linkActive' is also optional and has this preserve current
state semantics.

I can certainly see the merit of what Adam proposes, and the implementation
would be that linkActive becomes an enum like so:

{'enum': 'linkState'/* or linkActive */ , 'data': ['up', 'down', 
'disconnected']}

With this change, network would only be changed if one different than the 
current
one is specified and the vnic would be taken to the dummy bridge when the 
linkState
would be set to 'disconnected'.

There is also an objection, raised by Adam about the semantics of portMirroring.
The current behavior from my patch is:

portMirroring is None or is not set - No action taken.
portMirroring = [] - No action taken.
portMirroring = [a,b,z] - Set port mirroring for nets a,b and z to the 
specified vnic.

His proposal is:
portMirroring is None or is not set - No action taken.
portMirroring = [] - Unset port mirroring to the vnic that is currently set.
portMirroring = [a,b,z] - Set port mirroring for nets a,b and z to the 
specified vnic.

I would really welcome comments on this to have finally an agreement to the api 
for this
feature.

Best,

Toni
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] Back to future of vdsm network configuration

2012-12-04 Thread Simon Grinberg


- Original Message -
 From: Itamar Heim ih...@redhat.com
 To: Dan Kenigsberg dan...@redhat.com
 Cc: Alon Bar-Lev alo...@redhat.com, VDSM Project Development 
 vdsm-devel@lists.fedorahosted.org, Simon
 Grinberg si...@redhat.com, Andrew Cathrow acath...@redhat.com
 Sent: Monday, December 3, 2012 10:56:53 PM
 Subject: Re: [vdsm] Back to future of vdsm network configuration
 
 On 12/03/2012 06:54 PM, Dan Kenigsberg wrote:
  On Mon, Dec 03, 2012 at 04:28:16PM +0200, Itamar Heim wrote:
  On 12/03/2012 04:25 PM, Dan Kenigsberg wrote:
  On Mon, Dec 03, 2012 at 04:35:34AM -0500, Alon Bar-Lev wrote:
 
 
  - Original Message -
  From: Mark Wu wu...@linux.vnet.ibm.com
  To: VDSM Project Development
  vdsm-devel@lists.fedorahosted.org
  Cc: Alon Bar-Lev alo...@redhat.com, Dan Kenigsberg
  dan...@redhat.com, Simon Grinberg si...@redhat.com,
  Antoni Segura Puimedon asegu...@redhat.com, Igor Lvovsky
  ilvov...@redhat.com, Daniel P. Berrange
  berra...@redhat.com
  Sent: Monday, December 3, 2012 7:39:49 AM
  Subject: Re: [vdsm] Back to future of vdsm network
  configuration
 
  On 11/29/2012 04:24 AM, Alon Bar-Lev wrote:
 
  - Original Message -
  From: Dan Kenigsberg dan...@redhat.com
  To: Alon Bar-Lev alo...@redhat.com
  Cc: Simon Grinberg si...@redhat.com, VDSM Project
  Development vdsm-devel@lists.fedorahosted.org
  Sent: Wednesday, November 28, 2012 10:20:11 PM
  Subject: Re: [vdsm] MTU setting according to ifcfg files.
 
  On Wed, Nov 28, 2012 at 12:49:10PM -0500, Alon Bar-Lev wrote:
  Itamar though a bomb that we should co-exist on generic
  host,
  this
  is
  something I do not know to compute. I still waiting for a
  response
  of
  where this requirement came from and if that mandatory.
 
  This bomb has been ticking since ever. We have ovirt-node
  images
  for
  pure hypervisor nodes, but we support plain Linux nodes,
  where
  local
  admins are free to `yum upgrade` in the least convenient
  moment.
  The
  latter mode can be the stuff that nightmares are made of, but
  it
  also
  allows the flexibility and bleeding-endgeness we all cherish.
 
  There is a different between having generic OS and having
  generic
  setup, running your email server, file server and LDAP on a
  node
  that running VMs.
 
  I have no problem in having generic OS (opposed of ovirt-node)
  but
  have full control over that.
 
  Alon.
  Can I say we have got agreement on oVirt should cover two kinds
  of
  hypervisors?  Stateless slave is good for pure and normal
  virtualization
  workload, while generic host can keep the flexibility of
  customization.
  In my opinion, it's good for the oVirt community to provide
  choices
  for
  users.  They could customize it in production, building and
  even
  source
  code according to their requirements and skills.
 
  I also think it will be good to support both modes! It will also
  good if we can rule the world! :)
 
  Now seriously... :)
 
  If we want to ever have a working solution we need to focus,
  dropping wishful requirements in favour of the minimum required
  that will allow us to reach to stable milestone.
 
  Having a good clean interface for vdsm network within the
  stateless mode, will allow a persistent implementation to
  exists even if the whole implementation of master and vdsm
  assume stateless. This kind of implementation will get a new
  state from master, compare to whatever exists on the host and
  sync.
 
  I, of course, will be against investing resources in such
  network
  management plugin approach... but it is doable, and my vote is
  not
  something that you cannot safely ignore.
 
  I cannot say that I do not fail to parse English sentences with
  double
  or triple negations...
 
  I'd like to see an API that lets us define a persistent initial
  management
  interface, and create volatile network devices during runtime.
  I'd love
  to see a define/create distiction, as libvirt has.
 
  How about keeping our current setupNetwork API, with a minor
  change to
  its sematics - it would not persist anything. A new
  persistNetwork API
  would be added, intending to persist the management network after
  it has
  been tested.
 
  On boot, only the management defitions would show up, and Engine
  (or a
  small local sevice on top of vdsm) would push the complete
  configuration.
 
 
  how does this benefit over loading the last config, and then have
  engine refresh (always/if needed)?
 
  It's clearer for the local admin: if it's on the file system, it
  would
  be there after boot; he can do his worst to them, and we'd try to
  manage.
 
  Also, it is easier to recover from utterly-horrible remote
  commands,
  which had rendered our host incommunicado: the management interface
  used
  to send these commands -- and only it -- would show up after boot.
  This
  increases the probability that after fencing, we'd see the host
  again.
 
 i think we mentioned this before, but this will kill any way to have
 hosts come back to 

Re: [vdsm] object instancing in the new VDSM API

2012-12-04 Thread Adam Litke
Thanks for your detailed response...

On Mon, Dec 03, 2012 at 09:26:34PM -0500, Saggi Mizrahi wrote:
 So from what I gather the only thing that is bothering you is that storage
 operations require a lot of IDs.  I get that, I hate that to. It doesn't
 change the point that it was designed that way.  Even if you deem some use
 cases irrelevant it wouldn't change the fact that this is how people use it
 now.  And because we are going to throw it away as soon as we can there is no
 reason to shape out API around that.

In that case, I want to throw away the bad architecture along with the bad API.
In the future I would like to see:

1) Objects can be uniquely identified by a single UUID.   This means you would
not be able to reuse the same UUID on a different host/domain unless you are
talking about the same object (ie. move image).  If we think this is going to be
a problem, let's discuss the specific use cases.

2) Verbs should not have non-obvious preconditions or overloaded semantics
(basically, we need to get rid of the issues with storage pools and images that
you explain below).

 So from what I gather we agree on instancing.

Sure.  I am willing to adopt namespaces instead of instances as long as the
above is adhered to in the new design.

I do have to ask again, when do you think the new storage stuff will be ready
for serious review, testing and consideration for merging?  I would be happy to
spend a significant amount of time helping out with this if the end result has
us closing on this 2+ year endeavor :)

 
 ---
 
 From this moment on I'm going to try my best to explain how VDSM storage
 currently works.  It is filled with misdirection and bad design. I hope that
 after that you will understand why you can't pack all the IDs together.
 
 Let's start with the storage pool. Because it was simpler to have all metadata
 changing operations run on the same host someone needed to find a way to make
 cross domain operations work on the same host.  The solution was to band them
 all to a single entity call the storage pool and have a single lock. The point
 was to have a host be able to connect to multiple pools at a time.  Due to bad
 code (that could have been easily not have been so bad) the multiple pools
 feature was never implemented. Because the single lock to rule them all
 doesn't really work when you want to secure domain we had to add more locks
 making the pool concept obsolete.
 
 These means that you can trust VDSM to only be connected to a single pool at
 the time, this means that if you want to change anything you can just remove
 the pool arg.

Is there a reason that vdsm doesn't automatically connect to the pool noted in
the master storage domain?  It's fine if it doesn't become spm, but it would be
nice to reduce the number of steps required to bring storage back up after a
reboot.

Also, do you see significant changes in the storage domain related verbs?  I
guess we will remove the attach/detach/activate/deactivate verbs since storage
pools are going away.

 Lets go to volumes and images. Contrary to how it's name imgUUID does not
 represent and image. It's actually a tag given to part of a chain. This is
 commonly used to differentiate between parts of the chain responsible for VM
 images and templates. Due to bad code a lot of the possible combinations are
 not supported but that is the intention.  imgUUID being a tag means that it
 serves 3 purposes depending on the verb that uses it.  1) In some verbs it
 used as a useless sanity check to make sure the volume is tagged with this
 sdUUID.  This I imagine was done because someone didn't fully comprehend how
 and why you do sanity checks.  This means that in some verbs you can just
 remove it (if you are actually changing anything) 2) In some verbs it's meant
 to distinct the volume from it's original chain (creating a template). At that
 point it's actually now being invented by the caller.  3) Operations that act
 on the whole chain, if volUUID is there is for the same useless sanity check
 and can be removed.
 
 What you need to get out of this is that most of the time you can use less IDs
 just by removing useless imgUUID or volUUID args.  Further more, you need to
 understand that they are not hierarchical. imgUUID is a tag on the volume.
 similar to user for a file.
 
 As for domain IDs, because the caller can choose to reuse imgUUIDs and
 volUUIDs on different domains and some flows actually depend on that.  To make
 things simpler some verbs should be split up so how you specify that target
 volID doesn't affect the actual command.
 
 This means that copyImage() and createTemplate() should be split to:
 copyImage(dstDomain, srcDomain, imgUUID) createTemplate(dstDomain, dstImgUUID,
 srcDomain, srcImgUUID)
 
 That being said, I'm personally still against an indeterminate storage API
 because of engine adoption problems.  But if you want to fix the current
 interface. Packing up the IDs to a single ID wouldn't work and is 

Re: [vdsm] Back to future of vdsm network configuration

2012-12-04 Thread Itamar Heim

On 12/04/2012 07:49 PM, Simon Grinberg wrote:



- Original Message -

From: Itamar Heim ih...@redhat.com
To: Dan Kenigsberg dan...@redhat.com
Cc: Alon Bar-Lev alo...@redhat.com, VDSM Project Development 
vdsm-devel@lists.fedorahosted.org, Simon
Grinberg si...@redhat.com, Andrew Cathrow acath...@redhat.com
Sent: Monday, December 3, 2012 10:56:53 PM
Subject: Re: [vdsm] Back to future of vdsm network configuration

On 12/03/2012 06:54 PM, Dan Kenigsberg wrote:

On Mon, Dec 03, 2012 at 04:28:16PM +0200, Itamar Heim wrote:

On 12/03/2012 04:25 PM, Dan Kenigsberg wrote:

On Mon, Dec 03, 2012 at 04:35:34AM -0500, Alon Bar-Lev wrote:



- Original Message -

From: Mark Wu wu...@linux.vnet.ibm.com
To: VDSM Project Development
vdsm-devel@lists.fedorahosted.org
Cc: Alon Bar-Lev alo...@redhat.com, Dan Kenigsberg
dan...@redhat.com, Simon Grinberg si...@redhat.com,
Antoni Segura Puimedon asegu...@redhat.com, Igor Lvovsky
ilvov...@redhat.com, Daniel P. Berrange
berra...@redhat.com
Sent: Monday, December 3, 2012 7:39:49 AM
Subject: Re: [vdsm] Back to future of vdsm network
configuration

On 11/29/2012 04:24 AM, Alon Bar-Lev wrote:


- Original Message -

From: Dan Kenigsberg dan...@redhat.com
To: Alon Bar-Lev alo...@redhat.com
Cc: Simon Grinberg si...@redhat.com, VDSM Project
Development vdsm-devel@lists.fedorahosted.org
Sent: Wednesday, November 28, 2012 10:20:11 PM
Subject: Re: [vdsm] MTU setting according to ifcfg files.

On Wed, Nov 28, 2012 at 12:49:10PM -0500, Alon Bar-Lev wrote:

Itamar though a bomb that we should co-exist on generic
host,
this
is
something I do not know to compute. I still waiting for a
response
of
where this requirement came from and if that mandatory.


This bomb has been ticking since ever. We have ovirt-node
images
for
pure hypervisor nodes, but we support plain Linux nodes,
where
local
admins are free to `yum upgrade` in the least convenient
moment.
The
latter mode can be the stuff that nightmares are made of, but
it
also
allows the flexibility and bleeding-endgeness we all cherish.


There is a different between having generic OS and having
generic
setup, running your email server, file server and LDAP on a
node
that running VMs.

I have no problem in having generic OS (opposed of ovirt-node)
but
have full control over that.

Alon.

Can I say we have got agreement on oVirt should cover two kinds
of
hypervisors?  Stateless slave is good for pure and normal
virtualization
workload, while generic host can keep the flexibility of
customization.
In my opinion, it's good for the oVirt community to provide
choices
for
users.  They could customize it in production, building and
even
source
code according to their requirements and skills.


I also think it will be good to support both modes! It will also
good if we can rule the world! :)

Now seriously... :)

If we want to ever have a working solution we need to focus,
dropping wishful requirements in favour of the minimum required
that will allow us to reach to stable milestone.

Having a good clean interface for vdsm network within the
stateless mode, will allow a persistent implementation to
exists even if the whole implementation of master and vdsm
assume stateless. This kind of implementation will get a new
state from master, compare to whatever exists on the host and
sync.

I, of course, will be against investing resources in such
network
management plugin approach... but it is doable, and my vote is
not
something that you cannot safely ignore.


I cannot say that I do not fail to parse English sentences with
double
or triple negations...

I'd like to see an API that lets us define a persistent initial
management
interface, and create volatile network devices during runtime.
I'd love
to see a define/create distiction, as libvirt has.

How about keeping our current setupNetwork API, with a minor
change to
its sematics - it would not persist anything. A new
persistNetwork API
would be added, intending to persist the management network after
it has
been tested.

On boot, only the management defitions would show up, and Engine
(or a
small local sevice on top of vdsm) would push the complete
configuration.



how does this benefit over loading the last config, and then have
engine refresh (always/if needed)?


It's clearer for the local admin: if it's on the file system, it
would
be there after boot; he can do his worst to them, and we'd try to
manage.

Also, it is easier to recover from utterly-horrible remote
commands,
which had rendered our host incommunicado: the management interface
used
to send these commands -- and only it -- would show up after boot.
This
increases the probability that after fencing, we'd see the host
again.


i think we mentioned this before, but this will kill any way to have
hosts come back to life, also have a policy on connecting to storage,
even if engine is still down.
(one of these use cases is for the engine itself to be hosted on the
hosts as well)


For this use case you'll need much 

Re: [vdsm] VDSM tasks, the future

2012-12-04 Thread Adam Litke
On Tue, Dec 04, 2012 at 10:35:01AM -0500, Saggi Mizrahi wrote:
 Because I started hinting about how VDSM tasks are going to look going forward
 I thought it's better I'll just write everything in an email so we can talk
 about it in context.  This is not set in stone and I'm still debating things
 myself but it's very close to being done.

Don't debate them yourself, debate them here!  Even better, propose your idea in
schema form to show how a command might work exactly.

 - Everything is asynchronous.  The nature of message based communication is
 that you can't have synchronous operations.  This is not really debatable
 because it's just how TCP\AMQP\messaging works.

Can you show how a traditionally synchronous command might work?  Let's take
Host.getVmList as an example.

 - Task IDs will be decided by the caller.  This is how json-rpc works and also
 makes sense because no the engine can track the task without needing to have a
 stage where we give it the task ID back.  IDs are reusable as long as no one
 else is using them at the time so they can be used for synchronizing
 operations between clients (making sure a command is only executed once on a
 specific host without locking).
 
 - Tasks are transient If VDSM restarts it forgets all the task information.
 There are 2 ways to have persistent tasks: 1. The task creates an object that
 you can continue work on in VDSM.  The new storage does that by the fact that
 copyImage() returns one the target volume has been created but before the data
 has been fully copied.  From that moment on the stat of the copy can be
 queried from any host using getImageStatus() and the specific copy operation
 can be queried with getTaskStatus() on the host performing it.  After VDSM
 crashes, depending on policy, either VDSM will create a new task to continue
 the copy or someone else will send a command to continue the operation and
 that will be a new task.  2. VDSM tasks just start other operations track-able
 not through the task interface. For example Gluster.
 gluster.startVolumeRebalance() will return once it has been registered with
 Gluster.  glster.getOperationStatuses() will return the state of the operation
 from any host.  Each call is a task in itself.

I worry about this approach because every command has a different semantic for
checking progress.  For migration, we have to check VM status on the src and
dest hosts.  For image copy we need to use a special status call on the dest
image.  It would be nice if there was a unified method for checking on an
operation.  Maybe that can be completion events.

Client:   vdsm:
---   -

Image.copy(...)  --
 --  Operation Started
Wait for event   ...
 --  Event: Operation id done code

For an early error:

Client:   vdsm:
---   -

Image.copy(...)  --
 --  Error: code


 - No task tags.  They are silly and the caller can mangle whatever in the task
 ID if he really wants to tag tasks.

Yes.  Agreed.

 - No explicit recovery stage.  VDSM will be crash-only, there should be
 efforts to make everything crash-safe.  If that is problematic, in case of
 networking, VDSM will recover on start without having a task for it.

How does this work in practice for something like creating a new image from a
template?

 - No clean Task: Tasks can be started by any number of hosts this means that
 there is no way to own all tasks.  There could be cases where VDSM starts
 tasks on it's own and thus they have no owner at all.  The caller needs to
 continually track the state of VDSM. We will have brodcasted events to
 mitigate polling.

If a disconnected client might have missed a completion event, it will need to
check state.  This means each async operation that changes state must document a
proceedure for checking progress of a potentially ongoing operation.  For
Image.copy, that process would be to lookup the new image and check its state.

 - No revert Impossible to implement safely.

How do the engine folks feel about this?  I am ok with it :)

 - No SPM\HSM tasks SPM\SDM is no longer necessary for all domain types (only
 for type).  What used to be SPM tasks, or tasks that persist and can be
 restarted on other hosts is talked about in previous bullet points.
 
A nice simplification.


-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] API.py validation

2012-12-04 Thread Adam Litke
On Tue, Dec 04, 2012 at 08:43:11AM -0500, Antoni Segura Puimedon wrote:
 Hi all,
 
 I am currently working in adding a new feature to vdsm which requires a new
 entry point in vdsm, thus requiring:
 - Parameter definitions in vdsm_api/vdsmapi-schema.json
 - Implementation and checks in vdsm/API.py and other modules.
 
 Typically, we check for presence absence of required/optional parameters in
 API.py using utils.validateMinimalKeySet or just if else clauses. I think this
 process could benefit from a more automatic and less duplicated effort, i.e.,
 parsing vdsmapi-schema.json in a similar way as process-schema.py does to make
 a memoized method that is able to check whether the api call is correct
 according to the API definitions. A very good side effect would be that this
 would really avoid us from forgetting to update the schema.

Yes, this is a good idea.  I do want to add some checking.  For now, the best
place to add it would probably be in the DynamicBridge class which dispatches
json-rpc calls to the correct internal methods.  Unfortunately this would
exclude the xmlrpc api from the automatic checking.  I guess that's ok since
xmlrpc will be going away.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] link state semantics

2012-12-04 Thread Adam Litke
On Tue, Dec 04, 2012 at 12:32:34PM -0500, Antoni Segura Puimedon wrote:
 Hi list!
 
 We are working on the new 3.2 feature for adding support for updating VM
 devices, more specifically at the moment network devices.
 
 There is one point of the design which is not yet consensual and we'd 
 need to agree on a proper and clean design that would satisfy us all:
 
 My current proposal, as reflected by patch:
http://gerrit.ovirt.org/#/c/9560/5/vdsm_api/vdsmapi-schema.json
 and its parent is to have a linkActive boolean that is true for link
 status 'up' and false for link status 'down'.
 
 We want to support a none (dummy) network that is used to dissociate vnics
 from any real network. The semantics, as you can see in the patch are that
 unless you specify a network, updateDevice will place the interface on that
 network. However, Adam Litke argues that not specifying a network should
 keep the vnic on the network it currently is, as network is an optional
 parameter and 'linkActive' is also optional and has this preserve current
 state semantics.
 
 I can certainly see the merit of what Adam proposes, and the implementation
 would be that linkActive becomes an enum like so:
 
 {'enum': 'linkState'/* or linkActive */ , 'data': ['up', 'down', 
 'disconnected']}
 
 With this change, network would only be changed if one different than the 
 current
 one is specified and the vnic would be taken to the dummy bridge when the 
 linkState
 would be set to 'disconnected'.
 
 There is also an objection, raised by Adam about the semantics of 
 portMirroring.
 The current behavior from my patch is:
 
 portMirroring is None or is not set - No action taken.
 portMirroring = [] - No action taken.
 portMirroring = [a,b,z] - Set port mirroring for nets a,b and z to the 
 specified vnic.
 
 His proposal is:
 portMirroring is None or is not set - No action taken.
 portMirroring = [] - Unset port mirroring to the vnic that is currently set.
 portMirroring = [a,b,z] - Set port mirroring for nets a,b and z to the 
 specified vnic.
 
 I would really welcome comments on this to have finally an agreement to the 
 api for this
 feature.

+1 to the updated proposal.  Is there any better way to do it?

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


[vdsm] RFC: New Storage API

2012-12-04 Thread Saggi Mizrahi
I've been throwing a lot of bits out about the new storage API and I think it's 
time to talk a bit.
I will purposefully try and keep implementation details away and concentrate 
about how the API looks and how you use it.

First major change is in terminology, there is no long a storage domain but a 
storage repository.
This change is done because so many things are already called domain in the 
system and this will make things less confusing for new-commers with a libvirt 
background.

One other changes is that repositories no longer have a UUID.
The UUID was only used in the pool members manifest and is no longer needed.


connectStorageRepository(repoId, repoFormat, connectionParameters={}):
repoId - is a transient name that will be used to refer to the connected 
domain, it is not persisted and doesn't have to be the same across the cluster.
repoFormat - Similar to what used to be type (eg. localfs-1.0, nfs-3.4, 
clvm-1.2).
connectionParameters - This is format specific and will used to tell VDSM how 
to connect to the repo.

disconnectStorageRepository(self, repoId):


In the new API there are only images, some images are mutable and some are not.
mutable images are also called VirtualDisks
immutable images are also called Snapshots

There are no explicit templates, you can create as many images as you want from 
any snapshot.

There are 4 major image operations:


createVirtualDisk(targetRepoId, size, baseSnapshotId=None,
  userData={}, options={}):

targetRepoId - ID of a connected repo where the disk will be created
size - The size of the image you wish to create
baseSnapshotId - the ID of the snapshot you want the base the new virtual disk 
on
userData - optional data that will be attached to the new VD, could be anything 
that the user desires.
options - options to modify VDSMs default behavior

returns the id of the new VD

createSnapshot(targetRepoId, baseVirtualDiskId,
   userData={}, options={}):
targetRepoId - The ID of a connected repo where the new sanpshot will be 
created and the original image exists as well.
size - The size of the image you wish to create
baseVirtualDisk - the ID of a mutable image (Virtual Disk) you want to snapshot
userData - optional data that will be attached to the new Snapshot, could be 
anything that the user desires.
options - options to modify VDSMs default behavior

returns the id of the new Snapshot

copyImage(targetRepoId, imageId, baseImageId=None, userData={}, options={})
targetRepoId - The ID of a connected repo where the new image will be created
imageId - The image you wish to copy
baseImageId - if specified, the new image will contain only the diff between 
image and Id.
  If None the new image will contain all the bits of image Id. This 
can be used to copy partial parts of images for export.
userData - optional data that will be attached to the new image, could be 
anything that the user desires.
options - options to modify VDSMs default behavior

return the Id of the new image. In case of copying an immutable image the ID 
will be identical to the original image as they contain the same data. However 
the user should not assume that and always use the value returned from the 
method.

removeImage(repositoryId, imageId, options={}):
repositoryId - The ID of a connected repo where the image to delete resides
imageId - The id of the image you wish to delete.



getImageStatus(repositoryId, imageId)
repositoryId - The ID of a connected repo where the image to check resides
imageId - The id of the image you wish to check.

All operations return once the operations has been committed to disk NOT when 
the operation actually completes.
This is done so that:
- operation come to a stable state as quickly as possible.
- In case where there is an SDM, only small portion of the operation actually 
needs to be performed on the SDM host.
- No matter how many times the operation fails and on how many hosts, you can 
always resume the operation and choose when to do it.
- You can stop an operation at any time and remove the resulting object making 
a distinction between stop because the host is overloaded to I don't want 
that image

This means that after calling any operation that creates a new image the user 
must then call getImageStatus() to check what is the status of the image.
The status of the image can be either optimized, degraded, or broken.
Optimized means that the image is available and you can run VMs of it.
Degraded means that the image is available and will run VMs but it might be a 
better way VDSM can represent the underlying data. 
Broken means that the image can't be used at the moment, probably because not 
all the data has been set up on the volume.

Apart from that VDSM will also return the last persisted status information 
which will conatin
hostID - the last host to try and optimize of fix the image
stage - X/Y (eg. 1/10) the last persisted stage of the fix.
percent_complete - -1 or 0-100, the 

Re: [vdsm] RFC: New Storage API

2012-12-04 Thread Adam Litke
Thanks for sharing this.  It's nice to have something a little more concrete to
think about.  Just a few comments and questions inline to get some discussion
flowing.

On Tue, Dec 04, 2012 at 04:52:40PM -0500, Saggi Mizrahi wrote:
 I've been throwing a lot of bits out about the new storage API and I think 
 it's time to talk a bit.
 I will purposefully try and keep implementation details away and concentrate 
 about how the API looks and how you use it.
 
 First major change is in terminology, there is no long a storage domain but a 
 storage repository.
 This change is done because so many things are already called domain in the 
 system and this will make things less confusing for new-commers with a 
 libvirt background.
 
 One other changes is that repositories no longer have a UUID.
 The UUID was only used in the pool members manifest and is no longer needed.
 
 
 connectStorageRepository(repoId, repoFormat, connectionParameters={}):

We should probably add an options/flags parameter for extension of all new
APIs.

 repoId - is a transient name that will be used to refer to the connected 
 domain, it is not persisted and doesn't have to be the same across the 
 cluster.
 repoFormat - Similar to what used to be type (eg. localfs-1.0, nfs-3.4, 
 clvm-1.2).
 connectionParameters - This is format specific and will used to tell VDSM how 
 to connect to the repo.
 
 disconnectStorageRepository(self, repoId):

I assume 'self' is a mistake here.  Just want to clarify given all of the recent
talk about instances vs. namespaces.

 In the new API there are only images, some images are mutable and some are 
 not.
 mutable images are also called VirtualDisks
 immutable images are also called Snapshots

By mutable you mean writable right?  Or does the word mutable imply more than
that?

 There are no explicit templates, you can create as many images as you want 
 from any snapshot.
 
 There are 4 major image operations:
 
 
 createVirtualDisk(targetRepoId, size, baseSnapshotId=None,
   userData={}, options={}):

Is userdata a 'StringMap'?

I will reopen the argument about an options dict vs a flags parameter.  I oppose
the dict for expansion because I think it causes APIs to devolve into a mess
where lots of arbitrary and not well thought out overrides are packed into the
dict over time.  A flags argument (in json and python it can be an enum array)
limits us to really switching flags on and off instead of passing arbitrary
data.

 targetRepoId - ID of a connected repo where the disk will be created
 size - The size of the image you wish to create
 baseSnapshotId - the ID of the snapshot you want the base the new virtual 
 disk on
 userData - optional data that will be attached to the new VD, could be 
 anything that the user desires.
 options - options to modify VDSMs default behavior
 
 returns the id of the new VD
 
 createSnapshot(targetRepoId, baseVirtualDiskId,
userData={}, options={}):
 targetRepoId - The ID of a connected repo where the new sanpshot will be 
 created and the original image exists as well.
 size - The size of the image you wish to create

Why is this needed?  Doesn't the size of a snapshot have to be equal to its
base image?

 baseVirtualDisk - the ID of a mutable image (Virtual Disk) you want to 
 snapshot

Can you snapshot a snapshot?  In that case, this parameter should be called
baseImage.

 userData - optional data that will be attached to the new Snapshot, could be 
 anything that the user desires.
 options - options to modify VDSMs default behavior
 
 returns the id of the new Snapshot
 
 copyImage(targetRepoId, imageId, baseImageId=None, userData={}, options={})
 targetRepoId - The ID of a connected repo where the new image will be created
 imageId - The image you wish to copy

Do we locate the sourceRepoId automatically based on the imageId?

 baseImageId - if specified, the new image will contain only the diff between 
 image and Id.
   If None the new image will contain all the bits of image Id. 
 This can be used to copy partial parts of images for export.
 userData - optional data that will be attached to the new image, could be 
 anything that the user desires.
 options - options to modify VDSMs default behavior
 
 return the Id of the new image. In case of copying an immutable image the ID 
 will be identical to the original image as they contain the same data. 
 However the user should not assume that and always use the value returned 
 from the method.
 
 removeImage(repositoryId, imageId, options={}):
 repositoryId - The ID of a connected repo where the image to delete resides
 imageId - The id of the image you wish to delete.
 
 
 getImageStatus(repositoryId, imageId)
 repositoryId - The ID of a connected repo where the image to check resides
 imageId - The id of the image you wish to check.

What is in this return value?  Is it a single enum indicating whether the image
is locked (being copied, etc.) or a list of detailed 

Re: [vdsm] RFC: New Storage API

2012-12-04 Thread Saggi Mizrahi


- Original Message -
 From: Adam Litke a...@us.ibm.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org, 
 engine-devel engine-de...@ovirt.org
 Sent: Tuesday, December 4, 2012 6:08:25 PM
 Subject: Re: [vdsm] RFC: New Storage API
 
 Thanks for sharing this.  It's nice to have something a little more
 concrete to
 think about.  Just a few comments and questions inline to get some
 discussion
 flowing.
 
 On Tue, Dec 04, 2012 at 04:52:40PM -0500, Saggi Mizrahi wrote:
  I've been throwing a lot of bits out about the new storage API and
  I think it's time to talk a bit.
  I will purposefully try and keep implementation details away and
  concentrate about how the API looks and how you use it.
  
  First major change is in terminology, there is no long a storage
  domain but a storage repository.
  This change is done because so many things are already called
  domain in the system and this will make things less confusing for
  new-commers with a libvirt background.
  
  One other changes is that repositories no longer have a UUID.
  The UUID was only used in the pool members manifest and is no
  longer needed.
  
  
  connectStorageRepository(repoId, repoFormat,
  connectionParameters={}):
 
 We should probably add an options/flags parameter for extension of
 all new
 APIs.
Usually I agree but connectionParameters is already generic enough :)
 
  repoId - is a transient name that will be used to refer to the
  connected domain, it is not persisted and doesn't have to be the
  same across the cluster.
  repoFormat - Similar to what used to be type (eg. localfs-1.0,
  nfs-3.4, clvm-1.2).
  connectionParameters - This is format specific and will used to
  tell VDSM how to connect to the repo.
  
  disconnectStorageRepository(self, repoId):
 
 I assume 'self' is a mistake here.  Just want to clarify given all of
 the recent
 talk about instances vs. namespaces.
Yea, it's just pasted from my code
 
  In the new API there are only images, some images are mutable and
  some are not.
  mutable images are also called VirtualDisks
  immutable images are also called Snapshots
 
 By mutable you mean writable right?  Or does the word mutable imply
 more than
 that?
It's a semantic distinction due to implementation details, in general terms, 
yes.
 
  There are no explicit templates, you can create as many images as
  you want from any snapshot.
  
  There are 4 major image operations:
  
  
  createVirtualDisk(targetRepoId, size, baseSnapshotId=None,
userData={}, options={}):
 
 Is userdata a 'StringMap'?
currently it's a json object. We could limit it to a string map and trust the 
client to parse types.
We can have it be a string\blob and trust the user to serialize the data.
It's pass-through object either way.
 
 I will reopen the argument about an options dict vs a flags
 parameter.  I oppose
 the dict for expansion because I think it causes APIs to devolve into
 a mess
 where lots of arbitrary and not well thought out overrides are packed
 into the
 dict over time.  A flags argument (in json and python it can be an
 enum array)
 limits us to really switching flags on and off instead of passing
 arbitrary
 data.
We already have strategy that we know we want to have several options.
Other stuff that have been suggested is to be able to override the img format 
(qcow2\qed)

The way I envision it is having an class
opts = CommandOptions()
that you add
opts.addStringOption(key, value)
opts.addIntOption(key, 3)
opt.addBoolOption(key, True)

I know you could just as well have
strategy_space_flag and strategy_performance_flag
and fail the operation if they both exist.
Since it is a matter of personal taste I think it should be decided by a vote.
 
  targetRepoId - ID of a connected repo where the disk will be
  created
  size - The size of the image you wish to create
  baseSnapshotId - the ID of the snapshot you want the base the new
  virtual disk on
  userData - optional data that will be attached to the new VD, could
  be anything that the user desires.
  options - options to modify VDSMs default behavior
  
  returns the id of the new VD
  
  createSnapshot(targetRepoId, baseVirtualDiskId,
 userData={}, options={}):
  targetRepoId - The ID of a connected repo where the new sanpshot
  will be created and the original image exists as well.
  size - The size of the image you wish to create
 
 Why is this needed?  Doesn't the size of a snapshot have to be equal
 to its
 base image?
oops, another copy\paste error, you can see this arg doesn't exist in the 
method signature.
My proof reading do need more work.
 
  baseVirtualDisk - the ID of a mutable image (Virtual Disk) you want
  to snapshot
 
 Can you snapshot a snapshot?  In that case, this parameter should be
 called
 baseImage.
You can't snapshot a sanpshot, it makes no sense as it can't change and you 
will get the same object.
 
  userData - optional data that will be