Re: [vdsm] pep8 issue
On 13/05/14 17:22 -0300, Amador Pahim wrote: Building vdsm/master in F20, I've got: ./vdsm/virt/migration.py:223:19: E225 missing whitespace around operator In vdsm/virt/migration.py: 218 e.err = (libvirt.VIR_ERR_OPERATION_ABORTED, # error code$ 219 libvirt.VIR_FROM_QEMU, # error domain$ 220 'operation aborted',# error message$ 221 libvirt.VIR_ERR_WARNING,# error level$ 222 '', '', '', # str1, str2, str3,$ 223 -1, -1) # int1, int2$ 224 raise e$ pep8 is not accepting negative integer. Instead, it is handling the minus sign as an operator. Quick workaround is change -1 to int(-1). Is this a known? I found this one too and am planning to submit the same workaround. Actually you can just do (-1) without int(). -- Adam Litke ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] FW: Fwd: Question about MOM
On 26/03/14 03:50 -0700, Chegu Vinod wrote: removing the email alias Restoring the email alias. Please keep discussions as public as possible to allow others to contribute to the design and planning. Jason. Please see below... On 3/26/2014 1:38 AM, Liao, Chuan (Jason Liao, HPservers-Core-OE-PSC) wrote: Hi All, Follow below discussion. I got these points: 1. MOM gathering NUMA information(topology, statistics...) will changed in future. (one side using VDSM API, another side using libvirt and system API) I didn't follow your sentence.. Pl.. work with Adam/Martin and provide the needful API's on the VDSM side ...so that MOM entity thread can use the API and extract the needful about NUMA topology and cpu/memory usage info. As I see it...this is probably the only piece that would be relevant to be made available at the earliest (preferably in oVirt 3.5) and that would enable MOM to pursue next steps as they say fit. Beyond that ...at this point (for oVirt 3.5) let us not spend more time on MOM internals please. Let us leave that to Adam and Martin to pursue this as/when they see fit. 2. Martin and Adam will take a look at MOM policy in ovirt scheduler when NUMA feature turn on. Yes please. 3. ovirt engine will have numa-aware placement algorithm to make the VM run within NUMA nodes as best way. algorithm here is decided by user specified pinning requests (and/or) by the oVirt scheduler. In the case of user request (upon approval from oVirt scheduler) the VDSM- libvirt will be explicitly told what to do via numatune/cputune etc etc. In the absence of the user specified pinning request I don't know if oVirt scheduler intends to convey the numatune/cputune type of requests to the libvirt... 4. ovirt engine will have some algorithm to automatic configure virtual NUMA when big VM creation (big memory or vcpus) This is a good suggestion but in my view should be taken up after oVirt 3.5. For now just accept and process the user specified requests... 5. Investigate on KSM, memory ballooning have the right tuning parameter when NUMA feature turn on. That is for Adam/Martin et.al. ...not for your specific project. We just need to ensure that they have the basic NUMA info, they need (via the VDSM API i mentioned above)...so that it enables them to work on their part independently as/when they see fit. 6. Investigate on if Automatic NUMA balancing is keeping the process reasonably balanced and notify ovirt engine. Not sure I follow what you are saying... Here is what I have in my mind : Check if the target host has Automatic NUMA balancing enabled (you can use the sysctl -a |grep numa_balancing or a similar underlying mechanism for determining this). If its present then check if its enabled or not (value of 1 is enabled and 0 is disabled)... and convey this information to the oVirt engine GUI for display (this is a hint for a user (if they wish) to skip manual pinning).. This in my view is the minimum...at this point (and it would be great if we can make it happen for oVirt 3.5). I think since we have vdsm you can choose to enable autonuma always (when it is present). Are there any drawbacks to enabling it always? We can discuss (at some later point i.e for post oVirt 3.5) whether we should really provide a way to the user to disable Automatic NUMA balancing. Changing the other numa balancing tunables is just not going to happen...as far as I can see at this point (so let us not worry about that right now..) 7. Investigate on libvirt have any NUMA tuning APIs No. There is nothing to investigate here.. IMO. libvirt should not be playing with the host wide NUMA settings. Please feel free to correct me if I am missing something. See above BTW. I think there is no point in ovirt 3.5 release, am I right? If you are referring to just the MOM stuff then with the exception of my comment about having an appropriate API on the VDSM for enabling MOM there is nothing else. Vinod Best Regards, Jason Liao -Original Message- From: Vinod, Chegu Sent: 2014年3月21日 21:32 To: Adam Litke Cc: Liao, Chuan (Jason Liao, HPservers-Core-OE-PSC); vdsm-devel; Martin Sivak; Gilad Chaplik; Liang, Shang-Chun (David Liang, HPservers-Core-OE-PSC); Shi, Xiao-Lei (Bruce, HP Servers-PSC-CQ); Doron Fediuck Subject: Re: FW: Fwd: Question about MOM On 3/21/2014 6:13 AM, Adam Litke wrote: On 20/03/14 18:03 -0700, Chegu Vinod wrote: On 3/19/2014 11:01 PM, Liao, Chuan (Jason Liao, HPservers-Core-OE-PSC) wrote: Add Vinod in this thread. Best Regards, Jason Liao -Original Message- From: Adam Litke [mailto:ali...@redhat.com] Sent: 2014年3月19日 21:23 To: Doron Fediuck Cc: vdsm-devel; Liao, Chuan (Jason Liao, HPservers-Core-OE-PSC); Martin Sivak; Gilad Chaplik; Liang, Shang-Chun (David Liang, HPservers-Core-OE-PSC); Shi, Xiao-Lei (Bruce, HP Servers-PSC-CQ) Subject: Re: Fwd: Question about MOM On 19/03/14 05:50 -0400, Doron Fediuck wrote
Re: [vdsm] FW: Fwd: Question about MOM
On 20/03/14 18:03 -0700, Chegu Vinod wrote: On 3/19/2014 11:01 PM, Liao, Chuan (Jason Liao, HPservers-Core-OE-PSC) wrote: Add Vinod in this thread. Best Regards, Jason Liao -Original Message- From: Adam Litke [mailto:ali...@redhat.com] Sent: 2014年3月19日 21:23 To: Doron Fediuck Cc: vdsm-devel; Liao, Chuan (Jason Liao, HPservers-Core-OE-PSC); Martin Sivak; Gilad Chaplik; Liang, Shang-Chun (David Liang, HPservers-Core-OE-PSC); Shi, Xiao-Lei (Bruce, HP Servers-PSC-CQ) Subject: Re: Fwd: Question about MOM On 19/03/14 05:50 -0400, Doron Fediuck wrote: Moving this to the vdsm list. - Forwarded Message - From: Chuan Liao (Jason Liao, HPservers-Core-OE-PSC) chuan.l...@hp.com To: Martin Sivak msi...@redhat.com, ali...@redhat.com, Doron Fediuck dfedi...@redhat.com, Gilad Chaplik gchap...@redhat.com Cc: Shang-Chun Liang (David Liang, HPservers-Core-OE-PSC) shangchun.li...@hp.com, Xiao-Lei Shi (Bruce, HP Servers-PSC-CQ) xiao-lei@hp.com Sent: Wednesday, March 19, 2014 11:28:01 AM Subject: Question about MOM Hi All, I am a new with MOM feature. In my understanding, MOM is the collector both from host and guest and set the right policy to KSM and memory ballooning get better performance. Yes this is correct. In oVirt, MOM runs as another vdsm thread and uses the vdsm API to collect host and guest statistics. Those statistics are fed into a policy file which can create some outputs (such as ksm tuning parameters and guest balloon sizes). MOM then uses the vdsm API to apply those outputs to the system. Ok..Understood about the statistics gathering part and then initiating policy driven inputs for the ksm and balloning on the host etc. Perhaps this was already discussed earlier ? Does the MOM thread in vdsm intend to gather the NUMA topology of the host from the VDSM (using some new TBD or some enhanced existing API) or does it intend to collect this directly from the host using libvirt/libnuma etc ? When MOM is using the VDSM HypervisorInterface, it must get all of its information from vdsm. It is considered an API layering violation for MOM to access the system or libvirt connection directly. When running with the Libvirt HypervisorInterface, it should use libvirt and the system directly as necessary. Your new features should consider this and make use of the HypervisorInterface abstraction to provide both implementations. I am not sure how it has relationship with NUMA, does anyone can explain it to me? Jason, Here is my understanding (and I believe I am just paraphrasing/echoing Adam's comments ). MOM's NUMA related enhancements are independent of what the oVirt UI/oVirt scheduler does. It is likely that MOM's vdsm thread may choose to extract information about NUMA topology (includes dynamic stuff like cpu usage or free memory) from the VDSM (i.e. if they choose to not get it directly from libvirt/libnuma or /proc etc). How MOM interprets that NUMA information along with other statistics that it gathers (along side with user requested SLA requirements for each guest etc) should be left to MOM to decide and direct KSM/ballooning related actions. I don't believe we need to intervene in the MOM related internals. Once we decide to have NUMA-aware MOM policies there will need to be some infrastructure enhancements to enable it. I think Martin and I will take the lead on it since we have been thinking about these kinds of issues for some time now. I guess we need to start by examining the currently planned use cases. Please feel free to correct me if I am missing something or over-simplifying something: 1) NUMA-aware placement - Try to schedule VMs to run on hosts where the guest will not have to span multiple NUMA nodes. I guess you are referring to the case where the user (and/or the oVirt scheduler) has not explicitly directed libvirt on the host to schedule the VM in some specific way... In those cases the decision is left to the smarts of the host OS scheduler to take care of it (that includes the future/smarter Automatic NUMA balancing enabled scheduler). Yes. For this one, we need a numa-aware placement algorithm on engine, and the autonuma feature available and configured on all virt hosts. In the first phase I don't anticipate any changes to MOM internals. I would prefer to observe the performance characteristics of this and tweak MOM in the future to address actual performance problems we see. 2) Virtual NUMA topology - Emulate a NUMA topology inside the VM. Yes. Irrespective of any NUMA specified for the backing resources of a guest...when the guest size increases it is a required practice to have virtual NUMA topology enabled. (This helps the OS running inside the guest to scale/perform much by making NUMA aware decisions etc. Also it helps the applications running in the OS to scale/perform better). Agreed. One point I might make then... Should the VM creation process on engine automatically configure virtual NUMA (even if the user doesn't
Re: [vdsm] Fwd: Question about MOM
On 19/03/14 05:50 -0400, Doron Fediuck wrote: Moving this to the vdsm list. - Forwarded Message - From: Chuan Liao (Jason Liao, HPservers-Core-OE-PSC) chuan.l...@hp.com To: Martin Sivak msi...@redhat.com, ali...@redhat.com, Doron Fediuck dfedi...@redhat.com, Gilad Chaplik gchap...@redhat.com Cc: Shang-Chun Liang (David Liang, HPservers-Core-OE-PSC) shangchun.li...@hp.com, Xiao-Lei Shi (Bruce, HP Servers-PSC-CQ) xiao-lei@hp.com Sent: Wednesday, March 19, 2014 11:28:01 AM Subject: Question about MOM Hi All, I am a new with MOM feature. In my understanding, MOM is the collector both from host and guest and set the right policy to KSM and memory ballooning get better performance. Yes this is correct. In oVirt, MOM runs as another vdsm thread and uses the vdsm API to collect host and guest statistics. Those statistics are fed into a policy file which can create some outputs (such as ksm tuning parameters and guest balloon sizes). MOM then uses the vdsm API to apply those outputs to the system. I am not sure how it has relationship with NUMA, does anyone can explain it to me? I guess we need to start by examining the currently planned use cases. Please feel free to correct me if I am missing something or over-simplifying something: 1) NUMA-aware placement - Try to schedule VMs to run on hosts where the guest will not have to span multiple NUMA nodes. 2) Virtual NUMA topology - Emulate a NUMA topology inside the VM. These two use cases are intertwined because VMs with NUMA can be scheduled with more flexibility (albeit with more sophistication) since the scheduler can fit the VM onto hosts where the memory can be split across multiple Host NUMA nodes. 3) Manual NUMA pinning - Allow advanced admins to schedule a VM to run on a specific host with a manual pinning strategy. Most of these use cases involve the engine scheduler and engine UI. There is not much for MOM to do to support their direct implementation. We should focus on managing interactions with other SLA features that MOM does implement: - How should KSM be adjusted when NUMA is in effect? In a NUMA host, are there numa-aware KSM tunables that we should use? - When ballooning VMs, should we take into account how much memory we need to reclaim from VMs on a node by node basis? Lastly, let's see if MOM needs to manage the existing NUMA utilities in place on the system. I don't know much about AutoNUMA. Does it have tunables that should be adjusted or is it completely autonomous? Does libvirt have any NUMA tuning APIs that MOM may want to call to enhance performance in certain situations? One of the main questions I ask when trying to decide if MOM should manage a particular setting is: Is this something that is set once and stays the same or is it something that must change dynamically in accordance with current system conditions? In the former case, it is probably best managed by engine or vdsm directly. In the latter case, it fits the MOM model. Hope this was helpful! Please feel free to continue engaging this list with any additional questions that you might have. On engine side, there is only one button with this feature: Sync MoM Policy, right? On vdsm side, I saw the momIF is working for this, right? Best Regards, Jason Liao -- Adam Litke ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] mom RPMs for 3.4
On 01/02/14 22:48 +, Dan Kenigsberg wrote: On Fri, Jan 31, 2014 at 04:56:12PM -0500, Adam Litke wrote: On 31/01/14 08:36 +0100, Sandro Bonazzola wrote: Il 30/01/2014 19:30, Adam Litke ha scritto: On 30/01/14 18:13 +, Dan Kenigsberg wrote: On Thu, Jan 30, 2014 at 11:49:42AM -0500, Adam Litke wrote: Hi Sandro, After updating the MOM project's build system, I have used jenkins to produce a set of RPMs that I would like to tag into the oVirt 3.4 release. Please see the jenkins job [1] for the relevant artifacts for EL6[2], F19[3], and F20[4]. Dan, should I submit a patch to vdsm to make it require mom = 0.4.0? I want to be careful to not break people's environments this late in the 3.4 release cycle. What is the best way to minimize that damage? Hey, we're during beta. I prefer making this requirement explicit now over having users with supervdsmd.log retate due to log spam. In that case, Sandro, can you let me know when those RPMs hit the ovirt repos (for master and 3.4) and then I will submit a patch to vdsm to require the new version. mom 0.4.0 has been built in last night nightly job [1] and published to nightly by publisher job [2] so it's already available on nightly [3] For 3.4.0, it has been planned [4] a beta 2 release on 2014-02-06 so we'll include your builds in that release. I presume the scripting for 3.4 release rpms will produce a version without the git-rev based suffix: ie. mom-0.4.0-1.rpm? I need to figure out how to handle a problem that might be a bit unique to mom. MOM is used by non-oVirt users who install it from the main Fedora repository. I think it's fine that we are producing our own rpms in oVirt (that may have additional patches applied and may resync to upstream mom code more frequently than would be desired for the main Fedora repository). Given this, I think it makes sense to tag the oVirt RPMs with a special version suffix to indicate that these are oVirt produced and not upstream Fedora. For example: The next Fedora update will be mom-0.4.0-1.f20.rpm. The next oVirt update will be mom-0.4.0-1ovirt.f20.rpm. Is this the best practice for accomplishing my goals? One other thing I'd like to have the option of doing is to make vdsm depend on an ovirt distribution of mom so that the upstream Fedora version will not satisfy the dependency for vdsm. What is the motivation for this? You would not like to bother Fedora users with updates that are required only for oVirt? Yes, that was my thinking. It seems that oVirt requires updates more frequently than users that use MOM with libvirt directly and the Fedora update process is a bit more heavy than oVirt's at the moment. Vdsm itself is built, signed, and distributed via Fedora. It is also copied into the ovirt repo, for completeness sake. Could MoM do the same? If vdsm is finding this to work well than surely I can do the same with MOM. The 0.4.0 build is in updates-testing right now and should be able to be tagged stable in a day or two. ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] mom RPMs for 3.4
On 31/01/14 08:36 +0100, Sandro Bonazzola wrote: Il 30/01/2014 19:30, Adam Litke ha scritto: On 30/01/14 18:13 +, Dan Kenigsberg wrote: On Thu, Jan 30, 2014 at 11:49:42AM -0500, Adam Litke wrote: Hi Sandro, After updating the MOM project's build system, I have used jenkins to produce a set of RPMs that I would like to tag into the oVirt 3.4 release. Please see the jenkins job [1] for the relevant artifacts for EL6[2], F19[3], and F20[4]. Dan, should I submit a patch to vdsm to make it require mom = 0.4.0? I want to be careful to not break people's environments this late in the 3.4 release cycle. What is the best way to minimize that damage? Hey, we're during beta. I prefer making this requirement explicit now over having users with supervdsmd.log retate due to log spam. In that case, Sandro, can you let me know when those RPMs hit the ovirt repos (for master and 3.4) and then I will submit a patch to vdsm to require the new version. mom 0.4.0 has been built in last night nightly job [1] and published to nightly by publisher job [2] so it's already available on nightly [3] For 3.4.0, it has been planned [4] a beta 2 release on 2014-02-06 so we'll include your builds in that release. I presume the scripting for 3.4 release rpms will produce a version without the git-rev based suffix: ie. mom-0.4.0-1.rpm? I need to figure out how to handle a problem that might be a bit unique to mom. MOM is used by non-oVirt users who install it from the main Fedora repository. I think it's fine that we are producing our own rpms in oVirt (that may have additional patches applied and may resync to upstream mom code more frequently than would be desired for the main Fedora repository). Given this, I think it makes sense to tag the oVirt RPMs with a special version suffix to indicate that these are oVirt produced and not upstream Fedora. For example: The next Fedora update will be mom-0.4.0-1.f20.rpm. The next oVirt update will be mom-0.4.0-1ovirt.f20.rpm. Is this the best practice for accomplishing my goals? One other thing I'd like to have the option of doing is to make vdsm depend on an ovirt distribution of mom so that the upstream Fedora version will not satisfy the dependency for vdsm. Thoughts? ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
[vdsm] mom RPMs for 3.4
Hi Sandro, After updating the MOM project's build system, I have used jenkins to produce a set of RPMs that I would like to tag into the oVirt 3.4 release. Please see the jenkins job [1] for the relevant artifacts for EL6[2], F19[3], and F20[4]. Dan, should I submit a patch to vdsm to make it require mom = 0.4.0? I want to be careful to not break people's environments this late in the 3.4 release cycle. What is the best way to minimize that damage? [1] http://jenkins.ovirt.org/view/All/job/manual-build-tarball/179/ [2] http://jenkins.ovirt.org/view/All/job/manual-build-tarball/179/label=centos6-host/artifact/exported-artifacts/mom-0.4.0-1.el6.noarch.rpm [3] http://jenkins.ovirt.org/view/All/job/manual-build-tarball/179/label=fedora19-host/artifact/exported-artifacts/mom-0.4.0-1.fc19.noarch.rpm [4] http://jenkins.ovirt.org/view/All/job/manual-build-tarball/179/label=fedora20-host/artifact/exported-artifacts/mom-0.4.0-1.fc20.noarch.rpm ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] [Engine-devel] Copy reviewer scores on trivial rebase/commit msg changes
On 18/01/14 01:48 +0200, Itamar Heim wrote: I'd like to enable these - comments welcome: 1. label.Label-Name.copyAllScoresOnTrivialRebase If true, all scores for the label are copied forward when a new patch set is uploaded that is a trivial rebase. A new patch set is considered as trivial rebase if the commit message is the same as in the previous patch set and if it has the same code delta as the previous patch set. This is the case if the change was rebased onto a different parent. This can be used to enable sticky approvals, reducing turn-around for trivial rebases prior to submitting a change. Defaults to false. 2. label.Label-Name.copyAllScoresIfNoCodeChange If true, all scores for the label are copied forward when a new patch set is uploaded that has the same parent commit as the previous patch set and the same code delta as the previous patch set. This means only the commit message is different. This can be used to enable sticky approvals on labels that only depend on the code, reducing turn-around if only the commit message is changed prior to submitting a change. Defaults to false. I am a bit late to the party but +1 from me for trying both. I guess it will be quite rare that something bad happens here. So unlikely, that the time saved on all the previous patches will far offset the lost time for fixing the corner cases. ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] oVirt 3.4.0 alpha repository closure failure
On 10/01/14 10:01 +, Dan Kenigsberg wrote: On Fri, Jan 10, 2014 at 08:48:52AM +0100, Sandro Bonazzola wrote: Hi, oVirt 3.4.0 alpha repository has been composed but alpha has not been announced due to repository closure failures: on CentOS 6.5: # repoclosure -r ovirt-3.4.0-alpha -l ovirt-3.3.2 -l base -l epel -l glusterfs-epel -l updates -l extra -l glusterfs-noarch-epel -l ovirt-stable -n Reading in repository metadata - please wait Checking Dependencies Repos looked at: 8 base epel glusterfs-epel glusterfs-noarch-epel ovirt-3.3.2 ovirt-3.4.0-alpha ovirt-stable updates Num Packages in Repos: 16581 package: mom-0.3.2-20140101.git2691f25.el6.noarch from ovirt-3.4.0-alpha unresolved deps: procps-ng Adam, this seems like a real bug in http://gerrit.ovirt.org/#/c/22087/ : el6 still carries the older procps (which is, btw, provided by procps-ng). Done. http://gerrit.ovirt.org/23137 package: vdsm-hook-vhostmd-4.14.0-1.git6fdd55f.el6.noarch from ovirt-3.4.0-alpha unresolved deps: vhostmd Douglas, could you add a with_vhostmd option to the spec, and have it default to 0 on el*, and to 1 on fedoras? Thanks, Dan. ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] Smarter network_setup hooks
On 03/01/14 12:20 +, Dan Kenigsberg wrote: Recently, Miguel Angel Ajo (CCed) has added a nice functionality to the implementation of setupNetworks in Vdsm: two hook points where added: before and after the setupNetworks verb takes place. This is useful because sometimes, Vdsm's configuration is not good enough for the user. For example, someone may need to set various ETHTOOL_OPTS on a nic. Now, they can put a script under /usr/libexec/vdsm/after_network_setup/ that tweak their ifcfg-eth* files after they have been written by Vdsm. However, the hook script only knows that *a* change of network configuration took place. It does not know which change took place, and has to figure this out on its own. Enters http://gerrit.ovirt.org/20330 allow hooks to pass down dictionaries in json format. I'd like to discuss it here, as it introduces a new Vdsm/Hook API that is quite different than what we have for other hooks. Unlike with Vm and VmDevice creation, where Vdsm uses libvirt's xml definition internally as well as to communicate with the hooks, before/after_network_setup have to define their own means of communication. I would like to suggest to use the same information passed on the Engine/Vdsm API, and extend its reach into the hook script. The three arguments to setupNetworks(networks, bondings, options) would be dumped as json strings, to be read by the hook script. This option is very simple to use and implement, it gives the hook all the information that Vdsm-proper has, and allows for greatest flexibility for hook writers. This is also the down side of this idea: hook script may do all kinds of things with this information, some of them unsupportable, and they should be notified when Engine/Vdsm API changes. In my opinion, it is a small price to pay: hooks have always had the China Store Rule - if you break something, you own it. Hook users must know what they're doing, and take care not to use deprecated bits of the API. What is your opinion? Comments and suggestions are most welcome! Seems like a logical thing to do. What specific mechanism do you suggest for passing the JSON strings to the hook script? If passed as arguments to the hook script we would need to consider shell escaping and argv length restrictions. What about writing these out to a special file and adding a new getContext() call to the hooking module. A script that is unconcerned with the context would not require any changes. But a script that wants access would simply do: ctx = hooking.getContext() and ctx would be the contents of the special file already decoded into a native Python object for easy consumption. This could easily be extended to any hook which may want to provide some context to implementors. One more question comes to mind: Are there any pieces of information that we would need to redact from the context (passwords or other sensitive information)? ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] VDSM - top 10 with patches with no activity for more than 30 days
On Thu, 2013-02-28 at 12:51 -0500, Doron Fediuck wrote: - Original Message - From: Itamar Heim ih...@redhat.com To: vdsm-devel@lists.fedorahosted.org Sent: Wednesday, February 20, 2013 5:39:21 PM Subject: [vdsm] VDSM - top 10 with patches with no activity for more than 30days thoughts on how to trim these? (in openstack gerrit they auto-abandon patches with no activity for a couple of weeks - author can revive them back when they are relevant) preferred_email | count +-- fsimo...@redhat.com | 34 smizr...@redhat.com | 23 lvro...@linux.vnet.ibm.com | 13 ewars...@redhat.com | 12 wu...@linux.vnet.ibm.com| 12 x...@linux.vnet.ibm.com | 11 shao...@linux.vnet.ibm.com | 6 li...@linux.vnet.ibm.com| 6 zhshz...@linux.vnet.ibm.com | 6 shum...@linux.vnet.ibm.com | 5 ___ Review day? Anyone thinks a monthly review day will help? We've discussed this in the past and part of the reason for the backlog is that folks like Saggi and Federico like to use gerrit to store work-in-progress patches that don't need review. They may not be working on those patches at the moment but want them in gerrit for them to come back to. If we want to allow this use of gerrit then we will always have some stale patches lying around. -- Adam Litke a...@linux.vnet.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] [yajsonrpc]questions about json rpc
On Thu, Feb 21, 2013 at 06:10:35PM +0800, ShaoHe Feng wrote: Hi, Adam An error arises, when I call json rpc server by AsyncoreReactor. And I can call json rpc server successfully by a simple TCPReactor write by myself. how can I call json Rpc by AsyncoreReactor correctly? address = (127.0.0.1, 4044) clientsReactor = asyncoreReactor.AsyncoreReactor() reactor = TestClientWrapper(clientsReactor.createClient(address)) jsonAPI = JsonRpcClient(reactor) jsonAPI.connect() jsonAPI.callMethod(Host.ping, [], 1, 10) Traceback (most recent call last): File stdin, line 1, in module File /usr/lib64/python2.7/site-packages/yajsonrpc/client.py, line 39, in callMethod resp = self._transport.recv(timeout=timeout) File /usr/share/vdsm/tests/jsonRpcUtils.py, line 100, in recv return self._queue.get(timeout=timeout)[1] File /usr/lib64/python2.7/Queue.py, line 176, in get raise Empty Queue.Empty Sheldon, You and I resolved this problem but I will answer it here as well for the benefit of everyone. When using the Asyncore framework, there is a reactor on the server but also on the client. Asyncore is multi-threaded and an event loop must be started for the client reactor in order to process the server responses. See tests/jsonRpcUtils.py:43 for the call to initialize the event loop thread in the client reactor. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] VDSM Repository Reorganization
On Tue, Feb 19, 2013 at 03:53:46PM -0500, Saggi Mizrahi wrote: I'm not sure what's the purpose of having different versions of the client/server on the same machine. The software repository is one and it should provide both (as they're built from the same source). This is the standard way of delivering client/server applications in all the distributions. We can change that but we must have a good reason. There isn't really a reason. But, as I said, you don't want them to depend on each other or have the schema in it's own rpm. This means that you have to distribute them separately. I also want to allow to update the client on a host without updating the server. This is because you may want to have a script that works across the cluster but not update all the hosts. Now, even though you will use only old methods, the schema itself might become unparsable by old implementations. This should never happen. Right now each symbol in the schema is represented by a single OrderedDict and the parsing code just loads the schema file into the a list of these dicts. Once loaded, the vdsmapi module categorizes symbols according to the top-level keys. Unrecognized symbol types are simply skipped. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] VDSM Repository Reorganization
In order to make progress on the file reorg, I want to summarize the discussion and propose that a consensus has been reached regarding placement of the schema file. The current code has a routine find_schema() that can locate the schema file in the development source tree or in an installed location. Therefore, it only needs to appear in the source tree in a single location and we will not need any symlinks for this purpose. Recently, the API handling code (schema and parsing module) have been split into their own rpm. This should solve the installation problem since any entity that needs access to the schema and parser should simply install the vdsm-api rpm. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] May I apply for a user account on jenkins.ovirt.org to run VDSM functional tests?
On Tue, Jan 29, 2013 at 12:21:46PM +0100, Ewoud Kohl van Wijngaarden wrote: On Tue, Jan 29, 2013 at 06:15:08AM -0500, Eyal Edri wrote: - Original Message - From: Zhou Zheng Sheng zhshz...@linux.vnet.ibm.com To: in...@ovirt.org Cc: ShaoHe Feng shao...@linux.vnet.ibm.com Sent: Tuesday, January 29, 2013 12:24:27 PM Subject: May I apply for a user account on jenkins.ovirt.org to run VDSM functional tests? Hi all, I notice there is no VDSM functional tests running in oVirt Jenkins. Currently in VDSM we have some XML-RPC functional test cases for iSCSI, localfs and glusterfs storage as well as creating and destroying VMs on those storage. Functional tests through JSON-RPC are under review. I also submit a patch to Gerrit for running the tests easily (http://gerrit.ovirt.org/#/c/11238/). More test cases will be added to improve test coverage and reduce the chance of regression. Some bugs that can not be covered by unit test can be caught by functional tests. I think it would be helpful to run these functional tests continuously. We can also configure the Gerrit trigger in Jenkins to run functional tests when someone verifies the patch or when it gets approved but not merged. This may be helpful to the maintainer. I've setup a Jenkins job for VDSM functional tests in my lab server. You can refer to the job configuration of my current setup (https://github.com/edwardbadboy/vdsm-jenkins/blob/master/config.xml). After my patch in Gerrit is accepted, the job configuration will be simpler and the hacks can be removed. May I apply a user account for creating job in the oVirt Jenkins? Hi Zhou, Basically there shouldn't be any problem with that. we have an option for giving a 'power-user' permissions for certain users on oVirt misc projects to add and and configure jobs for thier project. it requires knowledge in jenkins, which it seems that you have and recognition from the team/other developers from the relevant project (in this case, VDSM) that you are an active member of the project. (just a formality essentially) I've added engine-devel list to this thread so anyone from vdsm team can vote +1 for adding you as a power user for jenkins. once we'll receive a few +1 and not objections i'll create a user for you and send you the details. I think vdsm-devel is more relevant here. Also a big +1 from me. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] [Engine-devel] RFC: New Storage API
On Tue, Jan 22, 2013 at 11:36:57PM +0800, Shu Ming wrote: 2013-1-15 5:34, Ayal Baron: image and volume are overused everywhere and it would be extremely confusing to have multiple meanings to the same terms in the same system (we have image today which means virtual disk and volume which means a part of a virtual disk). Personally I don't like the distinction between image and volume done in ec2/openstack/etc seeing as they're treated as different types of entities there while the only real difference is mutability (images are read-only, volumes are read-write). To move to the industry terminology we would need to first change all references we have today to image and volume in the system (I would say also in ovirt-engine side) to align with the new meaning. Despite my personal dislike of the terms, I definitely see the value in converging on the same terminology as the rest of the industry but to do so would be an arduous task which is out of scope of this discussion imo (patches welcome though ;) Another distinction between Openstack and oVirt is how the Nova/ovirt-engine look upon storage systems. In Openstack, a stand alone storage service(Cinder) exports the raw storage block device to Nova. On the other hand, in oVirt, storage system is highly bounded with the cluster scheduling system which integrates storage sub-system, VM dispatching sub-system, ISO image sub systems. This combination make all of the sub-system integrated in a whole which is easy to deploy, but it make the sub-system more opaque and not harder to reuse and maintain. This new storage API proposal give us an opportunity to distinct these sub-systems as new components which export better, loose-coupling APIs to VDSM. A very good point and an important goal in my opinion. I'd like to see ovirt-engine become more of a GUI for configuring the storage component (like it does for Gluster) rather than the centralized manager of storage. The clustered storage should be able to take care of itself as long as the peer hosts can negotiate the SDM role. It would be cool if someone could actually dedicate a non-virtualization host where its only job is to handle SDM operations. Such a host could choose to only deploy the standalone HSM service and not the complete vdsm package. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] API Documentation Since tag
On Mon, Jan 14, 2013 at 05:45:45PM -0500, Saggi Mizrahi wrote: - Original Message - From: Adam Litke a...@us.ibm.com To: Saggi Mizrahi smizr...@redhat.com Cc: vdsm-devel@lists.fedorahosted.org, Vinzenz Feenstra vfeen...@redhat.com Sent: Monday, January 14, 2013 5:21:41 PM Subject: Re: [vdsm] API Documentation Since tag On Mon, Jan 14, 2013 at 12:37:57PM -0500, Saggi Mizrahi wrote: - Original Message - From: Adam Litke a...@us.ibm.com To: Vinzenz Feenstra vfeen...@redhat.com Cc: vdsm-devel@lists.fedorahosted.org Sent: Friday, January 11, 2013 9:03:19 AM Subject: Re: [vdsm] API Documentation Since tag On Fri, Jan 11, 2013 at 10:19:45AM +0100, Vinzenz Feenstra wrote: Hi everyone, We are currently documenting the API in vdsmapi-schema.json I noticed that we have there documented when a certain element newly is introduced using the 'Since' tag. However I also noticed that we are not documenting when a field was newly added, nor do we update the 'since' tag. We should start documenting in what version we've introduced a field. A suggestion by saggi was to add to the comment for example: @since: 4.10.3 What is your point of view on this? I do think it's a good idea to add this information. How about supporting multiple Since lines in the comment like the following made up example: ## # @FenceNodePowerStatus: # # Indicates the power state of a remote host. # # @on:The remote host is powered on # # @off: The remote host is powered off # # @unknown: The power status is not known # # @sentient: The host is alive and powered by its own metabolism # # Since: 4.10.0 - @FenceNodePowerStatus # Since: 10.2.0 - @sentient ## I don't like the fact that both lines don't point to the same type of token. I also don't like that it's a repeat of the type names and field names. I prefer Vinzenz original suggestion (on IRC) of moving the Since token up and then have it be a state. It also makes discerning what entities you can use up to a certain version easier if you make sure to keep them sorted. We can do this because the order of the fields and availability is undetermined (unlike real structs). That is not correct. These structures are parsed into an OrderedDict and the ordering is important (especially for languages like C which might use real structs). The wire format, json, ignores the ordering, further more, for languages like C we can't use actual structs because then we have to bump a major version every time we add a field as the sizeof(struct Foo) changed ## # @FenceNodePowerStatus: # # Indicates the power state of a remote host. # # Since: 4.10.0 # # @on:The remote host is powered on # # @off: The remote host is powered off # # @unknown: The power status is not known # # Since: 10.2.0 # # @sentient: The host is alive and powered by its own metabolism # ## The problem though is that it makes since a property of the fields and not of the struct. This isn't that much of a problem as we can assume the earliest version is the time when the struct was introduced. I don't like this any better than my suggestion. Aside from the fact that field ordering is important (in the data structure itself), this spreads the since information throughout the comment rather than concentrating it in a single place. Well, thinking about it, I don't understand why structs need to have a Since property anyway. Only verbs should have it. Structs are available (by inference) since the earliest call that produces them. All fields in a struct are optional anyway. Old versions wouldn't try and access them, new clients should always assume these fields may not be returned anyway. All _newly_added_ fields must be optional. Fields that are part of the original definition of the type can be required fields. This reminds me that we will need to audit the schema for fields that can be made optional. For example, when creating Vm*Device objects, the VmDeviceAddress member can be omitted. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] API Documentation Since tag
On Mon, Jan 14, 2013 at 12:37:57PM -0500, Saggi Mizrahi wrote: - Original Message - From: Adam Litke a...@us.ibm.com To: Vinzenz Feenstra vfeen...@redhat.com Cc: vdsm-devel@lists.fedorahosted.org Sent: Friday, January 11, 2013 9:03:19 AM Subject: Re: [vdsm] API Documentation Since tag On Fri, Jan 11, 2013 at 10:19:45AM +0100, Vinzenz Feenstra wrote: Hi everyone, We are currently documenting the API in vdsmapi-schema.json I noticed that we have there documented when a certain element newly is introduced using the 'Since' tag. However I also noticed that we are not documenting when a field was newly added, nor do we update the 'since' tag. We should start documenting in what version we've introduced a field. A suggestion by saggi was to add to the comment for example: @since: 4.10.3 What is your point of view on this? I do think it's a good idea to add this information. How about supporting multiple Since lines in the comment like the following made up example: ## # @FenceNodePowerStatus: # # Indicates the power state of a remote host. # # @on:The remote host is powered on # # @off: The remote host is powered off # # @unknown: The power status is not known # # @sentient: The host is alive and powered by its own metabolism # # Since: 4.10.0 - @FenceNodePowerStatus # Since: 10.2.0 - @sentient ## I don't like the fact that both lines don't point to the same type of token. I also don't like that it's a repeat of the type names and field names. I prefer Vinzenz original suggestion (on IRC) of moving the Since token up and then have it be a state. It also makes discerning what entities you can use up to a certain version easier if you make sure to keep them sorted. We can do this because the order of the fields and availability is undetermined (unlike real structs). That is not correct. These structures are parsed into an OrderedDict and the ordering is important (especially for languages like C which might use real structs). ## # @FenceNodePowerStatus: # # Indicates the power state of a remote host. # # Since: 4.10.0 # # @on:The remote host is powered on # # @off: The remote host is powered off # # @unknown: The power status is not known # # Since: 10.2.0 # # @sentient: The host is alive and powered by its own metabolism # ## The problem though is that it makes since a property of the fields and not of the struct. This isn't that much of a problem as we can assume the earliest version is the time when the struct was introduced. I don't like this any better than my suggestion. Aside from the fact that field ordering is important (in the data structure itself), this spreads the since information throughout the comment rather than concentrating it in a single place. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] API Documentation Since tag
On Fri, Jan 11, 2013 at 10:19:45AM +0100, Vinzenz Feenstra wrote: Hi everyone, We are currently documenting the API in vdsmapi-schema.json I noticed that we have there documented when a certain element newly is introduced using the 'Since' tag. However I also noticed that we are not documenting when a field was newly added, nor do we update the 'since' tag. We should start documenting in what version we've introduced a field. A suggestion by saggi was to add to the comment for example: @since: 4.10.3 What is your point of view on this? I do think it's a good idea to add this information. How about supporting multiple Since lines in the comment like the following made up example: ## # @FenceNodePowerStatus: # # Indicates the power state of a remote host. # # @on:The remote host is powered on # # @off: The remote host is powered off # # @unknown: The power status is not known # # @sentient: The host is alive and powered by its own metabolism # # Since: 4.10.0 - @FenceNodePowerStatus # Since: 10.2.0 - @sentient ## Remember that any patch to change the schema format will require changes to process-schema.py as well. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] Managing async tasks
On Mon, Dec 17, 2012 at 12:15:08PM -0500, Saggi Mizrahi wrote: - Original Message - From: Adam Litke a...@us.ibm.com To: vdsm-devel@lists.fedorahosted.org Cc: Dan Kenigsberg dan...@redhat.com, Ayal Baron aba...@redhat.com, Saggi Mizrahi smizr...@redhat.com, Federico Simoncelli fsimo...@redhat.com, engine-de...@ovirt.org Sent: Monday, December 17, 2012 12:00:49 PM Subject: Managing async tasks On today's vdsm call we had a lively discussion around how asynchronous operations should be handled in the future. In an effort to include more people in the discussion and to better capture the resulting conversation I would like to continue that discussion here on the mailing list. A lot of ideas were thrown around about how 'tasks' should be handled in the future. There are a lot of ways that it can be done. To determine how we should implement it, it's probably best if we start with a set of requirements. If we can first agree on these, it should be easy to find a solution that meets them. I'll take a stab at identifying a first set of POSSIBLE requirements: - Standardized method for determining the result of an operation This is a big one for me because it directly affects the consumability of the API. If each verb has different semantics for discovering whether it has completed successfully, then the API will be nearly impossible to use easily. Since there is no way to assure if of some tasks completed successfully or failed, especially around the murky waters of storage, I say this requirement should be removed. At least not in the context of a task. I don't agree. Please feel free to convince me with some exampled. If we cannot provide feedback to a user as to whether their request has been satisfied or not, then we have some bigger problems to solve. Sorry. That's my list :) Hopefully others will be willing to add other requirements for consideration. From my understanding, task recovery (stop, abort, rollback, etc) will not be generally supported and should not be a requirement. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] Managing async tasks
On Mon, Dec 17, 2012 at 03:12:34PM -0500, Saggi Mizrahi wrote: This is an addendum to my previous email. - Original Message - From: Saggi Mizrahi smizr...@redhat.com To: Adam Litke a...@us.ibm.com Cc: Dan Kenigsberg dan...@redhat.com, Ayal Baron aba...@redhat.com, Federico Simoncelli fsimo...@redhat.com, engine-de...@ovirt.org, vdsm-devel@lists.fedorahosted.org Sent: Monday, December 17, 2012 2:52:06 PM Subject: Re: Managing async tasks - Original Message - From: Adam Litke a...@us.ibm.com To: Saggi Mizrahi smizr...@redhat.com Cc: Dan Kenigsberg dan...@redhat.com, Ayal Baron aba...@redhat.com, Federico Simoncelli fsimo...@redhat.com, engine-de...@ovirt.org, vdsm-devel@lists.fedorahosted.org Sent: Monday, December 17, 2012 2:16:25 PM Subject: Re: Managing async tasks On Mon, Dec 17, 2012 at 12:15:08PM -0500, Saggi Mizrahi wrote: - Original Message - From: Adam Litke a...@us.ibm.com To: vdsm-devel@lists.fedorahosted.org Cc: Dan Kenigsberg dan...@redhat.com, Ayal Baron aba...@redhat.com, Saggi Mizrahi smizr...@redhat.com, Federico Simoncelli fsimo...@redhat.com, engine-de...@ovirt.org Sent: Monday, December 17, 2012 12:00:49 PM Subject: Managing async tasks On today's vdsm call we had a lively discussion around how asynchronous operations should be handled in the future. In an effort to include more people in the discussion and to better capture the resulting conversation I would like to continue that discussion here on the mailing list. A lot of ideas were thrown around about how 'tasks' should be handled in the future. There are a lot of ways that it can be done. To determine how we should implement it, it's probably best if we start with a set of requirements. If we can first agree on these, it should be easy to find a solution that meets them. I'll take a stab at identifying a first set of POSSIBLE requirements: - Standardized method for determining the result of an operation This is a big one for me because it directly affects the consumability of the API. If each verb has different semantics for discovering whether it has completed successfully, then the API will be nearly impossible to use easily. Since there is no way to assure if of some tasks completed successfully or failed, especially around the murky waters of storage, I say this requirement should be removed. At least not in the context of a task. I don't agree. Please feel free to convince me with some exampled. If we cannot provide feedback to a user as to whether their request has been satisfied or not, then we have some bigger problems to solve. If VDSM sends a write command to a storage server, and the connection hangs up before the ACK has returned. The operation has been committed but VDSM has no way of knowing if that happened as far as VDSM is concerned it got an ETIMEO or EIO. This is the same problem that the engine has with VDSM. If VDSM creates an image\VM\network\repo but the connection hangs up before the response can be sent back as far as the engine is concerned the operation times out. This is an inherent issue with clustering. This is why I want to move away from tasks being *the* trackable objects. Tasks should be short. As short as possible. Run VM should just persist the VM information on the VDSM host and return. The rest of the tracking should be done using the VM ID. Create image should return once VDSM persisted the information about the request on the repository and created the metadata files. Tracking should be done on the repo or the imageId. The thing is that I know how long a VM object should live (or an Image object). So tracking it is straight forward. How long a task should live is very problematic and quite context specific. It depends on what the task is. I think it's quite confusing from an API standpoint to have every task have a different scope, id requirement and life-cycle. In VDSM has two types of APIs CRUD objects - VM, Image, Repository, Bridge, Storage Connections General transient methods - getBiosInfo(), getDeviceList() The latter are quite simple to manage. They don't need any special handling. If you lost a getBiosInfo() call you just send another one, no harm done. The same is even true with things that change the host like getDeviceList() What we are really arguing about is fitting the CRUD objects to some generic task oriented scheme. I'm saying it's a waste of time as you can quite easily have flows to recover from each operation. Create - Check if the object exists Read - Read again Update - either update again or read and update if update
Re: [vdsm] RFC: New Storage API
operation it will tell it to value one over the other. For example, whether to copy all the data or just create a qcow based of a snapshot. The default is space. You might have also noticed that it is never explicitly specified where to look for existing images. This is done purposefully, VDSM will always look in all connected repositories for existing objects. For very large setups this might be problematic. To mitigate the problem you have these options: participatingRepositories=[repoId, ...] which tell VDSM to narrow the search to just these repositories and imageHints={imgId: repoId} which will force VDSM to look for those image ID just in those repositories and fail if it doesn't find them there. ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel -- --- 舒明 Shu Ming Open Virtualization Engineerning; CSTL, IBM Corp. Tel: 86-10-82451626 Tieline: 9051626 E-mail: shum...@cn.ibm.com or shum...@linux.vnet.ibm.com Address: 3/F Ring Building, ZhongGuanCun Software Park, Haidian District, Beijing 100193, PRC ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] RFC: New Storage API
On Mon, Dec 10, 2012 at 02:03:09PM -0500, Saggi Mizrahi wrote: - Original Message - From: Adam Litke a...@us.ibm.com To: Saggi Mizrahi smizr...@redhat.com Cc: Deepak C Shetty deepa...@linux.vnet.ibm.com, engine-devel engine-de...@ovirt.org, VDSM Project Development vdsm-devel@lists.fedorahosted.org Sent: Monday, December 10, 2012 1:49:31 PM Subject: Re: [vdsm] RFC: New Storage API On Fri, Dec 07, 2012 at 02:53:41PM -0500, Saggi Mizrahi wrote: snip 1) Can you provide more info on why there is a exception for 'lvm based block domain'. Its not coming out clearly. File based domains are responsible for syncing up object manipulation (creation\deletion) The backend is responsible for making sure it all works either by having a single writer (NFS) or having it's own locking mechanism (gluster). In our LVM based domains VDSM is responsible for basic object manipulation. The current design uses an approach where there is a single host responsible for object creation\deleteion it is the SRM\SDM\SPM\S?M. If we ever find a way to make it fully clustered without a big hit in performance the S?M requirement will be removed form that type of domain. I would like to see us maintain a LOCALFS domain as well. For this, we would also need SRM, correct? No, why? Sorry, nevermind. I was thinking of a scenario with multiple clients talking to a single vdsm and making sure they don't stomp on one another. This is probably not something we are going to care about though. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] RFC: New Storage API
On Mon, Dec 10, 2012 at 03:36:23PM -0500, Saggi Mizrahi wrote: Statements like this make me start to worry about your userData concept. It's a sign of a bad API if the user needs to invent a custom metadata scheme for itself. This reminds me of the abomination that is the 'custom' property in the vm definition today. In one sentence: If VDSM doesn't care about it, VDSM doesn't manage it. userData being a void* is quite common and I don't understand why you would thing it's a sign of a bad API. Further more, giving the user choice about how to represent it's own metadata and what fields it want to keep seems reasonable to me. Especially given the fact that VDSM never reads it. The reason we are pulling away from the current system of VDSM understanding the extra data is that it makes that data tied to VDSMs on disk format. VDSM on disk format has to be very stable because of clusters with multiple VDSM versions. Further more, since this is actually manager data it has to be tied to the manager backward compatibility lifetime as well. Having it be opaque to VDSM ties it to only one, simpler, support lifetime instead of two. I guess you are implying that it will make it problematic for multiple users to read userData left by another user because the formats might not be compatible. The solution is that all parties interested in using VDSM storage agree on format, and common fields, and supportability, and all the other things that choosing a supporting *something* entails. This is, however, out of the scope of VDSM. When the time comes I think how the userData blob is actually parsed and what fields it keeps should be discussed on ovirt-devel or engine-devel. The crux of the issue is that VDSM manages only what it cares about and the user can't modify directly. This is done because everything we expose we commit to. If you want any information persisted like: - Human readable name (in whatever encoding) - Is this a template or a snapshot - What user owns this image You can just put it in the userData. VDSM is not going to impose what encoding you use. It's not going to decide if you represent your users as IDs or names or ldap queries or Public Keys. It's not going to decide if you have explicit templates or not. It's not going to decide if you care what is the logical image chain. It's not going to decide anything that is out of it's scope. No format is future proof, no selection of fields will be good for any situation. I'd much rather it be someone else's problem when any of them need to be changed. They have currently been VDSMs problem and it has been hell to maintain. In general, I actually agree with most of this. What I want to avoid is pushing things that should actually be a part of the API into this userData blob. We do want to keep the API as simple as possible to give vdsm flexibility. If, over time, we find that users are always using userData to work around something missing in the API, this could be a really good sign that the API needs extension. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] moving the collection of statistics to external process
On Thu, Dec 06, 2012 at 11:19:34PM +0800, Shu Ming wrote: 于 2012-12-6 4:51, Itamar Heim 写道: On 12/05/2012 10:33 PM, Adam Litke wrote: On Wed, Dec 05, 2012 at 10:21:39PM +0200, Itamar Heim wrote: On 12/05/2012 10:16 PM, Adam Litke wrote: On Wed, Dec 05, 2012 at 09:01:24PM +0200, Itamar Heim wrote: On 12/05/2012 08:57 PM, Adam Litke wrote: On Wed, Dec 05, 2012 at 08:30:10PM +0200, Itamar Heim wrote: On 12/05/2012 04:42 PM, Adam Litke wrote: I wanted to know what do you think about it and if you have better solution to avoid initiate so many threads? And if splitting vdsm is a good idea here? In first look, my opinion is that it can help and would be nice to have vmStatisticService that runs and writes to separate log the vms status. Vdsm recently started requiring the MOM package. MOM also performs some host and guest statistics collection as part of the policy framework. I think it would be a really good idea to consolidate all stats collection into MOM. Then, all stats become usable within the policy and by vdsm for its own internal purposes. Today, MOM has one stats collection thread per VM and one thread for the host stats. It has an API for gathering the most recently collected stats which vdsm can use. isn't this what collectd (and its libvirt plugin) or pcp are already doing? Lot's of things collect statistics, but as of right now, we're using MOM and we're not yet using collectd on the host, right? I think we should have a single stats collection service and clients for it. I think mom and vdsm should get their stats from that service, rather than have either beholden to any new stats something needs to collect. How would this work for collecting guest statistics? Would we require collectd to be installed in all guests running under oVirt? my understanding is collectd is installed on the host, and uses collects libvirt plugin to collect guests statistics? Yes, but some statistics can only be collected by making a call to the oVirt guest agent (eg. guest memory statistics). The logical next step would be to write a collectd plugin for ovirt-guest-agent, but vdsm owns the connections to the guest agents and probably does not want to multiplex those connections for many reasons (security being the main one). and some will come from qemu-ga which libvirt will support? maybe a collectd vdsm plugin for the guest agent stats? I am thinking to have the collectd as a stand alone service to collect the statics from both ovirt-guest and qemu-ga. Then collected can export the information to host proc file system in layered architecture. Then mom or other vdsm service can get the information from the proc file system like other OS statics exported in the host. You wouldn't use the host /proc filesystem for this purpose. /proc is an interface between userspace and the kernel. It is not for direct application use. The problem I see with hooking collectd up to ovirt-ga is that vdsm still needs a connection to ovirt-ga for things like shutdown and desktopLogin. Today vdsm, owns the connection to the guest agent and there is not a nice way to multiplex that connection for use by multiple clients simultaneously. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] link state semantics
On Wed, Dec 05, 2012 at 04:25:48AM -0500, Antoni Segura Puimedon wrote: - Original Message - From: Igor Lvovsky ilvov...@redhat.com To: Antoni Segura Puimedon asegu...@redhat.com Cc: Alona Kaplan alkap...@redhat.com, vdsm-devel@lists.fedorahosted.org Sent: Wednesday, December 5, 2012 10:17:50 AM Subject: Re: [vdsm] link state semantics - Original Message - From: Antoni Segura Puimedon asegu...@redhat.com To: vdsm-devel@lists.fedorahosted.org Cc: Alona Kaplan alkap...@redhat.com Sent: Tuesday, December 4, 2012 7:32:34 PM Subject: [vdsm] link state semantics Hi list! We are working on the new 3.2 feature for adding support for updating VM devices, more specifically at the moment network devices. There is one point of the design which is not yet consensual and we'd need to agree on a proper and clean design that would satisfy us all: My current proposal, as reflected by patch: http://gerrit.ovirt.org/#/c/9560/5/vdsm_api/vdsmapi-schema.json and its parent is to have a linkActive boolean that is true for link status 'up' and false for link status 'down'. We want to support a none (dummy) network that is used to dissociate vnics from any real network. The semantics, as you can see in the patch are that unless you specify a network, updateDevice will place the interface on that network. However, Adam Litke argues that not specifying a network should keep the vnic on the network it currently is, as network is an optional parameter and 'linkActive' is also optional and has this preserve current state semantics. I can certainly see the merit of what Adam proposes, and the implementation would be that linkActive becomes an enum like so: {'enum': 'linkState'/* or linkActive */ , 'data': ['up', 'down', 'disconnected']} If you are going for this use 'linkState' With this change, network would only be changed if one different than the current one is specified and the vnic would be taken to the dummy bridge when the linkState would be set to 'disconnected'. In general +1 for new one, with a little doubt. It looks a bit inconsistent that we leave the network as is if it omitted from input, but if linkState is 'disconnected' we will move it to dummy bridge. But I can live with it. Yes, the 'disconnected' overrules the network and that, as you point out, can be a source of confusion. I propose to add a warning to the return dictionary that tells the user that setting disconnected overrules any network setting. There is also an objection, raised by Adam about the semantics of portMirroring. The current behavior from my patch is: portMirroring is None or is not set - No action taken. portMirroring = [] - No action taken. portMirroring = [a,b,z] - Set port mirroring for nets a,b and z to the specified vnic. His proposal is: portMirroring is None or is not set - No action taken. portMirroring = [] - Unset port mirroring to the vnic that is currently set. portMirroring = [a,b,z] - Set port mirroring for nets a,b and z to the specified vnic. +1 for Adam's approach, just don't forget to unset portMirroring from all nets setted before if they not in new portMirroring = [a,b,z] So you're saying: portMirroring is None or is not set - No action taken. portMirroring = [] - Unset port mirroring to the vnic that is currently set. portMirroring = [a,b,z] - Set port mirroring for nets a,b and z to the specified vnic AND unset any other mirroring. I'm fine with it, I think it is even more complete and correct. Yes, +1. I would really welcome comments on this to have finally an agreement to the api for this feature. Best, Toni ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] Host bios information
On Wed, Dec 05, 2012 at 11:05:21AM +0200, ybronhei wrote: Today in the Api we display general information about the host that vdsm export by getCapabilities Api. We decided to add bios information as part of the information that is displayed in UI under host's general sub-tab. To summaries the feature - We'll modify General tab to Software Information and add another tab for Hardware Information which will include all the bios data that we'll decide to gather from the host and display. Following this feature page: http://www.ovirt.org/Features/Design/HostBiosInfo for more details. All the parameters that can be displayed are mentioned in the wiki. I would greatly appreciate your comments and questions. Seems good to me but I would like to throw out one suggestion. getVdsCapabilities is already a huge command that does a lot of time consuming things. As part of the vdsm API refactoring, we are going to start favoring small and concise APIs over bag APIs. Perhaps we should just add a new verb: Host.getVdsBiosInfo() that returns only this information. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] moving the collection of statistics to external process
On Wed, Dec 05, 2012 at 04:23:16PM +0200, ybronhei wrote: As part of an issue that if you push start for 200vms in the same time it takes hours because undefined issue, we thought about moving the collection of statistics outside vdsm. Thanks for bringing up this issue. I think this could be a good idea on its own merits (better modularity, etc). It can help because the stat collection is an internal threads of vdsm that can spend not a bit of a time, I'm not sure if it would help with the issue of starting many vms simultaneously, but it might improve vdsm response. In general, threads should be really cheap to create so I expect there is another cause for the performance bottleneck. That being said, I think we should still look at this feature. Currently we start thread for each vm and then collecting stats on them in constant intervals, and it must effect vdsm if we have 200 thread like this that can take some time. for example if we have connection errors to storage and we can't receive its response, all the 200 threads can get stuck and lock other threads (gil issue). I wanted to know what do you think about it and if you have better solution to avoid initiate so many threads? And if splitting vdsm is a good idea here? In first look, my opinion is that it can help and would be nice to have vmStatisticService that runs and writes to separate log the vms status. Vdsm recently started requiring the MOM package. MOM also performs some host and guest statistics collection as part of the policy framework. I think it would be a really good idea to consolidate all stats collection into MOM. Then, all stats become usable within the policy and by vdsm for its own internal purposes. Today, MOM has one stats collection thread per VM and one thread for the host stats. It has an API for gathering the most recently collected stats which vdsm can use. The problem with this solution is that if those interval functions needs to communicate with internal parts of vdsm to set values or start internal processes when something has changed, it depends on the stat function.. and I'm not sure that stat function should control internal flows. Today to recognize connectivity error we count on this method, but we can add polling mechanics for those issues (which can raise same problems we are trying to deal with..) I agree. Any cases where the stats collection threads are triggering internal vdsm logic need to be cleaned up. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] Host bios information
On Wed, Dec 05, 2012 at 05:25:10PM +0200, ybronhei wrote: On 12/05/2012 04:32 PM, Adam Litke wrote: On Wed, Dec 05, 2012 at 11:05:21AM +0200, ybronhei wrote: Today in the Api we display general information about the host that vdsm export by getCapabilities Api. We decided to add bios information as part of the information that is displayed in UI under host's general sub-tab. To summaries the feature - We'll modify General tab to Software Information and add another tab for Hardware Information which will include all the bios data that we'll decide to gather from the host and display. Following this feature page: http://www.ovirt.org/Features/Design/HostBiosInfo for more details. All the parameters that can be displayed are mentioned in the wiki. I would greatly appreciate your comments and questions. Seems good to me but I would like to throw out one suggestion. getVdsCapabilities is already a huge command that does a lot of time consuming things. As part of the vdsm API refactoring, we are going to start favoring small and concise APIs over bag APIs. Perhaps we should just add a new verb: Host.getVdsBiosInfo() that returns only this information. It leads to modification also in how the engine collects the parameters with the new api request and I'm not sure if we should get into this.. Now we have specific known way of how engine requests for capabilities, when and how it effects the status of the host that is shown via the UI. To simplify this feature I prefer to use the current way of gathering and providing host's information. If we'll decide to split the host's capabilities api, it needs to get rfcs mail of its own because it changes engine's internal flows and it makes this feature to something much more influential. I don't understand. Why can't you just call both APIs, one after the other? -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] moving the collection of statistics to external process
On Wed, Dec 05, 2012 at 08:30:10PM +0200, Itamar Heim wrote: On 12/05/2012 04:42 PM, Adam Litke wrote: I wanted to know what do you think about it and if you have better solution to avoid initiate so many threads? And if splitting vdsm is a good idea here? In first look, my opinion is that it can help and would be nice to have vmStatisticService that runs and writes to separate log the vms status. Vdsm recently started requiring the MOM package. MOM also performs some host and guest statistics collection as part of the policy framework. I think it would be a really good idea to consolidate all stats collection into MOM. Then, all stats become usable within the policy and by vdsm for its own internal purposes. Today, MOM has one stats collection thread per VM and one thread for the host stats. It has an API for gathering the most recently collected stats which vdsm can use. isn't this what collectd (and its libvirt plugin) or pcp are already doing? Lot's of things collect statistics, but as of right now, we're using MOM and we're not yet using collectd on the host, right? -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] moving the collection of statistics to external process
On Wed, Dec 05, 2012 at 09:01:24PM +0200, Itamar Heim wrote: On 12/05/2012 08:57 PM, Adam Litke wrote: On Wed, Dec 05, 2012 at 08:30:10PM +0200, Itamar Heim wrote: On 12/05/2012 04:42 PM, Adam Litke wrote: I wanted to know what do you think about it and if you have better solution to avoid initiate so many threads? And if splitting vdsm is a good idea here? In first look, my opinion is that it can help and would be nice to have vmStatisticService that runs and writes to separate log the vms status. Vdsm recently started requiring the MOM package. MOM also performs some host and guest statistics collection as part of the policy framework. I think it would be a really good idea to consolidate all stats collection into MOM. Then, all stats become usable within the policy and by vdsm for its own internal purposes. Today, MOM has one stats collection thread per VM and one thread for the host stats. It has an API for gathering the most recently collected stats which vdsm can use. isn't this what collectd (and its libvirt plugin) or pcp are already doing? Lot's of things collect statistics, but as of right now, we're using MOM and we're not yet using collectd on the host, right? I think we should have a single stats collection service and clients for it. I think mom and vdsm should get their stats from that service, rather than have either beholden to any new stats something needs to collect. How would this work for collecting guest statistics? Would we require collectd to be installed in all guests running under oVirt? -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] moving the collection of statistics to external process
On Wed, Dec 05, 2012 at 10:21:39PM +0200, Itamar Heim wrote: On 12/05/2012 10:16 PM, Adam Litke wrote: On Wed, Dec 05, 2012 at 09:01:24PM +0200, Itamar Heim wrote: On 12/05/2012 08:57 PM, Adam Litke wrote: On Wed, Dec 05, 2012 at 08:30:10PM +0200, Itamar Heim wrote: On 12/05/2012 04:42 PM, Adam Litke wrote: I wanted to know what do you think about it and if you have better solution to avoid initiate so many threads? And if splitting vdsm is a good idea here? In first look, my opinion is that it can help and would be nice to have vmStatisticService that runs and writes to separate log the vms status. Vdsm recently started requiring the MOM package. MOM also performs some host and guest statistics collection as part of the policy framework. I think it would be a really good idea to consolidate all stats collection into MOM. Then, all stats become usable within the policy and by vdsm for its own internal purposes. Today, MOM has one stats collection thread per VM and one thread for the host stats. It has an API for gathering the most recently collected stats which vdsm can use. isn't this what collectd (and its libvirt plugin) or pcp are already doing? Lot's of things collect statistics, but as of right now, we're using MOM and we're not yet using collectd on the host, right? I think we should have a single stats collection service and clients for it. I think mom and vdsm should get their stats from that service, rather than have either beholden to any new stats something needs to collect. How would this work for collecting guest statistics? Would we require collectd to be installed in all guests running under oVirt? my understanding is collectd is installed on the host, and uses collects libvirt plugin to collect guests statistics? Yes, but some statistics can only be collected by making a call to the oVirt guest agent (eg. guest memory statistics). The logical next step would be to write a collectd plugin for ovirt-guest-agent, but vdsm owns the connections to the guest agents and probably does not want to multiplex those connections for many reasons (security being the main one). -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] moving the collection of statistics to external process
On Wed, Dec 05, 2012 at 10:51:23PM +0200, Itamar Heim wrote: On 12/05/2012 10:33 PM, Adam Litke wrote: On Wed, Dec 05, 2012 at 10:21:39PM +0200, Itamar Heim wrote: On 12/05/2012 10:16 PM, Adam Litke wrote: On Wed, Dec 05, 2012 at 09:01:24PM +0200, Itamar Heim wrote: On 12/05/2012 08:57 PM, Adam Litke wrote: On Wed, Dec 05, 2012 at 08:30:10PM +0200, Itamar Heim wrote: On 12/05/2012 04:42 PM, Adam Litke wrote: I wanted to know what do you think about it and if you have better solution to avoid initiate so many threads? And if splitting vdsm is a good idea here? In first look, my opinion is that it can help and would be nice to have vmStatisticService that runs and writes to separate log the vms status. Vdsm recently started requiring the MOM package. MOM also performs some host and guest statistics collection as part of the policy framework. I think it would be a really good idea to consolidate all stats collection into MOM. Then, all stats become usable within the policy and by vdsm for its own internal purposes. Today, MOM has one stats collection thread per VM and one thread for the host stats. It has an API for gathering the most recently collected stats which vdsm can use. isn't this what collectd (and its libvirt plugin) or pcp are already doing? Lot's of things collect statistics, but as of right now, we're using MOM and we're not yet using collectd on the host, right? I think we should have a single stats collection service and clients for it. I think mom and vdsm should get their stats from that service, rather than have either beholden to any new stats something needs to collect. How would this work for collecting guest statistics? Would we require collectd to be installed in all guests running under oVirt? my understanding is collectd is installed on the host, and uses collects libvirt plugin to collect guests statistics? Yes, but some statistics can only be collected by making a call to the oVirt guest agent (eg. guest memory statistics). The logical next step would be to write a collectd plugin for ovirt-guest-agent, but vdsm owns the connections to the guest agents and probably does not want to multiplex those connections for many reasons (security being the main one). and some will come from qemu-ga which libvirt will support? maybe a collectd vdsm plugin for the guest agent stats? Then you still have vdsm plus one other entitity in the business of stats collection. I don't see how that's any better than what we have today. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] object instancing in the new VDSM API
and is logically wrong. What you need to do is remove redundant arguments and split up verbs that do more then one thing. - Original Message - From: Adam Litke a...@us.ibm.com To: Saggi Mizrahi smizr...@redhat.com Cc: vdsm-devel vdsm-de...@fedorahosted.org, Ayal Baron aba...@redhat.com, Barak Azulay bazu...@redhat.com, ybronhei ybron...@redhat.com Sent: Monday, December 3, 2012 5:46:31 PM Subject: Re: object instancing in the new VDSM API On Mon, Dec 03, 2012 at 04:34:28PM -0500, Saggi Mizrahi wrote: Currently the suggested scheme treats everything as instances and object have methods. This puts instancing as the responsibility of the API bindings. I suggest changing it to the way json was designed with namespaces and methods. For example instead for the api being: vm = host.getVMsList()[0] vm.getInfo() the API should be: vmID = host.getVMsList()[0] api.VMsManager.getVMInfo(vmID) And it should be up to decide how to wrap everything in objects. For VMs, your example looks nice, but for today's Volumes it's not so nice. To properly identify a Volume, we must pass the storage pool id, storage domain id, image id, and volume id. If we are working with two Volumes, we would need 8 parameters unless we optimize for context and assume that the storage pool uuid is the same for both volumes, etc. The problem with that optimization is that we require clients to understand internal implementation details. How should the StorageDomain.getVolumes API return a list of Volumes? A list of Volume ids is not enough information for most commands that involve a Volume. The problem with the API bindings controlling the instancing is that: 1) We have to *have* and *pass* implicit api obj which is problematic to maintain. For example, you have to have the api object as a member of instance for the method calls to work. This means that you can't recreate or pool API objects easily. You effectively need to add a move method to move the object to another API object to use it on a different host. You already make assumptions like this when passing around bare UUIDs. For example, you know that a Storage Domain cannot be associated with multiple Storage Pools at the same time. With instantiated objects, all of those associations are baked into the objects. A client never constructs objects. It only receives pre-instantiated objects by calling other APIs. 2) Because the objects are opaque it might be hard to know what fields of the instance to persist to get the same object. No. You just persist the whole object identifier the way it was given to you. In the case of Volumes, it may be an object containing 4 string uuids. It could also be a string in the form /spuuid/sduuid/imguuid/voluuid. In the end it doesn't really matter which form it's in because the client will not manipulate it. Perhaps some flattened string is best in order to enable easy database storage. 3) It breaks the distinction between by-value and by-reference objects. The distinction is made in the schema. Reference objects have methods and are called 'class' in the schema. Value objects have only fields and are called 'type'. 4) Any serious user will make it's own instance classes that conform to it's design and flow so they don't really add any convenience to anything apart for tests. You will create you're own VM object, and because it's in the manager scope it will be the same instance across all hosts. Instead of being able to pass the same ID to any host (as the vmID remains the same) you will have to create and instance object to use either before every call for simplicity or cache for each host for performance benefits. This is a pretty good argument for using namespacing instead of instances, however... I still think that all object references need to be an opaque type and it should not be legal to roll your own object reference from a set of other objects (eq. create a volume reference from imgUUID, sdUUID, and spUUID). The API should be explicit about the relationships between objects. If you want to write your own instance classes you still can. Just pass the vdsm-generated identifier into your object's constructor to use for later API calls. 5) It makes us pass a weird __obj__ parameter to each call that symbolizes self and makes it hard for a user that choose to use it's own bindings to understand what it does. Fair. '__obj__' is a terrible name. I would be okay with changing the semantics so that all API calls take an 'id' parameter as their first argument. I guess this could always be a string with an unspecified format. For Volumes, we can decide how we want to encode the 4 uuids. Vdsm would then need to parse this value on the server side to pull out the relevant IDs. 6) It's syntactic
Re: [vdsm] VDSM tasks, the future
On Tue, Dec 04, 2012 at 10:35:01AM -0500, Saggi Mizrahi wrote: Because I started hinting about how VDSM tasks are going to look going forward I thought it's better I'll just write everything in an email so we can talk about it in context. This is not set in stone and I'm still debating things myself but it's very close to being done. Don't debate them yourself, debate them here! Even better, propose your idea in schema form to show how a command might work exactly. - Everything is asynchronous. The nature of message based communication is that you can't have synchronous operations. This is not really debatable because it's just how TCP\AMQP\messaging works. Can you show how a traditionally synchronous command might work? Let's take Host.getVmList as an example. - Task IDs will be decided by the caller. This is how json-rpc works and also makes sense because no the engine can track the task without needing to have a stage where we give it the task ID back. IDs are reusable as long as no one else is using them at the time so they can be used for synchronizing operations between clients (making sure a command is only executed once on a specific host without locking). - Tasks are transient If VDSM restarts it forgets all the task information. There are 2 ways to have persistent tasks: 1. The task creates an object that you can continue work on in VDSM. The new storage does that by the fact that copyImage() returns one the target volume has been created but before the data has been fully copied. From that moment on the stat of the copy can be queried from any host using getImageStatus() and the specific copy operation can be queried with getTaskStatus() on the host performing it. After VDSM crashes, depending on policy, either VDSM will create a new task to continue the copy or someone else will send a command to continue the operation and that will be a new task. 2. VDSM tasks just start other operations track-able not through the task interface. For example Gluster. gluster.startVolumeRebalance() will return once it has been registered with Gluster. glster.getOperationStatuses() will return the state of the operation from any host. Each call is a task in itself. I worry about this approach because every command has a different semantic for checking progress. For migration, we have to check VM status on the src and dest hosts. For image copy we need to use a special status call on the dest image. It would be nice if there was a unified method for checking on an operation. Maybe that can be completion events. Client: vdsm: --- - Image.copy(...) -- -- Operation Started Wait for event ... -- Event: Operation id done code For an early error: Client: vdsm: --- - Image.copy(...) -- -- Error: code - No task tags. They are silly and the caller can mangle whatever in the task ID if he really wants to tag tasks. Yes. Agreed. - No explicit recovery stage. VDSM will be crash-only, there should be efforts to make everything crash-safe. If that is problematic, in case of networking, VDSM will recover on start without having a task for it. How does this work in practice for something like creating a new image from a template? - No clean Task: Tasks can be started by any number of hosts this means that there is no way to own all tasks. There could be cases where VDSM starts tasks on it's own and thus they have no owner at all. The caller needs to continually track the state of VDSM. We will have brodcasted events to mitigate polling. If a disconnected client might have missed a completion event, it will need to check state. This means each async operation that changes state must document a proceedure for checking progress of a potentially ongoing operation. For Image.copy, that process would be to lookup the new image and check its state. - No revert Impossible to implement safely. How do the engine folks feel about this? I am ok with it :) - No SPM\HSM tasks SPM\SDM is no longer necessary for all domain types (only for type). What used to be SPM tasks, or tasks that persist and can be restarted on other hosts is talked about in previous bullet points. A nice simplification. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] API.py validation
On Tue, Dec 04, 2012 at 08:43:11AM -0500, Antoni Segura Puimedon wrote: Hi all, I am currently working in adding a new feature to vdsm which requires a new entry point in vdsm, thus requiring: - Parameter definitions in vdsm_api/vdsmapi-schema.json - Implementation and checks in vdsm/API.py and other modules. Typically, we check for presence absence of required/optional parameters in API.py using utils.validateMinimalKeySet or just if else clauses. I think this process could benefit from a more automatic and less duplicated effort, i.e., parsing vdsmapi-schema.json in a similar way as process-schema.py does to make a memoized method that is able to check whether the api call is correct according to the API definitions. A very good side effect would be that this would really avoid us from forgetting to update the schema. Yes, this is a good idea. I do want to add some checking. For now, the best place to add it would probably be in the DynamicBridge class which dispatches json-rpc calls to the correct internal methods. Unfortunately this would exclude the xmlrpc api from the automatic checking. I guess that's ok since xmlrpc will be going away. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] link state semantics
On Tue, Dec 04, 2012 at 12:32:34PM -0500, Antoni Segura Puimedon wrote: Hi list! We are working on the new 3.2 feature for adding support for updating VM devices, more specifically at the moment network devices. There is one point of the design which is not yet consensual and we'd need to agree on a proper and clean design that would satisfy us all: My current proposal, as reflected by patch: http://gerrit.ovirt.org/#/c/9560/5/vdsm_api/vdsmapi-schema.json and its parent is to have a linkActive boolean that is true for link status 'up' and false for link status 'down'. We want to support a none (dummy) network that is used to dissociate vnics from any real network. The semantics, as you can see in the patch are that unless you specify a network, updateDevice will place the interface on that network. However, Adam Litke argues that not specifying a network should keep the vnic on the network it currently is, as network is an optional parameter and 'linkActive' is also optional and has this preserve current state semantics. I can certainly see the merit of what Adam proposes, and the implementation would be that linkActive becomes an enum like so: {'enum': 'linkState'/* or linkActive */ , 'data': ['up', 'down', 'disconnected']} With this change, network would only be changed if one different than the current one is specified and the vnic would be taken to the dummy bridge when the linkState would be set to 'disconnected'. There is also an objection, raised by Adam about the semantics of portMirroring. The current behavior from my patch is: portMirroring is None or is not set - No action taken. portMirroring = [] - No action taken. portMirroring = [a,b,z] - Set port mirroring for nets a,b and z to the specified vnic. His proposal is: portMirroring is None or is not set - No action taken. portMirroring = [] - Unset port mirroring to the vnic that is currently set. portMirroring = [a,b,z] - Set port mirroring for nets a,b and z to the specified vnic. I would really welcome comments on this to have finally an agreement to the api for this feature. +1 to the updated proposal. Is there any better way to do it? -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] RFC: New Storage API
information (like Volume.getInfo)? (I see some more info below...) All operations return once the operations has been committed to disk NOT when the operation actually completes. This is done so that: - operation come to a stable state as quickly as possible. - In case where there is an SDM, only small portion of the operation actually needs to be performed on the SDM host. - No matter how many times the operation fails and on how many hosts, you can always resume the operation and choose when to do it. - You can stop an operation at any time and remove the resulting object making a distinction between stop because the host is overloaded to I don't want that image This means that after calling any operation that creates a new image the user must then call getImageStatus() to check what is the status of the image. The status of the image can be either optimized, degraded, or broken. Optimized means that the image is available and you can run VMs of it. Degraded means that the image is available and will run VMs but it might be a better way VDSM can represent the underlying data. Broken means that the image can't be used at the moment, probably because not all the data has been set up on the volume. Apart from that VDSM will also return the last persisted status information which will conatin hostID - the last host to try and optimize of fix the image stage - X/Y (eg. 1/10) the last persisted stage of the fix. Do you have some examples of what the stages would be? I think these should be defined in enums so that the user can check on what the individual stages mean. What happens when the low level implementation of an operation changes? The meaning of the stages will change completely. percent_complete - -1 or 0-100, the last persisted completion percentage of the aforementioned stage. -1 means that no progress is available for that operation. last_error - This will only be filled if the operation failed because of something other then IO or a VDSM crash for obvious reasons. It will usually be set if the task was manually stopped The user can either be satisfied with that information or as the host specified in host ID if it is still working on that image by checking it's running tasks. checkStorageRepository(self, repositoryId, options={}): A method to go over a storage repository and scan for any existing problems. This includes degraded\broken images and deleted images that have no yet been physically deleted\merged. It returns a list of Fix objects. Fix objects come in 4 types: clean - cleans data, run them to get more space. optimize - run them to optimize a degraded image What is an example of a degraded image? merge - Merges two images together. Doing this sometimes makes more images ready optimizing or cleaning. The reason it is different from optimize is that unmerged images are considered optimized. mend - mends a broken image What does this mean? The user can read these types and prioritize fixes. Fixes also contain opaque FIX data and they should be sent as received to fixStorageRepository(self, repositoryId, fix, options={}): That will start a fix operation. Could we have an automatic fix mode where vdsm just does the right thing (for most things)? All major operations automatically start the appropriate Fix to bring the created object to an optimize\degraded state (the one that is quicker) unless one of the options is AutoFix=False. This is only useful for repos that might not be able to create volumes on all hosts (SDM) but would like to have the actual IO distributed in the cluster. Other common options is the strategy option: It has currently 2 possible values space and performance - In case VDSM has 2 ways of completing the same operation it will tell it to value one over the other. For example, whether to copy all the data or just create a qcow based of a snapshot. The default is space. I like this a lot. You might have also noticed that it is never explicitly specified where to look for existing images. This is done purposefully, VDSM will always look in all connected repositories for existing objects. For very large setups this might be problematic. To mitigate the problem you have these options: participatingRepositories=[repoId, ...] which tell VDSM to narrow the search to just these repositories and imageHints={imgId: repoId} which will force VDSM to look for those image ID just in those repositories and fail if it doesn't find them there. I would like to have a better way of specifying these optional parameters without burying them in an options structure. I will think a little more about this. Strategy can just be a two optional flags in a 'flags' argument. For the participatingRepositories and imageHints options, I think we need to use real parameters. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center
Re: [vdsm] RFD: API: Identifying vdsm objects in the next-gen API
On Thu, Nov 29, 2012 at 05:59:09PM -0500, Saggi Mizrahi wrote: - Original Message - From: Adam Litke a...@us.ibm.com To: Saggi Mizrahi smizr...@redhat.com Cc: engine-de...@linode01.ovirt.org, Dan Kenigsberg dan...@redhat.com, Federico Simoncelli fsimo...@redhat.com, Ayal Baron aba...@redhat.com, vdsm-devel@lists.fedorahosted.org Sent: Thursday, November 29, 2012 5:22:43 PM Subject: Re: RFD: API: Identifying vdsm objects in the next-gen API On Thu, Nov 29, 2012 at 04:52:14PM -0500, Saggi Mizrahi wrote: They are not future proof as the paradigm is completely different. Storage domain IDs are not static any more (and are not guaranteed to be unique or the same across the cluster. Image IDs represent the ID of the projected data and not the actual unique path. Just as an example, to run a VM you give a list of domains that might contain the needed images in the chain and the image ID of the tip. The paradigm is changed to and most calls get non synchronous number of images and domains. Further more, the APIs themselves are completely different. So future proofing is not really an issue. I don't understand this at all. Perhaps we could all use some education on the architecture of the planned architectural changes. If I can pass an arbitrary list of domainIDs that _might_ contain the data, why wouldn't I just pass all of them every time? In that case, why are they even required since vdsm would have to search anyway? It's for optimization mostly, the engine usually has a good idea of where stuff are, having it give hints to VDSM can speed up the search process. also, then engines knows how transient some storage pieces are. If you have a domain that is only there for backup or owned by another manager sharing the host, you don't want you VMs using the disks that are on that storage effectively preventing it from being removed (though we do have plans to have qemu switch base snapshots at runtime for just that). This is not a clean design. If the search is slow, then maybe we need to improve caching internally. Making a client cache a bunch of internal IDs to pass around sounds like a complete layering violation to me. As to making the current API a bit simpler. As I said, making them opaque is problematic as currently the engine is responsible for creating the IDs. As I mentioned in my last post, engine still can specify the ID's when the object is first created. From that point forward the ID never changes so it can be baked into the identifier. Where will this identifier be persisted? Further more, some calls require you to play with these (making a template instead of a snapshot). Also, the full chain and topology needs to be completely visible to the engine. Please provide a specific example of how you play with the IDs. I can guess where you are going, but I don't want to divert the thread. The relationship between volumes and images is deceptive at the moment. IMG is the chain and volume is a member, IMGUUID is only used to for verification and to detect when we hit a template going up the chain. When you do operation on images assumptions are being guaranteed about the resulting IDs. When you copy an image, you assume to know all the new IDs as they remain the same. With your method I can't tell what the new opaque result is going to be. Preview mode (another abomination being deprecated) relies on the disconnect between imgUUID and volUUID. Live migration currently moves a lot of the responsibility to the engine. No client should need to know about all of these internal details. I understand that's the way it is today, and that's one of the main reasons that the API is a complete pain to use. These things, as you said, are problematic. But this is the way things are today. We are changing them. Any intermediary step is needlessly problematic for existing clients. Work is already in progress for fixing the API properly, making some calls a bit nicer isn't an excuse to start making more compatibility code in the engine. The engine won't need compatibility code. This only would impact the jsonrpc bindings which aren't used by engine yet. When engine switches over, then yes it would need to adapt. As for task IDs. Currently task IDs are only used for storage and they get persisted to disk. This is WRONG and is not the case with the new storage API. Because we moved to an asynchronous message based protocol (json-rpc over TCP\AMQP) there is no need to generate a task ID. it is built in to json-rpc. json-rpc specifies that the IDs have to be unique for a client as long as the request is still active. This is good enough as internally we can have a verb for a client to query it's own running tasks and a verb to query other host tasks by mangling in the client before the ID. Because the protocol is So
Re: [vdsm] RFD: API: Identifying vdsm objects in the next-gen API
On Mon, Dec 03, 2012 at 03:57:42PM -0500, Saggi Mizrahi wrote: - Original Message - From: Adam Litke a...@us.ibm.com To: Saggi Mizrahi smizr...@redhat.com Cc: engine-de...@linode01.ovirt.org, Dan Kenigsberg dan...@redhat.com, Federico Simoncelli fsimo...@redhat.com, Ayal Baron aba...@redhat.com, vdsm-devel@lists.fedorahosted.org Sent: Monday, December 3, 2012 3:30:21 PM Subject: Re: RFD: API: Identifying vdsm objects in the next-gen API On Thu, Nov 29, 2012 at 05:59:09PM -0500, Saggi Mizrahi wrote: - Original Message - From: Adam Litke a...@us.ibm.com To: Saggi Mizrahi smizr...@redhat.com Cc: engine-de...@linode01.ovirt.org, Dan Kenigsberg dan...@redhat.com, Federico Simoncelli fsimo...@redhat.com, Ayal Baron aba...@redhat.com, vdsm-devel@lists.fedorahosted.org Sent: Thursday, November 29, 2012 5:22:43 PM Subject: Re: RFD: API: Identifying vdsm objects in the next-gen API On Thu, Nov 29, 2012 at 04:52:14PM -0500, Saggi Mizrahi wrote: They are not future proof as the paradigm is completely different. Storage domain IDs are not static any more (and are not guaranteed to be unique or the same across the cluster. Image IDs represent the ID of the projected data and not the actual unique path. Just as an example, to run a VM you give a list of domains that might contain the needed images in the chain and the image ID of the tip. The paradigm is changed to and most calls get non synchronous number of images and domains. Further more, the APIs themselves are completely different. So future proofing is not really an issue. I don't understand this at all. Perhaps we could all use some education on the architecture of the planned architectural changes. If I can pass an arbitrary list of domainIDs that _might_ contain the data, why wouldn't I just pass all of them every time? In that case, why are they even required since vdsm would have to search anyway? It's for optimization mostly, the engine usually has a good idea of where stuff are, having it give hints to VDSM can speed up the search process. also, then engines knows how transient some storage pieces are. If you have a domain that is only there for backup or owned by another manager sharing the host, you don't want you VMs using the disks that are on that storage effectively preventing it from being removed (though we do have plans to have qemu switch base snapshots at runtime for just that). This is not a clean design. If the search is slow, then maybe we need to improve caching internally. Making a client cache a bunch of internal IDs to pass around sounds like a complete layering violation to me. You can't cache this, if the same template exists on an 2 different NFS domains only the engine has enough information to know which you should use. We only have the engine give us thing information when starting a VM or merging\copying an image that resides on multiple domains. It is also completely optional. I didn't like it either. Is it even valid for the same template (with identical uuids) to exist in two places? I thought uuids aren't supposed to collide. I can envision some scenario where a cached storagedomain/storagepool relationship becomes invalid because another user detached the storagedomain. In that case, the API just returns the normal error about sd XXX is not attached to sp XXX. So I don't see any problem here. As to making the current API a bit simpler. As I said, making them opaque is problematic as currently the engine is responsible for creating the IDs. As I mentioned in my last post, engine still can specify the ID's when the object is first created. From that point forward the ID never changes so it can be baked into the identifier. Where will this identifier be persisted? Further more, some calls require you to play with these (making a template instead of a snapshot). Also, the full chain and topology needs to be completely visible to the engine. Please provide a specific example of how you play with the IDs. I can guess where you are going, but I don't want to divert the thread. The relationship between volumes and images is deceptive at the moment. IMG is the chain and volume is a member, IMGUUID is only used to for verification and to detect when we hit a template going up the chain. When you do operation on images assumptions are being guaranteed about the resulting IDs. When you copy an image, you assume to know all the new IDs as they remain the same. With your method I can't tell what the new opaque result is going to be. Preview mode (another abomination being deprecated) relies on the disconnect between imgUUID and volUUID. Live migration currently moves a lot of the responsibility to the engine. No client
Re: [vdsm] [ATTENTION] vdsm-bootstrap/host deployment (pre-3.2)
On Thu, Nov 29, 2012 at 10:00:12AM +0200, Dan Kenigsberg wrote: On Wed, Nov 28, 2012 at 03:29:35PM -0600, Adam Litke wrote: On Wed, Nov 28, 2012 at 03:45:28PM -0500, Alon Bar-Lev wrote: - Original Message - From: Dan Kenigsberg dan...@redhat.com To: Alon Bar-Lev alo...@redhat.com Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org, engine-devel engine-de...@ovirt.org, users us...@ovirt.org Sent: Wednesday, November 28, 2012 10:39:42 PM Subject: Re: [vdsm] [ATTENTION] vdsm-bootstrap/host deployment (pre-3.2) On Wed, Nov 28, 2012 at 02:57:17PM -0500, Alon Bar-Lev wrote: No... we need it as compatibility with older engines... We keep minimum changes there for legacy, until end-of-life. Is there an EoL statement for oVirt-3.1? We can make sure that oVirt-3.2's vdsm installs properly with ovirt-3.1's vdsm-bootstrap, or even require that Engine must be upgraded to ovirt-3.2 before upgrading any of the hosts. Is it too harsh to our vast install base? us...@ovirt.org, please chime in! I tried to find such, but the more I dig I find that we need to support old legacy. Why, exactly? Fedora gives no such guarntees (heck, I'm stuck with an unupgradable F16). Should we be any better than our (currently single) platform? We should start and detach from specific distro procedures. * legacy-removed: change machine width core file # echo /var/lib/vdsm/core /proc/sys/kernel/core_pattern Yeah, qemu-kvm and libvirtd are much more stable than in the old days, but wouldn't we want to keep a means to collect the corpses of dead processes from hypervisors? It has helped us nail down nasty bugs, even in Python. It does not mean it should be at /var/lib/vdsm ... :) I don't get the joke :-(. If you mind the location, we can think of somewhere else to put the core dumps. Would it be hard to reinstate a parallel feature in otopi? I usually do not make any jokes... A global system setting should not go into package specific location. Usually core dumps are off by default, I like this approach as unattended system may fast consume all disk space because of dumps. If a host fills up with dumps so quickly, it's a sign that it should not be used for production, and that someone should look into the cores. (P.S. we have a logrotate rule for them in vdsm) There should be a vdsm-debug-aids (or similar) to perform such changes. Again, I don't think vdsm should (by default) modify any system width parameter such as this. But I will happy to hear more views. I agree with your statement above that a single package should not override a global system setting. We should really work to remove as many of these from vdsm as we possibly can. It will help to make vdsm a much safer/well-behaved package. I'm fine with dropping these from vdsm, but I think they are good for ovirt - we would like to (be able to) enfornce policy on our nodes. If configuring core dumps is removed from vdsm, it should go somewhere else, or our log-collector users would miss their beloved dumps. Yes, I agree. From my point of view the plan was to do the following: 1. Remove unnecessary system configuration changes. This includes things like Royce's supervdsm startup process patch (and accompanying sudo-supervdsm conversions) which allows us to remove some of the sudo configuration. 2. Isolate the remaining tweaks into vdsm-tool. 3. Provide a service/program that can be run to configure a system to work in an ovirt-engine controlled cluster. Doing this allows vdsm to be safely installed on any system as a basic prerequisite for other software. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] RFD: API: Identifying vdsm objects in the next-gen API
On Thu, Nov 29, 2012 at 04:52:14PM -0500, Saggi Mizrahi wrote: They are not future proof as the paradigm is completely different. Storage domain IDs are not static any more (and are not guaranteed to be unique or the same across the cluster. Image IDs represent the ID of the projected data and not the actual unique path. Just as an example, to run a VM you give a list of domains that might contain the needed images in the chain and the image ID of the tip. The paradigm is changed to and most calls get non synchronous number of images and domains. Further more, the APIs themselves are completely different. So future proofing is not really an issue. I don't understand this at all. Perhaps we could all use some education on the architecture of the planned architectural changes. If I can pass an arbitrary list of domainIDs that _might_ contain the data, why wouldn't I just pass all of them every time? In that case, why are they even required since vdsm would have to search anyway? As to making the current API a bit simpler. As I said, making them opaque is problematic as currently the engine is responsible for creating the IDs. As I mentioned in my last post, engine still can specify the ID's when the object is first created. From that point forward the ID never changes so it can be baked into the identifier. Further more, some calls require you to play with these (making a template instead of a snapshot). Also, the full chain and topology needs to be completely visible to the engine. Please provide a specific example of how you play with the IDs. I can guess where you are going, but I don't want to divert the thread. These things, as you said, are problematic. But this is the way things are today. We are changing them. As for task IDs. Currently task IDs are only used for storage and they get persisted to disk. This is WRONG and is not the case with the new storage API. Because we moved to an asynchronous message based protocol (json-rpc over TCP\AMQP) there is no need to generate a task ID. it is built in to json-rpc. json-rpc specifies that the IDs have to be unique for a client as long as the request is still active. This is good enough as internally we can have a verb for a client to query it's own running tasks and a verb to query other host tasks by mangling in the client before the ID. Because the protocol is So this would rely on the client keeping the connection open and as soon as it disconnects it would lose the ability to query tasks from before the connection went down? I don't know if it's a good idea to conflate message ID's with task ID's. While the protocol can operate asynchronously, some calls have synchronous semantics and others have asynchronous semantics. I would expect sync calls to return their data immediately and async calls to return immediately with either: an error code, or an 'operation started' message and associated ID for querying the status of the operation. asynchronous all calls are asynchronous by nature well. Tasks will no longer be persisted or expected to be persisted. It's the callers responsibility to query the state and see if the operation succeeded or failed if the caller or VDSM died in the middle of the call. The current cleanTask() system can't be used when more then one client is using VDSM and will not be used for anything other then legacy storage. I agree about not persisting tasks in the future. Although I think finished tasks should remain in memory for some time so they can be queried by a client who must reconnect. AFAIK Apart from storage all objects IDs are constructed with a single ID, name or alias. VMs, storageConnections, network interfaces. So it's not a real issue. I agree that in the future we should keep the idiom of pass configuration once, name it, and keep using the name to reference the object. Yes, storage is the major problem here. - Original Message - From: Adam Litke a...@us.ibm.com To: Saggi Mizrahi smizr...@redhat.com Cc: engine-de...@linode01.ovirt.org, Dan Kenigsberg dan...@redhat.com, Federico Simoncelli fsimo...@redhat.com, Ayal Baron aba...@redhat.com, vdsm-devel@lists.fedorahosted.org Sent: Thursday, November 29, 2012 4:18:40 PM Subject: Re: RFD: API: Identifying vdsm objects in the next-gen API On Thu, Nov 29, 2012 at 02:16:42PM -0500, Saggi Mizrahi wrote: This is all only valid for the current storage API the new one doesn't have pools or volumes. Only domains and images. Also, images and domains are more loosely coupled and make this method problematic. I am looking for an incremental way to bridge the differences. It's been 2 years and we still don't have the revamped storage API so I am planning on what we have being around for awhile :) I think that defining object identifiers as opaque structured types is also future proof. In the future an Image-ng object we can drop 'storagepoolID
Re: [vdsm] [ATTENTION] vdsm-bootstrap/host deployment (pre-3.2)
On Wed, Nov 28, 2012 at 03:45:28PM -0500, Alon Bar-Lev wrote: - Original Message - From: Dan Kenigsberg dan...@redhat.com To: Alon Bar-Lev alo...@redhat.com Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org, engine-devel engine-de...@ovirt.org, users us...@ovirt.org Sent: Wednesday, November 28, 2012 10:39:42 PM Subject: Re: [vdsm] [ATTENTION] vdsm-bootstrap/host deployment (pre-3.2) On Wed, Nov 28, 2012 at 02:57:17PM -0500, Alon Bar-Lev wrote: No... we need it as compatibility with older engines... We keep minimum changes there for legacy, until end-of-life. Is there an EoL statement for oVirt-3.1? We can make sure that oVirt-3.2's vdsm installs properly with ovirt-3.1's vdsm-bootstrap, or even require that Engine must be upgraded to ovirt-3.2 before upgrading any of the hosts. Is it too harsh to our vast install base? us...@ovirt.org, please chime in! I tried to find such, but the more I dig I find that we need to support old legacy. Why, exactly? Fedora gives no such guarntees (heck, I'm stuck with an unupgradable F16). Should we be any better than our (currently single) platform? We should start and detach from specific distro procedures. * legacy-removed: change machine width core file # echo /var/lib/vdsm/core /proc/sys/kernel/core_pattern Yeah, qemu-kvm and libvirtd are much more stable than in the old days, but wouldn't we want to keep a means to collect the corpses of dead processes from hypervisors? It has helped us nail down nasty bugs, even in Python. It does not mean it should be at /var/lib/vdsm ... :) I don't get the joke :-(. If you mind the location, we can think of somewhere else to put the core dumps. Would it be hard to reinstate a parallel feature in otopi? I usually do not make any jokes... A global system setting should not go into package specific location. Usually core dumps are off by default, I like this approach as unattended system may fast consume all disk space because of dumps. If a host fills up with dumps so quickly, it's a sign that it should not be used for production, and that someone should look into the cores. (P.S. we have a logrotate rule for them in vdsm) There should be a vdsm-debug-aids (or similar) to perform such changes. Again, I don't think vdsm should (by default) modify any system width parameter such as this. But I will happy to hear more views. I agree with your statement above that a single package should not override a global system setting. We should really work to remove as many of these from vdsm as we possibly can. It will help to make vdsm a much safer/well-behaved package. If sysadmin manually enables dumps, he may do this at a location of his own choice. Note that we've just swapped hats: you're arguing for letting a local admin log in and mess with system configuration, and I'm for keeping a centralized feature for storing and collecting core dumps. As problems like crashes are investigated per case and reproduction scenario. But again, I may be wrong and we should have VDSM API command to start/stop storing dumps and manage this via its master... -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] Future of Vdsm network configuration - Thread mid-summary
On Tue, Nov 27, 2012 at 10:42:00AM +0200, Livnat Peer wrote: On 26/11/12 16:59, Adam Litke wrote: On Mon, Nov 26, 2012 at 02:57:19PM +0200, Livnat Peer wrote: On 26/11/12 03:15, Shu Ming wrote: Livnat, Thanks for your summary. I got comments below. 2012-11-25 18:53, Livnat Peer: Hi All, We have been discussing $subject for a while and I'd like to summarized what we agreed and disagreed on thus far. The way I see it there are two related discussions: 1. Getting VDSM networking stack to be distribution agnostic. - We are all in agreement that VDSM API should be generic enough to incorporate multiple implementation. (discussed on this thread: Alon's suggestion, Mark's patch for adding support for netcf etc.) - We would like to maintain at least one implementation as the working/up-to-date implementation for our users, this implementation should be distribution agnostic (as we all acknowledge this is an important goal for VDSM). I also think that with the agreement of this community we can choose to change our focus, from time to time, from one implementation to another as we see fit (today it can be OVS+netcf and in a few months we'll use the quantum based implementation if we agree it is better) 2. The second discussion is about persisting the network configuration on the host vs. dynamically retrieving it from a centralized location like the engine. Danken raised a concern that even if going with the dynamic approach the host should persist the management network configuration. About dynamical retrieving from a centralized location, when will the retrieving start? Just in the very early stage of host booting before network functions? Or after the host startup and in the normal running state of the host? Before retrieving the configuration, how does the host network connecting to the engine? I think we need a basic well known network between hosts and the engine first. Then after the retrieving, hosts should reconfigure the network for later management. However, the timing to retrieve and reconfigure are challenging. We did not discuss the dynamic approach in details on the list so far and I think this is a good opportunity to start this discussion... From what was discussed previously I can say that the need for a well known network was raised by danken, it was referred to as the management network, this network would be used for pulling the full host network configuration from the centralized location, at this point the engine. About the timing for retrieving the configuration, there are several approaches. One of them was described by Alon, and I think he'll join this discussion and maybe put it in his own words, but the idea was to 'keep' the network synchronized at all times. When the host have communication channel to the engine and the engine detects there is a mismatch in the host configuration, the engine initiates 'apply network configuration' action on the host. Using this approach we'll have a single path of code to maintain and that would reduce code complexity and bugs - That's quoting Alon Bar Lev (Alon I hope I did not twisted your words/idea). On the other hand the above approach makes local tweaks on the host (done manually by the administrator) much harder. I worry a lot about the above if we take the dynamic approach. It seems we'd need to introduce before/after 'apply network configuration' hooks where the admin could add custom config commands that aren't yet modeled by engine. yes, and I'm not sure the administrators would like the fact that we are 'forcing' them to write everything in a script and getting familiar with VDSM hooking mechanism (which in some cases require the use of custom properties on the engine level) instead of running a simple command line. Any other approaches ? Static configuration has the advantage of allowing a host to bring itself back online independent of the engine. This is also useful for anyone who may want to deploy a vdsm node in standalone mode. I think it would be possible to easily support a quasi-static configuration mode simply by extending the design of the dynamic approach slightly. In dynamic mode, the network configuration is passed down as a well-defined data structure. When a particular configuration has been committed, vdsm could write a copy of that configuration data structure to /var/run/vdsm/network-config.json. During a subsequent boot, if the engine cannot be contacted after activating the management network, the cached configuration can be applied using the same code as for dynamic mode. We'd have to flesh out the circumstances under which this would happen. I like this approach a lot but we need to consider that network configuration is an accumulated state, for example - 1. The engine sends a setup
Re: [vdsm] Future of Vdsm network configuration - Thread mid-summary
On Mon, Nov 26, 2012 at 06:13:01PM -0500, Alon Bar-Lev wrote: Hello, - Original Message - From: Adam Litke a...@us.ibm.com To: Alon Bar-Lev alo...@redhat.com Cc: Livnat Peer lp...@redhat.com, VDSM Project Development vdsm-devel@lists.fedorahosted.org Sent: Tuesday, November 27, 2012 12:51:36 AM Subject: Re: [vdsm] Future of Vdsm network configuration - Thread mid-summary Nice writeup! I like where this is going but see my comments inline below. On Mon, Nov 26, 2012 at 03:18:22PM -0500, Alon Bar-Lev wrote: - Original Message - From: Livnat Peer lp...@redhat.com To: Shu Ming shum...@linux.vnet.ibm.com Cc: Alon Bar-Lev abar...@redhat.com, VDSM Project Development vdsm-devel@lists.fedorahosted.org Sent: Monday, November 26, 2012 2:57:19 PM Subject: Re: [vdsm] Future of Vdsm network configuration - Thread mid-summary On 26/11/12 03:15, Shu Ming wrote: Livnat, Thanks for your summary. I got comments below. 2012-11-25 18:53, Livnat Peer: Hi All, We have been discussing $subject for a while and I'd like to summarized what we agreed and disagreed on thus far. The way I see it there are two related discussions: 1. Getting VDSM networking stack to be distribution agnostic. - We are all in agreement that VDSM API should be generic enough to incorporate multiple implementation. (discussed on this thread: Alon's suggestion, Mark's patch for adding support for netcf etc.) - We would like to maintain at least one implementation as the working/up-to-date implementation for our users, this implementation should be distribution agnostic (as we all acknowledge this is an important goal for VDSM). I also think that with the agreement of this community we can choose to change our focus, from time to time, from one implementation to another as we see fit (today it can be OVS+netcf and in a few months we'll use the quantum based implementation if we agree it is better) 2. The second discussion is about persisting the network configuration on the host vs. dynamically retrieving it from a centralized location like the engine. Danken raised a concern that even if going with the dynamic approach the host should persist the management network configuration. About dynamical retrieving from a centralized location, when will the retrieving start? Just in the very early stage of host booting before network functions? Or after the host startup and in the normal running state of the host? Before retrieving the configuration, how does the host network connecting to the engine? I think we need a basic well known network between hosts and the engine first. Then after the retrieving, hosts should reconfigure the network for later management. However, the timing to retrieve and reconfigure are challenging. We did not discuss the dynamic approach in details on the list so far and I think this is a good opportunity to start this discussion... From what was discussed previously I can say that the need for a well known network was raised by danken, it was referred to as the management network, this network would be used for pulling the full host network configuration from the centralized location, at this point the engine. About the timing for retrieving the configuration, there are several approaches. One of them was described by Alon, and I think he'll join this discussion and maybe put it in his own words, but the idea was to 'keep' the network synchronized at all times. When the host have communication channel to the engine and the engine detects there is a mismatch in the host configuration, the engine initiates 'apply network configuration' action on the host. Using this approach we'll have a single path of code to maintain and that would reduce code complexity and bugs - That's quoting Alon Bar Lev (Alon I hope I did not twisted your words/idea). On the other hand the above approach makes local tweaks on the host (done manually by the administrator) much harder. Any other approaches ? I'd like to add a more general question to the discussion what are the advantages of taking the dynamic approach? So far I collected two reasons: -It is a 'cleaner' design, removes complexity on VDSM code, easier to maintain going forward, and less bug prone (I agree with that one, as long as we keep the retrieving configuration mechanism/algorithm simple). -It adheres to the idea of having a stateless hypervisor - some more input on this point would be appreciated Any other advantages? discussing the benefits of having the persisted Livnat Sorry for the delay. Some more
Re: [vdsm] [RFC]about the implement of text-based console
-starter for me. 3. Extend Spice to support console Is it possible to implement a spice client can be run in pure text mode without GUI environment? If we extend the protocol to support console stream but the client must be run in GUI, it will be less useful. pros No new VMs and server process, easy for maintenance. cons Must wait for Spice developers to commit the support. Need special client program in CLI, the user may prefer existing client program like ssh. It not a big problem because this feature can be put in to oVirt shell. Can someone familiar with spice weigh in on whether a console connection as described here could survive a live migration? In general, I really like this approach if it can be done cleanly. Spice is already oVirt's primary end-user application so in a deployed environment, we'd expect users to already have this program. If a scripted interface is required, I am sure that I/O redirection could be added either to the existing spice client or as part of a new spice-console program. This approach also works with a vdsm that is connected to ovirt-engine or running in standalone mode. This seems like the best approach to me so long as the spice team agrees that it can and should be done. 4. oVirt shell - Engine - libvirtd This is the current workaround described in http://wiki.ovirt.org/wiki/Features/Serial_Console_in_CLI#Currently_operational_workaround The design is good but I do not like Engine talking to libvirtd directly, thus comes the VDSM console streaming API below. Work to do Provide console streaming API from Engine to be invoked in oVirt shell. Implement the serial-console command in oVirt shell. pros Support migration. Engine can reconnect to the guest automatically after migration while keeping the connection from oVirt shell. Fit well in the current oVirt architecture: no new server process introduced, no new VM introduced, easy to maintain and manage. cons Engine talking to libvirtd directly breaks the encapsulation of VDSM. Users only can get the console stream from Engine, they can not directly connect to the host as VNC and the above two sshd solutions do. I agree that this is a layering violation and should not be persued as the long-term solution. We do not want to expose the libvirt connection outside of the host. 5. VDSM console streaming API Implement new APIs in VDSM to forward the raw data from console. It exposes getConsoleReadStream() and getConsoleWriteStream() via XMLRPC binding. Then Engine can get the console data stream via API instead of directly connecting to libvirtd. Other things will be the same as solution 4. Work to do Implement getConsoleReadStream() and getConsoleWriteStream() in VDSM. Provide console streaming API from Engine to be invoked in oVirt shell. Implement the serial-console command in oVirt shell. Optional: Implement a client program in vdsClient to consume the stream API. pros Same as solution 4 cons We can not allow ordinary user directly connect to VDSM and invoke the stream API, because there is no ACL in VDSM, once a client cert is setup for the ordinary user, he can call all the APIs in VDSM and get total control. So the ordinary user can only get the stream from Engine, and we leave Engine to do the ACL. One issue that was raised is console buffering. What happens if a client does not call getConsoleReadStream() fast enough? Will characters be dropped? This could create a reliability problem and would make scripting against this interface risky at best. I like solution 4 best. I will note again for others that you mentioned you like #5 (console streaming API) best. I think the spice approach is best based on weighing the following requirements: 1. Simple and easy to maintain 2. Can access via the host or ovirt-engine 3. Scripting mode is possible 4. Reliable -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] Future of Vdsm network configuration - Thread mid-summary
of persistent configuration is: - To allow the host to operate independently of the engine in either a failure scenario or in a standalone configuration. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] Future of Vdsm network configuration - Thread mid-summary
on hosts serving as hypervisors has the flexibility argument. However at mass deployment, large data-center or dynamic environment this flexibility argument becomes liability. Today oVirt plays in the small data center realm so I do think it's important to give appropriate weight to the flexibility argument. It should be possible to build different environments based on the needs of the deployment. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] Future of Vdsm network configuration
On Wed, Nov 14, 2012 at 11:53:06AM +0200, Livnat Peer wrote: On 14/11/12 00:28, Adam Litke wrote: On Sun, Nov 11, 2012 at 09:46:43AM -0500, Alon Bar-Lev wrote: - Original Message - From: Dan Kenigsberg dan...@redhat.com To: vdsm-de...@fedorahosted.org Sent: Sunday, November 11, 2012 4:07:30 PM Subject: [vdsm] Future of Vdsm network configuration Hi, Nowadays, when vdsm receives the setupNetowrk verb, it mangles /etc/sysconfig/network-scripts/ifcfg-* files and restarts the network service, so they are read by the responsible SysV service. This is very much Fedora-oriented, and not up with the new themes in Linux network configuration. Since we want oVirt and Vdsm to be distribution agnostic, and support new features, we have to change. setupNetwork is responsible for two different things: (1) configure the host networking interfaces, and (2) create virtual networks for guests and connect the to the world over (1). Functionality (2) is provided by building Linux software bridges, and vlan devices. I'd like to explore moving it to Open vSwitch, which would enable a host of functionalities that we currently lack (e.g. tunneling). One thing that worries me is the need to reimplement our config snapshot/recovery on ovs's database. As far as I know, ovs is unable to maintain host level parameters of interfaces (e.g. eth0's IPv4 address), so we need another tool for functionality (1): either speak to NetworkManager directly, or to use NetCF, via its libvirt virInterface* wrapper. I have minor worries about NetCF's breadth of testing and usage; I know it is intended to be cross-platform, but unlike ovs, I am not aware of a wide Debian usage thereof. On the other hand, its API is ready for vdsm's usage for quite a while. NetworkManager has become ubiquitous, and we'd better integrate with it better than our current setting of NM_CONTROLLED=no. But as DPB tells us, https://lists.fedorahosted.org/pipermail/vdsm-devel/2012-November/001677.html we'd better offload integration with NM to libvirt. We would like to take Network configuration in VDSM to the next level and make it distribution agnostic in addition for setting the infrastructure for more advanced features to be used going forward. The path we think of taking is to integrate with OVS and for feature completeness use NetCF, via its libvirt virInterface* wrapper. Any comments or feedback on this proposal is welcomed. Thanks to the oVirt net team members who's input has helped writing this email. Hi, As far as I see this, network manager is a monster that is a huge dependency to have just to create bridges or configure network interfaces... It is true that on a host where network manager lives it would be not polite to define network resources not via its interface, however I don't like we force network manager. libvirt is long not used as virtualization library but system management agent, I am not sure this is the best system agent I would have chosen. I think that all the terms and building blocks got lost in time... and the result integration became more and more complex. Stabilizing such multi-layered component environment is much harder than monolithic environment. I would really want to see vdsm as monolithic component with full control over its resources, I believe this is the only way vdsm can be stable enough to be production grade. Hypervisor should be a total slave of manager (or cluster), so I have no problem in bypassing/disabling any distribution specific tool in favour of atoms (brctl, iproute), in non persistence mode. I know this derive some more work, but I don't think it is that complex to implement and maintain. Just my 2 cents... I couldn't disagree more. What you are suggesting requires that we reimplement every single networking feature in oVirt by ourselves. If we want to support the (absolutely critical) goal of being distro agnostic, then we need to implement the same functionality across multiple distros too. This is more work than we will ever be able to keep up with. If you think it's hard to stabilize the integration of an external networking library, imagine how hard it will be to stabilize our own rewritten and buggy version. This is not how open source is supposed to work. We should be assembling distinct, modular, pre-existing components together when they are available. If NetworkManager has integration problems, let's work upstream to fix them. If it's dependencies are too great, let's modularize it so we don't need to ship the parts that we don't need. I agree with Adam on this one, reimplementing the networking management layer by ourselves using only atoms seems like duplication of work that was already done and available for our use both by NM
Re: [vdsm] Review needed: 3.2 release feature -- libvdsm
On Tue, Nov 06, 2012 at 05:49:22PM +0200, Dan Kenigsberg wrote: On Mon, Oct 29, 2012 at 10:20:04AM -0500, Adam Litke wrote: Hi everyone, libvdsm is listed as a release feature for 3.2 (preview only)[1][2]. There is a set of patches up in gerrit that could use a wide review from the community. The plan is to merge the new json-rpc server[3] first so if you could concentrate your reviews there it would yield the greatest benefit. Thanks! [1] http://wiki.ovirt.org/wiki/OVirt_3.2_release-management [2] http://wiki.ovirt.org/wiki/Features/libvdsm [3] http://gerrit.ovirt.org/#/c/8614/ [3] defines the format of each message as sizejson-data where size is a binary value, used to split a (tcp) stream into messages. I would like to consider another splitting scheme, which I find better suited to the textual nature of jsonrpc: terminate each message with the newline character. It makes the protocol easier to sniff and debug (in case you've missed part of a message). The down size is that we would need to require clients to escape literal newlines, and unescape them in responses (both are done by python's json module, and the latter is part of the json standard). Thanks for bringing up this point. I would like to make this protocol compatible with existing clients. Is there a standard for segmenting messages over the channel? I suppose it depends on the transport layer. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] Review needed: 3.2 release feature -- libvdsm
On Tue, Nov 06, 2012 at 11:40:53AM -0600, Tony Asleson wrote: On 11/06/2012 09:49 AM, Dan Kenigsberg wrote: On Mon, Oct 29, 2012 at 10:20:04AM -0500, Adam Litke wrote: Hi everyone, libvdsm is listed as a release feature for 3.2 (preview only)[1][2]. There is a set of patches up in gerrit that could use a wide review from the community. The plan is to merge the new json-rpc server[3] first so if you could concentrate your reviews there it would yield the greatest benefit. Thanks! [1] http://wiki.ovirt.org/wiki/OVirt_3.2_release-management [2] http://wiki.ovirt.org/wiki/Features/libvdsm [3] http://gerrit.ovirt.org/#/c/8614/ [3] defines the format of each message as sizejson-data where size is a binary value, used to split a (tcp) stream into messages. I would like to consider another splitting scheme, which I find better suited to the textual nature of jsonrpc: terminate each message with the newline character. It makes the protocol easier to sniff and debug (in case you've missed part of a message). The down size is that we would need to require clients to escape literal newlines, and unescape them in responses (both are done by python's json module, and the latter is part of the json standard). I use json-rpc for IPC in libStoragemgmt (out of process plug-ins) with unix domain sockets. I adopted the sizejson-data model as well*. I chose this because it allows the use of non-stream capable json parsers. I wanted to ensure that the transport and protocol would be language and parser agnostic. You could achieve the message separation with new lines as you suggest, but then you may have to parse the message stream twice. Once to find the message delimiter and once again to parse the json itself, depending on json parser. Having the size at the beginning of the message is incredibly convenient from a coding efficiency standpoint. As for debug, I just log the message payload if needed. I haven't had the need to use a packet trace, but I'm not sure having a single newline separating messages would be obvious in a single frame capture? Would it be possible to compromise and leave the length and add the newline as the end? So sizepayloadnew line? You could then pass the message payload to the parser with without having to escape the newlines? Thanks for weighing in on this! If you use the size and newline, how do you account for the newline char in the size value? It seems unnecessary to include this character to me since you can use a combination of logfile analysis and scanning the data stream for 'id': to find method calls and responses. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
[vdsm] Review needed: 3.2 release feature -- libvdsm
Hi everyone, libvdsm is listed as a release feature for 3.2 (preview only)[1][2]. There is a set of patches up in gerrit that could use a wide review from the community. The plan is to merge the new json-rpc server[3] first so if you could concentrate your reviews there it would yield the greatest benefit. Thanks! [1] http://wiki.ovirt.org/wiki/OVirt_3.2_release-management [2] http://wiki.ovirt.org/wiki/Features/libvdsm [3] http://gerrit.ovirt.org/#/c/8614/ -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
[vdsm] New component 'mom' added to Bugzilla
Hi all, MOM is becoming a bigger part of oVirt and unfortunately it may have bugs at some point :( Thanks to Yaniv we have a new 'mom' component in oVirt's bugzilla where you can report these. To file a new bug against MOM: https://bugzilla.redhat.com/enter_bug.cgi?product=oVirt;component=mom Thanks! -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] vdsm API schema
On Sun, Oct 21, 2012 at 09:09:52AM +0200, Itamar Heim wrote: On 07/15/2012 03:12 AM, Adam Litke wrote: For the past few weeks I have been working on creating a schema that fully describes the vdsm API. I am mostly finished with that effort and I wanted to share the results with the team. Attached are two files: the raw schema and an html document with cross-linked type information. This should already be useful in its current form, but I have bigger plans. I would first like to get help to correct errors in the schema. Then, I will start the process of writing a code generator that will create C/gObject code that we can compile into a libvdsm with language bindings for python, java, etc. Please take a look at the attached files and let me know what you think? P.S. I tried to attach these to the oVirt Wiki, but they are not permitted file types. Hi Adam, that's quite a big scheme to review. have you thought about ways to solicit inputs for it? (maybe schedule per topic reviews of the new scheme/api for VM operations (virt), network (host level, vm level), storage, sla policy, etc.)? For the first pass, we are trying to replicate the current API as much as possible. For subsequent refactoring, I'd expect the discussions to occur in the community around the patches that are implementing the proposed changes. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] Change in vdsm[master]: schema: New type VmParameters
On Thu, Oct 18, 2012 at 02:12:58AM -0400, lvro...@linux.vnet.ibm.com wrote: Royce Lv has posted comments on this change. Change subject: schema: New type VmParameters .. Patch Set 4: Adam, I saw 3 kinds of vm related description spread in the code: (1)vm.conf: query via vm.status()(vm.py), used as return value when changed vm's conf(changeCD, hotplugDisk) (2)vm.stats: query via vm.getStats()(vm.py), which is VM's live stats and used when calling getVmStats.(also used by MOM) (3)vm.parameter: parameter passed to vm.create().(API.py) You are trying to split is (3) from (1); But the live info should be (2) from (1) according to me. To me, VmDefinition contains the hardware properties of the VM (things like devices, amount of memory, number of cpus). It also contains things that can only be known at runtime (VNC display port, device bus information (if not specified in advance), current cdrom disk, etc). VmStatistics are different because they are measured (network activity, cpu usage, etc). VmParameters is like a streamlined VmDefinition where we remove items that cannot be specified at create time. -- To view, visit http://gerrit.ovirt.org/7839 To unsubscribe, visit http://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: I00d1b9aed55cbfc2210c1a4091bce17d45b90e67 Gerrit-PatchSet: 4 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Adam Litke a...@us.ibm.com Gerrit-Reviewer: Adam Litke a...@us.ibm.com Gerrit-Reviewer: Federico Simoncelli fsimo...@redhat.com Gerrit-Reviewer: Royce Lv lvro...@linux.vnet.ibm.com Gerrit-Reviewer: Saggi Mizrahi smizr...@redhat.com -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] vdsClient error after reinstalling vdsm
I have found that I cannot run vdsClient from within the vdsm source tree. Is it possible that this is the problem you see as well? Perhaps after rebooting you logged in and were in a different directory? On Wed, Oct 17, 2012 at 10:53:04AM -0400, Laszlo Hornyak wrote: Hi! This is a low priority problem. Each time I reinstall vdsm from rpm, I get this error when running vdsClient: Traceback (most recent call last): File /usr/lib64/python2.6/runpy.py, line 122, in _run_module_as_main __main__, fname, loader, pkg_name) File /usr/lib64/python2.6/runpy.py, line 34, in _run_code exec code in run_globals File /usr/share/vdsm/vdsClient.py, line 28, in module from vdsm import vdscli ImportError: cannot import name vdscli And after a reboot it works fine again. Very strange behavior. Anyone knows how to make it work without reboot? Thx, Laszlo ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] [RFC]about the implement of text-based console
On Mon, Oct 15, 2012 at 04:40:00AM -0400, Dan Yasny wrote: - Original Message - From: Adam Litke a...@us.ibm.com To: Zhou Zheng Sheng zhshz...@linux.vnet.ibm.com Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org Sent: Friday, 12 October, 2012 3:10:57 PM Subject: Re: [vdsm] [RFC]about the implement of text-based console On Fri, Oct 12, 2012 at 04:55:20PM +0800, Zhou Zheng Sheng wrote: on 09/04/2012 22:19, Ryan Harper wrote: * Dan Kenigsberg dan...@redhat.com [2012-09-04 05:53]: On Tue, Sep 04, 2012 at 03:05:37PM +0800, Xu He Jie wrote: On 09/03/2012 10:33 PM, Dan Kenigsberg wrote: On Thu, Aug 30, 2012 at 04:26:31PM -0500, Adam Litke wrote: On Thu, Aug 30, 2012 at 11:32:02AM +0800, Xu He Jie wrote: Hi, I submited a patch for text-based console http://gerrit.ovirt.org/#/c/7165/ the issue I want to discussing as below: 1. fix port VS dynamic port Use fix port for all VM's console. connect console with 'ssh vmUUID@ip -p port'. Distinguishing VM by vmUUID. The current implement was vdsm will allocated port for console dynamically and spawn sub-process when VM creating. In sub-process the main thread responsible for accept new connection and dispatch output of console to each connection. When new connection is coming, main processing create new thread for each new connection. Dynamic port will allocated port for each VM and use range port. It isn't good for firewall rules. so I got a suggestion that use fix port. and connect console with 'ssh vmuuid@hostip -p fixport'. this is simple for user. We need one process for accept new connection from fix port and when new connection is coming, spawn sub-process for each vm. But because the console only can open by one process, main process need responsible for dispatching console's output of all vms and all connection. So the code will be a little complex then dynamic port. So this is dynamic port VS fix port and simple code VS complex code. From a usability point of view, I think the fixed port suggestion is nicer. This means that a system administrator needs only to open one port to enable remote console access. If your initial implementation limits console access to one connection per VM would that simplify the code? Yes, using a fixed port for all consoles of all VMs seems like a cooler idea. Besides the firewall issue, there's user experience: instead of calling getVmStats to tell the vm port, and then use ssh, only one ssh call is needed. (Taking this one step further - it would make sense to add another layer on top, directing console clients to the specific host currently running the Vm.) I did not take a close look at your implementation, and did not research this myself, but have you considered using sshd for this? I suppose you can configure sshd to collect the list of known users from `getAllVmStats`, and force it to run a command that redirects VM's console to the ssh client. It has a potential of being a more robust implementation. I have considered using sshd and ssh tunnel. They can't implement fixed port and share console. Would you elaborate on that? Usually sshd listens to a fixed port 22, and allows multiple users to have independet shells. What do you mean by share console? Current implement we can do anything that what we want. Yes, it is completely under our control, but there are down sides, too: we have to maintain another process, and another entry point, instead of configuring a universally-used, well maintained and debugged application. Think of the security implications of having another remote shell access point to a host. I'd much rather trust sshd if we can make it work. Dan. At first glance, the standard sshd on the host is stronger and more robust than a custom ssh server, but the risk using the host sshd is high. If we implement this feature via host ssd, when a hacker attacks the sshd successfully, he will get access to the host shell. After all, the custom ssh server is not for accessing host shell, but just for forwarding the data from the guest console (a host /dev/pts/X device). If we just use a custom ssh server, the code in this server only does 1. auth, 2. data forwarding, when the hacker attacks, he just gets access to that virtual machine. Notice that there is no code written about login to the host in the custom ssh server, and the custom ssh server can be protected under selinux, only allowing it to access /dev/pts/X. In fact using a custom VNC server in qemu is as risky as a custom ssh server in vdsm. If we accepts the former one, then I can accepts the latter one. The consideration is how
Re: [vdsm] [RFC]about the implement of text-based console
On Fri, Oct 12, 2012 at 04:55:20PM +0800, Zhou Zheng Sheng wrote: on 09/04/2012 22:19, Ryan Harper wrote: * Dan Kenigsberg dan...@redhat.com [2012-09-04 05:53]: On Tue, Sep 04, 2012 at 03:05:37PM +0800, Xu He Jie wrote: On 09/03/2012 10:33 PM, Dan Kenigsberg wrote: On Thu, Aug 30, 2012 at 04:26:31PM -0500, Adam Litke wrote: On Thu, Aug 30, 2012 at 11:32:02AM +0800, Xu He Jie wrote: Hi, I submited a patch for text-based console http://gerrit.ovirt.org/#/c/7165/ the issue I want to discussing as below: 1. fix port VS dynamic port Use fix port for all VM's console. connect console with 'ssh vmUUID@ip -p port'. Distinguishing VM by vmUUID. The current implement was vdsm will allocated port for console dynamically and spawn sub-process when VM creating. In sub-process the main thread responsible for accept new connection and dispatch output of console to each connection. When new connection is coming, main processing create new thread for each new connection. Dynamic port will allocated port for each VM and use range port. It isn't good for firewall rules. so I got a suggestion that use fix port. and connect console with 'ssh vmuuid@hostip -p fixport'. this is simple for user. We need one process for accept new connection from fix port and when new connection is coming, spawn sub-process for each vm. But because the console only can open by one process, main process need responsible for dispatching console's output of all vms and all connection. So the code will be a little complex then dynamic port. So this is dynamic port VS fix port and simple code VS complex code. From a usability point of view, I think the fixed port suggestion is nicer. This means that a system administrator needs only to open one port to enable remote console access. If your initial implementation limits console access to one connection per VM would that simplify the code? Yes, using a fixed port for all consoles of all VMs seems like a cooler idea. Besides the firewall issue, there's user experience: instead of calling getVmStats to tell the vm port, and then use ssh, only one ssh call is needed. (Taking this one step further - it would make sense to add another layer on top, directing console clients to the specific host currently running the Vm.) I did not take a close look at your implementation, and did not research this myself, but have you considered using sshd for this? I suppose you can configure sshd to collect the list of known users from `getAllVmStats`, and force it to run a command that redirects VM's console to the ssh client. It has a potential of being a more robust implementation. I have considered using sshd and ssh tunnel. They can't implement fixed port and share console. Would you elaborate on that? Usually sshd listens to a fixed port 22, and allows multiple users to have independet shells. What do you mean by share console? Current implement we can do anything that what we want. Yes, it is completely under our control, but there are down sides, too: we have to maintain another process, and another entry point, instead of configuring a universally-used, well maintained and debugged application. Think of the security implications of having another remote shell access point to a host. I'd much rather trust sshd if we can make it work. Dan. At first glance, the standard sshd on the host is stronger and more robust than a custom ssh server, but the risk using the host sshd is high. If we implement this feature via host ssd, when a hacker attacks the sshd successfully, he will get access to the host shell. After all, the custom ssh server is not for accessing host shell, but just for forwarding the data from the guest console (a host /dev/pts/X device). If we just use a custom ssh server, the code in this server only does 1. auth, 2. data forwarding, when the hacker attacks, he just gets access to that virtual machine. Notice that there is no code written about login to the host in the custom ssh server, and the custom ssh server can be protected under selinux, only allowing it to access /dev/pts/X. In fact using a custom VNC server in qemu is as risky as a custom ssh server in vdsm. If we accepts the former one, then I can accepts the latter one. The consideration is how robust of the custom ssh server, and the difficulty to maintain it. In He Jie's current patch, the ssh auth and transport library is an open-source third-party project, unless the project is well maintained and well proven, using it can be risky. So my opinion is using neither the host sshd, nor a custom ssh server. Maybe we can apply the suggestion from Dan Yasny, running a standard sshd in a very small VM in every host, and forward data from this VM to other guest consoles. The ssh part is in the VM, then our work is just forwarding data from the VM via virto serial channels, to the guest via the pty. I really
Re: [vdsm] Mom Balloon policy issue
Thanks for writing this. Some thoughts inline, below. Also, cc'ing some lists in case other folks want to participate in the discussion. On Tue, Oct 09, 2012 at 01:12:30PM -0400, Noam Slomianko wrote: Greetings, I've fiddled around with ballooning and wanted to raise a question for debate. Currently as long as the host is under memory pressure, MOM will try and reclaim back memory from all guests with more free memory then a given threshold. Main issue: Guest allocated memory is not the same as the resident (physical) memory used by qemu. This means that when memory is reclaimed back (the balloon is inflated) we might not get as much memory as planed back (or non at all). *Example1 no memory is reclaimed back: name | allocated memory | used by the vm | resident memory used in the host by qemu Vm1 | 4G | 4G, |4G Vm2 | 4G | 1G |1G - MOM will inflate the balloon in vm2 (as vm has no free memory) and will gain no memory One thing to keep in mind is that VMs having less host RSS than their memory allocation is a temporary condition. All VMs will eventually consume their full allocation if allowed to run. I'd be curious to know how long this process takes in general. We might be able to handle this case by refusing to inflate the balloon if: (VM free memory - planned balloon inflation) host RSS *Example1 memory is reclaimed partially: name | allocated memory | used by the vm | resident memory used in the host by qemu Vm1 | 4G | 4G, |4G Vm2 | 4G | 1G |1G Vm3 | 4G | 1G |4G - MOM will inflate the balloon in vm2 and vm3 slowly gaining only from vm3 The above rule extension may help here too. this behaviour might in the cause us to: * spend time reclaiming memory from many guests when we can reclaim only from a subgroup * be under the impression that we have more potential memory to reclaim when we do * bring inactive VMs dangerously low as they are constantly reclaimed (I've had guests crashing from kernel out of memory) To address this I suggest that we collect guest memory stats from libvirt as well, so we have the option to use them in our calculations. This can be achieved with the command virsh dommemstat domain which returns actual 3915372 (allocated) rss 2141580 (resident memory used by qemu) I would suggest adding these two fields to the VmStats that are collected by vdsm. Then, to try it out, add the fields to the GuestMemory Collector. (Note: MOM does have a collector that gathers RSS for VMs. It's called GuestQemuProc). You can then extend the Balloon policy to add a snippet to check if the proposed balloon adjustment should be carried out. You could add the logic to the change_big_enough function. additional topic: * should we include per guest config (for example a hard minimum memory cap, this vm cannot run effectively with less then 1G memory) Yes. This is probably something we want to do. There is a whole topic around VM tagging that we should consider. In the future we will want to be able to do many different things in policy based on a VMs tag. For example, some VMs may be completely exempt from ballooning. Others may have a minimum limit. I want to avoid passing in the raw guest configuration because MOM needs to work with direct libvirt vms and with ovirt/vdsm vms. Therefore, we want to think carefully about the abstractions we use when presenting VM properties to MOM. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] API: Supporting internal/testing interfaces
On Thu, Oct 04, 2012 at 09:22:31AM -0400, Federico Simoncelli wrote: - Original Message - From: Saggi Mizrahi smizr...@redhat.com To: dl...@redhat.com Cc: Federico Simoncelli fsimo...@redhat.com, vdsm-devel@lists.fedorahosted.org Sent: Thursday, October 4, 2012 1:27:27 PM Subject: Re: [vdsm] API: Supporting internal/testing interfaces - Original Message - From: Dor Laor dl...@redhat.com To: Saggi Mizrahi smizr...@redhat.com Cc: Federico Simoncelli fsimo...@redhat.com, vdsm-devel@lists.fedorahosted.org Sent: Wednesday, October 3, 2012 10:16:26 PM Subject: Re: [vdsm] API: Supporting internal/testing interfaces On 10/03/2012 09:52 PM, Saggi Mizrahi wrote: My personal preference is using the VDSM debug hook to inject code to a running VDSM and dynamically add whatever you want. This means the code is part of the test and not VDSM. That's might be good for debugging/tracing but not for full functional tests. There are also better ways for dynamic tracing. We used to use it (before the code rotted away) to add to VDSM the startCoverage() and endCoverage() verbs for tests. Another option is having the code in an optional RPM (similar to how debug hook is loaded only if it's installed) I might also accept unpythonic things like conditional compilation Asking people nicely not to use a method that might corrupt their data-center doesn't always work with good people not to mention bad ones. Using -test devices/interfaces is a common practice. It's good to keep them live within the code base so they won't get rotten and any reasonable user is aware it's only a test api. Downstream can always compile it out before shipping. Conditional compilation kind of awkward in python, but as I said I'll agree to have that as an option. From what I understand litke's proposal is having the bindings in a different RPM but I am actually talking about the server side code not being available or at least hooked up. I thought that the server side was modular too and Adam's proposal was a server side additional module that registers new verbs to expose. In any case, I personally like this being hard and tiresome to do because it makes living with bad design less tolerable. There are some things that are harder to test and debug no matter how you implement them. To see a single extension you have to start a vm and wait for the guest to fill the lv. A better design wouldn't change the fact that if you don't expose a verb you can't use it. In any case, I don't want new code to need to have special debug verbs, if you don't test a full VDSM you shouldn't need to have one running. Why you think that one thing should exclude the other. Here we're talking about providing easier ways to test more (not less). In a perfect world, the code that does LV extend would exist in an independent class (that doesn't depend on vdsm/hsm and can be tested with a simple, standalone unit test. Unfortunately, we do not live in a perfect world. New code should be testable in this way but we need something to test what we already have. We could always provide a debug rpm that enables a yet another binding for a quick and dirty xmlrpc server. This server would stick around even after the normal BindingXMLRPC one is retired. The debug server would have no API formalization whatsoever and could be made pluggable so that test cases could be easily dropped in. This approach comes with just as many avenues of abuse as the idea I had previously suggested. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] Change in vdsm[master]: Use 'yum clean expire-cache' instead of 'yum clean all'
On Tue, Oct 02, 2012 at 07:39:32PM -0400, Ayal Baron wrote: Ayal, thanks for your thorough treatment of this subject :) I completely agree with the framework that you have laid out here. Hopefully, we can all come to an agreement on a quasi-official project-wide policy based on this and then place it up on the wiki in a gerrit workflow best practices document. First of all, in gerrit there is no immediately visible difference between '0' and no review at all so someone might have serious issues with a patch but if she did not mark it with -1 submitter might totally miss this fact. esp. if someone sent a new revision and the title of the cover comment for previous version doesn't state a -1 (so maintainer doesn't know he needs to go looking back to verify things were fixed). This is adding overhead on maintainer now to go back to each and every review and make sure that there are no comments that should have been addressed in 0. Note that if someone gave a -1, normally I'd expect that person to make sure and +1 a subsequent patch to flag to maintainer that all their problems with the patch have been addressed. My take on this is: -2 - The approach taken to solve the problem is wrong and the whole thing should either be abandoned or rewritten in a new way. I can only accept this though if the reviewer also suggests the alternative (i.e. just saying your code isn't good is bad form imo). e.g. stating things like 'circular references is bad' and giving -2 but not suggesting alternatives and explanations is bad form imo. -1 - I think there are some issues with the current patch that should be addressed *prior* to merging it (bugs in the code that would affect many people etc). This would also include complex code which needs explaining (if it's too complex for me to immediately understand then it's fine to delay merge until either a good answer why this is mandatory is received and what the code does or simplification of the code submitted or at worst case - comment in the code. -1 should only be given with proper explanation, otherwise imo it's bad form. 0 - I have some *personal* style problems, questions which do not affect the validity of the patch *or* I think there are some changes that should be made but can definitely be done in a future patch and should not prevent merging the current version. Note that I find this very important to actually improve our current way of working. This means that if a patch improves current code but could in itself be further improved, it is valid imo to accept current version and ask committer to submit another patch to further improve it. Note that this would include things like (e.g.) discussions about spacing which are not enforced by pep8 tools (i.e. preventing a patch which fixes bugs from going in because of personal interpretation of pep8 about alignment of parameters in function signature is wrong imo). +1 - I have reviewed the code and it looks correct to me but I'm not a subject matter expert / the maintainer. Note that for things like style review only '+1' should be accompanied by a cover commit message stating - +1 only for style as Doron has mentioned previously on this thred. +2 - I am a subject matter expert, I have reviewed the code and it looks good to me (solves the problem properly and no serious issues left with it). As Doron mentioned, in our group (storage) the standard is to have at least 2 reviews (by different people) before committing unless the patch is *really* trivial. This means that I try to avoid giving +2 if no else has given a +1 before. Alon, note that we apply this both to vdsm and engine. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] Review Request: libvdsm schema updates
On Fri, Sep 28, 2012 at 08:29:03PM -0400, Keith Robertson wrote: What, no XSD(s)? :) XSD is really only appropriate for XML documents and this API does not use XML. - Original Message - From: Adam Litke a...@us.ibm.com To: vdsm-devel@lists.fedorahosted.org Sent: Friday, September 28, 2012 5:40:48 PM Subject: [vdsm] Review Request: libvdsm schema updates Hi vdsm developers! This is a plea for review of my libvdsm schema updates. The patches I am asking for review on are found here: http://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:schema,n,z These patches simply change some elements of the API schema that are needed by the actual libvdsm code. I want to make some progress on this if possible so thanks in advance for taking a look. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
[vdsm] Review Request: libvdsm schema updates
Hi vdsm developers! This is a plea for review of my libvdsm schema updates. The patches I am asking for review on are found here: http://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:schema,n,z These patches simply change some elements of the API schema that are needed by the actual libvdsm code. I want to make some progress on this if possible so thanks in advance for taking a look. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] [Engine-devel] is gerrit.ovirt.org down? eom
So the fix is to just regularly restart gerrit? Do we have any idea about the real, underlying problem? On Wed, Sep 12, 2012 at 11:56:44AM -0400, Eyal Edri wrote: - Original Message - From: Itamar Heim ih...@redhat.com To: Asaf Shakarchi ashak...@redhat.com Cc: Alon Bar-Lev alo...@redhat.com, Shireesh Anjal san...@redhat.com, engine-de...@ovirt.org, VDSM Project Development vdsm-devel@lists.fedorahosted.org, Shu Ming shum...@linux.vnet.ibm.com, Eyal Edri ee...@redhat.com Sent: Wednesday, September 12, 2012 6:34:56 PM Subject: Re: [Engine-devel] is gerrit.ovirt.org down? eom On 09/12/2012 06:23 PM, Asaf Shakarchi wrote: It happens from time to time, restart is required, Itamar only. restarted. eyal - can we make progress on the jenkins job with permission to more people to restart gerrit? the job is ready http://jenkins.ovirt.org/view/system-monitoring/job/restart_gerrit_service but i need to have jenkins user access to gerrit server + sudo access to run 'service' restart... it has access to www.ovirt.org but not to gerrit.ovirt.org. others - please email infra on gerrit issues (well, me personally always help as well) - Original Message - Yes, I am experiencing this too... Itamar? - Original Message - From: Shu Ming shum...@linux.vnet.ibm.com To: Alon Bar-Lev alo...@redhat.com Cc: Shireesh Anjal san...@redhat.com, engine-de...@ovirt.org, VDSM Project Development vdsm-devel@lists.fedorahosted.org Sent: Wednesday, September 12, 2012 5:50:14 PM Subject: Re: [Engine-devel] is gerrit.ovirt.org down? eom It seems gerrit has downed for several times recently. Is there any special reason? 于 2012-9-12 22:45, Alon Bar-Lev: yes. - Original Message - From: Shireesh Anjal san...@redhat.com To: engine-de...@ovirt.org Sent: Wednesday, September 12, 2012 5:43:35 PM Subject: [Engine-devel] is gerrit.ovirt.org down? eom ___ Engine-devel mailing list engine-de...@ovirt.org http://lists.ovirt.org/mailman/listinfo/engine-devel ___ Engine-devel mailing list engine-de...@ovirt.org http://lists.ovirt.org/mailman/listinfo/engine-devel -- --- 舒明 Shu Ming Open Virtualization Engineerning; CSTL, IBM Corp. Tel: 86-10-82451626 Tieline: 9051626 E-mail: shum...@cn.ibm.com or shum...@linux.vnet.ibm.com Address: 3/F Ring Building, ZhongGuanCun Software Park, Haidian District, Beijing 100193, PRC ___ Engine-devel mailing list engine-de...@ovirt.org http://lists.ovirt.org/mailman/listinfo/engine-devel ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] Patch review process
On Sun, Sep 09, 2012 at 02:33:00PM -0400, Alon Bar-Lev wrote: - Original Message - From: Adam Litke a...@us.ibm.com To: vdsm-devel@lists.fedorahosted.org Cc: Ryan Harper ry...@linux.vnet.ibm.com, Anthony Liguori aligu...@linux.vnet.ibm.com Sent: Sunday, September 9, 2012 8:27:30 PM Subject: [vdsm] Patch review process While discussing gerrit recently, I learned that some people use gerrit simply to host work-in-progress patches and they don't intend for those to be reviewed. How can a reviewer recognize this and skip those patches when choosing what to review? Is there a way to mark certain patches as more important and others as drafts? Yes. See [1]. $ git push upstream HEAD:refs/drafts/master/description Thanks for pointing it out. It would be nice if we could get people to start pushing WIP patches to drafts now that we have this feature. [1] http://gerrit-documentation.googlecode.com/svn/ReleaseNotes/ReleaseNotes-2.3.html -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
[vdsm] Jenkins build failure for change that adds build dependencies
Hi, My change, http://gerrit.ovirt.org/#/c/7516/ adds the following build dependencies. Since they are not installed on the system running patch verification tests I am getting build failures. Can we get these packages installed on the testing host(s) please? +BuildRequires: gobject-introspection-devel +BuildRequires: glib2-devel +BuildRequires: json-glib-devel +BuildRequires: vala +BuildRequires: libgee-devel -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] [RFC] An alternative way to provide a supported interface -- libvdsm
On Thu, Jul 26, 2012 at 11:47:51AM +0300, Itamar Heim wrote: On 07/17/2012 01:19 AM, Itamar Heim wrote: On 07/09/2012 09:52 PM, Saggi Mizrahi wrote: - Original Message - From: Itamar Heim ih...@redhat.com To: Saggi Mizrahi smizr...@redhat.com Cc: Adam Litke a...@us.ibm.com, vdsm-devel@lists.fedorahosted.org Sent: Monday, July 9, 2012 11:03:43 AM Subject: Re: [vdsm] [RFC] An alternative way to provide a supported interface -- libvdsm On 07/09/2012 05:56 PM, Saggi Mizrahi wrote: I don't think AMQP is a good low level supported protocol as it's a very complex protocol to set up and support. Also brokers are known to have their differences in standard implementation which means supporting them all is a mess. It looks like the most accepted route is the libvirt route of having a c library abstracting away client server communication and having more advanced consumers build protocol specific bridges that may have different support standards. On a more personal note, I think brokerless messaging is the way to go in ovirt because, unlike traditional clustering, worker nodes are not interchangeable so direct communication is the way to go, rendering brokers pretty much useless. but brokerless doesn't let multiple consumers which a bus provides? All consumers can connect to the host and *some* events can be broadcasted to all connected clients. The real question is weather you want to depend on AMQP's routing \ message storing Also, if you find it preferable to have a centralized host (single point of failure) to get all events from all hosts for the price of some clients (I assume read only clients) not needing to know the locations of all worker nodes. But IMHO we already have something like that, it's called the ovirt-engine, and it could send aggregated events about the cluster (maybe with some extra enginy data). The question is what does mandating a broker gives us something that an AMQP bridge wouldn't. The only thing I can think of is vdsm can assume unmoderated vdsm to vdsm communication bypassing the engine. This means that VDSM can have some clustered behavior that requires no engine intervention. Further more, the engine can send a request and let the nodes decide who is performing the operation among themselves. Essentially: [ engine ] [ engine ] | | VS | [vdsm][vdsm] [ broker ] | | [vdsm][vdsm] *All links are two way links This has dire consequences on API usability and supportability. So we need to converge on that. There needs to be a good reason why the aforementioned logic code can't sit on a another ovirt specific entity (lets call it ovirt-dynamo) that uses VDSM's supported API but it's own APIs (or more likely messaging algorithms) are unsupported. [engine ] ||| | [ broker ] | ||| | [vdsm]-[dynamo] : [dynamo]-[vdsm] Host A : Host B *All links are two way links 1. we have engine today 'in the path' to the history db. but it makes no sense for engine to be aware of each statistic we want to keep in the history db. same would be for an event/stats correlation service. they don't need to depend on each other for availability/redundancy. 2. we are already looking at quantum integration, which is doing engine to nodes communication via amqp. 3. with somewhat of a forward looking - moving some scheduling logic down to vdsm will probably mean we'll want one of the nodes to listen to statistics and state from the other nodes. to all of these, setting up a bus which allows multiple peer listeners seems more robust I'm still against developing a C level binding for amqp and rest support over a codebase which is in python. rest and amqp allow for both local and remote bindings in any language. C bindings should/could be a parallel implementation, but they seem like an unneeded overhead and complexity in the middle of the codebase. Sure, it's probably possible to bind a REST or AMQP API in other languages but I don't think there is an automatic way of doing it. That means having to keep up with maintenance of each and every binding every time the API changes. If we look at libvirt, they will say this is a large source of pain that they have recommended we avoid. For the C/gobject approach, we write a single API schema file. From that, we automatically generate the C API and bindings. Sure, the generation could be a bit complex but much of it will be someone else's codebase (and one that is used by lots of Gnome projects). -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] [RFC] An alternative way to provide a supported interface -- libvdsm
On Thu, Jul 12, 2012 at 08:11:17AM +0800, Shu Ming wrote: Basically, my understanding is that we can generate two versions of libvdsm from the schema file for both the node and the management application. First, the transportation protocols(XMLRPC, REST-API) will depend on libvdsm(node version) to export the APIs to remote management application. Secondly, the management application can use libvdsm(application version ) to emit the remote call to the node. Also, transportation protocols like REST API and XML RPC API can also be generated automatically by the schema file with C, Java, Python bindings. I think this might be a bit too complex of a model. Here's how I see it... The schema generates C/gObject code which can be compiled into libvdsm. We can use the gObject introspection library to automatically generate language bindings for Java, Python, Perl, etc. The libvdsm library talks to vdsmd using a wire protocol that works locally and remotely. This wire protocol is completely hidden from library users. It's an implementation detail that can be changed later if necessary. Today I would recommend that we use xmlrpc. This means that ovirt-engine or another remote program could use libvdsm in the exact same manner as a local program. The library user just needs to call libvdsm.connect(uri). Finally, REST and AMQP bridges would be written solely against libvdsm. These bridges are probably not suitable for code generation (but we can revisit that as a separate issue because it's up to the bridge writer to determine the best approach). On 2012-7-12 2:29, Saggi Mizrahi wrote: I'm sorry, but I don't really understand the drawing - Original Message - From: Shu Ming shum...@linux.vnet.ibm.com To: Adam Litke a...@us.ibm.com Cc: vdsm-devel@lists.fedorahosted.org Sent: Wednesday, July 11, 2012 10:24:49 AM Subject: Re: [vdsm] [RFC] An alternative way to provide a supported interface -- libvdsm Adam, Maybe, I don't fully understand your proposal. Here is my understanding of libvdsm in the picture. Please check the following link for the picture. http://www.ovirt.org/wiki/File:Libvdsm.JPG http://www.ovirt.org/wiki/File:Libvdsm.JPG On 2012-7-9 21:56, Adam Litke wrote: On Fri, Jul 06, 2012 at 03:53:08PM +0300, Itamar Heim wrote: On 07/06/2012 01:15 AM, Robert Middleswarth wrote: On 07/05/2012 04:45 PM, Adam Litke wrote: On Thu, Jul 05, 2012 at 03:47:42PM -0400, Saggi Mizrahi wrote: - Original Message - From: Adam Litke a...@us.ibm.com To: Saggi Mizrahi smizr...@redhat.com Cc: Anthony Liguori anth...@codemonkey.ws, VDSM Project Development vdsm-devel@lists.fedorahosted.org Sent: Thursday, July 5, 2012 2:34:50 PM Subject: Re: [RFC] An alternative way to provide a supported interface -- libvdsm On Wed, Jun 27, 2012 at 02:50:02PM -0400, Saggi Mizrahi wrote: The idea of having a supported C API was something I was thinking about doing (But I'd rather use gobject introspection and not schema generation) But the problem is not having a C API is using the current XML RPC API as it's base I want to disect this a bit to find out exactly where there might be agreement and disagreement. C API is a good thing to implement - Agreed. I also want to use gobject introspection but I don't agree that using glib precludes the use of a formalized schema. My proposal is that we write a schema definition and generate the glib C code from that schema. I agree that the _current_ xmlrpc API makes a pretty bad base from which to start a supportable API. XMLRPC is a perfectly reasonable remote/wire protocol and I think we should continue using it as a base for the next generation API. Using a schema will ensure that the new API is well-structured. There major problems with XML-RPC (and to some extent with REST as well) are high call overhead and no two way communication (push events). Basing on XML-RPC means that we will never be able to solve these issues. I am not sure I am ready to conceed that XML-RPC is too slow for our needs. Can you provide some more detail around this point and possibly suggest an alternative that has even lower overhead without sacrificing the ubiquity and usability of XML-RPC? As far as the two-way communication point, what are the options besides AMQP/ZeroMQ? Aren't these even worse from an overhead perspective than XML-RPC? Regarding two-way communication: you can write AMQP brokers based on the C API and run one on each vdsm host. Assuming the C API supports events, what else would you need? I personally think that using something like AMQP for inter-node communication and engine - node would be optimal. With a rest interface that just send messages though something like AMQP. I would also not dismiss AMQP so soon we want a bug with more than a single listener at engine side (engine, history db, maybe event correlation service). collectd as a means
Re: [vdsm] [RFC] An alternative way to provide a supported interface -- libvdsm
On Fri, Jul 06, 2012 at 03:53:08PM +0300, Itamar Heim wrote: On 07/06/2012 01:15 AM, Robert Middleswarth wrote: On 07/05/2012 04:45 PM, Adam Litke wrote: On Thu, Jul 05, 2012 at 03:47:42PM -0400, Saggi Mizrahi wrote: - Original Message - From: Adam Litke a...@us.ibm.com To: Saggi Mizrahi smizr...@redhat.com Cc: Anthony Liguori anth...@codemonkey.ws, VDSM Project Development vdsm-devel@lists.fedorahosted.org Sent: Thursday, July 5, 2012 2:34:50 PM Subject: Re: [RFC] An alternative way to provide a supported interface -- libvdsm On Wed, Jun 27, 2012 at 02:50:02PM -0400, Saggi Mizrahi wrote: The idea of having a supported C API was something I was thinking about doing (But I'd rather use gobject introspection and not schema generation) But the problem is not having a C API is using the current XML RPC API as it's base I want to disect this a bit to find out exactly where there might be agreement and disagreement. C API is a good thing to implement - Agreed. I also want to use gobject introspection but I don't agree that using glib precludes the use of a formalized schema. My proposal is that we write a schema definition and generate the glib C code from that schema. I agree that the _current_ xmlrpc API makes a pretty bad base from which to start a supportable API. XMLRPC is a perfectly reasonable remote/wire protocol and I think we should continue using it as a base for the next generation API. Using a schema will ensure that the new API is well-structured. There major problems with XML-RPC (and to some extent with REST as well) are high call overhead and no two way communication (push events). Basing on XML-RPC means that we will never be able to solve these issues. I am not sure I am ready to conceed that XML-RPC is too slow for our needs. Can you provide some more detail around this point and possibly suggest an alternative that has even lower overhead without sacrificing the ubiquity and usability of XML-RPC? As far as the two-way communication point, what are the options besides AMQP/ZeroMQ? Aren't these even worse from an overhead perspective than XML-RPC? Regarding two-way communication: you can write AMQP brokers based on the C API and run one on each vdsm host. Assuming the C API supports events, what else would you need? I personally think that using something like AMQP for inter-node communication and engine - node would be optimal. With a rest interface that just send messages though something like AMQP. I would also not dismiss AMQP so soon we want a bug with more than a single listener at engine side (engine, history db, maybe event correlation service). collectd as a means for statistics already supports it as well. I'm for having REST as well, but not sure as main one for a consumer like ovirt engine. I agree that a message bus could be a very useful model of communication between ovirt-engine components and multiple vdsm instances. But the complexities and dependencies of AMQP do not make it suitable for use as a low-level API. AMQP will repel new adopters. Why not establish a libvdsm that is more minimalist and can be easily used by everyone? Then AMQP brokers can be built on top of the stable API with ease. All AMQP should require of the low-level API are standard function calls and an events mechanism. Thanks Robert The current XML-RPC API contains a lot of decencies and inefficiencies and we would like to retire it as soon as we possibly can. Engine would like us to move to a message based API and 3rd parties want something simple like REST so it looks like no one actually wants to use XML-RPC. Not even us. I am proposing that AMQP brokers and REST APIs could be written against the public API. In fact, they need not even live in the vdsm tree anymore if that is what we choose. Core vdsm would only be responsible for providing libvdsm and whatever language bindings we want to support. If we take the libvdsm route, the only reason to even have a REST bridge is only to support OSes other then Linux which is something I'm not sure we care about at the moment. That might be true regarding the current in-tree implementation. However, I can almost guarantee that someone wanting to write a web GUI on top of standalone vdsm would want a REST API to talk to. But libvdsm makes this use case of no concern to the core vdsm developers. I do think that having C supportability in our API is a good idea, but the current API should not be used as the base. Let's _start_ with a schema document that describes today's API and then clean it up. I think that will work better than starting from scratch. Once my schema is written I will post it and we can 'patch' it as a community until we arrive at a 1.0 version we are all happy with. +1 Ok. Redoubling my efforts to get this done. Describing the output of list(True) takes awhile
Re: [vdsm] [virt-node] VDSM as a general purpose virt host manager
On Tue, Jun 26, 2012 at 11:11:51PM +0800, Shu Ming wrote: On 2012-6-26 20:45, Adam Litke wrote: On Tue, Jun 26, 2012 at 09:53:10AM +0800, Xu He Jie wrote: On 06/26/2012 05:19 AM, Adam Litke wrote: On Mon, Jun 25, 2012 at 05:53:31PM +0300, Dan Kenigsberg wrote: On Mon, Jun 25, 2012 at 08:28:29AM -0500, Adam Litke wrote: On Fri, Jun 22, 2012 at 06:45:43PM -0400, Andrew Cathrow wrote: - Original Message - From: Ryan Harper ry...@us.ibm.com To: Adam Litke a...@us.ibm.com Cc: Anthony Liguori aligu...@redhat.com, VDSM Project Development vdsm-devel@lists.fedorahosted.org Sent: Friday, June 22, 2012 12:45:42 PM Subject: Re: [vdsm] [virt-node] VDSM as a general purpose virt host manager * Adam Litke a...@us.ibm.com [2012-06-22 11:35]: On Thu, Jun 21, 2012 at 12:17:19PM +0300, Dor Laor wrote: On 06/19/2012 08:12 PM, Saggi Mizrahi wrote: - Original Message - From: Deepak C Shetty deepa...@linux.vnet.ibm.com To: Ryan Harper ry...@us.ibm.com Cc: Saggi Mizrahi smizr...@redhat.com, Anthony Liguori aligu...@redhat.com, VDSM Project Development vdsm-devel@lists.fedorahosted.org Sent: Tuesday, June 19, 2012 10:58:47 AM Subject: Re: [vdsm] [virt-node] VDSM as a general purpose virt host manager On 06/19/2012 01:13 AM, Ryan Harper wrote: * Saggi Mizrahismizr...@redhat.com [2012-06-18 10:05]: I would like to put on to the table for descussion the growing need for a way to more easily reuse of the functionality of VDSM in order to service projects other than Ovirt-Engine. Originally VDSM was created as a proprietary agent for the sole purpose of serving the then proprietary version of what is known as ovirt-engine. Red Hat, after acquiring the technology, pressed on with it's commitment to open source ideals and released the code. But just releasing code into the wild doesn't build a community or makes a project successful. Further more when building open source software you should aspire to build reusable components instead of monolithic stacks. Saggi, Thanks for sending this out. I've been trying to pull together some thoughts on what else is needed for vdsm as a community. I know that for some time downstream has been the driving force for all of the work and now with a community there are challenges in finding our own way. While we certainly don't want to make downstream efforts harder, I think we need to develop and support our own vision for what vdsm can be come, some what independent of downstream and other exploiters. Revisiting the API is definitely a much needed endeavor and I think adding some use-cases or sample applications would be useful in demonstrating whether or not we're evolving the API into something easier to use for applications beyond engine. We would like to expose a stable, documented, well supported API. This gives us a chance to rethink the VDSM API from the ground up. There is already work in progress of making the internal logic of VDSM separate enough from the API layer so we could continue feature development and bug fixing while designing the API of the future. In order to achieve this though we need to do several things: 1. Declare API supportability guidelines 2. Decide on an API transport (e.g. REST, ZMQ, AMQP) 3. Make the API easily consumable (e.g. proper docs, example code, extending the API, etc) 4. Implement the API itself In the earlier we'd discussed working to have similarities in the modeling between the oVirt API and VDSM but that seems to have dropped off the radar. Yes, the current REST API has attempted to be compatible with the current ovirt-engine API. Going forward, I am not sure how easy this will be to maintain given than engine is built on Java and vdsm is built on Python. Could you elaborate why the language difference is an issue? Isn't this what APIs are supposed to solve? The main language issue is that ovirt-engine has built their API using a set of Java-specific frameworks (JAXB and its dependents). It's true, if you google for 'python jaxb' you will find some sourceforge projects that attempt to bring the jaxb interface to python but I don't think that's the right approach. If you're writing a java project, do things the java way. If you're writing a python project, do them the python way. Right now I am focused on defining the current API (API.py/xmlrpc) mechanically (creating a schema and API documentation). XSD is not the correct language for that task (which is why I forsee a divergence at least at first). I want to take a stab at defining the API in a beneficial, long-term manner. Adam, Can you explain why you think XSD is not the correct language? Is it because of the lacking of full python language code generator? Is it possible to modify the existing code generator to address that issue? What is the benefit to introduce a new schema
Re: [vdsm] [virt-node] VDSM as a general purpose virt host manager
On Tue, Jun 26, 2012 at 04:47:35PM +0100, Daniel P. Berrange wrote: On Tue, Jun 26, 2012 at 05:37:26PM +0300, Dan Kenigsberg wrote: On Mon, Jun 25, 2012 at 04:19:28PM -0500, Adam Litke wrote: 1) Completely define the current XMLRPC API including all functions, parameters, and return values. Complex data structures can be broken down into their basic types. These are: int, str, bool, list, dict, typed-dict, enum. I have already started this process and am using Qemu's QAPI schema language. You can see that here [1]. For an example of what that looks like describing the vdsm API see this snippet [2]. 2) Import the parser/generator code from qemu for the above schema. Vdsm will require a few extensions such as typed-dictionaries, tuples, and type aliases. Adapt the generator so that it can produce a libvdsm which provides API language bindings for python, c, and java. 3) Implement a vdsm shell in terms of libvdsm. In fact, this can be largely auto-generated from the schema and accompanying documentation. This can serve to model how new transports can be written. For example, an AMQP implementation can be implemented entirely outside of the vdsm project if we wished. It only needs to talk to vdsm via libvdsm. Easy as 1,2,3 :) [1] http://git.qemu.org/?p=qemu.git;a=blob;f=qapi-schema.json;h=3b6e3468b440b4b681f321c9525a3d83bea2137a;hb=HEAD [2] http://fpaste.org/rt96/ Probably more than you bargained for when asking for more info... :) Indeed! I am still at a loss why the languages take such a prominent place in your choice for an API. A good API is easily consumable by any language. I think you are both right here. A good API is easily consumed from any language, but this doesn't mean there is zero cost to starting to consume it from a client. You either way to be able to auto-generate code for the client side APIs in all your languages of choice, or even better, you want the client side APIs to be just do runtime dynamic dispatch based on published schema. Thanks for commenting! On one hand, dynamic dispatch seems attractive but I think dramatically increases complexity on both the client and server sides. Does anyone know of a prominent open source project that has been successful with dynamic dispatch? I am inclined to go with the C library approach because it is tried and tested and it fits the model of other virtualization libraries that I am familiar with. If you go down the route of writing a C based libvdsm for VDSM, then my recommendation would be to use the GObject APIs. You can then take full advantage of the GObject Introspection capabilities to have full dynamic dispatch in languages like python, perl, javascript, or full auto-generation of code in Vala, C#, etc Ahh, thanks for reminding me of this. GObject definitely seems like the way to go. I assume there are no real implications for the schema definition and that the heavy-lifting for GObject support would be limited to the C code generator. Time to take a closer look at the GObject stuff. I certainly wouldn't waste time writing your own code-generator for all the various languages, since that's just reinventing the wheel that GObject Introspection already provides for the most part. Agreed. I would love to avoid this! -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] [virt-node] VDSM as a general purpose virt host manager
On Fri, Jun 22, 2012 at 06:45:43PM -0400, Andrew Cathrow wrote: - Original Message - From: Ryan Harper ry...@us.ibm.com To: Adam Litke a...@us.ibm.com Cc: Anthony Liguori aligu...@redhat.com, VDSM Project Development vdsm-devel@lists.fedorahosted.org Sent: Friday, June 22, 2012 12:45:42 PM Subject: Re: [vdsm] [virt-node] VDSM as a general purpose virt host manager * Adam Litke a...@us.ibm.com [2012-06-22 11:35]: On Thu, Jun 21, 2012 at 12:17:19PM +0300, Dor Laor wrote: On 06/19/2012 08:12 PM, Saggi Mizrahi wrote: - Original Message - From: Deepak C Shetty deepa...@linux.vnet.ibm.com To: Ryan Harper ry...@us.ibm.com Cc: Saggi Mizrahi smizr...@redhat.com, Anthony Liguori aligu...@redhat.com, VDSM Project Development vdsm-devel@lists.fedorahosted.org Sent: Tuesday, June 19, 2012 10:58:47 AM Subject: Re: [vdsm] [virt-node] VDSM as a general purpose virt host manager On 06/19/2012 01:13 AM, Ryan Harper wrote: * Saggi Mizrahismizr...@redhat.com [2012-06-18 10:05]: I would like to put on to the table for descussion the growing need for a way to more easily reuse of the functionality of VDSM in order to service projects other than Ovirt-Engine. Originally VDSM was created as a proprietary agent for the sole purpose of serving the then proprietary version of what is known as ovirt-engine. Red Hat, after acquiring the technology, pressed on with it's commitment to open source ideals and released the code. But just releasing code into the wild doesn't build a community or makes a project successful. Further more when building open source software you should aspire to build reusable components instead of monolithic stacks. Saggi, Thanks for sending this out. I've been trying to pull together some thoughts on what else is needed for vdsm as a community. I know that for some time downstream has been the driving force for all of the work and now with a community there are challenges in finding our own way. While we certainly don't want to make downstream efforts harder, I think we need to develop and support our own vision for what vdsm can be come, some what independent of downstream and other exploiters. Revisiting the API is definitely a much needed endeavor and I think adding some use-cases or sample applications would be useful in demonstrating whether or not we're evolving the API into something easier to use for applications beyond engine. We would like to expose a stable, documented, well supported API. This gives us a chance to rethink the VDSM API from the ground up. There is already work in progress of making the internal logic of VDSM separate enough from the API layer so we could continue feature development and bug fixing while designing the API of the future. In order to achieve this though we need to do several things: 1. Declare API supportability guidelines 2. Decide on an API transport (e.g. REST, ZMQ, AMQP) 3. Make the API easily consumable (e.g. proper docs, example code, extending the API, etc) 4. Implement the API itself In the earlier we'd discussed working to have similarities in the modeling between the oVirt API and VDSM but that seems to have dropped off the radar. Yes, the current REST API has attempted to be compatible with the current ovirt-engine API. Going forward, I am not sure how easy this will be to maintain given than engine is built on Java and vdsm is built on Python. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] [virt-node] VDSM as a general purpose virt host manager
On Mon, Jun 25, 2012 at 05:53:31PM +0300, Dan Kenigsberg wrote: On Mon, Jun 25, 2012 at 08:28:29AM -0500, Adam Litke wrote: On Fri, Jun 22, 2012 at 06:45:43PM -0400, Andrew Cathrow wrote: - Original Message - From: Ryan Harper ry...@us.ibm.com To: Adam Litke a...@us.ibm.com Cc: Anthony Liguori aligu...@redhat.com, VDSM Project Development vdsm-devel@lists.fedorahosted.org Sent: Friday, June 22, 2012 12:45:42 PM Subject: Re: [vdsm] [virt-node] VDSM as a general purpose virt host manager * Adam Litke a...@us.ibm.com [2012-06-22 11:35]: On Thu, Jun 21, 2012 at 12:17:19PM +0300, Dor Laor wrote: On 06/19/2012 08:12 PM, Saggi Mizrahi wrote: - Original Message - From: Deepak C Shetty deepa...@linux.vnet.ibm.com To: Ryan Harper ry...@us.ibm.com Cc: Saggi Mizrahi smizr...@redhat.com, Anthony Liguori aligu...@redhat.com, VDSM Project Development vdsm-devel@lists.fedorahosted.org Sent: Tuesday, June 19, 2012 10:58:47 AM Subject: Re: [vdsm] [virt-node] VDSM as a general purpose virt host manager On 06/19/2012 01:13 AM, Ryan Harper wrote: * Saggi Mizrahismizr...@redhat.com [2012-06-18 10:05]: I would like to put on to the table for descussion the growing need for a way to more easily reuse of the functionality of VDSM in order to service projects other than Ovirt-Engine. Originally VDSM was created as a proprietary agent for the sole purpose of serving the then proprietary version of what is known as ovirt-engine. Red Hat, after acquiring the technology, pressed on with it's commitment to open source ideals and released the code. But just releasing code into the wild doesn't build a community or makes a project successful. Further more when building open source software you should aspire to build reusable components instead of monolithic stacks. Saggi, Thanks for sending this out. I've been trying to pull together some thoughts on what else is needed for vdsm as a community. I know that for some time downstream has been the driving force for all of the work and now with a community there are challenges in finding our own way. While we certainly don't want to make downstream efforts harder, I think we need to develop and support our own vision for what vdsm can be come, some what independent of downstream and other exploiters. Revisiting the API is definitely a much needed endeavor and I think adding some use-cases or sample applications would be useful in demonstrating whether or not we're evolving the API into something easier to use for applications beyond engine. We would like to expose a stable, documented, well supported API. This gives us a chance to rethink the VDSM API from the ground up. There is already work in progress of making the internal logic of VDSM separate enough from the API layer so we could continue feature development and bug fixing while designing the API of the future. In order to achieve this though we need to do several things: 1. Declare API supportability guidelines 2. Decide on an API transport (e.g. REST, ZMQ, AMQP) 3. Make the API easily consumable (e.g. proper docs, example code, extending the API, etc) 4. Implement the API itself In the earlier we'd discussed working to have similarities in the modeling between the oVirt API and VDSM but that seems to have dropped off the radar. Yes, the current REST API has attempted to be compatible with the current ovirt-engine API. Going forward, I am not sure how easy this will be to maintain given than engine is built on Java and vdsm is built on Python. Could you elaborate why the language difference is an issue? Isn't this what APIs are supposed to solve? The main language issue is that ovirt-engine has built their API using a set of Java-specific frameworks (JAXB and its dependents). It's true, if you google for 'python jaxb' you will find some sourceforge projects that attempt to bring the jaxb interface to python but I don't think that's the right approach. If you're writing a java project, do things the java way. If you're writing a python project, do them the python way. Right now I am focused on defining the current API (API.py/xmlrpc) mechanically (creating a schema and API documentation). XSD is not the correct language for that task (which is why I forsee a divergence at least at first). I want to take a stab at defining the API
Re: [vdsm] [virt-node] VDSM as a general purpose virt host manager
code for the API - We should be able to auto-generate some API bindings. [1]: http://git.qemu.org/?p=qemu.git;a=blob_plain;f=qapi-schema.json;h=3b6e3468b440b4b681f321c9525a3d83bea2137a;hb=HEAD Regards, Dor [1] http://ovirt.org/wiki/VDSM_Stable_API_Plan nice to query what plugins/capabilities are supported and accordingly the client can take a decision and/or call the appropriate APIs w/o worrying about ENOTSUPP kind of error. It does becomes blur when we talk about Repository Engines... that was also targetted to provide pluggaibility in managing Images.. how will that co-exist with API level pluggability ? IIUC, StorageProvisioning (via libstoragemgmt) can be one such optional support that can fit as a plug-in nicely, right ? You will have have an introspective verb to get supported storage engines. Without the engine the hosts will not be able to log in to an image repo but it will not be an API level error. You will get UnsupportedRepoFormatError or something similar no matter which version of VDSM you use. The error is part of the interface and engines will expose their format and parameter in some way. - kvm tool integration into the API - there are lots of different kvm virt tools for various tasks and they are all stand-alone tools. Can we integrate their use into the node level API. Think libguestfs, virt-install, p2v/v2v tooling. All of these are available, but there isn't an easy way to use this tools through an API. - host management operations - vdsm already does some host level configuration (see networking e.g.) it would be good to think about extending the API to cover other areas of configuration and updates - hardware enumeration - driver level information - storage configuration (we've got a bit of a discussion going around libstoragemgmt here) - performance monitoring/debugging - is the host collecting enough information to do debug/perf analysis - can we support specific configurations of a host that optimize for specific workloads - and can we do this in the API such that third-parties can supply and maintain specific workload configurations All of these are dependent on one another and the permutations are endless. This is why I think we should try and work on each one separately. All discussions will be done openly on the mailing list and until the final version comes out nothing is set in stone. If you think you have anything to contribute to this process, please do so either by commenting on the discussions or by sending code/docs/whatever patches. Once the API solidifies it will be quite difficult to change fundamental things, so speak now or forever hold your peace. Note that this is just an introductory email. There will be a quick follow up email to kick start the discussions. ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] [virt-node] RFC: API Supportability
On Thu, Jun 21, 2012 at 01:20:40PM +0300, Dan Kenigsberg wrote: On Wed, Jun 20, 2012 at 10:42:16AM -0500, Adam Litke wrote: On Tue, Jun 19, 2012 at 10:17:28AM -0400, Saggi Mizrahi wrote: I've opened a wiki page [1] for the stable API and extracted some of the TODO points so we don't forget. Everyone can feel free to add more stuff. [1] http://ovirt.org/wiki/VDSM_Stable_API_Plan Rest of the comments inline - Original Message - From: Adam Litke a...@us.ibm.com To: Saggi Mizrahi smizr...@redhat.com Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org, Barak Azulay bazu...@redhat.com, Itamar Heim ih...@redhat.com, Ayal Baron aba...@redhat.com, Anthony Liguori aligu...@redhat.com Sent: Monday, June 18, 2012 12:23:10 PM Subject: Re: [virt-node] RFC: API Supportability On Mon, Jun 18, 2012 at 11:02:25AM -0400, Saggi Mizrahi wrote: The first thing we need to decide is API supportabiliy. I'll list the questions that need to be answered. The decision made here will have great effect on transport selection (espscially API change process and versioning) so try and think about this without going to specfic technicalities (eg. X can't be done on REST). Thanks for sending this out. I will take a crack at these questions... I would like to pose an additional question to be answered: - Should API parameter and return value constraints be formally defined? If so, how? Think of this as defining an API schema. For example: When creating a VM, which parameters are required/optional? What are the valid formats for specifying a VM disk? What are all of the possible task states? Has to be part of response to the call that retrieves the state. This will allow us to change the states in a BC manner. I am not sure I agree. I think it should be a part of the schema but not transmitted along with each API response involving a task. This would increase traffic and make responses unnecessarily verbose. Is there a maximum length for the storage domain description? I totally agree, how depends on the transport of choice but in any case I think the definition should be done in a declarative manner (XML\JSON) using concrete types (important for binding with C\Java) and have some *code to enforce* that the input is correct. This will prevent clients from not adhering to the schema exploiting python's relative lax approach to types. We already had issues with the engine wrongly sending numbers as strings and having this break internally because of some change in the python code made it not handle the conversion very well. Our schema should fully define a set of simple types and complex types. Each defined simple type will have an internal validation function to verify conformity of a given input. Complex types consist of nested lists and dicts of simple types. They are validated first by validating members as simple types and then checking for missing and/or extra data. When designing a dependable API, we should not desert our agility. ovirt-Engine has enjoyed the possibility of saying hey, we want another field reported in getVdsStats and presto, here it was. Complex types should be easily extendible (with a proper update of the API minor version, or a capabilities set). +1 -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] refactor clientif move the api implement to sub-module
On Wed, Jun 20, 2012 at 06:08:57PM +0800, Xu He Jie wrote: Hi, folks I am trying move api implement to sub-module, then we don't need singleton and passing clientif to anywhere. So I sent this mail to descript the idea. I think it's good for review. So I add api registeration mechanism. Other modules can register it's api implement to api layer. I try to move VM api and vm stuff(like clientif.vmContainer, etc) to a new module called vmm. the vmm.VMM is similar with hsm. It's responsiable for managing vm and register VM implement to api layer. Same with hsm, I move all storage related api to hsm, and hsm will register them to api layer. After all api move to submodule, we can rename clientif and needn't passing client to any where. :) I have already submit a rough version to gerrit: http://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:wip_refactor_clientif,n,z I took a cursory look at the code you have submitted. I think you need to more thoroughly describe your design. I was particularly confused by your use of Abstract Base Classes for this scenario. Can you explain in more depth why you have done this? Is there a simpler way to accomplish what you need to do? I looked at your VMM patch and I am unsure why you need to define VMBase and VMImpl separately. It means declaring the set of functions in two separate files. Thanks for shining some more light on your methodology. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] [virt-node] RFC: API Supportability
unrecognized parameters to a function. Anytime a new parameter is added to a function, a corresponding flag should be specified to enable handling of that parameter. In this way, an old server can return an error for 'Unknown flag'. Have a missed any cases? - How will versioning be expressed in the bindings? The API should have a call to return the overall version. Also, the capabilities call should list all noteworthy features that are present. - Do we retrict newer clients from using old APIs when talking with a new server? No. A new client that wants to be the most compatible across vdsm versions may choose to use an old API (even if a flashier one is available). -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] A Tool for PEP 8 Patches to Find Code Logic Changes
On Thu, Jun 07, 2012 at 12:03:30PM +0800, Zhou Zheng Sheng wrote: Hi, Since there is no coverage report on tests in vdsm, if a PEP 8 patch passes the tests, we still not sure if there is no mistake in it. Viewing the diff reports on all the changes consumes a lot of time, and some small but fatal mistakes(like misspelling variable name) can easily be ignored by human eyes. So I have a try on the compiler module of Python. I write a tool named 'pydiff'. pydiff parses two Python scripts into Abstract Syntax Trees. These data structures can reflect the logic of the code, and pydiff performs a recursive compare on the trees. Then pydiff reports differences and the corresponding line numbers. In this way, pydiff ignores code style changes, and reports only logical changes of the code. I think this tool can save us a lot of time. After a PEP 8 patch passes vdsm tests and pydiff, I will get some confidence on the patch and it probably does not break anything in vdsm. This is a very nice tool. Thanks for sharing it. I would like to see all authors of PEP8 patches use this to check their patches for semantic correctness. This should greatly improve our ability to complete the PEP8 cleanup quickly. Here is a usage example: test_o.py def foo(a, b): pass if __name__ == '__main__': A = [1, 2, 3] print (4, 5, 6), \ over foo(1, 2) print 'Hello World' test_n.py def foo(a, b): pass if __name__ == '__main__': A = [1, 2, 3] print (4, 5, 6), over fooo( 1, 2) print ('Hello ' 'World') Some differences of the files are just a matter of style. The only significant difference is the function call foo() is misspelled in test_n.py. Run pydiff.py, it will report: $ python pydiff.py test_*.py 1 difference(s) first file: test_n.py second file: test_o.py ((8, 'fooo'), (8, 'foo')) This report tells us that 'fooo' in line 8 of test_n.py is different from 'foo' in line 8 of test_o.py. It can also find insertions or deletions. Here is another simple example: old.py print 'Hello 1' print 'Hello 2' print 'Hello 3' print 'Hello 4' print 'Hello 5' new.py print 'Hello 1' print 'Hello 3' print 'Hello 4' print 'Hello 5' print 'Hello 5' Run pydiff: $ pydiff old.py new.py 2 difference(s) first file: old.py second file: new.py ((2, Printnl([Const('Hello 2')], None)), (2, None)) ((5, None), (5, Printnl([Const('Hello 5')], None))) Here ((2, Printnl([Const('Hello 2')], None)), (2, None)) means there is a print statement in line 2 of old.py, but no corresponding statement in new.py, so we can know the statement is deleted in new.py. ((5, None), (5, Printnl([Const('Hello 5')], None))) means there is a print statement in line 5 of new.py, but no corresponding statement in old.py, so we can know the statement is inserted in new.py. Sometimes the change in code logic is acceptable, for example, change aDict.has_key(Key) into Key in aDict. pydiff can report a difference in this case, but it is up to the user to judge whether it's acceptable. pydiff is just a tool to help you finding these changes. I hope it can be helpful for PEP 8 patch reviewers. If you find any bugs, please let me know. The script is in the attachment. -- Thanks and best regards! Zhou Zheng Sheng / 周征晟 E-mail: zhshz...@linux.vnet.ibm.com Telephone: 86-10-82454397 ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
[vdsm] Help please! -- [postmas...@us.ibm.com: Delivery Status Notification (Failure)]
Hi. Recently my mails to vdsm-devel have started bouncing. I think this was caused by a temporary misconfiguration of my local mail setup that resulted in my email appearing as 'agli...@us.ibm.com'. Could someone please verify that my list membership is still configured with the proper email address 'a...@us.ibm.com'? Thanks a lot and sorry for causing trouble! :) - Forwarded message from e33.co.us.ibm.com PostMaster postmas...@us.ibm.com - Date: Wed, 6 Jun 2012 15:33:12 -0600 From: e33.co.us.ibm.com PostMaster postmas...@us.ibm.com To: a...@us.ibm.com Subject: Delivery Status Notification (Failure) X-MailerServer: XMail 1.27mod32-ISS X-MailerError: Message = [1339018392162.92bffba0.49b4.2825d.e33] Server = [e33.co.us.ibm.com] [00] XMail bounce: Rcpt=[vdsm-devel@lists.fedorahosted.org];Error=[550 5.1.1 vdsm-devel@lists.fedorahosted.org: Recipient address rejected: User unknown in local recipient table] [01] Error sending message [1339018392162.92bffba0.49b4.2825d.e33] from [e33.co.us.ibm.com]. ID:12060621-2398---0746823F Mail From: a...@us.ibm.com Rcpt To: vdsm-devel@lists.fedorahosted.org Server:[hosted03.fedoraproject.org.] [02] The reason of the delivery failure was: 550 5.1.1 vdsm-devel@lists.fedorahosted.org: Recipient address rejected: User unknown in local recipient table [05] Here is listed the initial part of the message: Received: from /spool/local by e33.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for vdsm-devel@lists.fedorahosted.org from a...@us.ibm.com; Wed, 6 Jun 2012 15:33:12 -0600 Received: from d03dlp02.boulder.ibm.com (9.17.202.178) by e33.co.us.ibm.com (192.168.1.133) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 6 Jun 2012 15:33:10 -0600 Received: from d03relay03.boulder.ibm.com (d03relay03.boulder.ibm.com [9.17.195.228]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id E4C9F3E4004C for vdsm-devel@lists.fedorahosted.org; Wed, 6 Jun 2012 21:33:08 + (WET) Received: from d03av05.boulder.ibm.com (d03av05.boulder.ibm.com [9.17.195.85]) by d03relay03.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q56LX2Fe134162 for vdsm-devel@lists.fedorahosted.org; Wed, 6 Jun 2012 15:33:05 -0600 Received: from d03av05.boulder.ibm.com (loopback [127.0.0.1]) by d03av05.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q56LX0DP010799 for vdsm-devel@lists.fedorahosted.org; Wed, 6 Jun 2012 15:33:01 -0600 Received: from us.ibm.com (sig-9-76-23-222.mts.ibm.com [9.76.23.222]) by d03av05.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with SMTP id q56LWviG010656; Wed, 6 Jun 2012 15:32:58 -0600 Received: by us.ibm.com (sSMTP sendmail emulation); Wed, 6 Jun 2012 16:32:57 -0500 From: Adam Litke a...@us.ibm.com Date: Wed, 6 Jun 2012 16:32:57 -0500 To: Rodrigo Trujillo rodrigo.truji...@linux.vnet.ibm.com Cc: vdsm-devel@lists.fedorahosted.org Subject: Re: [vdsm] About xmlrpc an rest api Message-ID: 20120606213257.GU2671@localhost.localdomain References: 4fcfab74.8030...@linux.vnet.ibm.com MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: 4fcfab74.8030...@linux.vnet.ibm.com User-Agent: Mutt/1.5.21 (2010-09-15) X-Content-Scanned: Fidelis XPS MAILER x-cbid: 12060621-2398---0746823F On Wed, Jun 06, 2012 at 04:11:48PM -0300, Rodrigo Trujillo wrote: Hi, I have researched about the VDSM APIs, but was not clear to me how to use them. Where can I find documentation about them and how to use with python ? I wrote this python script (with help from this list) to create a VM using the xmlrpc interface. It is not trivial (as you will see). I am certain that you will need to modify this to get it working in your environment. In the future, we hope to make this far easier to do. We want to save you from needing to do the storage manipulations. Also, a REST API should organize the API much better than the xmlrpc (which was never meant to be friendly to end users). #!/usr/bin/python import sys import uuid import time - End forwarded message - -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] Help please! -- [postmas...@us.ibm.com: Delivery Status Notification (Failure)]
On Thu, Jun 07, 2012 at 05:33:04PM +0300, Itamar Heim wrote: On 06/07/2012 05:07 PM, Adam Litke wrote: Hi. Recently my mails to vdsm-devel have started bouncing. I think this was caused by a temporary misconfiguration of my local mail setup that resulted in my email appearing as 'agli...@us.ibm.com'. Could someone please verify that my list membership is still configured with the proper email address 'a...@us.ibm.com'? Thanks a lot and sorry for causing trouble! :) I think you just got this like other people who replied to that email. i.e., it's from your mail server, not from the mailing list Hmm, ok. It seems that the mail is flowing now and is arriving at the list. Also this reply has worked. Hope all is okay now then. - Forwarded message from e33.co.us.ibm.com PostMasterpostmas...@us.ibm.com - Date: Wed, 6 Jun 2012 15:33:12 -0600 From: e33.co.us.ibm.com PostMasterpostmas...@us.ibm.com To: a...@us.ibm.com Subject: Delivery Status Notification (Failure) X-MailerServer: XMail 1.27mod32-ISS X-MailerError: Message = [1339018392162.92bffba0.49b4.2825d.e33] Server = [e33.co.us.ibm.com] [00] XMail bounce: Rcpt=[vdsm-devel@lists.fedorahosted.org];Error=[550 5.1.1vdsm-devel@lists.fedorahosted.org: Recipient address rejected: User unknown in local recipient table] [01] Error sending message [1339018392162.92bffba0.49b4.2825d.e33] from [e33.co.us.ibm.com]. ID:12060621-2398---0746823F Mail From:a...@us.ibm.com Rcpt To:vdsm-devel@lists.fedorahosted.org Server:[hosted03.fedoraproject.org.] [02] The reason of the delivery failure was: 550 5.1.1vdsm-devel@lists.fedorahosted.org: Recipient address rejected: User unknown in local recipient table [05] Here is listed the initial part of the message: Received: from /spool/local by e33.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted forvdsm-devel@lists.fedorahosted.org froma...@us.ibm.com; Wed, 6 Jun 2012 15:33:12 -0600 Received: from d03dlp02.boulder.ibm.com (9.17.202.178) by e33.co.us.ibm.com (192.168.1.133) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 6 Jun 2012 15:33:10 -0600 Received: from d03relay03.boulder.ibm.com (d03relay03.boulder.ibm.com [9.17.195.228]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id E4C9F3E4004C forvdsm-devel@lists.fedorahosted.org; Wed, 6 Jun 2012 21:33:08 + (WET) Received: from d03av05.boulder.ibm.com (d03av05.boulder.ibm.com [9.17.195.85]) by d03relay03.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q56LX2Fe134162 forvdsm-devel@lists.fedorahosted.org; Wed, 6 Jun 2012 15:33:05 -0600 Received: from d03av05.boulder.ibm.com (loopback [127.0.0.1]) by d03av05.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q56LX0DP010799 forvdsm-devel@lists.fedorahosted.org; Wed, 6 Jun 2012 15:33:01 -0600 Received: from us.ibm.com (sig-9-76-23-222.mts.ibm.com [9.76.23.222]) by d03av05.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with SMTP id q56LWviG010656; Wed, 6 Jun 2012 15:32:58 -0600 Received: by us.ibm.com (sSMTP sendmail emulation); Wed, 6 Jun 2012 16:32:57 -0500 From: Adam Litkea...@us.ibm.com Date: Wed, 6 Jun 2012 16:32:57 -0500 To: Rodrigo Trujillorodrigo.truji...@linux.vnet.ibm.com Cc: vdsm-devel@lists.fedorahosted.org Subject: Re: [vdsm] About xmlrpc an rest api Message-ID:20120606213257.GU2671@localhost.localdomain References:4fcfab74.8030...@linux.vnet.ibm.com MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To:4fcfab74.8030...@linux.vnet.ibm.com User-Agent: Mutt/1.5.21 (2010-09-15) X-Content-Scanned: Fidelis XPS MAILER x-cbid: 12060621-2398---0746823F On Wed, Jun 06, 2012 at 04:11:48PM -0300, Rodrigo Trujillo wrote: Hi, I have researched about the VDSM APIs, but was not clear to me how to use them. Where can I find documentation about them and how to use with python ? I wrote this python script (with help from this list) to create a VM using the xmlrpc interface. It is not trivial (as you will see). I am certain that you will need to modify this to get it working in your environment. In the future, we hope to make this far easier to do. We want to save you from needing to do the storage manipulations. Also, a REST API should organize the API much better than the xmlrpc (which was never meant to be friendly to end users). #!/usr/bin/python import sys import uuid import time - End forwarded message - -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] SSLError with vdsm
On Thu, Jun 07, 2012 at 05:35:54PM +0300, Itamar Heim wrote: On 06/07/2012 09:58 AM, Wenyi Gao wrote: On 2012-06-07 13:51, Zhou Zheng Sheng wrote: Hi, It is because normal user do not have the privilege to access the keys in /etc/pki/vdsm/keys/ and certificates in /etc/pki/vdsm/certs/. You can su to root or sudo vdsClient to use SSL connection. 于 2012年06月07日 13:03, Wenyi Gao 写道: Hi guys, When I ran the cmmand vdsClient -s 0 getVdsCaps, I got the following error: $ vdsClient -s 0 getVdsCaps Traceback (most recent call last): File /usr/share/vdsm/vdsClient.py, line 2275, in module code, message = commands[command][0](commandArgs) File /usr/share/vdsm/vdsClient.py, line 403, in do_getCap return self.ExecAndExit(self.s.getVdsCapabilities()) File /usr/lib64/python2.7/xmlrpclib.py, line 1224, in __call__ return self.__send(self.__name, args) File /usr/lib64/python2.7/xmlrpclib.py, line 1578, in __request verbose=self.__verbose File /usr/lib64/python2.7/xmlrpclib.py, line 1264, in request return self.single_request(host, handler, request_body, verbose) File /usr/lib64/python2.7/xmlrpclib.py, line 1292, in single_request self.send_content(h, request_body) File /usr/lib64/python2.7/xmlrpclib.py, line 1439, in send_content connection.endheaders(request_body) File /usr/lib64/python2.7/httplib.py, line 954, in endheaders self._send_output(message_body) File /usr/lib64/python2.7/httplib.py, line 814, in _send_output self.send(msg) File /usr/lib64/python2.7/httplib.py, line 776, in send self.connect() File /usr/lib/python2.7/site-packages/vdsm/SecureXMLRPCServer.py, line 98, in connect cert_reqs=self.cert_reqs) File /usr/lib64/python2.7/ssl.py, line 381, in wrap_socket ciphers=ciphers) File /usr/lib64/python2.7/ssl.py, line 141, in __init__ ciphers) SSLError: [Errno 185090050] _ssl.c:340: error:0B084002:x509 certificate routines:X509_load_cert_crl_file:system lib But if I set ssl = false in /etc/vdsm/vdsm.conf, then run vdsClient 0 getVdsCaps, the problem goes away. Does anyone know what causes the problem above? Thanks. Wenyi Gao ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel -- Thanks and best regards! Zhou Zheng Sheng / 周征晟 E-mail:zhshz...@linux.vnet.ibm.com Telephone: 86-10-82454397 ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel Yes, it works. Thanks. maybe send a patch to check the permissions and give a proper error message for the next user failing on this? +1. Great suggestion! -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] Agenda for today's call
On Mon, Jun 04, 2012 at 12:58:21PM +0300, Dan Kenigsberg wrote: Hi All, I have fewer talk issues for today, please suggest others, or else the call would be short and to the point! - reviewers/verifiers are still missing for pep8 patches. A branch was created, but not much action has taken place on it http://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:pep8cleaning,n,z - Upcoming oVirt-3.1 release: version bump to 4.9.7? to 4.10? - Vdsm/MOM integration: could we move MOM to gerrit.ovirt.org? I would like to add: - screen sharing options for REST API online code review -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] RFC: Writeup on VDSM-libstoragemgmt integration
offload capability in the domain metadata -- If available, and override is not configured, it will use LSM to offload LUN/File snapshot -- If override is configured or capability is not available, it will use its internal logic to create snapshot (qcow2). - Copy/Clone vmdisk flow -- VDSM will check the copy offload capability in the domain metadata -- If available, and override is not configured, it will use LSM to offload LUN/File copy -- If override is configured or capability is not available, it will use its internal logic to create snapshot (eg: dd cmd in case of LUN). 7) LSM potential changes: - list features/capabilities of the array. Eg: copy offload, thin prov. etc. - list containers (aka pools) (present in LSM today) - Ability to list different types of arrays being managed, their capabilities and used/free space - Ability to create/list/delete/resize volumes ( LUN or exports, available in LSM as of today) - Get monitoring info with object (LUN/snapshot/volume) as optional parameter for specific info. eg: container/pool free/used space, raid type etc. Need to make sure above info is listed in a coherent way across arrays (number of LUNs, raid type used? free/total per container/pool, per LUN?. Also need I/O statistics wherever possible. ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] VDSM API/clientIF instance design issue
On Thu, May 31, 2012 at 04:08:52PM +0300, Dan Kenigsberg wrote: On Thu, May 31, 2012 at 09:03:37PM +0800, Mark Wu wrote: On 05/30/2012 11:01 PM, Dan Kenigsberg wrote: On Wed, May 30, 2012 at 10:49:29PM +0800, Mark Wu wrote: Hi Guys, Recently, I has been working on integrate MOM into VDSM. MOM needs to use VDSM API to interact with it. But currently, it requires the instance of clientIF to use vdsm API. Passing clientIF to MOM is not a good choice since it's a vdsm internal object. So I try to remove the parameter 'cif' from the interface definition and change to access the globally unique clientIF instance in API.py. Please remind me - why don't we continue to pass the clientIF instance, even if it means mentioning it each and every time an API.py object is created? It may be annoying (and thus serve as a reminder that we should probably retire much of clientIF...), but it should work. In the old MOM integration patch, I passed the clientIF instance to MOM by the following method: vdsmInterface.setConnection(self._cif) Here's your comments on the patch: _cif is not the proper API to interact with Vdsm. API.py is. Please change MOM to conform to this, if possible. I think that mom should receive an API object (even API.Global()!) that it needs for its operation. Even passing BindingXMLRPC() object is more APIish than the internal clientIF object. Please do not blame me! ;-) I do not mind passing an API.Global() that happens to hold an internal private reference to _clientIF. I just want that if we find the way to obliterate clientIF, we won't need to send a patch to MOM, too. So I try to remove cif from API definition to make MOM can call the VDSM API without having clientIF. I do not understand - MOM could receive an API object, it does not have to construct it by itself. Today, the API consists of several, unlinked objects so passing a single API.Global() would not be enough. We either need to allow MOM to construct its own API objects or produce them by calling methods in API.Global(). Personally, I think the code would be cleaner if clientIF is a singleton (Mark's latest patch) as opposed to adding factory methods to API.Global(). -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] a problem with pepe8
On Fri, May 18, 2012 at 03:56:05PM +0800, ShaoHe Feng wrote: a comment exceed 80 characters, and it is a url link. such as # http:///bb///eee/fff/ how can I do? is this OK? # http://bb// # /eee/fff/ # (the link is too long to fit in one line, copy it and paste it to one line) It would be nice if we could annotate the source code to disable certain checks in places such as this. Clearly the rigid line length restriction would result in a less readable comment if followed here. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] RESTful VM creation
No comments at all on this?? On Wed, May 09, 2012 at 09:35:29AM -0500, Adam Litke wrote: I would like to discuss a problem that is going to affect VM creation in the new REST API. This topic has come up previously and I want to revive that discussion because it is blocking a proper implementation of VM.create(). Consider a RESTful VM creation sequence: POST /api/vms/define - Define a new VM in the system POST /api/vms/id/disks/add - Add a new disk to the VM POST /api/vms/id/cdroms/add - Add a cdrom POST /api/vms/id/nics/add - Add a NIC PUT /api/vms/id - Change boot sequence POST /api/vms/id/start - Boot the VM Unfortunately this is not possible today with vdsm because a VM must be fully-specified at the time of creation and it will be started immediately. As I see it there are two ways forward: 1.) Deviate from a REST model and require a VM resource definition to include all sub-collections inline. -- or -- 2.) Support storage of vm definitions so that powered off VMs can be manipulated by the API. My preference would be #2 because: it makes the API more closely follow RESTful principles, it maintains parity with the cluster-level VM manipulation API, and it makes the API easier to use in standalone mode. Here is my idea on how this could be accomplished without committing to stateful host storage. In the past we have discussed adding an API for storing arbitrary metadata blobs on the master storage domain. If this API were available we could use it to create a transient VM construction site. Let's walk through the above RESTful sequence again and see how my idea would work in practice: * POST /api/vms/define - Define a new VM in the system A new VM definition would be written to the master storage domain metadata area. * GET /api/vms/new-uuid The normal 'list' API is consulted as usual. The VM will not be found there because it is not yet created. Next, the metadata area is consulted. The VM is found there and will be returned. The VM state will be 'New'. * POST /api/vms/id/disks/add - Add a new disk to the VM For 'New' VMs, this will update the VM metadata blob with the new disk information. Otherwise, this will call the hotplugDisk API. * POST /api/vms/id/cdroms/add - Add a cdrom For 'New' VMs, this will update the VM metadata blob with the new cdrom information. If we want to support hotplugged CDROMs we can call that API later. * POST /api/vms/id/nics/add - Add a NIC For 'New' VMs, this will update the VM metadata blob with the new nic information. Otherwise it triggers the hotplugNic API. * PUT /api/vms/id - Change boot sequence Only valid for 'New' VMs. Updates the metadata blob according to the parameters specified. * POST /api/vms/id/start - Boot the VM Load the metadata from the master storage domain metadata area. Call the VM.create() API. Remove the metadata from the master storage domain. VDSM will automatically purge old metadata from the master storage domain. This could be done any time a domain is: attached as master, deactivated, and periodically. How does this idea sound? I am certain that it can be improved by those of you with more experience and different viewpoints. Thoughts and comments? -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
[vdsm] RESTful VM creation
I would like to discuss a problem that is going to affect VM creation in the new REST API. This topic has come up previously and I want to revive that discussion because it is blocking a proper implementation of VM.create(). Consider a RESTful VM creation sequence: POST /api/vms/define - Define a new VM in the system POST /api/vms/id/disks/add - Add a new disk to the VM POST /api/vms/id/cdroms/add - Add a cdrom POST /api/vms/id/nics/add - Add a NIC PUT /api/vms/id - Change boot sequence POST /api/vms/id/start - Boot the VM Unfortunately this is not possible today with vdsm because a VM must be fully-specified at the time of creation and it will be started immediately. As I see it there are two ways forward: 1.) Deviate from a REST model and require a VM resource definition to include all sub-collections inline. -- or -- 2.) Support storage of vm definitions so that powered off VMs can be manipulated by the API. My preference would be #2 because: it makes the API more closely follow RESTful principles, it maintains parity with the cluster-level VM manipulation API, and it makes the API easier to use in standalone mode. Here is my idea on how this could be accomplished without committing to stateful host storage. In the past we have discussed adding an API for storing arbitrary metadata blobs on the master storage domain. If this API were available we could use it to create a transient VM construction site. Let's walk through the above RESTful sequence again and see how my idea would work in practice: * POST /api/vms/define - Define a new VM in the system A new VM definition would be written to the master storage domain metadata area. * GET /api/vms/new-uuid The normal 'list' API is consulted as usual. The VM will not be found there because it is not yet created. Next, the metadata area is consulted. The VM is found there and will be returned. The VM state will be 'New'. * POST /api/vms/id/disks/add - Add a new disk to the VM For 'New' VMs, this will update the VM metadata blob with the new disk information. Otherwise, this will call the hotplugDisk API. * POST /api/vms/id/cdroms/add - Add a cdrom For 'New' VMs, this will update the VM metadata blob with the new cdrom information. If we want to support hotplugged CDROMs we can call that API later. * POST /api/vms/id/nics/add - Add a NIC For 'New' VMs, this will update the VM metadata blob with the new nic information. Otherwise it triggers the hotplugNic API. * PUT /api/vms/id - Change boot sequence Only valid for 'New' VMs. Updates the metadata blob according to the parameters specified. * POST /api/vms/id/start - Boot the VM Load the metadata from the master storage domain metadata area. Call the VM.create() API. Remove the metadata from the master storage domain. VDSM will automatically purge old metadata from the master storage domain. This could be done any time a domain is: attached as master, deactivated, and periodically. How does this idea sound? I am certain that it can be improved by those of you with more experience and different viewpoints. Thoughts and comments? -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] error when run vdsClient
On Tue, May 08, 2012 at 11:51:02PM +0300, Dan Kenigsberg wrote: On Wed, May 09, 2012 at 01:42:45AM +0800, ShaoHe Feng wrote: $ sudo ./autobuild.sh build vdsm, and all test OK. then rpm install the rpm package. and start the vdsm $ sudo systemctl start vdsmd.service but error, when run vdsClient. File /usr/share/vdsm/vdsClient.py, line 28, in module from vdsm import vdscli ImportError: cannot import name vdscli but I change to root, the vdsClient can work. I have also noticed this problem. I have found that changing out of the vdsm source directory 'fixes' it as well. $ ls /usr/lib/python2.7/site-packages/vdsm/vdscli.py -al -rw-r--r--. 1 root root 4113 May 9 01:20 /usr/lib/python2.7/site-packages/vdsm/vdscli.py What's your $PWD? Maybe you have some vdsm module/package in your PYTHONPATH that hides the one in site-packages. ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
[vdsm] RFD: NEW API getAllTasks
The current APIs for retrieving all task information do not actually return all task information. I would like to introduce a new API that corrects this and other issues with the current API while preserving backwards compatibility with ovirt-engine for as long as is necessary. The current APIs: getAllTasksInfo(spUUID=None, options = None): - Returns a dictionary that maps a task UUID to a task verb. - Despite having 'all' in the name, this API only returns tasks that have an 'spm' tag. - This call returns only one piece of information for each task. - The spUUID parameter is deprecated and ignored. getAllTasksStatuses(spUUID=None, options = None): - Returns a dictionary of task status information. - Despite having 'all' in the name, this API only returns tasks that have an 'spm' tag. - The spUUID parameter is deprecated and ignored. I propose the following new API: getAllTasks(tag=None, options=None): - Returns a dictionary of task information. The info from both of the above functions would be merged into a single result set. - If tag is None, all tasks are returned. otherwise, only tasks matching the tag are returned. - The spUUID parameter is dropped. options is for future extension and is currently not used. This new API includes all functionality that is available in the old calls. In the future, ovirt-engine could switch to this API and preserve the current semantics by passing tag='spm' to getAllTasks. Meanwhile, API users that really want all tasks (gluster and the REST API) can get what they need. Thoughts on this idea? -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
[vdsm] getAllTasksInfo API
Hi, While developing the REST API I was having trouble using the getAllTasks(Info|Statuses) API to get tasks information. I found out that hsm is hard-coding a tagged search for 'spm' in the calls to the task manager. Is there a reason that this tag must be hard-coded or can we remove it as in the patch below? With this patch applied I am able to list all tasks. If this patch is acceptable, I would be happy to submit it to gerrit for approval. Thanks! commit 72621b2ffe5a0a21ba1023dada36b405bf2111f2 Author: Adam Litke a...@us.ibm.com Date: Mon Apr 16 13:56:55 2012 -0500 Don't hardcode the 'spm' tag when getting information for all tasks. diff --git a/vdsm/storage/hsm.py b/vdsm/storage/hsm.py index 2755aef..51ee17c 100644 --- a/vdsm/storage/hsm.py +++ b/vdsm/storage/hsm.py @@ -1694,7 +1694,7 @@ class HSM: :options: ? #getSharedLock(tasksResource...) -allTasksStatus = self.taskMng.getAllTasksStatuses(spm) +allTasksStatus = self.taskMng.getAllTasksStatuses() return dict(allTasksStatus=allTasksStatus) @@ -1733,7 +1733,7 @@ class HSM: #getSharedLock(tasksResource...) # TODO: if spUUID passed, make sure tasks are relevant only to pool -allTasksInfo = self.taskMng.getAllTasksInfo(spm) +allTasksInfo = self.taskMng.getAllTasksInfo() return dict(allTasksInfo=allTasksInfo) -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
[vdsm] MOM Integration Plan
Hi all, Very shortly Mark will be sending some patches for review that implement the long-awaited integration of mom with vdsm. I felt it would be easier to understand the changes to vdsm if they were explained a bit better. In support of this I have created a wiki page on ovirt.org with a diagram: http://ovirt.org/wiki/Features/MomIntegration To facilitate discussion, here is the text of that page: As discussed at the oVirt Workshop and elsewhere, integrating mom with vdsm will benefit oVirt by providing a mechanism for dynamic, policy-based tuning. This mechanism will pave the way for implementing memory ballooning policies, can enhance migration policy, and will replace the existing ksm tuning thread. MOM exists today as an independent library that can be used by python programs such as vdsm or in standalone mode (by using the accompanying momd program. Mom's operation is very configurable. The management policy is written in a Fortran-like language and is replaceable by the end user. Additionally, plugins allow you to customize the types of information collected and the manner in which it is collected. Similarly, Controller plugins permit a completely flexible control API to be created. To integrate mom, vdsm will initialize the mom library in a new thread and start it. Therefore, mom and vdsm will exist in the same process. Vdsm will configure the mom instance to use plugins and a policy that exclusively target the vdsm API. All statistics collection will occur via API calls and any management actions (including adjustments to KSM and VM balloons) will be done through the vdsm api as well. Mom will not use libvirt at all (not even to monitor for new VMs on the system). Packaging logistics: - Mom is an independent package that is already in Fedora. Any changes to mom that are required to support this integration will be submitted to the mom project for inclusion. Vdsm will consume the standard MOM package as a python module/library. In order to control its mom instance, vdsm will ship a mom configuration file and a mom policy file that will set mom's default behavior. At startup, vdsmd will import mom and initialize it with the configuration and policy files. From that point on, mom will interact with vdsm through the well-defined API in API.py. New features needed in vdsm: --- In order to fully benefit from mom's capabilities, vdsm should implement the following extra features/APIs: - Collection of more memory statistics via ovirt-guest-agent including the current memory balloon value. - A vmBalloon API to set a new balloon target. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] VDSM host network configuration
On Wed, Feb 15, 2012 at 06:36:48PM +0200, Dan Kenigsberg wrote: On Thu, Feb 16, 2012 at 12:05:16AM +0800, Lei Li wrote: Hi, We are working on VDSM network REST APIs, to support the functions we need to get the list of configured networks. I found that VDSM network has a function 'listNetworks' in configNetwork.py. It can get and display the current configured network like this: # python configNetwork.py list Networks: ['bridge_one', 'bridge_three', 'bridge_two'] Vlans: [] Nics: ['eth0'] Bondings: [] But there are some problems with it. It can not display the defined networks after host restart, but the created config file are still there(/etc/sysconfig/network-scripts/..). Did I miss anything? Or Is there some way to avoid this? Your suggestion and thoughts would be appreciated. Lei, on my vdsm host, running python /usr/share/vdsm/configNetwork.py list gives me the following output: Networks: ['ovirtmgmt'] Vlans: [] Nics: ['eth1', 'eth0'] Bondings: ['bond4', 'bond0', 'bond1', 'bond2', 'bond3'] and python /usr/share/vdsm/configNetwork.py show ovirtmgmt gives: Bridge ovirtmgmt: vlan=None, bonding=None, nics=['eth0'] These results are what I would expect to see. Could you describe how you reproduce the problem (with as much details)? You define a network, persist it, and restart the host? Hi Dan. As I understand it there is not a problem with vdsm in this regard. Lei is trying to model the current networking APIs in REST. To do this you might have something like: /vdsm-api/networks/ ..Get a list of bridges configured for vdsm /vdsm-api/networks/confirmMark the current network config as safe /vdsm-api/networks/addAdd a new network /vdsm-api/networks/ovirtmgmt/.View details of the ovirtmgmt network /vdsm-api/networks/ovirtmgmt/edit.Edit the ovirtmgmt network /vdsm-api/networks/ovirtmgmt/delete...Delete the ovirtmgmt network The current vdsm API lacks a facility to display the /vdsm-api/networks/ URI because there is no function to get such a list. To create such an API, one might call out to 'configNetwork.py list'. Is there support for adding such an API to API.py? How about an API to fetch network info via configNetwork.py show? Also, I think the networking APIs should be organized into a Network class within API.py. Did Vdsm restart after boot? What is reported by getVdsCaps ? -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] vdsm hangs in SamplingMethod after reinstall
On Sun, Feb 12, 2012 at 06:46:25PM -0500, Ayal Baron wrote: - Original Message - On Thu, Feb 09, 2012 at 07:15:48PM -0500, Ayal Baron wrote: - Original Message - Hi. I am running into a very annoying problem when working on vdsm lately. My development process involves stopping vdsm, replacing files, and restarting it. I do this pretty frequently. Sometimes, after restarting vdsm the XMLRPC call getStorageDomainsList() hangs. The following line is the last to Can you post the exact flow you're running? Still working on this. It isn't reproducing reliably -- only when I really need to get some work done :) print in the log: Thread-18::DEBUG::2012-02-09 17:11:46,793::misc::1017::SamplingMethod::(__call__) Trying to enter sampling method (storage.sdc.refreshStorage) The only solution I've been able to come up with is restarting my machine. When stopping vdsm I search for any stale threads but I am unable to find them. Do you know what else might be causing DynamicBarrier.enter() to hang for a long period of time? Do the threading primitives use some sort of temporary disk storage that needs to be cleaned up? Thanks for the help! Try to add some logging in sdc.py: def refreshStorage(self): ADD LOG HERE Yep have done this and I am not even getting into the refreshStorage function. We actually hang in DynamicBarrier.enter(). I am going to add some debugging to determine which locking operation gets stuck. On the face of it it sounds like a python bug. Is supervdsm running? did you try killing it as well? Are you sure there is no 'Got in to sampling method' line in the log? Have you tried adding logging in 'enter' to see at what stage exactly you get stuck? (side note - code should probably be updated with 'with' as it was originally written for use with python 2.4) multipath.rescan() I have a feeling that your issue is not with SamplingMethod -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel -- Adam Litke a...@us.ibm.com IBM Linux Technology Center -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
[vdsm] vdsm hangs in SamplingMethod after reinstall
Hi. I am running into a very annoying problem when working on vdsm lately. My development process involves stopping vdsm, replacing files, and restarting it. I do this pretty frequently. Sometimes, after restarting vdsm the XMLRPC call getStorageDomainsList() hangs. The following line is the last to print in the log: Thread-18::DEBUG::2012-02-09 17:11:46,793::misc::1017::SamplingMethod::(__call__) Trying to enter sampling method (storage.sdc.refreshStorage) The only solution I've been able to come up with is restarting my machine. When stopping vdsm I search for any stale threads but I am unable to find them. Do you know what else might be causing DynamicBarrier.enter() to hang for a long period of time? Do the threading primitives use some sort of temporary disk storage that needs to be cleaned up? Thanks for the help! -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] [Engine-devel] [RFC] New Connection Management API
and forcing it to refresh the engine token is simpler then having it refresh the VDSM token. I understand that engine currently has no way of tracking a user session. This, as I said, is also true in the case of VDSM. We can start and argue about which project should implement the session semantics. But as I see it it's not relevant to the connection management API. ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [vdsm] [Engine-devel] [RFC] New Connection Management API
, it is what i was looking for. If all is well, and it usually is, VDSM will not invoke a disconnect. So the caller would have to call unmanage if the connection succeeded at the end of the flow. agree. Now, if you are already calling unmanage if connection succeeded you can just call it anyway. not exactly, an example I gave earlier on the thread was that VSDM hangs or have other error and the engine can not initiate unmanaged, instead let's assume the host is fenced (self-fence or external fence does not matter), in this scenario the engine will not issue unmanage. instead of doing: (with your suggestion) manage wait until succeeds or lastError has value try: do stuff finally: unmanage do: (with the canonical flow) --- manage try: wait until succeeds or lastError has value do stuff finally: unmanage This is simpler to do than having another connection type. You are assuming the engine can communicate with VDSM and there are scenarios where it is not feasible. Now that we got that out of the way lets talk about the 2nd use case. Since I did not ask VDSM to clean after the (engine) user and you don't want to do it I am not sure we need to discuss this. If you insist we can start the discussion on who should implement the cleanup mechanism but I'm afraid I have no strong arguments for VDSM to do it, so I rather not go there ;) You dropped from the discussion my request for supporting list of connections for manage and unmanage verbs. API client died in the middle of the operation and unmanage was never called. Your suggested definition means that unless there was a problem with the connection VDSM will still have this connection active. The engine will have to clean it anyway. The problem is, VDSM has no way of knowing that a client died, forgot or is thinking really hard and will continue on in about 2 minutes. Connections that live until they die is a hard to define and work with lifecycle. Solving this problem is theoretically simple. Have clients hold some sort of session token and force the client to update it at a specified interval. You could bind resources (like domains, VMs, connections) to that session token so when it expires VDSM auto cleans the resources. This kind of mechanism is out of the scope of this API change. Further more I think that this mechanism should sit in the engine since the session might actually contain resources from multiple hosts and resources that are not managed by VDSM. In GUI flows specifically the user might do actions that don't even touch the engine and forcing it to refresh the engine token is simpler then having it refresh the VDSM token. I understand that engine currently has no way of tracking a user session. This, as I said, is also true in the case of VDSM. We can start and argue about which project should implement the session semantics. But as I see it it's not relevant to the connection management API. ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
Vdsm sync call agenda items
Hi Ayal, I would like to propose two agenda items for Monday's call: - vdsm testing (in preparation for oVirt Test Day) - my API refactoring patches Hopefully by Monday folks will have had a chance to look at the patches and we can discuss what I have done and the next steps. Thanks. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
Re: API design and plan
On Thu, Dec 08, 2011 at 04:56:17AM -0500, Ayal Baron wrote: - Original Message - On Tue, Dec 06, 2011 at 08:46:57AM -0600, Adam Litke wrote: On Tue, Dec 06, 2011 at 02:58:59PM +0200, Dan Kenigsberg wrote: On Mon, Dec 05, 2011 at 11:34:18AM -0600, Adam Litke wrote: Hi everyone. On today's VDSM call we discussed the requirements, design, and plan for updating the API to include support for QMF and single-host REST API. All members present arrived at a general consensus on the best way to design the next-generation API. I have tried to capture this discussion in the oVirt wiki: http://ovirt.org/wiki/Vdsm_API Please take a look at this page and let's discuss any changes that may be needed in order to adopt it as a working plan that we can begin to execute. Thanks! Very nice, I've fixed two bullets about the future of the xml-rpc. Thanks... Updates look good to me. I think that we are missing something here: how do we model Vdsm-to-Vdsm communication, in a binding-blind way? I'm less worried about the storage-based mailbox used for lvextend requests: my problem is with migration command. Ok, interesting... Besides migration, are there other features (current or planned) that would involve P2P communication? I want to ensure we consider the full problem space. Well, I can imagine we would like a host in distress to migrate VMs to whomever can take them, without central management driving this process. (CAVE split brain) At the momemt I cannot think of something that cannot be implemented by QMF events. Ayal? Currently, the implementation of the migrate verb includes contacting the remote Vdsm over xml-rpc before issuing the libvirt migrateToURI2 command ('migrationCreate' verb). A Vdsm user who choose to use the REST binding, is likely to want this to be implemented this using a REST request to the destination. This means that the implementation of Vdsm depends on the chosen binding. The issue can be mitigating by requiring the binding level to provide a callback for migrationCreate (and any other future Vdsm-world requests). This would complicate the beautiful png at http://ovirt.org/wiki/Vdsm_API#Design ... Does anyone have another suggestion? Actually, I think you are blending the external API with vdsm internals. As a management server or ovirt-engine, I don't care about the protocol that vdsm uses to contact the migration recipient. As far as I am concerned this is a special case internal function call. For that purpose, I think xmlrpc is perfectly well-suited to the task and should be used unconditionally, regardless of the bindings used to initiate the migration. So I would propose that we modify the design such that we keep an extremely thin xmlrpc server active whose sole purpose is to service internal P2P requests. Interesting. We could avoid even that, if we could register a callback with libvirt, so that destination libvirtd called destination Vdsm to verify that all storage and networking resources are ready, before executing qemu. DanPB, can something like that be done? (I guess it is not realistic since we may need to pass vdsm-specific data from source to dest, and libvirt is not supposed to be a general purpose transport.) Dan. I don't understand the issue. The whole point of the REST API is to be an easily consumable *single* node management API. Once you start coordinating among different nodes then you need clustering and management (either distributed or centralized), in both cases it is fine to require having a bus in which case you have your method of communications between hosts to replace current xml-rpc. Implicit in this statement is an assertion that live migration between two vdsm instances will not be supported without orchestration from an ovirt-engine instance. I don't agree with placing such a limitation on vdsm since p2p migration is already well-supported by the underlying components (libvirt and qemu). Requiring an additional xml-rpc server sounds wrong to me. The other option is to support a migrateCreate binding in REST and QMF. -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
Re: [Engine-devel] API design and plan
On Thu, Dec 08, 2011 at 06:48:53AM +0200, Itamar Heim wrote: On 12/05/2011 07:34 PM, Adam Litke wrote: Hi everyone. On today's VDSM call we discussed the requirements, design, and plan for updating the API to include support for QMF and single-host REST API. All members present arrived at a general consensus on the best way to design the next-generation API. I have tried to capture this discussion in the oVirt wiki: http://ovirt.org/wiki/Vdsm_API Please take a look at this page and let's discuss any changes that may be needed in order to adopt it as a working plan that we can begin to execute. Thanks! as you are going to plan an api... This piece by Geert Jansen summarizes lessons learned from the RHEV-M (ovirt) REST API project https://fedorahosted.org/pipermail/rhevm-api/2011-August/002714.html Thanks for the link! This is proving to be a very insightful read. I am finding that I have come to many of these same conclusions in my own way as I have been desigining the API (especially regarding the use of JSON over XML). -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel
ovirt-guest-agent memory statistics
To support decisions regarding a host's capacity to run virtual machines, it is useful to have an expanded set of guest memory statistics. These should be collected by the ovirt-guest-agent and made available by the vdsm getVmStats() API. Once this has been done, it will be possible to write a host-side MOM policy for auto-ballooning. The current set of vetted memory stats is published in the virtio specification: http://ozlabs.org/~rusty/virtio-spec/virtio-0.9.3.pdf (Appendix G, page 42) swap_in - the total number of pages swapped in swap_out - the total number of pages swapped out minflt - the total number of minor page faults majflt - the total number of major page faults memfree - the amount of memory that is completely unused (in Linux: MemFree) memtot - the total amount of available memory (in Linux: MemTotal) In Linux, these values can all be obtained by reading /proc/meminfo and /proc/vmstat. On Windows there is an existing implementation in the virtio balloon driver. How does everyone feel about adding these to the current set of guest stats? -- Adam Litke a...@us.ibm.com IBM Linux Technology Center ___ vdsm-devel mailing list vdsm-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/vdsm-devel