Re: [vdsm] pep8 issue

2014-05-20 Thread Adam Litke

On 13/05/14 17:22 -0300, Amador Pahim wrote:

Building vdsm/master in F20, I've got:

./vdsm/virt/migration.py:223:19: E225 missing whitespace around operator

In vdsm/virt/migration.py:

218 e.err = (libvirt.VIR_ERR_OPERATION_ABORTED,  # error code$
219  libvirt.VIR_FROM_QEMU,  # error
domain$
220  'operation aborted',# error
message$
221  libvirt.VIR_ERR_WARNING,# error
level$
222  '', '', '', # str1, str2,
str3,$
223  -1, -1) # int1, int2$
224 raise e$

pep8 is not accepting negative integer. Instead, it is handling the
minus sign as an operator. Quick workaround is change -1 to int(-1).
Is this a known?


I found this one too and am planning to submit the same workaround.
Actually you can just do (-1) without int().

--
Adam Litke
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] FW: Fwd: Question about MOM

2014-03-26 Thread Adam Litke

On 26/03/14 03:50 -0700, Chegu Vinod wrote:

removing the email alias


Restoring the email alias.  Please keep discussions as public as
possible to allow others to contribute to the design and planning.



Jason.

Please see below...


On 3/26/2014 1:38 AM, Liao, Chuan (Jason Liao, HPservers-Core-OE-PSC) wrote:

Hi All,

Follow below discussion. I got these points:
1. MOM gathering NUMA information(topology, statistics...) will changed in 
future. (one side using VDSM API, another side using libvirt and system API)


I didn't follow your sentence..

Pl.. work with Adam/Martin and provide the needful API's on the VDSM 
side ...so that MOM entity thread can use the API and extract the 
needful about NUMA topology and cpu/memory usage info. As I see 
it...this is probably the only piece that would be relevant to be made 
available at the earliest (preferably in oVirt 3.5) and that would 
enable MOM to pursue next steps as they say fit.


Beyond that ...at this point (for oVirt 3.5) let us not spend more 
time on MOM internals please. Let us leave that to Adam and Martin to 
pursue this as/when they see fit.



2. Martin and Adam will take a look at MOM policy in ovirt scheduler when NUMA 
feature turn on.

Yes please.

3. ovirt engine will have numa-aware placement algorithm to make the VM run 
within NUMA nodes as best way.


algorithm here is decided by user specified pinning requests 
(and/or) by the oVirt scheduler. In the case of user request (upon 
approval from oVirt scheduler) the VDSM- libvirt will be explicitly 
told what to do via numatune/cputune etc etc.   In the absence of the 
user specified pinning request I don't know if oVirt scheduler intends 
to convey the numatune/cputune type of requests to the libvirt...



4. ovirt engine will have some algorithm to automatic configure virtual NUMA 
when big VM creation (big memory or vcpus)


This is a good suggestion but in my view should be taken up after 
oVirt 3.5.

For now just accept and process the user specified requests...

5. Investigate on KSM, memory ballooning have the right tuning parameter when 
NUMA feature turn on.

That is for Adam/Martin et.al. ...not for your specific project.

We just need to ensure that they have the basic NUMA info, they need 
(via the VDSM API i mentioned above)...so that it enables them to work 
on their part independently as/when they see fit.



6. Investigate on if Automatic NUMA balancing is keeping the process reasonably 
balanced and notify ovirt engine.

Not sure I follow what you are saying...

Here is what I have in my mind :

Check if the target host has Automatic NUMA balancing enabled (you can 
use the sysctl -a |grep numa_balancing or a similar underlying 
mechanism for determining this). If its present then check if its 
enabled or not (value of 1 is enabled and 0 is disabled)... and convey 
this information to the oVirt engine GUI for display (this is a hint 
for a user (if they wish) to skip manual pinning)..   This in my view 
is the minimum...at this point (and it would be great if we can make 
it happen for oVirt 3.5).


I think since we have vdsm you can choose to enable autonuma always
(when it is present).  Are there any drawbacks to enabling it always?



We can discuss (at some later point i.e for post oVirt 3.5) whether we 
should really provide a way to the user to disable Automatic NUMA 
balancing.   Changing the other numa balancing tunables is just not 
going to happen...as far as I can see at this point (so let us not 
worry about that right now..)




7. Investigate on libvirt have any NUMA tuning APIs

No. There is nothing to investigate here..

IMO.  libvirt should not be playing with the host wide NUMA settings.






Please feel free to correct me if I am missing something.


See above


BTW. I think there is no point in ovirt 3.5 release, am I right?


If you are referring to just the MOM stuff then with the exception of 
my comment about having an appropriate API on the VDSM for enabling 
MOM there is nothing else.


Vinod



Best Regards,
Jason Liao

-Original Message-
From: Vinod, Chegu
Sent: 2014年3月21日 21:32
To: Adam Litke
Cc: Liao, Chuan (Jason Liao, HPservers-Core-OE-PSC); vdsm-devel; Martin Sivak; 
Gilad Chaplik; Liang, Shang-Chun (David Liang, HPservers-Core-OE-PSC); Shi, 
Xiao-Lei (Bruce, HP Servers-PSC-CQ); Doron Fediuck
Subject: Re: FW: Fwd: Question about MOM

On 3/21/2014 6:13 AM, Adam Litke wrote:

On 20/03/14 18:03 -0700, Chegu Vinod wrote:

On 3/19/2014 11:01 PM, Liao, Chuan (Jason Liao,
HPservers-Core-OE-PSC) wrote:

Add Vinod in this thread.

Best Regards, Jason Liao

-Original Message- From: Adam Litke
[mailto:ali...@redhat.com] Sent: 2014年3月19日 21:23 To: Doron Fediuck
Cc: vdsm-devel; Liao, Chuan (Jason Liao, HPservers-Core-OE-PSC);
Martin Sivak; Gilad Chaplik; Liang, Shang-Chun (David Liang,
HPservers-Core-OE-PSC); Shi, Xiao-Lei (Bruce, HP Servers-PSC-CQ)
Subject: Re: Fwd: Question about MOM

On 19/03/14 05:50 -0400, Doron Fediuck wrote

Re: [vdsm] FW: Fwd: Question about MOM

2014-03-21 Thread Adam Litke

On 20/03/14 18:03 -0700, Chegu Vinod wrote:

On 3/19/2014 11:01 PM, Liao, Chuan (Jason Liao,
HPservers-Core-OE-PSC) wrote:

Add Vinod in this thread.

Best Regards, Jason Liao

-Original Message- From: Adam Litke
[mailto:ali...@redhat.com] Sent: 2014年3月19日 21:23 To: Doron
Fediuck Cc: vdsm-devel; Liao, Chuan (Jason Liao,
HPservers-Core-OE-PSC); Martin Sivak; Gilad Chaplik; Liang,
Shang-Chun (David Liang, HPservers-Core-OE-PSC); Shi, Xiao-Lei
(Bruce, HP Servers-PSC-CQ) Subject: Re: Fwd: Question about MOM

On 19/03/14 05:50 -0400, Doron Fediuck wrote:

Moving this to the vdsm list.

- Forwarded Message - From: Chuan Liao (Jason Liao,
HPservers-Core-OE-PSC) chuan.l...@hp.com To: Martin Sivak
msi...@redhat.com, ali...@redhat.com, Doron Fediuck
dfedi...@redhat.com, Gilad Chaplik gchap...@redhat.com Cc:
Shang-Chun Liang (David Liang, HPservers-Core-OE-PSC)
shangchun.li...@hp.com, Xiao-Lei Shi (Bruce, HP
Servers-PSC-CQ) xiao-lei@hp.com Sent: Wednesday, March 19,
2014 11:28:01 AM Subject: Question about MOM

Hi All,

I am a new with MOM feature.

In my understanding, MOM is the collector both from host and guest
and set the right policy to KSM and memory ballooning get better
performance.

Yes this is correct.  In oVirt, MOM runs as another vdsm thread and
uses the vdsm API to collect host and guest statistics.  Those
statistics are fed into a policy file which can create some outputs
(such as ksm tuning parameters and guest balloon sizes).  MOM then
uses the vdsm API to apply those outputs to the system.



Ok..Understood about the statistics gathering part and then
initiating policy driven inputs for the ksm and balloning on the host
etc.

Perhaps this was already discussed earlier ? Does the MOM thread in
vdsm intend to gather the NUMA topology of the host from the VDSM
(using some new TBD or some enhanced existing API) or does it intend
to collect this directly from the host using libvirt/libnuma etc ?


When MOM is using the VDSM HypervisorInterface, it must get all of its
information from vdsm.  It is considered an API layering violation for
MOM to access the system or libvirt connection directly.  When running
with the Libvirt HypervisorInterface, it should use libvirt and the
system directly as necessary.  Your new features should consider this
and make use of the HypervisorInterface abstraction to provide both
implementations.


I am not sure how it has relationship with NUMA, does anyone can
explain it to me?


Jason, Here is my understanding (and I believe I am just
paraphrasing/echoing Adam's comments ).

MOM's NUMA related enhancements are independent of what the oVirt
UI/oVirt scheduler does.

It is likely that MOM's vdsm thread may choose to extract information
about NUMA topology (includes dynamic stuff like cpu usage or free
memory) from the VDSM (i.e. if they choose to not get it directly
from libvirt/libnuma or /proc etc).

How MOM interprets that NUMA information along with other statistics
that it gathers (along side with user requested SLA requirements for
each guest etc) should be left to MOM to decide and direct
KSM/ballooning related actions. I don't believe we need to intervene
in the MOM related internals.


Once we decide to have NUMA-aware MOM policies there will need to be
some infrastructure enhancements to enable it.  I think Martin and I
will take the lead on it since we have been thinking about these kinds
of issues for some time now.


I guess we need to start by examining the currently planned use
cases.  Please feel free to correct me if I am missing something or
over-simplifying something: 1) NUMA-aware placement - Try to
schedule VMs to run on hosts where the guest will not have to span
multiple NUMA nodes.


I guess you are referring to the case where the user (and/or the
oVirt scheduler) has not explicitly directed libvirt on the host to
schedule the VM in some specific way... In those cases the decision
is left to the smarts of the host OS scheduler to take care of it
(that includes the future/smarter Automatic NUMA balancing enabled
scheduler).


Yes.  For this one, we need a numa-aware placement algorithm on
engine, and the autonuma feature available and configured on all virt
hosts.  In the first phase I don't anticipate any changes to MOM
internals.  I would prefer to observe the performance characteristics
of this and tweak MOM in the future to address actual performance
problems we see.


  2) Virtual NUMA topology - Emulate a NUMA topology inside the VM.


Yes. Irrespective of any NUMA specified for the backing resources of
a guest...when the guest size increases it is a required practice
to have virtual NUMA topology enabled. (This helps the OS running
inside the guest to scale/perform much by making NUMA aware decisions
etc. Also it helps the applications running in the OS to
scale/perform better).


Agreed.  One point I might make then... Should the VM creation process
on engine automatically configure virtual NUMA (even if the user
doesn't

Re: [vdsm] Fwd: Question about MOM

2014-03-19 Thread Adam Litke

On 19/03/14 05:50 -0400, Doron Fediuck wrote:

Moving this to the vdsm list.

- Forwarded Message -
From: Chuan Liao (Jason Liao, HPservers-Core-OE-PSC) chuan.l...@hp.com
To: Martin Sivak msi...@redhat.com, ali...@redhat.com, Doron Fediuck 
dfedi...@redhat.com, Gilad Chaplik gchap...@redhat.com
Cc: Shang-Chun Liang (David Liang, HPservers-Core-OE-PSC) shangchun.li...@hp.com, 
Xiao-Lei Shi (Bruce, HP Servers-PSC-CQ) xiao-lei@hp.com
Sent: Wednesday, March 19, 2014 11:28:01 AM
Subject: Question about MOM

Hi All,

I am a new with MOM feature.

In my understanding, MOM is the collector both from host and guest
and set the right policy to KSM and memory ballooning get better
performance.


Yes this is correct.  In oVirt, MOM runs as another vdsm thread and
uses the vdsm API to collect host and guest statistics.  Those
statistics are fed into a policy file which can create some outputs
(such as ksm tuning parameters and guest balloon sizes).  MOM then
uses the vdsm API to apply those outputs to the system.


I am not sure how it has relationship with NUMA, does anyone can
explain it to me?


I guess we need to start by examining the currently planned use cases.
Please feel free to correct me if I am missing something or
over-simplifying something:
 1) NUMA-aware placement - Try to schedule VMs to run on hosts where
the guest will not have to span multiple NUMA nodes.
 2) Virtual NUMA topology - Emulate a NUMA topology inside the VM.

These two use cases are intertwined because VMs with NUMA can be
scheduled with more flexibility (albeit with more sophistication)
since the scheduler can fit the VM onto hosts where the memory can be
split across multiple Host NUMA nodes.

 3) Manual NUMA pinning - Allow advanced admins to schedule a VM to
run on a specific host with a manual pinning strategy.

Most of these use cases involve the engine scheduler and engine UI.
There is not much for MOM to do to support their direct
implementation.  We should focus on managing interactions with other
SLA features that MOM does implement:
 - How should KSM be adjusted when NUMA is in effect?  In a NUMA
   host, are there numa-aware KSM tunables that we should use?
 - When ballooning VMs, should we take into account how much memory
   we need to reclaim from VMs on a node by node basis?

Lastly, let's see if MOM needs to manage the existing NUMA utilities
in place on the system.  I don't know much about AutoNUMA.  Does it
have tunables that should be adjusted or is it completely autonomous?
Does libvirt have any NUMA tuning APIs that MOM may want to call to
enhance performance in certain situations?

One of the main questions I ask when trying to decide if MOM should
manage a particular setting is: Is this something that is set once
and stays the same or is it something that must change dynamically in
accordance with current system conditions?  In the former case, it is
probably best managed by engine or vdsm directly.  In the latter case,
it fits the MOM model.

Hope this was helpful!  Please feel free to continue engaging this
list with any additional questions that you might have.


On engine side, there is only one button with this feature: Sync MoM
Policy, right?

On vdsm side, I saw the momIF is working for this, right?

Best Regards, Jason Liao



--
Adam Litke
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] mom RPMs for 3.4

2014-02-03 Thread Adam Litke

On 01/02/14 22:48 +, Dan Kenigsberg wrote:

On Fri, Jan 31, 2014 at 04:56:12PM -0500, Adam Litke wrote:

On 31/01/14 08:36 +0100, Sandro Bonazzola wrote:
Il 30/01/2014 19:30, Adam Litke ha scritto:
On 30/01/14 18:13 +, Dan Kenigsberg wrote:
On Thu, Jan 30, 2014 at 11:49:42AM -0500, Adam Litke wrote:
Hi Sandro,

After updating the MOM project's build system, I have used jenkins to
produce a set of RPMs that I would like to tag into the oVirt 3.4
release.  Please see the jenkins job [1] for the relevant artifacts
for EL6[2], F19[3], and F20[4].

Dan, should I submit a patch to vdsm to make it require mom = 0.4.0?
I want to be careful to not break people's environments this late in
the 3.4 release cycle.  What is the best way to minimize that damage?

Hey, we're during beta. I prefer making this requirement explicit now
over having users with supervdsmd.log retate due to log spam.

In that case, Sandro, can you let me know when those RPMs hit the
ovirt repos (for master and 3.4) and then I will submit a patch to
vdsm to require the new version.


mom 0.4.0 has been built in last night nightly job [1] and published to 
nightly by publisher job [2]
so it's already available on nightly [3]

For 3.4.0, it has been planned [4] a beta 2 release on 2014-02-06 so we'll 
include your builds in that release.

I presume the scripting for 3.4 release rpms will produce a version
without the git-rev based suffix: ie. mom-0.4.0-1.rpm?

I need to figure out how to handle a problem that might be a bit
unique to mom.  MOM is used by non-oVirt users who install it from the
main Fedora repository.  I think it's fine that we are producing our
own rpms in oVirt (that may have additional patches applied and may
resync to upstream mom code more frequently than would be desired for
the main Fedora repository).  Given this, I think it makes sense to
tag the oVirt RPMs with a special version suffix to indicate that
these are oVirt produced and not upstream Fedora.

For example:
The next Fedora update will be mom-0.4.0-1.f20.rpm.
The next oVirt update will be mom-0.4.0-1ovirt.f20.rpm.

Is this the best practice for accomplishing my goals?  One other thing
I'd like to have the option of doing is to make vdsm depend on an
ovirt distribution of mom so that the upstream Fedora version will not
satisfy the dependency for vdsm.


What is the motivation for this? You would not like to bother Fedora
users with updates that are required only for oVirt?


Yes, that was my thinking.  It seems that oVirt requires updates more
frequently than users that use MOM with libvirt directly and the
Fedora update process is a bit more heavy than oVirt's at the moment.


Vdsm itself is built, signed, and distributed via Fedora.
It is also copied into the ovirt repo, for completeness sake. Could MoM
do the same?


If vdsm is finding this to work well than surely I can do the same
with MOM.  The 0.4.0 build is in updates-testing right now and should
be able to be tagged stable in a day or two.

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] mom RPMs for 3.4

2014-01-31 Thread Adam Litke

On 31/01/14 08:36 +0100, Sandro Bonazzola wrote:

Il 30/01/2014 19:30, Adam Litke ha scritto:

On 30/01/14 18:13 +, Dan Kenigsberg wrote:

On Thu, Jan 30, 2014 at 11:49:42AM -0500, Adam Litke wrote:

Hi Sandro,

After updating the MOM project's build system, I have used jenkins to
produce a set of RPMs that I would like to tag into the oVirt 3.4
release.  Please see the jenkins job [1] for the relevant artifacts
for EL6[2], F19[3], and F20[4].

Dan, should I submit a patch to vdsm to make it require mom = 0.4.0?
I want to be careful to not break people's environments this late in
the 3.4 release cycle.  What is the best way to minimize that damage?


Hey, we're during beta. I prefer making this requirement explicit now
over having users with supervdsmd.log retate due to log spam.


In that case, Sandro, can you let me know when those RPMs hit the
ovirt repos (for master and 3.4) and then I will submit a patch to
vdsm to require the new version.



mom 0.4.0 has been built in last night nightly job [1] and published to nightly 
by publisher job [2]
so it's already available on nightly [3]

For 3.4.0, it has been planned [4] a beta 2 release on 2014-02-06 so we'll 
include your builds in that release.


I presume the scripting for 3.4 release rpms will produce a version
without the git-rev based suffix: ie. mom-0.4.0-1.rpm?

I need to figure out how to handle a problem that might be a bit
unique to mom.  MOM is used by non-oVirt users who install it from the
main Fedora repository.  I think it's fine that we are producing our
own rpms in oVirt (that may have additional patches applied and may
resync to upstream mom code more frequently than would be desired for
the main Fedora repository).  Given this, I think it makes sense to
tag the oVirt RPMs with a special version suffix to indicate that
these are oVirt produced and not upstream Fedora.

For example:
The next Fedora update will be mom-0.4.0-1.f20.rpm.
The next oVirt update will be mom-0.4.0-1ovirt.f20.rpm.

Is this the best practice for accomplishing my goals?  One other thing
I'd like to have the option of doing is to make vdsm depend on an
ovirt distribution of mom so that the upstream Fedora version will not
satisfy the dependency for vdsm.

Thoughts?
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


[vdsm] mom RPMs for 3.4

2014-01-30 Thread Adam Litke

Hi Sandro,

After updating the MOM project's build system, I have used jenkins to
produce a set of RPMs that I would like to tag into the oVirt 3.4
release.  Please see the jenkins job [1] for the relevant artifacts
for EL6[2], F19[3], and F20[4].

Dan, should I submit a patch to vdsm to make it require mom = 0.4.0?
I want to be careful to not break people's environments this late in
the 3.4 release cycle.  What is the best way to minimize that damage?

[1] http://jenkins.ovirt.org/view/All/job/manual-build-tarball/179/
[2] 
http://jenkins.ovirt.org/view/All/job/manual-build-tarball/179/label=centos6-host/artifact/exported-artifacts/mom-0.4.0-1.el6.noarch.rpm
[3] 
http://jenkins.ovirt.org/view/All/job/manual-build-tarball/179/label=fedora19-host/artifact/exported-artifacts/mom-0.4.0-1.fc19.noarch.rpm
[4] 
http://jenkins.ovirt.org/view/All/job/manual-build-tarball/179/label=fedora20-host/artifact/exported-artifacts/mom-0.4.0-1.fc20.noarch.rpm
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] [Engine-devel] Copy reviewer scores on trivial rebase/commit msg changes

2014-01-21 Thread Adam Litke

On 18/01/14 01:48 +0200, Itamar Heim wrote:

I'd like to enable these - comments welcome:

1. label.Label-Name.copyAllScoresOnTrivialRebase

If true, all scores for the label are copied forward when a new patch 
set is uploaded that is a trivial rebase. A new patch set is 
considered as trivial rebase if the commit message is the same as in 
the previous patch set and if it has the same code delta as the 
previous patch set. This is the case if the change was rebased onto a 
different parent. This can be used to enable sticky approvals, 
reducing turn-around for trivial rebases prior to submitting a change. 
Defaults to false.



2. label.Label-Name.copyAllScoresIfNoCodeChange

If true, all scores for the label are copied forward when a new patch 
set is uploaded that has the same parent commit as the previous patch 
set and the same code delta as the previous patch set. This means only 
the commit message is different. This can be used to enable sticky 
approvals on labels that only depend on the code, reducing turn-around 
if only the commit message is changed prior to submitting a change. 
Defaults to false.


I am a bit late to the party but +1 from me for trying both.  I guess
it will be quite rare that something bad happens here.  So unlikely,
that the time saved on all the previous patches will far offset the
lost time for fixing the corner cases.
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] oVirt 3.4.0 alpha repository closure failure

2014-01-10 Thread Adam Litke

On 10/01/14 10:01 +, Dan Kenigsberg wrote:

On Fri, Jan 10, 2014 at 08:48:52AM +0100, Sandro Bonazzola wrote:

Hi,
oVirt 3.4.0 alpha repository has been composed but alpha has not been announced 
due to repository closure failures:

on CentOS 6.5:

# repoclosure -r ovirt-3.4.0-alpha -l ovirt-3.3.2 -l base -l epel -l 
glusterfs-epel -l updates -l extra -l glusterfs-noarch-epel -l ovirt-stable -n
Reading in repository metadata - please wait
Checking Dependencies
Repos looked at: 8
   base
   epel
   glusterfs-epel
   glusterfs-noarch-epel
   ovirt-3.3.2
   ovirt-3.4.0-alpha
   ovirt-stable
   updates
Num Packages in Repos: 16581
package: mom-0.3.2-20140101.git2691f25.el6.noarch from ovirt-3.4.0-alpha
  unresolved deps:
 procps-ng


Adam, this seems like a real bug in http://gerrit.ovirt.org/#/c/22087/ :
el6 still carries the older procps (which is, btw, provided by
procps-ng).


Done.
http://gerrit.ovirt.org/23137





package: vdsm-hook-vhostmd-4.14.0-1.git6fdd55f.el6.noarch from ovirt-3.4.0-alpha
  unresolved deps:
 vhostmd


Douglas, could you add a with_vhostmd option to the spec, and have it
default to 0 on el*, and to 1 on fedoras?

Thanks,
Dan.

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] Smarter network_setup hooks

2014-01-03 Thread Adam Litke

On 03/01/14 12:20 +, Dan Kenigsberg wrote:

Recently, Miguel Angel Ajo (CCed) has added a nice functionality to the
implementation of setupNetworks in Vdsm: two hook points where added:
before and after the setupNetworks verb takes place.

This is useful because sometimes, Vdsm's configuration is not good
enough for the user. For example, someone may need to set various
ETHTOOL_OPTS on a nic. Now, they can put a script under
/usr/libexec/vdsm/after_network_setup/ that tweak their ifcfg-eth*
files after they have been written by Vdsm.

However, the hook script only knows that *a* change of network
configuration took place. It does not know which change took place, and
has to figure this out on its own.

Enters http://gerrit.ovirt.org/20330 allow hooks to pass down
dictionaries in json format.

I'd like to discuss it here, as it introduces a new Vdsm/Hook API that
is quite different than what we have for other hooks. Unlike with Vm
and VmDevice creation, where Vdsm uses libvirt's xml definition
internally as well as to communicate with the hooks,
before/after_network_setup have to define their own means of
communication.

I would like to suggest to use the same information passed on the
Engine/Vdsm API, and extend its reach into the hook script. The three
arguments to setupNetworks(networks, bondings, options) would be dumped
as json strings, to be read by the hook script.

This option is very simple to use and implement, it gives the hook all
the information that Vdsm-proper has, and allows for greatest
flexibility for hook writers. This is also the down side of this idea:
hook script may do all kinds of things with this information, some of
them unsupportable, and they should be notified when Engine/Vdsm API
changes.

In my opinion, it is a small price to pay: hooks have always had the
China Store Rule - if you break something, you own it. Hook users must
know what they're doing, and take care not to use deprecated bits of the
API.

What is your opinion? Comments and suggestions are most welcome!


Seems like a logical thing to do.  What specific mechanism do you
suggest for passing the JSON strings to the hook script?  If passed as
arguments to the hook script we would need to consider shell escaping
and argv length restrictions.

What about writing these out to a special file and adding a new
getContext() call to the hooking module.  A script that is unconcerned
with the context would not require any changes.  But a script that
wants access would simply do:

   ctx = hooking.getContext()

and ctx would be the contents of the special file already decoded into
a native Python object for easy consumption.  This could easily be
extended to any hook which may want to provide some context to
implementors.

One more question comes to mind:  Are there any pieces of information
that we would need to redact from the context (passwords or other
sensitive information)?


___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] VDSM - top 10 with patches with no activity for more than 30 days

2013-03-05 Thread Adam Litke
On Thu, 2013-02-28 at 12:51 -0500, Doron Fediuck wrote:
 - Original Message -
  From: Itamar Heim ih...@redhat.com
  To: vdsm-devel@lists.fedorahosted.org
  Sent: Wednesday, February 20, 2013 5:39:21 PM
  Subject: [vdsm] VDSM - top 10 with patches with no activity for more than 
  30days
  
  thoughts on how to trim these?
  (in openstack gerrit they auto-abandon patches with no activity for a
  couple of weeks - author can revive them back when they are relevant)
  
preferred_email | count
+--
fsimo...@redhat.com | 34
smizr...@redhat.com | 23
lvro...@linux.vnet.ibm.com  | 13
ewars...@redhat.com | 12
wu...@linux.vnet.ibm.com| 12
x...@linux.vnet.ibm.com | 11
shao...@linux.vnet.ibm.com  | 6
li...@linux.vnet.ibm.com| 6
zhshz...@linux.vnet.ibm.com | 6
shum...@linux.vnet.ibm.com  | 5
  ___
 
 Review day? Anyone thinks a monthly review day will
 help?

We've discussed this in the past and part of the reason for the backlog
is that folks like Saggi and Federico like to use gerrit to store
work-in-progress patches that don't need review.  They may not be
working on those patches at the moment but want them in gerrit for them
to come back to.  If we want to allow this use of gerrit then we will
always have some stale patches lying around.

-- 
Adam Litke a...@linux.vnet.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] [yajsonrpc]questions about json rpc

2013-02-25 Thread Adam Litke
On Thu, Feb 21, 2013 at 06:10:35PM +0800, ShaoHe Feng wrote:
 Hi, Adam
 An error arises, when I call json rpc server by AsyncoreReactor. And
 I can call json rpc server successfully by a simple TCPReactor write
 by myself.
 how can I call json Rpc by AsyncoreReactor correctly?
 
  address = (127.0.0.1, 4044)
  clientsReactor = asyncoreReactor.AsyncoreReactor()
  reactor = TestClientWrapper(clientsReactor.createClient(address))
  jsonAPI = JsonRpcClient(reactor)
  jsonAPI.connect()
  jsonAPI.callMethod(Host.ping, [], 1, 10)
 Traceback (most recent call last):
 File stdin, line 1, in module
 File /usr/lib64/python2.7/site-packages/yajsonrpc/client.py, line
 39, in callMethod
 resp = self._transport.recv(timeout=timeout)
 File /usr/share/vdsm/tests/jsonRpcUtils.py, line 100, in recv
 return self._queue.get(timeout=timeout)[1]
 File /usr/lib64/python2.7/Queue.py, line 176, in get
 raise Empty
 Queue.Empty

Sheldon,

You and I resolved this problem but I will answer it here as well for the
benefit of everyone.

When using the Asyncore framework, there is a reactor on the server but also on
the client.  Asyncore is multi-threaded and an event loop must be started for
the client reactor in order to process the server responses.  See
tests/jsonRpcUtils.py:43 for the call to initialize the event loop thread in the
client reactor.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] VDSM Repository Reorganization

2013-02-25 Thread Adam Litke
On Tue, Feb 19, 2013 at 03:53:46PM -0500, Saggi Mizrahi wrote:
  I'm not sure what's the purpose of having different versions of the
  client/server on the same machine. The software repository is one and
  it should provide both (as they're built from the same source).
  This is the standard way of delivering client/server applications in
  all the distributions. We can change that but we must have a good
  reason.
 There isn't really a reason. But, as I said, you don't want them to
 depend on each other or have the schema in it's own rpm.
 This means that you have to distribute them separately.
 
 I also want to allow to update the client on a host without updating
 the server. This is because you may want to have a script that works
 across the cluster but not update all the hosts.
 
 Now, even though you will use only old methods, the schema itself
 might become unparsable by old implementations.

This should never happen.  Right now each symbol in the schema is represented
by a single OrderedDict and the parsing code just loads the schema file into the
a list of these dicts.  Once loaded, the vdsmapi module categorizes symbols
according to the top-level keys.  Unrecognized symbol types are simply skipped.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] VDSM Repository Reorganization

2013-02-25 Thread Adam Litke
In order to make progress on the file reorg, I want to summarize the discussion
and propose that a consensus has been reached regarding placement of the schema
file.

The current code has a routine find_schema() that can locate the schema file in
the development source tree or in an installed location.  Therefore, it only
needs to appear in the source tree in a single location and we will not need any
symlinks for this purpose.  Recently, the API handling code (schema and parsing
module) have been split into their own rpm.  This should solve the installation
problem since any entity that needs access to the schema and parser should
simply install the vdsm-api rpm.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] May I apply for a user account on jenkins.ovirt.org to run VDSM functional tests?

2013-01-30 Thread Adam Litke
On Tue, Jan 29, 2013 at 12:21:46PM +0100, Ewoud Kohl van Wijngaarden wrote:
 On Tue, Jan 29, 2013 at 06:15:08AM -0500, Eyal Edri wrote:
  - Original Message -
   From: Zhou Zheng Sheng zhshz...@linux.vnet.ibm.com
   To: in...@ovirt.org
   Cc: ShaoHe Feng shao...@linux.vnet.ibm.com
   Sent: Tuesday, January 29, 2013 12:24:27 PM
   Subject: May I apply for a user account on jenkins.ovirt.org to run VDSM  
   functional tests?
  
   Hi all,
  
   I notice there is no VDSM functional tests running in oVirt Jenkins.
   Currently in VDSM we have some XML-RPC functional test cases for
   iSCSI,
   localfs and glusterfs storage as well as creating and destroying VMs
   on
   those storage. Functional tests through JSON-RPC are under review. I
   also submit a patch to Gerrit for running the tests easily
   (http://gerrit.ovirt.org/#/c/11238/). More test cases will be added
   to
   improve test coverage and reduce the chance of regression.
  
   Some bugs that can not be covered by unit test can be caught by
   functional tests. I think it would be helpful to run these functional
   tests continuously. We can also configure the Gerrit trigger in
   Jenkins
   to run functional tests when someone verifies the patch or when it
   gets
   approved but not merged. This may be helpful to the maintainer.
  
   I've setup a Jenkins job for VDSM functional tests in my lab server.
   You
   can refer to the job configuration of my current setup
   (https://github.com/edwardbadboy/vdsm-jenkins/blob/master/config.xml).
   After my patch in Gerrit is accepted, the job configuration will be
   simpler and the hacks can be removed. May I apply a user account for
   creating job in the oVirt Jenkins?
  
 
  Hi Zhou,
  Basically there shouldn't be any problem with that.
  we have an option for giving a 'power-user' permissions for certain
  users on oVirt misc projects to add and and configure jobs for thier
  project.
 
  it requires knowledge in jenkins, which it seems that you have and
  recognition from the team/other developers from the relevant project
  (in this case, VDSM) that you are an active member of the project.
  (just a formality essentially)
 
  I've added engine-devel list to this thread so anyone from vdsm team
  can vote +1 for adding you as a power user for jenkins.
 
  once we'll receive a few +1 and not objections i'll create a user for
  you and send you the details.
 
 
 I think vdsm-devel is more relevant here.

Also a big +1 from me.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] [Engine-devel] RFC: New Storage API

2013-01-22 Thread Adam Litke
On Tue, Jan 22, 2013 at 11:36:57PM +0800, Shu Ming wrote:
 2013-1-15 5:34, Ayal Baron:
 image and volume are overused everywhere and it would be extremely confusing 
 to have multiple meanings to the same terms in the same system (we have 
 image today which means virtual disk and volume which means a part of a 
 virtual disk).
 Personally I don't like the distinction between image and volume done in 
 ec2/openstack/etc seeing as they're treated as different types of entities 
 there while the only real difference is mutability (images are read-only, 
 volumes are read-write).
 To move to the industry terminology we would need to first change all 
 references we have today to image and volume in the system (I would say also 
 in ovirt-engine side) to align with the new meaning.
 Despite my personal dislike of the terms, I definitely see the value in 
 converging on the same terminology as the rest of the industry but to do so 
 would be an arduous task which is out of scope of this discussion imo 
 (patches welcome though ;)
 
 Another distinction between Openstack and oVirt is how the
 Nova/ovirt-engine look upon storage systems. In Openstack, a stand
 alone storage service(Cinder) exports the raw storage block device
 to Nova. On the other hand, in oVirt, storage system is highly
 bounded with the cluster scheduling system which integrates storage
 sub-system, VM dispatching sub-system, ISO image sub systems. This
 combination make all of the sub-system integrated in a whole which
 is easy to deploy, but it make the sub-system more opaque and not
 harder to reuse and maintain. This new storage API proposal give us
 an opportunity to distinct these sub-systems as new components which
 export better, loose-coupling APIs to VDSM.

A very good point and an important goal in my opinion.  I'd like to see
ovirt-engine become more of a GUI for configuring the storage component (like it
does for Gluster) rather than the centralized manager of storage.  The clustered
storage should be able to take care of itself as long as the peer hosts can
negotiate the SDM role.  

It would be cool if someone could actually dedicate a non-virtualization host
where its only job is to handle SDM operations.  Such a host could choose to
only deploy the standalone HSM service and not the complete vdsm package.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] API Documentation Since tag

2013-01-15 Thread Adam Litke
On Mon, Jan 14, 2013 at 05:45:45PM -0500, Saggi Mizrahi wrote:
 
 
 - Original Message -
  From: Adam Litke a...@us.ibm.com
  To: Saggi Mizrahi smizr...@redhat.com
  Cc: vdsm-devel@lists.fedorahosted.org, Vinzenz Feenstra 
  vfeen...@redhat.com
  Sent: Monday, January 14, 2013 5:21:41 PM
  Subject: Re: [vdsm] API Documentation  Since tag
  
  On Mon, Jan 14, 2013 at 12:37:57PM -0500, Saggi Mizrahi wrote:
   
   
   - Original Message -
From: Adam Litke a...@us.ibm.com
To: Vinzenz Feenstra vfeen...@redhat.com
Cc: vdsm-devel@lists.fedorahosted.org
Sent: Friday, January 11, 2013 9:03:19 AM
Subject: Re: [vdsm] API Documentation  Since tag

On Fri, Jan 11, 2013 at 10:19:45AM +0100, Vinzenz Feenstra wrote:
 Hi everyone,
 
 We are currently documenting the API in vdsmapi-schema.json
 I noticed that we have there documented when a certain element
 newly
 is introduced using the 'Since' tag.
 However I also noticed that we are not documenting when a field
 was
 newly added, nor do we update the 'since' tag.
 
 We should start documenting in what version we've introduced a
 field.
 A suggestion by saggi was to add to the comment for example:
 @since: 4.10.3
 
 What is your point of view on this?

I do think it's a good idea to add this information.  How about
supporting
multiple Since lines in the comment like the following made up
example:

##
# @FenceNodePowerStatus:
#
# Indicates the power state of a remote host.
#
# @on:The remote host is powered on
#
# @off:   The remote host is powered off
#
# @unknown:   The power status is not known
#
# @sentient:  The host is alive and powered by its own metabolism
#
# Since: 4.10.0 - @FenceNodePowerStatus
# Since: 10.2.0 - @sentient
##
   I don't like the fact that both lines don't point to the same type
   of token.
   I also don't like that it's a repeat of the type names and field
   names.
   
   I prefer Vinzenz original suggestion (on IRC) of moving the Since
   token up and then
   have it be a state.  It also makes discerning what entities you can
   use up to a
   certain version easier if you make sure to keep them sorted.
   
   We can do this because the order of the fields and availability is
   undetermined (unlike real structs).
  
  That is not correct.  These structures are parsed into an OrderedDict
  and the
  ordering is important (especially for languages like C which might
  use real
  structs).
 The wire format, json, ignores the ordering, further more, for
 languages like C we can't use actual structs because then we have
 to bump a major version every time we add a field as the
 sizeof(struct Foo) changed
  
   
   ##
   # @FenceNodePowerStatus:
   #
   # Indicates the power state of a remote host.
   #
   # Since: 4.10.0
   #
   # @on:The remote host is powered on
   #
   # @off:   The remote host is powered off
   #
   # @unknown:   The power status is not known
   #
   # Since: 10.2.0
   #
   # @sentient:  The host is alive and powered by its own metabolism
   #
   ##
   
   The problem though is that it makes since a property of the fields
   and not of
   the struct. This isn't that much of a problem as we can assume the
   earliest
   version is the time when the struct was introduced.
  
  I don't like this any better than my suggestion.  Aside from the fact
  that field
  ordering is important (in the data structure itself), this spreads
  the since
  information throughout the comment rather than concentrating it in a
  single
  place.
 
 Well, thinking about it, I don't understand why structs need to have a
 Since property anyway. Only verbs should have it. Structs are
 available (by inference) since the earliest call that produces them.
 
 All fields in a struct are optional anyway. Old versions wouldn't try
 and access them, new clients should always assume these fields may
 not be returned anyway.

All _newly_added_ fields must be optional.  Fields that are part of the original
definition of the type can be required fields.  This reminds me that we will
need to audit the schema for fields that can be made optional.  For example,
when creating Vm*Device objects, the VmDeviceAddress member can be omitted.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] API Documentation Since tag

2013-01-14 Thread Adam Litke
On Mon, Jan 14, 2013 at 12:37:57PM -0500, Saggi Mizrahi wrote:
 
 
 - Original Message -
  From: Adam Litke a...@us.ibm.com
  To: Vinzenz Feenstra vfeen...@redhat.com
  Cc: vdsm-devel@lists.fedorahosted.org
  Sent: Friday, January 11, 2013 9:03:19 AM
  Subject: Re: [vdsm] API Documentation  Since tag
  
  On Fri, Jan 11, 2013 at 10:19:45AM +0100, Vinzenz Feenstra wrote:
   Hi everyone,
   
   We are currently documenting the API in vdsmapi-schema.json
   I noticed that we have there documented when a certain element
   newly
   is introduced using the 'Since' tag.
   However I also noticed that we are not documenting when a field was
   newly added, nor do we update the 'since' tag.
   
   We should start documenting in what version we've introduced a
   field.
   A suggestion by saggi was to add to the comment for example:
   @since: 4.10.3
   
   What is your point of view on this?
  
  I do think it's a good idea to add this information.  How about
  supporting
  multiple Since lines in the comment like the following made up
  example:
  
  ##
  # @FenceNodePowerStatus:
  #
  # Indicates the power state of a remote host.
  #
  # @on:The remote host is powered on
  #
  # @off:   The remote host is powered off
  #
  # @unknown:   The power status is not known
  #
  # @sentient:  The host is alive and powered by its own metabolism
  #
  # Since: 4.10.0 - @FenceNodePowerStatus
  # Since: 10.2.0 - @sentient
  ##
 I don't like the fact that both lines don't point to the same type of token.
 I also don't like that it's a repeat of the type names and field names.
 
 I prefer Vinzenz original suggestion (on IRC) of moving the Since token up 
 and then
 have it be a state.  It also makes discerning what entities you can use up to 
 a
 certain version easier if you make sure to keep them sorted.
 
 We can do this because the order of the fields and availability is 
 undetermined (unlike real structs).

That is not correct.  These structures are parsed into an OrderedDict and the
ordering is important (especially for languages like C which might use real
structs).

 
 ##
 # @FenceNodePowerStatus:
 #
 # Indicates the power state of a remote host.
 #
 # Since: 4.10.0
 #
 # @on:The remote host is powered on
 #
 # @off:   The remote host is powered off
 #
 # @unknown:   The power status is not known
 #
 # Since: 10.2.0
 #
 # @sentient:  The host is alive and powered by its own metabolism
 #
 ##
 
 The problem though is that it makes since a property of the fields and not of
 the struct. This isn't that much of a problem as we can assume the earliest
 version is the time when the struct was introduced.

I don't like this any better than my suggestion.  Aside from the fact that field
ordering is important (in the data structure itself), this spreads the since
information throughout the comment rather than concentrating it in a single
place.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] API Documentation Since tag

2013-01-11 Thread Adam Litke
On Fri, Jan 11, 2013 at 10:19:45AM +0100, Vinzenz Feenstra wrote:
 Hi everyone,
 
 We are currently documenting the API in vdsmapi-schema.json
 I noticed that we have there documented when a certain element newly
 is introduced using the 'Since' tag.
 However I also noticed that we are not documenting when a field was
 newly added, nor do we update the 'since' tag.
 
 We should start documenting in what version we've introduced a field.
 A suggestion by saggi was to add to the comment for example: @since: 4.10.3
 
 What is your point of view on this?

I do think it's a good idea to add this information.  How about supporting
multiple Since lines in the comment like the following made up example:

##
# @FenceNodePowerStatus:
#
# Indicates the power state of a remote host.
#
# @on:The remote host is powered on
#
# @off:   The remote host is powered off
#
# @unknown:   The power status is not known
#
# @sentient:  The host is alive and powered by its own metabolism
#
# Since: 4.10.0 - @FenceNodePowerStatus
# Since: 10.2.0 - @sentient
##

Remember that any patch to change the schema format will require changes to
process-schema.py as well.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] Managing async tasks

2012-12-17 Thread Adam Litke
On Mon, Dec 17, 2012 at 12:15:08PM -0500, Saggi Mizrahi wrote:
 
 
 - Original Message -
  From: Adam Litke a...@us.ibm.com To: vdsm-devel@lists.fedorahosted.org
  Cc: Dan Kenigsberg dan...@redhat.com, Ayal Baron aba...@redhat.com,
  Saggi Mizrahi smizr...@redhat.com, Federico Simoncelli
  fsimo...@redhat.com, engine-de...@ovirt.org Sent: Monday, December 17,
  2012 12:00:49 PM Subject: Managing async tasks
  
  On today's vdsm call we had a lively discussion around how asynchronous
  operations should be handled in the future.  In an effort to include more
  people in the discussion and to better capture the resulting conversation I
  would like to continue that discussion here on the mailing list.
  
  A lot of ideas were thrown around about how 'tasks' should be handled in the
  future.  There are a lot of ways that it can be done.  To determine how we
  should implement it, it's probably best if we start with a set of
  requirements.  If we can first agree on these, it should be easy to find a
  solution that meets them.  I'll take a stab at identifying a first set of
  POSSIBLE requirements:
  
  - Standardized method for determining the result of an operation
  
This is a big one for me because it directly affects the consumability of
the API.  If each verb has different semantics for discovering whether it
has completed successfully, then the API will be nearly impossible to use
easily.
 Since there is no way to assure if of some tasks completed successfully or
 failed, especially around the murky waters of storage, I say this requirement
 should be removed.  At least not in the context of a task.

I don't agree.  Please feel free to convince me with some exampled.  If we
cannot provide feedback to a user as to whether their request has been satisfied
or not, then we have some bigger problems to solve.

  
  
  Sorry.  That's my list :)  Hopefully others will be willing to add other
  requirements for consideration.
  
  From my understanding, task recovery (stop, abort, rollback, etc) will not
  be generally supported and should not be a requirement.
  

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] Managing async tasks

2012-12-17 Thread Adam Litke
On Mon, Dec 17, 2012 at 03:12:34PM -0500, Saggi Mizrahi wrote:
 This is an addendum to my previous email.
 
 - Original Message -
  From: Saggi Mizrahi smizr...@redhat.com
  To: Adam Litke a...@us.ibm.com
  Cc: Dan Kenigsberg dan...@redhat.com, Ayal Baron aba...@redhat.com, 
  Federico Simoncelli
  fsimo...@redhat.com, engine-de...@ovirt.org, 
  vdsm-devel@lists.fedorahosted.org
  Sent: Monday, December 17, 2012 2:52:06 PM
  Subject: Re: Managing async tasks
  
  
  
  - Original Message -
   From: Adam Litke a...@us.ibm.com
   To: Saggi Mizrahi smizr...@redhat.com
   Cc: Dan Kenigsberg dan...@redhat.com, Ayal Baron
   aba...@redhat.com, Federico Simoncelli
   fsimo...@redhat.com, engine-de...@ovirt.org,
   vdsm-devel@lists.fedorahosted.org
   Sent: Monday, December 17, 2012 2:16:25 PM
   Subject: Re: Managing async tasks
   
   On Mon, Dec 17, 2012 at 12:15:08PM -0500, Saggi Mizrahi wrote:


- Original Message -
 From: Adam Litke a...@us.ibm.com To:
 vdsm-devel@lists.fedorahosted.org
 Cc: Dan Kenigsberg dan...@redhat.com, Ayal Baron
 aba...@redhat.com,
 Saggi Mizrahi smizr...@redhat.com, Federico Simoncelli
 fsimo...@redhat.com, engine-de...@ovirt.org Sent: Monday,
 December 17,
 2012 12:00:49 PM Subject: Managing async tasks
 
 On today's vdsm call we had a lively discussion around how
 asynchronous
 operations should be handled in the future.  In an effort to
 include more
 people in the discussion and to better capture the resulting
 conversation I
 would like to continue that discussion here on the mailing
 list.
 
 A lot of ideas were thrown around about how 'tasks' should be
 handled in the
 future.  There are a lot of ways that it can be done.  To
 determine how we
 should implement it, it's probably best if we start with a set
 of
 requirements.  If we can first agree on these, it should be
 easy
 to find a
 solution that meets them.  I'll take a stab at identifying a
 first set of
 POSSIBLE requirements:
 
 - Standardized method for determining the result of an
 operation
 
   This is a big one for me because it directly affects the
   consumability of
   the API.  If each verb has different semantics for
   discovering
   whether it
   has completed successfully, then the API will be nearly
   impossible to use
   easily.
Since there is no way to assure if of some tasks completed
successfully or
failed, especially around the murky waters of storage, I say this
requirement
should be removed.  At least not in the context of a task.
   
   I don't agree.  Please feel free to convince me with some exampled.
If we
   cannot provide feedback to a user as to whether their request has
   been satisfied
   or not, then we have some bigger problems to solve.
  If VDSM sends a write command to a storage server, and the connection
  hangs up before the ACK has returned.
  The operation has been committed but VDSM has no way of knowing if
  that happened as far as VDSM is concerned it got an ETIMEO or EIO.
  This is the same problem that the engine has with VDSM.
  If VDSM creates an image\VM\network\repo but the connection hangs up
  before the response can be sent back as far as the engine is
  concerned the operation times out.
  This is an inherent issue with clustering.
  This is why I want to move away from tasks being *the* trackable
  objects.
  Tasks should be short. As short as possible.
  Run VM should just persist the VM information on the VDSM host and
  return. The rest of the tracking should be done using the VM ID.
  Create image should return once VDSM persisted the information about
  the request on the repository and created the metadata files.
  Tracking should be done on the repo or the imageId.
 
 The thing is that I know how long a VM object should live (or an Image 
 object).
 So tracking it is straight forward. How long a task should live is very 
 problematic and quite context specific.
 It depends on what the task is.
 I think it's quite confusing from an API standpoint to have every task have a 
 different scope, id requirement and life-cycle.
 
 In VDSM has two types of APIs
 
 CRUD objects - VM, Image, Repository, Bridge, Storage Connections
 General transient methods - getBiosInfo(), getDeviceList()
 
 The latter are quite simple to manage. They don't need any special handling. 
 If you lost a getBiosInfo() call you just send another one, no harm done.
 The same is even true with things that change the host like getDeviceList()
 
 What we are really arguing about is fitting the CRUD objects to some generic 
 task oriented scheme.
 I'm saying it's a waste of time as you can quite easily have flows to recover 
 from each operation.
 
 Create - Check if the object exists
 Read - Read again
 Update - either update again or read and update if update

Re: [vdsm] RFC: New Storage API

2012-12-10 Thread Adam Litke
 operation it will tell it to value one over the other. For
   example, whether to copy all the data or just create a qcow based
   of a snapshot.
   The default is space.
  
   You might have also noticed that it is never explicitly specified
   where to look for existing images. This is done purposefully, VDSM
   will always look in all connected repositories for existing
   objects.
   For very large setups this might be problematic. To mitigate the
   problem you have these options:
   participatingRepositories=[repoId, ...] which tell VDSM to narrow
   the search to just these repositories
   and
   imageHints={imgId: repoId} which will force VDSM to look for those
   image ID just in those repositories and fail if it doesn't find
   them there.
   ___
   vdsm-devel mailing list
   vdsm-devel@lists.fedorahosted.org
   https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
  
  
  --
  ---
  舒明 Shu Ming
  Open Virtualization Engineerning; CSTL, IBM Corp.
  Tel: 86-10-82451626  Tieline: 9051626 E-mail: shum...@cn.ibm.com or
  shum...@linux.vnet.ibm.com
  Address: 3/F Ring Building, ZhongGuanCun Software Park, Haidian
  District, Beijing 100193, PRC
  
  
  
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] RFC: New Storage API

2012-12-10 Thread Adam Litke
On Mon, Dec 10, 2012 at 02:03:09PM -0500, Saggi Mizrahi wrote:
 
 
 - Original Message -
  From: Adam Litke a...@us.ibm.com
  To: Saggi Mizrahi smizr...@redhat.com
  Cc: Deepak C Shetty deepa...@linux.vnet.ibm.com, engine-devel 
  engine-de...@ovirt.org, VDSM Project
  Development vdsm-devel@lists.fedorahosted.org
  Sent: Monday, December 10, 2012 1:49:31 PM
  Subject: Re: [vdsm] RFC: New Storage API
  
  On Fri, Dec 07, 2012 at 02:53:41PM -0500, Saggi Mizrahi wrote:
  
  snip
  
1) Can you provide more info on why there is a exception for 'lvm
based
block domain'. Its not coming out clearly.
   File based domains are responsible for syncing up object
   manipulation (creation\deletion)
   The backend is responsible for making sure it all works either by
   having a single writer (NFS) or having it's own locking mechanism
   (gluster).
   In our LVM based domains VDSM is responsible for basic object
   manipulation.
   The current design uses an approach where there is a single host
   responsible for object creation\deleteion it is the
   SRM\SDM\SPM\S?M.
   If we ever find a way to make it fully clustered without a big hit
   in performance the S?M requirement will be removed form that type
   of domain.
  
  I would like to see us maintain a LOCALFS domain as well.  For this,
  we would
  also need SRM, correct?
 No, why?

Sorry, nevermind.  I was thinking of a scenario with multiple clients talking to
a single vdsm and making sure they don't stomp on one another.  This is
probably not something we are going to care about though.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] RFC: New Storage API

2012-12-10 Thread Adam Litke
On Mon, Dec 10, 2012 at 03:36:23PM -0500, Saggi Mizrahi wrote:
  Statements like this make me start to worry about your userData
  concept.  It's a
  sign of a bad API if the user needs to invent a custom metadata
  scheme for
  itself.  This reminds me of the abomination that is the 'custom'
  property in the
  vm definition today.
 In one sentence: If VDSM doesn't care about it, VDSM doesn't manage it.
 
 userData being a void* is quite common and I don't understand why you would 
 thing it's a sign of a bad API.
 Further more, giving the user choice about how to represent it's own metadata 
 and what fields it want to keep seems reasonable to me.
 Especially given the fact that VDSM never reads it.
 
 The reason we are pulling away from the current system of VDSM understanding 
 the extra data is that it makes that data tied to VDSMs on disk format.
 VDSM on disk format has to be very stable because of clusters with multiple 
 VDSM versions.
 Further more, since this is actually manager data it has to be tied to the 
 manager backward compatibility lifetime as well.
 Having it be opaque to VDSM ties it to only one, simpler, support lifetime 
 instead of two.
 
 I guess you are implying that it will make it problematic for multiple users 
 to read userData left by another user because the formats might not be 
 compatible.
 The solution is that all parties interested in using VDSM storage agree on 
 format, and common fields, and supportability, and all the other things that 
 choosing a supporting *something* entails.
 This is, however, out of the scope of VDSM. When the time comes I think how 
 the userData blob is actually parsed and what fields it keeps should be 
 discussed on ovirt-devel or engine-devel.
 
 The crux of the issue is that VDSM manages only what it cares about and the 
 user can't modify directly.
 This is done because everything we expose we commit to.
 If you want any information persisted like:
 - Human readable name (in whatever encoding)
 - Is this a template or a snapshot
 - What user owns this image
 
 You can just put it in the userData.
 VDSM is not going to impose what encoding you use.
 It's not going to decide if you represent your users as IDs or names or ldap 
 queries or Public Keys.
 It's not going to decide if you have explicit templates or not.
 It's not going to decide if you care what is the logical image chain.
 It's not going to decide anything that is out of it's scope.
 No format is future proof, no selection of fields will be good for any 
 situation.
 I'd much rather it be someone else's problem when any of them need to be 
 changed.
 They have currently been VDSMs problem and it has been hell to maintain.

In general, I actually agree with most of this.  What I want to avoid is pushing
things that should actually be a part of the API into this userData blob.  We do
want to keep the API as simple as possible to give vdsm flexibility.  If, over
time, we find that users are always using userData to work around something
missing in the API, this could be a really good sign that the API needs
extension.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] moving the collection of statistics to external process

2012-12-06 Thread Adam Litke
On Thu, Dec 06, 2012 at 11:19:34PM +0800, Shu Ming wrote:
 于 2012-12-6 4:51, Itamar Heim 写道:
 On 12/05/2012 10:33 PM, Adam Litke wrote:
 On Wed, Dec 05, 2012 at 10:21:39PM +0200, Itamar Heim wrote:
 On 12/05/2012 10:16 PM, Adam Litke wrote:
 On Wed, Dec 05, 2012 at 09:01:24PM +0200, Itamar Heim wrote:
 On 12/05/2012 08:57 PM, Adam Litke wrote:
 On Wed, Dec 05, 2012 at 08:30:10PM +0200, Itamar Heim wrote:
 On 12/05/2012 04:42 PM, Adam Litke wrote:
 I wanted to know what do you think about it and if
 you have better
 solution to avoid initiate so many threads? And
 if splitting vdsm is
 a good idea here?
 In first look, my opinion is that it can help
 and would be nice to
 have vmStatisticService that runs and writes to
 separate log the vms
 status.
 Vdsm recently started requiring the MOM package. MOM
 also performs some host
 and guest statistics collection as part of the
 policy framework.  I think it
 would be a really good idea to consolidate all stats
 collection into MOM.  Then,
 all stats become usable within the policy and by
 vdsm for its own internal
 purposes.  Today, MOM has one stats collection
 thread per VM and one thread for
 the host stats.  It has an API for gathering the
 most recently collected stats
 which vdsm can use.
 
 
 isn't this what collectd (and its libvirt plugin) or
 pcp are already doing?
 
 Lot's of things collect statistics, but as of right now,
 we're using MOM and
 we're not yet using collectd on the host, right?
 
 
 I think we should have a single stats collection service
 and clients for it.
 I think mom and vdsm should get their stats from that service,
 rather than have either beholden to any new stats something needs to
 collect.
 
 How would this work for collecting guest statistics?  Would
 we require collectd
 to be installed in all guests running under oVirt?
 
 
 my understanding is collectd is installed on the host, and uses
 collects libvirt plugin to collect guests statistics?
 
 Yes, but some statistics can only be collected by making a call
 to the oVirt
 guest agent (eg. guest memory statistics).  The logical next
 step would be to
 write a collectd plugin for ovirt-guest-agent, but vdsm owns the
 connections to
 the guest agents and probably does not want to multiplex those
 connections for
 many reasons (security being the main one).
 
 
 and some will come from qemu-ga which libvirt will support?
 maybe a collectd vdsm plugin for the guest agent stats?
 
 
 I am thinking to have the collectd as a stand alone service to
 collect the statics from both ovirt-guest and qemu-ga.  Then
 collected can export the information to host proc file system in
 layered architecture.  Then mom or other vdsm service can get the
 information from the proc file system like other OS statics exported
 in the host.

You wouldn't use the host /proc filesystem for this purpose.  /proc is an
interface between userspace and the kernel.  It is not for direct application
use.

The problem I see with hooking collectd up to ovirt-ga is that vdsm still needs
a connection to ovirt-ga for things like shutdown and desktopLogin.  Today vdsm,
owns the connection to the guest agent and there is not a nice way to multiplex
that connection for use by multiple clients simultaneously.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] link state semantics

2012-12-05 Thread Adam Litke
On Wed, Dec 05, 2012 at 04:25:48AM -0500, Antoni Segura Puimedon wrote:
 
 
 - Original Message -
  From: Igor Lvovsky ilvov...@redhat.com
  To: Antoni Segura Puimedon asegu...@redhat.com
  Cc: Alona Kaplan alkap...@redhat.com, vdsm-devel@lists.fedorahosted.org
  Sent: Wednesday, December 5, 2012 10:17:50 AM
  Subject: Re: [vdsm] link state semantics
  
  
  
  - Original Message -
   From: Antoni Segura Puimedon asegu...@redhat.com
   To: vdsm-devel@lists.fedorahosted.org
   Cc: Alona Kaplan alkap...@redhat.com
   Sent: Tuesday, December 4, 2012 7:32:34 PM
   Subject: [vdsm] link state semantics
   
   Hi list!
   
   We are working on the new 3.2 feature for adding support for
   updating
   VM
   devices, more specifically at the moment network devices.
   
   There is one point of the design which is not yet consensual and
   we'd
   need to agree on a proper and clean design that would satisfy us
   all:
   
   My current proposal, as reflected by patch:
  http://gerrit.ovirt.org/#/c/9560/5/vdsm_api/vdsmapi-schema.json
   and its parent is to have a linkActive boolean that is true for
   link
   status 'up' and false for link status 'down'.
   
   We want to support a none (dummy) network that is used to
   dissociate
   vnics
   from any real network. The semantics, as you can see in the patch
   are
   that
   unless you specify a network, updateDevice will place the interface
   on that
   network. However, Adam Litke argues that not specifying a network
   should
   keep the vnic on the network it currently is, as network is an
   optional
   parameter and 'linkActive' is also optional and has this preserve
   current
   state semantics.
   
   I can certainly see the merit of what Adam proposes, and the
   implementation
   would be that linkActive becomes an enum like so:
   
   {'enum': 'linkState'/* or linkActive */ , 'data': ['up', 'down',
   'disconnected']}
   
  
  If you are going for this use 'linkState'
  
   With this change, network would only be changed if one different
   than
   the current
   one is specified and the vnic would be taken to the dummy bridge
   when
   the linkState
   would be set to 'disconnected'.
  
  In general +1 for new one, with a little doubt.
  It looks a bit inconsistent that we leave the network as is if it
  omitted from input,
  but if linkState is 'disconnected' we will move it to dummy bridge.
  But I can live with it.
 
 Yes, the 'disconnected' overrules the network and that, as you point
 out, can be a source of confusion. I propose to add a warning to the
 return dictionary that tells the user that setting disconnected overrules
 any network setting.
 
  
   
   There is also an objection, raised by Adam about the semantics of
   portMirroring.
   The current behavior from my patch is:
   
   portMirroring is None or is not set - No action taken.
   portMirroring = [] - No action taken.
   portMirroring = [a,b,z] - Set port mirroring for nets a,b and z to
   the specified vnic.
   
   His proposal is:
   portMirroring is None or is not set - No action taken.
   portMirroring = [] - Unset port mirroring to the vnic that is
   currently set.
   portMirroring = [a,b,z] - Set port mirroring for nets a,b and z to
   the specified vnic.
   
  
  +1 for Adam's approach, just don't forget to unset portMirroring from
  all nets setted before
  if they not in new portMirroring = [a,b,z]
 
 So you're saying:
 
 portMirroring is None or is not set - No action taken.
 portMirroring = [] - Unset port mirroring to the vnic that is
   currently set.
 portMirroring = [a,b,z] - Set port mirroring for nets a,b and z to
the specified vnic AND unset any other mirroring.
 
 I'm fine with it, I think it is even more complete and correct.

Yes, +1.

  
   I would really welcome comments on this to have finally an
   agreement
   to the api for this
   feature.
   
   Best,
   
   Toni
   ___
   vdsm-devel mailing list
   vdsm-devel@lists.fedorahosted.org
   https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
   
  
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] Host bios information

2012-12-05 Thread Adam Litke
On Wed, Dec 05, 2012 at 11:05:21AM +0200, ybronhei wrote:
 Today in the Api we display general information about the host that
 vdsm export by getCapabilities Api.
 
 We decided to add bios information as part of the information that
 is displayed in UI under host's general sub-tab.
 
 To summaries the feature - We'll modify General tab to Software
 Information and add another tab for Hardware Information which will
 include all the bios data that we'll decide to gather from the host
 and display.
 
 Following this feature page:
 http://www.ovirt.org/Features/Design/HostBiosInfo for more details.
 All the parameters that can be displayed are mentioned in the wiki.
 
 I would greatly appreciate your comments and questions.

Seems good to me but I would like to throw out one suggestion.
getVdsCapabilities is already a huge command that does a lot of time consuming
things.  As part of the vdsm API refactoring, we are going to start favoring
small and concise APIs over bag APIs.  Perhaps we should just add a new verb:
Host.getVdsBiosInfo() that returns only this information.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] moving the collection of statistics to external process

2012-12-05 Thread Adam Litke
On Wed, Dec 05, 2012 at 04:23:16PM +0200, ybronhei wrote:
 As part of an issue that if you push start for 200vms in the same
 time it takes hours because undefined issue, we thought about moving
 the collection of statistics outside vdsm.

Thanks for bringing up this issue.  I think this could be a good idea on its own
merits (better modularity, etc).

 It can help because the stat collection is an internal threads of
 vdsm that can spend not a bit of a time, I'm not sure if it would
 help with the issue of starting many vms simultaneously, but it
 might improve vdsm response.

In general, threads should be really cheap to create so I expect there is
another cause for the performance bottleneck.  That being said, I think we
should still look at this feature.

 Currently we start thread for each vm and then collecting stats on
 them in constant intervals, and it must effect vdsm if we have 200
 thread like this that can take some time. for example if we have
 connection errors to storage and we can't receive its response, all
 the 200 threads can get stuck and lock other threads (gil issue).
 
 I wanted to know what do you think about it and if you have better
 solution to avoid initiate so many threads? And if splitting vdsm is
 a good idea here?
 In first look, my opinion is that it can help and would be nice to
 have vmStatisticService that runs and writes to separate log the vms
 status.

Vdsm recently started requiring the MOM package.  MOM also performs some host
and guest statistics collection as part of the policy framework.  I think it
would be a really good idea to consolidate all stats collection into MOM.  Then,
all stats become usable within the policy and by vdsm for its own internal
purposes.  Today, MOM has one stats collection thread per VM and one thread for
the host stats.  It has an API for gathering the most recently collected stats
which vdsm can use.

 The problem with this solution is that if  those interval functions
 needs to communicate with internal parts of vdsm to set values or
 start internal processes when something has changed, it depends on
 the stat function.. and I'm not sure that stat function should
 control internal flows.
 Today to recognize connectivity error we count on this method, but
 we can add polling mechanics for those issues (which can raise same
 problems we are trying to deal with..)

I agree.  Any cases where the stats collection threads are triggering internal
vdsm logic need to be cleaned up.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] Host bios information

2012-12-05 Thread Adam Litke
On Wed, Dec 05, 2012 at 05:25:10PM +0200, ybronhei wrote:
 On 12/05/2012 04:32 PM, Adam Litke wrote:
 On Wed, Dec 05, 2012 at 11:05:21AM +0200, ybronhei wrote:
 Today in the Api we display general information about the host that
 vdsm export by getCapabilities Api.
 
 We decided to add bios information as part of the information that
 is displayed in UI under host's general sub-tab.
 
 To summaries the feature - We'll modify General tab to Software
 Information and add another tab for Hardware Information which will
 include all the bios data that we'll decide to gather from the host
 and display.
 
 Following this feature page:
 http://www.ovirt.org/Features/Design/HostBiosInfo for more details.
 All the parameters that can be displayed are mentioned in the wiki.
 
 I would greatly appreciate your comments and questions.
 
 Seems good to me but I would like to throw out one suggestion.
 getVdsCapabilities is already a huge command that does a lot of time 
 consuming
 things.  As part of the vdsm API refactoring, we are going to start favoring
 small and concise APIs over bag APIs.  Perhaps we should just add a new 
 verb:
 Host.getVdsBiosInfo() that returns only this information.
 
 It leads to modification also in how the engine collects the
 parameters with the new api request and I'm not sure if we should
 get into this.. Now we have specific known way of how engine
 requests for capabilities, when and how it effects the status of the
 host that is shown via the UI.
 To simplify this feature I prefer to use the current way of
 gathering and providing host's information. If we'll decide to split
 the host's capabilities api, it needs to get rfcs mail of its own
 because it changes engine's internal flows and it makes this feature
 to something much more influential.

I don't understand.  Why can't you just call both APIs, one after the other?

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] moving the collection of statistics to external process

2012-12-05 Thread Adam Litke
On Wed, Dec 05, 2012 at 08:30:10PM +0200, Itamar Heim wrote:
 On 12/05/2012 04:42 PM, Adam Litke wrote:
 I wanted to know what do you think about it and if you have better
 solution to avoid initiate so many threads? And if splitting vdsm is
 a good idea here?
 In first look, my opinion is that it can help and would be nice to
 have vmStatisticService that runs and writes to separate log the vms
 status.
 Vdsm recently started requiring the MOM package.  MOM also performs some host
 and guest statistics collection as part of the policy framework.  I think it
 would be a really good idea to consolidate all stats collection into MOM.  
 Then,
 all stats become usable within the policy and by vdsm for its own internal
 purposes.  Today, MOM has one stats collection thread per VM and one thread 
 for
 the host stats.  It has an API for gathering the most recently collected 
 stats
 which vdsm can use.
 
 
 isn't this what collectd (and its libvirt plugin) or pcp are already doing?

Lot's of things collect statistics, but as of right now, we're using MOM and
we're not yet using collectd on the host, right?

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] moving the collection of statistics to external process

2012-12-05 Thread Adam Litke
On Wed, Dec 05, 2012 at 09:01:24PM +0200, Itamar Heim wrote:
 On 12/05/2012 08:57 PM, Adam Litke wrote:
 On Wed, Dec 05, 2012 at 08:30:10PM +0200, Itamar Heim wrote:
 On 12/05/2012 04:42 PM, Adam Litke wrote:
 I wanted to know what do you think about it and if you have better
 solution to avoid initiate so many threads? And if splitting vdsm is
 a good idea here?
 In first look, my opinion is that it can help and would be nice to
 have vmStatisticService that runs and writes to separate log the vms
 status.
 Vdsm recently started requiring the MOM package.  MOM also performs some 
 host
 and guest statistics collection as part of the policy framework.  I think 
 it
 would be a really good idea to consolidate all stats collection into MOM.  
 Then,
 all stats become usable within the policy and by vdsm for its own internal
 purposes.  Today, MOM has one stats collection thread per VM and one 
 thread for
 the host stats.  It has an API for gathering the most recently collected 
 stats
 which vdsm can use.
 
 
 isn't this what collectd (and its libvirt plugin) or pcp are already doing?
 
 Lot's of things collect statistics, but as of right now, we're using MOM and
 we're not yet using collectd on the host, right?
 
 
 I think we should have a single stats collection service and clients for it.
 I think mom and vdsm should get their stats from that service,
 rather than have either beholden to any new stats something needs to
 collect.

How would this work for collecting guest statistics?  Would we require collectd
to be installed in all guests running under oVirt?

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] moving the collection of statistics to external process

2012-12-05 Thread Adam Litke
On Wed, Dec 05, 2012 at 10:21:39PM +0200, Itamar Heim wrote:
 On 12/05/2012 10:16 PM, Adam Litke wrote:
 On Wed, Dec 05, 2012 at 09:01:24PM +0200, Itamar Heim wrote:
 On 12/05/2012 08:57 PM, Adam Litke wrote:
 On Wed, Dec 05, 2012 at 08:30:10PM +0200, Itamar Heim wrote:
 On 12/05/2012 04:42 PM, Adam Litke wrote:
 I wanted to know what do you think about it and if you have better
 solution to avoid initiate so many threads? And if splitting vdsm is
 a good idea here?
 In first look, my opinion is that it can help and would be nice to
 have vmStatisticService that runs and writes to separate log the vms
 status.
 Vdsm recently started requiring the MOM package.  MOM also performs some 
 host
 and guest statistics collection as part of the policy framework.  I 
 think it
 would be a really good idea to consolidate all stats collection into 
 MOM.  Then,
 all stats become usable within the policy and by vdsm for its own 
 internal
 purposes.  Today, MOM has one stats collection thread per VM and one 
 thread for
 the host stats.  It has an API for gathering the most recently collected 
 stats
 which vdsm can use.
 
 
 isn't this what collectd (and its libvirt plugin) or pcp are already 
 doing?
 
 Lot's of things collect statistics, but as of right now, we're using MOM 
 and
 we're not yet using collectd on the host, right?
 
 
 I think we should have a single stats collection service and clients for it.
 I think mom and vdsm should get their stats from that service,
 rather than have either beholden to any new stats something needs to
 collect.
 
 How would this work for collecting guest statistics?  Would we require 
 collectd
 to be installed in all guests running under oVirt?
 
 
 my understanding is collectd is installed on the host, and uses
 collects libvirt plugin to collect guests statistics?

Yes, but some statistics can only be collected by making a call to the oVirt
guest agent (eg. guest memory statistics).  The logical next step would be to
write a collectd plugin for ovirt-guest-agent, but vdsm owns the connections to
the guest agents and probably does not want to multiplex those connections for
many reasons (security being the main one).

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] moving the collection of statistics to external process

2012-12-05 Thread Adam Litke
On Wed, Dec 05, 2012 at 10:51:23PM +0200, Itamar Heim wrote:
 On 12/05/2012 10:33 PM, Adam Litke wrote:
 On Wed, Dec 05, 2012 at 10:21:39PM +0200, Itamar Heim wrote:
 On 12/05/2012 10:16 PM, Adam Litke wrote:
 On Wed, Dec 05, 2012 at 09:01:24PM +0200, Itamar Heim wrote:
 On 12/05/2012 08:57 PM, Adam Litke wrote:
 On Wed, Dec 05, 2012 at 08:30:10PM +0200, Itamar Heim wrote:
 On 12/05/2012 04:42 PM, Adam Litke wrote:
 I wanted to know what do you think about it and if you have better
 solution to avoid initiate so many threads? And if splitting vdsm is
 a good idea here?
 In first look, my opinion is that it can help and would be nice to
 have vmStatisticService that runs and writes to separate log the vms
 status.
 Vdsm recently started requiring the MOM package.  MOM also performs 
 some host
 and guest statistics collection as part of the policy framework.  I 
 think it
 would be a really good idea to consolidate all stats collection into 
 MOM.  Then,
 all stats become usable within the policy and by vdsm for its own 
 internal
 purposes.  Today, MOM has one stats collection thread per VM and one 
 thread for
 the host stats.  It has an API for gathering the most recently 
 collected stats
 which vdsm can use.
 
 
 isn't this what collectd (and its libvirt plugin) or pcp are already 
 doing?
 
 Lot's of things collect statistics, but as of right now, we're using MOM 
 and
 we're not yet using collectd on the host, right?
 
 
 I think we should have a single stats collection service and clients for 
 it.
 I think mom and vdsm should get their stats from that service,
 rather than have either beholden to any new stats something needs to
 collect.
 
 How would this work for collecting guest statistics?  Would we require 
 collectd
 to be installed in all guests running under oVirt?
 
 
 my understanding is collectd is installed on the host, and uses
 collects libvirt plugin to collect guests statistics?
 
 Yes, but some statistics can only be collected by making a call to the oVirt
 guest agent (eg. guest memory statistics).  The logical next step would be to
 write a collectd plugin for ovirt-guest-agent, but vdsm owns the connections 
 to
 the guest agents and probably does not want to multiplex those connections 
 for
 many reasons (security being the main one).
 
 
 and some will come from qemu-ga which libvirt will support?
 maybe a collectd vdsm plugin for the guest agent stats?

Then you still have vdsm plus one other entitity in the business of stats
collection.  I don't see how that's any better than what we have today.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] object instancing in the new VDSM API

2012-12-04 Thread Adam Litke
 and is logically
 wrong.  What you need to do is remove redundant arguments and split up verbs
 that do more then one thing.
 
 - Original Message -
  From: Adam Litke a...@us.ibm.com To: Saggi Mizrahi
  smizr...@redhat.com Cc: vdsm-devel vdsm-de...@fedorahosted.org, Ayal
  Baron aba...@redhat.com, Barak Azulay bazu...@redhat.com, ybronhei
  ybron...@redhat.com Sent: Monday, December 3, 2012 5:46:31 PM Subject: Re:
  object instancing in the new VDSM API
  
  On Mon, Dec 03, 2012 at 04:34:28PM -0500, Saggi Mizrahi wrote:
   Currently the suggested scheme treats everything as instances and object
   have methods.  This puts instancing as the responsibility of the API
   bindings.  I suggest changing it to the way json was designed with
   namespaces and methods.
   
   For example instead for the api being:
   
   vm = host.getVMsList()[0] vm.getInfo()
   
   the API should be:
   
   vmID = host.getVMsList()[0] api.VMsManager.getVMInfo(vmID)
   
   And it should be up to decide how to wrap everything in objects.
  
  For VMs, your example looks nice, but for today's Volumes it's not so nice.
  To properly identify a Volume, we must pass the storage pool id, storage
  domain id, image id, and volume id.  If we are working with two Volumes, we
  would need 8 parameters unless we optimize for context and assume that the
  storage pool uuid is the same for both volumes, etc.  The problem with that
  optimization is that we require clients to understand internal
  implementation details.
  
  How should the StorageDomain.getVolumes API return a list of Volumes?  A
  list of Volume ids is not enough information for most commands that involve
  a Volume.
  
   The problem with the API bindings controlling the instancing is that: 1)
   We have to *have* and *pass* implicit api obj which is problematic to
   maintain.  For example, you have to have the api object as a member of
   instance for the method calls to work.  This means that you can't recreate
   or pool API objects easily. You effectively need to add a move method to
   move the object to another API object to use it on a different host.
  
  You already make assumptions like this when passing around bare UUIDs.  For
  example, you know that a Storage Domain cannot be associated with multiple
  Storage Pools at the same time.  With instantiated objects, all of those
  associations are baked into the objects.  A client never constructs objects.
  It only receives pre-instantiated objects by calling other APIs.
  
   2) Because the objects are opaque it might be hard to know what fields of
   the instance to persist to get the same object.
  
  No.  You just persist the whole object identifier the way it was given to
  you.  In the case of Volumes, it may be an object containing 4 string uuids.
  It could also be a string in the form /spuuid/sduuid/imguuid/voluuid.  In
  the end it doesn't really matter which form it's in because the client will
  not manipulate it.  Perhaps some flattened string is best in order to enable
  easy database storage.
  
   3) It breaks the distinction between by-value and by-reference objects.
  
  The distinction is made in the schema.  Reference objects have methods and
  are called 'class' in the schema.  Value objects have only fields and are
  called 'type'.
  
   4) Any serious user will make it's own instance classes that conform to
   it's design and flow so they don't really add any convenience to anything
   apart for tests.  You will create you're own VM object, and because it's
   in the manager scope it will be the same instance across all hosts.
   Instead of being able to pass the same ID to any host (as the vmID remains
   the same) you will have to create and instance object to use either before
   every call for simplicity or cache for each host for performance benefits.
  
  This is a pretty good argument for using namespacing instead of instances,
  however...  I still think that all object references need to be an opaque
  type and it should not be legal to roll your own object reference from a set
  of other objects (eq. create a volume reference from imgUUID, sdUUID, and
  spUUID).  The API should be explicit about the relationships between
  objects.
  
  If you want to write your own instance classes you still can.  Just pass the
  vdsm-generated identifier into your object's constructor to use for later
  API calls.
  
   5) It makes us pass a weird __obj__ parameter to each call that
   symbolizes self and makes it hard for a user that choose to use it's own
   bindings to understand what it does.
  
  Fair.  '__obj__' is a terrible name.  I would be okay with changing the
  semantics so that all API calls take an 'id' parameter as their first
  argument.  I guess this could always be a string with an unspecified format.
  For Volumes, we can decide how we want to encode the 4 uuids.  Vdsm would
  then need to parse this value on the server side to pull out the relevant
  IDs.
  
   6) It's syntactic

Re: [vdsm] VDSM tasks, the future

2012-12-04 Thread Adam Litke
On Tue, Dec 04, 2012 at 10:35:01AM -0500, Saggi Mizrahi wrote:
 Because I started hinting about how VDSM tasks are going to look going forward
 I thought it's better I'll just write everything in an email so we can talk
 about it in context.  This is not set in stone and I'm still debating things
 myself but it's very close to being done.

Don't debate them yourself, debate them here!  Even better, propose your idea in
schema form to show how a command might work exactly.

 - Everything is asynchronous.  The nature of message based communication is
 that you can't have synchronous operations.  This is not really debatable
 because it's just how TCP\AMQP\messaging works.

Can you show how a traditionally synchronous command might work?  Let's take
Host.getVmList as an example.

 - Task IDs will be decided by the caller.  This is how json-rpc works and also
 makes sense because no the engine can track the task without needing to have a
 stage where we give it the task ID back.  IDs are reusable as long as no one
 else is using them at the time so they can be used for synchronizing
 operations between clients (making sure a command is only executed once on a
 specific host without locking).
 
 - Tasks are transient If VDSM restarts it forgets all the task information.
 There are 2 ways to have persistent tasks: 1. The task creates an object that
 you can continue work on in VDSM.  The new storage does that by the fact that
 copyImage() returns one the target volume has been created but before the data
 has been fully copied.  From that moment on the stat of the copy can be
 queried from any host using getImageStatus() and the specific copy operation
 can be queried with getTaskStatus() on the host performing it.  After VDSM
 crashes, depending on policy, either VDSM will create a new task to continue
 the copy or someone else will send a command to continue the operation and
 that will be a new task.  2. VDSM tasks just start other operations track-able
 not through the task interface. For example Gluster.
 gluster.startVolumeRebalance() will return once it has been registered with
 Gluster.  glster.getOperationStatuses() will return the state of the operation
 from any host.  Each call is a task in itself.

I worry about this approach because every command has a different semantic for
checking progress.  For migration, we have to check VM status on the src and
dest hosts.  For image copy we need to use a special status call on the dest
image.  It would be nice if there was a unified method for checking on an
operation.  Maybe that can be completion events.

Client:   vdsm:
---   -

Image.copy(...)  --
 --  Operation Started
Wait for event   ...
 --  Event: Operation id done code

For an early error:

Client:   vdsm:
---   -

Image.copy(...)  --
 --  Error: code


 - No task tags.  They are silly and the caller can mangle whatever in the task
 ID if he really wants to tag tasks.

Yes.  Agreed.

 - No explicit recovery stage.  VDSM will be crash-only, there should be
 efforts to make everything crash-safe.  If that is problematic, in case of
 networking, VDSM will recover on start without having a task for it.

How does this work in practice for something like creating a new image from a
template?

 - No clean Task: Tasks can be started by any number of hosts this means that
 there is no way to own all tasks.  There could be cases where VDSM starts
 tasks on it's own and thus they have no owner at all.  The caller needs to
 continually track the state of VDSM. We will have brodcasted events to
 mitigate polling.

If a disconnected client might have missed a completion event, it will need to
check state.  This means each async operation that changes state must document a
proceedure for checking progress of a potentially ongoing operation.  For
Image.copy, that process would be to lookup the new image and check its state.

 - No revert Impossible to implement safely.

How do the engine folks feel about this?  I am ok with it :)

 - No SPM\HSM tasks SPM\SDM is no longer necessary for all domain types (only
 for type).  What used to be SPM tasks, or tasks that persist and can be
 restarted on other hosts is talked about in previous bullet points.
 
A nice simplification.


-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] API.py validation

2012-12-04 Thread Adam Litke
On Tue, Dec 04, 2012 at 08:43:11AM -0500, Antoni Segura Puimedon wrote:
 Hi all,
 
 I am currently working in adding a new feature to vdsm which requires a new
 entry point in vdsm, thus requiring:
 - Parameter definitions in vdsm_api/vdsmapi-schema.json
 - Implementation and checks in vdsm/API.py and other modules.
 
 Typically, we check for presence absence of required/optional parameters in
 API.py using utils.validateMinimalKeySet or just if else clauses. I think this
 process could benefit from a more automatic and less duplicated effort, i.e.,
 parsing vdsmapi-schema.json in a similar way as process-schema.py does to make
 a memoized method that is able to check whether the api call is correct
 according to the API definitions. A very good side effect would be that this
 would really avoid us from forgetting to update the schema.

Yes, this is a good idea.  I do want to add some checking.  For now, the best
place to add it would probably be in the DynamicBridge class which dispatches
json-rpc calls to the correct internal methods.  Unfortunately this would
exclude the xmlrpc api from the automatic checking.  I guess that's ok since
xmlrpc will be going away.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] link state semantics

2012-12-04 Thread Adam Litke
On Tue, Dec 04, 2012 at 12:32:34PM -0500, Antoni Segura Puimedon wrote:
 Hi list!
 
 We are working on the new 3.2 feature for adding support for updating VM
 devices, more specifically at the moment network devices.
 
 There is one point of the design which is not yet consensual and we'd 
 need to agree on a proper and clean design that would satisfy us all:
 
 My current proposal, as reflected by patch:
http://gerrit.ovirt.org/#/c/9560/5/vdsm_api/vdsmapi-schema.json
 and its parent is to have a linkActive boolean that is true for link
 status 'up' and false for link status 'down'.
 
 We want to support a none (dummy) network that is used to dissociate vnics
 from any real network. The semantics, as you can see in the patch are that
 unless you specify a network, updateDevice will place the interface on that
 network. However, Adam Litke argues that not specifying a network should
 keep the vnic on the network it currently is, as network is an optional
 parameter and 'linkActive' is also optional and has this preserve current
 state semantics.
 
 I can certainly see the merit of what Adam proposes, and the implementation
 would be that linkActive becomes an enum like so:
 
 {'enum': 'linkState'/* or linkActive */ , 'data': ['up', 'down', 
 'disconnected']}
 
 With this change, network would only be changed if one different than the 
 current
 one is specified and the vnic would be taken to the dummy bridge when the 
 linkState
 would be set to 'disconnected'.
 
 There is also an objection, raised by Adam about the semantics of 
 portMirroring.
 The current behavior from my patch is:
 
 portMirroring is None or is not set - No action taken.
 portMirroring = [] - No action taken.
 portMirroring = [a,b,z] - Set port mirroring for nets a,b and z to the 
 specified vnic.
 
 His proposal is:
 portMirroring is None or is not set - No action taken.
 portMirroring = [] - Unset port mirroring to the vnic that is currently set.
 portMirroring = [a,b,z] - Set port mirroring for nets a,b and z to the 
 specified vnic.
 
 I would really welcome comments on this to have finally an agreement to the 
 api for this
 feature.

+1 to the updated proposal.  Is there any better way to do it?

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] RFC: New Storage API

2012-12-04 Thread Adam Litke
 information (like
Volume.getInfo)?  (I see some more info below...)

 All operations return once the operations has been committed to disk NOT when 
 the operation actually completes.
 This is done so that:
 - operation come to a stable state as quickly as possible.
 - In case where there is an SDM, only small portion of the operation actually 
 needs to be performed on the SDM host.
 - No matter how many times the operation fails and on how many hosts, you can 
 always resume the operation and choose when to do it.
 - You can stop an operation at any time and remove the resulting object 
 making a distinction between stop because the host is overloaded to I 
 don't want that image
 
 This means that after calling any operation that creates a new image the user 
 must then call getImageStatus() to check what is the status of the image.
 The status of the image can be either optimized, degraded, or broken.
 Optimized means that the image is available and you can run VMs of it.
 Degraded means that the image is available and will run VMs but it might be 
 a better way VDSM can represent the underlying data. 
 Broken means that the image can't be used at the moment, probably because 
 not all the data has been set up on the volume.
 
 Apart from that VDSM will also return the last persisted status information 
 which will conatin
 hostID - the last host to try and optimize of fix the image
 stage - X/Y (eg. 1/10) the last persisted stage of the fix.

Do you have some examples of what the stages would be?  I think these should be
defined in enums so that the user can check on what the individual stages mean.
What happens when the low level implementation of an operation changes?  The
meaning of the stages will change completely.

 percent_complete - -1 or 0-100, the last persisted completion percentage of 
 the aforementioned stage. -1 means that no progress is available for that 
 operation.

 last_error - This will only be filled if the operation failed because of 
 something other then IO or a VDSM crash for obvious reasons.
  It will usually be set if the task was manually stopped
 
 The user can either be satisfied with that information or as the host 
 specified in host ID if it is still working on that image by checking it's 
 running tasks.
 
 checkStorageRepository(self, repositoryId, options={}):
 A method to go over a storage repository and scan for any existing problems. 
 This includes degraded\broken images and deleted images that have no yet been 
 physically deleted\merged.
 It returns a list of Fix objects.
 Fix objects come in 4 types:
 clean - cleans data, run them to get more space.
 optimize - run them to optimize a degraded image

What is an example of a degraded image?

 merge - Merges two images together. Doing this sometimes
 makes more images ready optimizing or cleaning.
 The reason it is different from optimize is that
 unmerged images are considered optimized.
 mend - mends a broken image

What does this mean?

 The user can read these types and prioritize fixes. Fixes also contain opaque 
 FIX data and they should be sent as received to
 fixStorageRepository(self, repositoryId, fix, options={}):
 
 That will start a fix operation.

Could we have an automatic fix mode where vdsm just does the right thing (for
most things)?

 All major operations automatically start the appropriate Fix to bring the 
 created object to an optimize\degraded state (the one that is quicker) unless 
 one of the options is
 AutoFix=False. This is only useful for repos that might not be able to create 
 volumes on all hosts (SDM) but would like to have the actual IO distributed 
 in the cluster.
 
 Other common options is the strategy option:
 It has currently 2 possible values
 space and performance - In case VDSM has 2 ways of completing the same 
 operation it will tell it to value one over the other. For example, whether 
 to copy all the data or just create a qcow based of a snapshot.
 The default is space.

I like this a lot.

 You might have also noticed that it is never explicitly specified where to 
 look for existing images. This is done purposefully, VDSM will always look in 
 all connected repositories for existing objects.
 For very large setups this might be problematic. To mitigate the problem you 
 have these options:
 participatingRepositories=[repoId, ...] which tell VDSM to narrow the search 
 to just these repositories
 and
 imageHints={imgId: repoId} which will force VDSM to look for those image ID 
 just in those repositories and fail if it doesn't find them there.

I would like to have a better way of specifying these optional parameters
without burying them in an options structure.  I will think a little more about
this.  Strategy can just be a two optional flags in a 'flags' argument.  For the
participatingRepositories and imageHints options, I think we need to use real
parameters.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

Re: [vdsm] RFD: API: Identifying vdsm objects in the next-gen API

2012-12-03 Thread Adam Litke
On Thu, Nov 29, 2012 at 05:59:09PM -0500, Saggi Mizrahi wrote:
 
 
 - Original Message -
  From: Adam Litke a...@us.ibm.com To: Saggi Mizrahi
  smizr...@redhat.com Cc: engine-de...@linode01.ovirt.org, Dan Kenigsberg
  dan...@redhat.com, Federico Simoncelli fsimo...@redhat.com, Ayal
  Baron aba...@redhat.com, vdsm-devel@lists.fedorahosted.org Sent:
  Thursday, November 29, 2012 5:22:43 PM Subject: Re: RFD: API: Identifying
  vdsm objects in the next-gen API
  
  On Thu, Nov 29, 2012 at 04:52:14PM -0500, Saggi Mizrahi wrote:
   They are not future proof as the paradigm is completely different.
   Storage domain IDs are not static any more (and are not guaranteed to be
   unique or the same across the cluster.  Image IDs represent the ID of the
   projected data and not the actual unique path.  Just as an example, to run
   a VM you give a list of domains that might contain the needed images in
   the chain and the image ID of the tip.  The paradigm is changed to and
   most calls get non synchronous number of images and domains.  Further
   more, the APIs themselves are completely different. So future proofing is
   not really an issue.
  
  I don't understand this at all.  Perhaps we could all use some education on
  the architecture of the planned architectural changes.  If I can pass an
  arbitrary list of domainIDs that _might_ contain the data, why wouldn't I
  just pass all of them every time?  In that case, why are they even required
  since vdsm would have to search anyway?
 It's for optimization mostly, the engine usually has a good idea of where
 stuff are, having it give hints to VDSM can speed up the search process.
 also, then engines knows how transient some storage pieces are. If you have a
 domain that is only there for backup or owned by another manager sharing the
 host, you don't want you VMs using the disks that are on that storage
 effectively preventing it from being removed (though we do have plans to have
 qemu switch base snapshots at runtime for just that).

This is not a clean design.  If the search is slow, then maybe we need to
improve caching internally.  Making a client cache a bunch of internal IDs to
pass around sounds like a complete layering violation to me.

  
   As to making the current API a bit simpler. As I said, making them opaque
   is problematic as currently the engine is responsible for creating the
   IDs.
  
  As I mentioned in my last post, engine still can specify the ID's when the
  object is first created.  From that point forward the ID never changes so it
  can be baked into the identifier.
 Where will this identifier be persisted?
  
   Further more, some calls require you to play with these (making a template
   instead of a snapshot).  Also, the full chain and topology needs to be
   completely visible to the engine.
  
  Please provide a specific example of how you play with the IDs.  I can guess
  where you are going, but I don't want to divert the thread.
 The relationship between volumes and images is deceptive at the moment.  IMG
 is the chain and volume is a member, IMGUUID is only used to for verification
 and to detect when we hit a template going up the chain.  When you do
 operation on images assumptions are being guaranteed about the resulting IDs.
 When you copy an image, you assume to know all the new IDs as they remain the
 same.  With your method I can't tell what the new opaque result is going to
 be.  Preview mode (another abomination being deprecated) relies on the
 disconnect between imgUUID and volUUID.  Live migration currently moves a lot
 of the responsibility to the engine.

No client should need to know about all of these internal details.  I understand
that's the way it is today, and that's one of the main reasons that the API is a
complete pain to use.

  
   These things, as you said, are problematic. But this is the way things are
   today.
  
  We are changing them.
 Any intermediary step is needlessly problematic for existing clients.  Work is
 already in progress for fixing the API properly, making some calls a bit nicer
 isn't an excuse to start making more compatibility code in the engine.

The engine won't need compatibility code.  This only would impact the jsonrpc
bindings which aren't used by engine yet.  When engine switches over, then yes
it would need to adapt.

  
   As for task IDs.  Currently task IDs are only used for storage and they
   get persisted to disk. This is WRONG and is not the case with the new
   storage API.  Because we moved to an asynchronous message based protocol
   (json-rpc over TCP\AMQP) there is no need to generate a task ID. it is
   built in to json-rpc.  json-rpc specifies that the IDs have to be unique
   for a client as long as the request is still active.  This is good enough
   as internally we can have a verb for a client to query it's own running
   tasks and a verb to query other host tasks by mangling in the client
   before the ID.  Because the protocol is
  
  So

Re: [vdsm] RFD: API: Identifying vdsm objects in the next-gen API

2012-12-03 Thread Adam Litke
On Mon, Dec 03, 2012 at 03:57:42PM -0500, Saggi Mizrahi wrote:
 
 
 - Original Message -
  From: Adam Litke a...@us.ibm.com To: Saggi Mizrahi
  smizr...@redhat.com Cc: engine-de...@linode01.ovirt.org, Dan Kenigsberg
  dan...@redhat.com, Federico Simoncelli fsimo...@redhat.com, Ayal
  Baron aba...@redhat.com, vdsm-devel@lists.fedorahosted.org Sent: Monday,
  December 3, 2012 3:30:21 PM Subject: Re: RFD: API: Identifying vdsm objects
  in the next-gen API
  
  On Thu, Nov 29, 2012 at 05:59:09PM -0500, Saggi Mizrahi wrote:
   
   
   - Original Message -
From: Adam Litke a...@us.ibm.com To: Saggi Mizrahi
smizr...@redhat.com Cc: engine-de...@linode01.ovirt.org, Dan
Kenigsberg dan...@redhat.com, Federico Simoncelli
fsimo...@redhat.com, Ayal Baron aba...@redhat.com,
vdsm-devel@lists.fedorahosted.org Sent: Thursday, November 29, 2012
5:22:43 PM Subject: Re: RFD: API: Identifying vdsm objects in the
next-gen API

On Thu, Nov 29, 2012 at 04:52:14PM -0500, Saggi Mizrahi wrote:
 They are not future proof as the paradigm is completely different.
 Storage domain IDs are not static any more (and are not guaranteed to
 be unique or the same across the cluster.  Image IDs represent the ID
 of the projected data and not the actual unique path.  Just as an
 example, to run a VM you give a list of domains that might contain the
 needed images in the chain and the image ID of the tip.  The paradigm
 is changed to and most calls get non synchronous number of images and
 domains.  Further more, the APIs themselves are completely different.
 So future proofing is not really an issue.

I don't understand this at all.  Perhaps we could all use some education
on the architecture of the planned architectural changes.  If I can pass
an arbitrary list of domainIDs that _might_ contain the data, why
wouldn't I just pass all of them every time?  In that case, why are they
even required since vdsm would have to search anyway?
   It's for optimization mostly, the engine usually has a good idea of where
   stuff are, having it give hints to VDSM can speed up the search process.
   also, then engines knows how transient some storage pieces are. If you
   have a domain that is only there for backup or owned by another manager
   sharing the host, you don't want you VMs using the disks that are on that
   storage effectively preventing it from being removed (though we do have
   plans to have qemu switch base snapshots at runtime for just that).
  
  This is not a clean design.  If the search is slow, then maybe we need to
  improve caching internally.  Making a client cache a bunch of internal IDs
  to pass around sounds like a complete layering violation to me.
 You can't cache this, if the same template exists on an 2 different NFS
 domains only the engine has enough information to know which you should use.
 We only have the engine give us thing information when starting a VM or
 merging\copying an image that resides on multiple domains.  It is also
 completely optional. I didn't like it either.

Is it even valid for the same template (with identical uuids) to exist in two
places?  I thought uuids aren't supposed to collide.  I can envision some
scenario where a cached storagedomain/storagepool relationship becomes invalid
because another user detached the storagedomain.  In that case, the API just
returns the normal error about sd XXX is not attached to sp XXX.  So I don't
see any problem here.

  

 As to making the current API a bit simpler. As I said, making them
 opaque is problematic as currently the engine is responsible for
 creating the IDs.

As I mentioned in my last post, engine still can specify the ID's when
the object is first created.  From that point forward the ID never
changes so it can be baked into the identifier.
   Where will this identifier be persisted?

 Further more, some calls require you to play with these (making a
 template instead of a snapshot).  Also, the full chain and topology
 needs to be completely visible to the engine.

Please provide a specific example of how you play with the IDs.  I can
guess where you are going, but I don't want to divert the thread.
   The relationship between volumes and images is deceptive at the moment.
   IMG is the chain and volume is a member, IMGUUID is only used to for
   verification and to detect when we hit a template going up the chain.
   When you do operation on images assumptions are being guaranteed about the
   resulting IDs.  When you copy an image, you assume to know all the new IDs
   as they remain the same.  With your method I can't tell what the new
   opaque result is going to be.  Preview mode (another abomination being
   deprecated) relies on the disconnect between imgUUID and volUUID.  Live
   migration currently moves a lot of the responsibility to the engine.
  
  No client

Re: [vdsm] [ATTENTION] vdsm-bootstrap/host deployment (pre-3.2)

2012-11-29 Thread Adam Litke
On Thu, Nov 29, 2012 at 10:00:12AM +0200, Dan Kenigsberg wrote:
 On Wed, Nov 28, 2012 at 03:29:35PM -0600, Adam Litke wrote:
  On Wed, Nov 28, 2012 at 03:45:28PM -0500, Alon Bar-Lev wrote:
   
   
   - Original Message -
From: Dan Kenigsberg dan...@redhat.com
To: Alon Bar-Lev alo...@redhat.com
Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org, 
engine-devel engine-de...@ovirt.org, users
us...@ovirt.org
Sent: Wednesday, November 28, 2012 10:39:42 PM
Subject: Re: [vdsm] [ATTENTION] vdsm-bootstrap/host deployment (pre-3.2)

On Wed, Nov 28, 2012 at 02:57:17PM -0500, Alon Bar-Lev wrote:
 
   No... we need it as compatibility with older engines...
   We keep minimum changes there for legacy, until end-of-life.
  
  Is there an EoL statement for oVirt-3.1?
  We can make sure that oVirt-3.2's vdsm installs properly with
  ovirt-3.1's vdsm-bootstrap, or even require that Engine must be
  upgraded
  to ovirt-3.2 before upgrading any of the hosts. Is it too harsh
  to
  our
  vast install base?  us...@ovirt.org, please chime in!
 
 
 I tried to find such, but the more I dig I find that we need to
 support old legacy.

Why, exactly? Fedora gives no such guarntees (heck, I'm stuck with an
unupgradable F16). Should we be any better than our (currently
single)
platform?
   
   We should start and detach from specific distro procedures.
   

 
  * legacy-removed: change machine width core file
   # echo /var/lib/vdsm/core  /proc/sys/kernel/core_pattern

Yeah, qemu-kvm and libvirtd are much more stable than in the
old
days,
but wouldn't we want to keep a means to collect the corpses
of
dead
processes from hypervisors? It has helped us nail down nasty
bugs,
even
in Python.
   
   It does not mean it should be at /var/lib/vdsm ... :)
  
  I don't get the joke :-(. If you mind the location, we can think
  of
  somewhere else to put the core dumps. Would it be hard to
  reinstate a
  parallel feature in otopi?
 
 I usually do not make any jokes...
 A global system setting should not go into package specific
 location.
 Usually core dumps are off by default, I like this approach as
 unattended system may fast consume all disk space because of
 dumps.

If a host fills up with dumps so quickly, it's a sign that it should
not
be used for production, and that someone should look into the cores.
(P.S. we have a logrotate rule for them in vdsm)
   
   There should be a vdsm-debug-aids (or similar) to perform such changes.
   Again, I don't think vdsm should (by default) modify any system width 
   parameter such as this.
   But I will happy to hear more views.
  
  I agree with your statement above that a single package should not override 
  a
  global system setting.  We should really work to remove as many of these 
  from
  vdsm as we possibly can.  It will help to make vdsm a much 
  safer/well-behaved
  package.
 
 I'm fine with dropping these from vdsm, but I think they are good for
 ovirt - we would like to (be able to) enfornce policy on our nodes.
 
 If configuring core dumps is removed from vdsm, it should go somewhere
 else, or our log-collector users would miss their beloved dumps.

Yes, I agree.  From my point of view the plan was to do the following:

1. Remove unnecessary system configuration changes.  This includes things like
Royce's supervdsm startup process patch (and accompanying sudo-supervdsm
conversions) which allows us to remove some of the sudo configuration.

2. Isolate the remaining tweaks into vdsm-tool.

3. Provide a service/program that can be run to configure a system to work in an
ovirt-engine controlled cluster.

Doing this allows vdsm to be safely installed on any system as a basic
prerequisite for other software.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] RFD: API: Identifying vdsm objects in the next-gen API

2012-11-29 Thread Adam Litke
On Thu, Nov 29, 2012 at 04:52:14PM -0500, Saggi Mizrahi wrote:
 They are not future proof as the paradigm is completely different.  Storage
 domain IDs are not static any more (and are not guaranteed to be unique or the
 same across the cluster.  Image IDs represent the ID of the projected data and
 not the actual unique path.  Just as an example, to run a VM you give a list
 of domains that might contain the needed images in the chain and the image ID
 of the tip.  The paradigm is changed to and most calls get non synchronous
 number of images and domains.  Further more, the APIs themselves are
 completely different. So future proofing is not really an issue.

I don't understand this at all.  Perhaps we could all use some education on the
architecture of the planned architectural changes.  If I can pass an arbitrary
list of domainIDs that _might_ contain the data, why wouldn't I just pass all
of them every time?  In that case, why are they even required since vdsm would
have to search anyway?

 As to making the current API a bit simpler. As I said, making them opaque is
 problematic as currently the engine is responsible for creating the IDs.

As I mentioned in my last post, engine still can specify the ID's when the
object is first created.  From that point forward the ID never changes so it can
be baked into the identifier.

 Further more, some calls require you to play with these (making a template
 instead of a snapshot).  Also, the full chain and topology needs to be
 completely visible to the engine.

Please provide a specific example of how you play with the IDs.  I can guess
where you are going, but I don't want to divert the thread.

 These things, as you said, are problematic. But this is the way things are
 today.

We are changing them.

 As for task IDs.  Currently task IDs are only used for storage and they get
 persisted to disk. This is WRONG and is not the case with the new storage API.
 Because we moved to an asynchronous message based protocol (json-rpc over
 TCP\AMQP) there is no need to generate a task ID. it is built in to json-rpc.
 json-rpc specifies that the IDs have to be unique for a client as long as the
 request is still active.  This is good enough as internally we can have a verb
 for a client to query it's own running tasks and a verb to query other host
 tasks by mangling in the client before the ID.  Because the protocol is

So this would rely on the client keeping the connection open and as soon as it
disconnects it would lose the ability to query tasks from before the connection
went down?  I don't know if it's a good idea to conflate message ID's with task
ID's.  While the protocol can operate asynchronously, some calls have
synchronous semantics and others have asynchronous semantics.  I would expect
sync calls to return their data immediately and async calls to return
immediately with either: an error code, or an 'operation started' message and
associated ID for querying the status of the operation.

 asynchronous all calls are asynchronous by nature well.  Tasks will no longer
 be persisted or expected to be persisted. It's the callers responsibility to
 query the state and see if the operation succeeded or failed if the caller or
 VDSM died in the middle of the call. The current cleanTask() system can't be
 used when more then one client is using VDSM and will not be used for anything
 other then legacy storage.

I agree about not persisting tasks in the future.  Although I think finished
tasks should remain in memory for some time so they can be queried by a client
who must reconnect.

 AFAIK Apart from storage all objects IDs are constructed with a single ID,
 name or alias. VMs, storageConnections, network interfaces. So it's not a real
 issue.  I agree that in the future we should keep the idiom of pass
 configuration once, name it, and keep using the name to reference the object.

Yes, storage is the major problem here.

 - Original Message -
  From: Adam Litke a...@us.ibm.com
  To: Saggi Mizrahi smizr...@redhat.com
  Cc: engine-de...@linode01.ovirt.org, Dan Kenigsberg dan...@redhat.com, 
  Federico Simoncelli
  fsimo...@redhat.com, Ayal Baron aba...@redhat.com, 
  vdsm-devel@lists.fedorahosted.org
  Sent: Thursday, November 29, 2012 4:18:40 PM
  Subject: Re: RFD: API: Identifying vdsm objects in the next-gen API
  
  On Thu, Nov 29, 2012 at 02:16:42PM -0500, Saggi Mizrahi wrote:
   This is all only valid for the current storage API the new one
   doesn't have
   pools or volumes. Only domains and images.  Also, images and
   domains are more
   loosely coupled and make this method problematic.
  
  I am looking for an incremental way to bridge the differences.  It's
  been 2
  years and we still don't have the revamped storage API so I am
  planning on what
  we have being around for awhile :)  I think that defining object
  identifiers as
  opaque structured types is also future proof.  In the future an
  Image-ng object
  we can drop 'storagepoolID

Re: [vdsm] [ATTENTION] vdsm-bootstrap/host deployment (pre-3.2)

2012-11-28 Thread Adam Litke
On Wed, Nov 28, 2012 at 03:45:28PM -0500, Alon Bar-Lev wrote:
 
 
 - Original Message -
  From: Dan Kenigsberg dan...@redhat.com
  To: Alon Bar-Lev alo...@redhat.com
  Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org, 
  engine-devel engine-de...@ovirt.org, users
  us...@ovirt.org
  Sent: Wednesday, November 28, 2012 10:39:42 PM
  Subject: Re: [vdsm] [ATTENTION] vdsm-bootstrap/host deployment (pre-3.2)
  
  On Wed, Nov 28, 2012 at 02:57:17PM -0500, Alon Bar-Lev wrote:
   
 No... we need it as compatibility with older engines...
 We keep minimum changes there for legacy, until end-of-life.

Is there an EoL statement for oVirt-3.1?
We can make sure that oVirt-3.2's vdsm installs properly with
ovirt-3.1's vdsm-bootstrap, or even require that Engine must be
upgraded
to ovirt-3.2 before upgrading any of the hosts. Is it too harsh
to
our
vast install base?  us...@ovirt.org, please chime in!
   
   
   I tried to find such, but the more I dig I find that we need to
   support old legacy.
  
  Why, exactly? Fedora gives no such guarntees (heck, I'm stuck with an
  unupgradable F16). Should we be any better than our (currently
  single)
  platform?
 
 We should start and detach from specific distro procedures.
 
  
   
* legacy-removed: change machine width core file
 # echo /var/lib/vdsm/core  /proc/sys/kernel/core_pattern
  
  Yeah, qemu-kvm and libvirtd are much more stable than in the
  old
  days,
  but wouldn't we want to keep a means to collect the corpses
  of
  dead
  processes from hypervisors? It has helped us nail down nasty
  bugs,
  even
  in Python.
 
 It does not mean it should be at /var/lib/vdsm ... :)

I don't get the joke :-(. If you mind the location, we can think
of
somewhere else to put the core dumps. Would it be hard to
reinstate a
parallel feature in otopi?
   
   I usually do not make any jokes...
   A global system setting should not go into package specific
   location.
   Usually core dumps are off by default, I like this approach as
   unattended system may fast consume all disk space because of
   dumps.
  
  If a host fills up with dumps so quickly, it's a sign that it should
  not
  be used for production, and that someone should look into the cores.
  (P.S. we have a logrotate rule for them in vdsm)
 
 There should be a vdsm-debug-aids (or similar) to perform such changes.
 Again, I don't think vdsm should (by default) modify any system width 
 parameter such as this.
 But I will happy to hear more views.

I agree with your statement above that a single package should not override a
global system setting.  We should really work to remove as many of these from
vdsm as we possibly can.  It will help to make vdsm a much safer/well-behaved
package.

 
  
   If sysadmin manually enables dumps, he may do this at a location of
   his own choice.
  
  Note that we've just swapped hats: you're arguing for letting a local
  admin log in and mess with system configuration, and I'm for keeping
  a
  centralized feature for storing and collecting core dumps.
 
 As problems like crashes are investigated per case and reproduction scenario.
 But again, I may be wrong and we should have VDSM API command to start/stop 
 storing dumps and manage this via its master...

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] Future of Vdsm network configuration - Thread mid-summary

2012-11-27 Thread Adam Litke
On Tue, Nov 27, 2012 at 10:42:00AM +0200, Livnat Peer wrote:
 On 26/11/12 16:59, Adam Litke wrote:
  On Mon, Nov 26, 2012 at 02:57:19PM +0200, Livnat Peer wrote:
  On 26/11/12 03:15, Shu Ming wrote:
  Livnat,
 
  Thanks for your summary.  I got comments below.
 
  2012-11-25 18:53, Livnat Peer:
  Hi All,
  We have been discussing $subject for a while and I'd like to summarized
  what we agreed and disagreed on thus far.
 
  The way I see it there are two related discussions:
 
 
  1. Getting VDSM networking stack to be distribution agnostic.
  - We are all in agreement that VDSM API should be generic enough to
  incorporate multiple implementation. (discussed on this thread: Alon's
  suggestion, Mark's patch for adding support for netcf etc.)
 
  - We would like to maintain at least one implementation as the
  working/up-to-date implementation for our users, this implementation
  should be distribution agnostic (as we all acknowledge this is an
  important goal for VDSM).
  I also think that with the agreement of this community we can choose to
  change our focus, from time to time, from one implementation to another
  as we see fit (today it can be OVS+netcf and in a few months we'll use
  the quantum based implementation if we agree it is better)
 
  2. The second discussion is about persisting the network configuration
  on the host vs. dynamically retrieving it from a centralized location
  like the engine. Danken raised a concern that even if going with the
  dynamic approach the host should persist the management network
  configuration.
 
  About dynamical retrieving from a centralized location,  when will the
  retrieving start? Just in the very early stage of host booting before
  network functions?  Or after the host startup and in the normal running
  state of the host?  Before retrieving the configuration,  how does the
  host network connecting to the engine? I think we need a basic well
  known network between hosts and the engine first.  Then after the
  retrieving, hosts should reconfigure the network for later management.
  However, the timing to retrieve and reconfigure are challenging.
 
 
  We did not discuss the dynamic approach in details on the list so far
  and I think this is a good opportunity to start this discussion...
 
  From what was discussed previously I can say that the need for a well
  known network was raised by danken, it was referred to as the management
  network, this network would be used for pulling the full host network
  configuration from the centralized location, at this point the engine.
 
  About the timing for retrieving the configuration, there are several
  approaches. One of them was described by Alon, and I think he'll join
  this discussion and maybe put it in his own words, but the idea was to
  'keep' the network synchronized at all times. When the host have
  communication channel to the engine and the engine detects there is a
  mismatch in the host configuration, the engine initiates 'apply network
  configuration' action on the host.
 
  Using this approach we'll have a single path of code to maintain and
  that would reduce code complexity and bugs - That's quoting Alon Bar Lev
  (Alon I hope I did not twisted your words/idea).
 
  On the other hand the above approach makes local tweaks on the host
  (done manually by the administrator) much harder.
  
  I worry a lot about the above if we take the dynamic approach.  It seems 
  we'd
  need to introduce before/after 'apply network configuration' hooks where the
  admin could add custom config commands that aren't yet modeled by engine.
  
 
 yes, and I'm not sure the administrators would like the fact that we are
 'forcing' them to write everything in a script and getting familiar with
 VDSM hooking mechanism (which in some cases require the use of custom
 properties on the engine level) instead of running a simple command line.
 
  Any other approaches ?
  
  Static configuration has the advantage of allowing a host to bring itself 
  back
  online independent of the engine.  This is also useful for anyone who may 
  want
  to deploy a vdsm node in standalone mode.
  
  I think it would be possible to easily support a quasi-static configuration 
  mode
  simply by extending the design of the dynamic approach slightly.  In dynamic
  mode, the network configuration is passed down as a well-defined data 
  structure.
  When a particular configuration has been committed, vdsm could write a copy 
  of
  that configuration data structure to /var/run/vdsm/network-config.json.  
  During
  a subsequent boot, if the engine cannot be contacted after activating the
  management network, the cached configuration can be applied using the same 
  code
  as for dynamic mode.  We'd have to flesh out the circumstances under which 
  this
  would happen. 
 
 I like this approach a lot but we need to consider that network
 configuration is an accumulated state, for example -
 
 1. The engine sends a setup

Re: [vdsm] Future of Vdsm network configuration - Thread mid-summary

2012-11-27 Thread Adam Litke
On Mon, Nov 26, 2012 at 06:13:01PM -0500, Alon Bar-Lev wrote:
 Hello,
 
 - Original Message -
  From: Adam Litke a...@us.ibm.com To: Alon Bar-Lev alo...@redhat.com
  Cc: Livnat Peer lp...@redhat.com, VDSM Project Development
  vdsm-devel@lists.fedorahosted.org Sent: Tuesday, November 27, 2012
  12:51:36 AM Subject: Re: [vdsm] Future of Vdsm network configuration -
  Thread mid-summary
  
  Nice writeup!  I like where this is going but see my comments inline below.
  
  On Mon, Nov 26, 2012 at 03:18:22PM -0500, Alon Bar-Lev wrote:
   
   
   - Original Message -
From: Livnat Peer lp...@redhat.com To: Shu Ming
shum...@linux.vnet.ibm.com Cc: Alon Bar-Lev abar...@redhat.com,
VDSM Project Development vdsm-devel@lists.fedorahosted.org Sent:
Monday, November 26, 2012 2:57:19 PM Subject: Re: [vdsm] Future of Vdsm
network configuration - Thread mid-summary

On 26/11/12 03:15, Shu Ming wrote:
 Livnat,
 
 Thanks for your summary.  I got comments below.
 
 2012-11-25 18:53, Livnat Peer:
 Hi All, We have been discussing $subject for a while and I'd like to
 summarized what we agreed and disagreed on thus far.

 The way I see it there are two related discussions:


 1. Getting VDSM networking stack to be distribution agnostic.  - We
 are all in agreement that VDSM API should be generic enough to
 incorporate multiple implementation. (discussed on this thread:
 Alon's suggestion, Mark's patch for adding support for netcf etc.)

 - We would like to maintain at least one implementation as the
 working/up-to-date implementation for our users, this implementation
 should be distribution agnostic (as we all acknowledge this is an
 important goal for VDSM).  I also think that with the agreement of
 this community we can choose to change our focus, from time to time,
 from one implementation to another as we see fit (today it can be
 OVS+netcf and in a few months we'll use the quantum based
 implementation if we agree it is better)

 2. The second discussion is about persisting the network
 configuration on the host vs. dynamically retrieving it from a
 centralized location like the engine. Danken raised a concern that
 even if going with the dynamic approach the host should persist the
 management network configuration.
 
 About dynamical retrieving from a centralized location,  when will the
 retrieving start? Just in the very early stage of host booting before
 network functions?  Or after the host startup and in the normal
 running state of the host?  Before retrieving the configuration,  how
 does the host network connecting to the engine? I think we need a
 basic well known network between hosts and the engine first.  Then
 after the retrieving, hosts should reconfigure the network for later
 management.  However, the timing to retrieve and reconfigure are
 challenging.
 

We did not discuss the dynamic approach in details on the list so far
and I think this is a good opportunity to start this discussion...

From what was discussed previously I can say that the need for a well
known network was raised by danken, it was referred to as the management
network, this network would be used for pulling the full host network
configuration from the centralized location, at this point the engine.

About the timing for retrieving the configuration, there are several
approaches. One of them was described by Alon, and I think he'll join
this discussion and maybe put it in his own words, but the idea was to
'keep' the network synchronized at all times. When the host have
communication channel to the engine and the engine detects there is a
mismatch in the host configuration, the engine initiates 'apply network
configuration' action on the host.

Using this approach we'll have a single path of code to maintain and
that would reduce code complexity and bugs - That's quoting Alon Bar Lev
(Alon I hope I did not twisted your words/idea).

On the other hand the above approach makes local tweaks on the host
(done manually by the administrator) much harder.

Any other approaches ?

I'd like to add a more general question to the discussion what are the
advantages of taking the dynamic approach?  So far I collected two
reasons:

-It is a 'cleaner' design, removes complexity on VDSM code, easier to
maintain going forward, and less bug prone (I agree with that one, as
long as we keep the retrieving configuration mechanism/algorithm
simple).

-It adheres to the idea of having a stateless hypervisor - some more
input on this point would be appreciated

Any other advantages?

discussing the benefits of having the persisted

Livnat

   
   Sorry for the delay. Some more

Re: [vdsm] [RFC]about the implement of text-based console

2012-11-27 Thread Adam Litke
-starter for me.
 
 3. Extend Spice to support console
 Is it possible to implement a spice client can be run in pure text
 mode without GUI environment? If we extend the protocol to support
 console stream but the client must be run in GUI, it will be less
 useful.
 
 pros
   No new VMs and server process, easy for maintenance.
 cons
   Must wait for Spice developers to commit the support.
   Need special client program in CLI, the user may prefer existing
 client program like ssh. It not a big problem because this feature
 can be put in to oVirt shell.

Can someone familiar with spice weigh in on whether a console connection as
described here could survive a live migration?  In general, I really like this
approach if it can be done cleanly.  Spice is already oVirt's primary end-user
application so in a deployed environment, we'd expect users to already have this
program.  If a scripted interface is required, I am sure that I/O redirection
could be added either to the existing spice client or as part of a new
spice-console program.  This approach also works with a vdsm that is connected
to ovirt-engine or running in standalone mode.

This seems like the best approach to me so long as the spice team agrees that it
can and should be done.

 4. oVirt shell - Engine - libvirtd
 This is the current workaround described in
 
 http://wiki.ovirt.org/wiki/Features/Serial_Console_in_CLI#Currently_operational_workaround
 
 The design is good but I do not like Engine talking to libvirtd
 directly, thus comes the VDSM console streaming API below.
 
 Work to do
   Provide console streaming API from Engine to be invoked in oVirt shell.
   Implement the serial-console command in oVirt shell.
 
 pros
   Support migration. Engine can reconnect to the guest automatically
 after migration while keeping the connection from oVirt shell.
   Fit well in the current oVirt architecture: no new server process
 introduced, no new VM introduced, easy to maintain and manage.
 cons
   Engine talking to libvirtd directly breaks the encapsulation of VDSM.
   Users only can get the console stream from Engine, they can not
 directly connect to the host as VNC and the above two sshd solutions
 do.

I agree that this is a layering violation and should not be persued as the
long-term solution.  We do not want to expose the libvirt connection outside of
the host.

 5. VDSM console streaming API
 Implement new APIs in VDSM to forward the raw data from console. It
 exposes getConsoleReadStream() and getConsoleWriteStream() via
 XMLRPC binding. Then Engine can get the console data stream via API
 instead of directly connecting to libvirtd. Other things will be the
 same as solution 4.
 
 Work to do
   Implement getConsoleReadStream() and getConsoleWriteStream() in VDSM.
   Provide console streaming API from Engine to be invoked in oVirt shell.
   Implement the serial-console command in oVirt shell.
   Optional: Implement a client program in vdsClient to consume the
 stream API.
 
 pros
   Same as solution 4
 cons
   We can not allow ordinary user directly connect to VDSM and invoke
 the stream API, because there is no ACL in VDSM, once a client cert
 is setup for the ordinary user, he can call all the APIs in VDSM and
 get total control. So the ordinary user can only get the stream from
 Engine, and we leave Engine to do the ACL.

One issue that was raised is console buffering.  What happens if a client does
not call getConsoleReadStream() fast enough?  Will characters be dropped?  This
could create a reliability problem and would make scripting against this
interface risky at best.

 
 I like solution 4 best.

I will note again for others that you mentioned you like #5 (console streaming
API) best.  I think the spice approach is best based on weighing the following
requirements:

1. Simple and easy to maintain
2. Can access via the host or ovirt-engine
3. Scripting mode is possible
4. Reliable

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] Future of Vdsm network configuration - Thread mid-summary

2012-11-26 Thread Adam Litke
 of persistent
configuration is:

- To allow the host to operate independently of the engine in either a failure
  scenario or in a standalone configuration. 

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] Future of Vdsm network configuration - Thread mid-summary

2012-11-26 Thread Adam Litke
 on hosts serving as hypervisors has the flexibility
 argument. However at mass deployment, large data-center or dynamic environment
 this flexibility argument becomes liability.

Today oVirt plays in the small data center realm so I do think it's important to
give appropriate weight to the flexibility argument.  It should be possible to
build different environments based on the needs of the deployment.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] Future of Vdsm network configuration

2012-11-14 Thread Adam Litke
On Wed, Nov 14, 2012 at 11:53:06AM +0200, Livnat Peer wrote:
 On 14/11/12 00:28, Adam Litke wrote:
  On Sun, Nov 11, 2012 at 09:46:43AM -0500, Alon Bar-Lev wrote:
 
 
  - Original Message -
  From: Dan Kenigsberg dan...@redhat.com
  To: vdsm-de...@fedorahosted.org
  Sent: Sunday, November 11, 2012 4:07:30 PM
  Subject: [vdsm] Future of Vdsm network configuration
 
  Hi,
 
  Nowadays, when vdsm receives the setupNetowrk verb, it mangles
  /etc/sysconfig/network-scripts/ifcfg-* files and restarts the network
  service, so they are read by the responsible SysV service.
 
  This is very much Fedora-oriented, and not up with the new themes
  in Linux network configuration. Since we want oVirt and Vdsm to be
  distribution agnostic, and support new features, we have to change.
 
  setupNetwork is responsible for two different things:
  (1) configure the host networking interfaces, and
  (2) create virtual networks for guests and connect the to the world
  over (1).
 
  Functionality (2) is provided by building Linux software bridges, and
  vlan devices. I'd like to explore moving it to Open vSwitch, which
  would
  enable a host of functionalities that we currently lack (e.g.
  tunneling). One thing that worries me is the need to reimplement our
  config snapshot/recovery on ovs's database.
 
  As far as I know, ovs is unable to maintain host level parameters of
  interfaces (e.g. eth0's IPv4 address), so we need another
  tool for functionality (1): either speak to NetworkManager directly,
  or
  to use NetCF, via its libvirt virInterface* wrapper.
 
  I have minor worries about NetCF's breadth of testing and usage; I
  know
  it is intended to be cross-platform, but unlike ovs, I am not aware
  of a
  wide Debian usage thereof. On the other hand, its API is ready for
  vdsm's
  usage for quite a while.
 
  NetworkManager has become ubiquitous, and we'd better integrate with
  it
  better than our current setting of NM_CONTROLLED=no. But as DPB tells
  us,
  https://lists.fedorahosted.org/pipermail/vdsm-devel/2012-November/001677.html
  we'd better offload integration with NM to libvirt.
 
  We would like to take Network configuration in VDSM to the next level
  and make it distribution agnostic in addition for setting the
  infrastructure for more advanced features to be used going forward.
  The path we think of taking is to integrate with OVS and for feature
  completeness use NetCF, via its libvirt virInterface* wrapper. Any
  comments or feedback on this proposal is welcomed.
 
  Thanks to the oVirt net team members who's input has helped writing
  this
  email.
 
  Hi,
 
  As far as I see this, network manager is a monster that is a huge 
  dependency
  to have just to create bridges or configure network interfaces... It is 
  true
  that on a host where network manager lives it would be not polite to define
  network resources not via its interface, however I don't like we force 
  network
  manager.
 
  libvirt is long not used as virtualization library but system management
  agent, I am not sure this is the best system agent I would have chosen.
 
  I think that all the terms and building blocks got lost in time... and the
  result integration became more and more complex.
 
  Stabilizing such multi-layered component environment is much harder than
  monolithic environment.
 
  I would really want to see vdsm as monolithic component with full control 
  over
  its resources, I believe this is the only way vdsm can be stable enough to 
  be
  production grade.
 
  Hypervisor should be a total slave of manager (or cluster), so I have no
  problem in bypassing/disabling any distribution specific tool in favour of
  atoms (brctl, iproute), in non persistence mode.
 
  I know this derive some more work, but I don't think it is that complex to
  implement and maintain.
 
  Just my 2 cents...
  
  I couldn't disagree more.  What you are suggesting requires that we 
  reimplement
  every single networking feature in oVirt by ourselves.  If we want to 
  support
  the (absolutely critical) goal of being distro agnostic, then we need to
  implement the same functionality across multiple distros too.  This is more 
  work
  than we will ever be able to keep up with.  If you think it's hard to 
  stabilize
  the integration of an external networking library, imagine how hard it will 
  be
  to stabilize our own rewritten and buggy version.  This is not how open 
  source
  is supposed to work.  We should be assembling distinct, modular, 
  pre-existing
  components together when they are available.  If NetworkManager has 
  integration
  problems, let's work upstream to fix them.  If it's dependencies are too 
  great,
  let's modularize it so we don't need to ship the parts that we don't need.
  
 
 I agree with Adam on this one, reimplementing the networking management
 layer by ourselves using only atoms seems like duplication of work that
 was already done and available for our use both by NM

Re: [vdsm] Review needed: 3.2 release feature -- libvdsm

2012-11-08 Thread Adam Litke
On Tue, Nov 06, 2012 at 05:49:22PM +0200, Dan Kenigsberg wrote:
 On Mon, Oct 29, 2012 at 10:20:04AM -0500, Adam Litke wrote:
  Hi everyone,
  
  libvdsm is listed as a release feature for 3.2 (preview only)[1][2].  There 
  is a
  set of patches up in gerrit that could use a wide review from the community.
  The plan is to merge the new json-rpc server[3] first so if you could
  concentrate your reviews there it would yield the greatest benefit.  Thanks!
  
  [1] http://wiki.ovirt.org/wiki/OVirt_3.2_release-management
  [2] http://wiki.ovirt.org/wiki/Features/libvdsm
  [3] http://gerrit.ovirt.org/#/c/8614/
 
 [3] defines the format of each message as
 
 sizejson-data
 
 where size is a binary value, used to split a (tcp) stream into
 messages. I would like to consider another splitting scheme, which I
 find better suited to the textual nature of jsonrpc: terminate each
 message with the newline character. It makes the protocol easier to
 sniff and debug (in case you've missed part of a message).
 
 The down size is that we would need to require clients to
 escape literal newlines, and unescape them in responses (both are done
 by python's json module, and the latter is part of the json standard).
 

Thanks for bringing up this point.  I would like to make this protocol
compatible with existing clients.  Is there a standard for segmenting messages
over the channel?  I suppose it depends on the transport layer.


-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] Review needed: 3.2 release feature -- libvdsm

2012-11-08 Thread Adam Litke
On Tue, Nov 06, 2012 at 11:40:53AM -0600, Tony Asleson wrote:
 On 11/06/2012 09:49 AM, Dan Kenigsberg wrote:
  On Mon, Oct 29, 2012 at 10:20:04AM -0500, Adam Litke wrote:
  Hi everyone,
 
  libvdsm is listed as a release feature for 3.2 (preview only)[1][2].  
  There is a
  set of patches up in gerrit that could use a wide review from the 
  community.
  The plan is to merge the new json-rpc server[3] first so if you could
  concentrate your reviews there it would yield the greatest benefit.  
  Thanks!
 
  [1] http://wiki.ovirt.org/wiki/OVirt_3.2_release-management
  [2] http://wiki.ovirt.org/wiki/Features/libvdsm
  [3] http://gerrit.ovirt.org/#/c/8614/
  
  [3] defines the format of each message as
  
  sizejson-data
  
  where size is a binary value, used to split a (tcp) stream into
  messages. I would like to consider another splitting scheme, which I
  find better suited to the textual nature of jsonrpc: terminate each
  message with the newline character. It makes the protocol easier to
  sniff and debug (in case you've missed part of a message).
  
  The down size is that we would need to require clients to
  escape literal newlines, and unescape them in responses (both are done
  by python's json module, and the latter is part of the json standard).
 
 I use json-rpc for IPC in libStoragemgmt (out of process plug-ins) with
 unix domain sockets.  I adopted the sizejson-data model as well*.
 
 I chose this because it allows the use of non-stream capable json
 parsers.  I wanted to ensure that the transport and protocol would be
 language and parser agnostic.
 
 You could achieve the message separation with new lines as you suggest,
 but then you may have to parse the message stream twice.  Once to find
 the message delimiter and once again to parse the json itself, depending
 on json parser.  Having the size at the beginning of the message is
 incredibly convenient from a coding efficiency standpoint.
 
 As for debug, I just log the message payload if needed.  I haven't had
 the need to use a packet trace, but I'm not sure having a single newline
 separating messages would be obvious in a single frame capture?
 
 Would it be possible to compromise and leave the length and add the
 newline as the end?  So sizepayloadnew line?  You could then pass
 the message payload to the parser with without having to escape the
 newlines?

Thanks for weighing in on this!  If you use the size and newline, how do you
account for the newline char in the size value?  It seems unnecessary to
include this character to me since you can use a combination of logfile analysis
and scanning the data stream for 'id': to find method calls and responses.


-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


[vdsm] Review needed: 3.2 release feature -- libvdsm

2012-10-29 Thread Adam Litke
Hi everyone,

libvdsm is listed as a release feature for 3.2 (preview only)[1][2].  There is a
set of patches up in gerrit that could use a wide review from the community.
The plan is to merge the new json-rpc server[3] first so if you could
concentrate your reviews there it would yield the greatest benefit.  Thanks!

[1] http://wiki.ovirt.org/wiki/OVirt_3.2_release-management
[2] http://wiki.ovirt.org/wiki/Features/libvdsm
[3] http://gerrit.ovirt.org/#/c/8614/
-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


[vdsm] New component 'mom' added to Bugzilla

2012-10-25 Thread Adam Litke
Hi all,

MOM is becoming a bigger part of oVirt and unfortunately it may have bugs at
some point :(  Thanks to Yaniv we have a new 'mom' component in oVirt's bugzilla
where you can report these.

To file a new bug against MOM: 
https://bugzilla.redhat.com/enter_bug.cgi?product=oVirt;component=mom

Thanks!

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] vdsm API schema

2012-10-22 Thread Adam Litke
On Sun, Oct 21, 2012 at 09:09:52AM +0200, Itamar Heim wrote:
 On 07/15/2012 03:12 AM, Adam Litke wrote:
 For the past few weeks I have been working on creating a schema that fully
 describes the vdsm API.  I am mostly finished with that effort and I wanted 
 to
 share the results with the team.  Attached are two files: the raw schema and 
 an
 html document with cross-linked type information.
 
 This should already be useful in its current form, but I have bigger plans.  
 I
 would first like to get help to correct errors in the schema.  Then, I will
 start the process of writing a code generator that will create C/gObject code
 that we can compile into a libvdsm with language bindings for python, java, 
 etc.
 
 Please take a look at the attached files and let me know what you think?
 
 P.S.  I tried to attach these to the oVirt Wiki, but they are not permitted 
 file
 types.
 
 Hi Adam,
 
 that's quite a big scheme to review.
 have you thought about ways to solicit inputs for it?
 (maybe schedule per topic reviews of the new scheme/api for VM
 operations (virt), network (host level, vm level), storage, sla
 policy, etc.)?

For the first pass, we are trying to replicate the current API as much as
possible.  For subsequent refactoring, I'd expect the discussions to occur in
the community around the patches that are implementing the proposed changes.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] Change in vdsm[master]: schema: New type VmParameters

2012-10-18 Thread Adam Litke
On Thu, Oct 18, 2012 at 02:12:58AM -0400, lvro...@linux.vnet.ibm.com wrote:
 Royce Lv has posted comments on this change.
 
 Change subject: schema: New type VmParameters
 ..
 
 
 Patch Set 4:
 
 Adam,
  I saw 3 kinds of vm related description spread in the code:
 
 (1)vm.conf: query via vm.status()(vm.py), used as return value when changed 
 vm's conf(changeCD, hotplugDisk)
 
 (2)vm.stats: query via vm.getStats()(vm.py), which is VM's live stats and 
 used when calling getVmStats.(also used by MOM)
 
 (3)vm.parameter: parameter passed to vm.create().(API.py)
 
 You are trying to split is (3) from (1); But the live info should be (2) from 
 (1) according to me.

To me, VmDefinition contains the hardware properties of the VM (things like
devices, amount of memory, number of cpus).  It also contains things that can
only be known at runtime (VNC display port, device bus information (if not
specified in advance), current cdrom disk, etc).  VmStatistics are different
because they are measured (network activity, cpu usage, etc).  VmParameters is
like a streamlined VmDefinition where we remove items that cannot be specified
at create time.

 --
 To view, visit http://gerrit.ovirt.org/7839
 To unsubscribe, visit http://gerrit.ovirt.org/settings
 
 Gerrit-MessageType: comment
 Gerrit-Change-Id: I00d1b9aed55cbfc2210c1a4091bce17d45b90e67
 Gerrit-PatchSet: 4
 Gerrit-Project: vdsm
 Gerrit-Branch: master
 Gerrit-Owner: Adam Litke a...@us.ibm.com
 Gerrit-Reviewer: Adam Litke a...@us.ibm.com
 Gerrit-Reviewer: Federico Simoncelli fsimo...@redhat.com
 Gerrit-Reviewer: Royce Lv lvro...@linux.vnet.ibm.com
 Gerrit-Reviewer: Saggi Mizrahi smizr...@redhat.com
 

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] vdsClient error after reinstalling vdsm

2012-10-17 Thread Adam Litke
I have found that I cannot run vdsClient from within the vdsm source tree.  Is
it possible that this is the problem you see as well?  Perhaps after rebooting
you logged in and were in a different directory?

On Wed, Oct 17, 2012 at 10:53:04AM -0400, Laszlo Hornyak wrote:
 Hi!
 
 This is a low priority problem. Each time I reinstall vdsm from rpm, I get 
 this error when running vdsClient:
 
 Traceback (most recent call last):
   File /usr/lib64/python2.6/runpy.py, line 122, in _run_module_as_main
 __main__, fname, loader, pkg_name)
   File /usr/lib64/python2.6/runpy.py, line 34, in _run_code
 exec code in run_globals
   File /usr/share/vdsm/vdsClient.py, line 28, in module
 from vdsm import vdscli
 ImportError: cannot import name vdscli
 
 And after a reboot it works fine again. Very strange behavior. Anyone knows 
 how to make it work without reboot?
 
 Thx,
 Laszlo
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] [RFC]about the implement of text-based console

2012-10-15 Thread Adam Litke
On Mon, Oct 15, 2012 at 04:40:00AM -0400, Dan Yasny wrote:
 
 
 - Original Message -
  From: Adam Litke a...@us.ibm.com
  To: Zhou Zheng Sheng zhshz...@linux.vnet.ibm.com
  Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org
  Sent: Friday, 12 October, 2012 3:10:57 PM
  Subject: Re: [vdsm] [RFC]about the implement of text-based console
  
  On Fri, Oct 12, 2012 at 04:55:20PM +0800, Zhou Zheng Sheng wrote:
   
   on 09/04/2012 22:19, Ryan Harper wrote:
   * Dan Kenigsberg dan...@redhat.com [2012-09-04 05:53]:
   On Tue, Sep 04, 2012 at 03:05:37PM +0800, Xu He Jie wrote:
   On 09/03/2012 10:33 PM, Dan Kenigsberg wrote:
   On Thu, Aug 30, 2012 at 04:26:31PM -0500, Adam Litke wrote:
   On Thu, Aug 30, 2012 at 11:32:02AM +0800, Xu He Jie wrote:
   Hi,
   
  I submited a patch for text-based console
   http://gerrit.ovirt.org/#/c/7165/
   
   the issue I want to discussing as below:
   1. fix port VS dynamic port
   
   Use fix port for all VM's console. connect console with 'ssh
   vmUUID@ip -p port'.
   Distinguishing VM by vmUUID.
   
   
  The current implement was vdsm will allocated port for
  console
   dynamically and spawn sub-process when VM creating.
   In sub-process the main thread responsible for accept new
   connection
   and dispatch output of console to each connection.
   When new connection is coming, main processing create new
   thread for
   each new connection. Dynamic port will allocated
   port for each VM and use range port. It isn't good for
   firewall rules.
   
   
  so I got a suggestion that use fix port. and connect
  console with
   'ssh vmuuid@hostip -p fixport'. this is simple for user.
   We need one process for accept new connection from fix port
   and when
   new connection is coming, spawn sub-process for each vm.
   But because the console only can open by one process, main
   process
   need responsible for dispatching console's output of all vms
   and all
   connection.
   So the code will be a little complex then dynamic port.
   
  So this is dynamic port VS fix port and simple code VS
  complex code.
   From a usability point of view, I think the fixed port
   suggestion is nicer.
   This means that a system administrator needs only to open one
   port to enable
   remote console access.  If your initial implementation limits
   console access to
   one connection per VM would that simplify the code?
   Yes, using a fixed port for all consoles of all VMs seems like
   a cooler
   idea. Besides the firewall issue, there's user experience:
   instead of
   calling getVmStats to tell the vm port, and then use ssh, only
   one ssh
   call is needed. (Taking this one step further - it would make
   sense to
   add another layer on top, directing console clients to the
   specific host
   currently running the Vm.)
   
   I did not take a close look at your implementation, and did not
   research
   this myself, but have you considered using sshd for this? I
   suppose you
   can configure sshd to collect the list of known users from
   `getAllVmStats`, and force it to run a command that redirects
   VM's
   console to the ssh client. It has a potential of being a more
   robust
   implementation.
   I have considered using sshd and ssh tunnel. They
   can't implement fixed port and share console.
   Would you elaborate on that? Usually sshd listens to a fixed port
   22,
   and allows multiple users to have independet shells. What do you
   mean by
   share console?
   
   Current implement
   we can do anything that what we want.
   Yes, it is completely under our control, but there are down
   sides, too:
   we have to maintain another process, and another entry point,
   instead of
   configuring a universally-used, well maintained and debugged
   application.
   Think of the security implications of having another remote shell
   access point to a host.  I'd much rather trust sshd if we can make
   it
   work.
   
   
   Dan.
   
   At first glance, the standard sshd on the host is stronger and more
   robust than a custom ssh server, but the risk using the host sshd
   is
   high. If we implement this feature via host ssd, when a hacker
   attacks the sshd successfully, he will get access to the host
   shell.
   After all, the custom ssh server is not for accessing host shell,
   but just for forwarding the data from the guest console (a host
   /dev/pts/X device). If we just use a custom ssh server, the code in
   this server only does 1. auth, 2. data forwarding, when the hacker
   attacks, he just gets access to that virtual machine. Notice that
   there is no code written about login to the host in the custom ssh
   server, and the custom ssh server can be protected under selinux,
   only allowing it to access /dev/pts/X.
   
   In fact using a custom VNC server in qemu is as risky as a custom
   ssh server in vdsm. If we accepts the former one, then I can
   accepts
   the latter one. The consideration is how

Re: [vdsm] [RFC]about the implement of text-based console

2012-10-12 Thread Adam Litke
On Fri, Oct 12, 2012 at 04:55:20PM +0800, Zhou Zheng Sheng wrote:
 
 on 09/04/2012 22:19, Ryan Harper wrote:
 * Dan Kenigsberg dan...@redhat.com [2012-09-04 05:53]:
 On Tue, Sep 04, 2012 at 03:05:37PM +0800, Xu He Jie wrote:
 On 09/03/2012 10:33 PM, Dan Kenigsberg wrote:
 On Thu, Aug 30, 2012 at 04:26:31PM -0500, Adam Litke wrote:
 On Thu, Aug 30, 2012 at 11:32:02AM +0800, Xu He Jie wrote:
 Hi,
 
I submited a patch for text-based console
 http://gerrit.ovirt.org/#/c/7165/
 
 the issue I want to discussing as below:
 1. fix port VS dynamic port
 
 Use fix port for all VM's console. connect console with 'ssh
 vmUUID@ip -p port'.
 Distinguishing VM by vmUUID.
 
 
The current implement was vdsm will allocated port for console
 dynamically and spawn sub-process when VM creating.
 In sub-process the main thread responsible for accept new connection
 and dispatch output of console to each connection.
 When new connection is coming, main processing create new thread for
 each new connection. Dynamic port will allocated
 port for each VM and use range port. It isn't good for firewall rules.
 
 
so I got a suggestion that use fix port. and connect console with
 'ssh vmuuid@hostip -p fixport'. this is simple for user.
 We need one process for accept new connection from fix port and when
 new connection is coming, spawn sub-process for each vm.
 But because the console only can open by one process, main process
 need responsible for dispatching console's output of all vms and all
 connection.
 So the code will be a little complex then dynamic port.
 
So this is dynamic port VS fix port and simple code VS complex code.
 From a usability point of view, I think the fixed port suggestion is 
 nicer.
 This means that a system administrator needs only to open one port to 
 enable
 remote console access.  If your initial implementation limits console 
 access to
 one connection per VM would that simplify the code?
 Yes, using a fixed port for all consoles of all VMs seems like a cooler
 idea. Besides the firewall issue, there's user experience: instead of
 calling getVmStats to tell the vm port, and then use ssh, only one ssh
 call is needed. (Taking this one step further - it would make sense to
 add another layer on top, directing console clients to the specific host
 currently running the Vm.)
 
 I did not take a close look at your implementation, and did not research
 this myself, but have you considered using sshd for this? I suppose you
 can configure sshd to collect the list of known users from
 `getAllVmStats`, and force it to run a command that redirects VM's
 console to the ssh client. It has a potential of being a more robust
 implementation.
 I have considered using sshd and ssh tunnel. They
 can't implement fixed port and share console.
 Would you elaborate on that? Usually sshd listens to a fixed port 22,
 and allows multiple users to have independet shells. What do you mean by
 share console?
 
 Current implement
 we can do anything that what we want.
 Yes, it is completely under our control, but there are down sides, too:
 we have to maintain another process, and another entry point, instead of
 configuring a universally-used, well maintained and debugged
 application.
 Think of the security implications of having another remote shell
 access point to a host.  I'd much rather trust sshd if we can make it
 work.
 
 
 Dan.
 
 At first glance, the standard sshd on the host is stronger and more
 robust than a custom ssh server, but the risk using the host sshd is
 high. If we implement this feature via host ssd, when a hacker
 attacks the sshd successfully, he will get access to the host shell.
 After all, the custom ssh server is not for accessing host shell,
 but just for forwarding the data from the guest console (a host
 /dev/pts/X device). If we just use a custom ssh server, the code in
 this server only does 1. auth, 2. data forwarding, when the hacker
 attacks, he just gets access to that virtual machine. Notice that
 there is no code written about login to the host in the custom ssh
 server, and the custom ssh server can be protected under selinux,
 only allowing it to access /dev/pts/X.
 
 In fact using a custom VNC server in qemu is as risky as a custom
 ssh server in vdsm. If we accepts the former one, then I can accepts
 the latter one. The consideration is how robust of the custom ssh
 server, and the difficulty to maintain it. In He Jie's current
 patch, the ssh auth and transport library is an open-source
 third-party project, unless the project is well maintained and well
 proven, using it can be risky.
 
 So my opinion is using neither the host sshd, nor a custom ssh
 server. Maybe we can apply the suggestion from Dan Yasny, running a
 standard sshd in a very small VM in every host, and forward data
 from this VM to other guest consoles. The ssh part is in the VM,
 then our work is just forwarding data from the VM via virto serial
 channels, to the guest via the pty.

I really

Re: [vdsm] Mom Balloon policy issue

2012-10-09 Thread Adam Litke
Thanks for writing this.  Some thoughts inline, below.  Also, cc'ing some lists
in case other folks want to participate in the discussion.

On Tue, Oct 09, 2012 at 01:12:30PM -0400, Noam Slomianko wrote:
 Greetings,
 
 I've fiddled around with ballooning and wanted to raise a question for debate.
 
 Currently as long as the host is under memory pressure, MOM will try and 
 reclaim back memory from all guests with more free memory then a given 
 threshold.
 
 Main issue: Guest allocated memory is not the same as the resident (physical) 
 memory used by qemu.
 This means that when memory is reclaimed back (the balloon is inflated) we 
 might not get as much memory as planed back (or non at all).
 
  *Example1 no memory is reclaimed back:
 name | allocated memory | used by the vm | resident memory used in the 
 host by qemu
 Vm1  |   4G |   4G,  |4G
 Vm2  |   4G |   1G   |1G
  - MOM will inflate the balloon in vm2 (as vm has no free memory) and will 
 gain no memory

One thing to keep in mind is that VMs having less host RSS than their memory
allocation is a temporary condition.  All VMs will eventually consume their full
allocation if allowed to run.  I'd be curious to know how long this process
takes in general.

We might be able to handle this case by refusing to inflate the balloon if:
(VM free memory - planned balloon inflation)  host RSS


  *Example1 memory is reclaimed partially:
 name | allocated memory | used by the vm | resident memory used in the 
 host by qemu
 Vm1  |   4G |   4G,  |4G
 Vm2  |   4G |   1G   |1G
 Vm3  |   4G |   1G   |4G
  - MOM will inflate the balloon in vm2 and vm3 slowly gaining only from vm3

The above rule extension may help here too.

 this behaviour might in the cause us to:
  * spend time reclaiming memory from many guests when we can reclaim only 
 from a subgroup
  * be under the impression that we have more potential memory to reclaim when 
 we do
  * bring inactive VMs dangerously low as they are constantly reclaimed (I've 
 had guests crashing from kernel out of memory)
 
 
 To address this I suggest that we collect guest memory stats from libvirt as 
 well, so we have the option to use them in our calculations.
 This can be achieved with the command virsh dommemstat domain which 
 returns
 actual 3915372 (allocated)
 rss 2141580 (resident memory used by qemu)

I would suggest adding these two fields to the VmStats that are collected by
vdsm.  Then, to try it out, add the fields to the GuestMemory Collector.  (Note:
MOM does have a collector that gathers RSS for VMs.  It's called GuestQemuProc).
You can then extend the Balloon policy to add a snippet to check if the proposed
balloon adjustment should be carried out.  You could add the logic to the
change_big_enough function.

 additional topic:
  * should we include per guest config (for example a hard minimum memory cap, 
 this vm cannot run effectively with less then 1G memory)

Yes.  This is probably something we want to do.  There is a whole topic around
VM tagging that we should consider.  In the future we will want to be able to do
many different things in policy based on a VMs tag.  For example, some VMs may
be completely exempt from ballooning.  Others may have a minimum limit.

I want to avoid passing in the raw guest configuration because MOM needs to work
with direct libvirt vms and with ovirt/vdsm vms.  Therefore, we want to think
carefully about the abstractions we use when presenting VM properties to MOM.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] API: Supporting internal/testing interfaces

2012-10-04 Thread Adam Litke
On Thu, Oct 04, 2012 at 09:22:31AM -0400, Federico Simoncelli wrote:
 - Original Message -
  From: Saggi Mizrahi smizr...@redhat.com
  To: dl...@redhat.com
  Cc: Federico Simoncelli fsimo...@redhat.com, 
  vdsm-devel@lists.fedorahosted.org
  Sent: Thursday, October 4, 2012 1:27:27 PM
  Subject: Re: [vdsm] API: Supporting internal/testing interfaces
  
  - Original Message -
   From: Dor Laor dl...@redhat.com
   To: Saggi Mizrahi smizr...@redhat.com
   Cc: Federico Simoncelli fsimo...@redhat.com,
   vdsm-devel@lists.fedorahosted.org
   Sent: Wednesday, October 3, 2012 10:16:26 PM
   Subject: Re: [vdsm] API: Supporting internal/testing interfaces
   
   On 10/03/2012 09:52 PM, Saggi Mizrahi wrote:
My personal preference is using the VDSM debug hook to inject
code
to a running VDSM and dynamically add whatever you want.
This means the code is part of the test and not VDSM.
   
   That's might be good for debugging/tracing but not for full
   functional
   tests. There are also better ways for dynamic tracing.
   
   
We used to use it (before the code rotted away) to add to VDSM
the
startCoverage() and endCoverage() verbs for tests.
   
Another option is having the code in an optional RPM (similar to
how debug hook is loaded only if it's installed)
   
I might also accept unpythonic things like conditional
compilation
   
Asking people nicely not to use a method that might corrupt their
data-center doesn't always work with good people not to mention
bad ones.
   
   Using -test devices/interfaces is a common practice. It's good to
   keep
   them live within the code base so they won't get rotten and any
   reasonable user is aware it's only a test api.
   
   Downstream can always compile it out before shipping.
  
  Conditional compilation kind of awkward in python, but as I said I'll
  agree to have that as an option.
  From what I understand litke's proposal is having the bindings in a
  different RPM but I am actually talking about the server side code
  not being available or at least hooked up.
 
 I thought that the server side was modular too and Adam's proposal was
 a server side additional module that registers new verbs to expose.
 
  In any case, I personally like this being hard and tiresome to do
  because it makes living with bad design less tolerable.
 
 There are some things that are harder to test and debug no matter how
 you implement them. To see a single extension you have to start a vm
 and wait for the guest to fill the lv. A better design wouldn't change
 the fact that if you don't expose a verb you can't use it.
 
  In any case, I don't want new code to need to have special debug
  verbs, if you don't test a full VDSM you shouldn't need to have one
  running.
 
 Why you think that one thing should exclude the other. Here we're talking
 about providing easier ways to test more (not less).

In a perfect world, the code that does LV extend would exist in an independent
class (that doesn't depend on vdsm/hsm and can be tested with a simple,
standalone unit test.  Unfortunately, we do not live in a perfect world.  New
code should be testable in this way but we need something to test what we
already have.

We could always provide a debug rpm that enables a yet another binding for a
quick and dirty xmlrpc server.  This server would stick around even after the
normal BindingXMLRPC one is retired.  The debug server would have no API
formalization whatsoever and could be made pluggable so that test cases could be
easily dropped in.  This approach comes with just as many avenues of abuse as
the idea I had previously suggested.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] Change in vdsm[master]: Use 'yum clean expire-cache' instead of 'yum clean all'

2012-10-03 Thread Adam Litke
On Tue, Oct 02, 2012 at 07:39:32PM -0400, Ayal Baron wrote:


Ayal, thanks for your thorough treatment of this subject :)  I completely agree
with the framework that you have laid out here.  Hopefully, we can all come to
an agreement on a quasi-official project-wide policy based on this and then
place it up on the wiki in a gerrit workflow best practices document.

 
 First of all, in gerrit there is no immediately visible difference between 
 '0' and no review at all so someone might have serious issues with a patch 
 but if she did not mark it with -1 submitter might totally miss this fact.
 esp. if someone sent a new revision and the title of the cover comment for 
 previous version doesn't state a -1 (so maintainer doesn't know he needs to 
 go looking back to verify things were fixed).
 This is adding overhead on maintainer now to go back to each and every review 
 and make sure that there are no comments that should have been addressed in 0.
 Note that if someone gave a -1, normally I'd expect that person to make sure 
 and +1 a subsequent patch to flag to maintainer that all their problems with 
 the patch have been addressed.
 
 My take on this is:
 
 -2 - The approach taken to solve the problem is wrong and the whole thing 
 should either be abandoned or rewritten in a new way.  I can only accept this 
 though if the reviewer also suggests the alternative (i.e. just saying your 
 code isn't good is bad form imo).  e.g. stating things like 'circular 
 references is bad' and giving -2 but not suggesting alternatives and 
 explanations is bad form imo.
 
 -1 - I think there are some issues with the current patch that should be 
 addressed *prior* to merging it (bugs in the code that would affect many 
 people etc).  This would also include complex code which needs explaining (if 
 it's too complex for me to immediately understand then it's fine to delay 
 merge until either a good answer why this is mandatory is received and what 
 the code does or simplification of the code submitted or at worst case - 
 comment in the code.
  -1 should only be given with proper explanation, otherwise imo it's bad form.
 
 0 - I have some *personal* style problems, questions which do not affect the 
 validity of the patch *or* I think there are some changes that should be made 
 but can definitely be done in a future patch and should not prevent merging 
 the current version.  Note that I find this very important to actually 
 improve our current way of working.  This means that if a patch improves 
 current code but could in itself be further improved, it is valid imo to 
 accept current version and ask committer to submit another patch to further 
 improve it.
 Note that this would include things like (e.g.) discussions about spacing 
 which are not enforced by pep8 tools (i.e. preventing a patch which fixes 
 bugs from going in because of personal interpretation of pep8 about alignment 
 of parameters in function signature is wrong imo).
 
 +1 - I have reviewed the code and it looks correct to me but I'm not a 
 subject matter expert / the maintainer.
  Note that for things like style review only '+1' should be accompanied 
 by a cover commit message stating - +1 only for style as Doron has 
 mentioned previously on this thred.
 
 +2 - I am a subject matter expert, I have reviewed the code and it looks good 
 to me (solves the problem properly and no serious issues left with it).
 
 As Doron mentioned, in our group (storage) the standard is to have at least 2 
 reviews (by different people) before committing unless the patch is *really* 
 trivial.
 This means that I try to avoid giving +2 if no else has given a +1 before.
 
 Alon, note that we apply this both to vdsm and engine.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] Review Request: libvdsm schema updates

2012-10-01 Thread Adam Litke
On Fri, Sep 28, 2012 at 08:29:03PM -0400, Keith Robertson wrote:
 What, no XSD(s)? :)

XSD is really only appropriate for XML documents and this API does not use XML.

 
 - Original Message -
  From: Adam Litke a...@us.ibm.com
  To: vdsm-devel@lists.fedorahosted.org
  Sent: Friday, September 28, 2012 5:40:48 PM
  Subject: [vdsm] Review Request: libvdsm schema updates
  
  Hi vdsm developers!
  
  This is a plea for review of my libvdsm schema updates.  The patches
  I am asking
  for review on are found here:
  
  http://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:schema,n,z
  
  These patches simply change some elements of the API schema that are
  needed by the
  actual libvdsm code.  I want to make some progress on this if
  possible so thanks
  in advance for taking a look.
  
  --
  Adam Litke a...@us.ibm.com
  IBM Linux Technology Center
  
  ___
  vdsm-devel mailing list
  vdsm-devel@lists.fedorahosted.org
  https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
  
 

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


[vdsm] Review Request: libvdsm schema updates

2012-09-28 Thread Adam Litke
Hi vdsm developers!

This is a plea for review of my libvdsm schema updates.  The patches I am asking
for review on are found here: 

http://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:schema,n,z

These patches simply change some elements of the API schema that are needed by 
the
actual libvdsm code.  I want to make some progress on this if possible so thanks
in advance for taking a look.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] [Engine-devel] is gerrit.ovirt.org down? eom

2012-09-12 Thread Adam Litke
So the fix is to just regularly restart gerrit?  Do we have any idea about the
real, underlying problem?

On Wed, Sep 12, 2012 at 11:56:44AM -0400, Eyal Edri wrote:
 
 
 - Original Message -
  From: Itamar Heim ih...@redhat.com
  To: Asaf Shakarchi ashak...@redhat.com
  Cc: Alon Bar-Lev alo...@redhat.com, Shireesh Anjal 
  san...@redhat.com, engine-de...@ovirt.org, VDSM Project
  Development vdsm-devel@lists.fedorahosted.org, Shu Ming 
  shum...@linux.vnet.ibm.com, Eyal Edri
  ee...@redhat.com
  Sent: Wednesday, September 12, 2012 6:34:56 PM
  Subject: Re: [Engine-devel] is gerrit.ovirt.org down? eom
  
  On 09/12/2012 06:23 PM, Asaf Shakarchi wrote:
   It happens from time to time, restart is required, Itamar only.
  
  restarted.
  eyal - can we make progress on the jenkins job with permission to
   more
  people to restart gerrit?
 
 the job is ready 
 http://jenkins.ovirt.org/view/system-monitoring/job/restart_gerrit_service
 but i need to have jenkins user access to gerrit server + sudo access to run 
 'service' restart... 
 
 it has access to www.ovirt.org but not to gerrit.ovirt.org. 
 
  others - please email infra on gerrit issues (well, me personally
  always
  help as well)
  
  
   - Original Message -
  
   Yes, I am experiencing this too...
  
   Itamar?
  
   - Original Message -
   From: Shu Ming shum...@linux.vnet.ibm.com
   To: Alon Bar-Lev alo...@redhat.com
   Cc: Shireesh Anjal san...@redhat.com, engine-de...@ovirt.org,
   VDSM Project Development
   vdsm-devel@lists.fedorahosted.org
   Sent: Wednesday, September 12, 2012 5:50:14 PM
   Subject: Re: [Engine-devel] is gerrit.ovirt.org down? eom
  
   It seems gerrit has downed for several times recently. Is there
   any
   special reason?
   于 2012-9-12 22:45, Alon Bar-Lev:
   yes.
  
   - Original Message -
   From: Shireesh Anjal san...@redhat.com
   To: engine-de...@ovirt.org
   Sent: Wednesday, September 12, 2012 5:43:35 PM
   Subject: [Engine-devel] is gerrit.ovirt.org down? eom
  
  
   ___
   Engine-devel mailing list
   engine-de...@ovirt.org
   http://lists.ovirt.org/mailman/listinfo/engine-devel
  
   ___
   Engine-devel mailing list
   engine-de...@ovirt.org
   http://lists.ovirt.org/mailman/listinfo/engine-devel
  
  
  
   --
   ---
   舒明 Shu Ming
   Open Virtualization Engineerning; CSTL, IBM Corp.
   Tel: 86-10-82451626  Tieline: 9051626 E-mail: shum...@cn.ibm.com
   or
   shum...@linux.vnet.ibm.com
   Address: 3/F Ring Building, ZhongGuanCun Software Park, Haidian
   District, Beijing 100193, PRC
  
  
  
   ___
   Engine-devel mailing list
   engine-de...@ovirt.org
   http://lists.ovirt.org/mailman/listinfo/engine-devel
  
  
  
 
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] Patch review process

2012-09-11 Thread Adam Litke
On Sun, Sep 09, 2012 at 02:33:00PM -0400, Alon Bar-Lev wrote:
 
 
 - Original Message -
  From: Adam Litke a...@us.ibm.com
  To: vdsm-devel@lists.fedorahosted.org
  Cc: Ryan Harper ry...@linux.vnet.ibm.com, Anthony Liguori 
  aligu...@linux.vnet.ibm.com
  Sent: Sunday, September 9, 2012 8:27:30 PM
  Subject: [vdsm] Patch review process
  
  While discussing gerrit recently, I learned that some people use
  gerrit simply
  to host work-in-progress patches and they don't intend for those to
  be reviewed.
  How can a reviewer recognize this and skip those patches when
  choosing what to
  review?  Is there a way to mark certain patches as more important and
  others as
  drafts?
 
 Yes.
 
 See [1].
 
 $ git push upstream HEAD:refs/drafts/master/description

Thanks for pointing it out.  It would be nice if we could get people to start
pushing WIP patches to drafts now that we have this feature.

 
 [1] 
 http://gerrit-documentation.googlecode.com/svn/ReleaseNotes/ReleaseNotes-2.3.html
 

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


[vdsm] Jenkins build failure for change that adds build dependencies

2012-08-30 Thread Adam Litke
Hi,

My change, http://gerrit.ovirt.org/#/c/7516/ adds the following build
dependencies.  Since they are not installed on the system running patch
verification tests I am getting build failures.  Can we get these packages
installed on the testing host(s) please?

+BuildRequires: gobject-introspection-devel
+BuildRequires: glib2-devel
+BuildRequires: json-glib-devel
+BuildRequires: vala
+BuildRequires: libgee-devel

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] [RFC] An alternative way to provide a supported interface -- libvdsm

2012-07-26 Thread Adam Litke
On Thu, Jul 26, 2012 at 11:47:51AM +0300, Itamar Heim wrote:
 On 07/17/2012 01:19 AM, Itamar Heim wrote:
 On 07/09/2012 09:52 PM, Saggi Mizrahi wrote:
 
 
 - Original Message -
 From: Itamar Heim ih...@redhat.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: Adam Litke a...@us.ibm.com, vdsm-devel@lists.fedorahosted.org
 Sent: Monday, July 9, 2012 11:03:43 AM
 Subject: Re: [vdsm] [RFC] An alternative way to provide a supported
 interface -- libvdsm
 
 On 07/09/2012 05:56 PM, Saggi Mizrahi wrote:
 I don't think AMQP is a good low level supported protocol as it's a
 very complex protocol to set up and support.
 Also brokers are known to have their differences in standard
 implementation which means supporting them all is a mess.
 
 It looks like the most accepted route is the libvirt route of
 having a c library abstracting away client server communication
 and having more advanced consumers build protocol specific bridges
 that may have different support standards.
 
 On a more personal note, I think brokerless messaging is the way to
 go in ovirt because, unlike traditional clustering, worker nodes
 are not interchangeable so direct communication is the way to go,
 rendering brokers pretty much useless.
 
 but brokerless doesn't let multiple consumers which a bus provides?
 All consumers can connect to the host and *some* events can be
 broadcasted to all connected clients.
 
 The real question is weather you want to depend on AMQP's routing \
 message storing
 Also, if you find it preferable to have a centralized host (single
 point of failure) to get all events from all hosts for the price of
 some clients (I assume read only clients) not needing to know the
 locations of all worker nodes.
 But IMHO we already have something like that, it's called the
 ovirt-engine, and it could send aggregated events about the cluster
 (maybe with some extra enginy data).
 
 The question is what does mandating a broker gives us something that
 an AMQP bridge wouldn't.
 The only thing I can think of is vdsm can assume unmoderated vdsm to
 vdsm communication bypassing the engine.
 This means that VDSM can have some clustered behavior that requires no
 engine intervention.
 Further more, the engine can send a request and let the nodes decide
 who is performing the operation among themselves.
 
 Essentially:
 
 [  engine  ]  [  engine  ]
 | |  VS  |
 [vdsm][vdsm]  [  broker  ]
   | |
[vdsm][vdsm]
 
 *All links are two way links
 
 This has dire consequences on API usability and supportability. So we
 need to converge on that.
 
 There needs to be a good reason why the aforementioned logic code
 can't sit on a another ovirt specific entity (lets call it
 ovirt-dynamo) that uses VDSM's supported API but it's own APIs (or
 more likely messaging algorithms) are unsupported.
 
   [engine   ]
 |||
 |  [   broker   ] |
 |||   |
 [vdsm]-[dynamo] : [dynamo]-[vdsm]
  Host A  :  Host B
 
 *All links are two way links
 
 1. we have engine today 'in the path' to the history db. but it makes no
 sense for engine to be aware of each statistic we want to keep in the
 history db.
 same would be for an event/stats correlation service.
 they don't need to depend on each other for availability/redundancy.
 
 2. we are already looking at quantum integration, which is doing engine
 to nodes communication via amqp.
 
 3. with somewhat of a forward looking - moving some scheduling logic
 down to vdsm will probably mean we'll want one of the nodes to listen
 to statistics and state from the other nodes.
 
 to all of these, setting up a bus which allows multiple peer listeners
 seems more robust
 
 
 I'm still against developing a C level binding for amqp and rest
 support over a codebase which is in python.
 rest and amqp allow for both local and remote bindings in any language.
 C bindings should/could be a parallel implementation, but they seem
 like an unneeded overhead and complexity in the middle of the
 codebase.

Sure, it's probably possible to bind a REST or AMQP API in other languages but I
don't think there is an automatic way of doing it.  That means having to keep up
with maintenance of each and every binding every time the API changes.  If we
look at libvirt, they will say this is a large source of pain that they have
recommended we avoid.

For the C/gobject approach, we write a single API schema file.  From that, we
automatically generate the C API and bindings.  Sure, the generation could be a
bit complex but much of it will be someone else's codebase (and one that is used
by lots of Gnome projects).

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] [RFC] An alternative way to provide a supported interface -- libvdsm

2012-07-12 Thread Adam Litke
On Thu, Jul 12, 2012 at 08:11:17AM +0800, Shu Ming wrote:
 Basically,  my understanding is that we can generate two versions of
 libvdsm from the schema file for both the node and the management
 application.  First, the transportation protocols(XMLRPC, REST-API)
 will depend on libvdsm(node version) to export the APIs to remote
 management application.  Secondly, the management application can
 use libvdsm(application version ) to emit the remote call to the
 node.   Also, transportation protocols like REST API and XML RPC API
 can also be generated automatically by the schema file with C, Java,
 Python bindings.

I think this might be a bit too complex of a model.  Here's how I see it...

The schema generates C/gObject code which can be compiled into libvdsm.  We can
use the gObject introspection library to automatically generate language
bindings for Java, Python, Perl, etc.

The libvdsm library talks to vdsmd using a wire protocol that works locally and
remotely.  This wire protocol is completely hidden from library users.  It's an
implementation detail that can be changed later if necessary.  Today I would
recommend that we use xmlrpc.  This means that ovirt-engine or another remote
program could use libvdsm in the exact same manner as a local program.  The
library user just needs to call libvdsm.connect(uri).

Finally, REST and AMQP bridges would be written solely against libvdsm.  These
bridges are probably not suitable for code generation (but we can revisit that
as a separate issue because it's up to the bridge writer to determine the best
approach).

 
 On 2012-7-12 2:29, Saggi Mizrahi wrote:
 I'm sorry, but I don't really understand the drawing
 
 - Original Message -
 From: Shu Ming shum...@linux.vnet.ibm.com
 To: Adam Litke a...@us.ibm.com
 Cc: vdsm-devel@lists.fedorahosted.org
 Sent: Wednesday, July 11, 2012 10:24:49 AM
 Subject: Re: [vdsm] [RFC] An alternative way to provide a supported 
 interface -- libvdsm
 
 Adam,
 
 Maybe,  I don't fully understand your proposal.  Here is my
 understanding of libvdsm in the picture. Please check the following
 link
 for the picture.
 
 http://www.ovirt.org/wiki/File:Libvdsm.JPG
 
 
 http://www.ovirt.org/wiki/File:Libvdsm.JPG
 
 On 2012-7-9 21:56, Adam Litke wrote:
 On Fri, Jul 06, 2012 at 03:53:08PM +0300, Itamar Heim wrote:
 On 07/06/2012 01:15 AM, Robert Middleswarth wrote:
 On 07/05/2012 04:45 PM, Adam Litke wrote:
 On Thu, Jul 05, 2012 at 03:47:42PM -0400, Saggi Mizrahi wrote:
 - Original Message -
 From: Adam Litke a...@us.ibm.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: Anthony Liguori anth...@codemonkey.ws, VDSM Project
 Development vdsm-devel@lists.fedorahosted.org
 Sent: Thursday, July 5, 2012 2:34:50 PM
 Subject: Re: [RFC] An alternative way to provide a supported
 interface -- libvdsm
 
 On Wed, Jun 27, 2012 at 02:50:02PM -0400, Saggi Mizrahi wrote:
 The idea of having a supported C API was something I was
 thinking
 about doing
 (But I'd rather use gobject introspection and not schema
 generation) But the
 problem is not having a C API is using the current XML RPC
 API as
 it's base
 I want to disect this a bit to find out exactly where there
 might be
 agreement
 and disagreement.
 
 C API is a good thing to implement - Agreed.
 
 I also want to use gobject introspection but I don't agree
 that using
 glib
 precludes the use of a formalized schema.  My proposal is that
 we
 write a schema
 definition and generate the glib C code from that schema.
 
 I agree that the _current_ xmlrpc API makes a pretty bad base
 from
 which to
 start a supportable API.  XMLRPC is a perfectly reasonable
 remote/wire protocol
 and I think we should continue using it as a base for the next
 generation API.
 Using a schema will ensure that the new API is
 well-structured.
 There major problems with XML-RPC (and to some extent with REST
 as
 well) are high call overhead and no two way communication (push
 events). Basing on XML-RPC means that we will never be able to
 solve
 these issues.
 I am not sure I am ready to conceed that XML-RPC is too slow for
 our
 needs.  Can
 you provide some more detail around this point and possibly
 suggest an
 alternative that has even lower overhead without sacrificing the
 ubiquity and
 usability of XML-RPC?  As far as the two-way communication
 point, what
 are the
 options besides AMQP/ZeroMQ?  Aren't these even worse from an
 overhead
 perspective than XML-RPC?  Regarding two-way communication: you
 can
 write AMQP
 brokers based on the C API and run one on each vdsm host.
   Assuming
 the C API
 supports events, what else would you need?
 I personally think that using something like AMQP for inter-node
 communication and engine - node would be optimal.  With a rest
 interface
 that just send messages though something like AMQP.
 I would also not dismiss AMQP so soon
 we want a bug with more than a single listener at engine side
 (engine, history db, maybe event correlation service).
 collectd as a means

Re: [vdsm] [RFC] An alternative way to provide a supported interface -- libvdsm

2012-07-09 Thread Adam Litke
On Fri, Jul 06, 2012 at 03:53:08PM +0300, Itamar Heim wrote:
 On 07/06/2012 01:15 AM, Robert Middleswarth wrote:
 On 07/05/2012 04:45 PM, Adam Litke wrote:
 On Thu, Jul 05, 2012 at 03:47:42PM -0400, Saggi Mizrahi wrote:
 
 - Original Message -
 From: Adam Litke a...@us.ibm.com
 To: Saggi Mizrahi smizr...@redhat.com
 Cc: Anthony Liguori anth...@codemonkey.ws, VDSM Project
 Development vdsm-devel@lists.fedorahosted.org
 Sent: Thursday, July 5, 2012 2:34:50 PM
 Subject: Re: [RFC] An alternative way to provide a supported
 interface -- libvdsm
 
 On Wed, Jun 27, 2012 at 02:50:02PM -0400, Saggi Mizrahi wrote:
 The idea of having a supported C API was something I was thinking
 about doing
 (But I'd rather use gobject introspection and not schema
 generation) But the
 problem is not having a C API is using the current XML RPC API as
 it's base
 I want to disect this a bit to find out exactly where there might be
 agreement
 and disagreement.
 
 C API is a good thing to implement - Agreed.
 
 I also want to use gobject introspection but I don't agree that using
 glib
 precludes the use of a formalized schema.  My proposal is that we
 write a schema
 definition and generate the glib C code from that schema.
 
 I agree that the _current_ xmlrpc API makes a pretty bad base from
 which to
 start a supportable API.  XMLRPC is a perfectly reasonable
 remote/wire protocol
 and I think we should continue using it as a base for the next
 generation API.
 Using a schema will ensure that the new API is well-structured.
 There major problems with XML-RPC (and to some extent with REST as
 well) are high call overhead and no two way communication (push
 events). Basing on XML-RPC means that we will never be able to solve
 these issues.
 I am not sure I am ready to conceed that XML-RPC is too slow for our
 needs.  Can
 you provide some more detail around this point and possibly suggest an
 alternative that has even lower overhead without sacrificing the
 ubiquity and
 usability of XML-RPC?  As far as the two-way communication point, what
 are the
 options besides AMQP/ZeroMQ?  Aren't these even worse from an overhead
 perspective than XML-RPC?  Regarding two-way communication: you can
 write AMQP
 brokers based on the C API and run one on each vdsm host.  Assuming
 the C API
 supports events, what else would you need?
 I personally think that using something like AMQP for inter-node
 communication and engine - node would be optimal.  With a rest interface
 that just send messages though something like AMQP.
 
 I would also not dismiss AMQP so soon
 we want a bug with more than a single listener at engine side
 (engine, history db, maybe event correlation service).
 collectd as a means for statistics already supports it as well.
 I'm for having REST as well, but not sure as main one for a consumer
 like ovirt engine.

I agree that a message bus could be a very useful model of communication between
ovirt-engine components and multiple vdsm instances.  But the complexities and
dependencies of AMQP do not make it suitable for use as a low-level API.  AMQP
will repel new adopters.  Why not establish a libvdsm that is more minimalist
and can be easily used by everyone?  Then AMQP brokers can be built on top of
the stable API with ease.  All AMQP should require of the low-level API are
standard function calls and an events mechanism.

 
 Thanks
 Robert
 The current XML-RPC API contains a lot of decencies and
 inefficiencies and we
 would like to retire it as soon as we possibly can. Engine would
 like us to
 move to a message based API and 3rd parties want something simple
 like REST so
 it looks like no one actually wants to use XML-RPC. Not even us.
 I am proposing that AMQP brokers and REST APIs could be written
 against the
 public API.  In fact, they need not even live in the vdsm tree
 anymore if that
 is what we choose.  Core vdsm would only be responsible for providing
 libvdsm
 and whatever language bindings we want to support.
 If we take the libvdsm route, the only reason to even have a REST
 bridge is only to support OSes other then Linux which is something
 I'm not sure we care about at the moment.
 That might be true regarding the current in-tree implementation.
 However, I can
 almost guarantee that someone wanting to write a web GUI on top of
 standalone
 vdsm would want a REST API to talk to.  But libvdsm makes this use
 case of no
 concern to the core vdsm developers.
 
 I do think that having C supportability in our API is a good idea,
 but the
 current API should not be used as the base.
 Let's _start_ with a schema document that describes today's API and
 then clean
 it up.  I think that will work better than starting from scratch.
   Once my
 schema is written I will post it and we can 'patch' it as a community
 until we
 arrive at a 1.0 version we are all happy with.
 +1
 Ok.  Redoubling my efforts to get this done.  Describing the output of
 list(True) takes awhile

Re: [vdsm] [virt-node] VDSM as a general purpose virt host manager

2012-06-26 Thread Adam Litke
On Tue, Jun 26, 2012 at 11:11:51PM +0800, Shu Ming wrote:
 On 2012-6-26 20:45, Adam Litke wrote:
 On Tue, Jun 26, 2012 at 09:53:10AM +0800, Xu He Jie wrote:
 On 06/26/2012 05:19 AM, Adam Litke wrote:
 On Mon, Jun 25, 2012 at 05:53:31PM +0300, Dan Kenigsberg wrote:
 On Mon, Jun 25, 2012 at 08:28:29AM -0500, Adam Litke wrote:
 On Fri, Jun 22, 2012 at 06:45:43PM -0400, Andrew Cathrow wrote:
 - Original Message -
 From: Ryan Harper ry...@us.ibm.com
 To: Adam Litke a...@us.ibm.com
 Cc: Anthony Liguori aligu...@redhat.com, VDSM Project 
 Development vdsm-devel@lists.fedorahosted.org
 Sent: Friday, June 22, 2012 12:45:42 PM
 Subject: Re: [vdsm] [virt-node] VDSM as a general purpose virt host 
 manager
 
 * Adam Litke a...@us.ibm.com [2012-06-22 11:35]:
 On Thu, Jun 21, 2012 at 12:17:19PM +0300, Dor Laor wrote:
 On 06/19/2012 08:12 PM, Saggi Mizrahi wrote:
 - Original Message -
 From: Deepak C Shetty deepa...@linux.vnet.ibm.com
 To: Ryan Harper ry...@us.ibm.com
 Cc: Saggi Mizrahi smizr...@redhat.com, Anthony Liguori
 aligu...@redhat.com, VDSM Project Development
 vdsm-devel@lists.fedorahosted.org
 Sent: Tuesday, June 19, 2012 10:58:47 AM
 Subject: Re: [vdsm] [virt-node] VDSM as a general purpose virt
 host manager
 
 On 06/19/2012 01:13 AM, Ryan Harper wrote:
 * Saggi Mizrahismizr...@redhat.com  [2012-06-18 10:05]:
 I would like to put on to the table for descussion the
 growing
 need for a way
 to more easily reuse of the functionality of VDSM in order to
 service projects
 other than Ovirt-Engine.
 
 Originally VDSM was created as a proprietary agent for the
 sole
 purpose of
 serving the then proprietary version of what is known as
 ovirt-engine. Red Hat,
 after acquiring the technology, pressed on with it's
 commitment to
 open source
 ideals and released the code. But just releasing code into
 the
 wild doesn't
 build a community or makes a project successful. Further more
 when
 building
 open source software you should aspire to build reusable
 components instead of
 monolithic stacks.
 
 Saggi,
 
 Thanks for sending this out.  I've been trying to pull
 together
 some
 thoughts on what else is needed for vdsm as a community.  I
 know
 that
 for some time downstream has been the driving force for all of
 the
 work
 and now with a community there are challenges in finding our
 own
 way.
 
 While we certainly don't want to make downstream efforts
 harder, I
 think
 we need to develop and support our own vision for what vdsm
 can be
 come,
 some what independent of downstream and other exploiters.
 
 Revisiting the API is definitely a much needed endeavor and I
 think
 adding some use-cases or sample applications would be useful
 in
 demonstrating whether or not we're evolving the API into
 something
 easier to use for applications beyond engine.
 
 We would like to expose a stable, documented, well supported
 API.
 This gives
 us a chance to rethink the VDSM API from the ground up. There
 is
 already work
 in progress of making the internal logic of VDSM separate
 enough
 from the API
 layer so we could continue feature development and bug fixing
 while designing
 the API of the future.
 
 In order to achieve this though we need to do several things:
  1. Declare API supportability guidelines
  2. Decide on an API transport (e.g. REST, ZMQ, AMQP)
  3. Make the API easily consumable (e.g. proper docs,
  example
  code, extending
 the API, etc)
  4. Implement the API itself
 In the earlier we'd discussed working to have similarities in the 
 modeling between the oVirt API and VDSM but that seems to have dropped 
 off the radar.
 Yes, the current REST API has attempted to be compatible with the current
 ovirt-engine API.  Going forward, I am not sure how easy this will be to
 maintain given than engine is built on Java and vdsm is built on Python.
 Could you elaborate why the language difference is an issue? Isn't this
 what APIs are supposed to solve?
 The main language issue is that ovirt-engine has built their API using a 
 set of
 Java-specific frameworks (JAXB and its dependents).  It's true, if you 
 google
 for 'python jaxb' you will find some sourceforge projects that attempt to 
 bring
 the jaxb interface to python but I don't think that's the right approach.  
 If
 you're writing a java project, do things the java way.  If you're writing a
 python project, do them the python way.  Right now I am focused on 
 defining the
 current API (API.py/xmlrpc) mechanically (creating a schema and API
 documentation).  XSD is not the correct language for that task (which is 
 why I
 forsee a divergence at least at first).  I want to take a stab at defining 
 the
 API in a beneficial, long-term manner.
 
 Adam,
 
 Can you explain why you think XSD is not the correct language?  Is
 it because of the lacking of full python language code generator? Is
 it possible to modify the existing code generator to address that
 issue?  What is the benefit to introduce a new schema

Re: [vdsm] [virt-node] VDSM as a general purpose virt host manager

2012-06-26 Thread Adam Litke
On Tue, Jun 26, 2012 at 04:47:35PM +0100, Daniel P. Berrange wrote:
 On Tue, Jun 26, 2012 at 05:37:26PM +0300, Dan Kenigsberg wrote:
  On Mon, Jun 25, 2012 at 04:19:28PM -0500, Adam Litke wrote:
   1) Completely define the current XMLRPC API including all functions, 
   parameters,
   and return values.  Complex data structures can be broken down into their 
   basic
   types.  These are: int, str, bool, list, dict, typed-dict, enum.  I have 
   already
   started this process and am using Qemu's QAPI schema language.  You can 
   see that
   here [1].  For an example of what that looks like describing the vdsm API 
   see
   this snippet [2].
   
   2) Import the parser/generator code from qemu for the above schema.  Vdsm 
   will
   require a few extensions such as typed-dictionaries, tuples, and type 
   aliases.
   Adapt the generator so that it can produce a libvdsm which provides API 
   language
   bindings for python, c, and java.
   
   3) Implement a vdsm shell in terms of libvdsm.  In fact, this can be 
   largely
   auto-generated from the schema and accompanying documentation.  This can 
   serve
   to model how new transports can be written.  For example, an AMQP 
   implementation
   can be implemented entirely outside of the vdsm project if we wished.  It 
   only
   needs to talk to vdsm via libvdsm.
   
   Easy as 1,2,3 :)
   
   [1] 
   http://git.qemu.org/?p=qemu.git;a=blob;f=qapi-schema.json;h=3b6e3468b440b4b681f321c9525a3d83bea2137a;hb=HEAD
   [2] http://fpaste.org/rt96/
   
   Probably more than you bargained for when asking for more info... :)
  
  Indeed!
  
  I am still at a loss why the languages take such a prominent place in
  your choice for an API. A good API is easily consumable by any language.
 
 I think you are both right here.  A good API is easily consumed from any
 language, but this doesn't mean there is zero cost to starting to consume
 it from a client. You either way to be able to auto-generate code for the
 client side APIs in all your languages of choice, or even better, you want
 the client side APIs to be just do runtime dynamic dispatch based on
 published schema.

Thanks for commenting!  On one hand, dynamic dispatch seems attractive but I
think dramatically increases complexity on both the client and server sides.
Does anyone know of a prominent open source project that has been successful
with dynamic dispatch?  I am inclined to go with the C library approach because
it is tried and tested and it fits the model of other virtualization libraries
that I am familiar with.

 If you go down the route of writing a C based libvdsm for VDSM, then my
 recommendation would be to use the GObject APIs. You can then take full
 advantage of the GObject Introspection capabilities to have full dynamic
 dispatch in languages like python, perl, javascript, or full auto-generation
 of code in Vala, C#, etc

Ahh, thanks for reminding me of this.  GObject definitely seems like the way to
go.  I assume there are no real implications for the schema definition and that
the heavy-lifting for GObject support would be limited to the C code generator.
Time to take a closer look at the GObject stuff.

 I certainly wouldn't waste time writing your own code-generator for all
 the various languages, since that's just reinventing the wheel that
 GObject Introspection already provides for the most part.

Agreed.  I would love to avoid this!

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] [virt-node] VDSM as a general purpose virt host manager

2012-06-25 Thread Adam Litke
On Fri, Jun 22, 2012 at 06:45:43PM -0400, Andrew Cathrow wrote:
 
 
 - Original Message -
  From: Ryan Harper ry...@us.ibm.com
  To: Adam Litke a...@us.ibm.com
  Cc: Anthony Liguori aligu...@redhat.com, VDSM Project Development 
  vdsm-devel@lists.fedorahosted.org
  Sent: Friday, June 22, 2012 12:45:42 PM
  Subject: Re: [vdsm] [virt-node] VDSM as a general purpose virt host manager
  
  * Adam Litke a...@us.ibm.com [2012-06-22 11:35]:
   On Thu, Jun 21, 2012 at 12:17:19PM +0300, Dor Laor wrote:
On 06/19/2012 08:12 PM, Saggi Mizrahi wrote:


- Original Message -
From: Deepak C Shetty deepa...@linux.vnet.ibm.com
To: Ryan Harper ry...@us.ibm.com
Cc: Saggi Mizrahi smizr...@redhat.com, Anthony Liguori
aligu...@redhat.com, VDSM Project Development
vdsm-devel@lists.fedorahosted.org
Sent: Tuesday, June 19, 2012 10:58:47 AM
Subject: Re: [vdsm] [virt-node] VDSM as a general purpose virt
host manager

On 06/19/2012 01:13 AM, Ryan Harper wrote:
* Saggi Mizrahismizr...@redhat.com  [2012-06-18 10:05]:
I would like to put on to the table for descussion the
growing
need for a way
to more easily reuse of the functionality of VDSM in order to
service projects
other than Ovirt-Engine.

Originally VDSM was created as a proprietary agent for the
sole
purpose of
serving the then proprietary version of what is known as
ovirt-engine. Red Hat,
after acquiring the technology, pressed on with it's
commitment to
open source
ideals and released the code. But just releasing code into
the
wild doesn't
build a community or makes a project successful. Further more
when
building
open source software you should aspire to build reusable
components instead of
monolithic stacks.

Saggi,

Thanks for sending this out.  I've been trying to pull
together
some
thoughts on what else is needed for vdsm as a community.  I
know
that
for some time downstream has been the driving force for all of
the
work
and now with a community there are challenges in finding our
own
way.

While we certainly don't want to make downstream efforts
harder, I
think
we need to develop and support our own vision for what vdsm
can be
come,
some what independent of downstream and other exploiters.

Revisiting the API is definitely a much needed endeavor and I
think
adding some use-cases or sample applications would be useful
in
demonstrating whether or not we're evolving the API into
something
easier to use for applications beyond engine.

We would like to expose a stable, documented, well supported
API.
This gives
us a chance to rethink the VDSM API from the ground up. There
is
already work
 
in progress of making the internal logic of VDSM separate
enough
from the API
layer so we could continue feature development and bug fixing
while designing
the API of the future.

In order to achieve this though we need to do several things:
 1. Declare API supportability guidelines
 2. Decide on an API transport (e.g. REST, ZMQ, AMQP)
 3. Make the API easily consumable (e.g. proper docs,
 example
 code, extending
the API, etc)
 4. Implement the API itself
 
 In the earlier we'd discussed working to have similarities in the modeling 
 between the oVirt API and VDSM but that seems to have dropped off the radar.

Yes, the current REST API has attempted to be compatible with the current
ovirt-engine API.  Going forward, I am not sure how easy this will be to
maintain given than engine is built on Java and vdsm is built on Python.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] [virt-node] VDSM as a general purpose virt host manager

2012-06-25 Thread Adam Litke
On Mon, Jun 25, 2012 at 05:53:31PM +0300, Dan Kenigsberg wrote:
 On Mon, Jun 25, 2012 at 08:28:29AM -0500, Adam Litke wrote:
  On Fri, Jun 22, 2012 at 06:45:43PM -0400, Andrew Cathrow wrote:
   
   
   - Original Message -
From: Ryan Harper ry...@us.ibm.com
To: Adam Litke a...@us.ibm.com
Cc: Anthony Liguori aligu...@redhat.com, VDSM Project Development 
vdsm-devel@lists.fedorahosted.org
Sent: Friday, June 22, 2012 12:45:42 PM
Subject: Re: [vdsm] [virt-node] VDSM as a general purpose virt host 
manager

* Adam Litke a...@us.ibm.com [2012-06-22 11:35]:
 On Thu, Jun 21, 2012 at 12:17:19PM +0300, Dor Laor wrote:
  On 06/19/2012 08:12 PM, Saggi Mizrahi wrote:
  
  
  - Original Message -
  From: Deepak C Shetty deepa...@linux.vnet.ibm.com
  To: Ryan Harper ry...@us.ibm.com
  Cc: Saggi Mizrahi smizr...@redhat.com, Anthony Liguori
  aligu...@redhat.com, VDSM Project Development
  vdsm-devel@lists.fedorahosted.org
  Sent: Tuesday, June 19, 2012 10:58:47 AM
  Subject: Re: [vdsm] [virt-node] VDSM as a general purpose virt
  host manager
  
  On 06/19/2012 01:13 AM, Ryan Harper wrote:
  * Saggi Mizrahismizr...@redhat.com  [2012-06-18 10:05]:
  I would like to put on to the table for descussion the
  growing
  need for a way
  to more easily reuse of the functionality of VDSM in order to
  service projects
  other than Ovirt-Engine.
  
  Originally VDSM was created as a proprietary agent for the
  sole
  purpose of
  serving the then proprietary version of what is known as
  ovirt-engine. Red Hat,
  after acquiring the technology, pressed on with it's
  commitment to
  open source
  ideals and released the code. But just releasing code into
  the
  wild doesn't
  build a community or makes a project successful. Further more
  when
  building
  open source software you should aspire to build reusable
  components instead of
  monolithic stacks.
  
  Saggi,
  
  Thanks for sending this out.  I've been trying to pull
  together
  some
  thoughts on what else is needed for vdsm as a community.  I
  know
  that
  for some time downstream has been the driving force for all of
  the
  work
  and now with a community there are challenges in finding our
  own
  way.
  
  While we certainly don't want to make downstream efforts
  harder, I
  think
  we need to develop and support our own vision for what vdsm
  can be
  come,
  some what independent of downstream and other exploiters.
  
  Revisiting the API is definitely a much needed endeavor and I
  think
  adding some use-cases or sample applications would be useful
  in
  demonstrating whether or not we're evolving the API into
  something
  easier to use for applications beyond engine.
  
  We would like to expose a stable, documented, well supported
  API.
  This gives
  us a chance to rethink the VDSM API from the ground up. There
  is
  already work
   
  in progress of making the internal logic of VDSM separate
  enough
  from the API
  layer so we could continue feature development and bug fixing
  while designing
  the API of the future.
  
  In order to achieve this though we need to do several things:
   1. Declare API supportability guidelines
   2. Decide on an API transport (e.g. REST, ZMQ, AMQP)
   3. Make the API easily consumable (e.g. proper docs,
   example
   code, extending
  the API, etc)
   4. Implement the API itself
   
   In the earlier we'd discussed working to have similarities in the 
   modeling between the oVirt API and VDSM but that seems to have dropped 
   off the radar.
  
  Yes, the current REST API has attempted to be compatible with the current
  ovirt-engine API.  Going forward, I am not sure how easy this will be to
  maintain given than engine is built on Java and vdsm is built on Python.
 
 Could you elaborate why the language difference is an issue? Isn't this
 what APIs are supposed to solve?

The main language issue is that ovirt-engine has built their API using a set of
Java-specific frameworks (JAXB and its dependents).  It's true, if you google
for 'python jaxb' you will find some sourceforge projects that attempt to bring
the jaxb interface to python but I don't think that's the right approach.  If
you're writing a java project, do things the java way.  If you're writing a
python project, do them the python way.  Right now I am focused on defining the
current API (API.py/xmlrpc) mechanically (creating a schema and API
documentation).  XSD is not the correct language for that task (which is why I
forsee a divergence at least at first).  I want to take a stab at defining the
API

Re: [vdsm] [virt-node] VDSM as a general purpose virt host manager

2012-06-22 Thread Adam Litke
 code for the API
 - We should be able to auto-generate some API bindings.

[1]: 
http://git.qemu.org/?p=qemu.git;a=blob_plain;f=qapi-schema.json;h=3b6e3468b440b4b681f321c9525a3d83bea2137a;hb=HEAD

 
 Regards,
 Dor
 
 [1] http://ovirt.org/wiki/VDSM_Stable_API_Plan
 
 nice
 to query what plugins/capabilities are supported and accordingly the
 client can take a decision and/or call the appropriate APIs w/o
 worrying
 about ENOTSUPP kind of error.
 It does becomes blur when we talk about Repository Engines... that
 was
 also targetted to provide pluggaibility in managing Images.. how will
 that co-exist with API level pluggability ?
 
 IIUC, StorageProvisioning (via libstoragemgmt) can be one such
 optional
 support that can fit as a plug-in nicely, right ?
 You will have have an introspective verb to get supported storage engines. 
 Without the engine the hosts will not be able to log in to an image repo but 
 it will not be an API level error. You will get UnsupportedRepoFormatError 
 or something similar no matter which version of VDSM you use. The error is 
 part of the interface and engines will expose their format and parameter in 
 some way.
 
   - kvm tool integration into the API
   - there are lots of different kvm virt tools for various
   tasks
   and they are all stand-alone tools.  Can we integrate
   their
   use into the node level API.  Think libguestfs,
   virt-install,
   p2v/v2v tooling.  All of these are available, but there
   isn't an
   easy way to use this tools through an API.
 
   - host management operations
   - vdsm already does some host level configuration (see
 networking e.g.) it would be good to think about
 extending
   the API to cover other areas of configuration and updates
   - hardware enumeration
   - driver level information
   - storage configuration
   (we've got a bit of a discussion going around
libstoragemgmt here)
 
   - performance monitoring/debugging
   - is the host collecting enough information to do
   debug/perf
   analysis
   - can we support specific configurations of a host that
   optimize
   for specific workloads
   - and can we do this in the API such that
   third-parties can
   supply and maintain specific workload configurations
 
 All of these are dependent on one another and the permutations are
 endless.
 This is why I think we should try and work on each one separately.
 All
 discussions will be done openly on the mailing list and until the
 final version
 comes out nothing is set in stone.
 
 If you think you have anything to contribute to this process,
 please do so
 either by commenting on the discussions or by sending
 code/docs/whatever
 patches. Once the API solidifies it will be quite difficult to
 change
 fundamental things, so speak now or forever hold your peace. Note
 that this is
 just an introductory email. There will be a quick follow up email
 to kick start
 the discussions.
 
 
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://fedorahosted.org/mailman/listinfo/vdsm-devel
 
 
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://fedorahosted.org/mailman/listinfo/vdsm-devel
 
 
 
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://fedorahosted.org/mailman/listinfo/vdsm-devel

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] [virt-node] RFC: API Supportability

2012-06-21 Thread Adam Litke
On Thu, Jun 21, 2012 at 01:20:40PM +0300, Dan Kenigsberg wrote:
 On Wed, Jun 20, 2012 at 10:42:16AM -0500, Adam Litke wrote:
  On Tue, Jun 19, 2012 at 10:17:28AM -0400, Saggi Mizrahi wrote:
   I've opened a wiki page [1] for the stable API and extracted some of the 
   TODO points so we don't forget. Everyone can feel free to add more 
   stuff.
   
   [1] http://ovirt.org/wiki/VDSM_Stable_API_Plan
   
   Rest of the comments inline
   - Original Message -
From: Adam Litke a...@us.ibm.com
To: Saggi Mizrahi smizr...@redhat.com
Cc: VDSM Project Development vdsm-devel@lists.fedorahosted.org, 
Barak Azulay bazu...@redhat.com, Itamar
Heim ih...@redhat.com, Ayal Baron aba...@redhat.com, Anthony 
Liguori aligu...@redhat.com
Sent: Monday, June 18, 2012 12:23:10 PM
Subject: Re: [virt-node] RFC: API Supportability

On Mon, Jun 18, 2012 at 11:02:25AM -0400, Saggi Mizrahi wrote:
 The first thing we need to decide is API supportabiliy. I'll list the
 questions that need to be answered. The decision made here will have 
 great
 effect on transport selection (espscially API change process and
 versioning) so try and think about this without going to specfic
 technicalities (eg. X can't be done on REST).

Thanks for sending this out.  I will take a crack at these questions...

I would like to pose an additional question to be answered:

- Should API parameter and return value constraints be formally 
defined?  If
so, how?

Think of this as defining an API schema.  For example: When creating a 
VM,
which parameters are required/optional?  What are the valid formats for
specifying a VM disk?  What are all of the possible task states?
   Has to be part of response to the call that retrieves the state. This will
   allow us to change the states in a BC manner.
  
  I am not sure I agree.  I think it should be a part of the schema but not
  transmitted along with each API response involving a task.  This would 
  increase
  traffic and make responses unnecessarily verbose.
  
 Is there a maximum length for the storage domain description?
   I totally agree, how depends on the transport of choice but in any case I
   think the definition should be done in a declarative manner (XML\JSON) 
   using
   concrete types (important for binding with C\Java) and have some *code to
   enforce* that the input is correct. This will prevent clients from not
   adhering to the schema exploiting python's relative lax approach to 
   types. We
   already had issues with the engine wrongly sending numbers as strings and
   having this break internally because of some change in the python code 
   made it
   not handle the conversion very well.
  
  Our schema should fully define a set of simple types and complex types.  
  Each
  defined simple type will have an internal validation function to verify
  conformity of a given input.  Complex types consist of nested lists and 
  dicts of
  simple types.  They are validated first by validating members as simple 
  types
  and then checking for missing and/or extra data.
 
 When designing a dependable API, we should not desert our agility.
 ovirt-Engine has enjoyed the possibility of saying hey, we want another
 field reported in getVdsStats and presto, here it was.
 Complex types should be easily extendible (with a proper update of the
 API minor version, or a capabilities set).

+1

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] refactor clientif move the api implement to sub-module

2012-06-20 Thread Adam Litke
On Wed, Jun 20, 2012 at 06:08:57PM +0800, Xu He Jie wrote:
 Hi, folks
 
I am trying move api implement to sub-module, then we don't need
 singleton and passing clientif to anywhere. So I sent this mail to
 descript the idea. I think it's good for review.
 
So I add api registeration mechanism. Other modules can register
 it's api implement to api layer.
 I try to move VM api and vm stuff(like clientif.vmContainer, etc) to
 a new module called vmm. the vmm.VMM is similar with hsm. It's
 responsiable for managing vm and register VM implement to api layer.
 Same with hsm, I move all storage related api to hsm, and hsm will
 register them to api layer.
 
   After all api move to submodule, we can rename clientif and
 needn't passing client to any where. :)
 
   I have already submit a rough version to gerrit: 
 http://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:wip_refactor_clientif,n,z

I took a cursory look at the code you have submitted.  I think you need to more
thoroughly describe your design.  I was particularly confused by your use of
Abstract Base Classes for this scenario.  Can you explain in more depth why you
have done this?  Is there a simpler way to accomplish what you need to do?  I
looked at your VMM patch and I am unsure why you need to define VMBase and
VMImpl separately.  It means declaring the set of functions in two separate
files.  Thanks for shining some more light on your methodology.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] [virt-node] RFC: API Supportability

2012-06-18 Thread Adam Litke
 unrecognized parameters to a function.

Anytime a new parameter is added to a function, a corresponding flag should be
specified to enable handling of that parameter.  In this way, an old server can
return an error for 'Unknown flag'.

Have a missed any cases?

  - How will versioning be expressed in the bindings?

The API should have a call to return the overall version.  Also, the
capabilities call should list all noteworthy features that are present.

  - Do we retrict newer clients from using old APIs when talking with a new
server?

No.  A new client that wants to be the most compatible across vdsm versions may
choose to use an old API (even if a flashier one is available).

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] A Tool for PEP 8 Patches to Find Code Logic Changes

2012-06-07 Thread Adam Litke
On Thu, Jun 07, 2012 at 12:03:30PM +0800, Zhou Zheng Sheng wrote:
 Hi,
 
 Since there is no coverage report on tests in vdsm, if a PEP 8 patch
 passes the tests, we still not sure if there is no mistake in it.
 Viewing the diff reports on all the changes consumes a lot of time, and
 some small but fatal mistakes(like misspelling variable name) can easily
 be ignored by human eyes.
 
 So I have a try on the compiler module of Python. I write a tool named
 'pydiff'. pydiff parses two Python scripts into Abstract Syntax Trees.
 These data structures can reflect the logic of the code, and pydiff
 performs a recursive compare on the trees. Then pydiff reports
 differences and the corresponding line numbers. In this way, pydiff
 ignores code style changes, and reports only logical changes of the code.
 
 I think this tool can save us a lot of time. After a PEP 8 patch passes
 vdsm tests and pydiff, I will get some confidence on the patch and it
 probably does not break anything in vdsm.

This is a very nice tool.  Thanks for sharing it.  I would like to see all
authors of PEP8 patches use this to check their patches for semantic
correctness.  This should greatly improve our ability to complete the PEP8
cleanup quickly.

 Here is a usage example:
 
  test_o.py 
 def foo(a, b):
 pass
 
 if __name__ == '__main__':
 A = [1, 2, 3]
 print (4, 5, 6), \
 over
 foo(1, 2)
 print 'Hello World'
 
 
  test_n.py 
 def foo(a, b):
 pass
 
 if __name__ == '__main__':
 A = [1,
 2, 3]
 print (4, 5, 6), over
 fooo(
 1, 2)
 print ('Hello '
 'World')
 
 
 Some differences of the files are just a matter of style. The only
 significant difference is the function call foo() is misspelled in
 test_n.py.
 
 Run pydiff.py, it will report:
 
 $ python pydiff.py test_*.py
 1 difference(s)
 first file: test_n.py
 second file: test_o.py
 
 ((8, 'fooo'), (8, 'foo'))
 
 This report tells us that 'fooo' in line 8 of test_n.py is different
 from 'foo' in line 8 of test_o.py.
 
 
 It can also find insertions or deletions. Here is another simple example:
 
  old.py 
 print 'Hello 1'
 print 'Hello 2'
 print 'Hello 3'
 print 'Hello 4'
 print 'Hello 5'
 
  new.py 
 print 'Hello 1'
 print 'Hello 3'
 print 'Hello 4'
 print 'Hello 5'
 print 'Hello 5'
 
 Run pydiff:
 
 $ pydiff old.py new.py
 2 difference(s)
 first file: old.py
 second file: new.py
 
 ((2, Printnl([Const('Hello 2')], None)), (2, None))
 
 ((5, None), (5, Printnl([Const('Hello 5')], None)))
 
 Here ((2, Printnl([Const('Hello 2')], None)), (2, None)) means there
 is a print statement in line 2 of old.py, but no corresponding statement
 in new.py, so we can know the statement is deleted in new.py.
 ((5, None), (5, Printnl([Const('Hello 5')], None))) means there is a
 print statement in line 5 of new.py, but no corresponding statement in
 old.py, so we can know the statement is inserted in new.py.
 
 
 Sometimes the change in code logic is acceptable, for example, change
 aDict.has_key(Key) into Key in aDict. pydiff can report a difference
 in this case, but it is up to the user to judge whether it's acceptable.
 pydiff is just a tool to help you finding these changes.
 
 I hope it can be helpful for PEP 8 patch reviewers. If you find any
 bugs, please let me know. The script is in the attachment.
 
 -- 
 Thanks and best regards!
 
 Zhou Zheng Sheng / 周征晟
 E-mail: zhshz...@linux.vnet.ibm.com
 Telephone: 86-10-82454397
 


 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://fedorahosted.org/mailman/listinfo/vdsm-devel


-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


[vdsm] Help please! -- [postmas...@us.ibm.com: Delivery Status Notification (Failure)]

2012-06-07 Thread Adam Litke
Hi.  Recently my mails to vdsm-devel have started bouncing.  I think this was
caused by a temporary misconfiguration of my local mail setup that resulted in
my email appearing as 'agli...@us.ibm.com'.  Could someone please verify that my
list membership is still configured with the proper email address
'a...@us.ibm.com'?  Thanks a lot and sorry for causing trouble! :)

- Forwarded message from e33.co.us.ibm.com PostMaster 
postmas...@us.ibm.com -

Date: Wed, 6 Jun 2012 15:33:12 -0600
From: e33.co.us.ibm.com PostMaster postmas...@us.ibm.com
To: a...@us.ibm.com
Subject: Delivery Status Notification (Failure)
X-MailerServer: XMail 1.27mod32-ISS
X-MailerError: Message = [1339018392162.92bffba0.49b4.2825d.e33] Server = 
[e33.co.us.ibm.com]

[00] XMail bounce: Rcpt=[vdsm-devel@lists.fedorahosted.org];Error=[550 5.1.1 
vdsm-devel@lists.fedorahosted.org: Recipient address rejected: User unknown 
in local recipient table]


[01] Error sending message [1339018392162.92bffba0.49b4.2825d.e33] from 
[e33.co.us.ibm.com].

ID:12060621-2398---0746823F
Mail From: a...@us.ibm.com
Rcpt To:   vdsm-devel@lists.fedorahosted.org
Server:[hosted03.fedoraproject.org.]


[02] The reason of the delivery failure was:

550 5.1.1 vdsm-devel@lists.fedorahosted.org: Recipient address rejected: User 
unknown in local recipient table


[05] Here is listed the initial part of the message:

Received: from /spool/local
by e33.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! 
Violators will be prosecuted
for vdsm-devel@lists.fedorahosted.org from a...@us.ibm.com;
Wed, 6 Jun 2012 15:33:12 -0600
Received: from d03dlp02.boulder.ibm.com (9.17.202.178)
by e33.co.us.ibm.com (192.168.1.133) with IBM ESMTP SMTP Gateway: 
Authorized Use Only! Violators will be prosecuted;
Wed, 6 Jun 2012 15:33:10 -0600
Received: from d03relay03.boulder.ibm.com (d03relay03.boulder.ibm.com 
[9.17.195.228])
by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id E4C9F3E4004C
for vdsm-devel@lists.fedorahosted.org; Wed,  6 Jun 2012 21:33:08 
+ (WET)
Received: from d03av05.boulder.ibm.com (d03av05.boulder.ibm.com [9.17.195.85])
by d03relay03.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id 
q56LX2Fe134162
for vdsm-devel@lists.fedorahosted.org; Wed, 6 Jun 2012 15:33:05 -0600
Received: from d03av05.boulder.ibm.com (loopback [127.0.0.1])
by d03av05.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP 
id q56LX0DP010799
for vdsm-devel@lists.fedorahosted.org; Wed, 6 Jun 2012 15:33:01 -0600
Received: from us.ibm.com (sig-9-76-23-222.mts.ibm.com [9.76.23.222])
by d03av05.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with SMTP id 
q56LWviG010656;
Wed, 6 Jun 2012 15:32:58 -0600
Received: by us.ibm.com (sSMTP sendmail emulation); Wed,  6 Jun 2012 16:32:57 
-0500
From: Adam Litke a...@us.ibm.com
Date: Wed, 6 Jun 2012 16:32:57 -0500
To: Rodrigo Trujillo rodrigo.truji...@linux.vnet.ibm.com
Cc: vdsm-devel@lists.fedorahosted.org
Subject: Re: [vdsm] About xmlrpc an rest api
Message-ID: 20120606213257.GU2671@localhost.localdomain
References: 4fcfab74.8030...@linux.vnet.ibm.com
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: 4fcfab74.8030...@linux.vnet.ibm.com
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Content-Scanned: Fidelis XPS MAILER
x-cbid: 12060621-2398---0746823F

On Wed, Jun 06, 2012 at 04:11:48PM -0300, Rodrigo Trujillo wrote:
 Hi,
 
 I  have researched about the VDSM APIs, but was not clear to me how
 to use them.
 Where can I find documentation about them and how to use with python ?

I wrote this python script (with help from this list) to create a VM using the
xmlrpc interface.  It is not trivial (as you will see).  I am certain that you
will need to modify this to get it working in your environment.  In the future,
we hope to make this far easier to do.  We want to save you from needing to do
the storage manipulations.  Also, a REST API should organize the API much better
than the xmlrpc (which was never meant to be friendly to end users).


#!/usr/bin/python

import sys
import uuid
import time

- End forwarded message -

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] Help please! -- [postmas...@us.ibm.com: Delivery Status Notification (Failure)]

2012-06-07 Thread Adam Litke
On Thu, Jun 07, 2012 at 05:33:04PM +0300, Itamar Heim wrote:
 On 06/07/2012 05:07 PM, Adam Litke wrote:
 Hi.  Recently my mails to vdsm-devel have started bouncing.  I think this was
 caused by a temporary misconfiguration of my local mail setup that resulted 
 in
 my email appearing as 'agli...@us.ibm.com'.  Could someone please verify 
 that my
 list membership is still configured with the proper email address
 'a...@us.ibm.com'?  Thanks a lot and sorry for causing trouble! :)
 
 
 I think you just got this like other people who replied to that email.
 i.e., it's from your mail server, not from the mailing list

Hmm, ok.  It seems that the mail is flowing now and is arriving at the list.
Also this reply has worked.  Hope all is okay now then.

 
 - Forwarded message from e33.co.us.ibm.com 
 PostMasterpostmas...@us.ibm.com  -
 
 Date: Wed, 6 Jun 2012 15:33:12 -0600
 From: e33.co.us.ibm.com PostMasterpostmas...@us.ibm.com
 To: a...@us.ibm.com
 Subject: Delivery Status Notification (Failure)
 X-MailerServer: XMail 1.27mod32-ISS
 X-MailerError: Message = [1339018392162.92bffba0.49b4.2825d.e33] Server = 
 [e33.co.us.ibm.com]
 
 [00] XMail bounce: Rcpt=[vdsm-devel@lists.fedorahosted.org];Error=[550 
 5.1.1vdsm-devel@lists.fedorahosted.org: Recipient address rejected: User 
 unknown in local recipient table]
 
 
 [01] Error sending message [1339018392162.92bffba0.49b4.2825d.e33] from 
 [e33.co.us.ibm.com].
 
 ID:12060621-2398---0746823F
 Mail From:a...@us.ibm.com
 Rcpt To:vdsm-devel@lists.fedorahosted.org
 Server:[hosted03.fedoraproject.org.]
 
 
 [02] The reason of the delivery failure was:
 
 550 5.1.1vdsm-devel@lists.fedorahosted.org: Recipient address rejected: 
 User unknown in local recipient table
 
 
 [05] Here is listed the initial part of the message:
 
 Received: from /spool/local
  by e33.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! 
  Violators will be prosecuted
  forvdsm-devel@lists.fedorahosted.org  froma...@us.ibm.com;
  Wed, 6 Jun 2012 15:33:12 -0600
 Received: from d03dlp02.boulder.ibm.com (9.17.202.178)
  by e33.co.us.ibm.com (192.168.1.133) with IBM ESMTP SMTP Gateway: 
  Authorized Use Only! Violators will be prosecuted;
  Wed, 6 Jun 2012 15:33:10 -0600
 Received: from d03relay03.boulder.ibm.com (d03relay03.boulder.ibm.com 
 [9.17.195.228])
  by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id E4C9F3E4004C
  forvdsm-devel@lists.fedorahosted.org; Wed,  6 Jun 2012 21:33:08 + 
  (WET)
 Received: from d03av05.boulder.ibm.com (d03av05.boulder.ibm.com 
 [9.17.195.85])
  by d03relay03.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id 
  q56LX2Fe134162
  forvdsm-devel@lists.fedorahosted.org; Wed, 6 Jun 2012 15:33:05 -0600
 Received: from d03av05.boulder.ibm.com (loopback [127.0.0.1])
  by d03av05.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP 
  id q56LX0DP010799
  forvdsm-devel@lists.fedorahosted.org; Wed, 6 Jun 2012 15:33:01 -0600
 Received: from us.ibm.com (sig-9-76-23-222.mts.ibm.com [9.76.23.222])
  by d03av05.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with SMTP id 
  q56LWviG010656;
  Wed, 6 Jun 2012 15:32:58 -0600
 Received: by us.ibm.com (sSMTP sendmail emulation); Wed,  6 Jun 2012 
 16:32:57 -0500
 From: Adam Litkea...@us.ibm.com
 Date: Wed, 6 Jun 2012 16:32:57 -0500
 To: Rodrigo Trujillorodrigo.truji...@linux.vnet.ibm.com
 Cc: vdsm-devel@lists.fedorahosted.org
 Subject: Re: [vdsm] About xmlrpc an rest api
 Message-ID:20120606213257.GU2671@localhost.localdomain
 References:4fcfab74.8030...@linux.vnet.ibm.com
 MIME-Version: 1.0
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 In-Reply-To:4fcfab74.8030...@linux.vnet.ibm.com
 User-Agent: Mutt/1.5.21 (2010-09-15)
 X-Content-Scanned: Fidelis XPS MAILER
 x-cbid: 12060621-2398---0746823F
 
 On Wed, Jun 06, 2012 at 04:11:48PM -0300, Rodrigo Trujillo wrote:
 Hi,
 
 I  have researched about the VDSM APIs, but was not clear to me how
 to use them.
 Where can I find documentation about them and how to use with python ?
 
 I wrote this python script (with help from this list) to create a VM using 
 the
 xmlrpc interface.  It is not trivial (as you will see).  I am certain that 
 you
 will need to modify this to get it working in your environment.  In the 
 future,
 we hope to make this far easier to do.  We want to save you from needing to 
 do
 the storage manipulations.  Also, a REST API should organize the API much 
 better
 than the xmlrpc (which was never meant to be friendly to end users).
 
 
 #!/usr/bin/python
 
 import sys
 import uuid
 import time
 
 - End forwarded message -
 
 

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] SSLError with vdsm

2012-06-07 Thread Adam Litke
On Thu, Jun 07, 2012 at 05:35:54PM +0300, Itamar Heim wrote:
 On 06/07/2012 09:58 AM, Wenyi Gao wrote:
 On 2012-06-07 13:51, Zhou Zheng Sheng wrote:
 Hi,
 It is because normal user do not have the privilege to access the keys
 in /etc/pki/vdsm/keys/ and certificates in /etc/pki/vdsm/certs/. You
 can su to root or sudo vdsClient to use SSL connection.
 
 于 2012年06月07日 13:03, Wenyi Gao 写道:
 
 Hi guys,
 
 When I ran the cmmand vdsClient -s 0 getVdsCaps, I got the
 following error:
 
 
 $ vdsClient -s 0 getVdsCaps
 Traceback (most recent call last):
   File /usr/share/vdsm/vdsClient.py, line 2275, in module
 code, message = commands[command][0](commandArgs)
   File /usr/share/vdsm/vdsClient.py, line 403, in do_getCap
 return self.ExecAndExit(self.s.getVdsCapabilities())
   File /usr/lib64/python2.7/xmlrpclib.py, line 1224, in __call__
 return self.__send(self.__name, args)
   File /usr/lib64/python2.7/xmlrpclib.py, line 1578, in __request
 verbose=self.__verbose
   File /usr/lib64/python2.7/xmlrpclib.py, line 1264, in request
 return self.single_request(host, handler, request_body, verbose)
   File /usr/lib64/python2.7/xmlrpclib.py, line 1292, in single_request
 self.send_content(h, request_body)
   File /usr/lib64/python2.7/xmlrpclib.py, line 1439, in send_content
 connection.endheaders(request_body)
   File /usr/lib64/python2.7/httplib.py, line 954, in endheaders
 self._send_output(message_body)
   File /usr/lib64/python2.7/httplib.py, line 814, in _send_output
 self.send(msg)
   File /usr/lib64/python2.7/httplib.py, line 776, in send
 self.connect()
   File /usr/lib/python2.7/site-packages/vdsm/SecureXMLRPCServer.py,
 line 98, in connect
 cert_reqs=self.cert_reqs)
   File /usr/lib64/python2.7/ssl.py, line 381, in wrap_socket
 ciphers=ciphers)
   File /usr/lib64/python2.7/ssl.py, line 141, in __init__
 ciphers)
 SSLError: [Errno 185090050] _ssl.c:340: error:0B084002:x509
 certificate routines:X509_load_cert_crl_file:system lib
 
 
 
 But if I set ssl = false in /etc/vdsm/vdsm.conf, then run
 vdsClient 0 getVdsCaps, the problem goes away.
 
 Does anyone know what causes the problem above? Thanks.
 
 
 Wenyi Gao
 
 
 
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://fedorahosted.org/mailman/listinfo/vdsm-devel
 
 --
 Thanks and best regards!
 
 Zhou Zheng Sheng / 周征晟
 E-mail:zhshz...@linux.vnet.ibm.com
 Telephone: 86-10-82454397
 
 
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://fedorahosted.org/mailman/listinfo/vdsm-devel
 
 Yes, it works. Thanks.
 
 maybe send a patch to check the permissions and give a proper error
 message for the next user failing on this?

+1.  Great suggestion!

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] Agenda for today's call

2012-06-04 Thread Adam Litke
On Mon, Jun 04, 2012 at 12:58:21PM +0300, Dan Kenigsberg wrote:
 Hi All,
 
 I have fewer talk issues for today, please suggest others, or else the
 call would be short and to the point!
 
 
 - reviewers/verifiers are still missing for pep8 patches.
   A branch was created, but not much action has taken place on it
   
 http://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:pep8cleaning,n,z
 
 - Upcoming oVirt-3.1 release: version bump to 4.9.7? to 4.10?
 
 - Vdsm/MOM integration: could we move MOM to gerrit.ovirt.org?
 

I would like to add:
- screen sharing options for REST API online code review

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] RFC: Writeup on VDSM-libstoragemgmt integration

2012-05-31 Thread Adam Litke
 offload capability in the
 domain metadata
 -- If available, and override is not configured, it will use
 LSM to offload LUN/File snapshot
 -- If override is configured or capability is not available,
 it will use its internal logic to create
snapshot (qcow2).
 
 - Copy/Clone vmdisk flow
 -- VDSM will check the copy offload capability in the domain
 metadata
 -- If available, and override is not configured, it will use
 LSM to offload LUN/File copy
 -- If override is configured or capability is not available,
 it will use its internal logic to create
snapshot (eg: dd cmd in case of LUN).
 
 7) LSM potential changes:
 
 - list features/capabilities of the array. Eg: copy offload,
 thin prov. etc.
 - list containers (aka pools) (present in LSM today)
 - Ability to list different types of arrays being managed, their
 capabilities and used/free space
 - Ability to create/list/delete/resize volumes ( LUN or exports,
 available in LSM as of today)
 - Get monitoring info with object (LUN/snapshot/volume) as
 optional parameter for specific info. eg: container/pool free/used
 space, raid type etc.
 
 Need to make sure above info is listed in a coherent way across
 arrays (number of LUNs, raid type used? free/total per
 container/pool, per LUN?. Also need I/O statistics wherever
 possible.
 
 
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://fedorahosted.org/mailman/listinfo/vdsm-devel

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] VDSM API/clientIF instance design issue

2012-05-31 Thread Adam Litke
On Thu, May 31, 2012 at 04:08:52PM +0300, Dan Kenigsberg wrote:
 On Thu, May 31, 2012 at 09:03:37PM +0800, Mark Wu wrote:
  On 05/30/2012 11:01 PM, Dan Kenigsberg wrote:
  On Wed, May 30, 2012 at 10:49:29PM +0800, Mark Wu wrote:
  Hi Guys,
  
  Recently,  I has been working on integrate MOM into VDSM.  MOM needs
  to use VDSM API to interact with it.  But currently, it requires the
  instance of clientIF to use vdsm API.  Passing clientIF to MOM is
  not a good choice since it's a vdsm internal object.  So I try to
  remove the parameter 'cif' from the interface definition and change
  to access the globally unique  clientIF instance in API.py.
  Please remind me - why don't we continue to pass the clientIF instance,
  even if it means mentioning it each and every time an API.py object is
  created? It may be annoying (and thus serve as a reminder that we should
  probably retire much of clientIF...), but it should work.
  
  In the old MOM integration patch,  I passed the clientIF instance to
  MOM by the following method:
  vdsmInterface.setConnection(self._cif)
  
  Here's your comments on the patch:
  
  _cif is not the proper API to interact with Vdsm. API.py is. Please
  change MOM to conform to this, if possible.
  
  I think that mom should receive an API object (even API.Global()!)
  that it needs for its operation. Even passing BindingXMLRPC() object
  is more APIish than the internal clientIF object.
 
 Please do not blame me! ;-)
 
 I do not mind passing an API.Global() that happens to hold an internal
 private reference to _clientIF. I just want that if we find the way to
 obliterate clientIF, we won't need to send a patch to MOM, too.
 
  
  So I try to remove cif from API definition to make MOM can call the
  VDSM API without having clientIF.
 
 I do not understand - MOM could receive an API object, it does not have
 to construct it by itself.

Today, the API consists of several, unlinked objects so passing a single
API.Global() would not be enough.  We either need to allow MOM to construct its
own API objects or produce them by calling methods in API.Global().  Personally,
I think the code would be cleaner if clientIF is a singleton (Mark's latest
patch) as opposed to adding factory methods to API.Global().

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center
___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] a problem with pepe8

2012-05-18 Thread Adam Litke
On Fri, May 18, 2012 at 03:56:05PM +0800, ShaoHe Feng wrote:
 a comment exceed 80 characters,  and it is a url link.
 such as
 # 
 http:///bb///eee/fff/
 
 how can I do?
 is this OK?
 # http://bb//
 # /eee/fff/
 # (the link is too long to fit in one line, copy it and paste it to
 one line)

It would be nice if we could annotate the source code to disable certain checks
in places such as this.  Clearly the rigid line length restriction would result
in a less readable comment if followed here.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] RESTful VM creation

2012-05-16 Thread Adam Litke
No comments at all on this??

On Wed, May 09, 2012 at 09:35:29AM -0500, Adam Litke wrote:
 I would like to discuss a problem that is going to affect VM creation in the 
 new
 REST API.  This topic has come up previously and I want to revive that
 discussion because it is blocking a proper implementation of VM.create().
 
 Consider a RESTful VM creation sequence:
   POST /api/vms/define - Define a new VM in the system
   POST /api/vms/id/disks/add - Add a new disk to the VM
   POST /api/vms/id/cdroms/add - Add a cdrom
   POST /api/vms/id/nics/add - Add a NIC
   PUT /api/vms/id - Change boot sequence
   POST /api/vms/id/start - Boot the VM
 
 Unfortunately this is not possible today with vdsm because a VM must be
 fully-specified at the time of creation and it will be started immediately.
 
 As I see it there are two ways forward:
 
 1.) Deviate from a REST model and require a VM resource definition to include
 all sub-collections inline.
 -- or --
 2.) Support storage of vm definitions so that powered off VMs can be 
 manipulated
 by the API.
 
 My preference would be #2 because: it makes the API more closely follow 
 RESTful
 principles, it maintains parity with the cluster-level VM manipulation API, 
 and
 it makes the API easier to use in standalone mode.
 
 Here is my idea on how this could be accomplished without committing to 
 stateful
 host storage.  In the past we have discussed adding an API for storing 
 arbitrary
 metadata blobs on the master storage domain.  If this API were available we
 could use it to create a transient VM construction site.  Let's walk through
 the above RESTful sequence again and see how my idea would work in practice:
 
 * POST /api/vms/define - Define a new VM in the system
 A new VM definition would be written to the master storage domain metadata 
 area.
 
 * GET /api/vms/new-uuid
 The normal 'list' API is consulted as usual.  The VM will not be found there
 because it is not yet created.  Next, the metadata area is consulted.  The VM 
 is
 found there and will be returned.  The VM state will be 'New'.
 
 * POST /api/vms/id/disks/add - Add a new disk to the VM
 For 'New' VMs, this will update the VM metadata blob with the new disk
 information.  Otherwise, this will call the hotplugDisk API.
 
 * POST /api/vms/id/cdroms/add - Add a cdrom
 For 'New' VMs, this will update the VM metadata blob with the new cdrom
 information.  If we want to support hotplugged CDROMs we can call that API
 later.
 
 * POST /api/vms/id/nics/add - Add a NIC
 For 'New' VMs, this will update the VM metadata blob with the new nic
 information.  Otherwise it triggers the hotplugNic API.
 
 * PUT /api/vms/id - Change boot sequence
 Only valid for 'New' VMs.  Updates the metadata blob according to the 
 parameters
 specified.
 
 * POST /api/vms/id/start - Boot the VM
 Load the metadata from the master storage domain metadata area.  Call the
 VM.create() API.  Remove the metadata from the master storage domain.
 
 VDSM will automatically purge old metadata from the master storage domain.  
 This
 could be done any time a domain is: attached as master, deactivated, and
 periodically.
 
 How does this idea sound?  I am certain that it can be improved by those of 
 you
 with more experience and different viewpoints.  Thoughts and comments?
 
 -- 
 Adam Litke a...@us.ibm.com
 IBM Linux Technology Center
 
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://fedorahosted.org/mailman/listinfo/vdsm-devel

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


[vdsm] RESTful VM creation

2012-05-09 Thread Adam Litke
I would like to discuss a problem that is going to affect VM creation in the new
REST API.  This topic has come up previously and I want to revive that
discussion because it is blocking a proper implementation of VM.create().

Consider a RESTful VM creation sequence:
  POST /api/vms/define - Define a new VM in the system
  POST /api/vms/id/disks/add - Add a new disk to the VM
  POST /api/vms/id/cdroms/add - Add a cdrom
  POST /api/vms/id/nics/add - Add a NIC
  PUT /api/vms/id - Change boot sequence
  POST /api/vms/id/start - Boot the VM

Unfortunately this is not possible today with vdsm because a VM must be
fully-specified at the time of creation and it will be started immediately.

As I see it there are two ways forward:

1.) Deviate from a REST model and require a VM resource definition to include
all sub-collections inline.
-- or --
2.) Support storage of vm definitions so that powered off VMs can be manipulated
by the API.

My preference would be #2 because: it makes the API more closely follow RESTful
principles, it maintains parity with the cluster-level VM manipulation API, and
it makes the API easier to use in standalone mode.

Here is my idea on how this could be accomplished without committing to stateful
host storage.  In the past we have discussed adding an API for storing arbitrary
metadata blobs on the master storage domain.  If this API were available we
could use it to create a transient VM construction site.  Let's walk through
the above RESTful sequence again and see how my idea would work in practice:

* POST /api/vms/define - Define a new VM in the system
A new VM definition would be written to the master storage domain metadata area.

* GET /api/vms/new-uuid
The normal 'list' API is consulted as usual.  The VM will not be found there
because it is not yet created.  Next, the metadata area is consulted.  The VM is
found there and will be returned.  The VM state will be 'New'.

* POST /api/vms/id/disks/add - Add a new disk to the VM
For 'New' VMs, this will update the VM metadata blob with the new disk
information.  Otherwise, this will call the hotplugDisk API.

* POST /api/vms/id/cdroms/add - Add a cdrom
For 'New' VMs, this will update the VM metadata blob with the new cdrom
information.  If we want to support hotplugged CDROMs we can call that API
later.

* POST /api/vms/id/nics/add - Add a NIC
For 'New' VMs, this will update the VM metadata blob with the new nic
information.  Otherwise it triggers the hotplugNic API.

* PUT /api/vms/id - Change boot sequence
Only valid for 'New' VMs.  Updates the metadata blob according to the parameters
specified.

* POST /api/vms/id/start - Boot the VM
Load the metadata from the master storage domain metadata area.  Call the
VM.create() API.  Remove the metadata from the master storage domain.

VDSM will automatically purge old metadata from the master storage domain.  This
could be done any time a domain is: attached as master, deactivated, and
periodically.

How does this idea sound?  I am certain that it can be improved by those of you
with more experience and different viewpoints.  Thoughts and comments?

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] error when run vdsClient

2012-05-08 Thread Adam Litke
On Tue, May 08, 2012 at 11:51:02PM +0300, Dan Kenigsberg wrote:
 On Wed, May 09, 2012 at 01:42:45AM +0800, ShaoHe Feng wrote:
  
  $ sudo ./autobuild.sh
  build vdsm, and all test OK.
  
  then rpm install the rpm package.
  
  and start the vdsm
  $ sudo systemctl start vdsmd.service
  
  but error, when run vdsClient.
  
File /usr/share/vdsm/vdsClient.py, line 28, in module
  from vdsm import vdscli
  ImportError: cannot import name vdscli
  
  but I change to root, the vdsClient can work.

I have also noticed this problem.  I have found that changing out of the vdsm
source directory 'fixes' it as well.

  
  $ ls /usr/lib/python2.7/site-packages/vdsm/vdscli.py -al
  -rw-r--r--. 1 root root 4113 May  9 01:20
  /usr/lib/python2.7/site-packages/vdsm/vdscli.py
 
 What's your $PWD? Maybe you have some vdsm module/package in your
 PYTHONPATH that hides the one in site-packages.
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://fedorahosted.org/mailman/listinfo/vdsm-devel

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


[vdsm] RFD: NEW API getAllTasks

2012-05-07 Thread Adam Litke
The current APIs for retrieving all task information do not actually return all
task information.  I would like to introduce a new API that corrects this and
other issues with the current API while preserving backwards compatibility with
ovirt-engine for as long as is necessary.

The current APIs:

getAllTasksInfo(spUUID=None, options = None):
 - Returns a dictionary that maps a task UUID to a task verb.
 - Despite having 'all' in the name, this API only returns tasks that have an
   'spm' tag.
 - This call returns only one piece of information for each task.
 - The spUUID parameter is deprecated and ignored.

getAllTasksStatuses(spUUID=None, options = None):
 - Returns a dictionary of task status information.
 - Despite having 'all' in the name, this API only returns tasks that have an
   'spm' tag.
 - The spUUID parameter is deprecated and ignored.


I propose the following new API:

getAllTasks(tag=None, options=None):
 - Returns a dictionary of task information.  The info from both of the above
   functions would be merged into a single result set.
 - If tag is None, all tasks are returned.  otherwise, only tasks matching the
   tag are returned.
 - The spUUID parameter is dropped.  options is for future extension and is
   currently not used.

This new API includes all functionality that is available in the old calls.  In
the future, ovirt-engine could switch to this API and preserve the current
semantics by passing tag='spm' to getAllTasks.  Meanwhile, API users that really
want all tasks (gluster and the REST API) can get what they need.

Thoughts on this idea?

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


[vdsm] getAllTasksInfo API

2012-04-16 Thread Adam Litke
Hi,

While developing the REST API I was having trouble using the
getAllTasks(Info|Statuses) API to get tasks information.  I found out that hsm
is hard-coding a tagged search for 'spm' in the calls to the task manager.  Is
there a reason that this tag must be hard-coded or can we remove it as in the
patch below?  With this patch applied I am able to list all tasks.

If this patch is acceptable, I would be happy to submit it to gerrit for
approval.  Thanks!


commit 72621b2ffe5a0a21ba1023dada36b405bf2111f2
Author: Adam Litke a...@us.ibm.com
Date:   Mon Apr 16 13:56:55 2012 -0500

Don't hardcode the 'spm' tag when getting information for all tasks.

diff --git a/vdsm/storage/hsm.py b/vdsm/storage/hsm.py
index 2755aef..51ee17c 100644
--- a/vdsm/storage/hsm.py
+++ b/vdsm/storage/hsm.py
@@ -1694,7 +1694,7 @@ class HSM:
 :options: ?
 
 #getSharedLock(tasksResource...)
-allTasksStatus = self.taskMng.getAllTasksStatuses(spm)
+allTasksStatus = self.taskMng.getAllTasksStatuses()
 return dict(allTasksStatus=allTasksStatus)
 
 
@@ -1733,7 +1733,7 @@ class HSM:
 
 #getSharedLock(tasksResource...)
 # TODO: if spUUID passed, make sure tasks are relevant only to pool
-allTasksInfo = self.taskMng.getAllTasksInfo(spm)
+allTasksInfo = self.taskMng.getAllTasksInfo()
 return dict(allTasksInfo=allTasksInfo)
 
 

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


[vdsm] MOM Integration Plan

2012-02-27 Thread Adam Litke
Hi all,

Very shortly Mark will be sending some patches for review that implement the
long-awaited integration of mom with vdsm.  I felt it would be easier to
understand the changes to vdsm if they were explained a bit better.  In support
of this I have created a wiki page on ovirt.org with a diagram:

http://ovirt.org/wiki/Features/MomIntegration

To facilitate discussion, here is the text of that page:

As discussed at the oVirt Workshop and elsewhere, integrating mom with vdsm will
benefit oVirt by providing a mechanism for dynamic, policy-based tuning.  This
mechanism will pave the way for implementing memory ballooning policies, can
enhance migration policy, and will replace the existing ksm tuning thread.

MOM exists today as an independent library that can be used by python programs
such as vdsm or in standalone mode (by using the accompanying momd program.
Mom's operation is very configurable.  The management policy is written in a
Fortran-like language and is replaceable by the end user.  Additionally, plugins
allow you to customize the types of information collected and the manner in
which it is collected.  Similarly, Controller plugins permit a completely
flexible control API to be created.

To integrate mom, vdsm will initialize the mom library in a new thread and start
it.  Therefore, mom and vdsm will exist in the same process.  Vdsm will
configure the mom instance to use plugins and a policy that exclusively target
the vdsm API.  All statistics collection will occur via API calls and any
management actions (including adjustments to KSM and VM balloons) will be done
through the vdsm api as well.  Mom will not use libvirt at all (not even to
monitor for new VMs on the system).

Packaging logistics:
-
Mom is an independent package that is already in Fedora.  Any changes to mom
that are required to support this integration will be submitted to the mom
project for inclusion.  Vdsm will consume the standard MOM package as a python
module/library.

In order to control its mom instance, vdsm will ship a mom configuration file
and a mom policy file that will set mom's default behavior.  At startup, vdsmd
will import mom and initialize it with the configuration and policy files.  From
that point on, mom will interact with vdsm through the well-defined API in
API.py.

New features needed in vdsm:
---
In order to fully benefit from mom's capabilities, vdsm should implement the
following extra features/APIs:

- Collection of more memory statistics via ovirt-guest-agent including the
  current memory balloon value.
- A vmBalloon API to set a new balloon target.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] VDSM host network configuration

2012-02-15 Thread Adam Litke
On Wed, Feb 15, 2012 at 06:36:48PM +0200, Dan Kenigsberg wrote:
 On Thu, Feb 16, 2012 at 12:05:16AM +0800, Lei Li wrote:
  Hi,
  
  We are working on VDSM network REST APIs, to support the functions we need 
  to get
  the list of configured networks. I found that VDSM network has a function
  'listNetworks' in configNetwork.py. It can get and display the current 
  configured
  network like this:
  
  # python configNetwork.py list
  Networks: ['bridge_one', 'bridge_three', 'bridge_two']
  Vlans: []
  Nics: ['eth0']
  Bondings: []
  
  But there are some problems with it. It can not display the defined 
  networks after
  host restart, but the created config file are still 
  there(/etc/sysconfig/network-scripts/..).
  Did I miss anything? Or Is there some way to avoid this?
  
  Your suggestion and thoughts would be appreciated.

Lei, on my vdsm host, running python /usr/share/vdsm/configNetwork.py list gives
me the following output:

Networks: ['ovirtmgmt']
Vlans: []
Nics: ['eth1', 'eth0']
Bondings: ['bond4', 'bond0', 'bond1', 'bond2', 'bond3']

and python /usr/share/vdsm/configNetwork.py show ovirtmgmt gives:

Bridge ovirtmgmt: vlan=None, bonding=None, nics=['eth0']

These results are what I would expect to see.

 Could you describe how you reproduce the problem (with as much details)?
 You define a network, persist it, and restart the host?

Hi Dan.  As I understand it there is not a problem with vdsm in this regard.
Lei is trying to model the current networking APIs in REST.  To do this you
might have something like:

/vdsm-api/networks/ ..Get a list of bridges configured for vdsm
/vdsm-api/networks/confirmMark the current network config as safe
/vdsm-api/networks/addAdd a new network
/vdsm-api/networks/ovirtmgmt/.View details of the ovirtmgmt network
/vdsm-api/networks/ovirtmgmt/edit.Edit the ovirtmgmt network
/vdsm-api/networks/ovirtmgmt/delete...Delete the ovirtmgmt network

The current vdsm API lacks a facility to display the /vdsm-api/networks/ URI
because there is no function to get such a list.  To create such an API, one
might call out to 'configNetwork.py list'.  Is there support for adding such an
API to API.py?  How about an API to fetch network info via 
configNetwork.py show?  

Also, I think the networking APIs should be organized into a Network class 
within API.py.

 
 Did Vdsm restart after boot? What is reported by getVdsCaps ?

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] vdsm hangs in SamplingMethod after reinstall

2012-02-14 Thread Adam Litke
On Sun, Feb 12, 2012 at 06:46:25PM -0500, Ayal Baron wrote:
 
 
 - Original Message -
  On Thu, Feb 09, 2012 at 07:15:48PM -0500, Ayal Baron wrote:
   
   
   - Original Message -
Hi.  I am running into a very annoying problem when working on
vdsm
lately.  My
development process involves stopping vdsm, replacing files, and
restarting it.
I do this pretty frequently.  Sometimes, after restarting vdsm
the
XMLRPC call
getStorageDomainsList() hangs.  The following line is the last to
 
 Can you post the exact flow you're running?

Still working on this.  It isn't reproducing reliably -- only when I really need
to get some work done :)

 
print in the
log:

Thread-18::DEBUG::2012-02-09
17:11:46,793::misc::1017::SamplingMethod::(__call__) Trying to
enter
sampling method (storage.sdc.refreshStorage)

The only solution I've been able to come up with is restarting my
machine.  When
stopping vdsm I search for any stale threads but I am unable to
find
them.  Do
you know what else might be causing DynamicBarrier.enter() to
hang
for a long
period of time?  Do the threading primitives use some sort of
temporary disk
storage that needs to be cleaned up?  Thanks for the help!
   
   Try to add some logging in sdc.py:
   def refreshStorage(self):
ADD LOG HERE
  
  Yep have done this and I am not even getting into the refreshStorage
  function.
  We actually hang in DynamicBarrier.enter().  I am going to add some
  debugging to
  determine which locking operation gets stuck.
 
 On the face of it it sounds like a python bug.
 Is supervdsm running? did you try killing it as well?
 Are you sure there is no 'Got in to sampling method' line in the log?
 Have you tried adding logging in 'enter' to see at what stage exactly you get 
 stuck?
 
 (side note - code should probably be updated with 'with' as it was originally 
 written for use with python 2.4)
 
 
  
   multipath.rescan()
   
   I have a feeling that your issue is not with SamplingMethod
   

--
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel

   
  
  --
  Adam Litke a...@us.ibm.com
  IBM Linux Technology Center
  
  
 

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


[vdsm] vdsm hangs in SamplingMethod after reinstall

2012-02-09 Thread Adam Litke
Hi.  I am running into a very annoying problem when working on vdsm lately.  My
development process involves stopping vdsm, replacing files, and restarting it.
I do this pretty frequently.  Sometimes, after restarting vdsm the XMLRPC call
getStorageDomainsList() hangs.  The following line is the last to print in the
log:

Thread-18::DEBUG::2012-02-09 
17:11:46,793::misc::1017::SamplingMethod::(__call__) Trying to enter sampling 
method (storage.sdc.refreshStorage)

The only solution I've been able to come up with is restarting my machine.  When
stopping vdsm I search for any stale threads but I am unable to find them.  Do
you know what else might be causing DynamicBarrier.enter() to hang for a long
period of time?  Do the threading primitives use some sort of temporary disk
storage that needs to be cleaned up?  Thanks for the help!

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] [Engine-devel] [RFC] New Connection Management API

2012-01-26 Thread Adam Litke
 and forcing it to refresh the engine token is simpler then 
  having it refresh the VDSM token.
  
  I understand that engine currently has no way of tracking a user session. 
  This, as I said, is also true in the case of VDSM. We can start and argue 
  about which project should implement the session semantics. But as I see it 
  it's not relevant to the connection management API.
 
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://fedorahosted.org/mailman/listinfo/vdsm-devel

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [vdsm] [Engine-devel] [RFC] New Connection Management API

2012-01-26 Thread Adam Litke
, it is what i was
  looking for.
  
   If all is well, and it usually is, VDSM will not invoke a
   disconnect.
   So the caller would have to call unmanage if the connection
   succeeded at the end of the flow.
  
  agree.
  
   Now, if you are already calling unmanage if connection succeeded
   you can just call it anyway.
  
  not exactly, an example I gave earlier on the thread was that VSDM
  hangs
  or have other error and the engine can not initiate unmanaged,
  instead
  let's assume the host is fenced (self-fence or external fence does
  not
  matter), in this scenario the engine will not issue unmanage.
  
   
   instead of doing: (with your suggestion)
   
   manage
   wait until succeeds or lastError has value
   try:
 do stuff
   finally:
 unmanage
   
   do: (with the canonical flow)
   ---
   manage
   try:
 wait until succeeds or lastError has value
 do stuff
   finally:
 unmanage
   
   This is simpler to do than having another connection type.
  
  You are assuming the engine can communicate with VDSM and there are
  scenarios where it is not feasible.
  
   
   Now that we got that out of the way lets talk about the 2nd use
   case.
  
  Since I did not ask VDSM to clean after the (engine) user and you
  don't
  want to do it I am not sure we need to discuss this.
  
  If you insist we can start the discussion on who should implement the
  cleanup mechanism but I'm afraid I have no strong arguments for VDSM
  to
  do it, so I rather not go there ;)
  
  
  You dropped from the discussion my request for supporting list of
  connections for manage and unmanage verbs.
  
   API client died in the middle of the operation and unmanage was
   never called.
   
   Your suggested definition means that unless there was a problem
   with the connection VDSM will still have this connection active.
   The engine will have to clean it anyway.
   
   The problem is, VDSM has no way of knowing that a client died,
   forgot or is thinking really hard and will continue on in about 2
   minutes.
  
   
   Connections that live until they die is a hard to define and work
   with lifecycle. Solving this problem is theoretically simple.
   
   Have clients hold some sort of session token and force the client
   to update it at a specified interval. You could bind resources
   (like domains, VMs, connections) to that session token so when it
   expires VDSM auto cleans the resources.
   
   This kind of mechanism is out of the scope of this API change.
   Further more I think that this mechanism should sit in the engine
   since the session might actually contain resources from multiple
   hosts and resources that are not managed by VDSM.
   
   In GUI flows specifically the user might do actions that don't even
   touch the engine and forcing it to refresh the engine token is
   simpler then having it refresh the VDSM token.
   
   I understand that engine currently has no way of tracking a user
   session. This, as I said, is also true in the case of VDSM. We can
   start and argue about which project should implement the session
   semantics. But as I see it it's not relevant to the connection
   management API.
  
  
 ___
 vdsm-devel mailing list
 vdsm-devel@lists.fedorahosted.org
 https://fedorahosted.org/mailman/listinfo/vdsm-devel

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Vdsm sync call agenda items

2012-01-12 Thread Adam Litke
Hi Ayal,

I would like to propose two agenda items for Monday's call:
 - vdsm testing (in preparation for oVirt Test Day)
 - my API refactoring patches

Hopefully by Monday folks will have had a chance to look at the patches and we
can discuss what I have done and the next steps.

Thanks.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: API design and plan

2011-12-08 Thread Adam Litke
On Thu, Dec 08, 2011 at 04:56:17AM -0500, Ayal Baron wrote:
 
 
 - Original Message -
  On Tue, Dec 06, 2011 at 08:46:57AM -0600, Adam Litke wrote:
   On Tue, Dec 06, 2011 at 02:58:59PM +0200, Dan Kenigsberg wrote:
On Mon, Dec 05, 2011 at 11:34:18AM -0600, Adam Litke wrote:
 Hi everyone.  On today's VDSM call we discussed the
 requirements, design, and
 plan for updating the API to include support for QMF and
 single-host REST API.
 All members present arrived at a general consensus on the best
 way to design the
 next-generation API.  I have tried to capture this discussion
 in the oVirt wiki:
 
  http://ovirt.org/wiki/Vdsm_API
 
 Please take a look at this page and let's discuss any changes
 that may be needed
 in order to adopt it as a working plan that we can begin to
 execute.  Thanks!


Very nice, I've fixed two bullets about the future of the
xml-rpc.
   
   Thanks... Updates look good to me.
   
I think that we are missing something here: how do we model
Vdsm-to-Vdsm
communication, in a binding-blind way? I'm less worried about the
storage-based mailbox used for lvextend requests: my problem is
with
migration command.
   
   Ok, interesting...  Besides migration, are there other features
   (current or
   planned) that would involve P2P communication?  I want to ensure we
   consider the
   full problem space.
  
  Well, I can imagine we would like a host in distress to migrate VMs
  to
  whomever can take them, without central management driving this
  process.
  (CAVE split brain)
  
  At the momemt I cannot think of something that cannot be implemented
  by
  QMF events. Ayal?
  
   
Currently, the implementation of the migrate verb includes
contacting
the remote Vdsm over xml-rpc before issuing the libvirt
migrateToURI2
command ('migrationCreate' verb).

A Vdsm user who choose to use the REST binding, is likely to want
this to
be implemented this using a REST request to the destination. This
means
that the implementation of Vdsm depends on the chosen binding.

The issue can be mitigating by requiring the binding level to
provide a
callback for migrationCreate (and any other future Vdsm-world
requests).
This would complicate the beautiful png at
http://ovirt.org/wiki/Vdsm_API#Design ... Does anyone have
another
suggestion?
   
   Actually, I think you are blending the external API with vdsm
   internals.  As a
   management server or ovirt-engine, I don't care about the protocol
   that vdsm
   uses to contact the migration recipient.  As far as I am concerned
   this is a
   special case internal function call.  For that purpose, I think
   xmlrpc is
   perfectly well-suited to the task and should be used
   unconditionally, regardless
   of the bindings used to initiate the migration.
   
   So I would propose that we modify the design such that we keep an
   extremely thin
   xmlrpc server active whose sole purpose is to service internal P2P
   requests.
  
  Interesting. We could avoid even that, if we could register a
  callback
  with libvirt, so that destination libvirtd called destination Vdsm to
  verify that all storage and networking resources are ready, before
  executing qemu. DanPB, can something like that be done? (I guess it
  is
  not realistic since we may need to pass vdsm-specific data from
  source
  to dest, and libvirt is not supposed to be a general purpose
  transport.)
  
  Dan.
 
 I don't understand the issue.  The whole point of the REST API is to be an
 easily consumable *single* node management API.  Once you start coordinating
 among different nodes then you need clustering and management (either
 distributed or centralized), in both cases it is fine to require having a bus
 in which case you have your method of communications between hosts to replace
 current xml-rpc.

Implicit in this statement is an assertion that live migration between two vdsm
instances will not be supported without orchestration from an ovirt-engine
instance.  I don't agree with placing such a limitation on vdsm since p2p
migration is already well-supported by the underlying components (libvirt and
qemu).

 Requiring an additional xml-rpc server sounds wrong to me.

The other option is to support a migrateCreate binding in REST and QMF.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


Re: [Engine-devel] API design and plan

2011-12-08 Thread Adam Litke
On Thu, Dec 08, 2011 at 06:48:53AM +0200, Itamar Heim wrote:
 On 12/05/2011 07:34 PM, Adam Litke wrote:
 Hi everyone.  On today's VDSM call we discussed the requirements, design, and
 plan for updating the API to include support for QMF and single-host REST 
 API.
 All members present arrived at a general consensus on the best way to design 
 the
 next-generation API.  I have tried to capture this discussion in the oVirt 
 wiki:
 
   http://ovirt.org/wiki/Vdsm_API
 
 Please take a look at this page and let's discuss any changes that may be 
 needed
 in order to adopt it as a working plan that we can begin to execute.  Thanks!
 
 
 as you are going to plan an api...
 This piece by Geert Jansen summarizes lessons learned from the
 RHEV-M (ovirt) REST API project
 https://fedorahosted.org/pipermail/rhevm-api/2011-August/002714.html

Thanks for the link!  This is proving to be a very insightful read.  I am
finding that I have come to many of these same conclusions in my own way as I
have been desigining the API (especially regarding the use of JSON over XML).

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


ovirt-guest-agent memory statistics

2011-12-07 Thread Adam Litke
To support decisions regarding a host's capacity to run virtual machines, it is
useful to have an expanded set of guest memory statistics.  These should be
collected by the ovirt-guest-agent and made available by the vdsm getVmStats()
API.  Once this has been done, it will be possible to write a host-side MOM
policy for auto-ballooning.

The current set of vetted memory stats is published in the virtio specification:

http://ozlabs.org/~rusty/virtio-spec/virtio-0.9.3.pdf (Appendix G, page 42)

swap_in - the total number of pages swapped in
swap_out - the total number of pages swapped out
minflt - the total number of minor page faults 
majflt - the total number of major page faults
memfree - the amount of memory that is completely unused (in Linux: MemFree)
memtot - the total amount of available memory (in Linux: MemTotal)

In Linux, these values can all be obtained by reading /proc/meminfo and
/proc/vmstat.  On Windows there is an existing implementation in the virtio
balloon driver.

How does everyone feel about adding these to the current set of guest stats?

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel


  1   2   >