from:"Adam Litke"

[ovirt-devel] Re: Purging inactive maintainers from vdsm-master-maintainers

2019-12-02 Thread Adam Litke

I also agree with the proposal.  It's sad to turn in my keys but I'm likely
unable to perform many duties expected of a maintainer at this point.  I
know that people can still find me via the git history :)

On Thu, Nov 28, 2019 at 3:37 AM Milan Zamazal  wrote:

> Dan Kenigsberg  writes:
>
> > On Wed, Nov 27, 2019 at 4:33 PM Francesco Romani 
> wrote:
> >>
> >> On 11/27/19 3:25 PM, Nir Soffer wrote:
> >
> >> > I want to remove inactive contributors from vdsm-master-maintainers.
> >> >
> >> > I suggest the simple rule of 2 years of inactivity for removing from
> >> > this group,
> >> > based on git log.
> >> >
> >> > See the list below for current status:
> >> > https://gerrit.ovirt.org/#/admin/groups/106,members
> >>
> >>
> >> No objections, keeping the list minimal and current is a good idea.
> >
> >
> > I love removing dead code; I feel a bit different about removing old
> > colleagues. Maybe I'm just being nostalgic.
> >
> > If we introduce this policy (which I understand is healthy), let us
> > give a long warning period (6 months?) before we apply the policy to
> > existing dormant maintainers. We should also make sure that we
> > actively try to contact a person before he or she is dropped.
>
> I think this is a reasonable proposal.
>
> Regards,
> Milan
> ___
> Devel mailing list -- devel@ovirt.org
> To unsubscribe send an email to devel-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/QCMGKR2IRYTITM2T3YMLXGOZCT4BHYGL/
>


-- 

Adam Litke

He / Him / His

Principle Software Engineer

Red Hat <https://www.redhat.com/>

ali...@redhat.com
@RedHat <https://twitter.com/redhat>   Red Hat
<https://www.linkedin.com/company/red-hat>  Red Hat
<https://www.facebook.com/RedHatInc>
<https://www.redhat.com/>
<https://redhat.com/summit>
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/2HBNMTBENTMVYI543OZWH2MNROTMLVXH/

Re: [ovirt-devel] vdsm stable branch maitainership

2018-01-09 Thread Adam Litke

+1

On Tue, Jan 9, 2018 at 8:17 AM, Francesco Romani <from...@redhat.com> wrote:

> On 01/09/2018 12:43 PM, Dan Kenigsberg wrote:
> > Hello,
> >
> > I would like to nominate Milan Zamazal and Petr Horacek as maintainers
> > of vdsm stable branches. This job requires understanding of vdsm
> > packaging and code, a lot of attention to details and awareness of the
> > requirements of other components and teams.
> >
> > I believe that both Milan and Petr have these qualities. I am certain
> > they would work in responsive caution when merging and tagging patches
> > to the stable branches.
> >
> > vdsm maintainers, please confirm if you approve.
>
> +1
>
> --
> Francesco Romani
> Senior SW Eng., Virtualization R
> Red Hat
> IRC: fromani github: @fromanirh
>
>


-- 
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] jsonrpc go client

2017-07-14 Thread Adam Litke

On Fri, Jul 14, 2017 at 9:32 AM, Piotr Kliczewski <
piotr.kliczew...@gmail.com> wrote:

> On Fri, Jul 14, 2017 at 3:14 PM, Dan Kenigsberg <dan...@redhat.com> wrote:
> > On Fri, Jul 14, 2017 at 3:11 PM, Piotr Kliczewski
> > <piotr.kliczew...@gmail.com> wrote:
> >> All,
> >>
> >> I pushed very simple jsonrpc go client [1] which allows to talk to
> >> vdsm. I had a request to create it but if there are more people
> >> willing to use it I am happy to maintain it.
>

Awesome Piotr!  Thanks for the great work.


> >>
> >> Please let me know if you find any issues with it or you have any
> >> feature requests.
> >
> > Interesting. Which use case do you see for this client?
> > Currently, Vdsm has very few clients: Engine, vdsm-client, mom and
> > hosted-engine. Too often we forget about the non-Engine ones and break
> > them, so I'd be happy to learn more about a 5th.
>
> Adam asked for the client for his storage related changes. I am not
> sure about specific use case.
>

I am looking at implementing a vdsm flexvol driver for kubernetes.  This
would allow kubernetes pods to access vdsm volumes using the native PV and
PVC mechanisms.


>
> >
> > Regarding https://github.com/pkliczewski/vdsm-jsonrpc-go/
> blob/master/example/main.go
> > : programming without exceptions and try-except is a pain. don't you
> > need to check the retval of Subscribe and disconnect on failure?
>
> By no means example is not perfect and you are correct. I will fix.
>



-- 
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] [ovirt-users] Feature: enhanced OVA support

2017-05-16 Thread Adam Litke

Great feature!  I am glad to see you plan to use the existing imageio
framework for transferring data.

Will you allow export of VMs from a particular snapshot?  I guess that's
how you'll have to do it if you want to support export of running VMs.

I think you should definitely have a comment in the ovf to indicate that an
OVA was generated by a oVirt.  People will try to use this new feature to
import random OVAs from who knows where.  I'd also recommend adding a
version to this comment:  or
perhaps even a schema version in case you need to deal with compatibility
issues in the future.

I agree with Yaniv Kaul that we should offer to sparsify the VM to optimize
it for export.  We should also return compressed data.  When exporting,
does it make sense to cache the stored OVA file in some sort of ephemeral
storage (host local is fine, storage domain may be better) in order to
allow the client to resume or restart an interrupted download without
having to start from scratch?

On Sun, May 14, 2017 at 9:56 AM, Arik Hadas <aha...@redhat.com> wrote:

> Hi everyone,
>
> We would like to share our plan for extending the currently provided
> support for OVA files with:
> 1. Support for uploading OVA.
> 2. Support for exporting a VM/template as OVA.
> 3. Support for importing OVA that was generated by oVirt (today, we only
> support those that are VMware-compatible).
> 4. Support for downloading OVA.
>
> This can be found on the feature page
> <http://www.ovirt.org/develop/release-management/features/virt/enhance-import-export-with-ova/>
> .
>
> Your feedback and cooperation will be highly appreciated.
>
> Thanks,
> Arik
>
>
> ___
> Users mailing list
> us...@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>

-- 
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] [VDSM] Adding Pylint to 'check' target

2017-05-15 Thread Adam Litke

I like the current structure of the make check rule which has in increasing
number of sub-targets (pep8, pyflakes, tests, etc) so it is still easy to
run individual targets if the check rule is more than you need.  For me
adding this is a big +1.

On Mon, May 15, 2017 at 12:04 PM, Dan Kenigsberg <dan...@redhat.com> wrote:

> On Mon, May 15, 2017 at 4:47 PM, Fred Rolland <froll...@redhat.com> wrote:
> > Hi,
> >
> > We are introducing Pylint to be performed as part of the 'check' target.
> > Once that patch [1] will be merged, every execution of 'make check' will
> > include also a Pylint analysis.
> >
> > Note that execution time will be longer by about 2 minutes.
> >
> > However, you are can use 'jobs' flag to tell 'make' to execute recipes
> > simultaneously.
> > Be aware that the output of the jobs will be interleaved.
> >
> > For example, running 'make' with two parallel jobs:
> >
> >  make --jobs=2 check
> >
> > Regards,
> > Freddy
> >
> > [1] https://gerrit.ovirt.org/#/c/76390/
>
> I'm a not as frequent user of `make check` as I used to be, but I'm
> cool with this addition. I'd like to hear if others are bothered.
>
> Dan.
>



-- 
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] Vdsm merge rights

2017-05-12 Thread Adam Litke

+2 :)

On Fri, May 12, 2017 at 6:16 AM, Nir Soffer <nsof...@redhat.com> wrote:

> +1
>
> בתאריך יום ו׳, 12 במאי 2017, 12:59, מאת Fabian Deutsch ‏<
> fdeut...@redhat.com>:
>
>> +1
>>
>> On Fri, May 12, 2017 at 11:25 AM, Edward Haas <eh...@redhat.com> wrote:
>> > Good news! +2
>> >
>> > On Fri, May 12, 2017 at 11:27 AM, Piotr Kliczewski <pklic...@redhat.com
>> >
>> > wrote:
>> >>
>> >> +1
>> >>
>> >> On Fri, May 12, 2017 at 9:14 AM, Dan Kenigsberg <dan...@redhat.com>
>> wrote:
>> >>>
>> >>> I'd like to nominate Francesco to the vdsm-maintainers
>> >>>
>> >>> https://gerrit.ovirt.org/#/admin/groups/uuid-
>> becbf722723417c336de6c1646749678acae8b89
>> >>> list, so he can merge patches without waiting for Nir, Adam or me.
>> >>>
>> >>> I believe that he proved to be thorough and considerate (and paranoid)
>> >>> as the job requires.
>> >>>
>> >>> Vdsm maintainers, please approve.
>> >>>
>> >>> Dan
>> >>
>> >>
>> >
>> >
>> > ___
>> > Devel mailing list
>> > Devel@ovirt.org
>> > http://lists.ovirt.org/mailman/listinfo/devel
>> ___
>> Devel mailing list
>> Devel@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/devel
>>
>
> ___
> Devel mailing list
> Devel@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/devel
>



-- 
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] New design for the Gerrit UI

2017-05-05 Thread Adam Litke

I really like the colors on the patternfly scheme.  Great job!

On Thu, May 4, 2017 at 9:25 AM, Evgheni Dereveanchin <edere...@redhat.com>
wrote:

> Thanks everyone for the great feedback!
>
> So there's two options I see now:
> 1) keep the default header scheme with white background, just add the
> project logo into the corner
> 2) try to adapt to the Patternfly scheme as used in oVirt's Admin UI
> currently.
>
> I've swapped the header background color to #393f45 as used in oVirt for a
> quick test:
> https://gerrit-staging.phx.ovirt.org/
>
> Is this more readable? If yes - I can continue working in this direction
> to add gradients
> and other patternfly style elements. Otherwise I'll just go with option 1
> and stick to the default style we have now.
>
> On Thu, May 4, 2017 at 2:45 PM, Martin Sivak <msi...@redhat.com> wrote:
>
>> > It will help if someone can suggest an alternate CSS which we can use
>> or specific color codes,
>>
>> Well.. keep it as it is or make it really dark (like the patternfly
>> menu). I do not care about logos but big area filled with non-neutral color
>> is always going to be an issue.
>>
>> Martin
>>
>> On Thu, May 4, 2017 at 2:15 PM, Eyal Edri <ee...@redhat.com> wrote:
>>
>>>
>>>
>>> On Thu, May 4, 2017 at 3:05 PM, Martin Perina <mper...@redhat.com>
>>> wrote:
>>>
>>>> I agree with Milan and Martin, even after few minutes looking at it,
>>>> the green
>>>> with combination of white background just made my eyes burning :-(
>>>>
>>>> Would it be possible to use more darker colors (at least for top
>>>> banner/menu)?
>>>> For example darker colors we use in oVirt engine welcome page ...
>>>>
>>>
>>> Thanks for the feedback,
>>> It will help if someone can suggest an alternate CSS which we can use or
>>> specific color codes,
>>> otherwise it will be long trial and error process until we'll find
>>> something that will suite everyone.
>>>
>>>
>>>
>>>>
>>>>
>>>> Martin
>>>>
>>>> On Thu, May 4, 2017 at 5:53 AM, Martin Sivak <msi...@redhat.com> wrote:
>>>>
>>>>> I agree with Milan here. The light green background makes the menu
>>>>> items to be almost unreadable, the search button (slightly different
>>>>> green color) blends with the background and generally the color pulls
>>>>> my eyes away from the content. I wouldn't feel comfortable looking at
>>>>> the screen for a whole day.
>>>>>
>>>>> Martin
>>>>>
>>>>> On Thu, May 4, 2017 at 9:57 AM, Milan Zamazal <mzama...@redhat.com>
>>>>> wrote:
>>>>> > Evgheni Dereveanchin <edere...@redhat.com> writes:
>>>>> >
>>>>> >> The Infra team is working on customizing the look of Gerrit to make
>>>>> it fit
>>>>> >> better with other oVirt services. I want to share the result of this
>>>>> >> effort. Hopefully we can gather some feedback before applying the
>>>>> design to
>>>>> >> oVirt's instance of Gerrit.
>>>>> >>
>>>>> >> Please visit the Staging instance to check it out:
>>>>> >>
>>>>> >>   https://gerrit-staging.phx.ovirt.org/
>>>>> >
>>>>> > Thank you for the preview.  While it fits better with oVirt services,
>>>>> > there is one thing that makes me uncomfortable with it: low contrast.
>>>>> > The top green bar is probably directly violating Web Accessibility
>>>>> > Guidelines (AA level; see
>>>>> > https://www.w3.org/TR/WCAG20/#visual-audio-contrast-contrast), but I
>>>>> > find all the green parts harder to read than in the current version.
>>>>> > So it would be nice if the contrast could be improved.
>>>>> >
>>>>> > Thanks,
>>>>> > Milan
>>>>> > ___
>>>>> > Devel mailing list
>>>>> > Devel@ovirt.org
>>>>> > http://lists.ovirt.org/mailman/listinfo/devel
>>>>> ___
>>>>> Devel mailing list
>>>>> Devel@ovirt.org
>>>>> http://lists.ovirt.org/mailman/listinfo/devel
>>>>>
>>>>
>>>>
>>>> ___
>>>> Infra mailing list
>>>> in...@ovirt.org
>>>> http://lists.ovirt.org/mailman/listinfo/infra
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> Eyal edri
>>>
>>>
>>> ASSOCIATE MANAGER
>>>
>>> RHV DevOps
>>>
>>> EMEA VIRTUALIZATION R
>>>
>>>
>>> Red Hat EMEA <https://www.redhat.com/>
>>> <https://red.ht/sig> TRIED. TESTED. TRUSTED.
>>> <https://redhat.com/trusted>
>>> phone: +972-9-7692018 <+972%209-769-2018>
>>> irc: eedri (on #tlv #rhev-dev #rhev-integ)
>>>
>>
>>
>> ___
>> Devel mailing list
>> Devel@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/devel
>>
>
>
>
> --
> Regards,
> Evgheni Dereveanchin
>
> ___
> Infra mailing list
> in...@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/infra
>
>


-- 
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] [VDSM] granting network+2 to Eddy

2017-02-28 Thread Adam Litke

+1

On Tue, Feb 28, 2017 at 6:03 AM, Francesco Romani <from...@redhat.com>
wrote:

>
> On 02/28/2017 08:32 AM, Dan Kenigsberg wrote:
> > Hi,
> >
> > After more than a year of substantial contribution to Vdsm networking,
> > and after several months of me upgrading his score, I would like to
> > nominate Eddy as a maintainer for network-related code in Vdsm, in
> > master and stable branches.
> >
> > Current Vdsm maintainers and others: please approve my suggestion if
> > you agree with it.
>
> Approved
>
> --
> Francesco Romani
> Red Hat Engineering Virtualization R & D
> IRC: fromani
>
>


-- 
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] [VDSM] Correct implementation of virt-sysprep job

2016-12-06 Thread Adam Litke

On 06/12/16 22:06 +0200, Arik Hadas wrote:

Adam,

:)  You seem upset.  Sorry if I touched on a nerve...

Just out of curiosity: when you write "v2v has promised" - what exactly do you
mean? the tool? Richard Jones (the maintainer of virt-v2v)? Shahar and I that
implemented the integration with virt-v2v? I'm not aware of such a promise by
any of these options :)

Some history...

Earlier this year Nir, Francesco (added), Shahar, and I began
discussing the similarities between what storage needed to do with
external commands and what was designed specifically for v2v.  I am
not sure if you were involved in the project at that time.  The plan
was to create common infrastructure that could be extended to fit the
unique needs of the verticals.  The v2v code was going to be moved
over to the new infrastructure (see [1]) and the only thing that
stopped the initial patch was lack of a VMWare testing environment for
verification.

At that time storage refocused on developing verbs that used the new
infrastructure and have been maintaining its suitability for general
use.  Conversion of v2v -> Host Jobs is obviously a lower priority
item and much more difficult now due to the early missed opportunity.

Anyway, let's say that you were given such a promise by someone and thus
consider that mechanism to be deprecated - it doesn't really matter.

I may be biased but I think my opinion does matter.

The current implementation doesn't well fit to this flow (it requires
per-volume job, it creates leases that are not needed for template's disks,
...) and with the "next-gen API" with proper support for virt flows not even
being discussed with us (and iiuc also not with the infra team) yet, I don't
understand what do you suggest except for some strong, though irrelevant,
statements.

If you are willing to engage in a good-faith technical discussion I am
sure I can help you to understand.  These operations to storage demand
some form of locking protection.  If volume leases aren't appropriate then
perhaps we should use the VM Leases / xleases that Nir is finishing
off for 4.1 now.

I suggest loud and clear to reuse (not to add dependencies, not to enhance, ..)
an existing mechanism for a very similar flow of virt-v2v that works well and
simple.

I clearly remember discussions involving infra (hello Oved), virt
(hola Michal), and storage where we decided that new APIs performing
async operations involving external commands should use the HostJobs
infrastructure instead of adding more information to Host Stats.
These were the "famous" entity polling meetings.

Of course plans can change but I have never been looped into any such
discussions.

Do you "promise" to implement your "next gen API" for 4.1 as an alternative?

I guess we need the design first.

On Tue, Dec 6, 2016 at 5:04 PM, Adam Litke <ali...@redhat.com> wrote:

   On 05/12/16 11:17 +0200, Arik Hadas wrote:

   On Mon, Dec 5, 2016 at 10:05 AM, Nir Soffer <nsof...@redhat.com> wrote:

      On Sun, Dec 4, 2016 at 8:50 PM, Shmuel Melamud <smela...@redhat.com>
   wrote:
      >
      > Hi!
      >
      > I'm currently working on integration of virt-sysprep into oVirt.
      >
      > Usually, if user creates a template from a regular VM, and then
   creates
      new VMs from this template, these new VMs inherit all configuration
   of the
      original VM, including SSH keys, UDEV rules, MAC addresses, system
   ID,
      hostname etc. It is unfortunate, because you cannot have two network
      devices with the same MAC address in the same network, for example.
      >
      > To avoid this, user must clean all machine-specific configuration
   from
      the original VM before creating a template from it. You can do this
      manually, but there is virt-sysprep utility that does this
   automatically.
      >
      > Ideally, virt-sysprep should be seamlessly integrated into
   template
      creation process. But the first step is to create a simple button:
   user
      selects a VM, clicks the button and oVirt executes virt-sysprep on
   the VM.
      >
      > virt-sysprep works directly on VM's filesystem. It accepts list of
   all
      disks of the VM as parameters:
      >
      > virt-sysprep -a disk1.img -a disk2.img -a disk3.img
      >
      > The architecture is as follows: command on the Engine side runs a
   job on
      VDSM side and tracks its success/failure. The job on VDSM side runs
      virt-sysprep.
      >
      > The question is how to implement the job correctly?
      >
      > I thought about using storage jobs, but they are designed to work
   only
      with a single volume, correct?

      New

Re: [ovirt-devel] [VDSM] Correct implementation of virt-sysprep job

2016-12-06 Thread Adam Litke

voked on VM rather than particular disk makes it
less suitable.


These are more appropriately called HostJobs and the have the
following semantics:
- They represent an external process running on a single host
- They are not persisted.  If the host or vdsm restarts, the job is
  aborted
- They operate on entities.  Currently storage is the first adopter
  of the infrastructure but virt was going to adopt these for the
  next-gen API.  Entities can be volumes, storage domains, vms,
  network interfaces, etc.
- Job status and progress is reported by the Host Jobs API.  If a job
  is not present, then the underlying entitie(s) must be polled by
  engine to determine the actual state.


3. V2V jobs - no mechanism is provided to resume failed jobs, no leases, etc


This is the old infra upon which Host Jobs are built.  v2v has
promised to move to Host Jobs in the future so we should not add new
dependencies to this code.


I have some arguments for using V2V-like jobs [1]:
1. creating template from vm is rarely done - if host goes unresponsive or any
other failure is detected we can just remove the template and report the error


We can chose this error handling with Host Jobs as well.


2. the phase of virt-sysprep is, unlike typical storage operation, short -
reducing the risk of failures during the process 


Reduced risk of failures is never an excuse to have lax error
handling.  The storage flavored host jobs provide tons of utilities
for making error handling standardized, easy to implement, and
correct.


3. during the operation the VM is down - by locking the VM/template and its
disks on the engine side, we render leases-like mechanism redundant


Eventually we want to protect all operations on storage with sanlock
leases.  This is safer and allows for a more distributed approach to
management.  Again, the use of leases correctly in host jobs requires
about 5 lines of code.  The benefits of standardization far outweigh
any perceived simplification resulting from omitting it.


4. in the worst case - the disk will not be corrupted (only some of the data
might be removed).


Again, the way engine chooses to handle job failures is independent of
the mechanism.  Let's separate that from this discussion.


So I think that the mechanism for storage jobs is an over-kill for this case.
We can keep it simple by generalise the V2V-job for other virt-tools jobs, like
virt-sysprep.


I think we ought to standardize on the Host Jobs framework where we
can collaborate on unit tests, standardized locking and error
handling, abort logic, etc.  When v2v moves to host jobs then we will
have a unified method of handling ephemeral jobs that are tied to
entities.

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] [VDSM] FAIL: test_intra_domain_copy('block', 'cow', 'cow') (storage_sdm_copy_data_test.TestCopyDataDIV)

2016-11-17 Thread Adam Litke

mg convert -p -t none -T none -f qcow2
/var/tmp/tmpKpf5ys/mnt/blockSD/beb999a9-a90b-4c49-9d4c-7b21b3adb164/images/c406fa29-72c8-4a26-9601-2bd5e0b1cbd0/1b8476de-236b-4c8e-aea0-134b213b7cb7
-O qcow2 -o compat=0.10
/var/tmp/tmpKpf5ys/mnt/blockSD/beb999a9-a90b-4c49-9d4c-7b21b3adb164/images/939d015f-e57e-4bf1-9013-b648fae347ae/0e7c49ae-6c7c-4526-89c2-17b923c70dfa
(cwd None) (qemuimg:247)
21:16:08 2016-11-16 21:14:26,805 DEBUG (MainThread)
[storage.Misc.excCmd] /usr/bin/taskset --cpu-list 0-3 /usr/bin/dd
iflag=direct skip=5 bs=512
if=/var/tmp/tmpKpf5ys/dev/beb999a9-a90b-4c49-9d4c-7b21b3adb164/metadata
count=1 (cwd None) (commands:69)
21:16:08 2016-11-16 21:14:26,820 DEBUG (MainThread)
[storage.Misc.excCmd] SUCCESS:  = '1+0 records in\n1+0 records
out\n512 bytes copied, 0.000380116 s, 1.3 MB/s\n';  = 0
(commands:93)
21:16:08 2016-11-16 21:14:26,821 DEBUG (MainThread) [storage.Misc]
err: ['1+0 records in', '1+0 records out', '512 bytes copied,
0.000380116 s, 1.3 MB/s'], size: 512 (misc:138)
21:16:08 2016-11-16 21:14:26,823 INFO  (MainThread)
[storage.VolumeManifest] Tearing down volume
beb999a9-a90b-4c49-9d4c-7b21b3adb164/0e7c49ae-6c7c-4526-89c2-17b923c70dfa
justme True (blockVolume:386)
21:16:08 2016-11-16 21:14:26,824 INFO  (MainThread)
[storage.VolumeManifest] Tearing down volume
beb999a9-a90b-4c49-9d4c-7b21b3adb164/1b8476de-236b-4c8e-aea0-134b213b7cb7
justme True (blockVolume:386)
21:16:08 2016-11-16 21:14:26,824 INFO  (MainThread) [root] Job
'f8a60ab8-2ba9-473d-bf46-82080c283137' completed (jobs:203)
21:16:08 2016-11-16 21:14:26,825 INFO  (MainThread) [root] Job
'f8a60ab8-2ba9-473d-bf46-82080c283137' will be deleted in 3600 seconds
(jobs:245)
21:16:08 - >> end captured logging << -


--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

[ovirt-devel] [VDSM] Failing make check

2016-11-16 Thread Adam Litke


Hi Piotr,

I am now seeing consistent Jenkins failures during make check (when
producing the schema html doc) and I suspect this[1] change.  Can you
take a look please?

[1] https://gerrit.ovirt.org/#/c/56387/

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] [vdsm] tests failures

2016-11-11 Thread Adam Litke


On 10/11/16 15:43 +0200, Nir Soffer wrote:

On Thu, Nov 10, 2016 at 3:39 PM, Piotr Kliczewski
<piotr.kliczew...@gmail.com> wrote:

All,

Few mins ago I saw build [1] failure due to:

13:36:42 ERROR: test_abort_during_copy('block')
(storage_sdm_copy_data_test.TestCopyDataDIV)
13:36:42 --
13:36:42 Traceback (most recent call last):
13:36:42   File
"/home/jenkins/workspace/vdsm_master_check-patch-el7-x86_64/vdsm/tests/testlib.py",
line 133, in wrapper
13:36:42 return f(self, *args)
13:36:42   File
"/home/jenkins/workspace/vdsm_master_check-patch-el7-x86_64/vdsm/tests/storage_sdm_copy_data_test.py",
line 279, in test_abort_during_copy
13:36:42 raise RuntimeError("Timeout waiting for thread")
13:36:42 RuntimeError: Timeout waiting for thread


Adam, can you take a look?


Is this happening all of the time or intermittently?  If intermittent
then we can increase the timeout or just ignore for the time being
since it's probably caused by an overloaded Jenkins slave.


--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] system tests failing on template export

2016-10-17 Thread Adam Litke


On 17/10/16 11:51 +0200, Piotr Kliczewski wrote:

Adam,

I see constant failures due to this and found:

2016-10-17 03:55:21,045 ERROR   (jsonrpc/3) [storage.TaskManager.Task]
Task=`8989d694-7099-449b-bd66-4d63786be089`::Unexpected error
(task:870)
Traceback (most recent call last):
 File "/usr/share/vdsm/storage/task.py", line 877, in _run
   return fn(*args, **kargs)
 File "/usr/lib/python2.7/site-packages/vdsm/logUtils.py", line 50, in wrapper
   res = f(*args, **kwargs)
 File "/usr/share/vdsm/storage/hsm.py", line 2212, in getAllTasksInfo
   allTasksInfo = sp.getAllTasksInfo()
 File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py",
line 77, in wrapper
   raise SecureError("Secured object is not in safe state")
SecureError: Secured object is not in safe state


This usually indicates that the SPM role has been lost which happens
most likely due to connection issues with the storage.  What is the
storage environment being used for the system tests?



Please take a look not sure whether it is related. You can find latest
build here [1]

Thanks,
Piotr

[1] http://jenkins.ovirt.org/job/ovirt_master_system-tests/668/

On Fri, Oct 14, 2016 at 11:22 AM, Evgheni Dereveanchin
<edere...@redhat.com> wrote:

Hello,

We've got several cases today where system tests failed
when attempting to export templates:

http://jenkins.ovirt.org/job/ovirt_master_system-tests/655/testReport/junit/(root)/004_basic_sanity/template_export/

Related engine.log looks something like this:
https://paste.fedoraproject.org/449936/47643643/raw/

I could not find any obvious issues in SPM logs, could someone
please take a look to confirm what may be causing this issue?

Full logs from the test are available here:
http://jenkins.ovirt.org/job/ovirt_master_system-tests/655/artifact/

Regards,
Evgheni Dereveanchin
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel


--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] SPM or SDM for a new verb?

2016-07-05 Thread Adam Litke


On 05/07/16 12:07 +0300, Shmuel Melamud wrote:

Hi!

I'm writing code for a new verb (sparsifyInplace) in VDSM and got two
different opinions about whether to use SPM or SDM for it:

1) SDM is the new correct approach, need to use it.

2) SDM is on early stage and may be changed significantly, so it is better
to use SPM as mature and reliable approach.

What's your opinion?


SDM is definitely the better way to go, if you can, since it will make
less work for you in the future and also make your verb use host
resources more efficiently.

My guess is that sparsifyInplace just needs to run a command against a
volume path that is visible to a selected vdsm host and wait for it to
complete.  Do you intend for this to be run also while a VM is using
the volume?

For SDM verbs in vdsm there is a basic formula.  All verbs are
asynchronous.  A new public API function is created in HSM.  This
function unpacks parameters and then creates and schedules a HostJob
instance.  The HostJob performs any necessary locking and does the
work.  It also has an interface for progress reporting and for
aborting the operation.  Engine monitors HostJobs using a public vdsm
API.  


While it's true that SDM is in early stages, the underlying
infrastructure that you will need has been upstream for awhile now.
I'll be happy to provide some additional details if you have further
questions.


--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

[ovirt-devel] [VDSM] Test fails only under make check

2016-07-01 Thread Adam Litke


I have written a new test [1] and when running 'make check' I get a
nasty ImportError (see below).  When running the same test using
run_tests_local.sh directly it works fine.  Any ideas what might be
going on?

[1] https://gerrit.ovirt.org/#/c/60060/1/tests/storage_hsm_test.py

==
ERROR: Failure: ImportError (No module named 'Queue')
--
Traceback (most recent call last):
 File "/usr/lib/python3.4/site-packages/nose/failure.py", line 39, in
runTest
   raise self.exc_val.with_traceback(self.tb)
 File "/usr/lib/python3.4/site-packages/nose/loader.py", line 418, in
loadTestsFromName
   addr.filename, addr.module)
 File "/usr/lib/python3.4/site-packages/nose/importer.py", line 47,
in importFromPath
   return self.importFromDir(dir_path, fqname)
 File "/usr/lib/python3.4/site-packages/nose/importer.py", line 94,
in importFromDir
   mod = load_module(part_fqname, fh, filename, desc)
 File "/usr/lib64/python3.4/imp.py", line 235, in load_module
   return load_source(name, filename, file)
 File "/usr/lib64/python3.4/imp.py", line 171, in load_source
   module = methods.load()
 File "", line 1220, in load
 File "", line 1200, in _load_unlocked
 File "", line 1129, in _exec
 File "", line 1471, in exec_module
 File "", line 321, in
_call_with_frames_removed
 File "/home/alitke/src/vdsm/tests/storage_hsm_test.py", line 26, in

   from storagetestlib import fake_file_env
 File "/home/alitke/src/vdsm/tests/storagetestlib.py", line 24, in

   from storagefakelib import FakeLVM
 File "/home/alitke/src/vdsm/tests/storagefakelib.py", line 32, in

   from storage import lvm as real_lvm
 File "/home/alitke/src/vdsm/vdsm/storage/lvm.py", line 41, in

   from vdsm.storage import devicemapper
 File "/home/alitke/src/vdsm/lib/vdsm/storage/devicemapper.py", line
30, in 
   from vdsm.storage import misc
 File "/home/alitke/src/vdsm/lib/vdsm/storage/misc.py", line 36, in

   import Queue
ImportError: No module named 'Queue'

--

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] Undelivered mail warnings from Gerrit

2016-06-14 Thread Adam Litke


On 02/06/16 11:39 +0200, Tomáš Golembiovský wrote:

Hi,

for the last two weeks I've been getting lots of warnings about undelivered
mail from Gerrit. The importnat thing in the message being:

   The original message was received at Wed, 1 Jun 2016 14:57:54 -0400
   from gerrit.ovirt.org [127.0.0.1]

   - Transcript of session follows -
   <vdsm-patc...@fedorahosted.org>... Deferred: Connection timed out with 
hosted-lists01.fedoraproject.org.
   Warning: message still undelivered after 4 hours
   Will keep trying until message is 5 days old


Anyone else experiencing the same problem? Is this being worked on?


It's affecting me quite severely also.

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] [VDSM] New versioning scheme

2016-06-01 Thread Adam Litke


On 01/06/16 13:51 +0300, Nir Soffer wrote:

Hi all,

We are going to branch 4.0 today, and it is a good time to update our
versioning scheme.

I suggest to use the standard ovirt versioning, use by most projects:

1. master

   vdsm-4.19.0-201606011345.gitxxxyyy

2. 4.0

   vdsm-4.18.1

The important invariant is that any build from master is considered newer
compare with the stable builds, since master always contain all stable
code, and new code.

Second invariant, the most recent build from master is always newer compared
with any other master build - the timestamp enforces this.

Thoughts?


+1

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] [vdsm] Another network test failing

2016-05-17 Thread Adam Litke


On 17/05/16 09:45 +0300, Dan Kenigsberg wrote:

On Tue, May 17, 2016 at 01:10:19AM +0300, Nir Soffer wrote:

On Mon, May 16, 2016 at 4:52 PM, Adam Litke <ali...@redhat.com> wrote:
> On 15/05/16 15:10 +0300, Dan Kenigsberg wrote:
>>
>> On Sun, May 15, 2016 at 10:33:30AM +0300, Edward Haas wrote:
>>>
>>> On Tue, May 10, 2016 at 8:19 PM, Adam Litke <ali...@redhat.com> wrote:
>>>
>>> > On 10/05/16 18:08 +0300, Dan Kenigsberg wrote:
>>> >
>>> >> On Mon, May 09, 2016 at 02:48:43PM -0400, Adam Litke wrote:
>>> >>
>>> >>> When running make check on my local system I often (but not always)
>>> >>> get the following error:
>>> >>>
>>> >>
>>> >> Do you have any clue related to when this happens? (your pwd,
>>> >> pythonpath)
>>> >>
>>> >
>>> > Maybe it's a side effect of the way nose loads and runs tests?
>>> >
>>> > Did it begin with the recent move of netinfo under vdsm.network?
>>> >> https://gerrit.ovirt.org/56713 (CommitDate: Thu May 5) or did you see
>>> >> it
>>> >> earlier?
>>> >>
>>> >
>>> > That's possible.  It only started happening recently.  It seems to
>>> > fail only when run under 'make check' but not when run via
>>> > ./run_tests_local.sh.
>>>
>>>
>>> Is it possible that on the same machine you have installed an older vdsm
>>> version
>>> and it somehow conflicts? (resolving vdsm from the site-packages instead
>>> from
>>> the local workspace)
>>
>>
>> Or maybe you have *.pyc from an older directory structure left in your
>> working directory?
>
>
> I think this was the issue.  Removing *.pyc from the source tree fixed
> it.  Thanks!

git clean -dxf is very useful from time to time


Yet very dangerous in another times (yes, once upon a time I had the
only copy of a helper script hiding within the leafs of a git tree)



Been there too.  Maybe it's time we fix 'make clean'.

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] [vdsm] Another network test failing

2016-05-16 Thread Adam Litke

On 15/05/16 15:10 +0300, Dan Kenigsberg wrote:

On Sun, May 15, 2016 at 10:33:30AM +0300, Edward Haas wrote:

On Tue, May 10, 2016 at 8:19 PM, Adam Litke <ali...@redhat.com> wrote:

> On 10/05/16 18:08 +0300, Dan Kenigsberg wrote:
>
>> On Mon, May 09, 2016 at 02:48:43PM -0400, Adam Litke wrote:
>>
>>> When running make check on my local system I often (but not always)
>>> get the following error:
>>>
>>
>> Do you have any clue related to when this happens? (your pwd,
>> pythonpath)
>>
>
> Maybe it's a side effect of the way nose loads and runs tests?
>
> Did it begin with the recent move of netinfo under vdsm.network?
>> https://gerrit.ovirt.org/56713 (CommitDate: Thu May 5) or did you see it
>> earlier?
>>
>
> That's possible.  It only started happening recently.  It seems to
> fail only when run under 'make check' but not when run via
> ./run_tests_local.sh.

Is it possible that on the same machine you have installed an older vdsm
version
and it somehow conflicts? (resolving vdsm from the site-packages instead
from
the local workspace)

Or maybe you have *.pyc from an older directory structure left in your
working directory?

I think this was the issue.  Removing *.pyc from the source tree fixed
it.  Thanks!

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] [vdsm] Another network test failing

2016-05-10 Thread Adam Litke


On 10/05/16 18:08 +0300, Dan Kenigsberg wrote:

On Mon, May 09, 2016 at 02:48:43PM -0400, Adam Litke wrote:

When running make check on my local system I often (but not always)
get the following error:


Do you have any clue related to when this happens? (your pwd,
pythonpath)


Maybe it's a side effect of the way nose loads and runs tests?


Did it begin with the recent move of netinfo under vdsm.network?
https://gerrit.ovirt.org/56713 (CommitDate: Thu May 5) or did you see it
earlier?


That's possible.  It only started happening recently.  It seems to
fail only when run under 'make check' but not when run via
./run_tests_local.sh.






$ rpm -qa | grep libvirt-python
libvirt-python-1.2.18-1.fc23.x86_64



==
ERROR: Failure: ImportError (No module named 'libvirt')
--
Traceback (most recent call last):
 File "/usr/lib/python3.4/site-packages/nose/failure.py", line 39, in
runTest
   raise self.exc_val.with_traceback(self.tb)
 File "/usr/lib/python3.4/site-packages/nose/loader.py", line 418, in
loadTestsFromName
   addr.filename, addr.module)
 File "/usr/lib/python3.4/site-packages/nose/importer.py", line 47,
in importFromPath
   return self.importFromDir(dir_path, fqname)
 File "/usr/lib/python3.4/site-packages/nose/importer.py", line 94,
in importFromDir
   mod = load_module(part_fqname, fh, filename, desc)
 File "/usr/lib64/python3.4/imp.py", line 235, in load_module
   return load_source(name, filename, file)
 File "/usr/lib64/python3.4/imp.py", line 171, in load_source
   module = methods.load()
 File "", line 1220, in load
 File "", line 1200, in _load_unlocked
 File "", line 1129, in _exec
 File "", line 1471, in exec_module
 File "", line 321, in
_call_with_frames_removed
 File "/home/alitke/src/vdsm/tests/network/models_test.py", line 27,
in 
   from vdsm.network.netinfo import bonding, mtus
 File "/home/alitke/src/vdsm/lib/vdsm/network/netinfo/__init__.py",
line 26, in 
   from vdsm import libvirtconnection
 File "/home/alitke/src/vdsm/lib/vdsm/libvirtconnection.py", line 29,
in 
   import libvirt
ImportError: No module named 'libvirt'

--
Ran 234 tests in 11.084s

FAILED (SKIP=42, errors=1)


--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

[ovirt-devel] [vdsm] Another network test failing

2016-05-09 Thread Adam Litke


When running make check on my local system I often (but not always)
get the following error:

$ rpm -qa | grep libvirt-python
libvirt-python-1.2.18-1.fc23.x86_64



==
ERROR: Failure: ImportError (No module named 'libvirt')
--
Traceback (most recent call last):
 File "/usr/lib/python3.4/site-packages/nose/failure.py", line 39, in
runTest
   raise self.exc_val.with_traceback(self.tb)
 File "/usr/lib/python3.4/site-packages/nose/loader.py", line 418, in
loadTestsFromName
   addr.filename, addr.module)
 File "/usr/lib/python3.4/site-packages/nose/importer.py", line 47,
in importFromPath
   return self.importFromDir(dir_path, fqname)
 File "/usr/lib/python3.4/site-packages/nose/importer.py", line 94,
in importFromDir
   mod = load_module(part_fqname, fh, filename, desc)
 File "/usr/lib64/python3.4/imp.py", line 235, in load_module
   return load_source(name, filename, file)
 File "/usr/lib64/python3.4/imp.py", line 171, in load_source
   module = methods.load()
 File "", line 1220, in load
 File "", line 1200, in _load_unlocked
 File "", line 1129, in _exec
 File "", line 1471, in exec_module
 File "", line 321, in
_call_with_frames_removed
 File "/home/alitke/src/vdsm/tests/network/models_test.py", line 27,
in 
   from vdsm.network.netinfo import bonding, mtus
 File "/home/alitke/src/vdsm/lib/vdsm/network/netinfo/__init__.py",
line 26, in 
   from vdsm import libvirtconnection
 File "/home/alitke/src/vdsm/lib/vdsm/libvirtconnection.py", line 29,
in 
   import libvirt
ImportError: No module named 'libvirt'

--
Ran 234 tests in 11.084s

FAILED (SKIP=42, errors=1)


--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] Vdsm api package

2016-03-29 Thread Adam Litke


On 29/03/16 21:01 +0300, Nir Soffer wrote:

Hi all,

In the Vdsm call, we discussed a way to expose vdsm errors to its clients
(e.g, engine, hosted engine agent/setup).

The idea is to have a vdsmapi package, holding:
- errors.py - all public errors
- events.py - all events sent by vdsm
- client.py - library for communicating with vdsm
- schema.py - the client will use this to autogenerate online help and
validate messages
- schema.yaml - we probably need several files (gluster, events, etc.)

This will allow other projects talking with vdsm to do:

   from vdsmapi import client, errors
   ...
   try:
   client.list(vmId="xxxyyy")
   except errors.NoSuchVM:
   ...

(this is a fake example, the real api may be different)

Engine can build-require vdsmapi, and generate java module with the
public errors from
vdsmapi/errors.py module, instead of keeping this hardcoded in engine,
and updating
it every time vdsm adds new public error.

Vdsm will use this package when building response to clients.

Edi was concerned about sharing the errors module, so maybe we need a package:

vdsmapi/
   errors/
   network.py
   virt.py
   storage.py
   gluster.py

We can still expose all the errors via errors/__init__.py, so clients
do not have to care about
the area of the application the error come from.

Thoughts?


Seems like a fantastic idea.  Would engine builds of the master branch
always fetch the errors module from vdsm's master branch or would
there be some synchronization points?

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

[ovirt-devel] oVirt presentations with reveal.js

2016-02-10 Thread Adam Litke


I recently gave an oVirt talk at FOSDEM and decided to use
reveal.js[1] to create my slides[2].  Reveal.js is a really slick
framework for creating HTML5 based presentations that look really
clean and modern.  You write your slides in html markup and a
javascript library provides everything else (it's a long list of
features so see the demo presentation for details[3]).

I was pretty happy with the oVirt themed slides I created so decided
to package it up into a reusable template.  To try it out simply:

1. git clone https://github.com/aglitke/reveal.js.git ovirt-template
2. cd ovirt-template
3. git checkout -b ovirt-template
4. firefox index.html

The main changes I have made is to add a floating oVirt logo to the
upper-right corner of the slides and a footer.  Both can display while
presenting and when the slides are printed.  I am not a graphic
designer or a web developer so I am keen to accept contributions from
people with more experience in this area.  Enjoy!

[1] https://github.com/hakimel/reveal.js
[2] http://aglitke.github.io/fosdem-2016/#/
[3] http://lab.hakim.se/reveal-js/#/

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] Changing the name of VDSM in oVirt 4.0.

2016-01-27 Thread Adam Litke


On 26/01/16 19:26 +0200, Nir Soffer wrote:

On Tue, Jan 26, 2016 at 5:29 PM, Yaniv Dary <yd...@redhat.com> wrote:

I suggest for ease of use and tracking we change the versioning to align to
the engine (4.0.0 in oVirt 4.0 GA) to make it easy to know which version was
in which release and also change the package naming to something like
ovirt-host-manager\ovirt-host-agent.


When we think about the names, we should consider all the components
installed or running on the host. Here is the current names and future options:


Also consider that we have discussed breaking vdsmd into its
sub-components.  In that case we'd need names for:

vdsm-storage
vdsm-virt
vdsm-network
etc

I am thinking of vdsm as a service provider to the engine.  Today it
provides a virtualization hypervisor, a storage repository, network
configuration services, etc.  I think using the word 'provider' is too
long (and possibly too vague).

We could just make up something to represent the concept of an
endpoint that ovirt-engine uses to get things done.  For example, an
engine often connects to gears to get things done (but gear is
already taken by OpenShift, sadly).

How about ovirt-minion? :)
ovirt-target? ovirt-element? ovirt-unit?

Also consider that an abbreviation or acronym is still okay.

Thanks for reading to the bottom of my pre-coffee stream of
consciousness.  Of the alternatives listed below, I'd be inclined to
support 'ovirt-host*'.



Current names:

vdsmd
supervdsmd
vdsm-tool
vdsClient
(we have also two hosted engine daemons, I don't remember the names)

Here are some options in no particular order to name these components:

Alt 1:
ovirt-hypervisor
ovirt-hypervisor-helper
ovirt-hypervisor-tool
ovirt-hyperviosr-cli

Alt 2:
ovirt-host
ovirt-host-helper
ovirt-host-tool
ovirt-host-cli

Alt 3:
ovirt-agent
ovirt-agent-helper
ovirt-agent-tool
ovirt-agent-cli

Thoughts?

Nir


--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

[ovirt-devel] Merging patches into vdsm without CI

2015-08-21 Thread Adam Litke


Hi all,

Recent breakage in the vdsm CI flows have caused the change upload
trigger to be disabled.  This means that CI scores are no longer being
automatically applied to uploaded changes.  This means that patches
cannot be merged into vdsm.  I have a queue of patches which are
otherwise ready for merge (which have passed CI in the past but needed
rebasing).  These patches have been stalled for almost a week now.

What can we to to unfreeze the vdsm development process in the short
and long term?  Earlier today I worked with Sandro and David on
manually running CI on my dev machine but am getting 100s of failures
(so it looks like this wont even be a good short-term solution).

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] Vdsm: extending maintainers team

2015-08-05 Thread Adam Litke


On 04/08/15 09:58 +0100, Dan Kenigsberg wrote:

If you follow Vdsm development, you probably have noticed that we are
short of active maintainers.

Thankfully, we have have great developers that - in my opinion - can
fill that gap. I am impressed by the quality of their reviews, their
endurance, and most importantly - their ability to unbreak whatever
code they approve.

I'd like to nominate
- Nir Soffer - for storage
- Francesco Romani - for virt
- Piotr Kliczewski - for infra




For the mean while, I would like to keep my own single point of merger
(unless I'm away, of course).

Active and former maintainers: please approve


A big +2 from me!  This is really needed and Nir, Francesco, and Piotr
are absolutely the right candidates for maintainership.  


(My apologies for the delay in responding as I was on PTO.)

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] Stomp regression in vdsm master

2015-06-12 Thread Adam Litke


On 12/06/15 15:13 +0200, Piotr Kliczewski wrote:

On Fri, Jun 12, 2015 at 3:11 PM, Adam Litke ali...@redhat.com wrote:

On 12/06/15 11:46 +0200, Piotr Kliczewski wrote:


On Fri, Jun 12, 2015 at 11:28 AM, Michal Skrivanek
michal.skriva...@redhat.com wrote:



On 12 Jun 2015, at 02:38, Adam Litke wrote:


On 09/06/15 08:41 +0200, Piotr Kliczewski wrote:


Adam,

Thank you for reporting.

There is work in parallel on the engine side so please refresh your
engine as well.
The changes that you listed should work with engine 3.5 but will fail
as you described
for older master.



I upgraded engine to latest master (72368f3) and vdsm as well
(718909e) and connections were still completely broken between my
engine and vdsm until I reverted https://gerrit.ovirt.org/#/c/38451/ .
I think there is something real here.



I got similar reports from Omer as of yesterday ~noon, both sides latest



Is vdsm-jsonrpc-java latest?



I have the following in my local maven repo:

$ find  ~/.m2 -name \*jsonrpc\*.jar
/home/alitke/.m2/repository/org/ovirt/vdsm-jsonrpc-java/vdsm-jsonrpc-java-client/1.1.1-SNAPSHOT/vdsm-jsonrpc-java-client-1.1.1-SNAPSHOT.jar
/home/alitke/.m2/repository/org/ovirt/vdsm-jsonrpc-java/vdsm-jsonrpc-java-client/1.1.1-SNAPSHOT/vdsm-jsonrpc-java-client-1.1.1-20150420.133832-3.jar


above is the latest merged. Can you share your logs?


Attached.





/home/alitke/.m2/repository/org/ovirt/vdsm-jsonrpc-java/vdsm-jsonrpc-java-client/1.0.15/vdsm-jsonrpc-java-client-1.0.15.jar
/home/alitke/.m2/repository/org/ovirt/vdsm-jsonrpc-java/vdsm-jsonrpc-java-client/1.1.0-SNAPSHOT/vdsm-jsonrpc-java-client-1.1.0-SNAPSHOT.jar
/home/alitke/.m2/repository/org/ovirt/vdsm-jsonrpc-java/vdsm-jsonrpc-java-client/1.1.0-SNAPSHOT/vdsm-jsonrpc-java-client-1.1.0-20150407.125052-6.jar










Thanks,
Piotr


On Mon, Jun 8, 2015 at 10:54 PM, Adam Litke ali...@redhat.com wrote:


Hi Piotr,

Today I refreshed my vdsm master branch and got the 4 commits at the
bottom of this email (among others).  My engine started having
connection timeouts to vdsm (100% connectivity failure).  Reverting
the commits resolved the problem for me.  I don't have logs at the
moment but just wanted to share this info in case anyone else started
experiencing connectivity problems to vdsm.


14897fea06e8f21ae99144ee0294b21e08ea0892 stomp: calling super
explicitly
ed12db391f2f147443baf52b5519d51ad5bd3410 stomp: allow single stomp
reactor
ac85274145cd82eec804e3585b3cd12a6c13261a stompreactor: fix naming of
default
destination
c80ab0657d4f0454c3141aadeadcf134e5f16de7 stomp: server side
subscriptions

--
Adam Litke



--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel





--
Adam Litke


--
Adam Litke


ovirt-logs.tgz
Description: application/gzip
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] Stomp regression in vdsm master

2015-06-12 Thread Adam Litke


On 12/06/15 15:36 +0200, Piotr Kliczewski wrote:

On Fri, Jun 12, 2015 at 3:23 PM, Adam Litke ali...@redhat.com wrote:

On 12/06/15 15:13 +0200, Piotr Kliczewski wrote:


On Fri, Jun 12, 2015 at 3:11 PM, Adam Litke ali...@redhat.com wrote:


On 12/06/15 11:46 +0200, Piotr Kliczewski wrote:



On Fri, Jun 12, 2015 at 11:28 AM, Michal Skrivanek
michal.skriva...@redhat.com wrote:




On 12 Jun 2015, at 02:38, Adam Litke wrote:


On 09/06/15 08:41 +0200, Piotr Kliczewski wrote:



Adam,

Thank you for reporting.

There is work in parallel on the engine side so please refresh your
engine as well.
The changes that you listed should work with engine 3.5 but will fail
as you described
for older master.




I upgraded engine to latest master (72368f3) and vdsm as well
(718909e) and connections were still completely broken between my
engine and vdsm until I reverted https://gerrit.ovirt.org/#/c/38451/ .
I think there is something real here.




I got similar reports from Omer as of yesterday ~noon, both sides
latest



Is vdsm-jsonrpc-java latest?




I have the following in my local maven repo:

$ find  ~/.m2 -name \*jsonrpc\*.jar

/home/alitke/.m2/repository/org/ovirt/vdsm-jsonrpc-java/vdsm-jsonrpc-java-client/1.1.1-SNAPSHOT/vdsm-jsonrpc-java-client-1.1.1-SNAPSHOT.jar

/home/alitke/.m2/repository/org/ovirt/vdsm-jsonrpc-java/vdsm-jsonrpc-java-client/1.1.1-SNAPSHOT/vdsm-jsonrpc-java-client-1.1.1-20150420.133832-3.jar



above is the latest merged. Can you share your logs?



Attached.



Looking at the logs I do not see any issues but I do not see any
processed messages on vdsm side.

Please apply this patch [1] it should solve this issue.


Indeed it does.  Thanks!



[1] https://gerrit.ovirt.org/#/c/38819








/home/alitke/.m2/repository/org/ovirt/vdsm-jsonrpc-java/vdsm-jsonrpc-java-client/1.0.15/vdsm-jsonrpc-java-client-1.0.15.jar

/home/alitke/.m2/repository/org/ovirt/vdsm-jsonrpc-java/vdsm-jsonrpc-java-client/1.1.0-SNAPSHOT/vdsm-jsonrpc-java-client-1.1.0-SNAPSHOT.jar

/home/alitke/.m2/repository/org/ovirt/vdsm-jsonrpc-java/vdsm-jsonrpc-java-client/1.1.0-SNAPSHOT/vdsm-jsonrpc-java-client-1.1.0-20150407.125052-6.jar










Thanks,
Piotr


On Mon, Jun 8, 2015 at 10:54 PM, Adam Litke ali...@redhat.com
wrote:



Hi Piotr,

Today I refreshed my vdsm master branch and got the 4 commits at the
bottom of this email (among others).  My engine started having
connection timeouts to vdsm (100% connectivity failure).  Reverting
the commits resolved the problem for me.  I don't have logs at the
moment but just wanted to share this info in case anyone else
started
experiencing connectivity problems to vdsm.


14897fea06e8f21ae99144ee0294b21e08ea0892 stomp: calling super
explicitly
ed12db391f2f147443baf52b5519d51ad5bd3410 stomp: allow single stomp
reactor
ac85274145cd82eec804e3585b3cd12a6c13261a stompreactor: fix naming of
default
destination
c80ab0657d4f0454c3141aadeadcf134e5f16de7 stomp: server side
subscriptions

--
Adam Litke




--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel






--
Adam Litke



--
Adam Litke


--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] Stomp regression in vdsm master

2015-06-11 Thread Adam Litke


On 09/06/15 08:41 +0200, Piotr Kliczewski wrote:

Adam,

Thank you for reporting.

There is work in parallel on the engine side so please refresh your
engine as well.
The changes that you listed should work with engine 3.5 but will fail
as you described
for older master.


I upgraded engine to latest master (72368f3) and vdsm as well
(718909e) and connections were still completely broken between my
engine and vdsm until I reverted https://gerrit.ovirt.org/#/c/38451/ .
I think there is something real here.



Thanks,
Piotr


On Mon, Jun 8, 2015 at 10:54 PM, Adam Litke ali...@redhat.com wrote:

Hi Piotr,

Today I refreshed my vdsm master branch and got the 4 commits at the
bottom of this email (among others).  My engine started having
connection timeouts to vdsm (100% connectivity failure).  Reverting
the commits resolved the problem for me.  I don't have logs at the
moment but just wanted to share this info in case anyone else started
experiencing connectivity problems to vdsm.


14897fea06e8f21ae99144ee0294b21e08ea0892 stomp: calling super explicitly
ed12db391f2f147443baf52b5519d51ad5bd3410 stomp: allow single stomp reactor
ac85274145cd82eec804e3585b3cd12a6c13261a stompreactor: fix naming of default
destination
c80ab0657d4f0454c3141aadeadcf134e5f16de7 stomp: server side subscriptions

--
Adam Litke


--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

[ovirt-devel] Stomp regression in vdsm master

2015-06-08 Thread Adam Litke


Hi Piotr,

Today I refreshed my vdsm master branch and got the 4 commits at the
bottom of this email (among others).  My engine started having
connection timeouts to vdsm (100% connectivity failure).  Reverting
the commits resolved the problem for me.  I don't have logs at the
moment but just wanted to share this info in case anyone else started
experiencing connectivity problems to vdsm.


14897fea06e8f21ae99144ee0294b21e08ea0892 stomp: calling super explicitly
ed12db391f2f147443baf52b5519d51ad5bd3410 stomp: allow single stomp reactor
ac85274145cd82eec804e3585b3cd12a6c13261a stompreactor: fix naming of default 
destination
c80ab0657d4f0454c3141aadeadcf134e5f16de7 stomp: server side subscriptions

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] [ACTION NEEDED] Packages for 3.5.3-RC1

2015-05-15 Thread Adam Litke


On 14/05/15 11:46 +0200, Sandro Bonazzola wrote:

# Please review the list of rpms / jobs:

# ovirt-hosted-engine-ha-1.2.6
http://jenkins.ovirt.org/job/ovirt-hosted-engine-ha_any_create-rpms_manual/7/

# otopi-1.3.2
http://jenkins.ovirt.org/job/otopi_any_create-rpms_manual/16/

# ovirt-engine-dwh-3.5.3_rc
http://jenkins.ovirt.org/job/manual-build-tarball/516/

# ovirt-engine-reports-3.5.3_rc
http://jenkins.ovirt.org/job/manual-build-tarball/517/

# qemu-kvm-ev-2.1.2-23.el7_1.3
http://jenkins.ovirt.org/job/qemu_master_create-rpms-el7-x86_64_merged/3/

# qemu-kvm-rhev-0.12.1.2-2.448.el6_6.3
http://jenkins.ovirt.org/job/qemu-kvm-rhev_create-rpms_el6/506/

# ovirt-log-collector-3.5.3-0.1.master.git8b7826f
http://jenkins.ovirt.org/job/ovirt-log-collector_3.5_create-rpms-fc20-x86_64_merged/25/
http://jenkins.ovirt.org/job/ovirt-log-collector_3.5_create-rpms-el7-x86_64_merged/15/
http://jenkins.ovirt.org/job/ovirt-log-collector_3.5_create-rpms-el6-x86_64_merged/26/

# ovirt-hosted-engine-setup-1.2.4-0.0.master.git62654a6
http://jenkins.ovirt.org/job/ovirt-hosted-engine-setup_3.5_create-rpms-fc20-x86_64_merged/60/
http://jenkins.ovirt.org/job/ovirt-hosted-engine-setup_3.5_create-rpms-el7-x86_64_merged/55/
http://jenkins.ovirt.org/job/ovirt-hosted-engine-setup_3.5_create-rpms-el6-x86_64_merged/61/

# ovirt-engine-3.5.3_rc1
http://jenkins.ovirt.org/job/manual-build-tarball/518/

# vdsm-4.16.16
http://jenkins.ovirt.org/job/manual-build-tarball/523/

#mom, to be released in Fedora and EPEL
#ACTION: Adam, please provide the tarball to be released in src.


See http://jenkins.ovirt.org/job/manual-build-tarball/524/

Fedora updates are in-progress...



#optimizer
http://jenkins.ovirt.org/job/ovirt-optimizer_master_create-rpms_merged/72/

# ovirt-node-plugin-hosted-engine
# ACTION Fabian / Douglas to provide version to be released.


--
Sandro Bonazzola
Better technology. Faster innovation. Powered by community collaboration.
See how it works at redhat.com


--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] [vdsm][mom][jsonrpc] new VDSM interface for MOM

2015-05-14 Thread Adam Litke


On 13/05/15 22:53 +0200, Piotr Kliczewski wrote:

On Wed, May 13, 2015 at 9:52 PM, Adam Litke ali...@redhat.com wrote:

On 11/05/15 04:28 -0400, Francesco Romani wrote:


Hi everyone,

I'm working to brush up and enhance my old hack
https://gerrit.ovirt.org/#/c/37827/1

That patch adds a new MOM interface, to talk with VDSM using the RPC
interface.
On top of that, I want to make efficient use of VDSM API (avoid redundant
call, possibly
issuing only one getAllVmStats call and caching the results, and so forth)

Next step will be to backport optimizations to current vdsmInterface.
Or maybe, even replacing the new vdsminterface with the new one I'm
developing :)

I'd like to use the blessed JSON-RPC interface, but what's the recommended
way to
do that? What is (or will be!) the official recommended VDSM external
client interface?

I thought about patch https://gerrit.ovirt.org/#/c/39203/

But my _impression_ is that patch will depend on VDSM's internal reactor,
thus is not very
suitable to be used into an external process.



I've written my own extremely crude client using the stomp library.
Nir also has a patch [1] on gerrit to do this.  Maybe he can provide
some insight.  It'd be nice if the vdsm-yajsonrpc package could
provide a full-featured client class that could be easily integrated
into projects like MOM.



I will try to provide simple client for people to use.


Thanks Piotr!  I'm sure you can come up with a much more  elegant way
to stitch the existing classes together to do what we need.

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] [vdsm][mom][jsonrpc] new VDSM interface for MOM

2015-05-13 Thread Adam Litke


On 11/05/15 04:28 -0400, Francesco Romani wrote:

Hi everyone,

I'm working to brush up and enhance my old hack 
https://gerrit.ovirt.org/#/c/37827/1

That patch adds a new MOM interface, to talk with VDSM using the RPC interface.
On top of that, I want to make efficient use of VDSM API (avoid redundant call, 
possibly
issuing only one getAllVmStats call and caching the results, and so forth)

Next step will be to backport optimizations to current vdsmInterface.
Or maybe, even replacing the new vdsminterface with the new one I'm developing 
:)

I'd like to use the blessed JSON-RPC interface, but what's the recommended way 
to
do that? What is (or will be!) the official recommended VDSM external client 
interface?

I thought about patch https://gerrit.ovirt.org/#/c/39203/

But my _impression_ is that patch will depend on VDSM's internal reactor, thus 
is not very
suitable to be used into an external process.


I've written my own extremely crude client using the stomp library.
Nir also has a patch [1] on gerrit to do this.  Maybe he can provide
some insight.  It'd be nice if the vdsm-yajsonrpc package could
provide a full-featured client class that could be easily integrated
into projects like MOM.

[1] https://gerrit.ovirt.org/#/c/35181/

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] Engine on Fedora 21

2015-02-02 Thread Adam Litke

On 02/02/15 15:41 -0500, Greg Sheremeta wrote:

- Original Message -

From: Adam Litke ali...@redhat.com
To: Juan Hernández jhern...@redhat.com
Cc: devel@ovirt.org
Sent: Monday, February 2, 2015 2:26:29 PM
Subject: Re: [ovirt-devel] Engine on Fedora 21

On 02/02/15 13:55 +0100, Juan Hernández wrote:
On 02/02/2015 07:56 AM, Sandro Bonazzola wrote:
 Il 29/01/2015 22:30, Adam Litke ha scritto:
 On 29/01/15 16:18 -0500, Yedidyah Bar David wrote:
 - Original Message -
 From: Adam Litke ali...@redhat.com
 To: devel@ovirt.org
 Sent: Thursday, January 29, 2015 9:46:27 PM
 Subject: [ovirt-devel] Engine on Fedora 21

 Hi all,

 Today I tried running engine on my Fedora 21 laptop.  I tried two
 approaches for deploying jboss: using the ovirt-jboss-as package, and
 by downloading and unpacking jboss-7.1.1 into /usr/share as I have
 done in the past.  engine-setup runs without errors but when I try to
 start engine the application does not seem to deploy in jboss and
 there are no errors reported (engine.log is empty).

 Is there a reasonable expectation that I should be able to get this
 working on F21 or am I wasting my time?  Does anyone have any ideas
 on how I can resolve the startup issues?

 Which Java version did you try to use it with?

 java-1.8.0-openjdk-1.8.0.31-3.b13.fc21.x86_64

 Did you have a look at [1]? In short: won't be, wait for f22.

 Yeah, didn't see much documentation of specific issues and the tracker
 bug looks pretty clean as far as general engine usability goes.

 Everything should be installable right now in F21 but jboss-as 7.1 doesn't
 work with java 1.8.
 We'll need to move to wildfly or backport java7 in order to make it
 working.

Alternatively, if it is for development purposes only, you may want to
consider using JBoss EAP 6.x instead of JBoss AS 7.1.1. The root cause
of the incompatibility has been fixed there (and in WildFly):

  https://issues.jboss.org/browse/WFLY-2057

You can get JBoss EAP from here:

  http://www.jboss.org/products/eap/download

Then you can unzip it to your favorite directory and use it during
installation of oVirt Engine:

  # engine-setup --jboss-home=/whatever/jboss-eap-6.3

It should work well with Java 8. If it doesn't work it is good to know,
as we will need to fix it eventually.

Thanks for the suggestions everyone.  I ended up installing
openJDK-1.7 alongside the stock 1.8 and it's working again for me.

via yum or did you just download it?

Both :)  I had to use yumdownloader to grab the rpms from the f20 repo
and then install them using rpm --nodeps (since the 1.8 rpms from f21
have an Obseletes: openjdk-1.7).

Perhaps depending on your answer, can't this fulfill the F21 support
feature?

I'll try to give those other JBoss versions a try soon though.

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

[ovirt-devel] Engine on Fedora 21

2015-01-29 Thread Adam Litke


Hi all,

Today I tried running engine on my Fedora 21 laptop.  I tried two
approaches for deploying jboss: using the ovirt-jboss-as package, and
by downloading and unpacking jboss-7.1.1 into /usr/share as I have
done in the past.  engine-setup runs without errors but when I try to
start engine the application does not seem to deploy in jboss and
there are no errors reported (engine.log is empty).

Is there a reasonable expectation that I should be able to get this
working on F21 or am I wasting my time?  Does anyone have any ideas
on how I can resolve the startup issues?

See logs attached...

--
Adam Litke
14:42:59,814 INFO  [org.jboss.modules] JBoss Modules version 1.1.1.GA
14:42:59,962 INFO  [org.jboss.msc] JBoss MSC version 1.0.2.GA
14:42:59,994 INFO  [org.jboss.as] JBAS015899: JBoss AS 7.1.1.Final Brontes 
starting
14:43:02,359 INFO  [org.xnio] XNIO Version 3.0.3.GA
14:43:02,369 INFO  [org.jboss.as.logging] JBAS011502: Removing bootstrap log 
handlers
14:44:55,300 INFO  [org.jboss.as.logging] JBAS011503: Restored bootstrap log 
handlers
14:44:55,340 INFO  [com.arjuna.ats.jbossatx] ARJUNA032018: Destroying 
TransactionManagerService
14:44:55,341 INFO  [com.arjuna.ats.jbossatx] ARJUNA032014: Stopping transaction 
recovery manager
14:44:55,832 INFO  [org.apache.coyote.http11.Http11Protocol] Pausing Coyote 
HTTP/1.1 on http--0.0.0.0-8443
14:44:55,832 INFO  [org.apache.coyote.http11.Http11Protocol] Stopping Coyote 
HTTP/1.1 on http--0.0.0.0-8443
14:44:55,832 INFO  [org.apache.coyote.http11.Http11Protocol] Pausing Coyote 
HTTP/1.1 on http--0.0.0.0-8080
14:44:55,834 INFO  [org.apache.coyote.http11.Http11Protocol] Stopping Coyote 
HTTP/1.1 on http--0.0.0.0-8080
14:45:01,218 INFO  [org.jboss.as] JBAS015950: JBoss AS 7.1.1.Final Brontes 
stopped in 5933ms
2015-01-29 14:43:02,379 INFO  [org.xnio.nio] (MSC service thread 1-1) XNIO NIO 
Implementation Version 3.0.3.GA
2015-01-29 14:43:02,381 INFO  [org.jboss.as.security] (ServerService Thread 
Pool -- 31) JBAS013101: Activating Security Subsystem
2015-01-29 14:43:02,383 INFO  [org.jboss.as.security] (MSC service thread 1-4) 
JBAS013100: Current PicketBox version=4.0.7.Final
2015-01-29 14:43:02,486 INFO  [org.jboss.as.naming] (ServerService Thread Pool 
-- 28) JBAS011800: Activating Naming Subsystem
2015-01-29 14:43:02,488 INFO  [org.jboss.as.clustering.infinispan] 
(ServerService Thread Pool -- 23) JBAS010280: Activating Infinispan subsystem.
2015-01-29 14:43:02,709 INFO  [org.jboss.as.connector] (MSC service thread 
1-13) JBAS010408: Starting JCA Subsystem (JBoss IronJacamar 1.0.9.Final)
2015-01-29 14:43:03,353 INFO  [org.jboss.remoting] (MSC service thread 1-2) 
JBoss Remoting version 3.2.3.GA
2015-01-29 14:43:04,106 INFO  [org.jboss.as.connector.subsystems.datasources] 
(ServerService Thread Pool -- 19) JBAS010404: Deploying non-JDBC-compliant 
driver class org.postgresql.Driver (version 9.1)
2015-01-29 14:43:04,111 INFO  [org.jboss.as.naming] (MSC service thread 1-10) 
JBAS011802: Starting Naming Service
2015-01-29 14:43:05,082 INFO  [org.jboss.as.remoting] (MSC service thread 1-12) 
JBAS017100: Listening on /127.0.0.1:8703
2015-01-29 14:43:05,152 INFO  [org.apache.coyote.http11.Http11Protocol] (MSC 
service thread 1-3) Starting Coyote HTTP/1.1 on http--0.0.0.0-8080
2015-01-29 14:43:06,622 INFO  [org.apache.coyote.http11.Http11Protocol] (MSC 
service thread 1-15) Starting Coyote HTTP/1.1 on http--0.0.0.0-8443
2015-01-29 14:43:06,669 INFO  [org.jboss.as.connector.subsystems.datasources] 
(MSC service thread 1-13) JBAS010400: Bound data source 
[java:/ENGINEDataSourceNoJTA]
2015-01-29 14:43:06,670 INFO  [org.jboss.as.connector.subsystems.datasources] 
(MSC service thread 1-13) JBAS010400: Bound data source [java:/ENGINEDataSource]
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

[ovirt-devel] mom-0.4.3

2014-11-25 Thread Adam Litke


Hi all,

A recent commit [1] to vdsm (a99bacf) introduced a dependency on mom-0.4.3
but mom-0.4.3 does not yet exist.  To work around this problem you may
build a pre-release src.rpm [2] of mom-0.4.3 that includes the needed
functionality.  Once we have enough content for a mom point release
I'll build the official upstream packages and release an update.

Sorry for the inconvenience.

[1] http://gerrit.ovirt.org/#/c/35407/
[2] http://people.redhat.com/~alitke/mom-0.4.3-0.0.aglpre.fc20.src.rpm
--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] contEIOVMs regression?

2014-11-21 Thread Adam Litke

On 21/11/14 10:03 -0500, Nir Soffer wrote:

- Original Message -

From: Adam Litke ali...@redhat.com
To: Nir Soffer nsof...@redhat.com
Cc: devel@ovirt.org, Francesco Romani from...@redhat.com, Federico Simoncelli 
fsimo...@redhat.com, Dan
Kenigsberg dan...@redhat.com
Sent: Friday, November 21, 2014 4:46:13 PM
Subject: Re: contEIOVMs regression?

On 20/11/14 17:37 -0500, Nir Soffer wrote:
- Original Message -
 From: Adam Litke ali...@redhat.com
 To: devel@ovirt.org
 Cc: Nir Soffer nsof...@redhat.com, Francesco Romani
 from...@redhat.com, Federico Simoncelli
 fsimo...@redhat.com, Dan Kenigsberg dan...@redhat.com
 Sent: Thursday, November 20, 2014 9:15:33 PM
 Subject: contEIOVMs regression?

 Hi list,

 I am taking a look at Bug 1157421 [1] which describes a situation
 where VM's that are paused with an -EIO error are not automatically
 resumed after the problem with storage has been corrected.  I have
 some patches [2] on gerrit that resolve the problem.  Since this
 appears to be a regression I am looking at a non-intrusive way to fix
 it in the 3.5 branch.  There is some disagreement on the proper way to
 fix this so I am hoping we can arrive at a solution through an open
 discussion.

 The main issue at hand is with the Event/Callback mechanism we use to
 call clientIF.contEIOVMs.  According to my experiments and this online
 discussion [3] weakref does not work for instance methods such as
 clientIF.contEIOVMs.  Our Event class uses weakref to prevent it from
 holding references to registered callback functions.

Why making event system more correct is required to tix [1]?

 I see two easy ways to fix the regression:

I don't follow, what is the regression?

Assuming that at some point contEIOVMs actually worked and was able to
automatically resume VMs, then we have a regression because given the
weakref problems I am describing herein there is no way that it is
working now.  The only way we don't have a regression is if this code
has never worked to begin with.

The current code is master do work - when I fixed this last time, the problem
was that we did not register the callback before starting the monitors, and
that the monitors did not issue a state change on the first time a monitor
check the domain state.

I verified that contEIOVMs is called and that it does try to continue vms.

Very curious.  I am working with 3.5.0.  The main difference (other
than branch) is that I am working in an environment with no connected
storage pool.  Though I still can't see how the weakref stuff could be
working in master.

If this does not break now (with current code), please open a bug.

 1) Treat clientIF as a singleton class (which it is) and make
 contEIOVMs a module-level method which gets the clientIF instance
 and calls it's bound contEIOVMs method.  See my patches [2] for the
 code behind this idea.

This is the wrong direction. There is only one place using that horrible
getInstance(), and it also could just create the single instance that we
need. We should remove getInstance() instead of using it in new code.

 2) Allow Event to maintain a strong reference on the bound
 clientIF.contEIOVMs method.  This will allow the current code to work
 as designed but will change the Event implementation to accomodate
 this specific use case.  Since no one else appears to be using this
 code, it should have no functional impact.

The code is already holding a strong reference now, no change is
needed :-)

I disagree.  From vdsm/storage/misc.py:
class Event(object):
...
def register(self, func, oneshot=False):
with self._syncRoot:
self._registrar[id(func)] = (weakref.ref(func), oneshot)
...
# ^^^ He's dead Jim

The function is converted into a weak reference.  Since, in this case,
the function is an instance method, the reference is immediately dead
on arrival.  I have verified this with debugging statements in my
environment.

So you suggest that taking a weakref to an instance method returns
a dead reference?

I thought that the problem is instance method keep hard reference to
the instance, so the weakref is useless.

Yeah, try out this test program to see what I mean:

#!/usr/bin/env python

import weakref
from functools import partial

class A(object):
   def __init__(self):
   self.r1 = weakref.ref(self.a)
   self.r2 = partial(A.a, weakref.proxy(self))

   def a(self):
   print Hello from a

def main():
   obj = A()
   print obj.r1
   obj.r2()

if __name__ == '__main__':
   main()

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

[ovirt-devel] contEIOVMs regression?

2014-11-20 Thread Adam Litke


Hi list,

I am taking a look at Bug 1157421 [1] which describes a situation
where VM's that are paused with an -EIO error are not automatically
resumed after the problem with storage has been corrected.  I have
some patches [2] on gerrit that resolve the problem.  Since this
appears to be a regression I am looking at a non-intrusive way to fix
it in the 3.5 branch.  There is some disagreement on the proper way to
fix this so I am hoping we can arrive at a solution through an open
discussion.

The main issue at hand is with the Event/Callback mechanism we use to
call clientIF.contEIOVMs.  According to my experiments and this online
discussion [3] weakref does not work for instance methods such as
clientIF.contEIOVMs.  Our Event class uses weakref to prevent it from
holding references to registered callback functions.

I see two easy ways to fix the regression:

1) Treat clientIF as a singleton class (which it is) and make
contEIOVMs a module-level method which gets the clientIF instance
and calls it's bound contEIOVMs method.  See my patches [2] for the
code behind this idea.

2) Allow Event to maintain a strong reference on the bound
clientIF.contEIOVMs method.  This will allow the current code to work
as designed but will change the Event implementation to accomodate
this specific use case.  Since no one else appears to be using this
code, it should have no functional impact.

Are there any other ideas I'm missing?  I am aware of plans to
refactor this code for 3.6 but I am more interested in a short-term,
practical solution to address the current regression.

Thanks for offering your insight on this problem.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1157421
[2] 
http://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:bug1157421,n,z

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] Unstable network connections after installing vdsm

2014-11-11 Thread Adam Litke


On 10/11/14 11:19 -0500, Ondřej Svoboda wrote:

Hi Adam,

there are some known issues, which depend on versions of software you are 
running.

If you are using EL7 (or Fedoras?) you may want to switch SELinux to permissive 
mode and turn off NetworkManager (both are separate problems with bugs open for 
them [1, 2]). Then give it another go. I think this could be your case. Please 
begin with NM, which is my suspect.


Using Fedora 20 and I think this was the culprit.  On my newly
installed machines I am yet to reproduce the problem with NM disabled.
Have we considered making the vdsm package conflict with
NetworkManager?  Or is this just a temporary situation?


If you are on EL6 you might be experiencing the traffic control (tc) utility or 
even the kernel not supporting certain commands. I came to looking at this 
problem finally so I might be able to sort it out (or ask Toni for help).

In the mean time, could you let us know what version of VDSM, selinux-policy 
and NetworkManager you are running? Could you attach 
/var/log/vdsm/supervdsm.log and /var/log/vdsm/vdsm.log? Does someting 
(NetworkManager!) in the journal seem fishy?


No repro but here are the package versions:
vdsm-4.16.0-522.git4a3768f.fc20.x86_64
NetworkManager-0.9.9.0-46.git20131003.fc20.x86_64
selinux-policy-3.12.1-193.fc20.noarch

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

[ovirt-devel] Unstable network connections after installing vdsm

2014-11-10 Thread Adam Litke


I've been experiencing peculiar and annoying networking behavior on my
oVirt development hosts and I'm hoping someone familiar with vdsm
networking configuration can help me get to the bottom of it.

My setup is two mini-Dells acting as virt hosts and ovirt engine
running on my laptop.  The dells get their network config from a
cobbler instance running on my laptop which also provides PXE
services.

After freshly installing the dells, I get a nice, stable network
connection.  After installing vdsm, the connection seems to drop
occasionally.  I have visit the machine, log into the console, and
execute 'dhclient ovirtmgmt'.  This fixes the problem again for
awhile.

Does this sound like anything someone has seen before?  What would be
the best way to start debugging/diagnosing this issue?  Thanks in
advance for your responses.

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] [ovirt-users] OVIRT-3.5-TEST-DAY-3: replace XML-rpc with JSON-rpc

2014-09-17 Thread Adam Litke

On 17/09/14 15:46 -0400, Francesco Romani wrote:

- Original Message -

From: Francesco Romani from...@redhat.com
To: devel@ovirt.org
Cc: users us...@ovirt.org
Sent: Wednesday, September 17, 2014 5:33:01 PM
Subject: [ovirt-users] OVIRT-3.5-TEST-DAY-3: replace XML-rpc with JSON-rpc

Everything I tried went OK, and logs look good to me.

I run in a few hiccups, which I mention for the sake of completeness:
- VDSM refused to start or run VMs initially: libvirt config included relics
from past
  environment on the same box, not JSON-rpc fault. Fixed with new config and
  (later) a reboot.
- Trying recovery, Engine took longer than expected to sync up with VDSM.
  I have not hard data and feeling is not enough to file a BZ, so I didn't.
- Still trying recovery, one and just one time Engine had stale data from
VDSM (reported two
  VMs as present which actually aren't). Not sure it was related to JSON-rpc,
  can't reproduce,
  so not filed a BZ.

I need to partially amend this statement as, running more benchmarks/profiling,
I got this twice in a row

INFO:root:starting 100 vms
INFO:root:start: serial execution
INFO:root:Starting VM: XS_C000
INFO:root:Starting VM: XS_C001
INFO:root:Starting VM: XS_C002

Traceback (most recent call last):
 File ./observe.py, line 154, in module
   data = bench(host, 'XS_C%03i', first, last, api, outfile, mins * 60.)
 File ./observe.py, line 122, in bench
   start(vms)
 File ./observe.py, line 66, in start
   vm.start()
 File ./observe.py, line 54, in start
   self._handle.start()
 File /usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/brokers.py, 
line 16507, in start
   headers={Correlation-Id:correlation_id}
 File /usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/proxy.py, line 
118, in request
   persistent_auth=self._persistent_auth)
 File /usr/lib/python2.7/site-packages/ovirtsdk/infrastructure/proxy.py, line 
140, in __doRequest
   persistent_auth=persistent_auth
 File /usr/lib/python2.7/site-packages/ovirtsdk/web/connection.py, line 134, 
in doRequest
   raise RequestError, response
ovirtsdk.infrastructure.errors.RequestError:
status: 400
reason: Bad Request
detail: Network error during communication with the Host.

(this is a runner script using ovirt sdk for python, source is available on 
demand and will be
published anyway soon[ish])

On engine logs I see something alike this: http://fpaste.org/134263/

Since the above is way too vague to file a meaningful BZ, I'm now continuing 
the investigation
to see if there is a bug somewhere or if it's a hiccup of my local environment.

I just want to note that I have been experiencing vague, intermittent
jsonRPC issues with my environment also.  I have filed 1143042 which I
believe to be a symptom of unreliable communication.  It seems to me
that we have a definite problem to work out.

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

[ovirt-devel] mom-0.4.2 released (Karma requested)

2014-09-12 Thread Adam Litke


Hi all,

I have released mom-0.4.2 and submitted updates for f20[1], el6[2],
and epel7[3].  If you have a spare cycle, please install this new
version from the updates-testing repository and add a comment in the
fedora updates system.  This will help expedite the rollout of this
update.  Thanks!

[1] https://admin.fedoraproject.org/updates/FEDORA-2014-10757/mom-0.4.2-1.fc20
[2] https://admin.fedoraproject.org/updates/mom-0.4.2-1.el6
[3] https://admin.fedoraproject.org/updates/mom-0.4.2-1.el7

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] mom-0.4.2 released (Karma requested)

2014-09-12 Thread Adam Litke


On 12/09/14 18:09 +0200, Sven Kieske wrote:

I see no new branch created at
http://gerrit.ovirt.org/#/admin/projects/mom,branches ?

This is all still in master branch.


We decided there is no point in creating branches for most releases
since we're just releasing straight releases into fedora.  If there
becomes a need for a stable branch in the future more will need to
change than just a branch in mom.  We'll need to ship mom RPMs in
oVirt and make sure the vdsm depends on the oVirt version instead of
the latest upstream version.  Hoping to avoid all of this for now.


Also a changelog would be awesome to have (there seem to be no huge
changes since 0.4.1).


Nothing huge here.  The purpose of this release is to allow vdsm to
bump the version of mom it depends on for enabling memory ballooning
functional tests.


Thanks for the new release anyway!


My pleasure.

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

[ovirt-devel] mom-0.4.2 release and oVirt-3.5

2014-09-02 Thread Adam Litke


Hi all,

Dan has asked for a new release of mom (so that vdsm can be sure to
depend on the latest code upstream).  I would like to do one more
release prior to oVirt-3.5 in order to get anything required for 3.5
features in the upstream Fedora/EPEL repos.  Is there anything else
that will be needed for this release?

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

[ovirt-devel] How do I build oVirt Jenkins jobs locally

2014-08-27 Thread Adam Litke


Hey guys,

I am following http://www.ovirt.org/Local_Jenkins_For_The_People in
order to set up a build env for ovirt-engine.  I've got the basic
setup running but I'd like to be able to build rpms in the same way
that we do on oVirt.org.  I came across the 'jenkins' repo in gerrit
but I can't figure out how to use that to create an XML file for the
create_rpms job suitable for import into jenkins.  Can anyone point me
in the right direction?

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] What does your oVirt development environment look like?

2014-08-25 Thread Adam Litke


On 21/08/14 09:17 -0400, Greg Sheremeta wrote:

Good idea.


Thanks, and thank you for sharing.



I work on the UI, so I don't have much of a need for a complex setup. I
have the two mini dells, and then I have two much more powerful personal
machines that I use for work -- machine 1 (dauntless) is my main development
machine, and machine 2 (starbase) is my main home server. I compile and run
engine on dauntless, and starbase serves NFS and SMB. I don't have iscsi setup,
although I probably should learn this. I use nested virt for all my hosts,


For a Friday afternoon project you might want to check out this easy
to follow guide for targetcli.  It's what I use for software iSCSI and
it works pretty well for me:

https://wiki.archlinux.org/index.php/ISCSI_Target


so mini dell 1 and mini dell 2 both run Fedora 20 and I basically just remote
to them to install vms via virt-manager.

I had cobbler running at one point, but I got frustrated with it one too many
times and gave up. Now I just have a giant collection of isos available via
NFS (and scattered on the desktops of the mini dells :)) I typically install
fresh hosts using the F20 network-install iso. It's a little slower, but
very reliable.


Yeah, I am wondering if this would be a better approach (though I
really do like the unattended PXE installations I can do with
cobbler).


I tend to not need more than one of two database instances at a time.

I gave up using my laptop for primary development because I need three monitors
on my dev rig, and my laptop supports two max. (I'm currently heartbroken at
the lack of USB3 video for linux. See [1].) I basically use my laptop as
a remote viewer to dauntless now when I'm working in bed or wanting to sit out
on the porch. (RealVNC encrypted mode -- I use an xrandr script to toggle off
two of dauntless's monitors, and then I full-screen VNC.)

Old pic of my desk: [2]


Wow,  I feel really low-tech with my single widescreen monitor here.


Dauntless, starbase, the dells, and all monitors are connected to a giant UPS.
Home network equipment is all connected to another UPS.

I've given some thought to building a distributed compile of ovirt (specifically
the GWT part -- maybe distribute each permutation to worker nodes), but I was
under the impression that most people just use their laptop for work. I think
a distributed compile would be pretty nice for me, but not sure how many people
would use it. ?


I try to compile engine as infrequently as possible.  Due to what it
does to my running system, I usually reboot afterwords too.

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

[ovirt-devel] What does your oVirt development environment look like?

2014-08-15 Thread Adam Litke


Ever since starting to work on oVirt around 3 years ago I've been
striving for the perfect development and test environment.  I was
inspired by Yaniv's recent deep dive on Foreman integration and
thought I'd ask people to share their setups and any tips and tricks
so we can all become better, more efficient developers.

My setup consists of my main work laptop and two mini-Dell servers.  I
run the engine on my laptop and I serve NFS and iSCSI (using
targetcli) from this system as well.  I use the ethernet port on the
laptop to connect it to a subnet with the two Dell systems.

Some goals for my setup are:
- Easy provisioning of the virt-hosts so I can quickly test on Fedora
  and CentOS without spending lots of time reinstalling
- Ability to test block and nfs storage
- Automation of test scenarios involving engine and hosts

To help me reach these goals I've deployed cobbler on my laptop and it
does a pretty good job at managing PXE boot configurations for my
hosts (and VMs) so they can be automatically intalled as needed.
After viewing Yaniv's presentation, it seems that Forman/Puppet are
the way of the future but it does seem a bit more involved to set up.
I am definitely curious if others are using Foreman in their personal
dev/test environment and can offer some insight on how that is working
out.

Thanks, and I look forward to reading about more of your setups!  If
we get enough of these, maybe this could make a good section of the
wiki.

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] What does your oVirt development environment look like?

2014-08-15 Thread Adam Litke

On 15/08/14 15:57 -0400, Yair Zaslavsky wrote:

- Original Message -

From: ybronhei ybron...@redhat.com
To: Adam Litke ali...@redhat.com, devel@ovirt.org
Sent: Friday, August 15, 2014 7:36:23 PM
Subject: Re: [ovirt-devel] What does your oVirt development environment look
like?

On 08/15/2014 09:32 AM, Adam Litke wrote:
 Ever since starting to work on oVirt around 3 years ago I've been
 striving for the perfect development and test environment.  I was
 inspired by Yaniv's recent deep dive on Foreman integration and
 thought I'd ask people to share their setups and any tips and tricks
 so we can all become better, more efficient developers.

 My setup consists of my main work laptop and two mini-Dell servers.  I
 run the engine on my laptop and I serve NFS and iSCSI (using
 targetcli) from this system as well.  I use the ethernet port on the
 laptop to connect it to a subnet with the two Dell systems.

 Some goals for my setup are:
 - Easy provisioning of the virt-hosts so I can quickly test on Fedora
and CentOS without spending lots of time reinstalling
 - Ability to test block and nfs storage
 - Automation of test scenarios involving engine and hosts

 To help me reach these goals I've deployed cobbler on my laptop and it
 does a pretty good job at managing PXE boot configurations for my
 hosts (and VMs) so they can be automatically intalled as needed.
 After viewing Yaniv's presentation, it seems that Forman/Puppet are
 the way of the future but it does seem a bit more involved to set up.
 I am definitely curious if others are using Foreman in their personal
 dev/test environment and can offer some insight on how that is working
 out.

 Thanks, and I look forward to reading about more of your setups!  If
 we get enough of these, maybe this could make a good section of the
 wiki.

Heppy to hear :) for those who missed -
https://www.youtube.com/watch?v=gozX891kYAY

each one has its own needs and goals I guess, but if you say it might
help, I'll never say no for sharing :P
I have 3 dells under my desk, I compile the engine a lot and its heavy
for my laptop. So I clone my local working directory and build it on the
strongest mini-dell using local jenkins server
(http://www.ovirt.org/Local_Jenkins_For_The_People). The other 2 I use
as hypervisor when needed. provision them is done by me manually :/..
cobbler pxe boot could help with already defined image..  Other then
that, I have nfs mount for storage and few vms for compilation and small
tests

Haven't used Jenkins for the people for quite some time, it's
awesome though.  Yaniv, does your Jenkins build all your local
branches?  I don't have much to share, my environment is even
simpler.  I am sure it's a common knowledge but still a reminder
(even if a new developer can benefit from it, it will be good) - you
can create a database schema per each branch you work on, and if
needed to switch between branches, you don't have to destroy your
current database.  Quite helpful, I must say , for someone who works
100% on engine related stuff.

Thanks for sharing... How do you manage your multiple db schemas?
Just with the engine-backup and engine-restore commands?

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] What does your oVirt development environment look like?

2014-08-15 Thread Adam Litke

On 15/08/14 16:20 -0400, Alon Bar-Lev wrote:

- Original Message -

From: Adam Litke ali...@redhat.com
To: Yair Zaslavsky yzasl...@redhat.com
Cc: devel@ovirt.org
Sent: Friday, August 15, 2014 11:17:05 PM
Subject: Re: [ovirt-devel] What does your oVirt development environment look 
like?

On 15/08/14 15:57 -0400, Yair Zaslavsky wrote:

- Original Message -
 From: ybronhei ybron...@redhat.com
 To: Adam Litke ali...@redhat.com, devel@ovirt.org
 Sent: Friday, August 15, 2014 7:36:23 PM
 Subject: Re: [ovirt-devel] What does your oVirt development environment
 look   like?

 On 08/15/2014 09:32 AM, Adam Litke wrote:
  Ever since starting to work on oVirt around 3 years ago I've been
  striving for the perfect development and test environment.  I was
  inspired by Yaniv's recent deep dive on Foreman integration and
  thought I'd ask people to share their setups and any tips and tricks
  so we can all become better, more efficient developers.

  My setup consists of my main work laptop and two mini-Dell servers.  I
  run the engine on my laptop and I serve NFS and iSCSI (using
  targetcli) from this system as well.  I use the ethernet port on the
  laptop to connect it to a subnet with the two Dell systems.

  Some goals for my setup are:
  - Easy provisioning of the virt-hosts so I can quickly test on Fedora
 and CentOS without spending lots of time reinstalling
  - Ability to test block and nfs storage
  - Automation of test scenarios involving engine and hosts

  To help me reach these goals I've deployed cobbler on my laptop and it
  does a pretty good job at managing PXE boot configurations for my
  hosts (and VMs) so they can be automatically intalled as needed.
  After viewing Yaniv's presentation, it seems that Forman/Puppet are
  the way of the future but it does seem a bit more involved to set up.
  I am definitely curious if others are using Foreman in their personal
  dev/test environment and can offer some insight on how that is working
  out.

  Thanks, and I look forward to reading about more of your setups!  If
  we get enough of these, maybe this could make a good section of the
  wiki.

 Heppy to hear :) for those who missed -
 https://www.youtube.com/watch?v=gozX891kYAY

 each one has its own needs and goals I guess, but if you say it might
 help, I'll never say no for sharing :P
 I have 3 dells under my desk, I compile the engine a lot and its heavy
 for my laptop. So I clone my local working directory and build it on the
 strongest mini-dell using local jenkins server
 (http://www.ovirt.org/Local_Jenkins_For_The_People). The other 2 I use
 as hypervisor when needed. provision them is done by me manually :/..
 cobbler pxe boot could help with already defined image..  Other then
 that, I have nfs mount for storage and few vms for compilation and small
 tests

Haven't used Jenkins for the people for quite some time, it's
awesome though.  Yaniv, does your Jenkins build all your local
branches?  I don't have much to share, my environment is even
simpler.  I am sure it's a common knowledge but still a reminder
(even if a new developer can benefit from it, it will be good) - you
can create a database schema per each branch you work on, and if
needed to switch between branches, you don't have to destroy your
current database.  Quite helpful, I must say , for someone who works
100% on engine related stuff.

Thanks for sharing... How do you manage your multiple db schemas?
Just with the engine-backup and engine-restore commands?

just create N empty databases, install each environment to different PREFIX and 
when running engine-setup select one for each environment.

Even better.  Thank you!

refer to README.developer at engine repo.

BTW: with proper listen ports customization, you can even have N engine 
instances running at same machine at same time.

Alon

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] [QA] [ACTION REQUIRED] oVirt 3.5.0 RC2 status

2014-08-12 Thread Adam Litke


On 12/08/14 17:43 +0200, Sandro Bonazzola wrote:

Hi,
tomorrow we should compose oVirt 3.5.0 RC2 starting at 08:00 UTC
We still have the following blockers list:

Bug ID  Whiteboard  Status  Summary
1127294 storage POSTLive Merge: Resolve unknown merge 
status in vdsm after host crash
1109920 storage POSTLive Merge: Extend internal block 
volumes during merge


There are several patches for master (6) that must be merged and
backported to 3.5.  Thanks Francesco for your reviews (I will repost
the series this afternoon for followup review).  I would appreciate a
look by those I've included as reviewers (you received a separate
email from me) so we can converge on these ASAP.

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] python-ioprocess for el7?

2014-07-17 Thread Adam Litke


On 14/07/14 10:12 -0400, Douglas Schilling Landgraf wrote:

On 07/12/2014 11:54 PM, Douglas Schilling Landgraf wrote:

On 07/11/2014 07:46 AM, Dan Kenigsberg wrote:

On Thu, Jul 10, 2014 at 05:01:21PM -0400, Adam Litke wrote:

Hi,

I am looking for python-ioprocess RPMs (new enough for latest vdsm
requirements).  Can anyone point me in the right direction?  Thanks!


Looking at

https://admin.fedoraproject.org/updates/search/python-pthreading
https://admin.fedoraproject.org/updates/search/python-cpopen
https://admin.fedoraproject.org/updates/search/ioprocess

I can confirm that we miss quite a bit of our dependencies for el7.

Douglas, Yaniv: can you have them built? I see that
http://dl.fedoraproject.org/pub/epel/beta/7/x86_64/ already exists, and
I hope to see out packages there.



Sure, please refresh, python-cpopen and python-pthreading should be
there now. However, ioprocess requires Saggi interaction.




Hi Adam,

I got access to build ioprocess, should be soon at 
http://dl.fedoraproject.org/pub/epel/beta/7/x86_64/ or right now you 
can get via: 
http://koji.fedoraproject.org/koji/taskinfo?taskID=7137068


Hmm, this seems to still be for version 0.3-2.

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

[ovirt-devel] Custom fencing with virsh_fence

2014-07-15 Thread Adam Litke


Hi all,

I am trying to configure custom fencing using fence_virsh in order to
test out fencing flows with my virtualized oVirt hosts.  I'm getting a
failure when clicking the Test button.  Can someone help me to
diagnose the problem?  I have applied the following settings using
engine-config:

~/ovirt-engine/bin/engine-config -s CustomVdsFenceType=xxxvirt
~/ovirt-engine/bin/engine-config -s CustomFenceAgentMapping=xxxvirt=virsh
~/ovirt-engine/bin/engine-config -s 
CustomVdsFenceOptionMapping=xxxvirt:address=ip,username=username,password=password

(note that engine-config seems to arbitrarily limit the number of
mapped options to 3.  Seems like a bug to me).


Here is the log output in engine.log:

2014-07-15 11:43:34,813 INFO
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(http--0.0.0.0-8080-1) Correlation ID: null, Call Stack: null, Custom
Event ID: -1, Message: Host centennial from cluster block was chosen
as a proxy to execute Status command on Host cascade.
2014-07-15 11:43:34,813 INFO
[org.ovirt.engine.core.bll.FenceExecutor] (http--0.0.0.0-8080-1) Using
Host centennial from cluster block as proxy to execute Status command
on Host 
2014-07-15 11:43:34,815 INFO

[org.ovirt.engine.core.bll.FenceExecutor] (http--0.0.0.0-8080-1)
Executing Status Power Management command, Proxy Host:centennial,
Agent:virsh, Target Host:, Management IP:192.168.2.101, User:root,
Options:
2014-07-15 11:43:34,816 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand]
(http--0.0.0.0-8080-1) START, FenceVdsVDSCommand(HostName =
centennial, HostId = a34f7dbc-dd99-4831-a1a9-54c411080ec1, targetVdsId
= b6b9d480-e20f-411a-9b9c-883fac32a4e5, action = Status, ip =
192.168.2.101, port = , type = virsh, user = root, password = **,
options = ''), log id: 24f33bda
2014-07-15 11:43:34,875 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand]
(http--0.0.0.0-8080-1) Failed in FenceVdsVDS method, for vds:
centennial; host: 192.168.2.103
2014-07-15 11:43:34,876 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand]
(http--0.0.0.0-8080-1) Command FenceVdsVDSCommand(HostName =
centennial, HostId = a34f7dbc-dd99-4831-a1a9-54c411080ec1, targetVdsId
= b6b9d480-e20f-411a-9b9c-883fac32a4e5, action = Status, ip =
192.168.2.101, port = , type = virsh, user = root, password = **,
options = '') execution failed. Exception: ClassCastException:
[Ljava.lang.Object; cannot be cast to java.lang.String
2014-07-15 11:43:34,877 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand]
(http--0.0.0.0-8080-1) FINISH, FenceVdsVDSCommand, log id: 24f33bda


--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] Custom fencing with virsh_fence

2014-07-15 Thread Adam Litke


On 15/07/14 17:59 +0200, Juan Hernandez wrote:

On 07/15/2014 05:51 PM, Adam Litke wrote:

Hi all,

I am trying to configure custom fencing using fence_virsh in order to
test out fencing flows with my virtualized oVirt hosts.  I'm getting a
failure when clicking the Test button.  Can someone help me to
diagnose the problem?  I have applied the following settings using
engine-config:

~/ovirt-engine/bin/engine-config -s CustomVdsFenceType=xxxvirt
~/ovirt-engine/bin/engine-config -s CustomFenceAgentMapping=xxxvirt=virsh
 ~/ovirt-engine/bin/engine-config -s 
CustomVdsFenceOptionMapping=xxxvirt:address=ip,username=username,password=password

(note that engine-config seems to arbitrarily limit the number of
mapped options to 3.  Seems like a bug to me).


Here is the log output in engine.log:

2014-07-15 11:43:34,813 INFO
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(http--0.0.0.0-8080-1) Correlation ID: null, Call Stack: null, Custom
Event ID: -1, Message: Host centennial from cluster block was chosen
as a proxy to execute Status command on Host cascade.
2014-07-15 11:43:34,813 INFO
[org.ovirt.engine.core.bll.FenceExecutor] (http--0.0.0.0-8080-1) Using
Host centennial from cluster block as proxy to execute Status command
on Host
2014-07-15 11:43:34,815 INFO
[org.ovirt.engine.core.bll.FenceExecutor] (http--0.0.0.0-8080-1)
Executing Status Power Management command, Proxy Host:centennial,
Agent:virsh, Target Host:, Management IP:192.168.2.101, User:root,
Options:
2014-07-15 11:43:34,816 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand]
(http--0.0.0.0-8080-1) START, FenceVdsVDSCommand(HostName =
centennial, HostId = a34f7dbc-dd99-4831-a1a9-54c411080ec1, targetVdsId
= b6b9d480-e20f-411a-9b9c-883fac32a4e5, action = Status, ip =
192.168.2.101, port = , type = virsh, user = root, password = **,
options = ''), log id: 24f33bda
2014-07-15 11:43:34,875 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand]
(http--0.0.0.0-8080-1) Failed in FenceVdsVDS method, for vds:
centennial; host: 192.168.2.103
2014-07-15 11:43:34,876 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand]
(http--0.0.0.0-8080-1) Command FenceVdsVDSCommand(HostName =
centennial, HostId = a34f7dbc-dd99-4831-a1a9-54c411080ec1, targetVdsId
= b6b9d480-e20f-411a-9b9c-883fac32a4e5, action = Status, ip =
192.168.2.101, port = , type = virsh, user = root, password = **,
options = '') execution failed. Exception: ClassCastException:
[Ljava.lang.Object; cannot be cast to java.lang.String
2014-07-15 11:43:34,877 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand]
(http--0.0.0.0-8080-1) FINISH, FenceVdsVDSCommand, log id: 24f33bda




Looks like this bug:

 https://bugzilla.redhat.com/1114977


Indeed it is.  So I looked at the host to see what the failure was and
I get the following messages.  It looks like engine is not passing the
contents of the 'Slot' UI field as the port option.  This, even after
I changed the param mapping like so:

engine-config -s 
CustomVdsFenceOptionMapping=xxxvirt:address=ip,username=username,password=password,slot=port


Thread-440::DEBUG::2014-07-15 15:06:46,997::API::1165::vds::(fenceNode) 
fenceNode(addr=192.168.2.101,port=,agent=virsh,user=root,passwd=,action=status,secure=,options==block-cascade)
Thread-440::DEBUG::2014-07-15 15:06:46,997::utils::594::root::(execCmd) 
/usr/sbin/fence_virsh (cwd None)
Thread-440::DEBUG::2014-07-15 15:06:47,035::utils::614::root::(execCmd) FAILED: err = 
WARNING:root:Parse error: Ignoring unknown option '=block-cascade'\n\nERROR:root:Failed: You 
have to enter plug number or machine identification\n\nERROR:root:Please use '-h' for 
usage\n\n; rc = 1
Thread-440::DEBUG::2014-07-15 15:06:47,035::API::1152::vds::(fence) rc 1 inp 
agent=fence_virsh
ipaddr=192.168.2.101
login=root
action=status
passwd=
=block-cascade out [] err [WARNING:root:Parse error: Ignoring unknown option 
'=block-cascade', '', 'ERROR:root:Failed: You have to enter plug number or machine 
identification', '', ERROR:root:Please use '-h' for usage, '']
Thread-440::DEBUG::2014-07-15 15:06:47,035::API::1188::vds::(fenceNode) rc 1 in 
agent=fence_virsh
ipaddr=192.168.2.101
login=root
action=status
passwd=
=block-cascade out [] err [WARNING:root:Parse error: Ignoring unknown option 
'=block-cascade', '', 'ERROR:root:Failed: You have to enter plug number or machine 
identification', '', ERROR:root:Please use '-h' for usage, '']




--
Dirección Comercial: C/Jose Bardasano Baos, 9, Edif. Gorbea 3, planta
3ºD, 28016 Madrid, Spain
Inscrita en el Reg. Mercantil de Madrid – C.I.F. B82657941 - Red Hat S.L.


--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] [vdsm] VM recovery now depends on HSM

2014-07-10 Thread Adam Litke


On 10/07/14 08:40 +0200, Michal Skrivanek wrote:


On Jul 9, 2014, at 15:38 , Nir Soffer nsof...@redhat.com wrote:


- Original Message -

From: Adam Litke ali...@redhat.com
To: Michal Skrivanek michal.skriva...@redhat.com
Cc: devel@ovirt.org
Sent: Wednesday, July 9, 2014 4:19:09 PM
Subject: Re: [ovirt-devel] [vdsm] VM recovery now depends on HSM

On 09/07/14 13:11 +0200, Michal Skrivanek wrote:


On Jul 8, 2014, at 22:36 , Adam Litke ali...@redhat.com wrote:


Hi all,

As part of the new live merge feature, when vdsm starts and has to
recover existing VMs, it calls VM._syncVolumeChain to ensure that
vdsm's view of the volume chain matches libvirt's.  This involves two
kinds of operations: 1) sync VM object, 2) sync underlying storage
metadata via HSM.

This means that HSM must be up (and the storage domain(s) that the VM
is using must be accessible.  When testing some rather eccentric error
flows, I am finding this to not always be the case.

Is there a way to have VM recovery wait on HSM to come up?  How should
we respond if a required storage domain cannot be accessed?  Is there
a mechanism in vdsm to schedule an operation to be retried at a later
time?  Perhaps I could just schedule the sync and it could be retried
until the required resources are available.


I've briefly discussed with Federico some time ago that IMHO the
syncVolumeChain needs to be changed. It must not be part of VM's create
flow as I expect this quite a bottleneck in big-scale environment (it is
now in fact not executing only on recovery but on all 4 create flows!).
I don't know how yet, but we need to find a different way. Now you just
added yet another reason.

So…I too ask for more insights:-)


Sure, so... We switched to running syncVolumeChain at all times to
cover a very rare scenario:

1. VM is running on host A
2. User initiates Live Merge on VM
3. Host A experiences a catastrophic hardware failure before engine
can determine if the merge succeeded or failed
4. VM is restarted on Host B

Since (in this case) the host cannot know if a live merge was in
progress on the previous host, it needs to always check.


Some ideas to mitigate:
1. When engine recreates a VM on a new host and a Live Merge was in
progress, engine could call a verb to ask the host to synchronize the
volume chain.  This way, it only happens when engine knows it's needed
and engine can be sure that the required resources (storage
connections and domains) are present.


This seems like the right approach.


+1
I like the only when needed, since indeed we can assume the scenario is 
unlikely to happen most of the times (but very real indeed)


Ok.  I will need to expose a synchronizeDisks virt verb for this.  It
will be called by engine whenever a VM moves between hosts prior to a
block job being resolved.

##
# @VM.synchronizeDisks:
#
# Tell vdsm to synchronize disk metadata with the live VM state
#
# @vmID:  The UUID of the VM
#
# Since: 4.16.0
##
{'command': {'class': 'VM', 'name': 'synchronizeDisks'},
'data': {'vmID': 'UUID'}}

Greg, you can call this after VmStats from the new host indicates
that the block job is indeed not there anymore.  You want to call it
before you fetch the VM definition to check the volume chain.  This
way you can be sure that the new host has refreshed the config in case
it was out of sync.

Federico, I am thinking about how to handle the case where someone
would try a cold merge here instead of starting the VM.  I guess they
cannot because engine will have the disk locked.  Maybe that is good
enough for now.


2. The syncVolumeChain call runs in the recovery case to ensure that
we clean up after any missed block job events from libvirt while vdsm
was stopped/restarting.


can we clean up later on, does it need to be on recovery? Can it be delayed - 
requested by engine a little bit later?


This question is where I could use some help from the experts :)  Here
is the scenario in question:  How serious is a temporary metadata
inconsistency?

1. Live merge starts for VM on a host
2. vdsm crashes
3. qemu completes the live merge operation and rewrite the qcow chain
4. libvirt emits an event (missed by vdsm which is not running)
5. vdsm starts and recovers VM

At this point, the vm conf has an outdated view of the disk.  In the
case of an active layer merge, the volumeID of the disk will have
changed and at least one volume is removed from the chain.  For
internal volume merge, just one or more volumes can be missing from
the chain.  In addition, the metadata on the storage side is out
dated.

As long as engine submits no operations which depend on an accurate
picture of the volume chain until it has called synchronizeDisks() we
should be okay.  Does vdsm initiate any operations on its own that
would be sensitive to this synchronization issue (ie. disk stats)?





We need this since vdsm recover running vms when it starts, before
engine is connected. Actually engine cannot talk with vdsm until
it finished the recovery

Re: [ovirt-devel] [vdsm] VM recovery now depends on HSM

2014-07-10 Thread Adam Litke


Sorry, adding Greg...

On 10/07/14 08:40 +0200, Michal Skrivanek wrote:


On Jul 9, 2014, at 15:38 , Nir Soffer nsof...@redhat.com wrote:


- Original Message -

From: Adam Litke ali...@redhat.com
To: Michal Skrivanek michal.skriva...@redhat.com
Cc: devel@ovirt.org
Sent: Wednesday, July 9, 2014 4:19:09 PM
Subject: Re: [ovirt-devel] [vdsm] VM recovery now depends on HSM

On 09/07/14 13:11 +0200, Michal Skrivanek wrote:


On Jul 8, 2014, at 22:36 , Adam Litke ali...@redhat.com wrote:


Hi all,

As part of the new live merge feature, when vdsm starts and has to
recover existing VMs, it calls VM._syncVolumeChain to ensure that
vdsm's view of the volume chain matches libvirt's.  This involves two
kinds of operations: 1) sync VM object, 2) sync underlying storage
metadata via HSM.

This means that HSM must be up (and the storage domain(s) that the VM
is using must be accessible.  When testing some rather eccentric error
flows, I am finding this to not always be the case.

Is there a way to have VM recovery wait on HSM to come up?  How should
we respond if a required storage domain cannot be accessed?  Is there
a mechanism in vdsm to schedule an operation to be retried at a later
time?  Perhaps I could just schedule the sync and it could be retried
until the required resources are available.


I've briefly discussed with Federico some time ago that IMHO the
syncVolumeChain needs to be changed. It must not be part of VM's create
flow as I expect this quite a bottleneck in big-scale environment (it is
now in fact not executing only on recovery but on all 4 create flows!).
I don't know how yet, but we need to find a different way. Now you just
added yet another reason.

So…I too ask for more insights:-)


Sure, so... We switched to running syncVolumeChain at all times to
cover a very rare scenario:

1. VM is running on host A
2. User initiates Live Merge on VM
3. Host A experiences a catastrophic hardware failure before engine
can determine if the merge succeeded or failed
4. VM is restarted on Host B

Since (in this case) the host cannot know if a live merge was in
progress on the previous host, it needs to always check.


Some ideas to mitigate:
1. When engine recreates a VM on a new host and a Live Merge was in
progress, engine could call a verb to ask the host to synchronize the
volume chain.  This way, it only happens when engine knows it's needed
and engine can be sure that the required resources (storage
connections and domains) are present.


This seems like the right approach.


+1
I like the only when needed, since indeed we can assume the scenario is 
unlikely to happen most of the times (but very real indeed)


Ok.  I will need to expose a synchronizeDisks virt verb for this.  It
will be called by engine whenever a VM moves between hosts prior to a
block job being resolved.

##
# @VM.synchronizeDisks:
#
# Tell vdsm to synchronize disk metadata with the live VM state
#
# @vmID:  The UUID of the VM
#
# Since: 4.16.0
##
{'command': {'class': 'VM', 'name': 'synchronizeDisks'},
'data': {'vmID': 'UUID'}}

Greg, you can call this after VmStats from the new host indicates
that the block job is indeed not there anymore.  You want to call it
before you fetch the VM definition to check the volume chain.  This
way you can be sure that the new host has refreshed the config in case
it was out of sync.

Federico, I am thinking about how to handle the case where someone
would try a cold merge here instead of starting the VM.  I guess they
cannot because engine will have the disk locked.  Maybe that is good
enough for now.


2. The syncVolumeChain call runs in the recovery case to ensure that
we clean up after any missed block job events from libvirt while vdsm
was stopped/restarting.


can we clean up later on, does it need to be on recovery? Can it be delayed - 
requested by engine a little bit later?


This question is where I could use some help from the experts :)  Here
is the scenario in question:  How serious is a temporary metadata
inconsistency?

1. Live merge starts for VM on a host
2. vdsm crashes
3. qemu completes the live merge operation and rewrite the qcow chain
4. libvirt emits an event (missed by vdsm which is not running)
5. vdsm starts and recovers VM

At this point, the vm conf has an outdated view of the disk.  In the
case of an active layer merge, the volumeID of the disk will have
changed and at least one volume is removed from the chain.  For
internal volume merge, just one or more volumes can be missing from
the chain.  In addition, the metadata on the storage side is out
dated.

As long as engine submits no operations which depend on an accurate
picture of the volume chain until it has called synchronizeDisks() we
should be okay.  Does vdsm initiate any operations on its own that
would be sensitive to this synchronization issue (ie. disk stats)?





We need this since vdsm recover running vms when it starts, before
engine is connected. Actually engine cannot talk with vdsm until

Re: [ovirt-devel] [vdsm] VM recovery now depends on HSM

2014-07-09 Thread Adam Litke


On 09/07/14 13:11 +0200, Michal Skrivanek wrote:


On Jul 8, 2014, at 22:36 , Adam Litke ali...@redhat.com wrote:


Hi all,

As part of the new live merge feature, when vdsm starts and has to
recover existing VMs, it calls VM._syncVolumeChain to ensure that
vdsm's view of the volume chain matches libvirt's.  This involves two
kinds of operations: 1) sync VM object, 2) sync underlying storage
metadata via HSM.

This means that HSM must be up (and the storage domain(s) that the VM
is using must be accessible.  When testing some rather eccentric error
flows, I am finding this to not always be the case.

Is there a way to have VM recovery wait on HSM to come up?  How should
we respond if a required storage domain cannot be accessed?  Is there
a mechanism in vdsm to schedule an operation to be retried at a later
time?  Perhaps I could just schedule the sync and it could be retried
until the required resources are available.


I've briefly discussed with Federico some time ago that IMHO the 
syncVolumeChain needs to be changed. It must not be part of VM's create flow as 
I expect this quite a bottleneck in big-scale environment (it is now in fact 
not executing only on recovery but on all 4 create flows!).
I don't know how yet, but we need to find a different way. Now you just added 
yet another reason.

So…I too ask for more insights:-)


Sure, so... We switched to running syncVolumeChain at all times to
cover a very rare scenario:

1. VM is running on host A
2. User initiates Live Merge on VM
3. Host A experiences a catastrophic hardware failure before engine
can determine if the merge succeeded or failed
4. VM is restarted on Host B

Since (in this case) the host cannot know if a live merge was in
progress on the previous host, it needs to always check.


Some ideas to mitigate:
1. When engine recreates a VM on a new host and a Live Merge was in
progress, engine could call a verb to ask the host to synchronize the
volume chain.  This way, it only happens when engine knows it's needed
and engine can be sure that the required resources (storage
connections and domains) are present.

2. The syncVolumeChain call runs in the recovery case to ensure that
we clean up after any missed block job events from libvirt while vdsm
was stopped/restarting.  In this case, the block job info is saved in
the vm conf so the recovery flow could be changed to query libvirt for
block job status on only those disks where we know about a previous
operation.  For those found gone, we'd call syncVolumeChain.  In this
scenario, we still have to deal with the race with HSM initialization
and storage connectivity issues.  Perhaps engine should drive this
case as well?


--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

[ovirt-devel] oVirt 3.5 Test Day 1 Results

2014-07-08 Thread Adam Litke


I tested:

* [RFE] Prevent host fencing while kdumping
 -  http://www.ovirt.org/Fence_kdump
* hosted-engine-setup

Results:
Bug 1115123 -- hosted-engine-setup fails with ioprocess oop_impl enabled

Adding a host with Detect kdump flow set to on and without crashkernel
command line parameter results in a warning in the log as expected.

I ran out of time before I was able to configure crash dump detection
for my host VMs correctly.  Looking forward to more thorough testing
on the next test day.

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

[ovirt-devel] [vdsm] VM recovery now depends on HSM

2014-07-08 Thread Adam Litke


Hi all,

As part of the new live merge feature, when vdsm starts and has to
recover existing VMs, it calls VM._syncVolumeChain to ensure that
vdsm's view of the volume chain matches libvirt's.  This involves two
kinds of operations: 1) sync VM object, 2) sync underlying storage
metadata via HSM.

This means that HSM must be up (and the storage domain(s) that the VM
is using must be accessible.  When testing some rather eccentric error
flows, I am finding this to not always be the case.

Is there a way to have VM recovery wait on HSM to come up?  How should
we respond if a required storage domain cannot be accessed?  Is there
a mechanism in vdsm to schedule an operation to be retried at a later
time?  Perhaps I could just schedule the sync and it could be retried
until the required resources are available.

Thanks for your insights.

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] [VDSM][sampling] thread pool status and handling of stuck calls

2014-07-07 Thread Adam Litke


On 07/07/14 10:53 -0400, Nir Soffer wrote:

* _sampleVmJobs uses virDomainBLockJobInfo, which needs to enter the QEMU
monitor.
  However, this needs to run only if there are active block jobs.
  I don't have numbers, but I expect this sampler to be idle most of time.


Adam: This is related to live merge right?


Yes, it is currently used only for live merge.  It only calls libvirt
when it expects a job to be running so indeed it is pretty much a
noop most of the time.
--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] oVirt's MoM feature list ?

2014-06-13 Thread Adam Litke

On 13/06/14 11:52 +, Vinod, Chegu wrote:

Cc'ng Gilad
Vinod

From: Vinod, Chegu
Sent: Friday, June 13, 2014 4:45 AM
To: ali...@redhat.com
Subject: oVirt's MoM feature list ?

Hi Adam,

Where can I find some information about the future features/enhancements that 
are planned in MoM ? Perhaps it was discussed already in some email group or in 
some presentation...If yes can you please point me to the same ?

This is a great question for the devel list (added to cc:).  I think
Doron and Martin (added to cc:) will be able to give some better
responses to this as well.

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

[ovirt-devel] [vdsm] Unifying VM device representations

2014-05-20 Thread Adam Litke


Hi Martin,

I noticed that you are working on some patches to refactor the VM
devices and deprecate self.conf['devices'].  I am a big fan of this
because my Live Merge code is far more complex than it should be since
some information lives in self.conf['devices'] and some lives in
self.devices.

Are you planning on changing the recovery code save/recovery of
vm.conf to work with the new device container you are creating?  It
would be nice to get my code working entirely independent of
self.conf['devices'] if possible.  Also, when are you aiming to have
this work completed?  Live Merge is needed for 3.5.  Will this work be
ready before then?

Thanks!

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

[ovirt-devel] vdsm tasks API design discussion

2014-05-05 Thread Adam Litke

 feature.
Isn't this what we're trying to avoid?

- Original Message -

From: Adam Litke ali...@redhat.com
To: Dan Kenigsberg dan...@redhat.com
Cc: smizr...@redhat.com, ybronhei ybron...@redhat.com, devel@ovirt.org
Sent: Thursday, May 1, 2014 8:28:14 PM
Subject: Re: [ovirt-devel] short recap of last vdsm call (15.4.2014)

On 01/05/14 17:53 +0100, Dan Kenigsberg wrote:
On Wed, Apr 30, 2014 at 01:26:18PM -0400, Adam Litke wrote:
 On 30/04/14 14:22 +0100, Dan Kenigsberg wrote:
 On Tue, Apr 22, 2014 at 02:54:29PM +0300, ybronhei wrote:
 hey,

 somehow we missed the summary of this call, and few big issues
 were raised there. so i would like to share it with all and hear
 more comments

 - task id in http header - allows engine to initiate calls with id
 instead of following vdsm response - federico already started this
 work, and this is mandatory for live merge feature afaiu.

 Adam, Federico, may I revisit this question from another angle?

 Why does Vdsm needs to know live-merge's task id?
 As far as I understand, (vmid, disk id) are enough to identify a live
 merge process.

 A vmId + diskId can uniquely identify a block job at a single moment
 in time since qemu guarantees that only a single block job can run at
 any given point in time.  But this gives us no way to differentiate
 two sequential jobs that run on the same disk.  Therefore, without
 having an engine-supplied jobID, we can never be sure if a one job
 finished and another started since the last time we polled stats.

Why would Engine ever want to initiate a new live merge of a
(vmId,diskId) before it has a conclusive result of the previous
success/failure of the previous attempt? As far as I understand, this
should never happen, and it's actually good for the API to force
avoidence of such a case.

 Additionally, engine-supplied UUIDs is part of a developing framework
 for next-generation async tasks.  Engine prefers to use a single
 identifier to represent any kind of task (rather than some problem
 domain specific combination of UUIDs).  Adhering to this rule will
 help us to converge on a single implementation of ng async tasks
 moving forward.

I do not think that having a (virtual) table of
task_id - vmId,diskId
in Vdsm is much simpler than having it on the Engine machine.

It needs to go somewhere.  As the designers of the API we felt it
would be better for vdsm to hide the semantics of when a vmId,diskId
tuple can be considered a unique identifier.  If we ever do generalize
the concept of a transient task to other users (setupNetworks, etc) it
would be a far more consumable API if engine didn't need to handle a
bunch of special cases about what constitutes a job ID and the
specifics of its lifetime.  UUIDs are simple and already
well-supported.  Why make it more difficult than it has to be?

I still find the nothion of a new framework for async tasks quite
useful. But as I requested before, I think we should design it first,
so it fits all conceivable users. In particular, if we should not tie it
to the existence of a running VM. We'd better settle on persistence
semantics that works for everybody (such as network tasks).

Last time, the idea was struck down by Saggi and others from infra, who
are afraid to repeat mistakes from the current task framework.

Several famous quotes apply here.  The only thing we have to fear is
fear itself :)  Sometimes perfect is the enemy of good.  Tasks
redesign was always going to be driven by the need to implement one
feature at first.  It just so happens that we volunteered to take a
stab at it for live merge.  It's clear that we won't be able to
completely replace the old tasks and get this feature out in one pass.
We believe the general principles of our tasks are generally
extensible to cover new use cases in the future:

 * Jobs are given an engine-supplied UUID when started
 * There is a well-known way to check if a job is running or not
 * There is a well-known way to test if a finished job succeeded or
   failed.

I believe we did spend quite a bit of time in March coming up with a
design for NG tasks.

Unfortunately it was infra who made our jobs vm-specific by requiring
the job status to be passed by getVMStats rather than an
object-agnostic getJobsStatus stand-alone API that could conglomerate
all job types into a single response.

 If we do not have a task id, we do not need to worry on how to pass it,
 and where to persist it.

 There are at least 3 reasons to persist a block job ID:
 * To associate a specific block job operation with a specific
  engine-initiated flow.
 * So that you can clean up after a job that completed when vdsm could
  not receive the completion event.

But if Vdsm dies before it managed to clean up, Engine would have to
perform the cleanup via another host. So having this short-loop cleanup
is redundant.

Fair enough.  We'll be doing the volume chain scan for every native VM
disk at VM startup.  The only exception is if we are recovering

Re: [ovirt-devel] vdsm tasks API design discussion

2014-05-05 Thread Adam Litke

 and over.


You could also have a special field containing the version
of the configuration (I would make it a hash or a UUID and
not a running number) that you would persist locally on the
host after you finished configuring since the local host is the
scope of setupNetworks().


Hmm, interesing.  It would save time and effort on scanning network
properties.  But you are introducing the persistence of task
end-state.  I thought this was something we are trying to avoid.


It would allow you to not care about any of the error state
keep sending the same configuration if you think something
bad happened until you the is what you expect it to be or
and error response actually manages find it's way back
to you. By using the same task ID you are guaranteed
to only have the operation running once at a time.

I don't mind helping anyone with making their algorithms
work but there is no escaping from the limitations
listed above. If we want to make oVirt truly scalable
and robust we have to start thinking about algorithms
that work despite of errors and not just have error flows.


Agreed.  This is what the ngTasks framework is supposed to achieve for
us.  I think you are conflating the issue of listing active operations
and high level flow design.  If the async operations that make up a
complex flow are themselves idempotent, then we have achieved the
above.  It can be done with or without a vdsm api to list running
jobs.


Notice I don't even mention different systems of persistence
and some tasks that you should be able to get state information
about from more than one host. Some Jobs can survive
a VDSM restart since it's not in VDSM like stuff in gluster or QEmu.


Yep, live merge is one such job.  While we don't persist the job, we
do remember that it was running so we can synchronize our state with
the underlying hypervisor when we restart.


To make it clear, the task API shouldn't really be that useful.
Task IDs are just there to match requests to responses internally
because as I explained, jobs are hard to manage generally in
such a system.



This by no way means that if we see a use case emerging that requires
some sort of infra we would not do it. I just think it would probably
be tied to some common algorithm or idiom than something truly generic
used by every API call.


Maybe we are talking about two different things that cannot be
combined.  All I want is a generic way to list ongoing host-level
operations that will be useful for live merge and others.  If all you
want is a protocol syncronization mechanism in the style of QMP then
that is different.  Perhaps we need both.  I'll be happy to keep the
jobID as a formal API parameter and other new APIs that spawn
long-running operations could do the same.  Then whatever token you
want to pass on the wire does not matter to me at all.


Hope I made things clearer, sorry if I came out a bit rude.
I'm off, I have my country's birthday to celebrate.


Thanks for participating in the discussion.  In the end we will end
up with superior code than if we had not had this discussion.  Happy
Yom HaAtzmaut!

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [ovirt-devel] short recap of last vdsm call (15.4.2014)

2014-04-30 Thread Adam Litke


On 30/04/14 14:22 +0100, Dan Kenigsberg wrote:

On Tue, Apr 22, 2014 at 02:54:29PM +0300, ybronhei wrote:

hey,

somehow we missed the summary of this call, and few big issues
were raised there. so i would like to share it with all and hear
more comments

- task id in http header - allows engine to initiate calls with id
instead of following vdsm response - federico already started this
work, and this is mandatory for live merge feature afaiu.


Adam, Federico, may I revisit this question from another angle?

Why does Vdsm needs to know live-merge's task id?
As far as I understand, (vmid, disk id) are enough to identify a live
merge process.


A vmId + diskId can uniquely identify a block job at a single moment
in time since qemu guarantees that only a single block job can run at
any given point in time.  But this gives us no way to differentiate
two sequential jobs that run on the same disk.  Therefore, without
having an engine-supplied jobID, we can never be sure if a one job
finished and another started since the last time we polled stats.
Additionally, engine-supplied UUIDs is part of a developing framework
for next-generation async tasks.  Engine prefers to use a single
identifier to represent any kind of task (rather than some problem
domain specific combination of UUIDs).  Adhering to this rule will
help us to converge on a single implementation of ng async tasks
moving forward.



If we do not have a task id, we do not need to worry on how to pass it,
and where to persist it.


There are at least 3 reasons to persist a block job ID:
* To associate a specific block job operation with a specific
 engine-initiated flow.
* So that you can clean up after a job that completed when vdsm could
 not receive the completion event.
* Since we must ask libvirt about block job events on a per VM, per
 disk basis, tracking the devices on which we expect block jobs
 enables us to eliminate wasteful calls to libvirt.

Hope this makes the rationale a bit clearer...

--
Adam Litke
___
Devel mailing list
Devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/devel

Re: [Engine-devel] Share Your Thoughts

2014-03-24 Thread Adam Litke


On 23/03/14 10:36 -0400, Gilad Chaplik wrote:

AuditLog gets recycled after 30 days. the reason i stopped my VM may
still be relevant.
I would not make fields complex/composite. they need to be easily
useable via the CLI for example.


I think we need multiple comments, so we need to think about the RESTful api 
anyhow.
I guess that next feature will be a reason for 'wipe after stop'/any other BE 
that needs reasoning.


What about a new DB table (maybe called Annotations) that takes a
business entity type, UUID, action type, timestamp, and reason string.
Then the shutdown reason could be entered as a new row in the DB.  It
can be kept as long as we want it and views can be adjusted to make
these fields searchable.

--
Adam Litke
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel

Re: [Engine-devel] Asynchronous tasks for live merge

2014-03-03 Thread Adam Litke


On 03/03/14 14:28 +, Dan Kenigsberg wrote:

On Fri, Feb 28, 2014 at 09:30:16AM -0500, Adam Litke wrote:

Hi all,

As part of our plan to support live merging of VM disk snapshots it
seems we will need a new form of asynchronous task in ovirt-engine.  I
am aware of AsyncTaskManager but it seems to be limited to managing
SPM tasks.  For live merge, we are going to need something called
VmTasks since the async command can be run only on the host that
currently runs the VM.

The way I see this working from an engine perspective is:
1. RemoveSnapshotCommand in bll is invoked as usual but since the VM is
  found to be up, we activate an alternative live merge flow.
2. We submit a LiveMerge VDS Command for each impacted disk.  This is
  an asynchronous command which we need to monitor for completion.
3. A VmJob is inserted into the DB so we'll remember to handle it.
4. The VDS Broker monitors the operation via an extension to the
  already collected VmStatistics data.  Vdsm will report active Block
  Jobs only.  Once the job stops (in error or success) it will cease
  to be reported by vdsm and engine will know to proceed.


You describe a reasonable way for Vdsm to report whether an async
operation has finished. However, may we instead use the oportunity to
introduce generic hsm tasks?


Sure, I am happy to have that conversation :)  If I understand
correctly, HSM tasks, while ideal, might be too complex to get right
and would block the Live Merge feature for longer than we would like.
Has anyone looked into what it would take to implement a HSM Tasks
framework like this in vdsm?  Are there any WIP implementations?  If
the scope of this is not too big, it can be completed relatively
quickly, and the resulting implementation would cover all known use
cases, then this could be worth it.  It's important to support Live
Merge soon.

Regarding deprecation of the current tasks API:  Could your suggested
HSM Tasks framework be extended to cover SPM/SDM tasks as well?  I
would hope that a it could.  In that case, we could look forward to a
unified async task architecture in vdsm.


I suggest to have something loosely modeled on posix fork/wait.

- Engine asks Vdsm to start an API verb asynchronously and supplies a
 uuid. This is unlike fork(2), where the system chooses the pid, but
 that's required so that Engine could tell if the command has reached
 Vdsm in case of a network error.

- Engine may monitor the task (a-la wait(WNOHANG))


Allon has communicated a desire to limit engine-side polling.  Perhaps
the active tasks could be added to the host stats?


- When the task is finished, Engine may collect its result (a-la wait).
 Until that happens, Vdsm must report the task forever; restart or
 upgrade are no excuses. On reboot, though, all tasks are forgotten, so
 Engine may stop monitoring tasks on a fenced host.


This could be a good comprimise.  I hate the idea of requiring engine
to play janitor and clean up stale vdsm data, but there is not much
better of a way to do it.  Allowing reboot to auto-clear tasks will at
least provide some backstop to how long tasks could pile up if
forgotten.


This may be an over kill for your use case, but it would come useful for
other cases. In particular, setupNetwork returns before it is completely
done, since dhcp address acquisition may take too much time. Engine may
poll getVdsCaps to see when it's done (or timeout), but it would be
nicer to have a generic mechanism that can serve us all.


If we were to consider this, I would want to vet the architecture
against all known use cases for tasks to make sure we don't need to
create a new framework in 3 months.


Note that I'm suggesting a completely new task framwork, at least on
Vdsm side, as the current one (with its broken persistence, arcane
states and never-reliable rollback) is beyond redemption, imho.


Are we okay with abandoning vdsm-side rollback entirely as we move
forward?  Won't that be a regression for at least some error flows
(especially in the realm of SPM tasks)?


5. When the job has completed, VDS Broker raises an event up to bll.
  Maybe this could be done via VmJobDAO on the stored VmJob?
6. Bll receives the event and issues a series of VDS commands to
  complete the operation:
  a) Verify the new image chain matches our expectations (the snap is
 no longer present in the chain).
  b) Delete the snapshot volume
  c) Remove the VmJob from the DB

Could you guys review this proposed flow for sanity?  The main
conceptual gaps I am left with concern #5 and #6.  What is the
appropriate way for VDSBroker to communicate with BLL?  Is there an
event mechanism I can explore or should I use the database?  I am
leaning toward the database because it is persistent and will ensure
#6 gets completed even if engine is restarted somewhere in the middle.
For #6, is there an existing polling / event loop in bll that I can
plug into?

Thanks in advance for taking the time to think about this flow and for
providing

Re: [Engine-devel] Asynchronous tasks for live merge

2014-03-03 Thread Adam Litke


On 03/03/14 16:36 +0200, Itamar Heim wrote:

On 03/03/2014 04:28 PM, Dan Kenigsberg wrote:

On Fri, Feb 28, 2014 at 09:30:16AM -0500, Adam Litke wrote:

Hi all,

As part of our plan to support live merging of VM disk snapshots it
seems we will need a new form of asynchronous task in ovirt-engine.  I
am aware of AsyncTaskManager but it seems to be limited to managing
SPM tasks.  For live merge, we are going to need something called
VmTasks since the async command can be run only on the host that
currently runs the VM.

The way I see this working from an engine perspective is:
1. RemoveSnapshotCommand in bll is invoked as usual but since the VM is
  found to be up, we activate an alternative live merge flow.
2. We submit a LiveMerge VDS Command for each impacted disk.  This is
  an asynchronous command which we need to monitor for completion.
3. A VmJob is inserted into the DB so we'll remember to handle it.
4. The VDS Broker monitors the operation via an extension to the
  already collected VmStatistics data.  Vdsm will report active Block
  Jobs only.  Once the job stops (in error or success) it will cease
  to be reported by vdsm and engine will know to proceed.


You describe a reasonable way for Vdsm to report whether an async
operation has finished. However, may we instead use the oportunity to
introduce generic hsm tasks?

I suggest to have something loosely modeled on posix fork/wait.

- Engine asks Vdsm to start an API verb asynchronously and supplies a
  uuid. This is unlike fork(2), where the system chooses the pid, but
  that's required so that Engine could tell if the command has reached
  Vdsm in case of a network error.

- Engine may monitor the task (a-la wait(WNOHANG))

- When the task is finished, Engine may collect its result (a-la wait).
  Until that happens, Vdsm must report the task forever; restart or
  upgrade are no excuses. On reboot, though, all tasks are forgotten, so
  Engine may stop monitoring tasks on a fenced host.

This may be an over kill for your use case, but it would come useful for
other cases. In particular, setupNetwork returns before it is completely
done, since dhcp address acquisition may take too much time. Engine may
poll getVdsCaps to see when it's done (or timeout), but it would be
nicer to have a generic mechanism that can serve us all.

Note that I'm suggesting a completely new task framwork, at least on
Vdsm side, as the current one (with its broken persistence, arcane
states and never-reliable rollback) is beyond redemption, imho.


5. When the job has completed, VDS Broker raises an event up to bll.
  Maybe this could be done via VmJobDAO on the stored VmJob?
6. Bll receives the event and issues a series of VDS commands to
  complete the operation:
  a) Verify the new image chain matches our expectations (the snap is
 no longer present in the chain).
  b) Delete the snapshot volume
  c) Remove the VmJob from the DB

Could you guys review this proposed flow for sanity?  The main
conceptual gaps I am left with concern #5 and #6.  What is the
appropriate way for VDSBroker to communicate with BLL?  Is there an
event mechanism I can explore or should I use the database?  I am
leaning toward the database because it is persistent and will ensure
#6 gets completed even if engine is restarted somewhere in the middle.
For #6, is there an existing polling / event loop in bll that I can
plug into?

Thanks in advance for taking the time to think about this flow and for
providing your insights!

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel



the way i read Adam's proposal, there is no task entity at vdsm side 
to monitor, rather the state of the object the operation is performed 
on (similar to CreateVM, where the engine monitors the state of the 
VM, rather than the CreateVM request).


Yeah, we use the term job in order to avoid assumptions and
implications (ie. rollback/cancel, persistence) that come with the
word task.  Job essentially means libvirt Block Job, but I am
trying to allow for extension in the future.  Vdsm would collect block
job information for devices it expects to have active block jobs and
report them all under a single structure in the VM statistics.  There
would be no persistence of information so when a libvirt block job
goes poof, vdsm will stop reporting it.

--
Adam Litke
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel

[Engine-devel] Schema upgrade failure on master

2014-03-03 Thread Adam Litke


Hi,

I've recently rebased to master and it looks like the
03_05_0050_event_notification_methods.sql script is failing on schema
upgrade.  Is this a bug or am I doing something wrong?  To upgrade I
did the normal proceedure with my development installation:

make install-dev ...
~/ovirt/bin/engine-setup

Got this result in the log file:

psql:/home/alitke/ovirt-**FILTERED**/share/ovirt-**FILTERED**/dbscripts/upgrade/03_05_0050_event_notification_methods.sql:10:
 ERROR:  column notification_method contains null values
FATAL: Cannot execute sql command: 
--file=/home/alitke/ovirt-**FILTERED**/share/ovirt-**FILTERED**/dbscripts/upgrade/03_05_0050_event_notification_methods.sql

2014-03-03 17:20:34 DEBUG otopi.context context._executeMethod:152 method 
exception
Traceback (most recent call last):
 File /usr/lib/python2.7/site-packages/otopi/context.py, line 142, in 
_executeMethod
   method['method']()
 File 
/home/alitke/ovirt-**FILTERED**/share/ovirt-**FILTERED**/setup/bin/../plugins/ovirt-**FILTERED**-setup/ovirt-**FILTERED**/db/schema.py,
 line 280, in _misc
   osetupcons.DBEnv.PGPASS_FILE
 File /usr/lib/python2.7/site-packages/otopi/plugin.py, line 451, in execute
   command=args[0],
RuntimeError: Command 
'/home/alitke/ovirt-**FILTERED**/share/ovirt-**FILTERED**/dbscripts/schema.sh' 
failed to execute

--
Adam Litke
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel

[Engine-devel] Asynchronous tasks for live merge

2014-02-28 Thread Adam Litke


Hi all,

As part of our plan to support live merging of VM disk snapshots it
seems we will need a new form of asynchronous task in ovirt-engine.  I
am aware of AsyncTaskManager but it seems to be limited to managing
SPM tasks.  For live merge, we are going to need something called
VmTasks since the async command can be run only on the host that
currently runs the VM.

The way I see this working from an engine perspective is:
1. RemoveSnapshotCommand in bll is invoked as usual but since the VM is
  found to be up, we activate an alternative live merge flow.
2. We submit a LiveMerge VDS Command for each impacted disk.  This is
  an asynchronous command which we need to monitor for completion.
3. A VmJob is inserted into the DB so we'll remember to handle it.
4. The VDS Broker monitors the operation via an extension to the
  already collected VmStatistics data.  Vdsm will report active Block
  Jobs only.  Once the job stops (in error or success) it will cease
  to be reported by vdsm and engine will know to proceed.
5. When the job has completed, VDS Broker raises an event up to bll.
  Maybe this could be done via VmJobDAO on the stored VmJob?
6. Bll receives the event and issues a series of VDS commands to
  complete the operation:
  a) Verify the new image chain matches our expectations (the snap is
 no longer present in the chain).
  b) Delete the snapshot volume
  c) Remove the VmJob from the DB

Could you guys review this proposed flow for sanity?  The main
conceptual gaps I am left with concern #5 and #6.  What is the
appropriate way for VDSBroker to communicate with BLL?  Is there an
event mechanism I can explore or should I use the database?  I am
leaning toward the database because it is persistent and will ensure
#6 gets completed even if engine is restarted somewhere in the middle.
For #6, is there an existing polling / event loop in bll that I can
plug into?

Thanks in advance for taking the time to think about this flow and for
providing your insights!

--
Adam Litke
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel

Re: [Engine-devel] mom RPMs for 3.4

2014-02-03 Thread Adam Litke


On 01/02/14 22:48 +, Dan Kenigsberg wrote:

On Fri, Jan 31, 2014 at 04:56:12PM -0500, Adam Litke wrote:

On 31/01/14 08:36 +0100, Sandro Bonazzola wrote:
Il 30/01/2014 19:30, Adam Litke ha scritto:
On 30/01/14 18:13 +, Dan Kenigsberg wrote:
On Thu, Jan 30, 2014 at 11:49:42AM -0500, Adam Litke wrote:
Hi Sandro,

After updating the MOM project's build system, I have used jenkins to
produce a set of RPMs that I would like to tag into the oVirt 3.4
release.  Please see the jenkins job [1] for the relevant artifacts
for EL6[2], F19[3], and F20[4].

Dan, should I submit a patch to vdsm to make it require mom = 0.4.0?
I want to be careful to not break people's environments this late in
the 3.4 release cycle.  What is the best way to minimize that damage?

Hey, we're during beta. I prefer making this requirement explicit now
over having users with supervdsmd.log retate due to log spam.

In that case, Sandro, can you let me know when those RPMs hit the
ovirt repos (for master and 3.4) and then I will submit a patch to
vdsm to require the new version.


mom 0.4.0 has been built in last night nightly job [1] and published to 
nightly by publisher job [2]
so it's already available on nightly [3]

For 3.4.0, it has been planned [4] a beta 2 release on 2014-02-06 so we'll 
include your builds in that release.

I presume the scripting for 3.4 release rpms will produce a version
without the git-rev based suffix: ie. mom-0.4.0-1.rpm?

I need to figure out how to handle a problem that might be a bit
unique to mom.  MOM is used by non-oVirt users who install it from the
main Fedora repository.  I think it's fine that we are producing our
own rpms in oVirt (that may have additional patches applied and may
resync to upstream mom code more frequently than would be desired for
the main Fedora repository).  Given this, I think it makes sense to
tag the oVirt RPMs with a special version suffix to indicate that
these are oVirt produced and not upstream Fedora.

For example:
The next Fedora update will be mom-0.4.0-1.f20.rpm.
The next oVirt update will be mom-0.4.0-1ovirt.f20.rpm.

Is this the best practice for accomplishing my goals?  One other thing
I'd like to have the option of doing is to make vdsm depend on an
ovirt distribution of mom so that the upstream Fedora version will not
satisfy the dependency for vdsm.


What is the motivation for this? You would not like to bother Fedora
users with updates that are required only for oVirt?


Yes, that was my thinking.  It seems that oVirt requires updates more
frequently than users that use MOM with libvirt directly and the
Fedora update process is a bit more heavy than oVirt's at the moment.


Vdsm itself is built, signed, and distributed via Fedora.
It is also copied into the ovirt repo, for completeness sake. Could MoM
do the same?


If vdsm is finding this to work well than surely I can do the same
with MOM.  The 0.4.0 build is in updates-testing right now and should
be able to be tagged stable in a day or two.

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel

Re: [Engine-devel] mom RPMs for 3.4

2014-01-31 Thread Adam Litke


On 31/01/14 08:36 +0100, Sandro Bonazzola wrote:

Il 30/01/2014 19:30, Adam Litke ha scritto:

On 30/01/14 18:13 +, Dan Kenigsberg wrote:

On Thu, Jan 30, 2014 at 11:49:42AM -0500, Adam Litke wrote:

Hi Sandro,

After updating the MOM project's build system, I have used jenkins to
produce a set of RPMs that I would like to tag into the oVirt 3.4
release.  Please see the jenkins job [1] for the relevant artifacts
for EL6[2], F19[3], and F20[4].

Dan, should I submit a patch to vdsm to make it require mom = 0.4.0?
I want to be careful to not break people's environments this late in
the 3.4 release cycle.  What is the best way to minimize that damage?


Hey, we're during beta. I prefer making this requirement explicit now
over having users with supervdsmd.log retate due to log spam.


In that case, Sandro, can you let me know when those RPMs hit the
ovirt repos (for master and 3.4) and then I will submit a patch to
vdsm to require the new version.



mom 0.4.0 has been built in last night nightly job [1] and published to nightly 
by publisher job [2]
so it's already available on nightly [3]

For 3.4.0, it has been planned [4] a beta 2 release on 2014-02-06 so we'll 
include your builds in that release.


I presume the scripting for 3.4 release rpms will produce a version
without the git-rev based suffix: ie. mom-0.4.0-1.rpm?

I need to figure out how to handle a problem that might be a bit
unique to mom.  MOM is used by non-oVirt users who install it from the
main Fedora repository.  I think it's fine that we are producing our
own rpms in oVirt (that may have additional patches applied and may
resync to upstream mom code more frequently than would be desired for
the main Fedora repository).  Given this, I think it makes sense to
tag the oVirt RPMs with a special version suffix to indicate that
these are oVirt produced and not upstream Fedora.

For example:
The next Fedora update will be mom-0.4.0-1.f20.rpm.
The next oVirt update will be mom-0.4.0-1ovirt.f20.rpm.

Is this the best practice for accomplishing my goals?  One other thing
I'd like to have the option of doing is to make vdsm depend on an
ovirt distribution of mom so that the upstream Fedora version will not
satisfy the dependency for vdsm.

Thoughts?
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel

[Engine-devel] mom RPMs for 3.4

2014-01-30 Thread Adam Litke


Hi Sandro,

After updating the MOM project's build system, I have used jenkins to
produce a set of RPMs that I would like to tag into the oVirt 3.4
release.  Please see the jenkins job [1] for the relevant artifacts
for EL6[2], F19[3], and F20[4].

Dan, should I submit a patch to vdsm to make it require mom = 0.4.0?
I want to be careful to not break people's environments this late in
the 3.4 release cycle.  What is the best way to minimize that damage?

[1] http://jenkins.ovirt.org/view/All/job/manual-build-tarball/179/
[2] 
http://jenkins.ovirt.org/view/All/job/manual-build-tarball/179/label=centos6-host/artifact/exported-artifacts/mom-0.4.0-1.el6.noarch.rpm
[3] 
http://jenkins.ovirt.org/view/All/job/manual-build-tarball/179/label=fedora19-host/artifact/exported-artifacts/mom-0.4.0-1.fc19.noarch.rpm
[4] 
http://jenkins.ovirt.org/view/All/job/manual-build-tarball/179/label=fedora20-host/artifact/exported-artifacts/mom-0.4.0-1.fc20.noarch.rpm
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel

Re: [Engine-devel] mom RPMs for 3.4

2014-01-30 Thread Adam Litke


On 30/01/14 18:13 +, Dan Kenigsberg wrote:

On Thu, Jan 30, 2014 at 11:49:42AM -0500, Adam Litke wrote:

Hi Sandro,

After updating the MOM project's build system, I have used jenkins to
produce a set of RPMs that I would like to tag into the oVirt 3.4
release.  Please see the jenkins job [1] for the relevant artifacts
for EL6[2], F19[3], and F20[4].

Dan, should I submit a patch to vdsm to make it require mom = 0.4.0?
I want to be careful to not break people's environments this late in
the 3.4 release cycle.  What is the best way to minimize that damage?


Hey, we're during beta. I prefer making this requirement explicit now
over having users with supervdsmd.log retate due to log spam.


In that case, Sandro, can you let me know when those RPMs hit the
ovirt repos (for master and 3.4) and then I will submit a patch to
vdsm to require the new version.
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel

Re: [Engine-devel] Copy reviewer scores on trivial rebase/commit msg changes

2014-01-21 Thread Adam Litke


On 18/01/14 01:48 +0200, Itamar Heim wrote:

I'd like to enable these - comments welcome:

1. label.Label-Name.copyAllScoresOnTrivialRebase

If true, all scores for the label are copied forward when a new patch 
set is uploaded that is a trivial rebase. A new patch set is 
considered as trivial rebase if the commit message is the same as in 
the previous patch set and if it has the same code delta as the 
previous patch set. This is the case if the change was rebased onto a 
different parent. This can be used to enable sticky approvals, 
reducing turn-around for trivial rebases prior to submitting a change. 
Defaults to false.



2. label.Label-Name.copyAllScoresIfNoCodeChange

If true, all scores for the label are copied forward when a new patch 
set is uploaded that has the same parent commit as the previous patch 
set and the same code delta as the previous patch set. This means only 
the commit message is different. This can be used to enable sticky 
approvals on labels that only depend on the code, reducing turn-around 
if only the commit message is changed prior to submitting a change. 
Defaults to false.


I am a bit late to the party but +1 from me for trying both.  I guess
it will be quite rare that something bad happens here.  So unlikely,
that the time saved on all the previous patches will far offset the
lost time for fixing the corner cases.
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel

Re: [Engine-devel] oVirt 3.4.0 alpha repository closure failure

2014-01-10 Thread Adam Litke


On 10/01/14 10:01 +, Dan Kenigsberg wrote:

On Fri, Jan 10, 2014 at 08:48:52AM +0100, Sandro Bonazzola wrote:

Hi,
oVirt 3.4.0 alpha repository has been composed but alpha has not been announced 
due to repository closure failures:

on CentOS 6.5:

# repoclosure -r ovirt-3.4.0-alpha -l ovirt-3.3.2 -l base -l epel -l 
glusterfs-epel -l updates -l extra -l glusterfs-noarch-epel -l ovirt-stable -n
Reading in repository metadata - please wait
Checking Dependencies
Repos looked at: 8
   base
   epel
   glusterfs-epel
   glusterfs-noarch-epel
   ovirt-3.3.2
   ovirt-3.4.0-alpha
   ovirt-stable
   updates
Num Packages in Repos: 16581
package: mom-0.3.2-20140101.git2691f25.el6.noarch from ovirt-3.4.0-alpha
  unresolved deps:
 procps-ng


Adam, this seems like a real bug in http://gerrit.ovirt.org/#/c/22087/ :
el6 still carries the older procps (which is, btw, provided by
procps-ng).


Done.
http://gerrit.ovirt.org/23137





package: vdsm-hook-vhostmd-4.14.0-1.git6fdd55f.el6.noarch from ovirt-3.4.0-alpha
  unresolved deps:
 vhostmd


Douglas, could you add a with_vhostmd option to the spec, and have it
default to 0 on el*, and to 1 on fedoras?

Thanks,
Dan.

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel

Re: [Engine-devel] UI: VM list not populating

2014-01-06 Thread Adam Litke


On 06/01/14 11:41 -0500, Alexander Wels wrote:

On Monday, January 06, 2014 11:27:07 AM Adam Litke wrote:

On 06/01/14 11:19 -0500, Alexander Wels wrote:
Adam,

Is this just when you first login into the webadmin or whenever you go to
the VM tab? In other words if you login, then switch to the templates tab
and back again to the VM tab does it still not load? What about when you
manually refresh the grid?

Thanks for the quick response!  It doesn't load at all -- first time
or any other time when revisiting.  In some cases in the past I would
have luck by clicking the blue refresh icon but that doesn't help
either.  I have force refreshed the browser (Chrome) to no avail.  I
guess the next step is to completely restart the browser (hmm, no luck
there either).



Okay, then something else is going on, are there any errors in the server log?



From server.log there are no ERRORs but this message may be related:


2014-01-06 13:01:54,209 WARN
[org.jboss.resteasy.spi.ResteasyDeployment] (http--0.0.0.0-8080-4)
Application.getSingletons() returned unknown class type:
org.ovirt.engine.api.restapi.util.VmHelper




Alexander

On Monday, January 06, 2014 11:02:02 AM Adam Litke wrote:
 Hi all,

 I am working with the latest ovirt-engine git and am finding some
 strange behavior with the UI.  The list of VMs never populates and I
 am stuck with the loading indicator.  All other tabs behave normally
 (Hosts, Templates, Storage, etc).  Also, the list of VMs can be loaded
 normally using the REST API.  Any ideas what may be causing this
 strange behavior?
 ___
 Engine-devel mailing list
 Engine-devel@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/engine-devel



___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel

Re: [Engine-devel] UI: VM list not populating

2014-01-06 Thread Adam Litke


On 06/01/14 11:44 -0500, Einav Cohen wrote:

- Original Message -
From: Alexander Wels aw...@redhat.com
Sent: Monday, January 6, 2014 11:41:38 AM

On Monday, January 06, 2014 11:27:07 AM Adam Litke wrote:
 On 06/01/14 11:19 -0500, Alexander Wels wrote:
 Adam,
 
 Is this just when you first login into the webadmin or whenever you go to
 the VM tab? In other words if you login, then switch to the templates tab
 and back again to the VM tab does it still not load? What about when you
 manually refresh the grid?

 Thanks for the quick response!  It doesn't load at all -- first time
 or any other time when revisiting.  In some cases in the past I would
 have luck by clicking the blue refresh icon but that doesn't help
 either.  I have force refreshed the browser (Chrome) to no avail.  I
 guess the next step is to completely restart the browser (hmm, no luck
 there either).


Okay, then something else is going on, are there any errors in the server
log?


In addition to server logs: maybe also provide client logs (see instructions
in [1])?
thanks.

[1] http://lists.ovirt.org/pipermail/users/2013-December/018494.html


GET http://localhost:8080/ovirt-engine/webadmin/Reports.xml 404 (Not Found) 
4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:16328
Mon Jan 06 13:05:00 GMT-500 2014 com.google.gwt.logging.client.LogConfiguration
SEVERE: (TypeError) 
stack: TypeError: Cannot call method 'kk' of null

   at LSj 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:11622:58)
   at Object.JTl [as h_] 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:17112:15349)
   at l7j 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:15200:166)
   at Object.n7j 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:16142:328)
   at r2j 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:15199:140)
   at rjk 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:7040:19)
   at Object._jk [as qT] 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:17088:17294)
   at Object.r5j [as tV] 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:17085:15904)
   at hIj 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:15873:85)
   at Object.kIj [as tV] 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:17082:510)
   at uKj 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:11859:40)
   at Object.xKj [as tV] 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:17082:20018)
   at OJj 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:15471:172)
   at Object.RJj [as Ch] 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:17082:19443)
   at Object.jAd [as ue] 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:17019:23272)
   at cR 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:14512:137)
   at Object.vR 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:17019:13248)
   at XMLHttpRequest.anonymous 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:11884:65)
   at _q 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:8351:29)
   at cr 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:15114:57)
   at XMLHttpRequest.anonymous 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:12521:45):
 Cannot call method 'kk' of null
com.google.gwt.core.client.JavaScriptException: (TypeError) 
stack: TypeError: Cannot call method 'kk' of null

   at LSj 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:11622:58)
   at Object.JTl [as h_] 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:17112:15349)
   at l7j 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:15200:166)
   at Object.n7j 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:16142:328)
   at r2j 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:15199:140)
   at rjk 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:7040:19)
   at Object._jk [as qT] 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:17088:17294)
   at Object.r5j [as tV] 
(http://localhost:8080/ovirt-engine/webadmin/4DD22D2F78BB84E2940BB7ADF6163F25.cache.html:17085:15904

Re: [Engine-devel] UI: VM list not populating

2014-01-06 Thread Adam Litke


On 06/01/14 13:12 -0500, Alexander Wels wrote:

On Monday, January 06, 2014 01:03:31 PM Adam Litke wrote:

On 06/01/14 11:41 -0500, Alexander Wels wrote:
On Monday, January 06, 2014 11:27:07 AM Adam Litke wrote:
 On 06/01/14 11:19 -0500, Alexander Wels wrote:
 Adam,
 
 Is this just when you first login into the webadmin or whenever you go
 to
 the VM tab? In other words if you login, then switch to the templates
 tab
 and back again to the VM tab does it still not load? What about when you
 manually refresh the grid?

 Thanks for the quick response!  It doesn't load at all -- first time
 or any other time when revisiting.  In some cases in the past I would
 have luck by clicking the blue refresh icon but that doesn't help
 either.  I have force refreshed the browser (Chrome) to no avail.  I
 guess the next step is to completely restart the browser (hmm, no luck
 there either).

Okay, then something else is going on, are there any errors in the server
log?
From server.log there are no ERRORs but this message may be related:

2014-01-06 13:01:54,209 WARN
[org.jboss.resteasy.spi.ResteasyDeployment] (http--0.0.0.0-8080-4)
Application.getSingletons() returned unknown class type:
org.ovirt.engine.api.restapi.util.VmHelper



Don't think that is related, as currently the web admin uses GWT RPC to
communicate with the engine, and not the REST interface.

So, ovirt-engine/var/log/ovirt-engine/server.log and ovirt-
engine/var/log/ovirt-engine/engine.log Have nothing in them?



From engine.log:


2014-01-06 13:10:34,428 WARN
[org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil]
(org.ovirt.thread.pool-6-thread-50) Executing a command:
java.util.concurrent.FutureTask , but note that there are 0 tasks in
the queue.

This repeats quite regularly...  Other than that, nothing looks
relavent.  






 Alexander
 
 On Monday, January 06, 2014 11:02:02 AM Adam Litke wrote:
  Hi all,
 
  I am working with the latest ovirt-engine git and am finding some
  strange behavior with the UI.  The list of VMs never populates and I
  am stuck with the loading indicator.  All other tabs behave normally
  (Hosts, Templates, Storage, etc).  Also, the list of VMs can be loaded
  normally using the REST API.  Any ideas what may be causing this
  strange behavior?
  ___
  Engine-devel mailing list
  Engine-devel@ovirt.org
  http://lists.ovirt.org/mailman/listinfo/engine-devel



___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel

Re: [Engine-devel] UI: VM list not populating

2014-01-06 Thread Adam Litke


On 06/01/14 13:30 -0500, Alexander Wels wrote:

Yes either compile in PRETTY mode or run in GWT debug mode. Depending on how
comfortable you are with doing either one.


Ok I think we're getting somewhere... When compiled in draft mode the client
errors look like this:

GET http://localhost:8080/ovirt-engine/webadmin/Reports.xml 404 (Not Found) 
C5287D41B71197763AB3125431813688.cache.html:44792
Mon Jan 06 14:08:21 GMT-500 2014 com.google.gwt.logging.client.LogConfiguration
SEVERE: (TypeError) 
stack: TypeError: Cannot call method 'get__Ljava_lang_Object_2Ljava_lang_Object_2' of null

   at 
org_ovirt_engine_ui_uicommonweb_dataprovider_AsyncDataProvider_getDisplayTypes__ILorg_ovirt_engine_core_compat_Version_2Ljava_util_List_2
 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:180644:360)
   at 
org_ovirt_engine_ui_uicommonweb_dataprovider_AsyncDataProvider_hasSpiceSupport__ILorg_ovirt_engine_core_compat_Version_2Z
 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:181442:10)
   at 
Object.org_ovirt_engine_ui_uicommonweb_models_vms_SpiceConsoleModel_canBeSelected__Z
 [as canBeSelected__Z] 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:248240:199)
   at 
org_ovirt_engine_ui_uicommonweb_models_VmConsolesImpl_$canSelectProtocol__Lorg_ovirt_engine_ui_uicommonweb_models_VmConsolesImpl_2Lorg_ovirt_engine_ui_uicommonweb_models_ConsoleProtocol_2Z
 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:187341:282)
   at 
org_ovirt_engine_ui_uicommonweb_models_VmConsolesImpl_$setDefaultSelectedProtocol__Lorg_ovirt_engine_ui_uicommonweb_models_VmConsolesImpl_2V
 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:187391:9)
   at 
Object.org_ovirt_engine_ui_uicommonweb_models_VmConsolesImpl_VmConsolesImpl__Lorg_ovirt_engine_core_common_businessentities_VM_2Lorg_ovirt_engine_ui_uicommonweb_models_Model_2Lorg_ovirt_engine_ui_uicommonweb_ConsoleOptionsFrontendPersister$ConsoleContext_2V
 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:187407:3)
   at 
org_ovirt_engine_ui_uicommonweb_models_ConsoleModelsCache_$updateCache__Lorg_ovirt_engine_ui_uicommonweb_models_ConsoleModelsCache_2Ljava_lang_Iterable_2V
 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:185252:1037)
   at 
org_ovirt_engine_ui_uicommonweb_models_vms_VmListModel_$setItems__Lorg_ovirt_engine_ui_uicommonweb_models_vms_VmListModel_2Ljava_lang_Iterable_2V
 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:194985:3)
   at 
Object.org_ovirt_engine_ui_uicommonweb_models_vms_VmListModel_setItems__Ljava_lang_Iterable_2V
 [as setItems__Ljava_lang_Iterable_2V] 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:195275:3)
   at 
Object.org_ovirt_engine_ui_uicommonweb_models_SearchableListModel$2_onSuccess__Ljava_lang_Object_2Ljava_lang_Object_2V
 [as onSuccess__Ljava_lang_Object_2Ljava_lang_Object_2V] 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:186570:23)
   at 
org_ovirt_engine_ui_frontend_Frontend$1_$onSuccess__Lorg_ovirt_engine_ui_frontend_Frontend$1_2Lorg_ovirt_engine_ui_frontend_communication_VdcOperation_2Lorg_ovirt_engine_core_common_queries_VdcQueryReturnValue_2V
 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:168839:1451)
   at 
Object.org_ovirt_engine_ui_frontend_Frontend$1_onSuccess__Ljava_lang_Object_2Ljava_lang_Object_2V
 [as onSuccess__Ljava_lang_Object_2Ljava_lang_Object_2V] 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:168871:3)
   at 
org_ovirt_engine_ui_frontend_communication_OperationProcessor$2_$onSuccess__Lorg_ovirt_engine_ui_frontend_communication_OperationProcessor$2_2Lorg_ovirt_engine_ui_frontend_communication_VdcOperation_2Ljava_lang_Object_2V
 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:173172:217)
   at 
Object.org_ovirt_engine_ui_frontend_communication_OperationProcessor$2_onSuccess__Ljava_lang_Object_2Ljava_lang_Object_2V
 [as onSuccess__Ljava_lang_Object_2Ljava_lang_Object_2V] 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:173190:3)
   at 
org_ovirt_engine_ui_frontend_communication_GWTRPCCommunicationProvider$4_$onSuccess__Lorg_ovirt_engine_ui_frontend_communication_GWTRPCCommunicationProvider$4_2Ljava_util_ArrayList_2V
 
(http://localhost:8080/ovirt-engine/webadmin/C5287D41B71197763AB3125431813688.cache.html:172948:675)
   at 
Object.org_ovirt_engine_ui_frontend_communication_GWTRPCCommunicationProvider$4_onSuccess__Ljava_lang_Object_2V
 [as onSuccess__Ljava_lang_Object_2V]

Re: [Engine-devel] UI: VM list not populating

2014-01-06 Thread Adam Litke


On 06/01/14 14:32 -0500, Daniel Erez wrote:



- Original Message -

From: Adam Litke ali...@redhat.com
To: Alexander Wels aw...@redhat.com
Cc: engine-devel@ovirt.org
Sent: Monday, January 6, 2014 9:11:48 PM
Subject: Re: [Engine-devel] UI: VM list not populating



Might be an issue of a stale osinfo properties file,
'displayProtocols' has recently been introduced by [1]

Try overwriting osinfo-defaults.properties with the updated one from latest bits
/ovirt-engine/packaging/conf/osinfo-defaults.properties -- 
$HOME/ovirt-engine/share/ovirt-engine/conf

[1] 
http://gerrit.ovirt.org/#/c/18677/14/packaging/conf/osinfo-defaults.properties


Thanks for the suggestion but it did not seem to resolve the issue.
Also, my proprties file has os.other.displayProtocols.value and 
os.other.spiceSupport.value.  This seems different from [1] above

which indicates that the spiceSupport key is removed entirely.

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel

Re: [Engine-devel] UI: VM list not populating

2014-01-06 Thread Adam Litke

On 06/01/14 15:31 -0500, Daniel Erez wrote:

- Original Message -

From: Adam Litke ali...@redhat.com
To: Daniel Erez de...@redhat.com
Cc: Alexander Wels aw...@redhat.com, engine-devel@ovirt.org
Sent: Monday, January 6, 2014 9:51:57 PM
Subject: Re: [Engine-devel] UI: VM list not populating

On 06/01/14 14:32 -0500, Daniel Erez wrote:

- Original Message -
 From: Adam Litke ali...@redhat.com
 To: Alexander Wels aw...@redhat.com
 Cc: engine-devel@ovirt.org
 Sent: Monday, January 6, 2014 9:11:48 PM
 Subject: Re: [Engine-devel] UI: VM list not populating

Might be an issue of a stale osinfo properties file,
'displayProtocols' has recently been introduced by [1]

Try overwriting osinfo-defaults.properties with the updated one from latest
bits
/ovirt-engine/packaging/conf/osinfo-defaults.properties --
$HOME/ovirt-engine/share/ovirt-engine/conf

[1]
http://gerrit.ovirt.org/#/c/18677/14/packaging/conf/osinfo-defaults.properties

Thanks for the suggestion but it did not seem to resolve the issue.
Also, my proprties file has os.other.displayProtocols.value and
os.other.spiceSupport.value.  This seems different from [1] above
which indicates that the spiceSupport key is removed entirely.

Actually spiceSupport key was added a bit later by:
http://gerrit.ovirt.org/#/c/18220/17/packaging/conf/osinfo-defaults.properties

Can you please check if VMs list is displayed correctly from the userportal?
(I just wonder if there's some race in 'initCache/initDisplayTypes' mechanism).

Does not work in the User Portal either.  I don't know if this is
related, but I have started to observe some new errors in server.log.
I wonder if I have done too much rebasing and schema upgrading on my
local DB:

2014-01-06 15:39:20,451 WARN  [org.ovirt.engine.core.vdsbroker.VdsManager] 
(DefaultQuartzScheduler_Worker-31) Failed to refresh VDS , vds = 
203848b8-1d84-4c01-a267-c11280d0ad0f : lager, error = 
org.springframework.jdbc.BadSqlGrammarException: PreparedStatementCallback; bad 
SQL grammar [select * from  getinterface_viewbyvds_id(?, ?, ?)]; nested 
exception is org.postgresql.util.PSQLException: The column name qos_overridden 
was not found in this ResultSet., continuing.: 
org.springframework.jdbc.BadSqlGrammarException: PreparedStatementCallback; bad 
SQL grammar [select * from  getinterface_viewbyvds_id(?, ?, ?)]; nested 
exception is org.postgresql.util.PSQLException: The column name qos_overridden 
was not found in this ResultSet.
at 
org.springframework.jdbc.support.SQLStateSQLExceptionTranslator.doTranslate(SQLStateSQLExceptionTranslator.java:98)
 [spring-jdbc.jar:3.1.1.RELEASE]
at 
org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:72)
 [spring-jdbc.jar:3.1.1.RELEASE]
at 
org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:80)
 [spring-jdbc.jar:3.1.1.RELEASE]
at 
org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:80)
 [spring-jdbc.jar:3.1.1.RELEASE]
at 
org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:603) 
[spring-jdbc.jar:3.1.1.RELEASE]
at 
org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:637) 
[spring-jdbc.jar:3.1.1.RELEASE]
at 
org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:666) 
[spring-jdbc.jar:3.1.1.RELEASE]
at 
org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:706) 
[spring-jdbc.jar:3.1.1.RELEASE]
at 
org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$PostgresSimpleJdbcCall.executeCallInternal(PostgresDbEngineDialect.java:154)
 [dal.jar:]
at 
org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$PostgresSimpleJdbcCall.doExecute(PostgresDbEngineDialect.java:120)
 [dal.jar:]
at 
org.springframework.jdbc.core.simple.SimpleJdbcCall.execute(SimpleJdbcCall.java:181)
 [spring-jdbc.jar:3.1.1.RELEASE]
at 
org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.executeImpl(SimpleJdbcCallsHandler.java:137)
 [dal.jar:]
at 
org.ovirt.engine.core.dal.dbbroker.SimpleJdbcCallsHandler.executeReadList(SimpleJdbcCallsHandler.java:103)
 [dal.jar:]
at 
org.ovirt.engine.core.dao.network.InterfaceDaoDbFacadeImpl.getAllInterfacesForVds(InterfaceDaoDbFacadeImpl.java:167)
 [dal.jar:]
at 
org.ovirt.engine.core.dao.network.InterfaceDaoDbFacadeImpl.getAllInterfacesForVds(InterfaceDaoDbFacadeImpl.java:150)
 [dal.jar:]
at 
org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerObjectsBuilder.updateNetworkData(VdsBrokerObjectsBuilder.java:930)
 [vdsbroker.jar:]
at 
org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerObjectsBuilder.updateVDSDynamicData(VdsBrokerObjectsBuilder.java:326)
 [vdsbroker.jar

Re: [Engine-devel] UI: VM list not populating

2014-01-06 Thread Adam Litke

On 06/01/14 15:56 -0500, Daniel Erez wrote:

- Original Message -

From: Adam Litke ali...@redhat.com
To: Daniel Erez de...@redhat.com
Cc: Alexander Wels aw...@redhat.com, engine-devel@ovirt.org
Sent: Monday, January 6, 2014 10:42:08 PM
Subject: Re: [Engine-devel] UI: VM list not populating

On 06/01/14 15:31 -0500, Daniel Erez wrote:

- Original Message -
 From: Adam Litke ali...@redhat.com
 To: Daniel Erez de...@redhat.com
 Cc: Alexander Wels aw...@redhat.com, engine-devel@ovirt.org
 Sent: Monday, January 6, 2014 9:51:57 PM
 Subject: Re: [Engine-devel] UI: VM list not populating

 On 06/01/14 14:32 -0500, Daniel Erez wrote:

 - Original Message -
  From: Adam Litke ali...@redhat.com
  To: Alexander Wels aw...@redhat.com
  Cc: engine-devel@ovirt.org
  Sent: Monday, January 6, 2014 9:11:48 PM
  Subject: Re: [Engine-devel] UI: VM list not populating

 Might be an issue of a stale osinfo properties file,
 'displayProtocols' has recently been introduced by [1]

 Try overwriting osinfo-defaults.properties with the updated one from
 latest
 bits
 /ovirt-engine/packaging/conf/osinfo-defaults.properties --
 $HOME/ovirt-engine/share/ovirt-engine/conf

 [1]

http://gerrit.ovirt.org/#/c/18677/14/packaging/conf/osinfo-defaults.properties

 Thanks for the suggestion but it did not seem to resolve the issue.
 Also, my proprties file has os.other.displayProtocols.value and
 os.other.spiceSupport.value.  This seems different from [1] above
 which indicates that the spiceSupport key is removed entirely.

Actually spiceSupport key was added a bit later by:
http://gerrit.ovirt.org/#/c/18220/17/packaging/conf/osinfo-defaults.properties

Can you please check if VMs list is displayed correctly from the userportal?
(I just wonder if there's some race in 'initCache/initDisplayTypes'
mechanism).

Does not work in the User Portal either.  I don't know if this is
related, but I have started to observe some new errors in server.log.
I wonder if I have done too much rebasing and schema upgrading on my
local DB:

Yeah, looks like the DB needs upgrading...
(if you don't have any important data you can just try creating a new one).
Regarding the user portal, I'm guessing you don't see any VMs as you have
to assign permissions to them first from the webadmin.
Can you try creating some new VMs from the user portal, to see if the list
is displayed correctly. Also, look whether you get a similar error in
the engine log file as the webadmin.

New VMs created in the admin portal and user portal do not show up in
the list.  I just see the animated boxes indicating that the data is
loading.  The same error appears in the engine.log.  I will try to
blow away the data and start over.

2014-01-06 15:39:20,451 WARN  [org.ovirt.engine.core.vdsbroker.VdsManager]
(DefaultQuartzScheduler_Worker-31) Failed to refresh VDS , vds =
203848b8-1d84-4c01-a267-c11280d0ad0f : lager, error =
org.springframework.jdbc.BadSqlGrammarException: PreparedStatementCallback;
bad SQL grammar [select * from  getinterface_viewbyvds_id(?, ?, ?)]; nested
exception is org.postgresql.util.PSQLException: The column name
qos_overridden was not found in this ResultSet., continuing.:
org.springframework.jdbc.BadSqlGrammarException: PreparedStatementCallback;
bad SQL grammar [select * from  getinterface_viewbyvds_id(?, ?, ?)]; nested
exception is org.postgresql.util.PSQLException: The column name
qos_overridden was not found in this ResultSet.
at

org.springframework.jdbc.support.SQLStateSQLExceptionTranslator.doTranslate(SQLStateSQLExceptionTranslator.java:98)
[spring-jdbc.jar:3.1.1.RELEASE]
at

org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:72)
[spring-jdbc.jar:3.1.1.RELEASE]
at

org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:80)
[spring-jdbc.jar:3.1.1.RELEASE]
at

org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:80)
[spring-jdbc.jar:3.1.1.RELEASE]
at 
org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:603)
[spring-jdbc.jar:3.1.1.RELEASE]
at 
org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:637)
[spring-jdbc.jar:3.1.1.RELEASE]
at 
org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:666)
[spring-jdbc.jar:3.1.1.RELEASE]
at 
org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:706)
[spring-jdbc.jar:3.1.1.RELEASE]
at

org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect$PostgresSimpleJdbcCall.executeCallInternal(PostgresDbEngineDialect.java:154)
[dal.jar:]
at

org.ovirt.engine.core.dal.dbbroker.PostgresDbEngineDialect

Re: [Engine-devel] ovirt-engine build segfault on Fedora 20

2014-01-03 Thread Adam Litke


On 02/01/14 21:53 -0500, Greg Sheremeta wrote:

Caution on upgrading your dev machine to Fedora 20. GWT compilation of safari 
(for Chrome) causes a segfault during the build. Strangely, the build appears to work, so 
I'm not sure what the net effect of the segfault is.

If you only compile for gecko (Firefox) [the default], you won't see the 
segfault.

In other words,
make clean install-dev PREFIX=$HOME/ovirt-engine 
DEV_EXTRA_BUILD_FLAGS_GWT_DEFAULTS=-Dgwt.userAgent=gecko1_8,safari
causes the segfault

But
make install-dev PREFIX=$HOME/ovirt-engine
works just fine.

I've duplicated this with with both OpenJDK and Oracle JDK.


I can confirm this on my F20 system with OpenJDK as well.  So far I
have not observed any problems with the resulting build.

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel

[Engine-devel] Engine on Fedora 20

2013-12-19 Thread Adam Litke


Has anyone had success running ovirt-engine on Fedora 20?  I upgraded
my system on Wednesday and thought everything was fine but then I
started getting the following error:

2013-12-19 14:53:31,447 ERROR [org.ovirt.engine.core.bll.Backend] (MSC
service thread 1-5) Error in getting DB connection. The database is
inaccessible. Original exception is:
DataAccessResourceFailureException: Error retreiving database
metadata; nested exception is
org.springframework.jdbc.support.MetaDataAccessException: Could not
get Connection for extracting meta data; nested exception is
org.springframework.jdbc.CannotGetJdbcConnectionException: Could not
get JDBC Connection; nested exception is java.sql.SQLException:
javax.resource.ResourceException: IJ000453: Unable to get managed
connection for java:/ENGINEDataSource

Has anyone encountered this recently?
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel

Re: [Engine-devel] Engine on Fedora 20

2013-12-19 Thread Adam Litke


On 19/12/13 15:05 -0500, Adam Litke wrote:

Has anyone had success running ovirt-engine on Fedora 20?  I upgraded
my system on Wednesday and thought everything was fine but then I
started getting the following error:

2013-12-19 14:53:31,447 ERROR [org.ovirt.engine.core.bll.Backend] (MSC
service thread 1-5) Error in getting DB connection. The database is
inaccessible. Original exception is:
DataAccessResourceFailureException: Error retreiving database
metadata; nested exception is
org.springframework.jdbc.support.MetaDataAccessException: Could not
get Connection for extracting meta data; nested exception is
org.springframework.jdbc.CannotGetJdbcConnectionException: Could not
get JDBC Connection; nested exception is java.sql.SQLException:
javax.resource.ResourceException: IJ000453: Unable to get managed
connection for java:/ENGINEDataSource

Has anyone encountered this recently?


Thanks to alonb for his help on IRC.  As it turns out, I had a poorly
configured pg_hba.conf file that only started causing problems on F20.
To fix I replaced my contents with the following two lines:

hostengine  engine  0.0.0.0/0   md5
hostengine  engine  ::0/0   md5

Otherwise, it seems to be working fine.
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel

Re: [Engine-devel] UX: Display VM Downtime in the UI

2013-12-18 Thread Adam Litke

On 18/12/13 16:04 -0500, Malini Rao wrote:

- Original Message -

From: Adam Litke ali...@redhat.com
To: engine-devel@ovirt.org
Sent: Wednesday, December 18, 2013 9:42:59 AM
Subject: [Engine-devel] UX: Display VM Downtime in the UI

Hi UX developers,

My recent change: http://gerrit.ovirt.org/#/c/22429/ adds support for
tracking the time a VM was last stopped and presenting it in the REST
API.  I would also like to expose this information in the admin
portal.  This feature has been requested by end users and is useful
for managing lots of VMs which may not be used frequently.

My idea is to change the 'Uptime' column in the VMs tab to 'Uptime /
Downtime' or some equivalent and more compact phrasing.  If the VM is
Up, then last_start_time would be used to calculate uptime.  If the VM
is Down, then last_stop_time would be used to calculate downtime.
This helps to make efficient use of the column space.

Thanks for your comments!

MR: I like the idea in general but can we extend to other states as
well? Then we could have the col be called something like 'Time in

I would argue that 'Up' and 'Down' are the only persistent states
where a VM can linger for a user-controlled amount of time.  The
others (WaitForLaunch, PoweringDown, etc) are just transitions with
their own system defined timeouts.  Because of this, it really only
makes sense to denote uptime and downtime.  When the VM is in another
state, this column would be empty.

current state'. Also, I think since this col is so far from the first
column that has the status icon, we should have a tooltip on the
value that says ' Uptime' , 'down time' or 'Status time'.

Agree on the tooltip.

I am not sure how column sorting is being implemented, but if we
combine uptime and downtime into a single column we have an
opportunity to provide a really intuitive sort where the longest
uptime machines are at the top and the longest downtime machines
are at the bottom.  This could be accomplished by treating uptime
as a positive interval and downtime as a negative interval.

MR: That's an interesting idea. Not sure how that would translate if
we did all states and times. Then I would think you would do
descending order within each state but then we would have to fix a
sequence for the display of the various statuses based on the
statuses that matter most.

This is much simpler if you just work with Up and Down.

Questions for you all:

- Do you support the idea of changing the Uptime column to include
Downtime as well or would you prefer a new column instead?

MR: I do not like the idea of introducing new columns for this
purpose since at any given time, only one of the columns will be
populated. Another idea is to remove this column all together and
include the time for the current status as a tooltip on the status
icon preceding the name.

What about adding the uptime/downtime to the status column itself?  I
don't necessarily think this will muddy the status much since there is
still an icon on the left.

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel

[Engine-devel] Java Newbie: Renaming some functions to fix findbugs warnings

2013-11-22 Thread Adam Litke

Hello,

I am working on resolving some warnings produced by findbugs and am looking for 
some advice on how to properly resolve the problem.

The Frontend class has several pairs of methods where a capitalized version is 
a deprecated static form and the camelCase version is the instance method.

For example:

@Deprecated
public static void RunQuery(...)

- and -

public void runQuery(...)

In both cases the parameters are the same so simply renaming RunQuery to 
runQuery will result in a conflict.  Since I am new to Java and the 
ovirt-engine project I am looking for some advice on how to fix the function 
name without breaking the code or people's sense of aesthetics.  Since this is 
a deprecated function, would it be terrible to rename it to 'runQueryStatic' or 
'runQueryDeprecated'?  Since the language provides syntactic annotations for 
'static' and 'deprecated', both of these names feel dirty but I am not sure 
what would be better.  Thanks for helping out a newbie!

--Adam
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel

Re: [Engine-devel] Java Newbie: Renaming some functions to fix findbugs warnings

2013-11-22 Thread Adam Litke

 Adam,
 
 We are aware of this issue and we actually have a patch somewhat ready to
 solve the issue [1]. We made the RunQuery/RunAction/etc method deprecated to
 encourage people to no longer use them. We have patch ready to remove all
 current uses of RunQuery/RunAction/etc from the code base, but haven't gotten
 around to rebasing/merging the patch.
 
 Alexander
 
 [1] http://gerrit.ovirt.org/#/c/18413/

Thanks for the detail!  Looks like fixing this properly is far from a 
beginner's task.
___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel

Re: [Engine-devel] 3.2 features for release notes

2013-01-24 Thread Adam Litke

On Thu, Jan 24, 2013 at 07:30:07AM -0800, Itamar Heim wrote:
 doron/adam:
 not sure about status of vdsm-mom in 3.2?

mom is enabled by default for hosts in 3.2 and will control KSM only.  No
user-visible changes are expected as this is primarily an infrastructure change
to enable more advanced SLA in the next release.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel

Re: [Engine-devel] [vdsm] RFC: New Storage API

2013-01-22 Thread Adam Litke

On Tue, Jan 22, 2013 at 11:36:57PM +0800, Shu Ming wrote:
 2013-1-15 5:34, Ayal Baron:
 image and volume are overused everywhere and it would be extremely confusing 
 to have multiple meanings to the same terms in the same system (we have 
 image today which means virtual disk and volume which means a part of a 
 virtual disk).
 Personally I don't like the distinction between image and volume done in 
 ec2/openstack/etc seeing as they're treated as different types of entities 
 there while the only real difference is mutability (images are read-only, 
 volumes are read-write).
 To move to the industry terminology we would need to first change all 
 references we have today to image and volume in the system (I would say also 
 in ovirt-engine side) to align with the new meaning.
 Despite my personal dislike of the terms, I definitely see the value in 
 converging on the same terminology as the rest of the industry but to do so 
 would be an arduous task which is out of scope of this discussion imo 
 (patches welcome though ;)
 
 Another distinction between Openstack and oVirt is how the
 Nova/ovirt-engine look upon storage systems. In Openstack, a stand
 alone storage service(Cinder) exports the raw storage block device
 to Nova. On the other hand, in oVirt, storage system is highly
 bounded with the cluster scheduling system which integrates storage
 sub-system, VM dispatching sub-system, ISO image sub systems. This
 combination make all of the sub-system integrated in a whole which
 is easy to deploy, but it make the sub-system more opaque and not
 harder to reuse and maintain. This new storage API proposal give us
 an opportunity to distinct these sub-systems as new components which
 export better, loose-coupling APIs to VDSM.

A very good point and an important goal in my opinion.  I'd like to see
ovirt-engine become more of a GUI for configuring the storage component (like it
does for Gluster) rather than the centralized manager of storage.  The clustered
storage should be able to take care of itself as long as the peer hosts can
negotiate the SDM role.  

It would be cool if someone could actually dedicate a non-virtualization host
where its only job is to handle SDM operations.  Such a host could choose to
only deploy the standalone HSM service and not the complete vdsm package.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel

[Engine-devel] Managing async tasks

2012-12-17 Thread Adam Litke

On today's vdsm call we had a lively discussion around how asynchronous
operations should be handled in the future.  In an effort to include more people
in the discussion and to better capture the resulting conversation I would like
to continue that discussion here on the mailing list.

A lot of ideas were thrown around about how 'tasks' should be handled in the
future.  There are a lot of ways that it can be done.  To determine how we
should implement it, it's probably best if we start with a set of requirements.
If we can first agree on these, it should be easy to find a solution that meets
them.  I'll take a stab at identifying a first set of POSSIBLE requirements:

- Standardized method for determining the result of an operation

  This is a big one for me because it directly affects the consumability of the
  API.  If each verb has different semantics for discovering whether it has
  completed successfully, then the API will be nearly impossible to use easily.


Sorry.  That's my list :)  Hopefully others will be willing to add other
requirements for consideration.

From my understanding, task recovery (stop, abort, rollback, etc) will not be
generally supported and should not be a requirement.



-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel

Re: [Engine-devel] Managing async tasks

2012-12-17 Thread Adam Litke

On Mon, Dec 17, 2012 at 03:12:34PM -0500, Saggi Mizrahi wrote:
 This is an addendum to my previous email.

 - Original Message -
  From: Saggi Mizrahi smizr...@redhat.com
  To: Adam Litke a...@us.ibm.com
  Cc: Dan Kenigsberg dan...@redhat.com, Ayal Baron aba...@redhat.com, 
  Federico Simoncelli
  fsimo...@redhat.com, engine-devel@ovirt.org, 
  vdsm-de...@lists.fedorahosted.org
  Sent: Monday, December 17, 2012 2:52:06 PM
  Subject: Re: Managing async tasks

  - Original Message -
   From: Adam Litke a...@us.ibm.com
   To: Saggi Mizrahi smizr...@redhat.com
   Cc: Dan Kenigsberg dan...@redhat.com, Ayal Baron
   aba...@redhat.com, Federico Simoncelli
   fsimo...@redhat.com, engine-devel@ovirt.org,
   vdsm-de...@lists.fedorahosted.org
   Sent: Monday, December 17, 2012 2:16:25 PM
   Subject: Re: Managing async tasks

   On Mon, Dec 17, 2012 at 12:15:08PM -0500, Saggi Mizrahi wrote:

- Original Message -
 From: Adam Litke a...@us.ibm.com To:
 vdsm-de...@lists.fedorahosted.org
 Cc: Dan Kenigsberg dan...@redhat.com, Ayal Baron
 aba...@redhat.com,
 Saggi Mizrahi smizr...@redhat.com, Federico Simoncelli
 fsimo...@redhat.com, engine-devel@ovirt.org Sent: Monday,
 December 17,
 2012 12:00:49 PM Subject: Managing async tasks

 On today's vdsm call we had a lively discussion around how
 asynchronous
 operations should be handled in the future.  In an effort to
 include more
 people in the discussion and to better capture the resulting
 conversation I
 would like to continue that discussion here on the mailing
 list.

 A lot of ideas were thrown around about how 'tasks' should be
 handled in the
 future.  There are a lot of ways that it can be done.  To
 determine how we
 should implement it, it's probably best if we start with a set
 of
 requirements.  If we can first agree on these, it should be
 easy
 to find a
 solution that meets them.  I'll take a stab at identifying a
 first set of
 POSSIBLE requirements:

 - Standardized method for determining the result of an
 operation

   This is a big one for me because it directly affects the
   consumability of
   the API.  If each verb has different semantics for
   discovering
   whether it
   has completed successfully, then the API will be nearly
   impossible to use
   easily.
Since there is no way to assure if of some tasks completed
successfully or
failed, especially around the murky waters of storage, I say this
requirement
should be removed.  At least not in the context of a task.

   I don't agree.  Please feel free to convince me with some exampled.
If we
   cannot provide feedback to a user as to whether their request has
   been satisfied
   or not, then we have some bigger problems to solve.
  If VDSM sends a write command to a storage server, and the connection
  hangs up before the ACK has returned.
  The operation has been committed but VDSM has no way of knowing if
  that happened as far as VDSM is concerned it got an ETIMEO or EIO.
  This is the same problem that the engine has with VDSM.
  If VDSM creates an image\VM\network\repo but the connection hangs up
  before the response can be sent back as far as the engine is
  concerned the operation times out.
  This is an inherent issue with clustering.
  This is why I want to move away from tasks being *the* trackable
  objects.
  Tasks should be short. As short as possible.
  Run VM should just persist the VM information on the VDSM host and
  return. The rest of the tracking should be done using the VM ID.
  Create image should return once VDSM persisted the information about
  the request on the repository and created the metadata files.
  Tracking should be done on the repo or the imageId.

 The thing is that I know how long a VM object should live (or an Image 
 object).
 So tracking it is straight forward. How long a task should live is very 
 problematic and quite context specific.
 It depends on what the task is.
 I think it's quite confusing from an API standpoint to have every task have a 
 different scope, id requirement and life-cycle.

 In VDSM has two types of APIs

 CRUD objects - VM, Image, Repository, Bridge, Storage Connections
 General transient methods - getBiosInfo(), getDeviceList()

 The latter are quite simple to manage. They don't need any special handling. 
 If you lost a getBiosInfo() call you just send another one, no harm done.
 The same is even true with things that change the host like getDeviceList()

 What we are really arguing about is fitting the CRUD objects to some generic 
 task oriented scheme.
 I'm saying it's a waste of time as you can quite easily have flows to recover 
 from each operation.

 Create - Check if the object exists
 Read - Read again
 Update - either update again or read and update if update

Re: [Engine-devel] [vdsm] RFC: New Storage API

2012-12-10 Thread Adam Litke

 operation it will tell it to value one over the other. For
   example, whether to copy all the data or just create a qcow based
   of a snapshot.
   The default is space.
  
   You might have also noticed that it is never explicitly specified
   where to look for existing images. This is done purposefully, VDSM
   will always look in all connected repositories for existing
   objects.
   For very large setups this might be problematic. To mitigate the
   problem you have these options:
   participatingRepositories=[repoId, ...] which tell VDSM to narrow
   the search to just these repositories
   and
   imageHints={imgId: repoId} which will force VDSM to look for those
   image ID just in those repositories and fail if it doesn't find
   them there.
   ___
   vdsm-devel mailing list
   vdsm-de...@lists.fedorahosted.org
   https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
  
  
  --
  ---
  舒明 Shu Ming
  Open Virtualization Engineerning; CSTL, IBM Corp.
  Tel: 86-10-82451626  Tieline: 9051626 E-mail: shum...@cn.ibm.com or
  shum...@linux.vnet.ibm.com
  Address: 3/F Ring Building, ZhongGuanCun Software Park, Haidian
  District, Beijing 100193, PRC
  
  
  
 ___
 vdsm-devel mailing list
 vdsm-de...@lists.fedorahosted.org
 https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel

Re: [Engine-devel] [vdsm] RFC: New Storage API

2012-12-10 Thread Adam Litke

On Fri, Dec 07, 2012 at 02:53:41PM -0500, Saggi Mizrahi wrote:

snip

  1) Can you provide more info on why there is a exception for 'lvm
  based
  block domain'. Its not coming out clearly.
 File based domains are responsible for syncing up object manipulation 
 (creation\deletion)
 The backend is responsible for making sure it all works either by having a 
 single writer (NFS) or having it's own locking mechanism (gluster).
 In our LVM based domains VDSM is responsible for basic object manipulation.
 The current design uses an approach where there is a single host responsible 
 for object creation\deleteion it is the SRM\SDM\SPM\S?M.
 If we ever find a way to make it fully clustered without a big hit in 
 performance the S?M requirement will be removed form that type of domain.

I would like to see us maintain a LOCALFS domain as well.  For this, we would
also need SRM, correct?

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel

Re: [Engine-devel] [vdsm] RFC: New Storage API

2012-12-10 Thread Adam Litke

On Mon, Dec 10, 2012 at 02:03:09PM -0500, Saggi Mizrahi wrote:

 - Original Message -
  From: Adam Litke a...@us.ibm.com
  To: Saggi Mizrahi smizr...@redhat.com
  Cc: Deepak C Shetty deepa...@linux.vnet.ibm.com, engine-devel 
  engine-devel@ovirt.org, VDSM Project
  Development vdsm-de...@lists.fedorahosted.org
  Sent: Monday, December 10, 2012 1:49:31 PM
  Subject: Re: [vdsm] RFC: New Storage API

  On Fri, Dec 07, 2012 at 02:53:41PM -0500, Saggi Mizrahi wrote:

  snip

1) Can you provide more info on why there is a exception for 'lvm
based
block domain'. Its not coming out clearly.
   File based domains are responsible for syncing up object
   manipulation (creation\deletion)
   The backend is responsible for making sure it all works either by
   having a single writer (NFS) or having it's own locking mechanism
   (gluster).
   In our LVM based domains VDSM is responsible for basic object
   manipulation.
   The current design uses an approach where there is a single host
   responsible for object creation\deleteion it is the
   SRM\SDM\SPM\S?M.
   If we ever find a way to make it fully clustered without a big hit
   in performance the S?M requirement will be removed form that type
   of domain.

  I would like to see us maintain a LOCALFS domain as well.  For this,
  we would
  also need SRM, correct?
 No, why?

Sorry, nevermind.  I was thinking of a scenario with multiple clients talking to
a single vdsm and making sure they don't stomp on one another.  This is
probably not something we are going to care about though.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel

Re: [Engine-devel] VDSM tasks, the future

2012-12-04 Thread Adam Litke

On Tue, Dec 04, 2012 at 10:35:01AM -0500, Saggi Mizrahi wrote:
 Because I started hinting about how VDSM tasks are going to look going forward
 I thought it's better I'll just write everything in an email so we can talk
 about it in context.  This is not set in stone and I'm still debating things
 myself but it's very close to being done.

Don't debate them yourself, debate them here!  Even better, propose your idea in
schema form to show how a command might work exactly.

 - Everything is asynchronous.  The nature of message based communication is
 that you can't have synchronous operations.  This is not really debatable
 because it's just how TCP\AMQP\messaging works.

Can you show how a traditionally synchronous command might work?  Let's take
Host.getVmList as an example.

 - Task IDs will be decided by the caller.  This is how json-rpc works and also
 makes sense because no the engine can track the task without needing to have a
 stage where we give it the task ID back.  IDs are reusable as long as no one
 else is using them at the time so they can be used for synchronizing
 operations between clients (making sure a command is only executed once on a
 specific host without locking).
 
 - Tasks are transient If VDSM restarts it forgets all the task information.
 There are 2 ways to have persistent tasks: 1. The task creates an object that
 you can continue work on in VDSM.  The new storage does that by the fact that
 copyImage() returns one the target volume has been created but before the data
 has been fully copied.  From that moment on the stat of the copy can be
 queried from any host using getImageStatus() and the specific copy operation
 can be queried with getTaskStatus() on the host performing it.  After VDSM
 crashes, depending on policy, either VDSM will create a new task to continue
 the copy or someone else will send a command to continue the operation and
 that will be a new task.  2. VDSM tasks just start other operations track-able
 not through the task interface. For example Gluster.
 gluster.startVolumeRebalance() will return once it has been registered with
 Gluster.  glster.getOperationStatuses() will return the state of the operation
 from any host.  Each call is a task in itself.

I worry about this approach because every command has a different semantic for
checking progress.  For migration, we have to check VM status on the src and
dest hosts.  For image copy we need to use a special status call on the dest
image.  It would be nice if there was a unified method for checking on an
operation.  Maybe that can be completion events.

Client:   vdsm:
---   -

Image.copy(...)  --
 --  Operation Started
Wait for event   ...
 --  Event: Operation id done code

For an early error:

Client:   vdsm:
---   -

Image.copy(...)  --
 --  Error: code


 - No task tags.  They are silly and the caller can mangle whatever in the task
 ID if he really wants to tag tasks.

Yes.  Agreed.

 - No explicit recovery stage.  VDSM will be crash-only, there should be
 efforts to make everything crash-safe.  If that is problematic, in case of
 networking, VDSM will recover on start without having a task for it.

How does this work in practice for something like creating a new image from a
template?

 - No clean Task: Tasks can be started by any number of hosts this means that
 there is no way to own all tasks.  There could be cases where VDSM starts
 tasks on it's own and thus they have no owner at all.  The caller needs to
 continually track the state of VDSM. We will have brodcasted events to
 mitigate polling.

If a disconnected client might have missed a completion event, it will need to
check state.  This means each async operation that changes state must document a
proceedure for checking progress of a potentially ongoing operation.  For
Image.copy, that process would be to lookup the new image and check its state.

 - No revert Impossible to implement safely.

How do the engine folks feel about this?  I am ok with it :)

 - No SPM\HSM tasks SPM\SDM is no longer necessary for all domain types (only
 for type).  What used to be SPM tasks, or tasks that persist and can be
 restarted on other hosts is talked about in previous bullet points.
 
A nice simplification.


-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel

Re: [Engine-devel] [vdsm] RFC: New Storage API

2012-12-04 Thread Adam Litke

 information (like
Volume.getInfo)?  (I see some more info below...)

 All operations return once the operations has been committed to disk NOT when 
 the operation actually completes.
 This is done so that:
 - operation come to a stable state as quickly as possible.
 - In case where there is an SDM, only small portion of the operation actually 
 needs to be performed on the SDM host.
 - No matter how many times the operation fails and on how many hosts, you can 
 always resume the operation and choose when to do it.
 - You can stop an operation at any time and remove the resulting object 
 making a distinction between stop because the host is overloaded to I 
 don't want that image
 
 This means that after calling any operation that creates a new image the user 
 must then call getImageStatus() to check what is the status of the image.
 The status of the image can be either optimized, degraded, or broken.
 Optimized means that the image is available and you can run VMs of it.
 Degraded means that the image is available and will run VMs but it might be 
 a better way VDSM can represent the underlying data. 
 Broken means that the image can't be used at the moment, probably because 
 not all the data has been set up on the volume.
 
 Apart from that VDSM will also return the last persisted status information 
 which will conatin
 hostID - the last host to try and optimize of fix the image
 stage - X/Y (eg. 1/10) the last persisted stage of the fix.

Do you have some examples of what the stages would be?  I think these should be
defined in enums so that the user can check on what the individual stages mean.
What happens when the low level implementation of an operation changes?  The
meaning of the stages will change completely.

 percent_complete - -1 or 0-100, the last persisted completion percentage of 
 the aforementioned stage. -1 means that no progress is available for that 
 operation.

 last_error - This will only be filled if the operation failed because of 
 something other then IO or a VDSM crash for obvious reasons.
  It will usually be set if the task was manually stopped
 
 The user can either be satisfied with that information or as the host 
 specified in host ID if it is still working on that image by checking it's 
 running tasks.
 
 checkStorageRepository(self, repositoryId, options={}):
 A method to go over a storage repository and scan for any existing problems. 
 This includes degraded\broken images and deleted images that have no yet been 
 physically deleted\merged.
 It returns a list of Fix objects.
 Fix objects come in 4 types:
 clean - cleans data, run them to get more space.
 optimize - run them to optimize a degraded image

What is an example of a degraded image?

 merge - Merges two images together. Doing this sometimes
 makes more images ready optimizing or cleaning.
 The reason it is different from optimize is that
 unmerged images are considered optimized.
 mend - mends a broken image

What does this mean?

 The user can read these types and prioritize fixes. Fixes also contain opaque 
 FIX data and they should be sent as received to
 fixStorageRepository(self, repositoryId, fix, options={}):
 
 That will start a fix operation.

Could we have an automatic fix mode where vdsm just does the right thing (for
most things)?

 All major operations automatically start the appropriate Fix to bring the 
 created object to an optimize\degraded state (the one that is quicker) unless 
 one of the options is
 AutoFix=False. This is only useful for repos that might not be able to create 
 volumes on all hosts (SDM) but would like to have the actual IO distributed 
 in the cluster.
 
 Other common options is the strategy option:
 It has currently 2 possible values
 space and performance - In case VDSM has 2 ways of completing the same 
 operation it will tell it to value one over the other. For example, whether 
 to copy all the data or just create a qcow based of a snapshot.
 The default is space.

I like this a lot.

 You might have also noticed that it is never explicitly specified where to 
 look for existing images. This is done purposefully, VDSM will always look in 
 all connected repositories for existing objects.
 For very large setups this might be problematic. To mitigate the problem you 
 have these options:
 participatingRepositories=[repoId, ...] which tell VDSM to narrow the search 
 to just these repositories
 and
 imageHints={imgId: repoId} which will force VDSM to look for those image ID 
 just in those repositories and fail if it doesn't find them there.

I would like to have a better way of specifying these optional parameters
without burying them in an options structure.  I will think a little more about
this.  Strategy can just be a two optional flags in a 'flags' argument.  For the
participatingRepositories and imageHints options, I think we need to use real
parameters.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

Re: [Engine-devel] RFD: API: Identifying vdsm objects in the next-gen API

2012-12-03 Thread Adam Litke

On Mon, Dec 03, 2012 at 03:57:42PM -0500, Saggi Mizrahi wrote:
 
 
 - Original Message -
  From: Adam Litke a...@us.ibm.com To: Saggi Mizrahi
  smizr...@redhat.com Cc: engine-de...@linode01.ovirt.org, Dan Kenigsberg
  dan...@redhat.com, Federico Simoncelli fsimo...@redhat.com, Ayal
  Baron aba...@redhat.com, vdsm-de...@lists.fedorahosted.org Sent: Monday,
  December 3, 2012 3:30:21 PM Subject: Re: RFD: API: Identifying vdsm objects
  in the next-gen API
  
  On Thu, Nov 29, 2012 at 05:59:09PM -0500, Saggi Mizrahi wrote:
   
   
   - Original Message -
From: Adam Litke a...@us.ibm.com To: Saggi Mizrahi
smizr...@redhat.com Cc: engine-de...@linode01.ovirt.org, Dan
Kenigsberg dan...@redhat.com, Federico Simoncelli
fsimo...@redhat.com, Ayal Baron aba...@redhat.com,
vdsm-de...@lists.fedorahosted.org Sent: Thursday, November 29, 2012
5:22:43 PM Subject: Re: RFD: API: Identifying vdsm objects in the
next-gen API

On Thu, Nov 29, 2012 at 04:52:14PM -0500, Saggi Mizrahi wrote:
 They are not future proof as the paradigm is completely different.
 Storage domain IDs are not static any more (and are not guaranteed to
 be unique or the same across the cluster.  Image IDs represent the ID
 of the projected data and not the actual unique path.  Just as an
 example, to run a VM you give a list of domains that might contain the
 needed images in the chain and the image ID of the tip.  The paradigm
 is changed to and most calls get non synchronous number of images and
 domains.  Further more, the APIs themselves are completely different.
 So future proofing is not really an issue.

I don't understand this at all.  Perhaps we could all use some education
on the architecture of the planned architectural changes.  If I can pass
an arbitrary list of domainIDs that _might_ contain the data, why
wouldn't I just pass all of them every time?  In that case, why are they
even required since vdsm would have to search anyway?
   It's for optimization mostly, the engine usually has a good idea of where
   stuff are, having it give hints to VDSM can speed up the search process.
   also, then engines knows how transient some storage pieces are. If you
   have a domain that is only there for backup or owned by another manager
   sharing the host, you don't want you VMs using the disks that are on that
   storage effectively preventing it from being removed (though we do have
   plans to have qemu switch base snapshots at runtime for just that).
  
  This is not a clean design.  If the search is slow, then maybe we need to
  improve caching internally.  Making a client cache a bunch of internal IDs
  to pass around sounds like a complete layering violation to me.
 You can't cache this, if the same template exists on an 2 different NFS
 domains only the engine has enough information to know which you should use.
 We only have the engine give us thing information when starting a VM or
 merging\copying an image that resides on multiple domains.  It is also
 completely optional. I didn't like it either.

Is it even valid for the same template (with identical uuids) to exist in two
places?  I thought uuids aren't supposed to collide.  I can envision some
scenario where a cached storagedomain/storagepool relationship becomes invalid
because another user detached the storagedomain.  In that case, the API just
returns the normal error about sd XXX is not attached to sp XXX.  So I don't
see any problem here.

  

 As to making the current API a bit simpler. As I said, making them
 opaque is problematic as currently the engine is responsible for
 creating the IDs.

As I mentioned in my last post, engine still can specify the ID's when
the object is first created.  From that point forward the ID never
changes so it can be baked into the identifier.
   Where will this identifier be persisted?

 Further more, some calls require you to play with these (making a
 template instead of a snapshot).  Also, the full chain and topology
 needs to be completely visible to the engine.

Please provide a specific example of how you play with the IDs.  I can
guess where you are going, but I don't want to divert the thread.
   The relationship between volumes and images is deceptive at the moment.
   IMG is the chain and volume is a member, IMGUUID is only used to for
   verification and to detect when we hit a template going up the chain.
   When you do operation on images assumptions are being guaranteed about the
   resulting IDs.  When you copy an image, you assume to know all the new IDs
   as they remain the same.  With your method I can't tell what the new
   opaque result is going to be.  Preview mode (another abomination being
   deprecated) relies on the disconnect between imgUUID and volUUID.  Live
   migration currently moves a lot of the responsibility to the engine.
  
  No client

Re: [Engine-devel] [vdsm] [ATTENTION] vdsm-bootstrap/host deployment (pre-3.2)

2012-11-29 Thread Adam Litke

On Thu, Nov 29, 2012 at 10:00:12AM +0200, Dan Kenigsberg wrote:
 On Wed, Nov 28, 2012 at 03:29:35PM -0600, Adam Litke wrote:
  On Wed, Nov 28, 2012 at 03:45:28PM -0500, Alon Bar-Lev wrote:

   - Original Message -
From: Dan Kenigsberg dan...@redhat.com
To: Alon Bar-Lev alo...@redhat.com
Cc: VDSM Project Development vdsm-de...@lists.fedorahosted.org, 
engine-devel engine-devel@ovirt.org, users
us...@ovirt.org
Sent: Wednesday, November 28, 2012 10:39:42 PM
Subject: Re: [vdsm] [ATTENTION] vdsm-bootstrap/host deployment (pre-3.2)

On Wed, Nov 28, 2012 at 02:57:17PM -0500, Alon Bar-Lev wrote:

   No... we need it as compatibility with older engines...
   We keep minimum changes there for legacy, until end-of-life.

  Is there an EoL statement for oVirt-3.1?
  We can make sure that oVirt-3.2's vdsm installs properly with
  ovirt-3.1's vdsm-bootstrap, or even require that Engine must be
  upgraded
  to ovirt-3.2 before upgrading any of the hosts. Is it too harsh
  to
  our
  vast install base?  us...@ovirt.org, please chime in!

 I tried to find such, but the more I dig I find that we need to
 support old legacy.

Why, exactly? Fedora gives no such guarntees (heck, I'm stuck with an
unupgradable F16). Should we be any better than our (currently
single)
platform?

   We should start and detach from specific distro procedures.

  * legacy-removed: change machine width core file
   # echo /var/lib/vdsm/core  /proc/sys/kernel/core_pattern

Yeah, qemu-kvm and libvirtd are much more stable than in the
old
days,
but wouldn't we want to keep a means to collect the corpses
of
dead
processes from hypervisors? It has helped us nail down nasty
bugs,
even
in Python.

   It does not mean it should be at /var/lib/vdsm ... :)

  I don't get the joke :-(. If you mind the location, we can think
  of
  somewhere else to put the core dumps. Would it be hard to
  reinstate a
  parallel feature in otopi?

 I usually do not make any jokes...
 A global system setting should not go into package specific
 location.
 Usually core dumps are off by default, I like this approach as
 unattended system may fast consume all disk space because of
 dumps.

If a host fills up with dumps so quickly, it's a sign that it should
not
be used for production, and that someone should look into the cores.
(P.S. we have a logrotate rule for them in vdsm)

   There should be a vdsm-debug-aids (or similar) to perform such changes.
   Again, I don't think vdsm should (by default) modify any system width 
   parameter such as this.
   But I will happy to hear more views.

  I agree with your statement above that a single package should not override 
  a
  global system setting.  We should really work to remove as many of these 
  from
  vdsm as we possibly can.  It will help to make vdsm a much 
  safer/well-behaved
  package.

 I'm fine with dropping these from vdsm, but I think they are good for
 ovirt - we would like to (be able to) enfornce policy on our nodes.

 If configuring core dumps is removed from vdsm, it should go somewhere
 else, or our log-collector users would miss their beloved dumps.

Yes, I agree.  From my point of view the plan was to do the following:

1. Remove unnecessary system configuration changes.  This includes things like
Royce's supervdsm startup process patch (and accompanying sudo-supervdsm
conversions) which allows us to remove some of the sudo configuration.

2. Isolate the remaining tweaks into vdsm-tool.

3. Provide a service/program that can be run to configure a system to work in an
ovirt-engine controlled cluster.

Doing this allows vdsm to be safely installed on any system as a basic
prerequisite for other software.

-- 
Adam Litke a...@us.ibm.com
IBM Linux Technology Center

___
Engine-devel mailing list
Engine-devel@ovirt.org
http://lists.ovirt.org/mailman/listinfo/engine-devel

1 2 >

1 - 100 of 109 matches

Mail list logo