[ovirt-devel] Re: VM rebooted during OST test_hotplug_cpu

2020-06-30 Thread Yedidyah Bar David
On Tue, Jun 30, 2020 at 9:30 PM Nir Soffer  wrote:
>
> On Tue, Jun 30, 2020 at 10:37 AM Yedidyah Bar David  wrote:
> >
> > On Tue, Jun 30, 2020 at 9:37 AM Michal Skrivanek
> >  wrote:
> > >
> > >
> > >
> > > > On 30 Jun 2020, at 08:30, Yedidyah Bar David  wrote:
> > > >
> > > > Hi all,
> > > >
> > > > I am trying to verify fixes for ovirt-engine-rename, specifically for
> > > > OVN. Engine top patch is [1], OST patch [2]. Ran the manual job on
> > > > these [3].
> > > >
> > > > In previous patches, OST failed in earlier tests. Now, it passed these
> > > > tests, so I hope that my patches are enough for what I am trying to
> > > > do. However, [3] did fail later, during test_hotplug_cpu - it set the
> > > > number of CPUs, then tried to connect to the VM, and timed out.
> > > >
> > > > The logs imply that right after it changed the number of CPUs, the VM
> > > > was rebooted, apparently by libvirtd. Relevant log snippets:
> > > >
> > > > vdsm [4]:
> > > >
> > > > 2020-06-29 10:21:10,889-0400 DEBUG (jsonrpc/1) [virt.vm]
> > > > (vmId='7474280d-4501-4355-9425-63898757682b') Setting number of cpus
> > > > to : 2 (vm:3089)
> > > > 2020-06-29 10:21:10,952-0400 INFO  (jsonrpc/1) [api.virt] FINISH
> > > > setNumberOfCpus return={'status': {'code': 0, 'message': 'Done'},
> > > > 'vmList': {}} from=:::192.168.201.4,54576, flow_id=7f9503ed,
> > > > vmId=7474280d-4501-4355-9425-63898757682b (api:54)
> > > > 2020-06-29 10:21:11,111-0400 DEBUG (periodic/0)
> > > > [virt.sampling.VMBulkstatsMonitor] sampled timestamp 2925.602824355
> > > > elapsed 0.160 acquired True domains all (sampling:451)
> > > > 2020-06-29 10:21:11,430-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer]
> > > > Return 'VM.setNumberOfCpus' in bridge with {} (__init__:356)
> > > > 2020-06-29 10:21:11,432-0400 INFO  (jsonrpc/1) [jsonrpc.JsonRpcServer]
> > > > RPC call VM.setNumberOfCpus succeeded in 0.56 seconds (__init__:312)
> > > > 2020-06-29 10:21:12,228-0400 INFO  (libvirt/events) [virt.vm]
> > > > (vmId='7474280d-4501-4355-9425-63898757682b') reboot event (vm:1033)
> > > >
> > > > qemu [5]:
> > > >
> > > > 2020-06-29T14:21:12.260303Z qemu-kvm: terminating on signal 15 from
> > > > pid 42224 ()
> > > > 2020-06-29 14:21:12.462+: shutting down, reason=destroyed
> > > >
> > > > libvirtd [6] itself does not log anything relevant AFAICT, but at
> > > > least it shows that the above unknown process is itself:
> > > >
> > > > 2020-06-29 14:18:16.212+: 42224: error : qemuMonitorIO:620 :
> > > > internal error: End of file from qemu monitor
>
> Is this from libvirt log?

Yes

> Why would libvirt log log libvirtd pid?

Perhaps because it forks several children, and thus has in its logging
format string also the pid? That's standard practice.

>
> > > > (Note that above line is from 3 minutes before the reboot, and the
> > > > only place in the log with '42224'. No other log there has 42224,
> > > > other than these and audit.log).
> > > >
> > > > Any idea? Is this a bug in libvirt? vdsm? I'd at least expect
> > > > something in the log for such a severe step.
>
> Is this the hosted engine vm?

No, it's "vm0". (It's the basic suite, not hosted-engine).

> If we had trouble with storage, maybe
> sanlock killed the vm because it could not renew the lease.
>
> What do we have in /var/log/sanlock.log?

https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/7031/artifact/exported-artifacts/test_logs/basic-suite-master/post-004_basic_sanity_pytest.py/lago-basic-suite-master-host-0/_var_log/sanlock.log

Nothing that seems relevant - last line is:

2020-06-29 10:14:29 2523 [44292]: s4:r6 resource
bdabb997-2075-4bb6-8217-b3a99d1bd599:da464550-1f81-443a-907d-39b763f13751:/rhev/data-center/mnt/192.168.200.4:_exports_nfs_share2/bdabb997-2075-4bb6-8217-b3a99d1bd599/dom_md/xleases:3145728

>
> Also, do we have any WARN logs in vdsm? if there was an issue with
> storage you would see warnings about blocked checkers every 10 seconds.

20 WARN lines. Last one before the problem:

2020-06-29 10:18:33,569-0400 WARN  (vm/da464550) [virt.vm]
(vmId='da464550-1f81-443a-907d-39b763f13751') Cannot find device alias
for _conf:{'type': 'lease', 'device': 'lease', 'lease_id':
'da464550-1f81-443a-907d-39b763f13751', 'sd_id':
'bdabb997-2075-4bb6-8217-b3a99d1bd599', 'path':
'/rhev/data-center/mnt/192.168.200.4:_exports_nfs_share2/bdabb997-2075-4bb6-8217-b3a99d1bd599/dom_md/xleases',
'offset': '3145728'} _deviceXML:None alias:None config:> createXmlElem:> custom:{} device:lease
deviceId:None deviceType:None from_xml_tree:>
getXML:> get_extra_xmls:> get_identifying_attrs:> get_metadata:> hotunplug_event: is_hostdevice:False
lease_id:da464550-1f81-443a-907d-39b763f13751 log: offset:3145728
path:/rhev/data-center/mnt/192.168.200.4:_exports_nfs_share2/bdabb997-2075-4bb6-8217-b3a99d1bd599/dom_md/xleases
sd_id:bdabb997-2075-4bb6-8217-b3a99d1bd599 setup:> specParams:{} teardown:> type:lease
update_device_info:> vmid:None (core:329)

Next one after it:


[ovirt-devel] Re: Backup: how to download only used extents from imageio backend

2020-06-30 Thread Michael Ablassmeier
hi,

On Tue, Jun 30, 2020 at 04:49:01PM +0300, Nir Soffer wrote:
> On Tue, Jun 30, 2020 at 10:32 AM Michael Ablassmeier  wrote:
> >  
> > https://tranfer_node:54322/images/d471c659-889f-4e7f-b55a-a475649c48a6/extents
> >
> > As i failed to find them, are there any existing functions/api calls
> > that could be used to download only the used extents to a file/fifo
> > pipe?
> 
> To use _internal.io.copy to copy the image to tape, we need to solve
> several issues:
> 
> 1. how do you write the extents to tape so that you can extract them later?
> 2. provide a backend that knows how to stream data to tape in the right format
> 3. fix client.download() to consider the number of writers allowed by
> the backend,
>since streaming to tape using multiple writers will not be possible.

so, speaking as someone who works for a backup vendor, issue  1 and 2 are
already solved by our software, the backend is there, we just need an
way to extract the data from the api without storing it into a file
first. Something like:

 backup_vm.py full  pipe

is already sufficient, as our backup client software would simply read
the data from the pipe, sending it to our backend which does all the
stuff regarding tape communication and format.

The old implementation used the snapshot/attach feature, where our
backup client is reading directly from the attached storage device,
sending the data to the backend, which cares about multiplexing to tape,
possible dedpulication, etc..

Tape is not the only use case here, most of the times our customers want
to write data to storage devices which do not expose an regular file
system (such as dedup services, StoreOnce, Virtual Tape solutions etc).

> To restore this backup, you need to:
> 1. find the tar in the tape (I have no idea how you would do this)
> 2. extract backup info from the tar
> 3. extract extents from the tar

1-3 are not an issue here and handled by our backend

> 4. start an upload transfer
> 5. for each data extent:
> read data from the tar member, and send to imageio using the right
> offset and size 

that is some good information, so it is possible to create an empty disk
with the same size using the API and then directly send the extents with
their propper offset. How does it look with an incremental backup on top
of an just restored full backup. Does the imageio backend automatically
rebase and commit the data from the incremental backup during upload?

As i understand it, requesting the extents directly and writing them to
a file, leaves you with an image in raw format, which then needs to be
properly re-aligned with zeros and converted to qcow2, beeing able to
commit any of the incremental backups i have stored somewhere. As during
upload, an convert is possible, that means we dont have to rebuild the
full/inc chain using a temporary file which we then upload?

> So the missing part is to create a connection to imageio and reading the data.
> 
> The easiest way is to use imageio._internal.backends.http, but note that this
> is internal now, so you should not use it outside of imageio. It is fine for
> writing proof of concept, and if you can show a good use case we can work
> on public API.

yes, that is what i noticed. My current solution would be to use the
interal functions to query the extent information and then continue
extracting them, to be able to pipe the data into our backend.

> You can write this using http.client.HTTPSConnection without using
> the http backend, but it would be a lot of code.

thanks for your example, i will give it a try during POC implementation.

> We probably need to expose the backends or a simplified interface
> in the client public API to make it easier to write such applications.
> 
> Maybe something like:
> 
>  client.coy(src, dst)
> 
> Where src and dst are objects implementing imageio backend interface.
> 
> But before we do this we need to see some examples of real programs
> using imageio, to understand the requirements better.

the main feature for us would be to be able to read the data and
pipe it somewhere, which works by using the _internal api
functions, but having a stable interface for it would be really
good for any kind of backup vendor to implement a client for
the new api into their software.

If anyone is interested to hear more thoughts about that, also from
redhat, dont hesitate to contact me directly for having a call.

bye,
- michael
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/QLIGGZGCN5XCKEPDOLXO3IM3TCQPKKFY/


[ovirt-devel] Re: VM rebooted during OST test_hotplug_cpu

2020-06-30 Thread Nir Soffer
On Tue, Jun 30, 2020 at 10:37 AM Yedidyah Bar David  wrote:
>
> On Tue, Jun 30, 2020 at 9:37 AM Michal Skrivanek
>  wrote:
> >
> >
> >
> > > On 30 Jun 2020, at 08:30, Yedidyah Bar David  wrote:
> > >
> > > Hi all,
> > >
> > > I am trying to verify fixes for ovirt-engine-rename, specifically for
> > > OVN. Engine top patch is [1], OST patch [2]. Ran the manual job on
> > > these [3].
> > >
> > > In previous patches, OST failed in earlier tests. Now, it passed these
> > > tests, so I hope that my patches are enough for what I am trying to
> > > do. However, [3] did fail later, during test_hotplug_cpu - it set the
> > > number of CPUs, then tried to connect to the VM, and timed out.
> > >
> > > The logs imply that right after it changed the number of CPUs, the VM
> > > was rebooted, apparently by libvirtd. Relevant log snippets:
> > >
> > > vdsm [4]:
> > >
> > > 2020-06-29 10:21:10,889-0400 DEBUG (jsonrpc/1) [virt.vm]
> > > (vmId='7474280d-4501-4355-9425-63898757682b') Setting number of cpus
> > > to : 2 (vm:3089)
> > > 2020-06-29 10:21:10,952-0400 INFO  (jsonrpc/1) [api.virt] FINISH
> > > setNumberOfCpus return={'status': {'code': 0, 'message': 'Done'},
> > > 'vmList': {}} from=:::192.168.201.4,54576, flow_id=7f9503ed,
> > > vmId=7474280d-4501-4355-9425-63898757682b (api:54)
> > > 2020-06-29 10:21:11,111-0400 DEBUG (periodic/0)
> > > [virt.sampling.VMBulkstatsMonitor] sampled timestamp 2925.602824355
> > > elapsed 0.160 acquired True domains all (sampling:451)
> > > 2020-06-29 10:21:11,430-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer]
> > > Return 'VM.setNumberOfCpus' in bridge with {} (__init__:356)
> > > 2020-06-29 10:21:11,432-0400 INFO  (jsonrpc/1) [jsonrpc.JsonRpcServer]
> > > RPC call VM.setNumberOfCpus succeeded in 0.56 seconds (__init__:312)
> > > 2020-06-29 10:21:12,228-0400 INFO  (libvirt/events) [virt.vm]
> > > (vmId='7474280d-4501-4355-9425-63898757682b') reboot event (vm:1033)
> > >
> > > qemu [5]:
> > >
> > > 2020-06-29T14:21:12.260303Z qemu-kvm: terminating on signal 15 from
> > > pid 42224 ()
> > > 2020-06-29 14:21:12.462+: shutting down, reason=destroyed
> > >
> > > libvirtd [6] itself does not log anything relevant AFAICT, but at
> > > least it shows that the above unknown process is itself:
> > >
> > > 2020-06-29 14:18:16.212+: 42224: error : qemuMonitorIO:620 :
> > > internal error: End of file from qemu monitor

Is this from libvirt log? Why would libvirt log log libvirtd pid?

> > > (Note that above line is from 3 minutes before the reboot, and the
> > > only place in the log with '42224'. No other log there has 42224,
> > > other than these and audit.log).
> > >
> > > Any idea? Is this a bug in libvirt? vdsm? I'd at least expect
> > > something in the log for such a severe step.

Is this the hosted engine vm? If we had trouble with storage, maybe
sanlock killed the vm because it could not renew the lease.

What do we have in /var/log/sanlock.log?

Also, do we have any WARN logs in vdsm? if there was an issue with
storage you would see warnings about blocked checkers every 10 seconds.

> > I’d suggest to rerun.
>
> ok
>
> > I don’t trust the CI env at all. Could be any reason.
>
> If the qemu process would have simply died, I'd agree it's likely to
> be an infra issue.
> But if libvirt indeed killed it, I'd say it's either a bug there, or
> some weird flow (which requires more logging, at least).
>
> > It’s highly unlikely to be caused by your patch, and I can see on my infra 
> > that OST is running well on both CentOS and Stream.
>
> Well, also basic-suite is now passing for quite some time now without
> any failures:
>
> https://jenkins.ovirt.org/job/ovirt-system-tests_basic-suite-master_nightly/
>
> Thanks,
>
> >
> > >
> > > [1] https://gerrit.ovirt.org/109961
> > > [2] https://gerrit.ovirt.org/109734
> > > [3] 
> > > https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/7031/
> > > [4] 
> > > https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/7031/artifact/exported-artifacts/test_logs/basic-suite-master/post-004_basic_sanity_pytest.py/lago-basic-suite-master-host-0/_var_log/vdsm/vdsm.log
> > > [5] 
> > > https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/7031/artifact/exported-artifacts/test_logs/basic-suite-master/post-004_basic_sanity_pytest.py/lago-basic-suite-master-host-0/_var_log/libvirt/qemu/vm0.log
> > > [6] 
> > > https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/7031/artifact/exported-artifacts/test_logs/basic-suite-master/post-004_basic_sanity_pytest.py/lago-basic-suite-master-host-0/_var_log/libvirt.log
> > > --
> > > Didi
> > > ___
> > > Devel mailing list -- devel@ovirt.org
> > > To unsubscribe send an email to devel-le...@ovirt.org
> > > Privacy Statement: https://www.ovirt.org/privacy-policy.html
> > > oVirt Code of Conduct: 
> > > 

[ovirt-devel] ovirt-engine has been tagged (ovirt-engine-4.4.1.5)

2020-06-30 Thread Sandro Bonazzola

___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/DHHLGFFMEFNQARTKB32J7OZOFEDFCN62/


[ovirt-devel] vdsm CI failing randomly again

2020-06-30 Thread Ales Musil
Hi,

it seems like vdsm CI is failing randomly again on el8.

I can see the error that the it cannot install the python2-pyxdg package.

2020-06-30T11:27:12.327Z] Error: Unable to find a match: python2-pyxdg nosync


The job that failed [0].
Can anyone please take a look?

Thank you.
Regards,
Ales

[0] https://jenkins.ovirt.org/job/vdsm_standard-check-patch/22328/
-- 

Ales Musil

Software Engineer - RHV Network

Red Hat EMEA 

amu...@redhat.comIM: amusil

___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/WHDBHEWLHOU6MYVPDKCUSZVPEKXDLMRX/


[ovirt-devel] Hyperconverged Single Host Glusterfs timing issue

2020-06-30 Thread Glenn Marcy
Now that I've been able to get past my issues with q35 bios I decided to try 
the hyperconverged install.  setting up the storage was no problem and the 
engine setup went fine until the point where it wanted to use that glusterfs 
storage.  I got the error, pretty quickly I might add, of

2020-06-30 02:17:50,689-0400 DEBUG ansible on_any args 
 kwargs
2020-06-30 02:17:51,428-0400 INFO ansible task start {'status': 'OK', 
'ansible_type': 'task', 'ansible_playbook': 
'/usr/share/ovirt-hosted-engine-setup/ansible/trigger_role.yml', 
'ansible_task': 'ovirt.hosted_engine_setup : Add glusterfs storage domain'}
2020-06-30 02:17:51,428-0400 DEBUG ansible on_any args TASK: 
ovirt.hosted_engine_setup : Add glusterfs storage domain kwargs 
is_conditional:False
2020-06-30 02:17:51,429-0400 DEBUG ansible on_any args localhostTASK: 
ovirt.hosted_engine_setup : Add glusterfs storage domain kwargs
2020-06-30 02:17:53,656-0400 DEBUG var changed: host "localhost" var 
"otopi_storage_domain_details_gluster" type "" value: "{
"changed": false,
"exception": "Traceback (most recent call last):
  File 
\"/tmp/ansible_ovirt_storage_domain_payload_jh42k_ip/ansible_ovirt_storage_domain_payload.zip/ansible/modules/cloud/ovirt/ovirt_storage_domain.py\",
 line 792, in main
  File 
\"/tmp/ansible_ovirt_storage_domain_payload_jh42k_ip/ansible_ovirt_storage_domain_payload.zip/ansible/module_utils/ovirt.py\",
 line 623, in create
**kwargs
  File \"/usr/lib64/python3.6/site-packages/ovirtsdk4/services.py\", line 
26097, in add
return self._internal_add(storage_domain, headers, query, wait)
  File \"/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py\", line 232, 
in _internal_add
return future.wait() if wait else future
  File \"/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py\", line 55, in 
wait
return self._code(response)
  File \"/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py\", line 229, 
in callback
self._check_fault(response)
  File \"/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py\", line 132, 
in _check_fault
self._raise_error(response, body)
  File \"/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py\", line 118, 
in _raise_error
raise error\novirtsdk4.Error: Fault reason is \"Operation Failed\". Fault 
detail is \"[Failed to fetch Gluster Volume List]\". HTTP response code is 
400.\n",
"failed": true,
"msg": "Fault reason is \"Operation Failed\". Fault detail is \"[Failed to 
fetch Gluster Volume List]\". HTTP response code is 400."
}"

The glusterfs cli worked fine by that point, had no issues with volume list, so 
I went into the node:6900/ovirt-engine forwarder to the appliance and could see 
that at the time of that error in the event logs there was a message that it 
was starting to update things and a few seconds later a message that the 
volumes were available.

I am thinking that there is need for a retry loop in this step or that 
something in the path isn't waiting for something internal to complete before 
issuing the Operation Failed error.

Regards,
Glenn Marcy
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/T3MKM7N4KIL3TPJNGE3VZ6RP2MNEI53X/


[ovirt-devel] Re: Backup: how to download only used extents from imageio backend

2020-06-30 Thread Nir Soffer
On Tue, Jun 30, 2020 at 10:32 AM Michael Ablassmeier  wrote:
>
> hi,
>
> im currently looking at the new incremental backup api that has been
> part of the 4.4 and RHV 4.4-beta release. So far i was able to create
> full/incremental backups and restore without any problem.
>
> Now, using the backup_vm.py example from the ovirt-engine-sdk i get
> the following is happening during a full backup:
>
>  1) imageio client api requests transfer
>  2) starts qemu-img to create a local qemu image with same size
>  3) starts qemu-nbd to serve this image
>  4) reads used extents from provided imageio source, passes data to
>  qemu-nbd process
>  5) resulting file is a thin provisioned qcow image with the actual
>  data of the VM's used space.
>
> while this works great, it has one downside: if i backup a virtual
> machine with lots of used extents, or multiple virtual machines at the
> same time, i may run out of space, if my primary backup target is
> not a regular disk.
>
> Imagine i want to stream the FULL backup to tape directly like
>
>  backup_vm.py full [..]  /dev/nst0
>
> thats currently not possible, because qemu-img is not able to open
> a tape device directly, given its nature of the qcow2 format.
>
> So what iam basically looking for, is a way to download only the extents
> from the imageio server that are really in use, not depending on qemu-*
> tools, to be able to pipe the data somehwere else.
>
> Standard tools, like for example curl, will allways download the full
> provisioned image from the imageio backend (of course).
>
> I noticed is that it is possible to query the extents via:
>
>  
> https://tranfer_node:54322/images/d471c659-889f-4e7f-b55a-a475649c48a6/extents
>
> As i failed to find them, are there any existing functions/api calls
> that could be used to download only the used extents to a file/fifo
> pipe?
>
> So far, i played around with the _internal.io.copy function, beeing able
> to at least read the data into a in memory BytesIO stream, but thats not
> the solution to my "problem" :)

To use _internal.io.copy to copy the image to tape, we need to solve
several issues:

1. how do you write the extents to tape so that you can extract them later?
2. provide a backend that knows how to stream data to tape in the right format
3. fix client.download() to consider the number of writers allowed by
the backend,
   since streaming to tape using multiple writers will not be possible.

I think we can start with a simple implementation using imageio API, and once
we have a working solution, we can consider making a backend.

A possible solution for 1 is to use tar format, creating one tar per backup.

The tar structure can be:

- backup info - json file with information about this backup like vm
id, disk id,
  date, checkpoint, etc.
- extents - the json returned from imageio as is. Using this json you
can restore
  later every extent to the right location in the restored image
- extent 1 - first data extent (zero=False)
...
- extent N - last data extent

To restore this backup, you need to:

1. find the tar in the tape (I have no idea how you would do this)
2. extract backup info from the tar
3. extract extents from the tar
4. start an upload transfer
5. for each data extent:
read data from the tar member, and send to imageio using the right
offset and size

Other formats are possible, but reusing tar seems like the easiest way
and will make it
easier to write and read backups from tapes.

Creating a tar file and adding items using streaming can be done like this:

with tarfile.open("/dev/xxx", "w|") as tar:

# Create tarinfo for extent-N
# setting other attributes may be needed
tarinfo = tarfile.Tarinfo("extent-{}".format(extent_number))
tarinfo.size = extent_size

# reader must implement read(n), providing tarinfo.size bytes.
tar.addfile(tarinfo, fileObj=reader)

I never tried to write directly to tape with python tarfile, but it should work.

So the missing part is to create a connection to imageio and reading the data.

The easiest way is to use imageio._internal.backends.http, but note that this
is internal now, so you should not use it outside of imageio. It is fine for
writing proof of concept, and if you can show a good use case we can work
on public API.

With that backend, you can do this:

from imageio._internal.backends impot http

with http.Backend(transfer_url, cafile) as backend:
extents = list(backend.extents("zero"))

# Write extents to tarfile. Assuming you wrote a helper write_to_tar()
# doing the Tarinfo dance.
extents_data = json.dumps([extent.to_dict() for extent in extents])
write_to_tar("extents", len(extent_data), io.BytesIO(extents_data))

for n, extent in enumerate(e for e in extents if not e.zero):

# Seek to start of extent. Reading extent.length bytes will
# return extent data.
backend.seek(extent.start)

# 

[ovirt-devel] Re: VM rebooted during OST test_hotplug_cpu

2020-06-30 Thread Yedidyah Bar David
On Tue, Jun 30, 2020 at 9:37 AM Michal Skrivanek
 wrote:
>
>
>
> > On 30 Jun 2020, at 08:30, Yedidyah Bar David  wrote:
> >
> > Hi all,
> >
> > I am trying to verify fixes for ovirt-engine-rename, specifically for
> > OVN. Engine top patch is [1], OST patch [2]. Ran the manual job on
> > these [3].
> >
> > In previous patches, OST failed in earlier tests. Now, it passed these
> > tests, so I hope that my patches are enough for what I am trying to
> > do. However, [3] did fail later, during test_hotplug_cpu - it set the
> > number of CPUs, then tried to connect to the VM, and timed out.
> >
> > The logs imply that right after it changed the number of CPUs, the VM
> > was rebooted, apparently by libvirtd. Relevant log snippets:
> >
> > vdsm [4]:
> >
> > 2020-06-29 10:21:10,889-0400 DEBUG (jsonrpc/1) [virt.vm]
> > (vmId='7474280d-4501-4355-9425-63898757682b') Setting number of cpus
> > to : 2 (vm:3089)
> > 2020-06-29 10:21:10,952-0400 INFO  (jsonrpc/1) [api.virt] FINISH
> > setNumberOfCpus return={'status': {'code': 0, 'message': 'Done'},
> > 'vmList': {}} from=:::192.168.201.4,54576, flow_id=7f9503ed,
> > vmId=7474280d-4501-4355-9425-63898757682b (api:54)
> > 2020-06-29 10:21:11,111-0400 DEBUG (periodic/0)
> > [virt.sampling.VMBulkstatsMonitor] sampled timestamp 2925.602824355
> > elapsed 0.160 acquired True domains all (sampling:451)
> > 2020-06-29 10:21:11,430-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer]
> > Return 'VM.setNumberOfCpus' in bridge with {} (__init__:356)
> > 2020-06-29 10:21:11,432-0400 INFO  (jsonrpc/1) [jsonrpc.JsonRpcServer]
> > RPC call VM.setNumberOfCpus succeeded in 0.56 seconds (__init__:312)
> > 2020-06-29 10:21:12,228-0400 INFO  (libvirt/events) [virt.vm]
> > (vmId='7474280d-4501-4355-9425-63898757682b') reboot event (vm:1033)
> >
> > qemu [5]:
> >
> > 2020-06-29T14:21:12.260303Z qemu-kvm: terminating on signal 15 from
> > pid 42224 ()
> > 2020-06-29 14:21:12.462+: shutting down, reason=destroyed
> >
> > libvirtd [6] itself does not log anything relevant AFAICT, but at
> > least it shows that the above unknown process is itself:
> >
> > 2020-06-29 14:18:16.212+: 42224: error : qemuMonitorIO:620 :
> > internal error: End of file from qemu monitor
> >
> > (Note that above line is from 3 minutes before the reboot, and the
> > only place in the log with '42224'. No other log there has 42224,
> > other than these and audit.log).
> >
> > Any idea? Is this a bug in libvirt? vdsm? I'd at least expect
> > something in the log for such a severe step.
>
> I’d suggest to rerun.

ok

> I don’t trust the CI env at all. Could be any reason.

If the qemu process would have simply died, I'd agree it's likely to
be an infra issue.
But if libvirt indeed killed it, I'd say it's either a bug there, or
some weird flow (which requires more logging, at least).

> It’s highly unlikely to be caused by your patch, and I can see on my infra 
> that OST is running well on both CentOS and Stream.

Well, also basic-suite is now passing for quite some time now without
any failures:

https://jenkins.ovirt.org/job/ovirt-system-tests_basic-suite-master_nightly/

Thanks,

>
> >
> > [1] https://gerrit.ovirt.org/109961
> > [2] https://gerrit.ovirt.org/109734
> > [3] 
> > https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/7031/
> > [4] 
> > https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/7031/artifact/exported-artifacts/test_logs/basic-suite-master/post-004_basic_sanity_pytest.py/lago-basic-suite-master-host-0/_var_log/vdsm/vdsm.log
> > [5] 
> > https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/7031/artifact/exported-artifacts/test_logs/basic-suite-master/post-004_basic_sanity_pytest.py/lago-basic-suite-master-host-0/_var_log/libvirt/qemu/vm0.log
> > [6] 
> > https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/7031/artifact/exported-artifacts/test_logs/basic-suite-master/post-004_basic_sanity_pytest.py/lago-basic-suite-master-host-0/_var_log/libvirt.log
> > --
> > Didi
> > ___
> > Devel mailing list -- devel@ovirt.org
> > To unsubscribe send an email to devel-le...@ovirt.org
> > Privacy Statement: https://www.ovirt.org/privacy-policy.html
> > oVirt Code of Conduct: 
> > https://www.ovirt.org/community/about/community-guidelines/
> > List Archives: 
> > https://lists.ovirt.org/archives/list/devel@ovirt.org/message/JEF5QWFZF4O2OGQFHPH7SPU6SX76KF47/
>


-- 
Didi
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/HMOCTJZDQERXZEXF24IUQE653RSATGLF/


[ovirt-devel] Backup: how to download only used extents from imageio backend

2020-06-30 Thread Michael Ablassmeier
hi,

im currently looking at the new incremental backup api that has been
part of the 4.4 and RHV 4.4-beta release. So far i was able to create
full/incremental backups and restore without any problem.

Now, using the backup_vm.py example from the ovirt-engine-sdk i get
the following is happening during a full backup:

 1) imageio client api requests transfer
 2) starts qemu-img to create a local qemu image with same size
 3) starts qemu-nbd to serve this image
 4) reads used extents from provided imageio source, passes data to
 qemu-nbd process 
 5) resulting file is a thin provisioned qcow image with the actual
 data of the VM's used space.

while this works great, it has one downside: if i backup a virtual
machine with lots of used extents, or multiple virtual machines at the
same time, i may run out of space, if my primary backup target is
not a regular disk.

Imagine i want to stream the FULL backup to tape directly like

 backup_vm.py full [..]  /dev/nst0

thats currently not possible, because qemu-img is not able to open
a tape device directly, given its nature of the qcow2 format.

So what iam basically looking for, is a way to download only the extents
from the imageio server that are really in use, not depending on qemu-*
tools, to be able to pipe the data somehwere else.

Standard tools, like for example curl, will allways download the full
provisioned image from the imageio backend (of course).

I noticed is that it is possible to query the extents via:

 https://tranfer_node:54322/images/d471c659-889f-4e7f-b55a-a475649c48a6/extents

As i failed to find them, are there any existing functions/api calls
that could be used to download only the used extents to a file/fifo
pipe?

So far, i played around with the _internal.io.copy function, beeing able
to at least read the data into a in memory BytesIO stream, but thats not
the solution to my "problem" :)

bye,
- michael
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/ATIBK6AWZHHGVMKXUWYQU2CAVC74TDUJ/


[ovirt-devel] Untrusted CI environment

2020-06-30 Thread Sandro Bonazzola
> I’d suggest to rerun. I don’t trust the CI env at all. Could be any reason.
> It’s highly unlikely to be caused by your patch, and I can see on my infra
> that OST is running well on both CentOS and Stream.
>

If we don't trust the CI environment (I also don't really trust it that
much) why not changing it to something we can trust and rely on?

-- 

Sandro Bonazzola

MANAGER, SOFTWARE ENGINEERING, EMEA R RHV

Red Hat EMEA 

sbona...@redhat.com


*Red Hat respects your work life balance. Therefore there is no need to
answer this email out of your office hours.
*
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/XNDMTTJ6ICIFPTP7SJA3YRHEISZL3UKE/


[ovirt-devel] Re: OST failing because of modular filtering error

2020-06-30 Thread Michal Skrivanek


> On 29 Jun 2020, at 17:12, Parth Dhanjal  wrote:
> 
> Hey!
> 
> I am unable to install gluster-ansible-roles on VMs because of an error from 
> modular filtering
>   - conflicting requests
>   - package python-six-1.9.0-2.el7.noarch is filtered out by modular filtering
>   - package python2-cryptography-1.7.2-2.el7.x86_64 is filtered out by 
> modular filtering
>   - package python2-cryptography-2.1.4-2.el7.x86_64 is filtered out by 
> modular filtering
>   - package python2-six-1.10.0-9.el7.noarch is filtered out by modular 
> filtering
>   - package python-six-1.9.0-1.el7.noarch is filtered out by modular filtering
>  
> I tried disabling the modular packages but still facing this issue, can 
> someone suggest a fix?

how did you get el7 packages? what/how are you installing exactly?

> 
> Regards
> Parth Dhanjal
> ___
> Devel mailing list -- devel@ovirt.org
> To unsubscribe send an email to devel-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/3ETIG5GJFGUKTW4HL77M432I62VZNXSA/

___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/VEPKSRD6DVDGVFBRLRCUGXHKDDXRK76A/


[ovirt-devel] Re: VM rebooted during OST test_hotplug_cpu

2020-06-30 Thread Michal Skrivanek


> On 30 Jun 2020, at 08:30, Yedidyah Bar David  wrote:
> 
> Hi all,
> 
> I am trying to verify fixes for ovirt-engine-rename, specifically for
> OVN. Engine top patch is [1], OST patch [2]. Ran the manual job on
> these [3].
> 
> In previous patches, OST failed in earlier tests. Now, it passed these
> tests, so I hope that my patches are enough for what I am trying to
> do. However, [3] did fail later, during test_hotplug_cpu - it set the
> number of CPUs, then tried to connect to the VM, and timed out.
> 
> The logs imply that right after it changed the number of CPUs, the VM
> was rebooted, apparently by libvirtd. Relevant log snippets:
> 
> vdsm [4]:
> 
> 2020-06-29 10:21:10,889-0400 DEBUG (jsonrpc/1) [virt.vm]
> (vmId='7474280d-4501-4355-9425-63898757682b') Setting number of cpus
> to : 2 (vm:3089)
> 2020-06-29 10:21:10,952-0400 INFO  (jsonrpc/1) [api.virt] FINISH
> setNumberOfCpus return={'status': {'code': 0, 'message': 'Done'},
> 'vmList': {}} from=:::192.168.201.4,54576, flow_id=7f9503ed,
> vmId=7474280d-4501-4355-9425-63898757682b (api:54)
> 2020-06-29 10:21:11,111-0400 DEBUG (periodic/0)
> [virt.sampling.VMBulkstatsMonitor] sampled timestamp 2925.602824355
> elapsed 0.160 acquired True domains all (sampling:451)
> 2020-06-29 10:21:11,430-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer]
> Return 'VM.setNumberOfCpus' in bridge with {} (__init__:356)
> 2020-06-29 10:21:11,432-0400 INFO  (jsonrpc/1) [jsonrpc.JsonRpcServer]
> RPC call VM.setNumberOfCpus succeeded in 0.56 seconds (__init__:312)
> 2020-06-29 10:21:12,228-0400 INFO  (libvirt/events) [virt.vm]
> (vmId='7474280d-4501-4355-9425-63898757682b') reboot event (vm:1033)
> 
> qemu [5]:
> 
> 2020-06-29T14:21:12.260303Z qemu-kvm: terminating on signal 15 from
> pid 42224 ()
> 2020-06-29 14:21:12.462+: shutting down, reason=destroyed
> 
> libvirtd [6] itself does not log anything relevant AFAICT, but at
> least it shows that the above unknown process is itself:
> 
> 2020-06-29 14:18:16.212+: 42224: error : qemuMonitorIO:620 :
> internal error: End of file from qemu monitor
> 
> (Note that above line is from 3 minutes before the reboot, and the
> only place in the log with '42224'. No other log there has 42224,
> other than these and audit.log).
> 
> Any idea? Is this a bug in libvirt? vdsm? I'd at least expect
> something in the log for such a severe step.

I’d suggest to rerun. I don’t trust the CI env at all. Could be any reason.
It’s highly unlikely to be caused by your patch, and I can see on my infra that 
OST is running well on both CentOS and Stream.

> 
> [1] https://gerrit.ovirt.org/109961
> [2] https://gerrit.ovirt.org/109734
> [3] 
> https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/7031/
> [4] 
> https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/7031/artifact/exported-artifacts/test_logs/basic-suite-master/post-004_basic_sanity_pytest.py/lago-basic-suite-master-host-0/_var_log/vdsm/vdsm.log
> [5] 
> https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/7031/artifact/exported-artifacts/test_logs/basic-suite-master/post-004_basic_sanity_pytest.py/lago-basic-suite-master-host-0/_var_log/libvirt/qemu/vm0.log
> [6] 
> https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/7031/artifact/exported-artifacts/test_logs/basic-suite-master/post-004_basic_sanity_pytest.py/lago-basic-suite-master-host-0/_var_log/libvirt.log
> -- 
> Didi
> ___
> Devel mailing list -- devel@ovirt.org
> To unsubscribe send an email to devel-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/devel@ovirt.org/message/JEF5QWFZF4O2OGQFHPH7SPU6SX76KF47/
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/ZCKCXCOG4IQWTFSSTAG43INH7WA5BHQS/


[ovirt-devel] VM rebooted during OST test_hotplug_cpu

2020-06-30 Thread Yedidyah Bar David
Hi all,

I am trying to verify fixes for ovirt-engine-rename, specifically for
OVN. Engine top patch is [1], OST patch [2]. Ran the manual job on
these [3].

In previous patches, OST failed in earlier tests. Now, it passed these
tests, so I hope that my patches are enough for what I am trying to
do. However, [3] did fail later, during test_hotplug_cpu - it set the
number of CPUs, then tried to connect to the VM, and timed out.

The logs imply that right after it changed the number of CPUs, the VM
was rebooted, apparently by libvirtd. Relevant log snippets:

vdsm [4]:

2020-06-29 10:21:10,889-0400 DEBUG (jsonrpc/1) [virt.vm]
(vmId='7474280d-4501-4355-9425-63898757682b') Setting number of cpus
to : 2 (vm:3089)
2020-06-29 10:21:10,952-0400 INFO  (jsonrpc/1) [api.virt] FINISH
setNumberOfCpus return={'status': {'code': 0, 'message': 'Done'},
'vmList': {}} from=:::192.168.201.4,54576, flow_id=7f9503ed,
vmId=7474280d-4501-4355-9425-63898757682b (api:54)
2020-06-29 10:21:11,111-0400 DEBUG (periodic/0)
[virt.sampling.VMBulkstatsMonitor] sampled timestamp 2925.602824355
elapsed 0.160 acquired True domains all (sampling:451)
2020-06-29 10:21:11,430-0400 DEBUG (jsonrpc/1) [jsonrpc.JsonRpcServer]
Return 'VM.setNumberOfCpus' in bridge with {} (__init__:356)
2020-06-29 10:21:11,432-0400 INFO  (jsonrpc/1) [jsonrpc.JsonRpcServer]
RPC call VM.setNumberOfCpus succeeded in 0.56 seconds (__init__:312)
2020-06-29 10:21:12,228-0400 INFO  (libvirt/events) [virt.vm]
(vmId='7474280d-4501-4355-9425-63898757682b') reboot event (vm:1033)

qemu [5]:

2020-06-29T14:21:12.260303Z qemu-kvm: terminating on signal 15 from
pid 42224 ()
2020-06-29 14:21:12.462+: shutting down, reason=destroyed

libvirtd [6] itself does not log anything relevant AFAICT, but at
least it shows that the above unknown process is itself:

2020-06-29 14:18:16.212+: 42224: error : qemuMonitorIO:620 :
internal error: End of file from qemu monitor

(Note that above line is from 3 minutes before the reboot, and the
only place in the log with '42224'. No other log there has 42224,
other than these and audit.log).

Any idea? Is this a bug in libvirt? vdsm? I'd at least expect
something in the log for such a severe step.

[1] https://gerrit.ovirt.org/109961
[2] https://gerrit.ovirt.org/109734
[3] 
https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/7031/
[4] 
https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/7031/artifact/exported-artifacts/test_logs/basic-suite-master/post-004_basic_sanity_pytest.py/lago-basic-suite-master-host-0/_var_log/vdsm/vdsm.log
[5] 
https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/7031/artifact/exported-artifacts/test_logs/basic-suite-master/post-004_basic_sanity_pytest.py/lago-basic-suite-master-host-0/_var_log/libvirt/qemu/vm0.log
[6] 
https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/7031/artifact/exported-artifacts/test_logs/basic-suite-master/post-004_basic_sanity_pytest.py/lago-basic-suite-master-host-0/_var_log/libvirt.log
-- 
Didi
___
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/JEF5QWFZF4O2OGQFHPH7SPU6SX76KF47/