[ovirt-users] Re: Incorrect Max CPU's per VM in libvirt if cluster compatibility 4.6 or 4.7

2022-07-20 Thread David Sekne
Hello,

To confirm, changing MaxNumOfCpusCoefficient solved my issue.

64

Thank you for the help.

Regards,
David
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/GMS25UIVYI6N3EOC2J2CLNSEYVL4VRXO/


[ovirt-users] Re: Incorrect Max CPU's per VM in libvirt if cluster compatibility 4.6 or 4.7

2022-07-19 Thread David Sekne
Hello,

Not sure about the feature part. I additionally tested this on oVirt 
4.4.10.7-1.el8 and it works fine even if cluster compatibility there is set to 
4.6.

Engine:
MaxNumOfVmSockets: 32 version: 4.2
MaxNumOfVmSockets: 32 version: 4.3
MaxNumOfVmSockets: 32 version: 4.4
MaxNumOfVmSockets: 32 version: 4.5
MaxNumOfVmSockets: 32 version: 4.6

VM:
32

Regards,
David
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/MBRDAXKCA5GFRRP2N3LX6A2ZGQM67FE5/


[ovirt-users] Incorrect Max CPU's per VM in libvirt if cluster compatibility 4.6 or 4.7

2022-07-19 Thread David Sekne
Hello,

I recently upgraded our cluster from 4.3 to 4.5 (4.5.1.3-1.el8), I raised the 
cluster compatibility afterwards to 4.7 as well. I noticed that the CPU hot 
plugging does not work above 16 CPU's if cluster compatibility is set to 4.6 or 
4.7. 

Error I get is: 
Failed to hot set number of CPUS to VM testVM-3. Underlying error message: 
invalid argument: requested vcpus is greater than max allowable vcpus for the 
live domain: 32 > 16

Issue here I believe is that the MaxNumOfVmSockets values set on the engine per 
cluster version is not correctly set on libvirt when VM is started.

My values on the engine:
MaxNumOfVmSockets: 32 version: 4.2
MaxNumOfVmSockets: 32 version: 4.3
MaxNumOfVmSockets: 32 version: 4.4
MaxNumOfVmSockets: 32 version: 4.5
MaxNumOfVmSockets: 64 version: 4.6
MaxNumOfVmSockets: 1 version: 4.7

Values on libvirt for VM's for all cluster compatibility versions:

4.3
32

4.4
32

4.5
32

4.6
16

4.7
16

Is anyone else experiencing this issue (possible BUG)?

Regards,
David
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/JUHBDPG7ZRZK7G56IVNHE36P6AJTXD26/


[ovirt-users] Re: Windows VMs randomly wont boot after compatibility change

2022-04-25 Thread David Sekne
Hello,

I played around a bit with the emulated machine option and I see we have a
slight misunderstanding. We upgraded the cluster compatibility level to 4.5
last year at which time indeed we had to reconfigure most of the NIC's on
Windows VM's (I forgot we had to do this as it wasnt my task). We changed
from 4.5 to 4.6 not long ago and there were no changes needed on VMs at
that time.

The problem we are seeing and started after the update to 4.5 is that VMs
randomly won't boot at all (reboot is initiated through OS). The VM is
stuck at the boot screen (picture attached) indefinitely and the VM can
only be booted if you stop/start it.

[image: image.png]

Have you seen any similar issues?

Regards,
David

On Fri, Apr 22, 2022 at 7:06 PM Erez Zarum  wrote:

> Hey,
> This only happens to Windows VMs mostly, when you update the cluster
> compatibility level it should alert you on VMs that might have this issues
> and let you know before you resume the upgrade.
>
>
> On Fri, Apr 22, 2022 at 8:04 PM David Sekne  wrote:
>
>> Hello,
>>
>> I followed the documentation on updating the compatibility version. We
>> had no issues with NIC's, all VMs just needed a reboot. No other issues
>> were displayed for VM's after the reboot (we have around 400 VMs) so not
>> sure which changes we would need to do on VM's themselves?
>>
>> I will try to change the emulated machine for Windows VMs to see if it
>> helps (other OS's have no issues).
>>
>> Thank you for the help.
>>
>> Regards,
>> David
>>
>> On Fri, Apr 22, 2022 at 3:02 PM Erez Zarum  wrote:
>>
>>> Hey,
>>> This is noted in the documents:
>>> https://www.ovirt.org/documentation/upgrade_guide/index.html#Changing_the_Cluster_Compatibility_Version_minor_updates
>>> I recommend if you can't cope with it (i.e: logging to console and
>>> reconfiguring the NICs/Disks) it to change the VM custom emulated machine
>>> to "pc-q35-rhel8.1.0", this is the default emulated machine when cluster
>>> compatibility level is set to 4.4.
>>> Also note, if you heavily rely on VNC, there's a bug introduced in
>>> libvirt 8 that won't allow to set password length longer than 8 chars,
>>> oVirt by default tries to set a password length of 12 chars but up until
>>> now libvirt simply ignored that and set the password based on the first 8
>>> chars, i don't know from which version that happened but in 4.4.10 you
>>> won't be able to use VNC so i recommend to use SPICE (which is actually
>>> much better).
>>>
>>>
>>>
>>> On Fri, Apr 22, 2022 at 2:57 PM David Sekne 
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> I have noticed that some of our Windows VM's (2012 - 2022)
>>>> randomly won't boot when reboot is initiated from guest OS. As far as I can
>>>> tell this started happening after we raised the cluster compatibility from
>>>> 4.4 to 4.5 (it's 4.6 now). To fix it a VM needs to be stopped and started.
>>>>
>>>> We are running oVirt 4.4.10.
>>>>
>>>> I cannot really see much from the logs if grepping for a specific VM
>>>> that had these issues.
>>>>
>>>> Example VM ID is: daf33e97-a76f-4b82-b4f2-20fa4891c88b
>>>>
>>>> Im attaching logs:
>>>>
>>>> - Initial hypervisor where VM was running on (reboot is initiated at
>>>> 4:06:38 AM): vdsm-1.log
>>>> - Second hypervisor where VM was started after it was stopped and
>>>> started back (this was done 7:45:43 AM): vdsm-2.log
>>>> - Engine log: engine.log
>>>>
>>>> Has someone noticed any similar issues and can provide some feedback /
>>>> help?
>>>>
>>>> Regards,
>>>> David
>>>> ___
>>>> Users mailing list -- users@ovirt.org
>>>> To unsubscribe send an email to users-le...@ovirt.org
>>>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
>>>> oVirt Code of Conduct:
>>>> https://www.ovirt.org/community/about/community-guidelines/
>>>> List Archives:
>>>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/GGAVCU7PPGITPAPZVABJEROFSF3CKXUD/
>>>>
>>>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/HFBVMLXES3ACDM7JFTREMPLQEGUJ3F3A/


[ovirt-users] Re: Windows VMs randomly wont boot after compatibility change

2022-04-22 Thread David Sekne
Hello,

I followed the documentation on updating the compatibility version. We had
no issues with NIC's, all VMs just needed a reboot. No other issues were
displayed for VM's after the reboot (we have around 400 VMs) so not sure
which changes we would need to do on VM's themselves?

I will try to change the emulated machine for Windows VMs to see if it
helps (other OS's have no issues).

Thank you for the help.

Regards,
David

On Fri, Apr 22, 2022 at 3:02 PM Erez Zarum  wrote:

> Hey,
> This is noted in the documents:
> https://www.ovirt.org/documentation/upgrade_guide/index.html#Changing_the_Cluster_Compatibility_Version_minor_updates
> I recommend if you can't cope with it (i.e: logging to console and
> reconfiguring the NICs/Disks) it to change the VM custom emulated machine
> to "pc-q35-rhel8.1.0", this is the default emulated machine when cluster
> compatibility level is set to 4.4.
> Also note, if you heavily rely on VNC, there's a bug introduced in libvirt
> 8 that won't allow to set password length longer than 8 chars, oVirt by
> default tries to set a password length of 12 chars but up until now libvirt
> simply ignored that and set the password based on the first 8 chars, i
> don't know from which version that happened but in 4.4.10 you won't be able
> to use VNC so i recommend to use SPICE (which is actually much better).
>
>
>
> On Fri, Apr 22, 2022 at 2:57 PM David Sekne  wrote:
>
>> Hello,
>>
>> I have noticed that some of our Windows VM's (2012 - 2022) randomly won't
>> boot when reboot is initiated from guest OS. As far as I can tell this
>> started happening after we raised the cluster compatibility from 4.4 to 4.5
>> (it's 4.6 now). To fix it a VM needs to be stopped and started.
>>
>> We are running oVirt 4.4.10.
>>
>> I cannot really see much from the logs if grepping for a specific VM that
>> had these issues.
>>
>> Example VM ID is: daf33e97-a76f-4b82-b4f2-20fa4891c88b
>>
>> Im attaching logs:
>>
>> - Initial hypervisor where VM was running on (reboot is initiated at
>> 4:06:38 AM): vdsm-1.log
>> - Second hypervisor where VM was started after it was stopped and started
>> back (this was done 7:45:43 AM): vdsm-2.log
>> - Engine log: engine.log
>>
>> Has someone noticed any similar issues and can provide some feedback /
>> help?
>>
>> Regards,
>> David
>> ___
>> Users mailing list -- users@ovirt.org
>> To unsubscribe send an email to users-le...@ovirt.org
>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/users@ovirt.org/message/GGAVCU7PPGITPAPZVABJEROFSF3CKXUD/
>>
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/RROFT5HTYUCTLDJC6Z7S3PWLUO7OEBGD/


[ovirt-users] Re: Tasks stuck waiting on another after failed storage migration (yet not visible on SPM)

2020-06-05 Thread David Sekne
Hello,

It looks like this was the problem indeed.

I have the migration policy set to post copy (thought this was relevant
only to VM migration and not disk migration) and had
libvirt-4.5.0-23.el7_7.6.x86_64 on the problematic hosts. Restarting the
VDSM after the migration indeed resolved the issue.

This issue only appeared during disk move for me.

I have updated all of the hosts since (libvirt-4.5.0-33.el7_8.1.x86_64) and
have not noticed the issue since.

Thank you again.

Regards,

On Mon, Jun 1, 2020 at 6:53 PM Benny Zlotnik  wrote:

> Sorry for the late reply, but you may have hit this bug[1], I forgot about
> it.
> The bug happens when you live migrate a VM in post-copy mode, vdsm
> stops monitoring the VM's jobs.
> The root cause is an issue in libvirt, so it depends on which libvirt
> version you have
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1774230
>
> On Fri, May 29, 2020 at 3:54 PM David Sekne  wrote:
> >
> > Hello,
> >
> > I tried the live migrate as well and it didn't help (it failed).
> >
> > The VM disks were in a illegal state so I ended up restoring the VM from
> backup (It was least complex solution for my case).
> >
> > Thank you both for the help.
> >
> > Regards,
> >
> > On Thu, May 28, 2020 at 5:01 PM Strahil Nikolov 
> wrote:
> >>
> >> I used  to have a similar issue and when I live migrated  (from 1  host
> to another)  it  automatically completed.
> >>
> >> Best Regards,
> >> Strahil Nikolov
> >>
> >> На 27 май 2020 г. 17:39:36 GMT+03:00, Benny Zlotnik <
> bzlot...@redhat.com> написа:
> >> >Sorry, by overloaded I meant in terms of I/O, because this is an
> >> >active layer merge, the active layer
> >> >(aabf3788-8e47-4f8b-84ad-a7eb311659fa) is merged into the base image
> >> >(a78c7505-a949-43f3-b3d0-9d17bdb41af5), before the VM switches to use
> >> >it as the active layer. So if there is constantly additional data
> >> >written to the current active layer, vdsm may have trouble finishing
> >> >the synchronization
> >> >
> >> >
> >> >On Wed, May 27, 2020 at 4:55 PM David Sekne 
> >> >wrote:
> >> >>
> >> >> Hello,
> >> >>
> >> >> Yes, no problem. XML is attached (I ommited the hostname and IP).
> >> >>
> >> >> Server is quite big (8 CPU / 32 Gb RAM / 1 Tb disk) yet not
> >> >overloaded. We have multiple servers with the same specs with no
> >> >issues.
> >> >>
> >> >> Regards,
> >> >>
> >> >> On Wed, May 27, 2020 at 2:28 PM Benny Zlotnik 
> >> >wrote:
> >> >>>
> >> >>> Can you share the VM's xml?
> >> >>> Can be obtained with `virsh -r dumpxml `
> >> >>> Is the VM overloaded? I suspect it has trouble converging
> >> >>>
> >> >>> taskcleaner only cleans up the database, I don't think it will help
> >> >here
> >> >>>
> >> >___
> >> >Users mailing list -- users@ovirt.org
> >> >To unsubscribe send an email to users-le...@ovirt.org
> >> >Privacy Statement: https://www.ovirt.org/privacy-policy.html
> >> >oVirt Code of Conduct:
> >> >https://www.ovirt.org/community/about/community-guidelines/
> >> >List Archives:
> >> >
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/HX4QZDIKXH7ETWPDNI3SKZ535WHBXE2V/
>
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/3RNQF6HLPIPXVCCLLROG65DS7RDAQJCH/


[ovirt-users] Re: Tasks stuck waiting on another after failed storage migration (yet not visible on SPM)

2020-05-29 Thread David Sekne
Hello,

I tried the live migrate as well and it didn't help (it failed).

The VM disks were in a illegal state so I ended up restoring the VM from
backup (It was least complex solution for my case).

Thank you both for the help.

Regards,

On Thu, May 28, 2020 at 5:01 PM Strahil Nikolov 
wrote:

> I used  to have a similar issue and when I live migrated  (from 1  host to
> another)  it  automatically completed.
>
> Best Regards,
> Strahil Nikolov
>
> На 27 май 2020 г. 17:39:36 GMT+03:00, Benny Zlotnik 
> написа:
> >Sorry, by overloaded I meant in terms of I/O, because this is an
> >active layer merge, the active layer
> >(aabf3788-8e47-4f8b-84ad-a7eb311659fa) is merged into the base image
> >(a78c7505-a949-43f3-b3d0-9d17bdb41af5), before the VM switches to use
> >it as the active layer. So if there is constantly additional data
> >written to the current active layer, vdsm may have trouble finishing
> >the synchronization
> >
> >
> >On Wed, May 27, 2020 at 4:55 PM David Sekne 
> >wrote:
> >>
> >> Hello,
> >>
> >> Yes, no problem. XML is attached (I ommited the hostname and IP).
> >>
> >> Server is quite big (8 CPU / 32 Gb RAM / 1 Tb disk) yet not
> >overloaded. We have multiple servers with the same specs with no
> >issues.
> >>
> >> Regards,
> >>
> >> On Wed, May 27, 2020 at 2:28 PM Benny Zlotnik 
> >wrote:
> >>>
> >>> Can you share the VM's xml?
> >>> Can be obtained with `virsh -r dumpxml `
> >>> Is the VM overloaded? I suspect it has trouble converging
> >>>
> >>> taskcleaner only cleans up the database, I don't think it will help
> >here
> >>>
> >___
> >Users mailing list -- users@ovirt.org
> >To unsubscribe send an email to users-le...@ovirt.org
> >Privacy Statement: https://www.ovirt.org/privacy-policy.html
> >oVirt Code of Conduct:
> >https://www.ovirt.org/community/about/community-guidelines/
> >List Archives:
> >
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/HX4QZDIKXH7ETWPDNI3SKZ535WHBXE2V/
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/2F6QBKQWEH6BDS4FIKOQB3XXOXPWY35W/


[ovirt-users] Re: Tasks stuck waiting on another after failed storage migration (yet not visible on SPM)

2020-05-28 Thread David Sekne
Hello,

Not sure IO could be the case. The underlying storage itself is brand new
(nvme) connected with FC and is barely at 10 % capacity with low IOPS and
practically zero latency. There are no IO limitations on the LUN itself. I
would also be able to see any IO problems on the other VMs (none in this
case).

I'm out of ideas on what to do for the time how to stop / complete the
task. Any suggestion welcome.

Shutting down the VM in this state means that It probably wont start back
(snapshots with disks in illegal state). Worst case I plan to restore this
VM from a backup tonight.

Regards,





On Wed, May 27, 2020 at 4:39 PM Benny Zlotnik  wrote:

> Sorry, by overloaded I meant in terms of I/O, because this is an
> active layer merge, the active layer
> (aabf3788-8e47-4f8b-84ad-a7eb311659fa) is merged into the base image
> (a78c7505-a949-43f3-b3d0-9d17bdb41af5), before the VM switches to use
> it as the active layer. So if there is constantly additional data
> written to the current active layer, vdsm may have trouble finishing
> the synchronization
>
>
> On Wed, May 27, 2020 at 4:55 PM David Sekne  wrote:
> >
> > Hello,
> >
> > Yes, no problem. XML is attached (I ommited the hostname and IP).
> >
> > Server is quite big (8 CPU / 32 Gb RAM / 1 Tb disk) yet not overloaded.
> We have multiple servers with the same specs with no issues.
> >
> > Regards,
> >
> > On Wed, May 27, 2020 at 2:28 PM Benny Zlotnik 
> wrote:
> >>
> >> Can you share the VM's xml?
> >> Can be obtained with `virsh -r dumpxml `
> >> Is the VM overloaded? I suspect it has trouble converging
> >>
> >> taskcleaner only cleans up the database, I don't think it will help here
> >>
>
>
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/SCJS2T3R356Q6GMCRMAQAHJUAHB6W7LI/


[ovirt-users] Re: Tasks stuck waiting on another after failed storage migration (yet not visible on SPM)

2020-05-27 Thread David Sekne
Hello,

Yes, no problem. XML is attached (I ommited the hostname and IP).

Server is quite big (8 CPU / 32 Gb RAM / 1 Tb disk) yet not overloaded. We
have multiple servers with the same specs with no issues.

Regards,

On Wed, May 27, 2020 at 2:28 PM Benny Zlotnik  wrote:

> Can you share the VM's xml?
> Can be obtained with `virsh -r dumpxml `
> Is the VM overloaded? I suspect it has trouble converging
>
> taskcleaner only cleans up the database, I don't think it will help here
>
>

  HOSTNAME
  e113ff18-5687-4e03-8a27-b12c82ad6d6b
  http://ovirt.org/vm/tune/1.0; xmlns:ns1="http://ovirt.org/vm/1.0;>

http://ovirt.org/vm/1.0;>
{"f694590a-1577-4dce-bf0c-3a8d74adf341": {"blockJobType": "commit", "topVolume": "aabf3788-8e47-4f8b-84ad-a7eb311659fa", "strategy": "commit", "jobID": "f694590a-1577-4dce-bf0c-3a8d74adf341", "disk": {"domainID": "5b396436-1edc-4e82-9224-0404d4a317dc", "imageID": "8a3a24a7-ade2-4bf2-a499-6662936996cd", "volumeID": "aabf3788-8e47-4f8b-84ad-a7eb311659fa", "poolID": "e8cb9baa-7fa8-11ea-bedc-de3258d1c5ed"}, "baseVolume": "a78c7505-a949-43f3-b3d0-9d17bdb41af5"}}
4.3
False
3
false
32768
32768
auto_resume
1588719350.19

VLAN-4000-ISP

4



5b396436-1edc-4e82-9224-0404d4a317dc
/dev/sda
8a3a24a7-ade2-4bf2-a499-6662936996cd
e8cb9baa-7fa8-11ea-bedc-de3258d1c5ed
aabf3788-8e47-4f8b-84ad-a7eb311659fa


5b396436-1edc-4e82-9224-0404d4a317dc
8a3a24a7-ade2-4bf2-a499-6662936996cd
110100480
/dev/5b396436-1edc-4e82-9224-0404d4a317dc/leases
/rhev/data-center/mnt/blockSD/5b396436-1edc-4e82-9224-0404d4a317dc/images/8a3a24a7-ade2-4bf2-a499-6662936996cd/a78c7505-a949-43f3-b3d0-9d17bdb41af5
a78c7505-a949-43f3-b3d0-9d17bdb41af5


5b396436-1edc-4e82-9224-0404d4a317dc
8a3a24a7-ade2-4bf2-a499-6662936996cd
108003328
/dev/5b396436-1edc-4e82-9224-0404d4a317dc/leases
/rhev/data-center/mnt/blockSD/5b396436-1edc-4e82-9224-0404d4a317dc/images/8a3a24a7-ade2-4bf2-a499-6662936996cd/aabf3788-8e47-4f8b-84ad-a7eb311659fa
aabf3788-8e47-4f8b-84ad-a7eb311659fa





  
  33554432
  33554432
  16
  1
  
/machine
  
  

  oVirt
  oVirt Node
  7-7.1908.0.el7.centos
  ----3cecef01683a
  e113ff18-5687-4e03-8a27-b12c82ad6d6b

  
  
hvm

  
  

  
  
Skylake-Server







  

  
  



  
  destroy
  restart
  destroy
  


  
  
/usr/libexec/qemu-kvm

  
  
  
  
  
  


  
  

  
  



  
  


  

  
  
  8a3a24a7-ade2-4bf2-a499-6662936996cd
  
  
  


  
  
  


  
  


  
  


  
  


  


  
  
  
  
  
  
  
  
  
  


  
  
  
  


  
  
  
  


  
  
  


  
  


  


  


  
  
  
  
  
  
  
  
  


  


  
  
  



  /dev/urandom
  
  

  
  
system_u:system_r:svirt_t:s0:c168,c668
system_u:object_r:svirt_image_t:s0:c168,c668
  
  
+107:+107
+107:+107
  


___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/L33UU3LQKDJEGVRES3VIOKCWY6UWXD7V/


[ovirt-users] Re: Tasks stuck waiting on another after failed storage migration (yet not visible on SPM)

2020-05-27 Thread David Sekne
Hello,

Running virsh blockjob sda --info a couple of times it shows 99 or 100 
%. Looks like it is stuck  / flapping for some reason.

Active Block Commit: [ 99 %]
Active Block Commit: [100 %]

What would be the best approach to resolve this?

I see taskcleaner.sh can be used in cases like these?

Regards,
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/RTWSBREYVKJPHHYFEVMZJB44QVRMH7PP/


[ovirt-users] Re: Tasks stuck waiting on another after failed storage migration (yet not visible on SPM)

2020-05-27 Thread David Sekne
Hello,

Thank you for the reply.

Unfortunately I cant see the task on any on the hosts:

vdsm-client Task getInfo taskID=f694590a-1577-4dce-bf0c-3a8d74adf341
vdsm-client: Command Task.getInfo with args {'taskID': 
'f694590a-1577-4dce-bf0c-3a8d74adf341'} failed:
(code=401, message=Task id unknown: (u'f694590a-1577-4dce-bf0c-3a8d74adf341',))

I can see it starting in VDSM log on the host runnig the VM:

/var/log/vdsm/vdsm.log.2:2020-05-26 12:15:09,349+0200 INFO  (jsonrpc/6) 
[virt.vm] (vmId='e113ff18-5687-4e03-8a27-b12c82ad6d6b') Starting merge with 
jobUUID=u'f694590a-1577-4dce-bf0c-3a8d74adf341', original 
chain=a78c7505-a949-43f3-b3d0-9d17bdb41af5 < 
aabf3788-8e47-4f8b-84ad-a7eb311659fa (top), disk='sda', base='sda[1]', 
top=None, bandwidth=0, flags=12 (vm:5945) 

Also running vdsm-client Host getAllTasks I don't see any runnig tasks (on any 
host).

Am I missing something?

Regards,

___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/VBTD3HLXPK7F7MBJCQEQV6E2KA3H7FZK/


[ovirt-users] Tasks stuck waiting on another after failed storage migration (yet not visible on SPM)

2020-05-27 Thread David Sekne
Hello,

I'm running oVirt version 4.3.9.4-1.el7.

After a failed live storage migration a VM got stuck with snapshot.
Checking the engine logs I can see that the snapshot removal task is
waiting for Merge to complete and vice versa.

2020-05-26 18:34:04,826+02 INFO
 
[org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommandCallback]
(EE-ManagedThreadFactory-engineScheduled-Thread-70)
[90f428b0-9c4e-4ac0-8de6-1103fc13da9e] Command
'RemoveSnapshotSingleDiskLive' (id: '60ce36c1-bf74-40a9-9fb0-7fcf7eb95f40')
waiting on child command id: 'f7d1de7b-9e87-47ba-9ba0-ee04301ba3b1'
type:'Merge' to complete
2020-05-26 18:34:04,827+02 INFO
 [org.ovirt.engine.core.bll.MergeCommandCallback]
(EE-ManagedThreadFactory-engineScheduled-Thread-70)
[90f428b0-9c4e-4ac0-8de6-1103fc13da9e] Waiting on merge command to complete
(jobId = f694590a-1577-4dce-bf0c-3a8d74adf341)
2020-05-26 18:34:04,845+02 INFO
 [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback]
(EE-ManagedThreadFactory-engineScheduled-Thread-70)
[90f428b0-9c4e-4ac0-8de6-1103fc13da9e] Command 'RemoveSnapshot' (id:
'47c9a847-5b4b-4256-9264-a760acde8275') waiting on child command id:
'60ce36c1-bf74-40a9-9fb0-7fcf7eb95f40' type:'RemoveSnapshotSingleDiskLive'
to complete
2020-05-26 18:34:14,277+02 INFO
 [org.ovirt.engine.core.vdsbroker.monitoring.VmJobsMonitoring]
(EE-ManagedThreadFactory-engineScheduled-Thread-96) [] VM Job
[f694590a-1577-4dce-bf0c-3a8d74adf341]: In progress (no change)

I cannot see any runnig tasks on the SPM (vdsm-client Host
getAllTasksInfo). I also cannot find the task ID in any of the other node's
logs.

I already tried restarting the Engine (didn't help).

To start I'm puzzled as to where this task is queueing?

Any Ideas on how I could resolve this?

Thank you.
Regards,
David
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/VJBI3SMVXTPSGGJ66P55MU2ERN3HBCTH/


[ovirt-users] Tasks stuck waiting on another after failed storage migration (yet not visible on SPM)

2020-05-26 Thread david . sekne
Hello,

I'm running oVirt version 4.3.9.4-1.el7. After a failed live storage migration 
VM got stuck with snapshot. Checking the engine logs I can see that the 
snapshot removal task is waiting for Merge to complete and vice versa.

2020-05-26 18:34:04,826+02 INFO  
[org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommandCallback]
 (EE-ManagedThreadFactory-engineScheduled-Thread-70) 
[90f428b0-9c4e-4ac0-8de6-1103fc13da9e] Command 'RemoveSnapshotSingleDiskLive' 
(id: '60ce36c1-bf74-40a9-9fb0-7fcf7eb95f40') waiting on child command id: 
'f7d1de7b-9e87-47ba-9ba0-ee04301ba3b1' type:'Merge' to complete
2020-05-26 18:34:04,827+02 INFO  
[org.ovirt.engine.core.bll.MergeCommandCallback] 
(EE-ManagedThreadFactory-engineScheduled-Thread-70) 
[90f428b0-9c4e-4ac0-8de6-1103fc13da9e] Waiting on merge command to complete 
(jobId = f694590a-1577-4dce-bf0c-3a8d74adf341)
2020-05-26 18:34:04,845+02 INFO  
[org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] 
(EE-ManagedThreadFactory-engineScheduled-Thread-70) 
[90f428b0-9c4e-4ac0-8de6-1103fc13da9e] Command 'RemoveSnapshot' (id: 
'47c9a847-5b4b-4256-9264-a760acde8275') waiting on child command id: 
'60ce36c1-bf74-40a9-9fb0-7fcf7eb95f40' type:'RemoveSnapshotSingleDiskLive' to 
complete
2020-05-26 18:34:14,277+02 INFO  
[org.ovirt.engine.core.vdsbroker.monitoring.VmJobsMonitoring] 
(EE-ManagedThreadFactory-engineScheduled-Thread-96) [] VM Job 
[f694590a-1577-4dce-bf0c-3a8d74adf341]: In progress (no change)

I cant see any tasks on sPM via command 2020-05-26 18:34:04,826+02 INFO  
[org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommandCallback]
 (EE-ManagedThreadFactory-engineScheduled-Thread-70) 
[90f428b0-9c4e-4ac0-8de6-1103fc13da9e] Command 'RemoveSnapshotSingleDiskLive' 
(id: '60ce36c1-bf74-40a9-9fb0-7fcf7eb95f40') waiting on child command id: 
'f7d1de7b-9e87-47ba-9ba0-ee04301ba3b1' type:'Merge' to complete
2020-05-26 18:34:04,827+02 INFO  
[org.ovirt.engine.core.bll.MergeCommandCallback] 
(EE-ManagedThreadFactory-engineScheduled-Thread-70) 
[90f428b0-9c4e-4ac0-8de6-1103fc13da9e] Waiting on merge command to complete 
(jobId = f694590a-1577-4dce-bf0c-3a8d74adf341)
2020-05-26 18:34:04,845+02 INFO  
[org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] 
(EE-ManagedThreadFactory-engineScheduled-Thread-70) 
[90f428b0-9c4e-4ac0-8de6-1103fc13da9e] Command 'RemoveSnapshot' (id: 
'47c9a847-5b4b-4256-9264-a760acde8275') waiting on child command id: 
'60ce36c1-bf74-40a9-9fb0-7fcf7eb95f40' type:'RemoveSnapshotSingleDiskLive' to 
complete
2020-05-26 18:34:14,277+02 INFO  
[org.ovirt.engine.core.vdsbroker.monitoring.VmJobsMonitoring] 
(EE-ManagedThreadFactory-engineScheduled-Thread-96) [] VM Job 
[f694590a-1577-4dce-bf0c-3a8d74adf341]: In progress (no change)

I cannot see any runnig tasks on the SPM (vdsm-client Host getAllTasksInfo). I 
also cannot find the task ID in any of the other node's logs.

I already tried restarting the Engine (didn't help).

To start I'm puzzled where is the engine getting the task info?

Any Ideas on how I could resolve this?

Thank you.
Regards,
David


___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/6XPFSY2C4M6FTKSBI32B3TA4ERTQHQBD/