[ovirt-users] Re: remote-viewer Spice Problem on MacOS

2018-06-18 Thread J Brian Ismay

On 2018-06-18 08:11, Christophe TREFOIS wrote:

We have the same issue potentially.

In fact, for some VMs, we get even an error now (even though it works
from Windows).

In the error log on the hypervisor, we get a “SSL error”. This is in
libvirtd logs.

@Hendrik, do you have perhaps same error?

140085012483328:error:140760FC:SSL
routines:SSL23_GET_CLIENT_HELLO:unknown protocol:s23_srvr.c:640:
(process:375537): Spice-WARNING **:
reds-stream.c:469:reds_stream_ssl_accept: SSL_accept failed, error=1



I get pretty much the same error. I do have some other interesting data 
points though. i just upgraded my production cluster from 4.1 to 4.2 and 
it then started showing this error.


On the other hand, I have my secondary cluster which was installed 
directly to 4.2 with no upgrades. It does not suffer from the same 
issue.


The error:
-

2018-06-18 11:14:02.175 defaults[57134:17925419]
The domain/default pair of (.GlobalPreferences, AppleCollationOrder) 
does not exist
/Applications/RemoteViewer.app/Contents/MacOS/RemoteViewer: line 79: 
test: argument expected


(RemoteViewer-bin:57126): GSpice-WARNING **: Error receiving data: 
Connection reset by peer


(RemoteViewer-bin:57126): GSpice-WARNING **: main-1:0: SSL_connect: 
error:0005:lib(0):func(0):DH lib


--

This is with the latest OSX build of the RemoteViewer application 
available at 
"http://people.freedesktop.org/~teuf/spice-gtk-osx/dmg/0.5.7/RemoteViewer-0.5.7-1.dmg;



--
Brian Ismay
CENIC Sr. Systems Administrator
jis...@cenic.org
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/CDS67FP5R3YJPT5MWGC7LQ26XV2QV37Z/


Re: [ovirt-users] Ongoing "VM is not responding" and "ETL sampling errors".

2017-08-09 Thread J Brian Ismay
Some more information, was doing more digging, and found that there is a 
possible correlation between the VM timeouts and the following sanlock 
messages:


Aug  9 07:47:57 hype-02 sanlock[1565]: 2017-08-09 07:47:57-0700 3509532 
[48218]: s49 delta_renew read timeout 10 sec offset 0 
/rhev/data-center/mnt/netapp:_tus__vms/b256c5ec-296a-46f0-8dec-55a1b90fd841/dom_md/ids
Aug  9 07:47:57 hype-02 sanlock[1565]: 2017-08-09 07:47:57-0700 3509532 
[48218]: s49 renewal error -202 delta_length 10 last_success 3509501
Aug  9 07:47:58 hype-02 sanlock[1565]: 2017-08-09 07:47:58-0700 3509533 
[51222]: s51 delta_renew read timeout 10 sec offset 0 
/var/run/vdsm/storage/8b22a6bd-50d4-4a6e-9b89-8b2e00e7835c/000b7ce5-fa09-41ad-b0c6-0839d550c3df/5e5c81d1-0075-4a2c-baf5-13a08b9d025b
Aug  9 07:47:58 hype-02 sanlock[1565]: 2017-08-09 07:47:58-0700 3509533 
[51222]: s51 renewal error -202 delta_length 10 last_success 3509502
Aug  9 07:47:58 hype-02 sanlock[1565]: 2017-08-09 07:47:58-0700 3509533 
[48256]: s50 delta_renew read timeout 10 sec offset 0 
/rhev/data-center/mnt/netapp:_hostedengine/8b22a6bd-50d4-4a6e-9b89-8b2e00e7835c/dom_md/ids
Aug  9 07:47:58 hype-02 sanlock[1565]: 2017-08-09 07:47:58-0700 3509533 
[48256]: s50 renewal error -202 delta_length 10 last_success 3509502


This is from one of my nodes, and other not responding events match 
other sanlock renewal errors very closely, with the sanlock errors 
preceding the VM errors by a small amount. The other nodes have 
virtually identical logs.


--
Brian


On 2017-08-09 10:54, J Brian Ismay wrote:

Hello the list,

I have been slowly bringing up a 9-node cluster for the last few 
months.


All nodes are identical, dual 2-port 10G nics, lots of memory and CPU
Storage is a Netapp Filer accessed via NFS on a dedicated 10Gb
dual-switch environment.

Generally everything is working fine, but ever since our last rebuild
of the cluster in preperation for a move into production status we
have been getting repeated errors showing in the HostedEngine console:

VM foo is not responding.
VM bar is not responding.
VM baz is not responding.

These errors happen on a fairly regular basis, and generally are
multiple VMs all being hosted by different nodes. When errors occur I
also lose external connectivity to the VM in question, both via its
service IP address and via the ovirt console. The actual outages
appear to generally last 15-20 seconds and then things recover and go
back to normal.

We are also getting much more frequent errors:

ETL service sampling has encountered an error. Please consult the
service log for more details.

I have attached snippets from the Engine engine.log from this morning.
If any other logs are needed for to help diagnosis I can provide them.

--
Brian Ismay
SR. Systems Administrator
jis...@cenic.org


engine.log: NOTE, the system clock is in UTC, local time is PDT, so
this occurred at 07:48AM local time.

2017-08-09 14:48:37,237 INFO
[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
(DefaultQuartzScheduler6) [2cea1ef7] VM
'69880324-2d2e-4a70-8071-4ae0f0ae342e'(vm1) moved from 'Up' -->
'NotResponding'
2017-08-09 14:48:37,277 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler6) [2cea1ef7] Correlation ID: null, Call Stack:
null, Custom Event ID: -1, Message: VM vm1 is not responding.
2017-08-09 14:48:37,277 INFO
[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
(DefaultQuartzScheduler6) [2cea1ef7] VM
'4471e3ee-9f69-4903-b68f-c1293aea047f'(vm2) moved from 'Up' -->
'NotResponding'
2017-08-09 14:48:37,282 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler6) [2cea1ef7] Correlation ID: null, Call Stack:
null, Custom Event ID: -1, Message: VM vm2 is not responding.
2017-08-09 14:48:38,326 INFO
[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
(DefaultQuartzScheduler5) [cf129f7] VM
'35fd4afa-12a1-4326-9db5-a86939a01fa8'(vm3) moved from 'Up' -->
'NotResponding'
2017-08-09 14:48:38,360 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler5) [cf129f7] Correlation ID: null, Call Stack:
null, Custom Event ID: -1, Message: VM vm3 is not responding.
2017-08-09 14:48:38,360 INFO
[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
(DefaultQuartzScheduler5) [cf129f7] VM
'd83e9633-3597-4046-95ee-2a166682b85e'(vm4) moved from 'Up' -->
'NotResponding'
2017-08-09 14:48:38,365 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler5) [cf129f7] Correlation ID: null, Call Stack:
null, Custom Event ID: -1, Message: VM vm4 is not responding.
2017-08-09 14:48:49,075 INFO
[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
(DefaultQuartzScheduler8) [3b1149ff] VM
'd41984d0-4418-4991-9af0-25593abac976'(vm5) moved from 'Up' -->
'NotResponding'
2017-08-09 14:48:49,130 WARN
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQ

[ovirt-users] Ongoing "VM is not responding" and "ETL sampling errors".

2017-08-09 Thread J Brian Ismay

Hello the list,

I have been slowly bringing up a 9-node cluster for the last few months.

All nodes are identical, dual 2-port 10G nics, lots of memory and CPU
Storage is a Netapp Filer accessed via NFS on a dedicated 10Gb 
dual-switch environment.


Generally everything is working fine, but ever since our last rebuild of 
the cluster in preperation for a move into production status we have 
been getting repeated errors showing in the HostedEngine console:


VM foo is not responding.
VM bar is not responding.
VM baz is not responding.

These errors happen on a fairly regular basis, and generally are 
multiple VMs all being hosted by different nodes. When errors occur I 
also lose external connectivity to the VM in question, both via its 
service IP address and via the ovirt console. The actual outages appear 
to generally last 15-20 seconds and then things recover and go back to 
normal.


We are also getting much more frequent errors:

ETL service sampling has encountered an error. Please consult the 
service log for more details.


I have attached snippets from the Engine engine.log from this morning. 
If any other logs are needed for to help diagnosis I can provide them.


--
Brian Ismay
SR. Systems Administrator
jis...@cenic.org


engine.log: NOTE, the system clock is in UTC, local time is PDT, so this 
occurred at 07:48AM local time.


2017-08-09 14:48:37,237 INFO  
[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] 
(DefaultQuartzScheduler6) [2cea1ef7] VM 
'69880324-2d2e-4a70-8071-4ae0f0ae342e'(vm1) moved from 'Up' --> 
'NotResponding'
2017-08-09 14:48:37,277 WARN  
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
(DefaultQuartzScheduler6) [2cea1ef7] Correlation ID: null, Call Stack: 
null, Custom Event ID: -1, Message: VM vm1 is not responding.
2017-08-09 14:48:37,277 INFO  
[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] 
(DefaultQuartzScheduler6) [2cea1ef7] VM 
'4471e3ee-9f69-4903-b68f-c1293aea047f'(vm2) moved from 'Up' --> 
'NotResponding'
2017-08-09 14:48:37,282 WARN  
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
(DefaultQuartzScheduler6) [2cea1ef7] Correlation ID: null, Call Stack: 
null, Custom Event ID: -1, Message: VM vm2 is not responding.
2017-08-09 14:48:38,326 INFO  
[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] 
(DefaultQuartzScheduler5) [cf129f7] VM 
'35fd4afa-12a1-4326-9db5-a86939a01fa8'(vm3) moved from 'Up' --> 
'NotResponding'
2017-08-09 14:48:38,360 WARN  
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
(DefaultQuartzScheduler5) [cf129f7] Correlation ID: null, Call Stack: 
null, Custom Event ID: -1, Message: VM vm3 is not responding.
2017-08-09 14:48:38,360 INFO  
[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] 
(DefaultQuartzScheduler5) [cf129f7] VM 
'd83e9633-3597-4046-95ee-2a166682b85e'(vm4) moved from 'Up' --> 
'NotResponding'
2017-08-09 14:48:38,365 WARN  
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
(DefaultQuartzScheduler5) [cf129f7] Correlation ID: null, Call Stack: 
null, Custom Event ID: -1, Message: VM vm4 is not responding.
2017-08-09 14:48:49,075 INFO  
[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] 
(DefaultQuartzScheduler8) [3b1149ff] VM 
'd41984d0-4418-4991-9af0-25593abac976'(vm5) moved from 'Up' --> 
'NotResponding'
2017-08-09 14:48:49,130 WARN  
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
(DefaultQuartzScheduler8) [3b1149ff] Correlation ID: null, Call Stack: 
null, Custom Event ID: -1, Message: VM vm5 is not responding.
2017-08-09 14:48:49,131 INFO  
[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] 
(DefaultQuartzScheduler8) [3b1149ff] VM 
'ed87b37d-5b79-4105-ba89-29a59361eb4e'(vm6) moved from 'Up' --> 
'NotResponding'
2017-08-09 14:48:49,136 WARN  
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
(DefaultQuartzScheduler8) [3b1149ff] Correlation ID: null, Call Stack: 
null, Custom Event ID: -1, Message: VM vm6 is not responding.
2017-08-09 14:48:52,221 INFO  
[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] 
(DefaultQuartzScheduler7) [2973c87] VM 
'506980f4-6764-4cc6-bb20-c1956d8ed201'(vm7) moved from 'Up' --> 
'NotResponding'
2017-08-09 14:48:52,226 WARN  
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
(DefaultQuartzScheduler7) [2973c87] Correlation ID: null, Call Stack: 
null, Custom Event ID: -1, Message: VM vm7 is not responding.
2017-08-09 14:48:52,299 INFO  
[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] 
(DefaultQuartzScheduler6) [2cea1ef7] VM 
'69880324-2d2e-4a70-8071-4ae0f0ae342e'(vm1) moved from 'NotResponding' 
--> 'Up'
2017-08-09 14:48:52,300 INFO  
[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] 
(DefaultQuartzScheduler6) [2cea1ef7] VM 
'4471e3ee-9f69-4903-b68f-c1293aea047f'(vm2) moved from 'NotResponding' 
--> 'Up'
2017-08-09 14:48:53,373 INFO  
[org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]