Re: [ovirt-users] Testing self hosted engine in 3.6: hostname not resolved error

2015-10-23 Thread Simone Tiraboschi
On Fri, Oct 23, 2015 at 3:57 PM, Gianluca Cecchi 
wrote:

> On Thu, Oct 22, 2015 at 5:38 PM, Simone Tiraboschi 
> wrote:
>
>>
>>
>>> In case I want to setup a single host with self hosted engine, could I
>>> configure on hypervisor
>>> a) one NFS share for sh engine
>>> b) one NFS share for ISO DOMAIN
>>> c) a local filesystem to be used to create then a local POSIX complant
>>> FS storage domain
>>> and work this way as a replacement of all-in-one?
>>>
>>
>> Yes but c is just a workaround, using another external NFS share would
>> help a lot if in the future you plan to add o to migrate to a new server.
>>
>
> Why do you see this as a workaround, if I plan to have this for example as
> a devel personal infra without no other hypervisors?
> I think about better performance directly going local instead of adding
> overhead of NFS with itself
>

Just cause you are using as a shared storage something that is not really
shared.


>
 Put the host in global maintenance (otherwise the engine VM will be
 restarted)
 Shutdown the engine VM
 Shutdown the host


>>>
> Please note that at some point I had to power off the hypervisor in the
> previous step, because it was stalled trying to stop two processes:
> "Watchdog Multiplexing Daemon"
> and
> "Shared Storage Lease Manager"
>
> https://drive.google.com/file/d/0BwoPbcrMv8mvTVoyNzhRNGpqN1U/view?usp=sharing
>
> It was apparently able to stop the "Watchdog Multiplexing Daemon" after
> some minutes
>
> https://drive.google.com/file/d/0BwoPbcrMv8mvZExNNkw5LVBiXzA/view?usp=sharing
>
> But no way for the Shared Storage Lease Manager and the screen above is
> when I forced a power off yesterday, after global maintenance and correct
> shutdown of sh engine and shutdown of hypervisor stalled.
>
>
>
>
>
>>
>
>>> Ok. And for starting all again, is this correct:
>>>
>>> a) power on hypevisor
>>> b) hosted-engine --set-maintenance --mode=none
>>>
>>> other steps required?
>>>
>>>
>> No, that's correct
>>
>
>
> Today after powering on hypervisor and waiting about 6 minutes I then ran:
>
>  [root@ovc71 ~]# ps -ef|grep qemu
> root  2104  1985  0 15:41 pts/000:00:00 grep --color=auto qemu
>
> --> as expected no VM in execution
>
> [root@ovc71 ~]# systemctl status vdsmd
> vdsmd.service - Virtual Desktop Server Manager
>Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled)
>Active: active (running) since Fri 2015-10-23 15:34:46 CEST; 3min 25s
> ago
>   Process: 1666 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh
> --pre-start (code=exited, status=0/SUCCESS)
>  Main PID: 1745 (vdsm)
>CGroup: /system.slice/vdsmd.service
>├─1745 /usr/bin/python /usr/share/vdsm/vdsm
>└─1900 /usr/libexec/ioprocess --read-pipe-fd 56 --write-pipe-fd
> 55 --max-threads 10 --...
>
> Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 client
> step 1
> Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
> ask_user_info()
> Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 client
> step 1
> Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
> ask_user_info()
> Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
> make_client_response()
> Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 client
> step 2
> Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
> parse_server_challenge()
> Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
> ask_user_info()
> Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
> make_client_response()
> Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 client
> step 3
>
> --> I think it is expected that vdsmd starts anyway, even in global
> maintenance, is it correct?
>
> But then:
>
> [root@ovc71 ~]# hosted-engine --set-maintenance --mode=none
> Traceback (most recent call last):
>   File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
> "__main__", fname, loader, pkg_name)
>   File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
> exec code in run_globals
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/set_maintenance.py",
> line 73, in 
> if not maintenance.set_mode(sys.argv[1]):
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/set_maintenance.py",
> line 61, in set_mode
> value=m_global,
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
> line 259, in set_maintenance_mode
> str(value))
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
> line 201, in set_global_md_flag
> with broker.connection(self._retries, self._wait):
>   File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__
> return self.gen.next()
>   File
> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
> line 99, in connection
> 

Re: [ovirt-users] schedule a VM backup in ovirt 3.5

2015-10-23 Thread Patrick Russell
Using ovirt-node EL7 we’ve been able to live merge since 3.5.3 without any 
issues. 

-Patrick

> On Oct 23, 2015, at 12:24 AM, Christopher Cox  wrote:
> 
> On 10/22/2015 10:46 PM, Indunil Jayasooriya wrote:
> ...
>> 
>> Hmm,
>> 
>> *How to list the sanphot?
>> *
>> *how to backup the VM with snapshot?
>> *
>> *finally , how to remove this snapshot?
>> *
>> 
>> 
>> Then. I think it will be OVER. Yesterday, I tried a lot.  but, NO success.
>> 
>> Hope to hear from you.
> 
> Not exactly "help" but AFAIK, even with 3.5, there is no live merging of 
> snapshots so they can't be deleted unless the VM is down.  I know for large 
> snapshots that have been around for awhile removing them can take some time 
> too.
> 
> Others feel free to chime in...
> 
> 
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Testing self hosted engine in 3.6: hostname not resolved error

2015-10-23 Thread Gianluca Cecchi
On Fri, Oct 23, 2015 at 5:05 PM, Simone Tiraboschi 
wrote:

>
>>
> OK, can you please try again the whole reboot procedure just to ensure
> that it was just a temporary NFS glitch?
>


It seems reproducible.

This time I was able to shutdown the hypervisor without manual power off.
Only strange thing is that I ran

shutdown -h now

and actually the VM at some point (I was able to see that the watchdog
stopped...) booted ?

Related lines in messages:
Oct 23 17:33:32 ovc71 systemd: Unmounting RPC Pipe File System...
Oct 23 17:33:32 ovc71 systemd: Stopping Session 11 of user root.
Oct 23 17:33:33 ovc71 systemd: Stopped Session 11 of user root.
Oct 23 17:33:33 ovc71 systemd: Stopping user-0.slice.
Oct 23 17:33:33 ovc71 systemd: Removed slice user-0.slice.
Oct 23 17:33:33 ovc71 systemd: Stopping vdsm-dhclient.slice.
Oct 23 17:33:33 ovc71 systemd: Removed slice vdsm-dhclient.slice.
Oct 23 17:33:33 ovc71 systemd: Stopping vdsm.slice.
Oct 23 17:33:33 ovc71 systemd: Removed slice vdsm.slice.
Oct 23 17:33:33 ovc71 systemd: Stopping Sound Card.
Oct 23 17:33:33 ovc71 systemd: Stopped target Sound Card.
Oct 23 17:33:33 ovc71 systemd: Stopping LVM2 PV scan on device 8:2...
Oct 23 17:33:33 ovc71 systemd: Stopping LVM2 PV scan on device 8:16...
Oct 23 17:33:33 ovc71 systemd: Stopping Dump dmesg to /var/log/dmesg...
Oct 23 17:33:33 ovc71 systemd: Stopped Dump dmesg to /var/log/dmesg.
Oct 23 17:33:33 ovc71 systemd: Stopping Watchdog Multiplexing Daemon...
Oct 23 17:33:33 ovc71 systemd: Stopping Multi-User System.
Oct 23 17:33:33 ovc71 systemd: Stopped target Multi-User System.
Oct 23 17:33:33 ovc71 systemd: Stopping ABRT kernel log watcher...
Oct 23 17:33:33 ovc71 systemd: Stopping Command Scheduler...
Oct 23 17:33:33 ovc71 rsyslogd: [origin software="rsyslogd"
swVersion="7.4.7" x-pid="690" x-info="http://www.rsyslog.com;] exiting on
signal 15.
Oct 23 17:36:24 ovc71 rsyslogd: [origin software="rsyslogd"
swVersion="7.4.7" x-pid="697" x-info="http://www.rsyslog.com;] start
Oct 23 17:36:21 ovc71 journal: Runtime journal is using 8.0M (max 500.0M,
leaving 750.0M of free 4.8G, current limit 500.0M).
Oct 23 17:36:21 ovc71 kernel: Initializing cgroup subsys cpuset


Coming back with the ovrt processes I see:

[root@ovc71 ~]# systemctl status ovirt-ha-broker
ovirt-ha-broker.service - oVirt Hosted Engine High Availability
Communications Broker
   Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-broker.service; enabled)
   Active: inactive (dead) since Fri 2015-10-23 17:36:25 CEST; 31s ago
  Process: 849 ExecStop=/usr/lib/systemd/systemd-ovirt-ha-broker stop
(code=exited, status=0/SUCCESS)
  Process: 723 ExecStart=/usr/lib/systemd/systemd-ovirt-ha-broker start
(code=exited, status=0/SUCCESS)
 Main PID: 844 (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/ovirt-ha-broker.service

Oct 23 17:36:24 ovc71.localdomain.local systemd-ovirt-ha-broker[723]:
Starting ovirt-ha-broker: [...
Oct 23 17:36:24 ovc71.localdomain.local systemd[1]: Started oVirt Hosted
Engine High Availabili...r.
Oct 23 17:36:25 ovc71.localdomain.local systemd-ovirt-ha-broker[849]:
Stopping ovirt-ha-broker: [...
Hint: Some lines were ellipsized, use -l to show in full.

ANd
[root@ovc71 ~]# systemctl status nfs-server
nfs-server.service - NFS server and services
   Loaded: loaded (/usr/lib/systemd/system/nfs-server.service; enabled)
   Active: active (exited) since Fri 2015-10-23 17:36:27 CEST; 1min 9s ago
  Process: 1123 ExecStart=/usr/sbin/rpc.nfsd $RPCNFSDARGS (code=exited,
status=0/SUCCESS)
  Process: 1113 ExecStartPre=/usr/sbin/exportfs -r (code=exited,
status=0/SUCCESS)
 Main PID: 1123 (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/nfs-server.service

Oct 23 17:36:27 ovc71.localdomain.local systemd[1]: Starting NFS server and
services...
Oct 23 17:36:27 ovc71.localdomain.local systemd[1]: Started NFS server and
services.

So it seems that the broker tries to start and fails (17:36:25) before NFS
server start phase completes (17:36:27)...?

Again if I then manually start ha-broker and ha-agent, they start ok and
I'm able to become operational again with the sh engine up

systemd file for broker is this

[Unit]
Description=oVirt Hosted Engine High Availability Communications Broker

[Service]
Type=forking
EnvironmentFile=-/etc/sysconfig/ovirt-ha-broker
ExecStart=/usr/lib/systemd/systemd-ovirt-ha-broker start
ExecStop=/usr/lib/systemd/systemd-ovirt-ha-broker stop

[Install]
WantedBy=multi-user.target

Probably inside the [unit] section I should add
After=nfs-server.service

but this should be true only for sh engine configured with NFS so to be
done at install/setup time?

If you want I can set this change for my environment and verify...



>
> The issue was here:  --spice-host-subject="C=EN, L=Test, O=Test, CN=Test"
> This one was just the temporary subject used by hosted-engine-setup during
> the bootstrap sequence when your engine was still to come.
> At the end that cert got replace by the engine CA signed ones and 

Re: [ovirt-users] Testing self hosted engine in 3.6: hostname not resolved error

2015-10-23 Thread Simone Tiraboschi
On Fri, Oct 23, 2015 at 5:55 PM, Gianluca Cecchi 
wrote:

> On Fri, Oct 23, 2015 at 5:05 PM, Simone Tiraboschi 
> wrote:
>
>>
>>>
>> OK, can you please try again the whole reboot procedure just to ensure
>> that it was just a temporary NFS glitch?
>>
>
>
> It seems reproducible.
>
> This time I was able to shutdown the hypervisor without manual power off.
> Only strange thing is that I ran
>
> shutdown -h now
>
> and actually the VM at some point (I was able to see that the watchdog
> stopped...) booted ?
>
> Related lines in messages:
> Oct 23 17:33:32 ovc71 systemd: Unmounting RPC Pipe File System...
> Oct 23 17:33:32 ovc71 systemd: Stopping Session 11 of user root.
> Oct 23 17:33:33 ovc71 systemd: Stopped Session 11 of user root.
> Oct 23 17:33:33 ovc71 systemd: Stopping user-0.slice.
> Oct 23 17:33:33 ovc71 systemd: Removed slice user-0.slice.
> Oct 23 17:33:33 ovc71 systemd: Stopping vdsm-dhclient.slice.
> Oct 23 17:33:33 ovc71 systemd: Removed slice vdsm-dhclient.slice.
> Oct 23 17:33:33 ovc71 systemd: Stopping vdsm.slice.
> Oct 23 17:33:33 ovc71 systemd: Removed slice vdsm.slice.
> Oct 23 17:33:33 ovc71 systemd: Stopping Sound Card.
> Oct 23 17:33:33 ovc71 systemd: Stopped target Sound Card.
> Oct 23 17:33:33 ovc71 systemd: Stopping LVM2 PV scan on device 8:2...
> Oct 23 17:33:33 ovc71 systemd: Stopping LVM2 PV scan on device 8:16...
> Oct 23 17:33:33 ovc71 systemd: Stopping Dump dmesg to /var/log/dmesg...
> Oct 23 17:33:33 ovc71 systemd: Stopped Dump dmesg to /var/log/dmesg.
> Oct 23 17:33:33 ovc71 systemd: Stopping Watchdog Multiplexing Daemon...
> Oct 23 17:33:33 ovc71 systemd: Stopping Multi-User System.
> Oct 23 17:33:33 ovc71 systemd: Stopped target Multi-User System.
> Oct 23 17:33:33 ovc71 systemd: Stopping ABRT kernel log watcher...
> Oct 23 17:33:33 ovc71 systemd: Stopping Command Scheduler...
> Oct 23 17:33:33 ovc71 rsyslogd: [origin software="rsyslogd"
> swVersion="7.4.7" x-pid="690" x-info="http://www.rsyslog.com;] exiting on
> signal 15.
> Oct 23 17:36:24 ovc71 rsyslogd: [origin software="rsyslogd"
> swVersion="7.4.7" x-pid="697" x-info="http://www.rsyslog.com;] start
> Oct 23 17:36:21 ovc71 journal: Runtime journal is using 8.0M (max 500.0M,
> leaving 750.0M of free 4.8G, current limit 500.0M).
> Oct 23 17:36:21 ovc71 kernel: Initializing cgroup subsys cpuset
>
>
> Coming back with the ovrt processes I see:
>
> [root@ovc71 ~]# systemctl status ovirt-ha-broker
> ovirt-ha-broker.service - oVirt Hosted Engine High Availability
> Communications Broker
>Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-broker.service;
> enabled)
>Active: inactive (dead) since Fri 2015-10-23 17:36:25 CEST; 31s ago
>   Process: 849 ExecStop=/usr/lib/systemd/systemd-ovirt-ha-broker stop
> (code=exited, status=0/SUCCESS)
>   Process: 723 ExecStart=/usr/lib/systemd/systemd-ovirt-ha-broker start
> (code=exited, status=0/SUCCESS)
>  Main PID: 844 (code=exited, status=0/SUCCESS)
>CGroup: /system.slice/ovirt-ha-broker.service
>
> Oct 23 17:36:24 ovc71.localdomain.local systemd-ovirt-ha-broker[723]:
> Starting ovirt-ha-broker: [...
> Oct 23 17:36:24 ovc71.localdomain.local systemd[1]: Started oVirt Hosted
> Engine High Availabili...r.
> Oct 23 17:36:25 ovc71.localdomain.local systemd-ovirt-ha-broker[849]:
> Stopping ovirt-ha-broker: [...
> Hint: Some lines were ellipsized, use -l to show in full.
>
> ANd
> [root@ovc71 ~]# systemctl status nfs-server
> nfs-server.service - NFS server and services
>Loaded: loaded (/usr/lib/systemd/system/nfs-server.service; enabled)
>Active: active (exited) since Fri 2015-10-23 17:36:27 CEST; 1min 9s ago
>   Process: 1123 ExecStart=/usr/sbin/rpc.nfsd $RPCNFSDARGS (code=exited,
> status=0/SUCCESS)
>   Process: 1113 ExecStartPre=/usr/sbin/exportfs -r (code=exited,
> status=0/SUCCESS)
>  Main PID: 1123 (code=exited, status=0/SUCCESS)
>CGroup: /system.slice/nfs-server.service
>
> Oct 23 17:36:27 ovc71.localdomain.local systemd[1]: Starting NFS server
> and services...
> Oct 23 17:36:27 ovc71.localdomain.local systemd[1]: Started NFS server and
> services.
>
> So it seems that the broker tries to start and fails (17:36:25) before NFS
> server start phase completes (17:36:27)...?
>
> Again if I then manually start ha-broker and ha-agent, they start ok and
> I'm able to become operational again with the sh engine up
>
> systemd file for broker is this
>
> [Unit]
> Description=oVirt Hosted Engine High Availability Communications Broker
>
> [Service]
> Type=forking
> EnvironmentFile=-/etc/sysconfig/ovirt-ha-broker
> ExecStart=/usr/lib/systemd/systemd-ovirt-ha-broker start
> ExecStop=/usr/lib/systemd/systemd-ovirt-ha-broker stop
>
> [Install]
> WantedBy=multi-user.target
>
> Probably inside the [unit] section I should add
> After=nfs-server.service
>
>
Ok, I understood.
You are right: the broker was failing cause the NFS storage was not ready
cause it was served in loopback and there isn't any explicit service

Re: [ovirt-users] Testing self hosted engine in 3.6: hostname not resolved error

2015-10-23 Thread Gianluca Cecchi
On Thu, Oct 22, 2015 at 5:38 PM, Simone Tiraboschi 
wrote:

>
>
>> In case I want to setup a single host with self hosted engine, could I
>> configure on hypervisor
>> a) one NFS share for sh engine
>> b) one NFS share for ISO DOMAIN
>> c) a local filesystem to be used to create then a local POSIX complant FS
>> storage domain
>> and work this way as a replacement of all-in-one?
>>
>
> Yes but c is just a workaround, using another external NFS share would
> help a lot if in the future you plan to add o to migrate to a new server.
>

Why do you see this as a workaround, if I plan to have this for example as
a devel personal infra without no other hypervisors?
I think about better performance directly going local instead of adding
overhead of NFS with itself



>>

>>> Put the host in global maintenance (otherwise the engine VM will be
>>> restarted)
>>> Shutdown the engine VM
>>> Shutdown the host
>>>
>>>
>>
Please note that at some point I had to power off the hypervisor in the
previous step, because it was stalled trying to stop two processes:
"Watchdog Multiplexing Daemon"
and
"Shared Storage Lease Manager"
https://drive.google.com/file/d/0BwoPbcrMv8mvTVoyNzhRNGpqN1U/view?usp=sharing

It was apparently able to stop the "Watchdog Multiplexing Daemon" after
some minutes
https://drive.google.com/file/d/0BwoPbcrMv8mvZExNNkw5LVBiXzA/view?usp=sharing

But no way for the Shared Storage Lease Manager and the screen above is
when I forced a power off yesterday, after global maintenance and correct
shutdown of sh engine and shutdown of hypervisor stalled.





>

>> Ok. And for starting all again, is this correct:
>>
>> a) power on hypevisor
>> b) hosted-engine --set-maintenance --mode=none
>>
>> other steps required?
>>
>>
> No, that's correct
>


Today after powering on hypervisor and waiting about 6 minutes I then ran:

 [root@ovc71 ~]# ps -ef|grep qemu
root  2104  1985  0 15:41 pts/000:00:00 grep --color=auto qemu

--> as expected no VM in execution

[root@ovc71 ~]# systemctl status vdsmd
vdsmd.service - Virtual Desktop Server Manager
   Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled)
   Active: active (running) since Fri 2015-10-23 15:34:46 CEST; 3min 25s ago
  Process: 1666 ExecStartPre=/usr/libexec/vdsm/vdsmd_init_common.sh
--pre-start (code=exited, status=0/SUCCESS)
 Main PID: 1745 (vdsm)
   CGroup: /system.slice/vdsmd.service
   ├─1745 /usr/bin/python /usr/share/vdsm/vdsm
   └─1900 /usr/libexec/ioprocess --read-pipe-fd 56 --write-pipe-fd
55 --max-threads 10 --...

Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 client
step 1
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
ask_user_info()
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 client
step 1
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
ask_user_info()
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
make_client_response()
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 client
step 2
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
parse_server_challenge()
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
ask_user_info()
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5
make_client_response()
Oct 23 15:34:46 ovc71.localdomain.local python[1745]: DIGEST-MD5 client
step 3

--> I think it is expected that vdsmd starts anyway, even in global
maintenance, is it correct?

But then:

[root@ovc71 ~]# hosted-engine --set-maintenance --mode=none
Traceback (most recent call last):
  File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
  File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
  File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/set_maintenance.py",
line 73, in 
if not maintenance.set_mode(sys.argv[1]):
  File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_setup/set_maintenance.py",
line 61, in set_mode
value=m_global,
  File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
line 259, in set_maintenance_mode
str(value))
  File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
line 201, in set_global_md_flag
with broker.connection(self._retries, self._wait):
  File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__
return self.gen.next()
  File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
line 99, in connection
self.connect(retries, wait)
  File
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
line 78, in connect
raise BrokerConnectionError(error_msg)
ovirt_hosted_engine_ha.lib.exceptions.BrokerConnectionError: Failed to
connect to broker, the number of errors has exceeded the limit (1)

What to do next?
___
Users mailing list

Re: [ovirt-users] oVirt monitoring with libirt-snmp

2015-10-23 Thread Kevin COUSIN
Hi Dan, 

I want to monitor my oVirt infrastructure with Zabbix. I found a zabbix 
template , but it needs to modifiy libirtd.conf to enable libvirt-snmp to 
access VM informations (https://github.com/jensdepuydt/zabbix-ovirt).



   Kevin

- Mail original -
> De: "Dan Kenigsberg" 
> À: "Roman Mohr" 
> Cc: "Kevin COUSIN" , "users" 
> Envoyé: Jeudi 22 Octobre 2015 09:39:49
> Objet: Re: [ovirt-users] oVirt monitoring with libirt-snmp

> On Wed, Oct 21, 2015 at 05:48:17PM +0200, Roman Mohr wrote:
>> Hi Kevin,
>> 
>> 
>> you should not change auth_unix_rw=sasl. I never used libvirt-snmp but it
>> is save to create another user like this:
>> 
>> > saslpasswd2 -a libvirt 
>> 
>> I did that several times on my hosts.
>> 
>> 
>> On Mon, Oct 19, 2015 at 4:43 PM, Kevin COUSIN 
>> wrote:
>> 
>> > Hi list,
>> >
>> > Is it safe to edit /etc/libvirt/libvirtd.conf? I need to change
>> > auth_unix_rw="sasl" because I want to allow libvirt-snmp to acess to VM
>> > informations. Perhaps I need to create a user in sasl instead ?
> 
> Would you share your use case for libvirt-snmp?
> 
> The supposed danger is that whatever uses snmp may modify VM state under
> the feet of oVirt and surprise it.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Testing self hosted engine in 3.6: hostname not resolved error

2015-10-23 Thread Gianluca Cecchi
On Fri, Oct 23, 2015 at 6:10 PM, Simone Tiraboschi 
wrote:

>
>
>>
>> Probably inside the [unit] section I should add
>> After=nfs-server.service
>>
>>
> Ok, I understood.
> You are right: the broker was failing cause the NFS storage was not ready
> cause it was served in loopback and there isn't any explicit service
> dependency on that.
>
> We are not imposing it cause generally an NFS shared domain is generally
> thought to be served from and external system while a loopback NFS is just
> a degenerate case.
> Simply fix it manually.
>
>

OK, understod. Done and the fix works as expected.


> it should be:
> remote-viewer --spice-ca-file=/etc/pki/vdsm/libvirt-spice/ca-cert.pem
> spice://ovc71.localdomain.local?tls-port=5900 --spice-host-subject="C=US,
> O=localdomain.local, CN=ovc71.localdomain.local"
>
>
>
same error...

[root@ovc71 ~]# remote-viewer
--spice-ca-file=/etc/pki/vdsm/libvirt-spice/ca-cert.pem
spice://ovc71.localdomain.local?tls-port=5900 --spice-host-subject="C=US,
O=localdomain.local, CN=ovc71.localdomain.local"

** (remote-viewer:4788): WARNING **: Couldn't connect to accessibility bus:
Failed to connect to socket /tmp/dbus-Gb5xXSKiKK: Connection refused
GLib-GIO-Message: Using the 'memory' GSettings backend.  Your settings will
not be saved or shared with other applications.
(/usr/bin/remote-viewer:4788): Spice-Warning **:
ssl_verify.c:492:openssl_verify: ssl: subject 'C=US, O=localdomain.local,
CN=ovc71.localdomain.local' verification failed
(/usr/bin/remote-viewer:4788): Spice-Warning **:
ssl_verify.c:494:openssl_verify: ssl: verification failed

(remote-viewer:4788): GSpice-WARNING **: main-1:0: SSL_connect:
error:0001:lib(0):func(0):reason(1)


even if I copy the /etc/pki/vdsm/libvirt-spice/ca-cert.pem from hypervisor
to my pc in /tmp and run:

[g.cecchi@ope46 ~]$ remote-viewer --spice-ca-file=/tmp/ca-cert.pem
spice://ovc71.localdomain.local?tls-port=5900 --spice-host-subject="C=US,
O=localdomain.local,
CN=ovc71.localdomain.local"(/usr/bin/remote-viewer:8915): Spice-Warning **:
ssl_verify.c:493:openssl_verify: ssl: subject 'C=US, O=localdomain.local,
CN=ovc71.localdomain.local' verification failed
(/usr/bin/remote-viewer:8915): Spice-Warning **:
ssl_verify.c:495:openssl_verify: ssl: verification failed

(remote-viewer:8915): GSpice-WARNING **: main-1:0: SSL_connect:
error:0001:lib(0):func(0):reason(1)
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Error while executing action New SAN Storage Domain: Cannot zero out volume

2015-10-23 Thread Devin A. Bougie
Every time I try to create a Data / iSCSI Storage Domain, I receive an "Error 
while executing action New SAN Storage Domain: Cannot zero out volume" error.

iscsid does login to the node, and the volumes appear to have been created.  
However, I cannot use it to create or import a Data / iSCSI storage domain.

[root@lnx84 ~]# iscsiadm -m node
#.#.#.#:3260,1 iqn.2015-10.N.N.N.lnx88:lnx88.target1

[root@lnx84 ~]# iscsiadm -m session
tcp: [1] #.#.#.#:3260,1 iqn.2015-10.N.N.N.lnx88:lnx88.target1 (non-flash)

[root@lnx84 ~]# pvscan
  PV /dev/mapper/1IET_00010001   VG f73c8720-77c3-42a6-8a29-9677db54bac6   lvm2 
[547.62 GiB / 543.75 GiB free]
...
[root@lnx84 ~]# lvscan
  inactive  '/dev/f73c8720-77c3-42a6-8a29-9677db54bac6/metadata' 
[512.00 MiB] inherit
  inactive  '/dev/f73c8720-77c3-42a6-8a29-9677db54bac6/outbox' [128.00 
MiB] inherit
  inactive  '/dev/f73c8720-77c3-42a6-8a29-9677db54bac6/leases' [2.00 
GiB] inherit
  inactive  '/dev/f73c8720-77c3-42a6-8a29-9677db54bac6/ids' [128.00 
MiB] inherit
  inactive  '/dev/f73c8720-77c3-42a6-8a29-9677db54bac6/inbox' [128.00 
MiB] inherit
  inactive  '/dev/f73c8720-77c3-42a6-8a29-9677db54bac6/master' [1.00 
GiB] inherit
...

Any help would be greatly appreciated.

Many thanks,
Devin

Here are the relevant lines from engine.log:
--
2015-10-23 16:04:56,925 INFO  
[org.ovirt.engine.core.vdsbroker.vdsbroker.GetDeviceListVDSCommand] 
(ajp--127.0.0.1-8702-8) START, GetDeviceListVDSCommand(HostName = lnx84, HostId 
= a650e161-75f6-4916-bc18-96044bf3fc26, storageType=ISCSI), log id: 44a64578
2015-10-23 16:04:57,681 INFO  
[org.ovirt.engine.core.vdsbroker.vdsbroker.GetDeviceListVDSCommand] 
(ajp--127.0.0.1-8702-8) FINISH, GetDeviceListVDSCommand, return: [LUNs 
[id=1IET_00010001, physicalVolumeId=wpmBIM-tgc1-yKtH-XSwc-40wZ-Kn49-btwBFn, 
volumeGroupId=8gZEwa-3x5m-TiqA-uEPX-gC04-wkzx-PlaQDu, serial=SIET_VIRTUAL-DISK, 
lunMapping=1, vendorId=IET, productId=VIRTUAL-DISK, _lunConnections=[{ id: 
null, connection: #.#.#.#, iqn: iqn.2015-10.N.N.N.lnx88:lnx88.target1, vfsType: 
null, mountOptions: null, nfsVersion: null, nfsRetrans: null, nfsTimeo: null 
};], deviceSize=547, vendorName=IET, pathsDictionary={sdi=true}, lunType=ISCSI, 
status=Used, diskId=null, diskAlias=null, storageDomainId=null, 
storageDomainName=null]], log id: 44a64578
2015-10-23 16:05:06,474 INFO  
[org.ovirt.engine.core.bll.storage.AddSANStorageDomainCommand] 
(ajp--127.0.0.1-8702-8) [53dd8c98] Running command: AddSANStorageDomainCommand 
internal: false. Entities affected :  ID: aaa0----123456789aaa 
Type: SystemAction group CREATE_STORAGE_DOMAIN with role type ADMIN
2015-10-23 16:05:06,488 INFO  
[org.ovirt.engine.core.vdsbroker.vdsbroker.CreateVGVDSCommand] 
(ajp--127.0.0.1-8702-8) [53dd8c98] START, CreateVGVDSCommand(HostName = lnx84, 
HostId = a650e161-75f6-4916-bc18-96044bf3fc26, 
storageDomainId=cb5b0e2e-d68d-462a-b8fa-8894a6e0ed19, 
deviceList=[1IET_00010001], force=true), log id: 12acc23b
2015-10-23 16:05:07,379 INFO  
[org.ovirt.engine.core.vdsbroker.vdsbroker.CreateVGVDSCommand] 
(ajp--127.0.0.1-8702-8) [53dd8c98] FINISH, CreateVGVDSCommand, return: 
dDaCCO-PLDu-S2nz-yOjM-qpOW-PGaa-ecpJ8P, log id: 12acc23b
2015-10-23 16:05:07,384 INFO  
[org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStorageDomainVDSCommand] 
(ajp--127.0.0.1-8702-8) [53dd8c98] START, 
CreateStorageDomainVDSCommand(HostName = lnx84, HostId = 
a650e161-75f6-4916-bc18-96044bf3fc26, storageDomain=StorageDomainStatic[lnx88, 
cb5b0e2e-d68d-462a-b8fa-8894a6e0ed19], 
args=dDaCCO-PLDu-S2nz-yOjM-qpOW-PGaa-ecpJ8P), log id: cc93ec6
2015-10-23 16:05:10,356 ERROR 
[org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStorageDomainVDSCommand] 
(ajp--127.0.0.1-8702-8) [53dd8c98] Failed in CreateStorageDomainVDS method
2015-10-23 16:05:10,360 INFO  
[org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStorageDomainVDSCommand] 
(ajp--127.0.0.1-8702-8) [53dd8c98] Command 
org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStorageDomainVDSCommand return 
value 
 StatusOnlyReturnForXmlRpc [mStatus=StatusForXmlRpc [mCode=374, mMessage=Cannot 
zero out volume: ('/dev/cb5b0e2e-d68d-462a-b8fa-8894a6e0ed19/metadata',)]]
2015-10-23 16:05:10,367 INFO  
[org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStorageDomainVDSCommand] 
(ajp--127.0.0.1-8702-8) [53dd8c98] HostName = lnx84
2015-10-23 16:05:10,370 ERROR 
[org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStorageDomainVDSCommand] 
(ajp--127.0.0.1-8702-8) [53dd8c98] Command 
CreateStorageDomainVDSCommand(HostName = lnx84, HostId = 
a650e161-75f6-4916-bc18-96044bf3fc26, storageDomain=StorageDomainStatic[lnx88, 
cb5b0e2e-d68d-462a-b8fa-8894a6e0ed19], 
args=dDaCCO-PLDu-S2nz-yOjM-qpOW-PGaa-ecpJ8P) execution failed. Exception: 
VDSErrorException: VDSGenericException: VDSErrorException: Failed to 
CreateStorageDomainVDS, error = Cannot zero out volume: 
('/dev/cb5b0e2e-d68d-462a-b8fa-8894a6e0ed19/metadata',), code = 374
2015-10-23 16:05:10,381 INFO  

Re: [ovirt-users] Testing self hosted engine in 3.6: hostname not resolved error

2015-10-23 Thread Simone Tiraboschi
On Fri, Oct 23, 2015 at 4:56 PM, Gianluca Cecchi 
wrote:

> On Fri, Oct 23, 2015 at 4:42 PM, Simone Tiraboschi 
> wrote:
>
>>
>>
>>
>> Are ovirt-ha-agent and ovirt-ha-broker up and running?
>> Can you please try to restart them via systemd?
>>
>>
>> In the mean time I found inside the logs they failed to start.
>
> I found in broker log the message
> Thread-1730::ERROR::2015-10-22
> 17:31:47,016::listener::192::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
> Error handling request, data: 'set-storage-domain FilesystemBackend
> dom_type=nfs3 sd_uuid=f53854cd-8767-4011-9564-36dc36e0a5d1'
> Traceback (most recent call last):
> ...
> BackendFailureException: path to storage domain
> f53854cd-8767-4011-9564-36dc36e0a5d1 not found in /rhev/data-center/mnt
>
> so probably the NFS part was not in lace yet when the broker attempted to
> start?
> I saw that actually I had now
>
> [root@ovc71 ovirt-hosted-engine-ha]# ll
> /rhev/data-center/mnt/ovc71.localdomain.local:_NFS__DOMAIN
> total 0
> -rwxr-xr-x. 1 vdsm kvm  0 Oct 23 16:46 __DIRECT_IO_TEST__
> drwxr-xr-x. 5 vdsm kvm 47 Oct 22 15:49 f53854cd-8767-4011-9564-36dc36e0a5d1
>
> and I was able to run
>
> systemctl start ovirt-ha-broker.service
> and verify it correctly started.
> and the same for
> systemctl start ovirt-ha-agent
>
> after a couple of minutes the sh engine VM was powered on and I was able
> to access web admin portal.
>
>
OK, can you please try again the whole reboot procedure just to ensure that
it was just a temporary NFS glitch?


> But if I try to connect to its console with
>
> [root@ovc71 ovirt-hosted-engine-ha]# hosted-engine --add-console-password
> Enter password:
> code = 0
> message = 'Done'
>
> and then
> # remote-viewer --spice-ca-file=/etc/pki/vdsm/libvirt-spice/ca-cert.pem
> spice://localhost?tls-port=5900 --spice-host-subject="C=EN, L=Test, O=Test,
> CN=Test"
>
> ** (remote-viewer:7173): WARNING **: Couldn't connect to accessibility
> bus: Failed to connect to socket /tmp/dbus-Gb5xXSKiKK: Connection refused
> GLib-GIO-Message: Using the 'memory' GSettings backend.  Your settings
> will not be saved or shared with other applications.
> (/usr/bin/remote-viewer:7173): Spice-Warning **:
> ssl_verify.c:492:openssl_verify: ssl: subject 'C=EN, L=Test, O=Test,
> CN=Test' verification failed
> (/usr/bin/remote-viewer:7173): Spice-Warning **:
> ssl_verify.c:494:openssl_verify: ssl: verification failed
>
>
The issue was here:  --spice-host-subject="C=EN, L=Test, O=Test, CN=Test"
This one was just the temporary subject used by hosted-engine-setup during
the bootstrap sequence when your engine was still to come.
At the end that cert got replace by the engine CA signed ones and so you
have to substitute that subject to match the one you used during your setup.


> (remote-viewer:7173): GSpice-WARNING **: main-1:0: SSL_connect:
> error:0001:lib(0):func(0):reason(1)
>
> I get an error window with
> Unable to connect to the graphic server spice://localhost?tls-port=5900
>
> [root@ovc71 ovirt-hosted-engine-ha]# netstat -tan | grep 5900
> tcp0  0 0.0.0.0:59000.0.0.0:*   LISTEN
>
>
> the qemu command line of the sh engine is:
> qemu  4489 1 23 16:41 ?00:02:35 /usr/libexec/qemu-kvm
> -name HostedEngine -S -machine pc-i440fx-rhel7.2.0,accel=kvm,usb=off -cpu
> Nehalem -m 8192 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1
> -uuid 9e654c4a-925c-48ba-9818-6908b7714d3a -smbios
> type=1,manufacturer=oVirt,product=oVirt
> Node,version=7-1.1503.el7.centos.2.8,serial=97F39B57-FA7D-2A47-9E0E-304705DE227D,uuid=9e654c4a-925c-48ba-9818-6908b7714d3a
> -no-user-config -nodefaults -chardev
> socket,id=charmonitor,path=/var/lib/libvirt/qemu/HostedEngine.monitor,server,nowait
> -mon chardev=charmonitor,id=monitor,mode=control -rtc
> base=2015-10-23T14:41:23,driftfix=slew -global
> kvm-pit.lost_tick_policy=discard -no-hpet -no-reboot -boot strict=on
> -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device
> virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x4 -device
> virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive
> if=none,id=drive-ide0-1-0,readonly=on,format=raw,serial= -device
> ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive
> file=/var/run/vdsm/storage/f53854cd-8767-4011-9564-36dc36e0a5d1/45ae3a4a-2190-4494-9419-b7c2af8a7aef/52b97c5b-96ae-4efc-b2e0-f56cde243384,if=none,id=drive-virtio-disk0,format=raw,serial=45ae3a4a-2190-4494-9419-b7c2af8a7aef,cache=none,werror=stop,rerror=stop,aio=threads
> -device
> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
> -netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=27 -device
> virtio-net-pci,netdev=hostnet0,id=net0,mac=00:16:3e:16:6a:b6,bus=pci.0,addr=0x3
> -chardev
> socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/9e654c4a-925c-48ba-9818-6908b7714d3a.com.redhat.rhevm.vdsm,server,nowait
> -device
> 

Re: [ovirt-users] Testing self hosted engine in 3.6: hostname not resolved error

2015-10-23 Thread Gianluca Cecchi
On Fri, Oct 23, 2015 at 4:42 PM, Simone Tiraboschi 
wrote:

>
>
>
> Are ovirt-ha-agent and ovirt-ha-broker up and running?
> Can you please try to restart them via systemd?
>
>
> In the mean time I found inside the logs they failed to start.

I found in broker log the message
Thread-1730::ERROR::2015-10-22
17:31:47,016::listener::192::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
Error handling request, data: 'set-storage-domain FilesystemBackend
dom_type=nfs3 sd_uuid=f53854cd-8767-4011-9564-36dc36e0a5d1'
Traceback (most recent call last):
...
BackendFailureException: path to storage domain
f53854cd-8767-4011-9564-36dc36e0a5d1 not found in /rhev/data-center/mnt

so probably the NFS part was not in lace yet when the broker attempted to
start?
I saw that actually I had now

[root@ovc71 ovirt-hosted-engine-ha]# ll
/rhev/data-center/mnt/ovc71.localdomain.local:_NFS__DOMAIN
total 0
-rwxr-xr-x. 1 vdsm kvm  0 Oct 23 16:46 __DIRECT_IO_TEST__
drwxr-xr-x. 5 vdsm kvm 47 Oct 22 15:49 f53854cd-8767-4011-9564-36dc36e0a5d1

and I was able to run

systemctl start ovirt-ha-broker.service
and verify it correctly started.
and the same for
systemctl start ovirt-ha-agent

after a couple of minutes the sh engine VM was powered on and I was able to
access web admin portal.

But if I try to connect to its console with

[root@ovc71 ovirt-hosted-engine-ha]# hosted-engine --add-console-password
Enter password:
code = 0
message = 'Done'

and then
# remote-viewer --spice-ca-file=/etc/pki/vdsm/libvirt-spice/ca-cert.pem
spice://localhost?tls-port=5900 --spice-host-subject="C=EN, L=Test, O=Test,
CN=Test"

** (remote-viewer:7173): WARNING **: Couldn't connect to accessibility bus:
Failed to connect to socket /tmp/dbus-Gb5xXSKiKK: Connection refused
GLib-GIO-Message: Using the 'memory' GSettings backend.  Your settings will
not be saved or shared with other applications.
(/usr/bin/remote-viewer:7173): Spice-Warning **:
ssl_verify.c:492:openssl_verify: ssl: subject 'C=EN, L=Test, O=Test,
CN=Test' verification failed
(/usr/bin/remote-viewer:7173): Spice-Warning **:
ssl_verify.c:494:openssl_verify: ssl: verification failed

(remote-viewer:7173): GSpice-WARNING **: main-1:0: SSL_connect:
error:0001:lib(0):func(0):reason(1)

I get an error window with
Unable to connect to the graphic server spice://localhost?tls-port=5900

[root@ovc71 ovirt-hosted-engine-ha]# netstat -tan | grep 5900
tcp0  0 0.0.0.0:59000.0.0.0:*   LISTEN


the qemu command line of the sh engine is:
qemu  4489 1 23 16:41 ?00:02:35 /usr/libexec/qemu-kvm -name
HostedEngine -S -machine pc-i440fx-rhel7.2.0,accel=kvm,usb=off -cpu Nehalem
-m 8192 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid
9e654c4a-925c-48ba-9818-6908b7714d3a -smbios
type=1,manufacturer=oVirt,product=oVirt
Node,version=7-1.1503.el7.centos.2.8,serial=97F39B57-FA7D-2A47-9E0E-304705DE227D,uuid=9e654c4a-925c-48ba-9818-6908b7714d3a
-no-user-config -nodefaults -chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/HostedEngine.monitor,server,nowait
-mon chardev=charmonitor,id=monitor,mode=control -rtc
base=2015-10-23T14:41:23,driftfix=slew -global
kvm-pit.lost_tick_policy=discard -no-hpet -no-reboot -boot strict=on
-device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device
virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x4 -device
virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive
if=none,id=drive-ide0-1-0,readonly=on,format=raw,serial= -device
ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive
file=/var/run/vdsm/storage/f53854cd-8767-4011-9564-36dc36e0a5d1/45ae3a4a-2190-4494-9419-b7c2af8a7aef/52b97c5b-96ae-4efc-b2e0-f56cde243384,if=none,id=drive-virtio-disk0,format=raw,serial=45ae3a4a-2190-4494-9419-b7c2af8a7aef,cache=none,werror=stop,rerror=stop,aio=threads
-device
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
-netdev tap,fd=26,id=hostnet0,vhost=on,vhostfd=27 -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=00:16:3e:16:6a:b6,bus=pci.0,addr=0x3
-chardev
socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/9e654c4a-925c-48ba-9818-6908b7714d3a.com.redhat.rhevm.vdsm,server,nowait
-device
virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm
-chardev
socket,id=charchannel1,path=/var/lib/libvirt/qemu/channels/9e654c4a-925c-48ba-9818-6908b7714d3a.org.qemu.guest_agent.0,server,nowait
-device
virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0
-chardev spicevmc,id=charchannel2,name=vdagent -device
virtserialport,bus=virtio-serial0.0,nr=3,chardev=charchannel2,id=channel2,name=com.redhat.spice.0
-chardev
socket,id=charchannel3,path=/var/lib/libvirt/qemu/channels/9e654c4a-925c-48ba-9818-6908b7714d3a.org.ovirt.hosted-engine-setup.0,server,nowait
-device

Re: [ovirt-users] Issue with kernel 2.6.32-573.3.1.el6.x86_64?

2015-10-23 Thread Michael Kleinpaste
Not an option unfortunately. The system is a production one and we don't
have spare equipment to test that on.

On Fri, Oct 23, 2015, 12:47 AM Giorgio Bersano 
wrote:

> Hi Michael,
> if you have the possibility to test with
> kernel-2.6.32-504.16.2.el6.x86_64 and
> kernel-2.6.32-504.23.4.el6.x86_64 please do it.
>
> I'm expecting that 2.6.32-504.16.2.el6 is the latest correctly working
> kernel. Please tell us if this is your case.
> There are clear signs of misbehaviours due to changes in the VLAN code
> between the two kernels above.
>
> Also take a look at https://bugs.centos.org/view.php?id=9467 and
> https://bugzilla.redhat.com/show_bug.cgi?id=1263561 .
>
> Best regards,
> Giorgio.
>
> P.S. sorry for top posting. I tried to revert this thread but I
> couldn't do it in an effective way :(
>
>
>
> 2015-10-21 23:05 GMT+02:00 Michael Kleinpaste
> :
> > VMs are on different VLANs and use a central Vyos VM as the firewall and
> > default gateway.  The only indication I had that the packets were getting
> > dropped or being sent out of order was by tshark'ing the traffic.  Tons
> and
> > tons of resends.
> >
> > The problem was definitely resolved after I dropped back to the prior
> kernel
> > (2.6.32-504.12.2.el6.x86_64).
> >
> >
> > On Tue, Oct 20, 2015 at 11:51 PM Ido Barkan  wrote:
> >>
> >> Hi Michael,
> >> Can you describe your network architecture for this vm (inside the
> host).
> >> Do you know where are the packets get dropped?
> >> Ido
> >>
> >> On Tue, Sep 22, 2015 at 1:19 AM, Michael Kleinpaste
> >>  wrote:
> >> > Nobody's seen this?
> >> >
> >> > On Wed, Sep 16, 2015 at 9:08 AM Michael Kleinpaste
> >> >  wrote:
> >> >>
> >> >> So I patched my vhosts and updated the kernel to
> >> >> 2.6.32-573.3.1.el6.x86_64.  Afterwards the networking became unstable
> >> >> for my
> >> >> vyatta firewall vm.  Lots of packet loss and out of order packets
> >> >> (based on
> >> >> my tshark at the time).
> >> >>
> >> >> Has anybody else experienced this?
> >> >> --
> >> >> Michael Kleinpaste
>
-- 
*Michael Kleinpaste*
Senior Systems Administrator
SharperLending, LLC.
www.SharperLending.com
michael.kleinpa...@sharperlending.com
(509) 324-1230   Fax: (509) 324-1234
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Issue with kernel 2.6.32-573.3.1.el6.x86_64?

2015-10-23 Thread Ido Barkan
It might help to tcpdump at different points (devices: bridge, vlan device,
physical nic or bond) in order to isolate the evil dropper.
Ido

Thanks,
Ido
On Oct 22, 2015 12:06 AM, "Michael Kleinpaste" <
michael.kleinpa...@sharperlending.com> wrote:

> VMs are on different VLANs and use a central Vyos VM as the firewall and
> default gateway.  The only indication I had that the packets were getting
> dropped or being sent out of order was by tshark'ing the traffic.  Tons and
> tons of resends.
>
> The problem was definitely resolved after I dropped back to the prior
> kernel (2.6.32-504.12.2.el6.x86_64).
>
> On Tue, Oct 20, 2015 at 11:51 PM Ido Barkan  wrote:
>
>> Hi Michael,
>> Can you describe your network architecture for this vm (inside the host).
>> Do you know where are the packets get dropped?
>> Ido
>>
>> On Tue, Sep 22, 2015 at 1:19 AM, Michael Kleinpaste
>>  wrote:
>> > Nobody's seen this?
>> >
>> > On Wed, Sep 16, 2015 at 9:08 AM Michael Kleinpaste
>> >  wrote:
>> >>
>> >> So I patched my vhosts and updated the kernel to
>> >> 2.6.32-573.3.1.el6.x86_64.  Afterwards the networking became unstable
>> for my
>> >> vyatta firewall vm.  Lots of packet loss and out of order packets
>> (based on
>> >> my tshark at the time).
>> >>
>> >> Has anybody else experienced this?
>> >> --
>> >> Michael Kleinpaste
>> >> Senior Systems Administrator
>> >> SharperLending, LLC.
>> >> www.SharperLending.com
>> >> michael.kleinpa...@sharperlending.com
>> >> (509) 324-1230   Fax: (509) 324-1234
>> >
>> > --
>> > Michael Kleinpaste
>> > Senior Systems Administrator
>> > SharperLending, LLC.
>> > www.SharperLending.com
>> > michael.kleinpa...@sharperlending.com
>> > (509) 324-1230   Fax: (509) 324-1234
>> >
>> > ___
>> > Users mailing list
>> > Users@ovirt.org
>> > http://lists.ovirt.org/mailman/listinfo/users
>> >
>>
>>
>>
>> --
>> Thanks,
>> Ido Barkan
>>
> --
> *Michael Kleinpaste*
> Senior Systems Administrator
> SharperLending, LLC.
> www.SharperLending.com
> michael.kleinpa...@sharperlending.com
> (509) 324-1230   Fax: (509) 324-1234
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Issue with kernel 2.6.32-573.3.1.el6.x86_64?

2015-10-23 Thread Giorgio Bersano
Hi Michael,
if you have the possibility to test with
kernel-2.6.32-504.16.2.el6.x86_64 and
kernel-2.6.32-504.23.4.el6.x86_64 please do it.

I'm expecting that 2.6.32-504.16.2.el6 is the latest correctly working
kernel. Please tell us if this is your case.
There are clear signs of misbehaviours due to changes in the VLAN code
between the two kernels above.

Also take a look at https://bugs.centos.org/view.php?id=9467 and
https://bugzilla.redhat.com/show_bug.cgi?id=1263561 .

Best regards,
Giorgio.

P.S. sorry for top posting. I tried to revert this thread but I
couldn't do it in an effective way :(



2015-10-21 23:05 GMT+02:00 Michael Kleinpaste
:
> VMs are on different VLANs and use a central Vyos VM as the firewall and
> default gateway.  The only indication I had that the packets were getting
> dropped or being sent out of order was by tshark'ing the traffic.  Tons and
> tons of resends.
>
> The problem was definitely resolved after I dropped back to the prior kernel
> (2.6.32-504.12.2.el6.x86_64).
>
>
> On Tue, Oct 20, 2015 at 11:51 PM Ido Barkan  wrote:
>>
>> Hi Michael,
>> Can you describe your network architecture for this vm (inside the host).
>> Do you know where are the packets get dropped?
>> Ido
>>
>> On Tue, Sep 22, 2015 at 1:19 AM, Michael Kleinpaste
>>  wrote:
>> > Nobody's seen this?
>> >
>> > On Wed, Sep 16, 2015 at 9:08 AM Michael Kleinpaste
>> >  wrote:
>> >>
>> >> So I patched my vhosts and updated the kernel to
>> >> 2.6.32-573.3.1.el6.x86_64.  Afterwards the networking became unstable
>> >> for my
>> >> vyatta firewall vm.  Lots of packet loss and out of order packets
>> >> (based on
>> >> my tshark at the time).
>> >>
>> >> Has anybody else experienced this?
>> >> --
>> >> Michael Kleinpaste
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users