Re: [ovirt-users] Hosted Engine Migration fails

2015-02-26 Thread Doron Fediuck


On 26/02/15 17:31, Soeren Malchow wrote:
 Hi,
 
  
 
 we tried this (Roys Mail below) , and yes, shutdown always works.
 
  
 
 But now this problem comes up with regular machines now as well.
 
  
 
 The environment is setup like this
 
  
 
 Engine: Ovirt 3.5.1.1-1.el6 – on CentOS6
 
  
 
 The Storage Backend is gluster 3.6.2-1.el7 on CentOS 7
 
  
 
 Compute hosts: Libvirt Version: libvirt-1.2.9.1-2.fc20, kvm 2.1.2 -
 7.fc20, vdsm vdsm-4.16.10-8.gitc937927.fc20
 
  
 
 All compute Servers are 100% identical.
 
  
 
 The storage cluster was tested manually and works just fine.
 
  
 
 The network interfaces are not fully utilized, more like 15%.
 
  
 
 Log output ist as below. The thing in the log output I do not understand
 is this
 
  
 
 “2015-02-26T14:50:26.650595Z qemu-system-x86_64: load of migration
 failed: Input/output error”
 
  
 
 From the qemu log.
 
  
 
 Also if I shut down machines, put their host into maintenance and start
 them somewhere else, everything works just fine.
 
  
 
 Can someone help with this ? Any idea where to look ?
 
  
 
 Regards
 
 Soeren
 
  
 
  
 
  
 
 *From VDSM Log*
 
  
 
 I just tried to migrate a machine, this here happens on the source
 
  
 
 ßsnip à
 
  
 
 vdsm.log:Thread-49548::DEBUG::2015-02-26
 15:42:26,692::__init__::469::jsonrpc.JsonRpcServer::(_serveRequest)
 Calling 'VM.migrate' in bridge with {u'params': {u'tunneled': u'false',
 u'dstqemu': u'172.19.2.31', u'src': u'compute04', u'dst':
 u'compute01:54321', u'vmId': u'b75823d1-00f0-457e-a692-8b95f73907db',
 u'abortOnError': u'true', u'method': u'online'}, u'vmID':
 u'b75823d1-00f0-457e-a692-8b95f73907db'}
 
 vdsm.log:Thread-49548::DEBUG::2015-02-26
 15:42:26,694::API::510::vds::(migrate) {u'tunneled': u'false',
 u'dstqemu': u'IPADDR', u'src': u'compute04', u'dst': u'compute01:54321',
 u'vmId': u'b75823d1-00f0-457e-a692-8b95f73907db', u'abortOnError':
 u'true', u'method': u'online'}
 
 vdsm.log:Thread-49549::DEBUG::2015-02-26
 15:42:26,699::migration::103::vm.Vm::(_setupVdsConnection)
 vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::Destination server is:
 compute01:54321
 
 vdsm.log:Thread-49549::DEBUG::2015-02-26
 15:42:26,702::migration::105::vm.Vm::(_setupVdsConnection)
 vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::Initiating connection with
 destination
 
 vdsm.log:Thread-49549::DEBUG::2015-02-26
 15:42:26,733::migration::155::vm.Vm::(_prepareGuest)
 vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::Migration started
 
 vdsm.log:Thread-49549::DEBUG::2015-02-26
 15:42:26,755::migration::238::vm.Vm::(run)
 vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::migration semaphore
 acquired after 0 seconds
 
 vdsm.log:Thread-49549::DEBUG::2015-02-26
 15:42:27,211::migration::298::vm.Vm::(_startUnderlyingMigration)
 vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::starting migration to
 qemu+tls://compute01/system with miguri tcp://IPADDR
 
 vdsm.log:Thread-49550::DEBUG::2015-02-26
 15:42:27,213::migration::361::vm.Vm::(run)
 vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::migration downtime thread
 started
 
 vdsm.log:Thread-49551::DEBUG::2015-02-26
 15:42:27,216::migration::410::vm.Vm::(monitor_migration)
 vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::starting migration monitor
 thread
 
 vdsm.log:Thread-49550::DEBUG::2015-02-26
 15:43:42,218::migration::370::vm.Vm::(run)
 vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::setting migration downtime
 to 50
 
 vdsm.log:Thread-49550::DEBUG::2015-02-26
 15:44:57,222::migration::370::vm.Vm::(run)
 vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::setting migration downtime
 to 100
 
 vdsm.log:Thread-49550::DEBUG::2015-02-26
 15:46:12,227::migration::370::vm.Vm::(run)
 vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::setting migration downtime
 to 150
 
 vdsm.log:Thread-49551::WARNING::2015-02-26
 15:47:07,279::migration::458::vm.Vm::(monitor_migration)
 vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::Migration stalling:
 remaining (1791MiB)  lowmark (203MiB). Refer to RHBZ#919201.
 
 vdsm.log:Thread-49551::WARNING::2015-02-26
 15:47:17,281::migration::458::vm.Vm::(monitor_migration)
 vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::Migration stalling:
 remaining (1398MiB)  lowmark (203MiB). Refer to RHBZ#919201.
 
 vdsm.log:Thread-49550::DEBUG::2015-02-26
 15:47:27,233::migration::370::vm.Vm::(run)
 vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::setting migration downtime
 to 200
 
 vdsm.log:Thread-49551::WARNING::2015-02-26
 15:47:27,283::migration::458::vm.Vm::(monitor_migration)
 vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::Migration stalling:
 remaining (1066MiB)  lowmark (203MiB). Refer to RHBZ#919201.
 
 vdsm.log:Thread-49551::WARNING::2015-02-26
 15:47:37,285::migration::458::vm.Vm::(monitor_migration)
 vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::Migration stalling:
 remaining (701MiB)  lowmark (203MiB). Refer to RHBZ#919201.
 
 vdsm.log:Thread-49551::WARNING::2015-02-26
 15:47:47,287::migration::458::vm.Vm::(monitor_migration)
 vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::Migration 

Re: [ovirt-users] Hosted Engine Migration fails

2015-02-26 Thread Soeren Malchow
Hi,

we tried this (Roys Mail below) , and yes, shutdown always works.

But now this problem comes up with regular machines now as well.

The environment is setup like this

Engine: Ovirt 3.5.1.1-1.el6 - on CentOS6

The Storage Backend is gluster 3.6.2-1.el7 on CentOS 7

Compute hosts: Libvirt Version: libvirt-1.2.9.1-2.fc20, kvm 2.1.2 - 7.fc20, 
vdsm vdsm-4.16.10-8.gitc937927.fc20

All compute Servers are 100% identical.

The storage cluster was tested manually and works just fine.

The network interfaces are not fully utilized, more like 15%.

Log output ist as below. The thing in the log output I do not understand is this

2015-02-26T14:50:26.650595Z qemu-system-x86_64: load of migration failed: 
Input/output error

From the qemu log.

Also if I shut down machines, put their host into maintenance and start them 
somewhere else, everything works just fine.

Can someone help with this ? Any idea where to look ?

Regards
Soeren



From VDSM Log

I just tried to migrate a machine, this here happens on the source

-- snip --

vdsm.log:Thread-49548::DEBUG::2015-02-26 
15:42:26,692::__init__::469::jsonrpc.JsonRpcServer::(_serveRequest) Calling 
'VM.migrate' in bridge with {u'params': {u'tunneled': u'false', u'dstqemu': 
u'172.19.2.31', u'src': u'compute04', u'dst': u'compute01:54321', u'vmId': 
u'b75823d1-00f0-457e-a692-8b95f73907db', u'abortOnError': u'true', u'method': 
u'online'}, u'vmID': u'b75823d1-00f0-457e-a692-8b95f73907db'}
vdsm.log:Thread-49548::DEBUG::2015-02-26 15:42:26,694::API::510::vds::(migrate) 
{u'tunneled': u'false', u'dstqemu': u'IPADDR', u'src': u'compute04', u'dst': 
u'compute01:54321', u'vmId': u'b75823d1-00f0-457e-a692-8b95f73907db', 
u'abortOnError': u'true', u'method': u'online'}
vdsm.log:Thread-49549::DEBUG::2015-02-26 
15:42:26,699::migration::103::vm.Vm::(_setupVdsConnection) 
vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::Destination server is: 
compute01:54321
vdsm.log:Thread-49549::DEBUG::2015-02-26 
15:42:26,702::migration::105::vm.Vm::(_setupVdsConnection) 
vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::Initiating connection with 
destination
vdsm.log:Thread-49549::DEBUG::2015-02-26 
15:42:26,733::migration::155::vm.Vm::(_prepareGuest) 
vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::Migration started
vdsm.log:Thread-49549::DEBUG::2015-02-26 
15:42:26,755::migration::238::vm.Vm::(run) 
vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::migration semaphore acquired after 
0 seconds
vdsm.log:Thread-49549::DEBUG::2015-02-26 
15:42:27,211::migration::298::vm.Vm::(_startUnderlyingMigration) 
vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::starting migration to 
qemu+tls://compute01/system with miguri tcp://IPADDR
vdsm.log:Thread-49550::DEBUG::2015-02-26 
15:42:27,213::migration::361::vm.Vm::(run) 
vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::migration downtime thread started
vdsm.log:Thread-49551::DEBUG::2015-02-26 
15:42:27,216::migration::410::vm.Vm::(monitor_migration) 
vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::starting migration monitor thread
vdsm.log:Thread-49550::DEBUG::2015-02-26 
15:43:42,218::migration::370::vm.Vm::(run) 
vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::setting migration downtime to 50
vdsm.log:Thread-49550::DEBUG::2015-02-26 
15:44:57,222::migration::370::vm.Vm::(run) 
vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::setting migration downtime to 100
vdsm.log:Thread-49550::DEBUG::2015-02-26 
15:46:12,227::migration::370::vm.Vm::(run) 
vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::setting migration downtime to 150
vdsm.log:Thread-49551::WARNING::2015-02-26 
15:47:07,279::migration::458::vm.Vm::(monitor_migration) 
vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::Migration stalling: remaining 
(1791MiB)  lowmark (203MiB). Refer to RHBZ#919201.
vdsm.log:Thread-49551::WARNING::2015-02-26 
15:47:17,281::migration::458::vm.Vm::(monitor_migration) 
vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::Migration stalling: remaining 
(1398MiB)  lowmark (203MiB). Refer to RHBZ#919201.
vdsm.log:Thread-49550::DEBUG::2015-02-26 
15:47:27,233::migration::370::vm.Vm::(run) 
vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::setting migration downtime to 200
vdsm.log:Thread-49551::WARNING::2015-02-26 
15:47:27,283::migration::458::vm.Vm::(monitor_migration) 
vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::Migration stalling: remaining 
(1066MiB)  lowmark (203MiB). Refer to RHBZ#919201.
vdsm.log:Thread-49551::WARNING::2015-02-26 
15:47:37,285::migration::458::vm.Vm::(monitor_migration) 
vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::Migration stalling: remaining 
(701MiB)  lowmark (203MiB). Refer to RHBZ#919201.
vdsm.log:Thread-49551::WARNING::2015-02-26 
15:47:47,287::migration::458::vm.Vm::(monitor_migration) 
vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::Migration stalling: remaining 
(361MiB)  lowmark (203MiB). Refer to RHBZ#919201.
vdsm.log:Thread-49551::WARNING::2015-02-26 
15:48:07,291::migration::458::vm.Vm::(monitor_migration) 
vmId=`b75823d1-00f0-457e-a692-8b95f73907db`::Migration stalling: remaining 

Re: [ovirt-users] Problem deploying ovirt-hosted-engine to iSCSI

2015-02-26 Thread Justin Clacherty

On 25/02/2015 6:23 PM, Simone Tiraboschi wrote:

- Original Message -

From: Justin Clacherty jus...@redfish.com.au
To: Simone Tiraboschi stira...@redhat.com
Cc: Donny Davis do...@cloudspin.me, users@ovirt.org
Sent: Wednesday, February 25, 2015 5:26:37 AM
Subject: Re: [ovirt-users] Problem deploying ovirt-hosted-engine to iSCSI

On 25/02/2015 12:35 AM, Simone Tiraboschi wrote:

Can you please paste here the output of vdsClient -s 0 getDeviceList 3
thanks

Sure. Once the iscsi target is logged in the call to vdsClient never
returns. After a few minutes I hit ctrl-c, that's what caused the
traceback. See below for the output.

Could you please upload somewhere your VDSM logs (/var/log/vdsm/vdsm.log)?

I have it connecting now. I tested it on another machine again and it 
wasn't working. Changed the MTU back to 1500 on both the SAN and the 
host and now it works fine. Not sure why MTU of 9000 doesn't work, but 
I'll worry about that later.


Thanks.

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Problem deploying ovirt-hosted-engine to iSCSI

2015-02-26 Thread Fabian Deutsch
- Original Message -
 On 25/02/2015 6:23 PM, Simone Tiraboschi wrote:
  - Original Message -
  From: Justin Clacherty jus...@redfish.com.au
  To: Simone Tiraboschi stira...@redhat.com
  Cc: Donny Davis do...@cloudspin.me, users@ovirt.org
  Sent: Wednesday, February 25, 2015 5:26:37 AM
  Subject: Re: [ovirt-users] Problem deploying ovirt-hosted-engine to iSCSI
 
  On 25/02/2015 12:35 AM, Simone Tiraboschi wrote:
  Can you please paste here the output of vdsClient -s 0 getDeviceList 3
  thanks
  Sure. Once the iscsi target is logged in the call to vdsClient never
  returns. After a few minutes I hit ctrl-c, that's what caused the
  traceback. See below for the output.
  Could you please upload somewhere your VDSM logs (/var/log/vdsm/vdsm.log)?
 
 I have it connecting now. I tested it on another machine again and it
 wasn't working. Changed the MTU back to 1500 on both the SAN and the
 host and now it works fine. Not sure why MTU of 9000 doesn't work, but
 I'll worry about that later.

Maybe some intermediate device can not handle taht jumbo frame?

- fabian
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Problem deploying ovirt-hosted-engine to iSCSI

2015-02-26 Thread Justin Clacherty

On 26/02/2015 7:40 PM, Fabian Deutsch wrote:

- Original Message -

I have it connecting now. I tested it on another machine again and it
wasn't working. Changed the MTU back to 1500 on both the SAN and the
host and now it works fine. Not sure why MTU of 9000 doesn't work, but
I'll worry about that later.

Maybe some intermediate device can not handle taht jumbo frame?

There's only a switch which is configured for MTU of 9000. Doesn't 
matter for now, I'll figure it out later.

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] oVirt Manager problem

2015-02-26 Thread Moti Asayag


- Original Message -
 From: Massimo Mad mad196...@gmail.com
 To: users@ovirt.org
 Sent: Wednesday, February 25, 2015 11:14:28 AM
 Subject: [ovirt-users] oVirt Manager problem
 
 Hi,
 Here is the result of the command actually tables  audit_log  are very
 large .
 What can I do to prevent the file system fills up ?
 

It seems that the total objects (tables and indices) around audit_log table
is 2GB.

You can control the number of entries in this table by setting a shorter
period for keeping data in that table. The default is 30 days.
You can check by:
engine-config -g AuditLogAgingThreshold

and if wish to set a smaller value for days to keep, i.e for a week:
engine-config -s AuditLogAgingThreshold=7 

and restart the ovirt-engine to take effect.

However, I'm curious if there is some flooding of a certain event-log
entry in that table which might cause this table to increase to that volume
unless it reflects the actual workload of the system.

 psql engine -c SELECT nspname || '.' || relname AS relation,
 pg_size_pretty(pg_relation_size(C.oid)) AS size
 FROM pg_class C
 LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
 WHERE nspname NOT IN ('pg_catalog', 'information_schema')
 ORDER BY pg_relation_size(C.oid) DESC
 LIMIT 20;
 
 relation | size
 --+-
 public.audit_log | 826 MB
 public.idx_audit_correlation_id | 159 MB
 public.idx_audit_log_job_id | 114 MB
 public.idx_audit_log_storage_domain_name | 114 MB
 public.idx_audit_log_user_name | 114 MB
 public.idx_audit_log_vm_name | 114 MB
 public.idx_audit_log_storage_pool_name | 114 MB
 public.idx_audit_log_vds_name | 114 MB
 public.idx_audit_log_vm_template_name | 113 MB
 public.pk_audit_log | 90 MB
 public.idx_audit_log_log_time | 90 MB
 pg_toast.pg_toast_2618 | 1336 kB
 public.vm_dynamic | 680 kB
 public.vm_statistics | 496 kB
 public.disk_image_dynamic | 288 kB
 public.vds_interface_statistics | 256 kB
 public.vm_device | 208 kB
 public.vds_dynamic | 128 kB
 pg_toast.pg_toast_2619 | 120 kB
 public.pk_disk_image_dynamic | 104 kB
 (20 rows)
 
 Regards Massimo
 
 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users
 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] 3.5.1 not detecting Haswell CPU

2015-02-26 Thread Francesco Romani
- Original Message -
 From: Blaster blas...@556nato.com
 To: users@ovirt.org
 Sent: Monday, February 23, 2015 5:43:17 AM
 Subject: [ovirt-users] 3.5.1 not detecting Haswell CPU

Hi,
 
 I just upgraded from a working 3.5.0 on F20 to 3.5.1.  After the
 upgrade, 3.5.1 started detecting my i7-4790k as a Sandy Bridge, and not
 the Haswell it properly detected with 3.5.0.  I had to force the CPU
 type to Sandy Bridge in order to bring the node up.
 
   vdsClient -s 0 getVdsCaps | grep cpu
  cpuCores = '4'
  cpuFlags =
 'fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse36,clflush,dts,acpi,mmx,fxsr,sse,sse2,ss,ht,tm,pbe,syscall,nx,pdpe1gb,rdtscp,lm,constant_tsc,arch_perfmon,pebs,bts,rep_good,nopl,xtopology,nonstop_tsc,aperfmperf,eagerfpu,pni,pclmulqdq,dtes64,monitor,ds_cpl,vmx,est,tm2,ssse3,fma,cx16,xtpr,pdcm,pcid,sse4_1,sse4_2,x2apic,movbe,popcnt,tsc_deadline_timer,aes,xsave,avx,f16c,rdrand,lahf_lm,abm,ida,arat,pln,pts,dtherm,tpr_shadow,vnmi,flexpriority,ept,vpid,fsgsbase,tsc_adjust,bmi1,hle,avx2,smep,bmi2,erms,invpcid,rtm,xsaveopt,model_Nehalem,model_Conroe,model_coreduo,model_core2duo,model_Penryn,model_Westmere,model_n270,model_SandyBridge'
  cpuModel = 'Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz'
  cpuSockets = '1'
  cpuSpeed = '4300.000'
  cpuThreads = '8'
  numaNodes = {'0': {'cpus': [0, 1, 2, 3, 4, 5, 6, 7],
 'totalMemory': '32022'}}
 
 # cat /proc/cpuinfo
 processor   : 0
 vendor_id   : GenuineIntel
 cpu family  : 6
 model   : 60
 model name  : Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
 stepping: 3
 microcode   : 0x1c
 cpu MHz : 4299.843
 cache size  : 8192 KB
 physical id : 0
 siblings: 8
 core id : 0
 cpu cores   : 4
 apicid  : 0
 initial apicid  : 0
 fpu : yes
 fpu_exception   : yes
 cpuid level : 13
 wp  : yes
 flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
 mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe
 syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good
 nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64
 monitor ds_cpl vmx est tm2 ssse3 fma cx16 xtpr pdcm pcid sse4_1 sse4_2
 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm
 abm ida arat pln pts dtherm tpr_shadow vnmi flexpriority ept vpid
 fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm xsaveopt
 bugs:
 bogomips: 7999.90
 clflush size: 64
 cache_alignment : 64
 address sizes   : 39 bits physical, 48 bits virtual
 power management:

I have vague memories of similar bugs in the past. Can you please share
the libvirt debug logs of one hypervisor host which has this misbehaviour?

Bests,

-- 
Francesco Romani
RedHat Engineering Virtualization R  D
Phone: 8261328
IRC: fromani
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users