Hi,

i have a two node cluster with several VirtualDomains as resources. Normally 
live migration is no problem. But rarely it fails, without giving any 
reasonable 
message in the logs. I tried to migrate several VirtualDmains concurrently from 
ha-idg-2 to ha-idg-1. One VirtualDomain failed, the others suceeded.

This happened already once before, i *think* it was also when i tried to 
migrate several VirtualDomains concurrently, but i'm not complete sure (it's 
already
some time ago).

But the error appears rarely, it often suceeded without any problems. I *think* 
it's related to migrating several domains concurrently.

node 1:
2018-12-03T16:03:03.037872+01:00 ha-idg-1 crmd[8615]:  warning: Action 44 
(vm_mausdb_migrate_to_0) on ha-idg-2 failed (target: 0 vs. rc: 1): Error
2018-12-03T16:03:03.038234+01:00 ha-idg-1 crmd[8615]:   notice: Transition 44 
aborted by operation vm_mausdb_migrate_to_0 'modify' on ha-idg-2: Event failed
2018-12-03T16:03:03.038569+01:00 ha-idg-1 crmd[8615]:  warning: Action 44 
(vm_mausdb_migrate_to_0) on ha-idg-2 failed (target: 0 vs. rc: 1): Error

2018-12-03T16:03:03.041771+01:00 ha-idg-1 stonith-ng[8611]:  warning: 
fence_ha-idg-2 has 'action' parameter, which should never be specified in 
configuration
2018-12-03T16:03:03.042112+01:00 ha-idg-1 stonith-ng[8611]:  warning: Mapping 
action='Off' to pcmk_off_action='Off'
2018-12-03T16:03:03.042365+01:00 ha-idg-1 stonith-ng[8611]:  warning: Mapping 
action='Off' to pcmk_reboot_action='Off'
2018-12-03T16:03:03.208895+01:00 ha-idg-1 sshd[6397]: Received disconnect from 
146.107.235.132 port 46004:11: disconnected by user
2018-12-03T16:03:03.209367+01:00 ha-idg-1 sshd[6397]: Disconnected from 
146.107.235.132 port 46004
2018-12-03T16:03:03.209670+01:00 ha-idg-1 sshd[6397]: pam_unix(sshd:session): 
session closed for user root
2018-12-03T16:03:03.213996+01:00 ha-idg-1 systemd-logind[1627]: Removed session 
17.
2018-12-03T16:03:03.286986+01:00 ha-idg-1 crmd[8615]:   notice: Transition 44 
(Complete=6, Pending=0, Fired=0, Skipped=1, Incomplete=9, 
Source=/var/lib/pacemaker/pengine/pe-input-346.bz2): Stopped
2018-12-03T16:03:03.290762+01:00 ha-idg-1 stonith-ng[8611]:  warning: 
fence_ha-idg-2 has 'action' parameter, which should never be specified in 
configuration
2018-12-03T16:03:03.291224+01:00 ha-idg-1 stonith-ng[8611]:  warning: Mapping 
action='Off' to pcmk_off_action='Off'
2018-12-03T16:03:03.291619+01:00 ha-idg-1 stonith-ng[8611]:  warning: Mapping 
action='Off' to pcmk_reboot_action='Off'
2018-12-03T16:03:03.313864+01:00 ha-idg-1 pengine[8614]:  warning: Processing 
failed op migrate_to for vm_mausdb on ha-idg-2: unknown error (1)
2018-12-03T16:03:03.314193+01:00 ha-idg-1 pengine[8614]:  warning: Processing 
failed op migrate_to for vm_mausdb on ha-idg-2: unknown error (1)
2018-12-03T16:03:03.316485+01:00 ha-idg-1 pengine[8614]:   notice: Recover 
vm_mausdb#011(Started ha-idg-2 -> ha-idg-1)
2018-12-03T16:03:03.316805+01:00 ha-idg-1 pengine[8614]:   notice: Migrate 
vm_geneious#011(Started ha-idg-2 -> ha-idg-1)
2018-12-03T16:03:03.317914+01:00 ha-idg-1 pengine[8614]:   notice: Calculated 
transition 45, saving inputs in /var/lib/pacemaker/pengine/pe-input-347.bz2

node2:
2018-12-03T16:02:32.312598+01:00 ha-idg-2 VirtualDomain(vm_mausdb)[7988]: INFO: 
mausdb: Starting live migration to ha-idg-1 (using: virsh 
--connect=qemu:///system --quiet migrate --live  ma
usdb qemu+ssh://ha-idg-1/system ).
2018-12-03T16:02:32.315358+01:00 ha-idg-2 VirtualDomain(vm_geneious)[7989]: 
INFO: geneious: Starting live migration to ha-idg-1 (using: virsh 
--connect=qemu:///system --quiet migrate --live
  geneious qemu+ssh://ha-idg-1/system ).
2018-12-03T16:02:32.391911+01:00 ha-idg-2 lrmd[14256]:   notice: executing - 
rsc:vm_sim action:stop call_id:78
2018-12-03T16:02:32.394065+01:00 ha-idg-2 stonith-ng[14255]:  warning: 
fence_ha-idg-1 has 'action' parameter, which should never be specified in 
configuration
2018-12-03T16:02:32.394369+01:00 ha-idg-2 stonith-ng[14255]:  warning: Mapping 
action='Off' to pcmk_off_action='Off'
2018-12-03T16:02:32.394619+01:00 ha-idg-2 stonith-ng[14255]:  warning: Mapping 
action='Off' to pcmk_reboot_action='Off'
2018-12-03T16:02:32.459622+01:00 ha-idg-2 VirtualDomain(vm_sim)[8599]: INFO: 
Domain sim already stopped.
2018-12-03T16:02:32.464042+01:00 ha-idg-2 lrmd[14256]:   notice: finished - 
rsc:vm_sim action:stop call_id:78 pid:8599 exit-code:0 exec-time:72ms 
queue-time:0ms
2018-12-03T16:02:32.464750+01:00 ha-idg-2 crmd[14259]:   notice: Result of stop 
operation for vm_sim on ha-idg-2: 0 (ok)
2018-12-03T16:02:32.470496+01:00 ha-idg-2 stonith-ng[14255]:  warning: 
fence_ha-idg-1 has 'action' parameter, which should never be specified in 
configuration
2018-12-03T16:02:32.470794+01:00 ha-idg-2 stonith-ng[14255]:  warning: Mapping 
action='Off' to pcmk_off_action='Off'
2018-12-03T16:02:32.471043+01:00 ha-idg-2 stonith-ng[14255]:  warning: Mapping 
action='Off' to pcmk_reboot_action='Off'
2018-12-03T16:02:32.628562+01:00 ha-idg-2 stonith-ng[14255]:  warning: 
fence_ha-idg-1 has 'action' parameter, which should never be specified in 
configuration
2018-12-03T16:02:32.628970+01:00 ha-idg-2 stonith-ng[14255]:  warning: Mapping 
action='Off' to pcmk_off_action='Off'
2018-12-03T16:02:32.629226+01:00 ha-idg-2 stonith-ng[14255]:  warning: Mapping 
action='Off' to pcmk_reboot_action='Off'
2018-12-03T16:03:02.836145+01:00 ha-idg-2 libvirtd[3117]: 2018-12-03 
15:03:02.835+0000: 4515: error : qemuMigrationCheckJobStatus:1456 : operation 
failed: migration job: unexpectedly failed
2018-12-03T16:03:03.006918+01:00 ha-idg-2 VirtualDomain(vm_mausdb)[7988]: 
ERROR: mausdb: live migration to ha-idg-1 failed: 1
2018-12-03T16:03:03.032370+01:00 ha-idg-2 kernel: [ 6593.265436] br0: port 
2(vnet0) entered disabled state
2018-12-03T16:03:03.032383+01:00 ha-idg-2 kernel: [ 6593.266268] device vnet0 
left promiscuous mode
2018-12-03T16:03:03.032385+01:00 ha-idg-2 kernel: [ 6593.266275] br0: port 
2(vnet0) entered disabled state
2018-12-03T16:03:03.032789+01:00 ha-idg-2 lrmd[14256]:   notice: 
vm_mausdb_migrate_to_0:7988:stderr [ error: operation failed: migration job: 
unexpectedly failed ]
2018-12-03T16:03:03.033126+01:00 ha-idg-2 lrmd[14256]:   notice: 
vm_mausdb_migrate_to_0:7988:stderr [ ocf-exit-reason:mausdb: live migration to 
ha-idg-1 failed: 1 ]
2018-12-03T16:03:03.033376+01:00 ha-idg-2 lrmd[14256]:   notice: finished - 
rsc:vm_mausdb action:migrate_to call_id:75 pid:7988 exit-code:1 
exec-time:31065ms queue-time:0ms
2018-12-03T16:03:03.033909+01:00 ha-idg-2 crmd[14259]:   notice: Result of 
migrate_to operation for vm_mausdb on ha-idg-2: 1 (unknown error)
2018-12-03T16:03:03.034198+01:00 ha-idg-2 crmd[14259]:   notice: 
ha-idg-2-vm_mausdb_migrate_to_0:75 [ error: operation failed: migration job: 
unexpectedly failed\nocf-exit-reason:mausdb: live migration to ha-idg-1 failed: 
1\n ]

/var/log/pacemaker.log:
Dec 03 16:02:32 [8611] ha-idg-1 stonith-ng:     info: cib_devices_update:       
Updating devices to version 1.789.6
Dec 03 16:02:32 [8611] ha-idg-1 stonith-ng:  warning: xml2device_params:        
fence_ha-idg-2 has 'action' parameter, which should never be specified in 
configuration
Dec 03 16:02:32 [8611] ha-idg-1 stonith-ng:  warning: map_action:       Mapping 
action='Off' to pcmk_off_action='Off'
Dec 03 16:02:32 [8611] ha-idg-1 stonith-ng:  warning: map_action:       Mapping 
action='Off' to pcmk_reboot_action='Off'
Dec 03 16:02:32 [8611] ha-idg-1 stonith-ng:     info: cib_device_update:        
Device fence_ha-idg-1 has been disabled on ha-idg-1: score=-INFINITY
Dec 03 16:02:37 [8610] ha-idg-1        cib:     info: cib_process_ping: 
Reporting our current digest to ha-idg-1: 2715b3a8fd056fbd4d84cb12b911a328 for 
1.789.6 (0x1aebf00 0)
Dec 03 16:03:03 [8610] ha-idg-1        cib:     info: cib_perform_op:   Diff: 
--- 1.789.6 2
Dec 03 16:03:03 [8610] ha-idg-1        cib:     info: cib_perform_op:   Diff: 
+++ 1.789.7 (null)
Dec 03 16:03:03 [8610] ha-idg-1        cib:     info: cib_perform_op:   +  
/cib:  @num_updates=7
Dec 03 16:03:03 [8610] ha-idg-1        cib:     info: cib_perform_op:   +  
/cib/status/node_state[@id='1084777492']/lrm[@id='1084777492']/lrm_resources/lrm_resource[@id='vm_mausdb']/lrm_rsc
_op[@id='vm_mausdb_last_0']:  
@transition-magic=0:1;44:44:0:e4b76be7-fdb6-47c1-a905-9a49650b4180, 
@exit-reason=mausdb: live migration to ha-idg-1 failed: 1, @call-id=75, 
@rc-code=1, @op-sta
tus=0, @exec-time=31065
Dec 03 16:03:03 [8610] ha-idg-1        cib:     info: cib_perform_op:   ++ 
/cib/status/node_state[@id='1084777492']/lrm[@id='1084777492']/lrm_resources/lrm_resource[@id='vm_mausdb']:
  <lrm_
rsc_op id="vm_mausdb_last_failure_0" operation_key="vm_mausdb_migrate_to_0" 
operation="migrate_to" crm-debug-origin="do_update_resource" 
crm_feature_set="3.0.13" transition-key="44:44:0:e4b
76be7-fdb6-47c1-a905-9a49650b4180" 
transition-magic="0:1;44:44:0:e4b76be7-fdb6-47c1-a905-9a49650b4180" 
exit-reason="mausdb: live migrat
Dec 03 16:03:03 [8610] ha-idg-1        cib:     info: cib_process_request:      
Completed cib_modify operation for section status: OK (rc=0, 
origin=ha-idg-2/crmd/74, version=1.789.7)
Dec 03 16:03:03 [8615] ha-idg-1       crmd:  warning: status_from_rc:   Action 
44 (vm_mausdb_migrate_to_0) on ha-idg-2 failed (target: 0 vs. rc: 1): Error
Dec 03 16:03:03 [8615] ha-idg-1       crmd:   notice: abort_transition_graph:   
Transition 44 aborted by operation vm_mausdb_migrate_to_0 'modify' on ha-idg-2: 
Event failed | magic=0:1;44:4
4:0:e4b76be7-fdb6-47c1-a905-9a49650b4180 cib=1.789.7 
source=match_graph_event:310 complete=false
Dec 03 16:03:03 [8615] ha-idg-1       crmd:     info: match_graph_event:        
Action vm_mausdb_migrate_to_0 (44) confirmed on ha-idg-2 (rc=1)
Dec 03 16:03:03 [8615] ha-idg-1       crmd:     info: process_graph_event:      
Detected action (44.44) vm_mausdb_migrate_to_0.75=unknown error: failed
Dec 03 16:03:03 [8615] ha-idg-1       crmd:  warning: status_from_rc:   Action 
44 (vm_mausdb_migrate_to_0) on ha-idg-2 failed (target: 0 vs. rc: 1): Error
Dec 03 16:03:03 [8615] ha-idg-1       crmd:     info: abort_transition_graph:   
Transition 44 aborted by operation vm_mausdb_migrate_to_0 'create' on ha-idg-2: 
Event failed | magic=0:1;44:4
4:0:e4b76be7-fdb6-47c1-a905-9a49650b4180 cib=1.789.7 
source=match_graph_event:310 complete=false
Dec 03 16:03:03 [8615] ha-idg-1       crmd:     info: match_graph_event:        
Action vm_mausdb_migrate_to_0 (44) confirmed on ha-idg-2 (rc=1)
Dec 03 16:03:03 [8615] ha-idg-1       crmd:     info: process_graph_event:      
Detected action (44.44) vm_mausdb_migrate_to_0.75=unknown error: failed
Dec 03 16:03:03 [8611] ha-idg-1 stonith-ng:     info: 
update_cib_stonith_devices_v2:    Updating device list from the cib: modify 
lrm_rsc_op[@id='vm_mausdb_last_0']

Hosts are SLES 12 SP3, pacenaker is the most recent for my OS: 
pacemaker-1.1.16-6.5.1.x86_64

I don't have any clue, i'm thankful for any hint. I can provide you with 
further information if needed.


Bernd
-- 

Bernd Lentes 
Systemadministration 
Institut für Entwicklungsgenetik 
Gebäude 35.34 - Raum 208 
HelmholtzZentrum münchen 
[ mailto:bernd.len...@helmholtz-muenchen.de | 
bernd.len...@helmholtz-muenchen.de ] 
phone: +49 89 3187 1241 
fax: +49 89 3187 2294 
[ http://www.helmholtz-muenchen.de/idg | http://www.helmholtz-muenchen.de/idg ] 

wer Fehler macht kann etwas lernen 
wer nichts macht kann auch nichts lernen
 

Helmholtz Zentrum Muenchen
Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)
Ingolstaedter Landstr. 1
85764 Neuherberg
www.helmholtz-muenchen.de
Aufsichtsratsvorsitzende: MinDirig.in Petra Steiner-Hoffmann
Stellv.Aufsichtsratsvorsitzender: MinDirig. Dr. Manfred Wolter
Geschaeftsfuehrer: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich Bassler, 
Dr. rer. nat. Alfons Enhsen
Registergericht: Amtsgericht Muenchen HRB 6466
USt-IdNr: DE 129521671

_______________________________________________
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to