Re: [pve-devel] migration problems since qemu 1.3

2012-12-24 Thread Dietmar Maurer
 virtio0:
 cephkvmpool1:vm-105-disk-
 1,iops_rd=215,iops_wr=155,mbps_rd=130,mbps_wr=90,size=20G

Please can you also test without ceph?
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] migration problems since qemu 1.3

2012-12-23 Thread Alexandre DERUMIER
maybe can you try to recompile pve-qemu-kvm without this patch :

include fix-off-by-1-error-in-RAM-migration-code.patch
https://git.proxmox.com/?p=pve-qemu-kvm.git;a=commit;h=e01e677960fbc6787f8358543047307fca67facb

this come from this git
http://git.qemu.org/?p=qemu.git;a=commit;h=7ec81e56edc2b2007ce0ae3982aa5c18af9546ab





- Mail original - 

De: Stefan Priebe - Profihost AG s.pri...@profihost.ag 
À: Dietmar Maurer diet...@proxmox.com 
Cc: Alexandre DERUMIER aderum...@odiso.com, pve-devel@pve.proxmox.com 
Envoyé: Samedi 22 Décembre 2012 20:19:37 
Objet: Re: [pve-devel] migration problems since qemu 1.3 

Hi, 

 Please can we track the bug here: 
 https://bugzilla.proxmox.com/show_bug.cgi?id=298 

didn't know about the bug report. Great to see that i'm not the only one. 

 @Stefan: Does it work when the VM does not use and network device? 

No that changes nothing. I've removed all network devices and VM 
migration is still not working ;-( 

Mery XMAS. 

Greets, 
Stefan 
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] migration problems since qemu 1.3

2012-12-23 Thread Stefan Priebe

Hi,
Am 23.12.2012 14:18, schrieb Alexandre DERUMIER:

I have redone tests on my side, with linux and windows guests, vms with 4 - 
16GB ram
I really can't reproduce your problem.

migration speed is around 400MB/S.

How much mem was in use? Died you tried my suggestion with
find / -type f -print | xargs cat /dev/null

BEFORE you migrate?


maybe can you try the last qemu git version ? I see some big migration patchs 
commited this last days

Already tried that. No Change.

Stefan
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] migration problems since qemu 1.3

2012-12-23 Thread Alexandre DERUMIER
How much mem was in use? Died you tried my suggestion with 
find / -type f -print | xargs cat /dev/null 

I filled memory buffer with fio read benchmark. (tested with 4GB and 16G guests)

storage tested: nfs and iscsi. (don't have my rbd cluster for now)










- Mail original - 

De: Stefan Priebe s.pri...@profihost.ag 
À: Alexandre DERUMIER aderum...@odiso.com 
Cc: pve-devel@pve.proxmox.com, Dietmar Maurer diet...@proxmox.com 
Envoyé: Dimanche 23 Décembre 2012 14:21:07 
Objet: Re: [pve-devel] migration problems since qemu 1.3 

Hi, 
Am 23.12.2012 14:18, schrieb Alexandre DERUMIER: 
 I have redone tests on my side, with linux and windows guests, vms with 4 - 
 16GB ram 
 I really can't reproduce your problem. 
 
 migration speed is around 400MB/S. 
How much mem was in use? Died you tried my suggestion with 
find / -type f -print | xargs cat /dev/null 

BEFORE you migrate? 

 maybe can you try the last qemu git version ? I see some big migration patchs 
 commited this last days 
Already tried that. No Change. 

Stefan 
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] migration problems since qemu 1.3

2012-12-22 Thread Dietmar Maurer
Please can we track the bug here:

https://bugzilla.proxmox.com/show_bug.cgi?id=298

@Stefan: Does it work when the VM does not use and network device?

___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] migration problems since qemu 1.3

2012-12-22 Thread Stefan Priebe - Profihost AG

Hi,

 Please can we track the bug here:
 https://bugzilla.proxmox.com/show_bug.cgi?id=298

didn't know about the bug report. Great to see that i'm not the only one.

 @Stefan: Does it work when the VM does not use and network device?

No that changes nothing. I've removed all network devices and VM 
migration is still not working ;-(


Mery XMAS.

Greets,
Stefan
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] migration problems since qemu 1.3

2012-12-21 Thread Stefan Priebe - Profihost AG

Hi,

even more news. The kvm is repsonsive again after cancelling the 
migration and waiting around 1-2 minutes.


While these two minutes - the kvm process on the source host is then 
running at 100% CPU.


Greets,
Stefan
Am 21.12.2012 14:46, schrieb Stefan Priebe - Profihost AG:


This time it hangs at the first query-migrate:
--
Dec 21 14:44:43 starting migration of VM 100 to node 'cloud1-1203'
(10.255.0.22)
Dec 21 14:44:43 copying disk images
Dec 21 14:44:43 starting VM 100 on remote node 'cloud1-1203'
Dec 21 14:44:46 starting migration tunnel
Dec 21 14:44:46 starting online/live migration on port 6
Dec 21 14:44:46 migrate-set-capabilities, capabilities = [HASH(0x3933ed0)]
Dec 21 14:44:46 migrate-set-cache-size, value = 429496729
Dec 21 14:44:46 start migrate tcp:localhost:6
Dec 21 14:44:48 query-migrate
---

I can reproduce this by assign min. 4GB of memory to a machine and then
fill the buffers and cache by:

find / -type f -print |xargs cat /dev/null

And then start a migrate.

Stefan
Am 21.12.2012 11:43, schrieb Stefan Priebe - Profihost AG:

Hi Alexandre,

i've added some debugging / logging code.

The output stops / hangs at query migrate. See here:

Dec 21 11:41:59 starting migration of VM 100 to node 'cloud1-1203'
(10.255.0.22)
Dec 21 11:41:59 copying disk images
Dec 21 11:41:59 starting VM 100 on remote node 'cloud1-1203'
Dec 21 11:42:02 starting migration tunnel
Dec 21 11:42:03 starting online/live migration on port 6
Dec 21 11:42:03 migrate-set-capabilities, capabilities =
[HASH(0x39a9fb0)]
Dec 21 11:42:03 migrate-set-cache-size, value = 429496729
Dec 21 11:42:03 start migrate tcp:localhost:6
Dec 21 11:42:05 query-migrate
Dec 21 11:42:05 migration status: active (transferred 468063329,
remaining 3764068352), total 4303814656)
Dec 21 11:42:07 query-migrate

I can't even ping the VM anymore.

Stefan

Am 21.12.2012 08:58, schrieb Alexandre DERUMIER:

Hi Stefan, any news ?

I'm trying to reproduce your problem, but it's works fine for me, no
crash...

- Mail original -

De: Stefan Priebe - Profihost AG s.pri...@profihost.ag
À: Alexandre DERUMIER aderum...@odiso.com
Cc: pve-devel@pve.proxmox.com
Envoyé: Jeudi 20 Décembre 2012 16:09:42
Objet: Re: [pve-devel] migration problems since qemu 1.3

Hi,
Am 20.12.2012 15:57, schrieb Alexandre DERUMIER:

Just an idea (not sure it's the problem),can you try to commment

$qmpclient-queue_cmd($vmid, $ballooncb, 'query-balloon');

in QemuServer.pm, line 2081.

and restart pvedaemon  pvestatd ?


This doesn't change anything.

Right now the kvm process is running on old and new machine.

An strace on the pid on the new machine shows a loop of:


[pid 28351] ... futex resumed ) = -1 ETIMEDOUT (Connection timed
out)
[pid 28351] futex(0x7ff8b8025388, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 28351] futex(0x7ff8b8026024,
FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 11801, {1356016143,
843092000},  unfinished ...
[pid 28285] mremap(0x7ff77bfe4000, 160378880, 160411648, MREMAP_MAYMOVE)
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160411648, 160448512, MREMAP_MAYMOVE)
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160448512, 160481280, MREMAP_MAYMOVE)
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160481280, 160514048, MREMAP_MAYMOVE)
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160514048, 160546816, MREMAP_MAYMOVE)
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160546816, 160583680, MREMAP_MAYMOVE)
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160583680, 160616448, MREMAP_MAYMOVE)
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160616448, 160649216, MREMAP_MAYMOVE)
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160649216, 160681984, MREMAP_MAYMOVE)
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160681984, 160718848, MREMAP_MAYMOVE)
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160718848, 160751616, MREMAP_MAYMOVE)
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160751616, 160784384, MREMAP_MAYMOVE)
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160784384, 160817152, MREMAP_MAYMOVE)
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160817152, 160854016, MREMAP_MAYMOVE)
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160854016, 160886784, MREMAP_MAYMOVE)
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160886784, 160919552, MREMAP_MAYMOVE)
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160919552, 160952320, MREMAP_MAYMOVE)
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160952320, 160989184, MREMAP_MAYMOVE)
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160989184, 161021952, MREMAP_MAYMOVE)
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 161021952, 161054720, MREMAP_MAYMOVE)
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 161054720, 161087488, MREMAP_MAYMOVE)
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 161087488, 161124352, MREMAP_MAYMOVE

Re: [pve-devel] migration problems since qemu 1.3

2012-12-21 Thread Alexandre DERUMIER
Hi Stefan,

I'll try to reproduce it, maybe qemu-devel can help too ?

I'll be offline until 26/12 (christmas).

Mery Xmas to all.

Alexandre

- Mail original - 

De: Stefan Priebe - Profihost AG s.pri...@profihost.ag 
À: Alexandre DERUMIER aderum...@odiso.com 
Cc: pve-devel@pve.proxmox.com 
Envoyé: Vendredi 21 Décembre 2012 14:51:54 
Objet: Re: [pve-devel] migration problems since qemu 1.3 

Hi, 

even more news. The kvm is repsonsive again after cancelling the 
migration and waiting around 1-2 minutes. 

While these two minutes - the kvm process on the source host is then 
running at 100% CPU. 

Greets, 
Stefan 
Am 21.12.2012 14:46, schrieb Stefan Priebe - Profihost AG: 
 
 This time it hangs at the first query-migrate: 
 -- 
 Dec 21 14:44:43 starting migration of VM 100 to node 'cloud1-1203' 
 (10.255.0.22) 
 Dec 21 14:44:43 copying disk images 
 Dec 21 14:44:43 starting VM 100 on remote node 'cloud1-1203' 
 Dec 21 14:44:46 starting migration tunnel 
 Dec 21 14:44:46 starting online/live migration on port 6 
 Dec 21 14:44:46 migrate-set-capabilities, capabilities = [HASH(0x3933ed0)] 
 Dec 21 14:44:46 migrate-set-cache-size, value = 429496729 
 Dec 21 14:44:46 start migrate tcp:localhost:6 
 Dec 21 14:44:48 query-migrate 
 --- 
 
 I can reproduce this by assign min. 4GB of memory to a machine and then 
 fill the buffers and cache by: 
 
 find / -type f -print |xargs cat /dev/null 
 
 And then start a migrate. 
 
 Stefan 
 Am 21.12.2012 11:43, schrieb Stefan Priebe - Profihost AG: 
 Hi Alexandre, 
 
 i've added some debugging / logging code. 
 
 The output stops / hangs at query migrate. See here: 
 
 Dec 21 11:41:59 starting migration of VM 100 to node 'cloud1-1203' 
 (10.255.0.22) 
 Dec 21 11:41:59 copying disk images 
 Dec 21 11:41:59 starting VM 100 on remote node 'cloud1-1203' 
 Dec 21 11:42:02 starting migration tunnel 
 Dec 21 11:42:03 starting online/live migration on port 6 
 Dec 21 11:42:03 migrate-set-capabilities, capabilities = 
 [HASH(0x39a9fb0)] 
 Dec 21 11:42:03 migrate-set-cache-size, value = 429496729 
 Dec 21 11:42:03 start migrate tcp:localhost:6 
 Dec 21 11:42:05 query-migrate 
 Dec 21 11:42:05 migration status: active (transferred 468063329, 
 remaining 3764068352), total 4303814656) 
 Dec 21 11:42:07 query-migrate 
 
 I can't even ping the VM anymore. 
 
 Stefan 
 
 Am 21.12.2012 08:58, schrieb Alexandre DERUMIER: 
 Hi Stefan, any news ? 
 
 I'm trying to reproduce your problem, but it's works fine for me, no 
 crash... 
 
 - Mail original - 
 
 De: Stefan Priebe - Profihost AG s.pri...@profihost.ag 
 À: Alexandre DERUMIER aderum...@odiso.com 
 Cc: pve-devel@pve.proxmox.com 
 Envoyé: Jeudi 20 Décembre 2012 16:09:42 
 Objet: Re: [pve-devel] migration problems since qemu 1.3 
 
 Hi, 
 Am 20.12.2012 15:57, schrieb Alexandre DERUMIER: 
 Just an idea (not sure it's the problem),can you try to commment 
 
 $qmpclient-queue_cmd($vmid, $ballooncb, 'query-balloon'); 
 
 in QemuServer.pm, line 2081. 
 
 and restart pvedaemon  pvestatd ? 
 
 This doesn't change anything. 
 
 Right now the kvm process is running on old and new machine. 
 
 An strace on the pid on the new machine shows a loop of: 
 
  
 [pid 28351] ... futex resumed ) = -1 ETIMEDOUT (Connection timed 
 out) 
 [pid 28351] futex(0x7ff8b8025388, FUTEX_WAKE_PRIVATE, 1) = 0 
 [pid 28351] futex(0x7ff8b8026024, 
 FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 11801, {1356016143, 
 843092000},  unfinished ... 
 [pid 28285] mremap(0x7ff77bfe4000, 160378880, 160411648, MREMAP_MAYMOVE) 
 = 0x7ff77bfe4000 
 [pid 28285] mremap(0x7ff77bfe4000, 160411648, 160448512, MREMAP_MAYMOVE) 
 = 0x7ff77bfe4000 
 [pid 28285] mremap(0x7ff77bfe4000, 160448512, 160481280, MREMAP_MAYMOVE) 
 = 0x7ff77bfe4000 
 [pid 28285] mremap(0x7ff77bfe4000, 160481280, 160514048, MREMAP_MAYMOVE) 
 = 0x7ff77bfe4000 
 [pid 28285] mremap(0x7ff77bfe4000, 160514048, 160546816, MREMAP_MAYMOVE) 
 = 0x7ff77bfe4000 
 [pid 28285] mremap(0x7ff77bfe4000, 160546816, 160583680, MREMAP_MAYMOVE) 
 = 0x7ff77bfe4000 
 [pid 28285] mremap(0x7ff77bfe4000, 160583680, 160616448, MREMAP_MAYMOVE) 
 = 0x7ff77bfe4000 
 [pid 28285] mremap(0x7ff77bfe4000, 160616448, 160649216, MREMAP_MAYMOVE) 
 = 0x7ff77bfe4000 
 [pid 28285] mremap(0x7ff77bfe4000, 160649216, 160681984, MREMAP_MAYMOVE) 
 = 0x7ff77bfe4000 
 [pid 28285] mremap(0x7ff77bfe4000, 160681984, 160718848, MREMAP_MAYMOVE) 
 = 0x7ff77bfe4000 
 [pid 28285] mremap(0x7ff77bfe4000, 160718848, 160751616, MREMAP_MAYMOVE) 
 = 0x7ff77bfe4000 
 [pid 28285] mremap(0x7ff77bfe4000, 160751616, 160784384, MREMAP_MAYMOVE) 
 = 0x7ff77bfe4000 
 [pid 28285] mremap(0x7ff77bfe4000, 160784384, 160817152, MREMAP_MAYMOVE) 
 = 0x7ff77bfe4000 
 [pid 28285] mremap(0x7ff77bfe4000, 160817152, 160854016, MREMAP_MAYMOVE) 
 = 0x7ff77bfe4000 
 [pid 28285] mremap(0x7ff77bfe4000, 160854016, 160886784, MREMAP_MAYMOVE) 
 = 0x7ff77bfe4000 
 [pid

[pve-devel] migration problems since qemu 1.3

2012-12-20 Thread Stefan Priebe - Profihost AG

Hello list,

i've massive migration problems since switching to qemu 1.3. Mostly the 
migration just hangs never finishes and suddenly the vm is just dead / 
not running anymore.


Has anybody seen this too?

Greets,
Stefan
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] migration problems since qemu 1.3

2012-12-20 Thread Alexandre DERUMIER
Do you mean migration between qemu-kvm 1.2 - qemu 1.3 ?

(because it's not supported)

or migration between qemu 1.3 - qemu 1.3 ?


also,I don't know if it's related, but in the changelog:
http://wiki.qemu.org/ChangeLog/1.3

Live Migration, Save/Restore
The stop and cont commands have new semantics on the destination machine 
during migration. Previously, the outcome depended on whether the commands were 
issued before or after the source connected to the destination QEMU: in 
particular, cont would fail if issued before connection, and undo the 
effect of the -S command-line option if issued after. Starting from this 
version, the effect of stop and cont will always take place at the end of 
migration (overriding the presence or absence of the -S option) and cont will 
never fail. This change should be transparent, since the old behavior was 
usually subject to a race condition.


- Mail original - 

De: Stefan Priebe - Profihost AG s.pri...@profihost.ag 
À: pve-devel@pve.proxmox.com 
Envoyé: Jeudi 20 Décembre 2012 09:46:01 
Objet: [pve-devel] migration problems since qemu 1.3 

Hello list, 

i've massive migration problems since switching to qemu 1.3. Mostly the 
migration just hangs never finishes and suddenly the vm is just dead / 
not running anymore. 

Has anybody seen this too? 

Greets, 
Stefan 
___ 
pve-devel mailing list 
pve-devel@pve.proxmox.com 
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel 
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] migration problems since qemu 1.3

2012-12-20 Thread Stefan Priebe - Profihost AG

Hi Alexandre,
Am 20.12.2012 09:50, schrieb Alexandre DERUMIER:

Do you mean migration between qemu-kvm 1.2 - qemu 1.3 ?

No.


or migration between qemu 1.3 - qemu 1.3 ?
Yes. It works fine with NEWLY started VMs but if the VMs are running 
more than 1-3 days. It stops working and the VMs just crahs during 
migration.



also,I don't know if it's related, but in the changelog:
http://wiki.qemu.org/ChangeLog/1.3

Live Migration, Save/Restore
The stop and cont commands have new semantics on the destination machine during migration. Previously, the outcome depended 
on whether the commands were issued before or after the source connected to the destination QEMU: in particular, cont would fail if 
issued before connection, and undo the effect of the -S command-line option if issued after. Starting from this version, the effect of 
stop and cont will always take place at the end of migration (overriding the presence or absence of the -S option) and 
cont will never fail. This change should be transparent, since the old behavior was usually subject to a race condition.
i don't see how this could result in a crash of the whole VM and a 
migration not working at all.


Greets,
Stefan
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] migration problems since qemu 1.3

2012-12-20 Thread Alexandre DERUMIER
with last git, I think it's related to balloon driver enabled by default, and 
qmp command send (see my previous mail).


can you try to replace (in QemuServer.pm)

if (!defined($conf-{balloon}) || $conf-{balloon}) {
vm_mon_cmd($vmid, balloon, value = $conf-{balloon}*1024*1024)
if $conf-{balloon};

vm_mon_cmd($vmid, 'qom-set',
   path = machine/peripheral/balloon0,
   property = stats-polling-interval,
   value = 2);
}

by

if (!defined($conf-{balloon}) || $conf-{balloon}) {
vm_mon_cmd_nocheck($vmid, balloon, value = 
$conf-{balloon}*1024*1024)
if $conf-{balloon};

vm_mon_cmd_nocheck($vmid, 'qom-set',
   path = machine/peripheral/balloon0,
   property = stats-polling-interval,
   value = 2);
}


(vm_mon_cmd_nocheck)

- Mail original - 

De: Stefan Priebe - Profihost AG s.pri...@profihost.ag 
À: Alexandre DERUMIER aderum...@odiso.com 
Cc: pve-devel@pve.proxmox.com 
Envoyé: Jeudi 20 Décembre 2012 11:48:06 
Objet: Re: [pve-devel] migration problems since qemu 1.3 

Hi, 

Am 20.12.2012 10:04, schrieb Alexandre DERUMIER: 
 Yes. It works fine with NEWLY started VMs but if the VMs are running 
 more than 1-3 days. It stops working and the VMs just crahs during 
 migration. 
 Maybe vm running since 1-3 days,have more memory used, so I take more time to 
 live migrate. 

I see totally different outputs - the vm crashes and the status output 
stops. 

with git from yesterday i'm just getting this: 
-- 
Dec 20 11:34:21 starting migration of VM 100 to node 'cloud1-1203' 
(10.255.0.22) 
Dec 20 11:34:21 copying disk images 
Dec 20 11:34:21 starting VM 100 on remote node 'cloud1-1203' 
Dec 20 11:34:23 ERROR: online migrate failure - command '/usr/bin/ssh -o 
'BatchMode=yes' root@10.255.0.22 qm start 100 --stateuri tcp --skiplock 
--migratedfrom cloud1-1202' failed: exit code 255 
Dec 20 11:34:23 aborting phase 2 - cleanup resources 
Dec 20 11:34:24 ERROR: migration finished with problems (duration 00:00:03) 
TASK ERROR: migration problems 
-- 


 Does it crash at start of the migration ? or in the middle of the migration ? 

At the beginning mostly i see no more output after: 
migration listens on port 6 


 what is your vm conf ? (memory size, storage ?) 
2GB mem, RBD / Ceph Storage 

Stefan 
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] migration problems since qemu 1.3

2012-12-20 Thread Stefan Priebe - Profihost AG

Hi,

at least migration works at all ;-) I'll wait until tomorrow and test 
again. I've restarted all VMs with latest pve-qemu-kvm.


Thanks!

Am 20.12.2012 11:57, schrieb Alexandre DERUMIER:

with last git, I think it's related to balloon driver enabled by default, and 
qmp command send (see my previous mail).


can you try to replace (in QemuServer.pm)

 if (!defined($conf-{balloon}) || $conf-{balloon}) {
 vm_mon_cmd($vmid, balloon, value = $conf-{balloon}*1024*1024)
 if $conf-{balloon};

 vm_mon_cmd($vmid, 'qom-set',
path = machine/peripheral/balloon0,
property = stats-polling-interval,
value = 2);
 }

by

 if (!defined($conf-{balloon}) || $conf-{balloon}) {
 vm_mon_cmd_nocheck($vmid, balloon, value = 
$conf-{balloon}*1024*1024)
 if $conf-{balloon};

 vm_mon_cmd_nocheck($vmid, 'qom-set',
path = machine/peripheral/balloon0,
property = stats-polling-interval,
value = 2);
 }


(vm_mon_cmd_nocheck)

- Mail original -

De: Stefan Priebe - Profihost AG s.pri...@profihost.ag
À: Alexandre DERUMIER aderum...@odiso.com
Cc: pve-devel@pve.proxmox.com
Envoyé: Jeudi 20 Décembre 2012 11:48:06
Objet: Re: [pve-devel] migration problems since qemu 1.3

Hi,

Am 20.12.2012 10:04, schrieb Alexandre DERUMIER:

Yes. It works fine with NEWLY started VMs but if the VMs are running
more than 1-3 days. It stops working and the VMs just crahs during
migration.

Maybe vm running since 1-3 days,have more memory used, so I take more time to 
live migrate.


I see totally different outputs - the vm crashes and the status output
stops.

with git from yesterday i'm just getting this:
--
Dec 20 11:34:21 starting migration of VM 100 to node 'cloud1-1203'
(10.255.0.22)
Dec 20 11:34:21 copying disk images
Dec 20 11:34:21 starting VM 100 on remote node 'cloud1-1203'
Dec 20 11:34:23 ERROR: online migrate failure - command '/usr/bin/ssh -o
'BatchMode=yes' root@10.255.0.22 qm start 100 --stateuri tcp --skiplock
--migratedfrom cloud1-1202' failed: exit code 255
Dec 20 11:34:23 aborting phase 2 - cleanup resources
Dec 20 11:34:24 ERROR: migration finished with problems (duration 00:00:03)
TASK ERROR: migration problems
--



Does it crash at start of the migration ? or in the middle of the migration ?


At the beginning mostly i see no more output after:
migration listens on port 6



what is your vm conf ? (memory size, storage ?)

2GB mem, RBD / Ceph Storage

Stefan


___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] migration problems since qemu 1.3

2012-12-20 Thread Alexandre DERUMIER
i had it again. 
Do you have applied the fix from today about balloning ?
https://git.proxmox.com/?p=qemu-server.git;a=commit;h=95381ce06cea266d40911a7129da6067a1640cbf

I even canot connect anymore through console to this VM. 

mmm, seem that something break qmp on source vm... 
Is the source vm running ? (is ssh working?)




- Mail original - 

De: Stefan Priebe - Profihost AG s.pri...@profihost.ag 
À: Alexandre DERUMIER aderum...@odiso.com 
Cc: pve-devel@pve.proxmox.com 
Envoyé: Jeudi 20 Décembre 2012 15:27:53 
Objet: Re: [pve-devel] migration problems since qemu 1.3 

Hi, 

i had it again. 

Migration hangs at: 
Dec 20 15:23:03 starting migration of VM 107 to node 'cloud1-1202' 
(10.255.0.20) 
Dec 20 15:23:03 copying disk images 
Dec 20 15:23:03 starting VM 107 on remote node 'cloud1-1202' 
Dec 20 15:23:06 starting migration tunnel 
Dec 20 15:23:06 starting online/live migration on port 6 

I even canot connect anymore through console to this VM. 

Stefan 

Am 20.12.2012 12:31, schrieb Stefan Priebe - Profihost AG: 
 Hi, 
 
 at least migration works at all ;-) I'll wait until tomorrow and test 
 again. I've restarted all VMs with latest pve-qemu-kvm. 
 
 Thanks! 
 
 Am 20.12.2012 11:57, schrieb Alexandre DERUMIER: 
 with last git, I think it's related to balloon driver enabled by 
 default, and qmp command send (see my previous mail). 
 
 
 can you try to replace (in QemuServer.pm) 
 
 if (!defined($conf-{balloon}) || $conf-{balloon}) { 
 vm_mon_cmd($vmid, balloon, value = 
 $conf-{balloon}*1024*1024) 
 if $conf-{balloon}; 
 
 vm_mon_cmd($vmid, 'qom-set', 
 path = machine/peripheral/balloon0, 
 property = stats-polling-interval, 
 value = 2); 
 } 
 
 by 
 
 if (!defined($conf-{balloon}) || $conf-{balloon}) { 
 vm_mon_cmd_nocheck($vmid, balloon, value = 
 $conf-{balloon}*1024*1024) 
 if $conf-{balloon}; 
 
 vm_mon_cmd_nocheck($vmid, 'qom-set', 
 path = machine/peripheral/balloon0, 
 property = stats-polling-interval, 
 value = 2); 
 } 
 
 
 (vm_mon_cmd_nocheck) 
 
 - Mail original - 
 
 De: Stefan Priebe - Profihost AG s.pri...@profihost.ag 
 À: Alexandre DERUMIER aderum...@odiso.com 
 Cc: pve-devel@pve.proxmox.com 
 Envoyé: Jeudi 20 Décembre 2012 11:48:06 
 Objet: Re: [pve-devel] migration problems since qemu 1.3 
 
 Hi, 
 
 Am 20.12.2012 10:04, schrieb Alexandre DERUMIER: 
 Yes. It works fine with NEWLY started VMs but if the VMs are running 
 more than 1-3 days. It stops working and the VMs just crahs during 
 migration. 
 Maybe vm running since 1-3 days,have more memory used, so I take more 
 time to live migrate. 
 
 I see totally different outputs - the vm crashes and the status output 
 stops. 
 
 with git from yesterday i'm just getting this: 
 -- 
 Dec 20 11:34:21 starting migration of VM 100 to node 'cloud1-1203' 
 (10.255.0.22) 
 Dec 20 11:34:21 copying disk images 
 Dec 20 11:34:21 starting VM 100 on remote node 'cloud1-1203' 
 Dec 20 11:34:23 ERROR: online migrate failure - command '/usr/bin/ssh -o 
 'BatchMode=yes' root@10.255.0.22 qm start 100 --stateuri tcp --skiplock 
 --migratedfrom cloud1-1202' failed: exit code 255 
 Dec 20 11:34:23 aborting phase 2 - cleanup resources 
 Dec 20 11:34:24 ERROR: migration finished with problems (duration 
 00:00:03) 
 TASK ERROR: migration problems 
 -- 
 
 
 Does it crash at start of the migration ? or in the middle of the 
 migration ? 
 
 At the beginning mostly i see no more output after: 
 migration listens on port 6 
 
 
 what is your vm conf ? (memory size, storage ?) 
 2GB mem, RBD / Ceph Storage 
 
 Stefan 
 
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] migration problems since qemu 1.3

2012-12-20 Thread Stefan Priebe - Profihost AG

Hi,
Am 20.12.2012 15:49, schrieb Alexandre DERUMIER:

i had it again.

Do you have applied the fix from today about balloning ?
https://git.proxmox.com/?p=qemu-server.git;a=commit;h=95381ce06cea266d40911a7129da6067a1640cbf


Yes.


I even canot connect anymore through console to this VM.


mmm, seem that something break qmp on source vm...
Is the source vm running ? (is ssh working?)
It is marked as running the kvm process is still there. But no service 
is running anymore - so i cannot even connect via ssh anymore.


Stefan


- Mail original -

De: Stefan Priebe - Profihost AG s.pri...@profihost.ag
À: Alexandre DERUMIER aderum...@odiso.com
Cc: pve-devel@pve.proxmox.com
Envoyé: Jeudi 20 Décembre 2012 15:27:53
Objet: Re: [pve-devel] migration problems since qemu 1.3

Hi,

i had it again.

Migration hangs at:
Dec 20 15:23:03 starting migration of VM 107 to node 'cloud1-1202'
(10.255.0.20)
Dec 20 15:23:03 copying disk images
Dec 20 15:23:03 starting VM 107 on remote node 'cloud1-1202'
Dec 20 15:23:06 starting migration tunnel
Dec 20 15:23:06 starting online/live migration on port 6

I even canot connect anymore through console to this VM.

Stefan

Am 20.12.2012 12:31, schrieb Stefan Priebe - Profihost AG:

Hi,

at least migration works at all ;-) I'll wait until tomorrow and test
again. I've restarted all VMs with latest pve-qemu-kvm.

Thanks!

Am 20.12.2012 11:57, schrieb Alexandre DERUMIER:

with last git, I think it's related to balloon driver enabled by
default, and qmp command send (see my previous mail).


can you try to replace (in QemuServer.pm)

if (!defined($conf-{balloon}) || $conf-{balloon}) {
vm_mon_cmd($vmid, balloon, value =
$conf-{balloon}*1024*1024)
if $conf-{balloon};

vm_mon_cmd($vmid, 'qom-set',
path = machine/peripheral/balloon0,
property = stats-polling-interval,
value = 2);
}

by

if (!defined($conf-{balloon}) || $conf-{balloon}) {
vm_mon_cmd_nocheck($vmid, balloon, value =
$conf-{balloon}*1024*1024)
if $conf-{balloon};

vm_mon_cmd_nocheck($vmid, 'qom-set',
path = machine/peripheral/balloon0,
property = stats-polling-interval,
value = 2);
}


(vm_mon_cmd_nocheck)

- Mail original -

De: Stefan Priebe - Profihost AG s.pri...@profihost.ag
À: Alexandre DERUMIER aderum...@odiso.com
Cc: pve-devel@pve.proxmox.com
Envoyé: Jeudi 20 Décembre 2012 11:48:06
Objet: Re: [pve-devel] migration problems since qemu 1.3

Hi,

Am 20.12.2012 10:04, schrieb Alexandre DERUMIER:

Yes. It works fine with NEWLY started VMs but if the VMs are running
more than 1-3 days. It stops working and the VMs just crahs during
migration.

Maybe vm running since 1-3 days,have more memory used, so I take more
time to live migrate.


I see totally different outputs - the vm crashes and the status output
stops.

with git from yesterday i'm just getting this:
--
Dec 20 11:34:21 starting migration of VM 100 to node 'cloud1-1203'
(10.255.0.22)
Dec 20 11:34:21 copying disk images
Dec 20 11:34:21 starting VM 100 on remote node 'cloud1-1203'
Dec 20 11:34:23 ERROR: online migrate failure - command '/usr/bin/ssh -o
'BatchMode=yes' root@10.255.0.22 qm start 100 --stateuri tcp --skiplock
--migratedfrom cloud1-1202' failed: exit code 255
Dec 20 11:34:23 aborting phase 2 - cleanup resources
Dec 20 11:34:24 ERROR: migration finished with problems (duration
00:00:03)
TASK ERROR: migration problems
--



Does it crash at start of the migration ? or in the middle of the
migration ?


At the beginning mostly i see no more output after:
migration listens on port 6



what is your vm conf ? (memory size, storage ?)

2GB mem, RBD / Ceph Storage

Stefan


___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] migration problems since qemu 1.3

2012-12-20 Thread Alexandre DERUMIER
Just an idea (not sure it's the problem),can you try to commment

$qmpclient-queue_cmd($vmid, $ballooncb, 'query-balloon');

in QemuServer.pm, line 2081.

and restart pvedaemon  pvestatd ?



- Mail original - 

De: Stefan Priebe - Profihost AG s.pri...@profihost.ag 
À: Alexandre DERUMIER aderum...@odiso.com 
Cc: pve-devel@pve.proxmox.com 
Envoyé: Jeudi 20 Décembre 2012 15:50:38 
Objet: Re: [pve-devel] migration problems since qemu 1.3 

Hi, 
Am 20.12.2012 15:49, schrieb Alexandre DERUMIER: 
 i had it again. 
 Do you have applied the fix from today about balloning ? 
 https://git.proxmox.com/?p=qemu-server.git;a=commit;h=95381ce06cea266d40911a7129da6067a1640cbf
  

Yes. 

 I even canot connect anymore through console to this VM. 
 
 mmm, seem that something break qmp on source vm... 
 Is the source vm running ? (is ssh working?) 
It is marked as running the kvm process is still there. But no service 
is running anymore - so i cannot even connect via ssh anymore. 

Stefan 

 - Mail original - 
 
 De: Stefan Priebe - Profihost AG s.pri...@profihost.ag 
 À: Alexandre DERUMIER aderum...@odiso.com 
 Cc: pve-devel@pve.proxmox.com 
 Envoyé: Jeudi 20 Décembre 2012 15:27:53 
 Objet: Re: [pve-devel] migration problems since qemu 1.3 
 
 Hi, 
 
 i had it again. 
 
 Migration hangs at: 
 Dec 20 15:23:03 starting migration of VM 107 to node 'cloud1-1202' 
 (10.255.0.20) 
 Dec 20 15:23:03 copying disk images 
 Dec 20 15:23:03 starting VM 107 on remote node 'cloud1-1202' 
 Dec 20 15:23:06 starting migration tunnel 
 Dec 20 15:23:06 starting online/live migration on port 6 
 
 I even canot connect anymore through console to this VM. 
 
 Stefan 
 
 Am 20.12.2012 12:31, schrieb Stefan Priebe - Profihost AG: 
 Hi, 
 
 at least migration works at all ;-) I'll wait until tomorrow and test 
 again. I've restarted all VMs with latest pve-qemu-kvm. 
 
 Thanks! 
 
 Am 20.12.2012 11:57, schrieb Alexandre DERUMIER: 
 with last git, I think it's related to balloon driver enabled by 
 default, and qmp command send (see my previous mail). 
 
 
 can you try to replace (in QemuServer.pm) 
 
 if (!defined($conf-{balloon}) || $conf-{balloon}) { 
 vm_mon_cmd($vmid, balloon, value = 
 $conf-{balloon}*1024*1024) 
 if $conf-{balloon}; 
 
 vm_mon_cmd($vmid, 'qom-set', 
 path = machine/peripheral/balloon0, 
 property = stats-polling-interval, 
 value = 2); 
 } 
 
 by 
 
 if (!defined($conf-{balloon}) || $conf-{balloon}) { 
 vm_mon_cmd_nocheck($vmid, balloon, value = 
 $conf-{balloon}*1024*1024) 
 if $conf-{balloon}; 
 
 vm_mon_cmd_nocheck($vmid, 'qom-set', 
 path = machine/peripheral/balloon0, 
 property = stats-polling-interval, 
 value = 2); 
 } 
 
 
 (vm_mon_cmd_nocheck) 
 
 - Mail original - 
 
 De: Stefan Priebe - Profihost AG s.pri...@profihost.ag 
 À: Alexandre DERUMIER aderum...@odiso.com 
 Cc: pve-devel@pve.proxmox.com 
 Envoyé: Jeudi 20 Décembre 2012 11:48:06 
 Objet: Re: [pve-devel] migration problems since qemu 1.3 
 
 Hi, 
 
 Am 20.12.2012 10:04, schrieb Alexandre DERUMIER: 
 Yes. It works fine with NEWLY started VMs but if the VMs are running 
 more than 1-3 days. It stops working and the VMs just crahs during 
 migration. 
 Maybe vm running since 1-3 days,have more memory used, so I take more 
 time to live migrate. 
 
 I see totally different outputs - the vm crashes and the status output 
 stops. 
 
 with git from yesterday i'm just getting this: 
 -- 
 Dec 20 11:34:21 starting migration of VM 100 to node 'cloud1-1203' 
 (10.255.0.22) 
 Dec 20 11:34:21 copying disk images 
 Dec 20 11:34:21 starting VM 100 on remote node 'cloud1-1203' 
 Dec 20 11:34:23 ERROR: online migrate failure - command '/usr/bin/ssh -o 
 'BatchMode=yes' root@10.255.0.22 qm start 100 --stateuri tcp --skiplock 
 --migratedfrom cloud1-1202' failed: exit code 255 
 Dec 20 11:34:23 aborting phase 2 - cleanup resources 
 Dec 20 11:34:24 ERROR: migration finished with problems (duration 
 00:00:03) 
 TASK ERROR: migration problems 
 -- 
 
 
 Does it crash at start of the migration ? or in the middle of the 
 migration ? 
 
 At the beginning mostly i see no more output after: 
 migration listens on port 6 
 
 
 what is your vm conf ? (memory size, storage ?) 
 2GB mem, RBD / Ceph Storage 
 
 Stefan 
 
___
pve-devel mailing list
pve-devel@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


Re: [pve-devel] migration problems since qemu 1.3

2012-12-20 Thread Stefan Priebe - Profihost AG

Hi,
Am 20.12.2012 15:57, schrieb Alexandre DERUMIER:

Just an idea (not sure it's the problem),can you try to commment

$qmpclient-queue_cmd($vmid, $ballooncb, 'query-balloon');

in QemuServer.pm, line 2081.

and restart pvedaemon  pvestatd ?


This doesn't change anything.

Right now the kvm process is running on old and new machine.

An strace on the pid on the new machine shows a loop of:


[pid 28351] ... futex resumed )   = -1 ETIMEDOUT (Connection timed 
out)

[pid 28351] futex(0x7ff8b8025388, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 28351] futex(0x7ff8b8026024, 
FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 11801, {1356016143, 
843092000},  unfinished ...
[pid 28285] mremap(0x7ff77bfe4000, 160378880, 160411648, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160411648, 160448512, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160448512, 160481280, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160481280, 160514048, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160514048, 160546816, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160546816, 160583680, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160583680, 160616448, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160616448, 160649216, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160649216, 160681984, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160681984, 160718848, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160718848, 160751616, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160751616, 160784384, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160784384, 160817152, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160817152, 160854016, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160854016, 160886784, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160886784, 160919552, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160919552, 160952320, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160952320, 160989184, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 160989184, 161021952, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 161021952, 161054720, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 161054720, 161087488, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 161087488, 161124352, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 161124352, 161157120, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 161157120, 161189888, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 161189888, 161222656, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 161222656, 161259520, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 161259520, 161292288, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 161292288, 161325056, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28351] ... futex resumed )   = -1 ETIMEDOUT (Connection timed 
out)

[pid 28351] futex(0x7ff8b8025388, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 28351] futex(0x7ff8b8026024, 
FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 11803, {1356016144, 
843283000},  unfinished ...
[pid 28285] mremap(0x7ff77bfe4000, 161325056, 161357824, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 161357824, 161394688, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 161394688, 161427456, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 161427456, 161460224, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28345] ... restart_syscall resumed ) = -1 ETIMEDOUT (Connection 
timed out)
[pid 28345] futex(0x7ff8caa2e274, FUTEX_CMP_REQUEUE_PRIVATE, 1, 
2147483647, 0x7ff8caa2e1b0, 872) = 1

[pid 28347] ... futex resumed )   = 0
[pid 28345] futex(0x7ff8caa241a8, FUTEX_WAKE_PRIVATE, 1 unfinished ...
[pid 28347] futex(0x7ff8caa2e1b0, FUTEX_WAKE_PRIVATE, 1 unfinished ...
[pid 28345] ... futex resumed )   = 0
[pid 28347] ... futex resumed )   = 0
[pid 28345] futex(0x7ff8caa2420c, 
FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 799, {1356016153, 
954319000},  unfinished ...
[pid 28347] sendmsg(19, {msg_name(0)=NULL, msg_iov(1)=[{\t, 1}], 
msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 1
[pid 28347] futex(0x7ff8caa2e274, FUTEX_WAIT_PRIVATE, 873, NULL 
unfinished ...
[pid 28285] mremap(0x7ff77bfe4000, 161460224, 161492992, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 161492992, 161529856, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 161529856, 161562624, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000
[pid 28285] mremap(0x7ff77bfe4000, 161562624, 

Re: [pve-devel] migration problems since qemu 1.3

2012-12-20 Thread Alexandre DERUMIER
Hi Stefan, any news ?

I'm trying to reproduce your problem, but it's works fine for me, no crash...

- Mail original - 

De: Stefan Priebe - Profihost AG s.pri...@profihost.ag 
À: Alexandre DERUMIER aderum...@odiso.com 
Cc: pve-devel@pve.proxmox.com 
Envoyé: Jeudi 20 Décembre 2012 16:09:42 
Objet: Re: [pve-devel] migration problems since qemu 1.3 

Hi, 
Am 20.12.2012 15:57, schrieb Alexandre DERUMIER: 
 Just an idea (not sure it's the problem),can you try to commment 
 
 $qmpclient-queue_cmd($vmid, $ballooncb, 'query-balloon'); 
 
 in QemuServer.pm, line 2081. 
 
 and restart pvedaemon  pvestatd ? 

This doesn't change anything. 

Right now the kvm process is running on old and new machine. 

An strace on the pid on the new machine shows a loop of: 

 
[pid 28351] ... futex resumed ) = -1 ETIMEDOUT (Connection timed 
out) 
[pid 28351] futex(0x7ff8b8025388, FUTEX_WAKE_PRIVATE, 1) = 0 
[pid 28351] futex(0x7ff8b8026024, 
FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 11801, {1356016143, 
843092000},  unfinished ... 
[pid 28285] mremap(0x7ff77bfe4000, 160378880, 160411648, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 160411648, 160448512, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 160448512, 160481280, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 160481280, 160514048, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 160514048, 160546816, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 160546816, 160583680, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 160583680, 160616448, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 160616448, 160649216, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 160649216, 160681984, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 160681984, 160718848, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 160718848, 160751616, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 160751616, 160784384, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 160784384, 160817152, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 160817152, 160854016, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 160854016, 160886784, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 160886784, 160919552, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 160919552, 160952320, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 160952320, 160989184, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 160989184, 161021952, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 161021952, 161054720, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 161054720, 161087488, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 161087488, 161124352, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 161124352, 161157120, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 161157120, 161189888, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 161189888, 161222656, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 161222656, 161259520, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 161259520, 161292288, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 161292288, 161325056, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28351] ... futex resumed ) = -1 ETIMEDOUT (Connection timed 
out) 
[pid 28351] futex(0x7ff8b8025388, FUTEX_WAKE_PRIVATE, 1) = 0 
[pid 28351] futex(0x7ff8b8026024, 
FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 11803, {1356016144, 
843283000},  unfinished ... 
[pid 28285] mremap(0x7ff77bfe4000, 161325056, 161357824, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 161357824, 161394688, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 161394688, 161427456, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28285] mremap(0x7ff77bfe4000, 161427456, 161460224, MREMAP_MAYMOVE) 
= 0x7ff77bfe4000 
[pid 28345] ... restart_syscall resumed ) = -1 ETIMEDOUT (Connection 
timed out) 
[pid 28345] futex(0x7ff8caa2e274, FUTEX_CMP_REQUEUE_PRIVATE, 1, 
2147483647, 0x7ff8caa2e1b0, 872) = 1 
[pid 28347] ... futex resumed ) = 0 
[pid 28345] futex(0x7ff8caa241a8, FUTEX_WAKE_PRIVATE, 1 unfinished ... 
[pid 28347] futex(0x7ff8caa2e1b0, FUTEX_WAKE_PRIVATE, 1 unfinished ... 
[pid 28345] ... futex resumed ) = 0 
[pid 28347] ... futex resumed ) = 0 
[pid 28345] futex(0x7ff8caa2420c, 
FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 799, {1356016153, 
954319000},  unfinished ... 
[pid 28347] sendmsg(19, {msg_name(0)=NULL, msg_iov(1)=[{\t, 1}], 
msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 1 
[pid 28347] futex