Re: [Linux-HA] Xen live migration the Ressource appear as not running after few minutes

Rene Purcell Fri, 18 May 2007 07:20:10 -0700

Yup the problem got solved increasing the timeout value of the stop and
monitor operation, not sure now which one was in cause..


I'm now having some issue with the live migration.. after few time I told
heartbeat to switch the ressource from one node to another, using my place
constraints, I got some kernel panic in the VM...
I'm using SLES 10 SP1-RC4 I'll past the output.. just in case someone can
understand what's the problem! but it's more a Xen issue..

Once I got:
qclsles02:~ # xm console  migrating-qclvmsles01
Unable to handle kernel NULL pointer dereference at 00000000000001d8 RIP:
<ffffffff801df29d>{blk_start_queue+4}
PGD 3a447067 PUD 3a469067 PMD 0
Oops: 0002 [1] SMP
last sysfs file: /class/net/eth0/statistics/collisions
CPU 0
Modules linked in: 8250 serial_core ipv6 nfs lockd nfs_acl sunrpc apparmor
aamatch_pcre loop dm_mod ext3 jbd xenblk xennet
Pid: 2748, comm: suspend Tainted: G     U 2.6.16.46-0.10-xen #1
RIP: e030:[<ffffffff801df29d>] <ffffffff801df29d>{blk_start_queue+4}
RSP: e02b:ffff880013f89d98  EFLAGS: 00010002
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff880000650000 R08: 000000003b9ac878 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000002 R12: ffff880000b16848
R13: ffff880014e1da00 R14: ffff88003ffedd08 R15: ffff880014e1c000
FS:  00002ad048c41c90(0000) GS:ffffffff803a4000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000
Process suspend (pid: 2748, threadinfo ffff880013f88000, task
ffff88002fdec140)
Stack: ffff880000650000 ffffffff8800c969 00000000ffffffff ffffffff8800cbec
      ffff880000b16848 0000000080252db9 ffff8800006500d8 ffff880000b16800
      0000000000000000 ffff880000b16848
Call Trace: <ffffffff8800c969>{:xenblk:kick_pending_request_queues+24}
      <ffffffff8800cbec>{:xenblk:blkfront_resume+631}
<ffffffff80141c0a>{keventd_create_kthread+0}
      <ffffffff8025377c>{resume_dev+88} <ffffffff8024ec47>{xen_suspend+0}
      <ffffffff80253724>{resume_dev+0}
<ffffffff8024803d>{bus_for_each_dev+67}
      <ffffffff8024ec47>{xen_suspend+0}
<ffffffff802536a1>{xenbus_resume+37}
      <ffffffff8024f4a5>{__xen_suspend+1541}
<ffffffff80141c0a>{keventd_create_kthread+0}
      <ffffffff8024ec50>{xen_suspend+9} <ffffffff8024ec47>{xen_suspend+0}
      <ffffffff80141eae>{kthread+212} <ffffffff8010baba>{child_rip+8}
      <ffffffff80141c0a>{keventd_create_kthread+0}
<ffffffff80141dda>{kthread+0}
      <ffffffff8010bab2>{child_rip+0}

Code: f0 0f ba


and there's another one:

qclsles02:~ # xm console  qclvmsles01
Unable to handle kernel NULL pointer dereference at 00000000000001d8 RIP:
<ffffffff801df29d>{blk_start_queue+4}
PGD 2db6d067 PUD 2db71067 PMD 0
Oops: 0002 [1] SMP
last sysfs file: /block/xvdc/removable
CPU 0
Modules linked in: 8250 serial_core ipv6 nfs lockd nfs_acl sunrpc apparmor
aamatch_pcre loop dm_mod ext3 jbd xenblk xennet
Pid: 3099, comm: suspend Tainted: G     U 2.6.16.46-0.10-xen #1
RIP: e030:[<ffffffff801df29d>] <ffffffff801df29d>{blk_start_queue+4}
RSP: e02b:ffff88002bec3d98  EFLAGS: 00010002
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff88000063c000 R08: 000000003b9ac938 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000002 R12: ffff880000b16848
R13: ffff88003ad29a00 R14: ffff88003ffedd08 R15: ffff88003ad28000
FS:  00002b885f3c1fc0(0000) GS:ffffffff803a4000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000
Process suspend (pid: 3099, threadinfo ffff88002bec2000, task
ffff88002d0d0080)
Stack: ffff88000063c000 ffffffff8800c969 00000000ffffffff ffffffff8800cbec
      ffff880000b16848 0000000080252db9 ffff88000063c0d8 ffff880000b16800
      0000000000000000 ffff880000b16848
Call Trace: <ffffffff8800c969>{:xenblk:kick_pending_request_queues+24}
      <ffffffff8800cbec>{:xenblk:blkfront_resume+631}
<ffffffff80141c0a>{keventd_create_kthread+0}
      <ffffffff8025377c>{resume_dev+88} <ffffffff8024ec47>{xen_suspend+0}
      <ffffffff80253724>{resume_dev+0}
<ffffffff8024803d>{bus_for_each_dev+67}
      <ffffffff8024ec47>{xen_suspend+0}
<ffffffff802536a1>{xenbus_resume+37}
      <ffffffff8024f4a5>{__xen_suspend+1541}
<ffffffff80141c0a>{keventd_create_kthread+0}
      <ffffffff8024ec50>{xen_suspend+9} <ffffffff8024ec47>{xen_suspend+0}
      <ffffffff80141eae>{kthread+212} <ffffffff8010baba>{child_rip+8}
      <ffffffff80141c0a>{keventd_create_kthread+0}
<ffffffff80141dda>{kthread+0}
      <ffffffff8010bab2>{child_rip+0}

Code: f0 0f ba b7 d8 01 00 00 02 f0 0f ba af d8 01 00 00 06 19 c0
RIP <ffffffff801df29d>{blk_start_queue+4} RSP <ffff88002bec3d98>
CR2: 00000000000001d8
<6>Setting mem allocation to 1048576 kiB
Setting mem allocation to 1048576 kiB


Thanks for your reply guy's.


On 5/18/07, Lars Marowsky-Bree <[EMAIL PROTECTED]> wrote:


On 2007-05-18T10:38:10, Andrew Beekhof <[EMAIL PROTECTED]> wrote:

> >I have a problem when my ressource migrate from one node to another,
during
> >the migration, after few minutes the ressource become in "not running"
> >state.. but it's working fine. I mean the ressource migrate from one
node
> >to
> >the other, it's just like the RA can't wait for xen to finish the
migration
> >so he call the ressource "not running".
>
> it does this on the node its migrating from?

Sounds like a timeout problem to me. Likely the timeouts for the
migrate_from and migrate_to ops should be increased in his environment.


Sincerely,
    Lars

--
Teamlead Kernel, SuSE Labs, Research and Development
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems




--
René Jr Purcell
Chargé de projet, sécurité et sytèmes
Techno Centre Logiciels Libres, http://www.tc2l.ca/
Téléphone : (418) 681-2929 #124
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Xen live migration the Ressource appear as not running after few minutes

Reply via email to