On Wed, Feb 26, 2014 at 09:19:27PM +0400, Vitaly Lavrov wrote: > On 26.02.2014 19:41, Serge Hallyn wrote: > >Quoting Vitaly Lavrov ([email protected]): > >>On 25.02.2014 22:54, Serge Hallyn wrote: > >>>Quoting Vitaly Lavrov ([email protected]): > >>>>On 23.02.2014 03:36, Stéphane Graber wrote: > >>>>>Hi, > >>>>> > >>>>>Thanks for your patch. > >>>>> > >>>>>Can I just ask you to sign it off? (Signed-off-by: Name <email>) > >>>>Hi! > >>>> > >>>>I found the source of the problem with a reboot of the container, but do > >>>>not know how best to fix it. > >>>>We have a race condition between the end of the old container and the > >>>>creation of the network interfaces > >>>>in the new container. Insert usleep (100000) before lxc_delete_network() > >>>>solves the problem with a reboot, > >>>>but it's a bad way. > >>>> > >>>>How to wait until the completion of the container? > >>> > >>>How exactly are you doing the test? just script > >>> > >>> lxc-start; > >>> lxc-stop; > >>> lxc-start; > >>lxc-stop -rn container > > > >A-ha! Thanks. Yes, this is a bug in our reboot handling in > >lxcapi_start(). I can reproduce it trivially with lxc-stop -r > >on any container with lxc.network.type = phys. > > "lxc.network.type = phys" has another bug > > *** glibc detected *** lxc-start: realloc(): invalid pointer: 0x0948eed0 *** > ======= Backtrace: ========= > /lib/libc.so.6(+0x7710b)[0xb756d10b] > /lib/libc.so.6(realloc+0x2c5)[0xb75720b5] > /usr/lib/liblxc.so.1(__lxc_start+0x5d2)[0xb76c0c12] > /usr/lib/liblxc.so.1(lxc_start+0x4c)[0xb76c15ac] > /usr/lib/liblxc.so.1(+0x42a2c)[0xb76eaa2c] > lxc-start(main+0x267)[0x8048e07] > /lib/libc.so.6(__libc_start_main+0xf5)[0xb750f5a5] > lxc-start[0x8049245] > ======= Memory map: ======== > > src/lxc/start.c:753 save_phys_nics() > ----------------------------------------------------------------------- > conf->saved_nics = realloc(conf->saved_nics, > (conf->num_savednics+1)*sizeof(struct saved_nic)); > ----------------------------------------------------------------------- > > The patch is simple. > > --- src/lxc/conf.c.orig 2014-02-26 13:21:40.263953511 +0400 > +++ src/lxc/conf.c 2014-02-26 20:39:46.710074311 +0400 > @@ -2606,6 +2606,7 @@ void lxc_rename_phys_nics_on_shutdown(st > } > conf->num_savednics = 0; > free(conf->saved_nics); > + conf->saved_nics = NULL; > } > > static char *default_rootfs_mount = LXCROOTFSMOUNT; > @@ -4119,8 +4120,8 @@ static void lxc_clear_saved_nics(struct > return; > for (i=0; i < conf->num_savednics; i++) > free(conf->saved_nics[i].orig_name); > - conf->saved_nics = 0; > free(conf->saved_nics); > + conf->saved_nics = NULL; > } > > void lxc_conf_free(struct lxc_conf *conf) > --
That patch looks reasonable to me, can you send it separately to the mailing-list including a commit message and Signed-off-by line so I can already apply that one to master? Thanks! > > > But there is a more difficult problem. > > Function lxc_rename_phys_nics_on_shutdown() does not always work as it should. > > ------ > lxc-start 1393409939.368 INFO lxc_conf - running to reset 1 nic names > lxc-start 1393409939.368 WARN lxc_conf - resetting nic 3 to eth2 failed: > No such device > ------ > > I added a wait loop and debug printing and that's what got: > ----- > lxc-start 1393433485.531 INFO lxc_conf - running to reset 1 nic names > lxc-start 1393433485.532 WARN lxc_conf - resetting nic 3 to eth2 failed: > 'No such device', delay 1ms > lxc-start 1393433485.533 WARN lxc_conf - resetting nic 3 to eth2 failed: > 'No such device', delay 2ms > lxc-start 1393433485.534 WARN lxc_conf - resetting nic 3 to eth2 failed: > 'No such device', delay 3ms > lxc-start 1393433485.536 WARN lxc_conf - resetting nic 3 to eth2 failed: > 'No such device', delay 4ms > lxc-start 1393433485.537 WARN lxc_conf - resetting nic 3 to eth2 failed: > 'No such device', delay 5ms > lxc-start 1393433485.538 WARN lxc_conf - resetting nic 3 to eth2 failed: > 'No such device', delay 6ms > lxc-start 1393433485.539 WARN lxc_conf - resetting nic 3 to eth2 failed: > 'No such device', delay 7ms > lxc-start 1393433485.540 WARN lxc_conf - resetting nic 3 to eth2 failed: > 'No such device', delay 8ms > lxc-start 1393433485.541 WARN lxc_conf - resetting nic 3 to eth2 failed: > 'No such device', delay 9ms > lxc-start 1393433485.542 WARN lxc_conf - resetting nic 3 to eth2 failed: > 'No such device', delay 10ms > lxc-start 1393433485.543 WARN lxc_conf - resetting nic 3 to eth2 failed: > 'No such device', delay 11ms > lxc-start 1393433485.544 WARN lxc_conf - resetting nic 3 to eth2 failed: > 'No such device', delay 12ms > lxc-start 1393433485.545 WARN lxc_conf - resetting nic 3 to eth2 failed: > 'No such device', delay 13ms > lxc-start 1393433485.562 INFO lxc_conf - resetting nic 3 to eth2, delay > 14ms > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > If you wait 12-20 ms renaming network interface works. > The same problem with vlan interface. > > > >_______________________________________________ > >lxc-devel mailing list > >[email protected] > >http://lists.linuxcontainers.org/listinfo/lxc-devel > > > > _______________________________________________ > lxc-devel mailing list > [email protected] > http://lists.linuxcontainers.org/listinfo/lxc-devel -- Stéphane Graber Ubuntu developer http://www.ubuntu.com
signature.asc
Description: Digital signature
_______________________________________________ lxc-devel mailing list [email protected] http://lists.linuxcontainers.org/listinfo/lxc-devel
