On 26.02.2014 19:41, Serge Hallyn wrote:
Quoting Vitaly Lavrov ([email protected]):
On 25.02.2014 22:54, Serge Hallyn wrote:
Quoting Vitaly Lavrov ([email protected]):
On 23.02.2014 03:36, Stéphane Graber wrote:
Hi,

Thanks for your patch.

Can I just ask you to sign it off? (Signed-off-by: Name <email>)
Hi!

I found the source of the problem with a reboot of the container, but do not 
know how best to fix it.
We have a race condition between the end of the old container and the creation 
of the network interfaces
in the new container. Insert usleep (100000) before lxc_delete_network() solves 
the problem with a reboot,
but it's a bad way.

How to wait until the completion of the container?

How exactly are you doing the test?  just script

        lxc-start;
        lxc-stop;
        lxc-start;
lxc-stop -rn container

A-ha!  Thanks.  Yes, this is a bug in our reboot handling in
lxcapi_start().  I can reproduce it trivially with lxc-stop -r
on any container with lxc.network.type = phys.

"lxc.network.type = phys" has another bug

*** glibc detected *** lxc-start: realloc(): invalid pointer: 0x0948eed0 ***
======= Backtrace: =========
/lib/libc.so.6(+0x7710b)[0xb756d10b]
/lib/libc.so.6(realloc+0x2c5)[0xb75720b5]
/usr/lib/liblxc.so.1(__lxc_start+0x5d2)[0xb76c0c12]
/usr/lib/liblxc.so.1(lxc_start+0x4c)[0xb76c15ac]
/usr/lib/liblxc.so.1(+0x42a2c)[0xb76eaa2c]
lxc-start(main+0x267)[0x8048e07]
/lib/libc.so.6(__libc_start_main+0xf5)[0xb750f5a5]
lxc-start[0x8049245]
======= Memory map: ========

src/lxc/start.c:753 save_phys_nics()
-----------------------------------------------------------------------
        conf->saved_nics = realloc(conf->saved_nics,
                (conf->num_savednics+1)*sizeof(struct saved_nic));
-----------------------------------------------------------------------

The patch is simple.

--- src/lxc/conf.c.orig 2014-02-26 13:21:40.263953511 +0400
+++ src/lxc/conf.c      2014-02-26 20:39:46.710074311 +0400
@@ -2606,6 +2606,7 @@ void lxc_rename_phys_nics_on_shutdown(st
        }
        conf->num_savednics = 0;
        free(conf->saved_nics);
+       conf->saved_nics = NULL;
 }

 static char *default_rootfs_mount = LXCROOTFSMOUNT;
@@ -4119,8 +4120,8 @@ static void lxc_clear_saved_nics(struct
                return;
        for (i=0; i < conf->num_savednics; i++)
                free(conf->saved_nics[i].orig_name);
-       conf->saved_nics = 0;
        free(conf->saved_nics);
+       conf->saved_nics = NULL;
 }

 void lxc_conf_free(struct lxc_conf *conf)
--


But there is a more difficult problem.

Function lxc_rename_phys_nics_on_shutdown() does not always work as it should.

------
lxc-start 1393409939.368 INFO     lxc_conf - running to reset 1 nic names
lxc-start 1393409939.368 WARN     lxc_conf - resetting nic 3 to eth2 failed: No 
such device
------

I added a wait loop and debug printing and that's what got:
-----
lxc-start 1393433485.531 INFO     lxc_conf - running to reset 1 nic names
lxc-start 1393433485.532 WARN     lxc_conf - resetting nic 3 to eth2 failed: 
'No such device', delay 1ms
lxc-start 1393433485.533 WARN     lxc_conf - resetting nic 3 to eth2 failed: 
'No such device', delay 2ms
lxc-start 1393433485.534 WARN     lxc_conf - resetting nic 3 to eth2 failed: 
'No such device', delay 3ms
lxc-start 1393433485.536 WARN     lxc_conf - resetting nic 3 to eth2 failed: 
'No such device', delay 4ms
lxc-start 1393433485.537 WARN     lxc_conf - resetting nic 3 to eth2 failed: 
'No such device', delay 5ms
lxc-start 1393433485.538 WARN     lxc_conf - resetting nic 3 to eth2 failed: 
'No such device', delay 6ms
lxc-start 1393433485.539 WARN     lxc_conf - resetting nic 3 to eth2 failed: 
'No such device', delay 7ms
lxc-start 1393433485.540 WARN     lxc_conf - resetting nic 3 to eth2 failed: 
'No such device', delay 8ms
lxc-start 1393433485.541 WARN     lxc_conf - resetting nic 3 to eth2 failed: 
'No such device', delay 9ms
lxc-start 1393433485.542 WARN     lxc_conf - resetting nic 3 to eth2 failed: 
'No such device', delay 10ms
lxc-start 1393433485.543 WARN     lxc_conf - resetting nic 3 to eth2 failed: 
'No such device', delay 11ms
lxc-start 1393433485.544 WARN     lxc_conf - resetting nic 3 to eth2 failed: 
'No such device', delay 12ms
lxc-start 1393433485.545 WARN     lxc_conf - resetting nic 3 to eth2 failed: 
'No such device', delay 13ms
lxc-start 1393433485.562 INFO     lxc_conf - resetting nic 3 to eth2, delay 14ms
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If you wait 12-20 ms renaming network interface works.
The same problem with vlan interface.


_______________________________________________
lxc-devel mailing list
[email protected]
http://lists.linuxcontainers.org/listinfo/lxc-devel


_______________________________________________
lxc-devel mailing list
[email protected]
http://lists.linuxcontainers.org/listinfo/lxc-devel

Reply via email to