Re: [lxc-devel] [PATCH] fix multithreaded create()
On Tue, Nov 12, 2013 at 2:04 PM, Dwight Engen dwight.en...@oracle.com wrote: We were calling save_config() twice within the create() flow, each from a different process. Depending on order of scheduling, sometimes the data from the first save_config() (which was just the stuff from LXC_DEFAULT_CONFIG) would overwrite the config we wanted (the full config), causing a truncated config file which would then cause lxc to segfault once it read it back in because no rootfs.path was set. This fixes it by only calling save_config() once in the create() flow. A rejected alternative was to call fsync(fileno(fout)) before the fclose in save_config. Signed-off-by: Dwight Engen dwight.en...@oracle.com Great news; This solved the concurrent create problems for me Bad news; I started to observe destroy() failures (see below) [caglar@qp:~/go/src/github.com/caglar10ur/lxc/examples] sudo ./concurrent_stress -iteration=100 -count=10 lxc_container: Error destroying rootfs for 5 2013/11/12 14:31:43 [ERROR] ERROR: destroying the container 5 failed 2013/11/12 14:31:43 [ERROR] ERROR: container 5 already defined lxc_container: Error destroying rootfs for 1 2013/11/12 14:32:10 [ERROR] ERROR: destroying the container 1 failed 2013/11/12 14:32:10 [ERROR] ERROR: container 1 already defined lxc_container: Error destroying rootfs for 1 2013/11/12 14:32:25 [ERROR] ERROR: destroying the container 1 failed 2013/11/12 14:32:25 [ERROR] ERROR: container 1 already defined lxc_container: Error destroying rootfs for 7 2013/11/12 14:32:30 [ERROR] ERROR: destroying the container 7 failed 2013/11/12 14:32:30 [ERROR] ERROR: container 7 already defined lxc_container: Error destroying rootfs for 4 2013/11/12 14:33:25 [ERROR] ERROR: destroying the container 4 failed 2013/11/12 14:33:25 [ERROR] ERROR: container 4 already defined lxc_container: Error destroying rootfs for 0 2013/11/12 14:33:40 [ERROR] ERROR: destroying the container 0 failed 2013/11/12 14:33:40 [ERROR] ERROR: container 0 already defined lxc_container: Error destroying rootfs for 0 2013/11/12 14:33:45 [ERROR] ERROR: destroying the container 0 failed 2013/11/12 14:33:45 [ERROR] ERROR: container 0 already defined lxc_container: Error destroying rootfs for 8 2013/11/12 14:33:55 [ERROR] ERROR: destroying the container 8 failed 2013/11/12 14:33:55 [ERROR] ERROR: container 8 already defined lxc_container: Error destroying rootfs for 2 2013/11/12 14:34:40 [ERROR] ERROR: destroying the container 2 failed 2013/11/12 14:34:40 [ERROR] ERROR: container 2 already defined lxc_container: Error destroying rootfs for 1 2013/11/12 14:34:45 [ERROR] ERROR: destroying the container 1 failed 2013/11/12 14:34:45 [ERROR] ERROR: container 1 already defined lxc_container: Error destroying rootfs for 9 2013/11/12 14:35:00 [ERROR] ERROR: destroying the container 9 failed 2013/11/12 14:35:01 [ERROR] ERROR: container 9 already defined [caglar@qp:~/go/src/github.com/caglar10ur/lxc/examples] but since the original problem is gone now Acked-by: S.Çağlar Onur cag...@10ur.org --- src/lxc/lxccontainer.c | 20 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c index c7b2f5e..05ca643 100644 --- a/src/lxc/lxccontainer.c +++ b/src/lxc/lxccontainer.c @@ -1192,16 +1192,19 @@ static bool lxcapi_create(struct lxc_container *c, const char *t, if (lxcapi_is_defined(c) c-lxc_conf c-lxc_conf-rootfs.path access(c-lxc_conf-rootfs.path, F_OK) == 0 tpath) { ERROR(Container %s:%s already exists, c-config_path, c-name); - free(tpath); - return false; + goto free_tpath; } - /* Save the loaded configuration to disk */ - if (!c-save_config(c, NULL)) { - ERROR(failed to save starting configuration for %s\n, c-name); - goto out; + if (!c-lxc_conf) { + if (!c-load_config(c, LXC_DEFAULT_CONFIG)) { + ERROR(Error loading default configuration file %s\n, LXC_DEFAULT_CONFIG); + goto free_tpath; + } } + if (!create_container_dir(c)) + goto free_tpath; + /* * either template or rootfs.path should be set. * if both template and rootfs.path are set, template is setup as rootfs.path. @@ -1290,10 +1293,11 @@ out_unlock: if (partial_fd = 0) remove_partial(c, partial_fd); out: - if
Re: [lxc-devel] [PATCH] fix multithreaded create()
On Tue, 12 Nov 2013 14:35:58 -0500 S.Çağlar Onur cag...@10ur.org wrote: On Tue, Nov 12, 2013 at 2:04 PM, Dwight Engen dwight.en...@oracle.com wrote: We were calling save_config() twice within the create() flow, each from a different process. Depending on order of scheduling, sometimes the data from the first save_config() (which was just the stuff from LXC_DEFAULT_CONFIG) would overwrite the config we wanted (the full config), causing a truncated config file which would then cause lxc to segfault once it read it back in because no rootfs.path was set. This fixes it by only calling save_config() once in the create() flow. A rejected alternative was to call fsync(fileno(fout)) before the fclose in save_config. Signed-off-by: Dwight Engen dwight.en...@oracle.com Great news; This solved the concurrent create problems for me Bad news; I started to observe destroy() failures (see below) Hmm, you are testing through the go binding correct? I ran these two without error: ./src/tests/lxc-test-concurrent -i 200 -j 8 create,destroy ./src/tests/lxc-test-concurrent -i 100 -j 8 I wonder if lxc-test-concurrent will fail for you? [caglar@qp:~/go/src/github.com/caglar10ur/lxc/examples] sudo ./concurrent_stress -iteration=100 -count=10 lxc_container: Error destroying rootfs for 5 2013/11/12 14:31:43 [ERROR] ERROR: destroying the container 5 failed 2013/11/12 14:31:43 [ERROR] ERROR: container 5 already defined lxc_container: Error destroying rootfs for 1 2013/11/12 14:32:10 [ERROR] ERROR: destroying the container 1 failed 2013/11/12 14:32:10 [ERROR] ERROR: container 1 already defined lxc_container: Error destroying rootfs for 1 2013/11/12 14:32:25 [ERROR] ERROR: destroying the container 1 failed 2013/11/12 14:32:25 [ERROR] ERROR: container 1 already defined lxc_container: Error destroying rootfs for 7 2013/11/12 14:32:30 [ERROR] ERROR: destroying the container 7 failed 2013/11/12 14:32:30 [ERROR] ERROR: container 7 already defined lxc_container: Error destroying rootfs for 4 2013/11/12 14:33:25 [ERROR] ERROR: destroying the container 4 failed 2013/11/12 14:33:25 [ERROR] ERROR: container 4 already defined lxc_container: Error destroying rootfs for 0 2013/11/12 14:33:40 [ERROR] ERROR: destroying the container 0 failed 2013/11/12 14:33:40 [ERROR] ERROR: container 0 already defined lxc_container: Error destroying rootfs for 0 2013/11/12 14:33:45 [ERROR] ERROR: destroying the container 0 failed 2013/11/12 14:33:45 [ERROR] ERROR: container 0 already defined lxc_container: Error destroying rootfs for 8 2013/11/12 14:33:55 [ERROR] ERROR: destroying the container 8 failed 2013/11/12 14:33:55 [ERROR] ERROR: container 8 already defined lxc_container: Error destroying rootfs for 2 2013/11/12 14:34:40 [ERROR] ERROR: destroying the container 2 failed 2013/11/12 14:34:40 [ERROR] ERROR: container 2 already defined lxc_container: Error destroying rootfs for 1 2013/11/12 14:34:45 [ERROR] ERROR: destroying the container 1 failed 2013/11/12 14:34:45 [ERROR] ERROR: container 1 already defined lxc_container: Error destroying rootfs for 9 2013/11/12 14:35:00 [ERROR] ERROR: destroying the container 9 failed 2013/11/12 14:35:01 [ERROR] ERROR: container 9 already defined [caglar@qp:~/go/src/github.com/caglar10ur/lxc/examples] but since the original problem is gone now Acked-by: S.Çağlar Onur cag...@10ur.org --- src/lxc/lxccontainer.c | 20 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c index c7b2f5e..05ca643 100644 --- a/src/lxc/lxccontainer.c +++ b/src/lxc/lxccontainer.c @@ -1192,16 +1192,19 @@ static bool lxcapi_create(struct lxc_container *c, const char *t, if (lxcapi_is_defined(c) c-lxc_conf c-lxc_conf-rootfs.path access(c-lxc_conf-rootfs.path, F_OK) == 0 tpath) { ERROR(Container %s:%s already exists, c-config_path, c-name); - free(tpath); - return false; + goto free_tpath; } - /* Save the loaded configuration to disk */ - if (!c-save_config(c, NULL)) { - ERROR(failed to save starting configuration for %s\n, c-name); - goto out; + if (!c-lxc_conf) { + if (!c-load_config(c, LXC_DEFAULT_CONFIG)) { + ERROR(Error loading default configuration file %s\n, LXC_DEFAULT_CONFIG); + goto free_tpath; + } }
Re: [lxc-devel] [PATCH] fix multithreaded create()
Hi Dwight, On Tue, Nov 12, 2013 at 3:31 PM, Dwight Engen dwight.en...@oracle.com wrote: On Tue, 12 Nov 2013 14:35:58 -0500 S.Çağlar Onur cag...@10ur.org wrote: On Tue, Nov 12, 2013 at 2:04 PM, Dwight Engen dwight.en...@oracle.com wrote: We were calling save_config() twice within the create() flow, each from a different process. Depending on order of scheduling, sometimes the data from the first save_config() (which was just the stuff from LXC_DEFAULT_CONFIG) would overwrite the config we wanted (the full config), causing a truncated config file which would then cause lxc to segfault once it read it back in because no rootfs.path was set. This fixes it by only calling save_config() once in the create() flow. A rejected alternative was to call fsync(fileno(fout)) before the fclose in save_config. Signed-off-by: Dwight Engen dwight.en...@oracle.com Great news; This solved the concurrent create problems for me Bad news; I started to observe destroy() failures (see below) Hmm, you are testing through the go binding correct? I ran these two without error: ./src/tests/lxc-test-concurrent -i 200 -j 8 create,destroy ./src/tests/lxc-test-concurrent -i 100 -j 8 I wonder if lxc-test-concurrent will fail for you? Hmm will try ASAP. Also I remembered that I started to use best bdev option on my desktop machine which uses btrfs so maybe it has something to do with it. I'll revert back to dir type on Go bindings and try that again as well. Will report back as soon as I have something! [caglar@qp:~/go/src/github.com/caglar10ur/lxc/examples] sudo ./concurrent_stress -iteration=100 -count=10 lxc_container: Error destroying rootfs for 5 2013/11/12 14:31:43 [ERROR] ERROR: destroying the container 5 failed 2013/11/12 14:31:43 [ERROR] ERROR: container 5 already defined lxc_container: Error destroying rootfs for 1 2013/11/12 14:32:10 [ERROR] ERROR: destroying the container 1 failed 2013/11/12 14:32:10 [ERROR] ERROR: container 1 already defined lxc_container: Error destroying rootfs for 1 2013/11/12 14:32:25 [ERROR] ERROR: destroying the container 1 failed 2013/11/12 14:32:25 [ERROR] ERROR: container 1 already defined lxc_container: Error destroying rootfs for 7 2013/11/12 14:32:30 [ERROR] ERROR: destroying the container 7 failed 2013/11/12 14:32:30 [ERROR] ERROR: container 7 already defined lxc_container: Error destroying rootfs for 4 2013/11/12 14:33:25 [ERROR] ERROR: destroying the container 4 failed 2013/11/12 14:33:25 [ERROR] ERROR: container 4 already defined lxc_container: Error destroying rootfs for 0 2013/11/12 14:33:40 [ERROR] ERROR: destroying the container 0 failed 2013/11/12 14:33:40 [ERROR] ERROR: container 0 already defined lxc_container: Error destroying rootfs for 0 2013/11/12 14:33:45 [ERROR] ERROR: destroying the container 0 failed 2013/11/12 14:33:45 [ERROR] ERROR: container 0 already defined lxc_container: Error destroying rootfs for 8 2013/11/12 14:33:55 [ERROR] ERROR: destroying the container 8 failed 2013/11/12 14:33:55 [ERROR] ERROR: container 8 already defined lxc_container: Error destroying rootfs for 2 2013/11/12 14:34:40 [ERROR] ERROR: destroying the container 2 failed 2013/11/12 14:34:40 [ERROR] ERROR: container 2 already defined lxc_container: Error destroying rootfs for 1 2013/11/12 14:34:45 [ERROR] ERROR: destroying the container 1 failed 2013/11/12 14:34:45 [ERROR] ERROR: container 1 already defined lxc_container: Error destroying rootfs for 9 2013/11/12 14:35:00 [ERROR] ERROR: destroying the container 9 failed 2013/11/12 14:35:01 [ERROR] ERROR: container 9 already defined [caglar@qp:~/go/src/github.com/caglar10ur/lxc/examples] but since the original problem is gone now Acked-by: S.Çağlar Onur cag...@10ur.org --- src/lxc/lxccontainer.c | 20 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c index c7b2f5e..05ca643 100644 --- a/src/lxc/lxccontainer.c +++ b/src/lxc/lxccontainer.c @@ -1192,16 +1192,19 @@ static bool lxcapi_create(struct lxc_container *c, const char *t, if (lxcapi_is_defined(c) c-lxc_conf c-lxc_conf-rootfs.path access(c-lxc_conf-rootfs.path, F_OK) == 0 tpath) { ERROR(Container %s:%s already exists, c-config_path, c-name); - free(tpath); - return false; + goto free_tpath; } - /* Save the loaded configuration to disk */ - if (!c-save_config(c, NULL)) { -
Re: [lxc-devel] [PATCH] fix multithreaded create()
Quoting Dwight Engen (dwight.en...@oracle.com): We were calling save_config() twice within the create() flow, each from a different process. Depending on order of scheduling, sometimes the data from the first save_config() (which was just the stuff from LXC_DEFAULT_CONFIG) would overwrite the config we wanted (the full config), causing a truncated config file which would then cause lxc to segfault once it read it back in because no rootfs.path was set. This fixes it by only calling save_config() once in the create() flow. A rejected alternative was to call fsync(fileno(fout)) before the fclose in save_config. Signed-off-by: Dwight Engen dwight.en...@oracle.com Acked-by: Serge E. Hallyn serge.hal...@ubuntu.com --- src/lxc/lxccontainer.c | 20 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c index c7b2f5e..05ca643 100644 --- a/src/lxc/lxccontainer.c +++ b/src/lxc/lxccontainer.c @@ -1192,16 +1192,19 @@ static bool lxcapi_create(struct lxc_container *c, const char *t, if (lxcapi_is_defined(c) c-lxc_conf c-lxc_conf-rootfs.path access(c-lxc_conf-rootfs.path, F_OK) == 0 tpath) { ERROR(Container %s:%s already exists, c-config_path, c-name); - free(tpath); - return false; + goto free_tpath; } - /* Save the loaded configuration to disk */ - if (!c-save_config(c, NULL)) { - ERROR(failed to save starting configuration for %s\n, c-name); - goto out; + if (!c-lxc_conf) { + if (!c-load_config(c, LXC_DEFAULT_CONFIG)) { + ERROR(Error loading default configuration file %s\n, LXC_DEFAULT_CONFIG); + goto free_tpath; + } } + if (!create_container_dir(c)) + goto free_tpath; + /* * either template or rootfs.path should be set. * if both template and rootfs.path are set, template is setup as rootfs.path. @@ -1290,10 +1293,11 @@ out_unlock: if (partial_fd = 0) remove_partial(c, partial_fd); out: - if (tpath) - free(tpath); if (!ret c) lxcapi_destroy(c); +free_tpath: + if (tpath) + free(tpath); return ret; } -- 1.8.3.1 -- DreamFactory - Open Source REST JSON Services for HTML5 Native Apps OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access Free app hosting. Or install the open source package on any LAMP server. Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native! http://pubads.g.doubleclick.net/gampad/clk?id=63469471iu=/4140/ostg.clktrk ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel -- DreamFactory - Open Source REST JSON Services for HTML5 Native Apps OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access Free app hosting. Or install the open source package on any LAMP server. Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native! http://pubads.g.doubleclick.net/gampad/clk?id=63469471iu=/4140/ostg.clktrk ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel
Re: [lxc-devel] [PATCH] fix multithreaded create()
Hey Dwight, On Tue, Nov 12, 2013 at 3:35 PM, S.Çağlar Onur cag...@10ur.org wrote: Hmm, you are testing through the go binding correct? I ran these two without error: ./src/tests/lxc-test-concurrent -i 200 -j 8 create,destroy ./src/tests/lxc-test-concurrent -i 100 -j 8 I wonder if lxc-test-concurrent will fail for you? As you suspected lxc-test-concurrent runs fine on my system, I also tried running concurrent_stress with directory backend and as long as I see it also works fine. I guess the initial failure I observed had something with btrfs backend, I'll try to reproduce it and possibly spend some time on it to see what's happening. Hmm will try ASAP. Also I remembered that I started to use best bdev option on my desktop machine which uses btrfs so maybe it has something to do with it. I'll revert back to dir type on Go bindings and try that again as well. Will report back as soon as I have something! Cheers, -- S.Çağlar Onur cag...@10ur.org -- DreamFactory - Open Source REST JSON Services for HTML5 Native Apps OAuth, Users, Roles, SQL, NoSQL, BLOB Storage and External API Access Free app hosting. Or install the open source package on any LAMP server. Sign up and see examples for AngularJS, jQuery, Sencha Touch and Native! http://pubads.g.doubleclick.net/gampad/clk?id=63469471iu=/4140/ostg.clktrk ___ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel