Re: [lxc-devel] [PATCH] tests: Introduce lxc-test-concurrent for testing basic actions concurrently

2013-09-16 Thread Serge Hallyn
So, the basic lxc-test-concurrent now passes for me with both
busybox and ubuntu templates.  That's because the daemonized
starts are mutexed from each other just fine.  However, I expect
to see problems if we did start in parallel with things like wait.

(I'll play around)

-serge

--
LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99!
1,500+ hours of tutorials including VisualStudio 2012, Windows 8, SharePoint
2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack includes
Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. 
http://pubads.g.doubleclick.net/gampad/clk?id=58041151iu=/4140/ostg.clktrk
___
Lxc-devel mailing list
Lxc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-devel


Re: [lxc-devel] [PATCH] tests: Introduce lxc-test-concurrent for testing basic actions concurrently

2013-09-14 Thread S . Çağlar Onur
Hi Serge,

This seems to fix the creates but starts started to fail this time. One
quick note, my concurrent.c still uses ubuntu as it's much more easy to
replicate the issue with it.

diff --git a/src/tests/concurrent.c b/src/tests/concurrent.c
index 7faf34c..ffc025c 100644
--- a/src/tests/concurrent.c
+++ b/src/tests/concurrent.c
@@ -40,7 +40,7 @@ void * concurrent(void *arguments) {
 args-return_code = 1;
 if (strcmp(args-mode, create) == 0) {
 if (!c-is_defined(c)) {
-if (!c-create(c, busybox, NULL, NULL, 1, NULL)) {
+if (!c-create(c, ubuntu, NULL, NULL, 1, NULL)) {
 fprintf(stderr, Creating the container (%s) failed...\n,
name);
 goto out;
 }


[caglar@oOo:~/Projects/lxc(staging)] sudo lxc-test-concurrent
Executing (create) for 5 containers...

Executing (start) for 5 containers...
Starting the container (4) failed...
lxc_container: command get_cgroup failed to receive response
Starting the container (3) failed...
Starting the container (2) failed...
Starting the container (1) failed...
Starting the container (0) failed...

[caglar@oOo:~/go/src/github.com/caglar10ur/lxc(devel)] sudo lxc-ls
0  1  2  3  4
[caglar@oOo:~/go/src/github.com/caglar10ur/lxc(devel)] sudo lxc-start -d -n
0
lxc-start: command get_cgroup failed to receive response
[caglar@oOo:~/go/src/github.com/caglar10ur/lxc(devel)]


On Fri, Sep 13, 2013 at 10:20 PM, Serge Hallyn serge.hal...@ubuntu.comwrote:

 Quoting Serge Hallyn (serge.hal...@ubuntu.com):
  Quoting Dwight Engen (dwight.en...@oracle.com):
   On Fri, 13 Sep 2013 17:29:53 -0400
   Dwight Engen dwight.en...@oracle.com wrote:
  
On Fri, 13 Sep 2013 12:09:55 -0400
S.Çağlar Onur cag...@10ur.org wrote:
   
 Hi Dwight,

 Yes, I only observed a hang so far but not this assertion (in fact
 I
 don't remember ever seeing that). What I'm seeing is this;
   
Okay, something funny is going on, but I don't know what yet. That
assertion is coming from liblxc.so-libgnutls-libgcrypt and seems to
be complaining that we're unlocking something that is already
unlocked. So I compiled lxc without GNUTLS support (by commenting out
the check for it in configure.ac) and now I get past that and get
hangs similar to yours.
   
Interestingly, I modified your program to just do the create and
destroy and not the start nor stop and I still get the hangs during
the creation part.
  
   Sorry to reply to myself: So now I modified your program to do the
   create single threaded and everything else threaded (see attached
   patch) and that doesn't hang for me.
  
   It looked like to me that the hung threads were stuck on process_lock()
   so I added a backtrace there to debug it a bit more and it looks like
   the hang is in the flow:
  
lxcapi_create()
  c-save_config()
container_disk_lock()
  lxclock()
process_lock()
  
   Not sure why yet though. Serge, do you think this could have to do with
   the fork()ing from a thread in that flow?
 
  Well, we've got save_config() doing an fopen without the process_lock()
  for one thing which is bad.  That could explain it in a few ways - we
  could have corruption due to both tasks changing fdtable at the same
  time, or perhaps because open() is a cancelation point...
 
  Does doing a process_lock() around the fopen and fclose in
  lxcapi_save_config() fix it?

 something like this should be a start:

 Subject: api_start: make thread-safe (or at least start to)

 Signed-off-by: Serge Hallyn serge.hal...@ubuntu.com
 ---
  src/lxc/lxccontainer.c | 6 ++
  src/lxc/parse.c| 4 
  2 files changed, 10 insertions(+)

 diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c
 index 79237df..3c51c4a 100644
 --- a/src/lxc/lxccontainer.c
 +++ b/src/lxc/lxccontainer.c
 @@ -665,7 +665,9 @@ static bool create_container_dir(struct lxc_container
 *c)
 free(s);
 return false;
 }
 +   process_lock();
 ret = mkdir(s, 0755);
 +   process_unlock();
 if (ret) {
 if (errno == EEXIST)
 ret = 0;
 @@ -1362,11 +1364,15 @@ static bool lxcapi_save_config(struct
 lxc_container *c, const char *alt_file)
 if (lret)
 return false;

 +   process_lock();
 fout = fopen(alt_file, w);
 +   process_unlock();
 if (!fout)
 goto out;
 write_config(fout, c-lxc_conf);
 +   process_lock();
 fclose(fout);
 +   process_unlock();
 ret = true;

  out:
 diff --git a/src/lxc/parse.c b/src/lxc/parse.c
 index 26cbbdd..f154b89 100644
 --- a/src/lxc/parse.c
 +++ b/src/lxc/parse.c
 @@ -90,7 +90,9 @@ int lxc_file_for_each_line(const char *file, lxc_file_cb
 callback, void *data)
 char *line = NULL;
 size_t len = 0;

 +   process_lock();
 f = fopen(file, r);
 +   process_unlock();
 if (!f) 

Re: [lxc-devel] [PATCH] tests: Introduce lxc-test-concurrent for testing basic actions concurrently

2013-09-14 Thread Serge Hallyn
Quoting S.Çağlar Onur (cag...@10ur.org):
 Hi Serge,
 
 This seems to fix the creates but starts started to fail this time. One

Yeah I expected that.  Through send_state(), command.c and af_unix.c
need some process_lock()s too.  I had a preliminary patch on my other
laptop, but I wanted to see if I'd hit all the create ones and whether
that even helped at all before going on :)

 quick note, my concurrent.c still uses ubuntu as it's much more easy to
 replicate the issue with it.
 
 diff --git a/src/tests/concurrent.c b/src/tests/concurrent.c
 index 7faf34c..ffc025c 100644
 --- a/src/tests/concurrent.c
 +++ b/src/tests/concurrent.c
 @@ -40,7 +40,7 @@ void * concurrent(void *arguments) {
  args-return_code = 1;
  if (strcmp(args-mode, create) == 0) {
  if (!c-is_defined(c)) {
 -if (!c-create(c, busybox, NULL, NULL, 1, NULL)) {
 +if (!c-create(c, ubuntu, NULL, NULL, 1, NULL)) {
  fprintf(stderr, Creating the container (%s) failed...\n,
 name);
  goto out;
  }

Ok, I'm going to assume noone objects to just triggering that on
argv[1] and push a patch to do that.  It'll do busybox by default,
or argv[1] if that is passed in.

thanks,
-serge

--
LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99!
1,500+ hours of tutorials including VisualStudio 2012, Windows 8, SharePoint
2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack includes
Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/22/13. 
http://pubads.g.doubleclick.net/gampad/clk?id=64545871iu=/4140/ostg.clktrk
___
Lxc-devel mailing list
Lxc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-devel


Re: [lxc-devel] [PATCH] tests: Introduce lxc-test-concurrent for testing basic actions concurrently

2013-09-14 Thread Serge Hallyn
Quoting S.Çağlar Onur (cag...@10ur.org):
 Hi Serge,
 
 This seems to fix the creates but starts started to fail this time. One
 quick note, my concurrent.c still uses ubuntu as it's much more easy to
 replicate the issue with it.

All right I *think* I was on the right track with the commit which I
reverted in staging, but it'll deadlock as is.  (Didn't test, but it's
obvious)  I *think* we'll just want to drop the process_lock() around
the block of code in the daemonized part of api_start(), especially
around the wait_on_daemonized_start().  But I don't have time to do it
justice right now and don't want to cause staging to be unusable for
anyone.  If you want to play with it, please feel free to play with
reverted commit 002f3cff4d83c0666cfda40599eded8e0d638c6c.

-serge

--
LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99!
1,500+ hours of tutorials including VisualStudio 2012, Windows 8, SharePoint
2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack includes
Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/22/13. 
http://pubads.g.doubleclick.net/gampad/clk?id=64545871iu=/4140/ostg.clktrk
___
Lxc-devel mailing list
Lxc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-devel


Re: [lxc-devel] [PATCH] tests: Introduce lxc-test-concurrent for testing basic actions concurrently

2013-09-13 Thread Dwight Engen
Hi Çağlar

I was trying out your new test (with one small change, I made it use
busybox instead of ubuntu during the creates since busybox is a bit
simpler). I put it in a while true shell loop, and it goes for a few
iterations but then the error I get is:

[...]
Executing (create) for 5 containers...
lxc-test-concurrent: ath.c:213: _gcry_ath_mutex_unlock: Assertion `*lock == 
((ath_mutex_t) 1)' failed. 

I think you were seeing a hang, so this is quite different than that?

--
How ServiceNow helps IT people transform IT departments:
1. Consolidate legacy IT systems to a single system of record for IT
2. Standardize and globalize service processes across IT
3. Implement zero-touch automation to replace manual, redundant tasks
http://pubads.g.doubleclick.net/gampad/clk?id=5127iu=/4140/ostg.clktrk
___
Lxc-devel mailing list
Lxc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-devel


Re: [lxc-devel] [PATCH] tests: Introduce lxc-test-concurrent for testing basic actions concurrently

2013-09-13 Thread Dwight Engen
On Fri, 13 Sep 2013 17:29:53 -0400
Dwight Engen dwight.en...@oracle.com wrote:

 On Fri, 13 Sep 2013 12:09:55 -0400
 S.Çağlar Onur cag...@10ur.org wrote:
 
  Hi Dwight,
  
  Yes, I only observed a hang so far but not this assertion (in fact I
  don't remember ever seeing that). What I'm seeing is this;
 
 Okay, something funny is going on, but I don't know what yet. That
 assertion is coming from liblxc.so-libgnutls-libgcrypt and seems to
 be complaining that we're unlocking something that is already
 unlocked. So I compiled lxc without GNUTLS support (by commenting out
 the check for it in configure.ac) and now I get past that and get
 hangs similar to yours.
 
 Interestingly, I modified your program to just do the create and
 destroy and not the start nor stop and I still get the hangs during
 the creation part.

Sorry to reply to myself: So now I modified your program to do the
create single threaded and everything else threaded (see attached
patch) and that doesn't hang for me.

It looked like to me that the hung threads were stuck on process_lock()
so I added a backtrace there to debug it a bit more and it looks like
the hang is in the flow:

 lxcapi_create()
   c-save_config()
 container_disk_lock()
   lxclock()
 process_lock()

Not sure why yet though. Serge, do you think this could have to do with
the fork()ing from a thread in that flow?
diff --git a/src/tests/concurrent.c b/src/tests/concurrent.c
index 7faf34c..a70b252 100644
--- a/src/tests/concurrent.c
+++ b/src/tests/concurrent.c
@@ -15,6 +15,8 @@
  * with this program; if not, write to the Free Software Foundation, Inc.,
  * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
  */
+
+#include limits.h
 #include stdio.h
 #include pthread.h
 
@@ -28,12 +30,13 @@ struct thread_args {
 char *mode;
 };
 
-void * concurrent(void *arguments) {
-char name[4];
+void do_function(void *arguments)
+{
+char name[NAME_MAX+1];
 struct thread_args *args = arguments;
 struct lxc_container *c;
 
-sprintf(name, %d, args-thread_id);
+sprintf(name, lxc-test-concurrent-%d, args-thread_id);
 
 c = lxc_container_new(name, NULL);
 
@@ -79,44 +82,60 @@ void * concurrent(void *arguments) {
 args-return_code = 0;
 out:
 lxc_container_put(c);
-pthread_exit(NULL);
 }
 
+void * concurrent(void *arguments)
+{
+do_function(arguments);
+pthread_exit(NULL);
+}
 
 int main(int argc, char *argv[]) {
-int i, j;
+int i, j, loop, loops = 1;
 pthread_attr_t attr;
 pthread_t threads[NTHREADS];
 struct thread_args args[NTHREADS];
 
-char *modes[] = {create, start, stop, destroy, NULL};
+//char *modes[] = {create, start, stop, destroy, NULL};
+char *modes[] = {start, stop, destroy, NULL};
 
 pthread_attr_init(attr);
-pthread_attr_setdetachstate(attr, PTHREAD_CREATE_JOINABLE);
 
-for (i = 0; modes[i];i++) {
-printf(Executing (%s) for %d containers...\n, modes[i], NTHREADS);
-for (j = 0; j  NTHREADS; j++) {
-args[j].thread_id = j;
-args[j].mode = modes[i];
+if (argc  1)
+loops = atoi(argv[1]);
 
-if (pthread_create(threads[j], attr, concurrent, (void *) args[j]) != 0) {
-perror(pthread_create() error);
-exit(EXIT_FAILURE);
-}
+for (loop = 1; loop = loops; loop++) {
+printf(\nIteration %d/%d\n, loop, loops);
+printf(Creating non-threaded...\n);
+for(j = 0; j  NTHREADS; j++) {
+args[0].thread_id = 0;
+args[0].mode = create;
+do_function(args[0]);
 }
 
-for (j = 0; j  NTHREADS; j++) {
-if ( pthread_join(threads[j], NULL) != 0) {
-perror(pthread_join() error);
-exit(EXIT_FAILURE);
+for (i = 0; modes[i];i++) {
+printf(Executing (%s) for %d containers...\n, modes[i], NTHREADS);
+for (j = 0; j  NTHREADS; j++) {
+args[j].thread_id = j;
+args[j].mode = modes[i];
+
+if (pthread_create(threads[j], attr, concurrent, (void *) args[j]) != 0) {
+perror(pthread_create() error);
+exit(EXIT_FAILURE);
+}
 }
-if (args[j].return_code) {
-perror(thread returned an error);
-exit(EXIT_FAILURE);
+
+for (j = 0; j  NTHREADS; j++) {
+if (pthread_join(threads[j], NULL) != 0) {
+perror(pthread_join() error);
+exit(EXIT_FAILURE);
+}
+if (args[j].return_code) {
+fprintf(stderr, thread returned error %d, args[j].return_code);
+exit(EXIT_FAILURE);
+}
 }
 }
-printf(\n);
 }
 
 pthread_attr_destroy(attr);
--
How ServiceNow 

Re: [lxc-devel] [PATCH] tests: Introduce lxc-test-concurrent for testing basic actions concurrently

2013-09-13 Thread Serge Hallyn
Quoting Dwight Engen (dwight.en...@oracle.com):
 On Fri, 13 Sep 2013 17:29:53 -0400
 Dwight Engen dwight.en...@oracle.com wrote:
 
  On Fri, 13 Sep 2013 12:09:55 -0400
  S.Çağlar Onur cag...@10ur.org wrote:
  
   Hi Dwight,
   
   Yes, I only observed a hang so far but not this assertion (in fact I
   don't remember ever seeing that). What I'm seeing is this;
  
  Okay, something funny is going on, but I don't know what yet. That
  assertion is coming from liblxc.so-libgnutls-libgcrypt and seems to
  be complaining that we're unlocking something that is already
  unlocked. So I compiled lxc without GNUTLS support (by commenting out
  the check for it in configure.ac) and now I get past that and get
  hangs similar to yours.
  
  Interestingly, I modified your program to just do the create and
  destroy and not the start nor stop and I still get the hangs during
  the creation part.
 
 Sorry to reply to myself: So now I modified your program to do the
 create single threaded and everything else threaded (see attached
 patch) and that doesn't hang for me.
 
 It looked like to me that the hung threads were stuck on process_lock()
 so I added a backtrace there to debug it a bit more and it looks like
 the hang is in the flow:
 
  lxcapi_create()
c-save_config()
  container_disk_lock()
lxclock()
  process_lock()
 
 Not sure why yet though. Serge, do you think this could have to do with
 the fork()ing from a thread in that flow?

Well, we've got save_config() doing an fopen without the process_lock()
for one thing which is bad.  That could explain it in a few ways - we
could have corruption due to both tasks changing fdtable at the same
time, or perhaps because open() is a cancelation point...

Does doing a process_lock() around the fopen and fclose in
lxcapi_save_config() fix it?

As for the forks, everything I've seen suggests those should be safe.
Unless we fork with some mutexes held, in which case the mutexes will
be copied held into the new task, and the new task might deadlock
against nothing.  But that shouldn't affect the parent thread which is
what you're seeing above.

-serge

--
How ServiceNow helps IT people transform IT departments:
1. Consolidate legacy IT systems to a single system of record for IT
2. Standardize and globalize service processes across IT
3. Implement zero-touch automation to replace manual, redundant tasks
http://pubads.g.doubleclick.net/gampad/clk?id=5127iu=/4140/ostg.clktrk
___
Lxc-devel mailing list
Lxc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-devel


Re: [lxc-devel] [PATCH] tests: Introduce lxc-test-concurrent for testing basic actions concurrently

2013-09-13 Thread S . Çağlar Onur
Hi Dwight,

Yes it only stuck during creating concurrent containers for me and
start/stop/freeze/unfreeze seems to work fine. If it helps I'm pretty sure
that it was working fine till last week (or I was so lucky not to hit by
this problem before). Go binding's test suite does lots of concurrent stuff
to test bindings and they were passing but now ConcurrentCreate test case
(which creates 10 containers in parallel) hangs most of the time.



On Fri, Sep 13, 2013 at 5:29 PM, Dwight Engen dwight.en...@oracle.comwrote:

 On Fri, 13 Sep 2013 12:09:55 -0400
 S.Çağlar Onur cag...@10ur.org wrote:

  Hi Dwight,
 
  Yes, I only observed a hang so far but not this assertion (in fact I
  don't remember ever seeing that). What I'm seeing is this;

 Okay, something funny is going on, but I don't know what yet. That
 assertion is coming from liblxc.so-libgnutls-libgcrypt and seems to
 be complaining that we're unlocking something that is already unlocked.
 So I compiled lxc without GNUTLS support (by commenting out the check
 for it in configure.ac) and now I get past that and get hangs similar to
 yours.

 Interestingly, I modified your program to just do the create and
 destroy and not the start nor stop and I still get the hangs during the
 creation part.

  * lxc-test-concurrent get stuck
 
  [caglar@qgq:~] sudo lxc-test-concurrent
  Executing (create) for 5 containers...
 
  * ps auwxf shows this (so no rsync etc. running anymore)
 
  caglar   21004  0.2  0.2  51344  4868 ?S11:59   0:00
  mosh-server new -s -c 256 -l LANG=en_US.UTF-8
  caglar   21005  0.0  0.2  23068  4412 pts/2Ss   11:59   0:00  \_
  -bash root 27347  0.0  0.1  60248  2080 pts/2S+   12:03
  0:00  \_ sudo lxc-test-concurrent
  root 27348  0.0  0.0 383816   884 pts/2Sl+  12:03   0:00
   \_ lxc-test-concurrent
  root 27354  0.0  0.0 381684   408 pts/2S+   12:03   0:00
 \_ lxc-test-concurrent
 
  * strace give this
 
  [caglar@qgq:~/Projects/lxc(staging)] sudo strace -p 27354
  Process 27354 attached - interrupt to quit
  futex(0x7fdc68b82cc0, FUTEX_WAIT_PRIVATE, 2, NULL^C unfinished ...
  Process 27354 detached
  [caglar@qgq:~/Projects/lxc(staging)] sudo strace -p 27348
  Process 27348 attached - interrupt to quit
  futex(0x7fdc65f3d9d0, FUTEX_WAIT, 27353, NULL^C unfinished ...
  Process 27348 detached
  [caglar@qgq:~/Projects/lxc(staging)] sudo strace -p 27347
  Process 27347 attached - interrupt to quit
  select(6, [3 5], [], NULL, NULL^C unfinished ...
  Process 27347 detached
 
  * lxc-ls
 
  [caglar@qgq:~/Projects/lxc(staging)] sudo lxc-ls --fancy
  NAME   STATEIPV4  IPV6
  --
  0  STOPPED  - -
  1  STOPPED  - -
  2  STOPPED  - -
  3  STOPPED  - -
  4  STOPPED  - -
 
  * /var/lib/lxc/4/partial still there
 
  [caglar@qgq:/var/lib/lxc] ls /var/lib/lxc/*
  /var/lib/lxc/lxc-monitord.log
 
  /var/lib/lxc/0:
  config  fstab  rootfs
 
  /var/lib/lxc/1:
  config  fstab  rootfs
 
  /var/lib/lxc/2:
  config  fstab  rootfs
 
  /var/lib/lxc/3:
  config  fstab  rootfs
 
  /var/lib/lxc/4:
  config  partial  rootfs
 
  /var/lib/lxc/bleach:
  config  fstab  lxc_snapshots  rootfs
 
  /var/lib/lxc/bleach-ng:
  config  delta0  fstab  lxc_rdepends  rootfs
  ​




-- 
S.Çağlar Onur cag...@10ur.org
--
How ServiceNow helps IT people transform IT departments:
1. Consolidate legacy IT systems to a single system of record for IT
2. Standardize and globalize service processes across IT
3. Implement zero-touch automation to replace manual, redundant tasks
http://pubads.g.doubleclick.net/gampad/clk?id=5127iu=/4140/ostg.clktrk___
Lxc-devel mailing list
Lxc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-devel


Re: [lxc-devel] [PATCH] tests: Introduce lxc-test-concurrent for testing basic actions concurrently

2013-09-13 Thread Serge Hallyn
Quoting Serge Hallyn (serge.hal...@ubuntu.com):
 Quoting Dwight Engen (dwight.en...@oracle.com):
  On Fri, 13 Sep 2013 17:29:53 -0400
  Dwight Engen dwight.en...@oracle.com wrote:
  
   On Fri, 13 Sep 2013 12:09:55 -0400
   S.Çağlar Onur cag...@10ur.org wrote:
   
Hi Dwight,

Yes, I only observed a hang so far but not this assertion (in fact I
don't remember ever seeing that). What I'm seeing is this;
   
   Okay, something funny is going on, but I don't know what yet. That
   assertion is coming from liblxc.so-libgnutls-libgcrypt and seems to
   be complaining that we're unlocking something that is already
   unlocked. So I compiled lxc without GNUTLS support (by commenting out
   the check for it in configure.ac) and now I get past that and get
   hangs similar to yours.
   
   Interestingly, I modified your program to just do the create and
   destroy and not the start nor stop and I still get the hangs during
   the creation part.
  
  Sorry to reply to myself: So now I modified your program to do the
  create single threaded and everything else threaded (see attached
  patch) and that doesn't hang for me.
  
  It looked like to me that the hung threads were stuck on process_lock()
  so I added a backtrace there to debug it a bit more and it looks like
  the hang is in the flow:
  
   lxcapi_create()
 c-save_config()
   container_disk_lock()
 lxclock()
   process_lock()
  
  Not sure why yet though. Serge, do you think this could have to do with
  the fork()ing from a thread in that flow?
 
 Well, we've got save_config() doing an fopen without the process_lock()
 for one thing which is bad.  That could explain it in a few ways - we
 could have corruption due to both tasks changing fdtable at the same
 time, or perhaps because open() is a cancelation point...
 
 Does doing a process_lock() around the fopen and fclose in
 lxcapi_save_config() fix it?

something like this should be a start:

Subject: api_start: make thread-safe (or at least start to)

Signed-off-by: Serge Hallyn serge.hal...@ubuntu.com
---
 src/lxc/lxccontainer.c | 6 ++
 src/lxc/parse.c| 4 
 2 files changed, 10 insertions(+)

diff --git a/src/lxc/lxccontainer.c b/src/lxc/lxccontainer.c
index 79237df..3c51c4a 100644
--- a/src/lxc/lxccontainer.c
+++ b/src/lxc/lxccontainer.c
@@ -665,7 +665,9 @@ static bool create_container_dir(struct lxc_container *c)
free(s);
return false;
}
+   process_lock();
ret = mkdir(s, 0755);
+   process_unlock();
if (ret) {
if (errno == EEXIST)
ret = 0;
@@ -1362,11 +1364,15 @@ static bool lxcapi_save_config(struct lxc_container *c, 
const char *alt_file)
if (lret)
return false;
 
+   process_lock();
fout = fopen(alt_file, w);
+   process_unlock();
if (!fout)
goto out;
write_config(fout, c-lxc_conf);
+   process_lock();
fclose(fout);
+   process_unlock();
ret = true;
 
 out:
diff --git a/src/lxc/parse.c b/src/lxc/parse.c
index 26cbbdd..f154b89 100644
--- a/src/lxc/parse.c
+++ b/src/lxc/parse.c
@@ -90,7 +90,9 @@ int lxc_file_for_each_line(const char *file, lxc_file_cb 
callback, void *data)
char *line = NULL;
size_t len = 0;
 
+   process_lock();
f = fopen(file, r);
+   process_unlock();
if (!f) {
SYSERROR(failed to open %s, file);
return -1;
@@ -104,7 +106,9 @@ int lxc_file_for_each_line(const char *file, lxc_file_cb 
callback, void *data)
 
if (line)
free(line);
+   process_lock();
fclose(f);
+   process_unlock();
return err;
 }
 
-- 
1.8.3.2


--
How ServiceNow helps IT people transform IT departments:
1. Consolidate legacy IT systems to a single system of record for IT
2. Standardize and globalize service processes across IT
3. Implement zero-touch automation to replace manual, redundant tasks
http://pubads.g.doubleclick.net/gampad/clk?id=5127iu=/4140/ostg.clktrk
___
Lxc-devel mailing list
Lxc-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lxc-devel


Re: [lxc-devel] [PATCH] tests: Introduce lxc-test-concurrent for testing basic actions concurrently

2013-09-12 Thread Serge Hallyn
Quoting S.Çağlar Onur (cag...@10ur.org):
 Signed-off-by: S.Çağlar Onur cag...@10ur.org

Oh, great, thanks :)

Acked-by: Serge E. Hallyn serge.hal...@ubuntu.com

 ---
  .gitignore |   3 ++
  src/tests/Makefile.am  |   6 ++-
  src/tests/concurrent.c | 116 
 +
  3 files changed, 123 insertions(+), 2 deletions(-)
  create mode 100644 src/tests/concurrent.c
 
 diff --git a/.gitignore b/.gitignore
 index 660756f..c005c4d 100644
 --- a/.gitignore
 +++ b/.gitignore
 @@ -10,6 +10,7 @@
  
  .deps
  .libs
 +.dirstamp
  
  Makefile.in
  Makefile
 @@ -75,6 +76,7 @@ src/python-lxc/lxc/__pycache__/
  
  src/tests/lxc-test-cgpath
  src/tests/lxc-test-clonetest
 +src/tests/lxc-test-concurrent
  src/tests/lxc-test-console
  src/tests/lxc-test-containertests
  src/tests/lxc-test-createtest
 @@ -85,6 +87,7 @@ src/tests/lxc-test-locktests
  src/tests/lxc-test-lxcpath
  src/tests/lxc-test-saveconfig
  src/tests/lxc-test-shutdowntest
 +src/tests/lxc-test-snapshot
  src/tests/lxc-test-startone
  src/tests/lxc-usernic-test
  
 diff --git a/src/tests/Makefile.am b/src/tests/Makefile.am
 index a6dacab..8157407 100644
 --- a/src/tests/Makefile.am
 +++ b/src/tests/Makefile.am
 @@ -18,6 +18,7 @@ lxc_test_console_SOURCES = console.c
  lxc_usernic_test_SOURCES = ../lxc/lxc_user_nic.c ../lxc/nl.c
  lxc_usernic_test_CFLAGS = -DISTEST
  lxc_test_snapshot_SOURCES = snapshot.c
 +lxc_test_concurrent_SOURCES = concurrent.c
  
  AM_CFLAGS=-I$(top_srcdir)/src \
   -DLXCROOTFSMOUNT=\$(LXCROOTFSMOUNT)\ \
 @@ -30,7 +31,7 @@ bin_PROGRAMS = lxc-test-containertests lxc-test-locktests 
 lxc-test-startone \
   lxc-test-destroytest lxc-test-saveconfig lxc-test-createtest \
   lxc-test-shutdowntest lxc-test-get_item lxc-test-getkeys 
 lxc-test-lxcpath \
   lxc-test-cgpath lxc-test-clonetest lxc-test-console lxc-usernic-test \
 - lxc-test-snapshot
 + lxc-test-snapshot lxc-test-concurrent
  
  bin_SCRIPTS = lxc-test-usernic
  
 @@ -51,4 +52,5 @@ EXTRA_DIST = \
   startone.c \
   console.c \
   lxc-test-usernic \
 - snapshot.c
 + snapshot.c \
 + concurrent.c
 diff --git a/src/tests/concurrent.c b/src/tests/concurrent.c
 new file mode 100644
 index 000..41c171b
 --- /dev/null
 +++ b/src/tests/concurrent.c
 @@ -0,0 +1,116 @@
 +/* concurrent.c
 + *
 + * Copyright © 2013 S.Çağlar Onur cag...@10ur.org
 + *
 + * This program is free software; you can redistribute it and/or modify
 + * it under the terms of the GNU General Public License version 2, as
 + * published by the Free Software Foundation.
 + *
 + * This program is distributed in the hope that it will be useful,
 + * but WITHOUT ANY WARRANTY; without even the implied warranty of
 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 + * GNU General Public License for more details.
 + *
 + * You should have received a copy of the GNU General Public License along
 + * with this program; if not, write to the Free Software Foundation, Inc.,
 + * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
 + */
 +#include stdio.h
 +#include pthread.h
 +
 +#include ../lxc/lxccontainer.h
 +
 +#define NTHREADS 5
 +
 +struct thread_args {
 +int thread_id;
 +int return_code;
 +char *mode;
 +};
 +
 +void * concurrent(void *arguments) {
 +char name[4];
 +struct thread_args *args = arguments;
 +struct lxc_container *c;
 +
 +sprintf(name, %d, args-thread_id);
 +
 +c = lxc_container_new(name, NULL);
 +
 +args-return_code = 1;
 +if (strcmp(args-mode, create) == 0) {
 +if (!c-is_defined(c)) {
 +if (!c-create(c, ubuntu, NULL, NULL, 1, NULL)) {
 +fprintf(stderr, Creating the container (%s) failed...\n, 
 name);
 +goto out;
 +}
 +}
 +} else if(strcmp(args-mode, start) == 0) {
 +if (c-is_defined(c)  !c-is_running(c)) {
 +c-want_daemonize(c);
 +if (!c-start(c, false, NULL)) {
 +fprintf(stderr, Starting the container (%s) failed...\n, 
 name);
 +goto out;
 +}
 +}
 +} else if(strcmp(args-mode, stop) == 0) {
 +if (c-is_defined(c)  c-is_running(c)) {
 +if (!c-stop(c)) {
 +fprintf(stderr, Stopping the container (%s) failed...\n, 
 name);
 +goto out;
 +}
 +}
 +} else if(strcmp(args-mode, destroy) == 0) {
 +if (c-is_defined(c)  !c-is_running(c)) {
 +if (!c-destroy(c)) {
 +fprintf(stderr, Destroying the container (%s) failed...\n, 
 name);
 +goto out;
 +}
 +}
 +}
 +args-return_code = 0;
 +out:
 +lxc_container_put(c);
 +pthread_exit(NULL);
 +}
 +
 +
 +int main(int argc, char *argv[]) {
 +int i, j;
 +pthread_attr_t attr;
 +pthread_t threads[NTHREADS];
 +struct thread_args args[NTHREADS];
 +
 +char *modes[] = {create, start, stop, 

[lxc-devel] [PATCH] tests: Introduce lxc-test-concurrent for testing basic actions concurrently

2013-09-12 Thread S . Çağlar Onur
Signed-off-by: S.Çağlar Onur cag...@10ur.org
---
 .gitignore |   3 ++
 src/tests/Makefile.am  |   6 ++-
 src/tests/concurrent.c | 116 +
 3 files changed, 123 insertions(+), 2 deletions(-)
 create mode 100644 src/tests/concurrent.c

diff --git a/.gitignore b/.gitignore
index 660756f..c005c4d 100644
--- a/.gitignore
+++ b/.gitignore
@@ -10,6 +10,7 @@
 
 .deps
 .libs
+.dirstamp
 
 Makefile.in
 Makefile
@@ -75,6 +76,7 @@ src/python-lxc/lxc/__pycache__/
 
 src/tests/lxc-test-cgpath
 src/tests/lxc-test-clonetest
+src/tests/lxc-test-concurrent
 src/tests/lxc-test-console
 src/tests/lxc-test-containertests
 src/tests/lxc-test-createtest
@@ -85,6 +87,7 @@ src/tests/lxc-test-locktests
 src/tests/lxc-test-lxcpath
 src/tests/lxc-test-saveconfig
 src/tests/lxc-test-shutdowntest
+src/tests/lxc-test-snapshot
 src/tests/lxc-test-startone
 src/tests/lxc-usernic-test
 
diff --git a/src/tests/Makefile.am b/src/tests/Makefile.am
index a6dacab..8157407 100644
--- a/src/tests/Makefile.am
+++ b/src/tests/Makefile.am
@@ -18,6 +18,7 @@ lxc_test_console_SOURCES = console.c
 lxc_usernic_test_SOURCES = ../lxc/lxc_user_nic.c ../lxc/nl.c
 lxc_usernic_test_CFLAGS = -DISTEST
 lxc_test_snapshot_SOURCES = snapshot.c
+lxc_test_concurrent_SOURCES = concurrent.c
 
 AM_CFLAGS=-I$(top_srcdir)/src \
-DLXCROOTFSMOUNT=\$(LXCROOTFSMOUNT)\ \
@@ -30,7 +31,7 @@ bin_PROGRAMS = lxc-test-containertests lxc-test-locktests 
lxc-test-startone \
lxc-test-destroytest lxc-test-saveconfig lxc-test-createtest \
lxc-test-shutdowntest lxc-test-get_item lxc-test-getkeys 
lxc-test-lxcpath \
lxc-test-cgpath lxc-test-clonetest lxc-test-console lxc-usernic-test \
-   lxc-test-snapshot
+   lxc-test-snapshot lxc-test-concurrent
 
 bin_SCRIPTS = lxc-test-usernic
 
@@ -51,4 +52,5 @@ EXTRA_DIST = \
startone.c \
console.c \
lxc-test-usernic \
-   snapshot.c
+   snapshot.c \
+   concurrent.c
diff --git a/src/tests/concurrent.c b/src/tests/concurrent.c
new file mode 100644
index 000..41c171b
--- /dev/null
+++ b/src/tests/concurrent.c
@@ -0,0 +1,116 @@
+/* concurrent.c
+ *
+ * Copyright © 2013 S.Çağlar Onur cag...@10ur.org
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+#include stdio.h
+#include pthread.h
+
+#include ../lxc/lxccontainer.h
+
+#define NTHREADS 5
+
+struct thread_args {
+int thread_id;
+int return_code;
+char *mode;
+};
+
+void * concurrent(void *arguments) {
+char name[4];
+struct thread_args *args = arguments;
+struct lxc_container *c;
+
+sprintf(name, %d, args-thread_id);
+
+c = lxc_container_new(name, NULL);
+
+args-return_code = 1;
+if (strcmp(args-mode, create) == 0) {
+if (!c-is_defined(c)) {
+if (!c-create(c, ubuntu, NULL, NULL, 1, NULL)) {
+fprintf(stderr, Creating the container (%s) failed...\n, 
name);
+goto out;
+}
+}
+} else if(strcmp(args-mode, start) == 0) {
+if (c-is_defined(c)  !c-is_running(c)) {
+c-want_daemonize(c);
+if (!c-start(c, false, NULL)) {
+fprintf(stderr, Starting the container (%s) failed...\n, 
name);
+goto out;
+}
+}
+} else if(strcmp(args-mode, stop) == 0) {
+if (c-is_defined(c)  c-is_running(c)) {
+if (!c-stop(c)) {
+fprintf(stderr, Stopping the container (%s) failed...\n, 
name);
+goto out;
+}
+}
+} else if(strcmp(args-mode, destroy) == 0) {
+if (c-is_defined(c)  !c-is_running(c)) {
+if (!c-destroy(c)) {
+fprintf(stderr, Destroying the container (%s) failed...\n, 
name);
+goto out;
+}
+}
+}
+args-return_code = 0;
+out:
+lxc_container_put(c);
+pthread_exit(NULL);
+}
+
+
+int main(int argc, char *argv[]) {
+int i, j;
+pthread_attr_t attr;
+pthread_t threads[NTHREADS];
+struct thread_args args[NTHREADS];
+
+char *modes[] = {create, start, stop, destroy, NULL};
+
+pthread_attr_init(attr);
+pthread_attr_setdetachstate(attr, PTHREAD_CREATE_JOINABLE);
+
+for (i = 0; modes[i];i++) {
+printf(Executing (%s) for %d containers...\n, modes[i], NTHREADS);
+for (j = 0; j  NTHREADS;