On Mon, 6 May 2013 19:08:08 -0500 Serge Hallyn <serge.hal...@ubuntu.com> wrote:
> Quoting Dwight Engen (dwight.en...@oracle.com): > > On Mon, 6 May 2013 15:31:14 -0500 > > Serge Hallyn <serge.hal...@ubuntu.com> wrote: > > > > > Quoting Dwight Engen (dwight.en...@oracle.com): > > > > On Mon, 6 May 2013 13:06:43 -0400 > > > > Dwight Engen <dwight.en...@oracle.com> wrote: > > > > > > > > > Hi Çağlar, > > > > > > > > > > Thanks for the test program, I can sort of recreate it here > > > > > with that, although once I lxc-stop them all, lxc-monitord > > > > > does go away. I put a debug in lxc_wait() to see that the > > > > > client always close the fd to the monitord and they all were > > > > > closed so I'm not sure why lxc-monitord isn't seeing the > > > > > accepted fd coming back from epoll to close. Still > > > > > investigating... > > > > > > > > Okay, so I debugged this and the problem is basically down to > > > > lxc not being thread aware. With your test program we get > > > > multiple threads in lxcapi_start() simultaneously in the > > > > daemonize case. One of them forks while another one has the > > > > client fd to the monitor open and thus the fd gets duped by the > > > > fork and that is the client fd that holds lxc-monitord open > > > > until the container shuts down. > > > > > > > > Çağlar you could try out the following patch, it essentially > > > > serializes container startup from a thread perspective. I > > > > haven't tested it thoroughly, but it did fix the problem here. > > > > Right now lxc doesn't support threaded use, so you may run into > > > > other things as well. Depending on our stance on thread support > > > > in lxc, you may need to do the serialization in the threaded > > > > app. I guess another alternative is that initially we could > > > > just thread serialize at the API (big lxc lock). > > > > > > It sounds like lxcapi_start should be locking c->slock? The > > > priv_lock is to protect the struct lxccontainer in memory (for > > > when you do c = lxc_container_new(); then clone a new thread), > > > while the slock is to protect the container data. It's being > > > taken at create, info, destroy, etc, but not at start. > > > > Hi Serge, thanks for pointing those out, lxc is more thread aware > > than I realized :) Unfortunately I don't think either lock will > > help here as both these locks are per-container and the data we > > need to synchronize is per-calling process (ie. which fd's are open > > at time of fork). To put it another way, this problem happens even > > when each c is different. > > Right you are. I was so busy worrying about how to protect the > on-disk container and in-memory container_struct I wasn't thinking > about the shared fdtable. > > Should we have a single mutex around all fd modifications? What else? I think that will certainly work here, I haven't considered other places. Do we need to think about sighand and umask? Would you prefer to use the semaphores lxc already has (ie lxc_newlock() with no name) over pthread_mutex? Since Çağlar verified the test fix worked for him also I'd like to make a real fix. ------------------------------------------------------------------------------ Learn Graph Databases - Download FREE O'Reilly Book "Graph Databases" is the definitive new guide to graph databases and their applications. This 200-page book is written by three acclaimed leaders in the field. The early access version is available now. Download your free book today! http://p.sf.net/sfu/neotech_d2d_may _______________________________________________ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel