On Tue, 23 Apr 2013 10:25:58 -0500 Serge Hallyn <serge.hal...@ubuntu.com> wrote:
> Quoting Dwight Engen (dwight.en...@oracle.com): > > [...] > > > > > So here is what I'm proposing: when lxc-monitor starts, it > > > > > attempts to start lxc-monitord. lxc-monitord creates a fifo > > > > > and a socket on lxcpath. > > > > > > > > Thanks, Dwight. Looks awesome. Some comments below, but I'm > > > > only not adding an ack bc you say you want to make some changes > > > > first anyway. > > > > > > > > When you send the next version I'll run it through my testsuite > > > > (and presumably ack). > > > > Serge, this one should be testable so feel free to hit it with your > > testsuite if you want, but I'm still sending it as RFC due to the > > questions raised below. Caglar, I've tested this pretty thoroughly > > with your python parallel start script doing 40 containers > > parallel 4 ways (and put into a loop running for a few minutes) and > > some other parallel startup cases I made. If you want to test it > > yourself too, that'd be great. > > > > > > ... > > > > > +static void lxc_monitord_delete(struct lxc_monitor *mon, > > > > > const char *lxcpath) +{ > > > > > + int i; > > > > > + > > > > > + lxc_mainloop_del_handler(&mon->descr, mon->listenfd); > > > > > + close(mon->listenfd); > > > > > > > > The ordering here might need to change to ensure we don't get > > > > any clients hanging between the two steps. > > > > > > Hmm, good point. Have to think about this, there might be a race > > > the other way with still having the fd in the epoll set also. > > > > I think the way I have it is the right order. It looks to me from > > the epoll_ctl manpage that we'll get EBADF if fd isn't valid at the > > time of EPOLL_CTL_DEL. > > > > > > I have a couple of new questions too :( In the current code there is > > the following flow (only when starting a container as daemon through > > the api): > > > > lxcapi_start() > > wait_on_daemonized_start() > > lxcapi_wait() > > lxc_wait() > > lxc_monitor_open() > > > > This is racy with multiple starters all trying to bind the socket > > address and is the source of Caglar's original problem. It looks to > > me like it is also racy with respect to the child container > > setting the state, but there is a timeout safety on the wait side so > > we don't hang. > > > > In the change I'm proposing, I put in a call to > > lxc_monitord_spawn() in only this daemon case which should mean > > there is always a place to deliver the status and no races between > > parent and child (because of > > Yeah, that sounds good. It might seem like overkill, but OTOH we want > the API to be able to tell us if the container start failed. Okay, I'm fine with that but I thought I should mention it. > > sync'ing on the pipe), but has the drawback of having monitord > > always get started when the container is started daemonized. This > > is the only flow I found that used lxc_wait internally. Stephane, > > you might want to comment on this as it looks like you added the > > wait_on_daemonized_start() stuff, I'd be happy if my analysis missed > > something :) At least monitord will go away after startup, but it'd > > be nice not to have to start it at all. Any thoughts? > > > > The other question is: I do getsockopt SO_PEERCRED and > > check against the effective uid which I believe is correct in case > > the caller is setuid. What I'm wondering is why the routines in > > af_unix.c are passing/checking against the real? > > Git history shows it's just always been done that way (so no rationale > for ruid over euid). Daniel? > > > > > > > -- > > > > Signed-off-by: Dwight Engen <dwight.en...@oracle.com> > > My tests all passed. > > In monitord's main(), you don't check the snprintf return value? Fixed, thanks. > (Other than that, looks good) Patch to follow. ------------------------------------------------------------------------------ Try New Relic Now & We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, & servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr _______________________________________________ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel