Hi Çağlar, I'm confused by your output, it certainly looks like something isn't right. Do you have a theory as to why monitord thinks it still has 9 clients?
On Sat, 4 May 2013 00:01:45 -0400 S.Çağlar Onur <cag...@10ur.org> wrote: > Hi all, > > I think I understand why I'm confused before while chasing another > bug. This is what I'm seeing right now. > > * I patched lxc_monitord.c with following > > diff --git a/src/lxc/lxc_monitord.c b/src/lxc/lxc_monitord.c > index e76af71..59f1e9d 100644 > --- a/src/lxc/lxc_monitord.c > +++ b/src/lxc/lxc_monitord.c > @@ -373,6 +373,7 @@ int main(int argc, char *argv[]) > } > > if (lxc_monitord_create(&mon)) { > + NOTICE("create failed"); > goto out; > } > > @@ -398,6 +399,7 @@ int main(int argc, char *argv[]) > NOTICE("no clients for 30 seconds, exiting"); > break; > } > + NOTICE("clients %d", mon.clientfds_cnt); > } > > lxc_mainloop_close(&mon.descr); > > * I started 10 containers using go bindings > > [caglar@qgq:~/Project/lxc/examples] sudo ./concurrent_start > Starting the container (3)... > Starting the container (2)... > Starting the container (4)... > Starting the container (0)... > Starting the container (1)... > Starting the container (8)... > Starting the container (7)... > Starting the container (6)... > Starting the container (5)... > Starting the container (9)... > > * Then started to stop them 1 by 1 using lxc-stop > > [caglar@qgq:~/Project/lxc/examples] sudo lxc-stop -n 0 > [caglar@qgq:~/Project/lxc/examples] sudo ./list > 0 (STOPPED) > 1 (RUNNING) > 2 (RUNNING) > 3 (RUNNING) > 4 (RUNNING) > 5 (RUNNING) > 6 (RUNNING) > 7 (RUNNING) > 8 (RUNNING) > 9 (RUNNING) I assume you stopped 1-8 here? > [caglar@qgq:~/Project/lxc/examples] date && sudo ./list > Fri May 3 23:57:14 EDT 2013 > 0 (STOPPED) > 1 (STOPPED) > 2 (STOPPED) > 3 (STOPPED) > 4 (STOPPED) > 5 (STOPPED) > 6 (STOPPED) > 7 (STOPPED) > 8 (STOPPED) > 9 (RUNNING) > bleach (STOPPED) > > * lxc-monitord is still around after ~10min Looks like its not going away because it thinks there are 9 clients still. My guess is somehow its not getting notified of the client closes (or they're still around?). The following patch should provide a bit more info in the log: diff --git a/src/lxc/lxc_monitord.c b/src/lxc/lxc_monitord.c index e76af71..537a2b3 100644 --- a/src/lxc/lxc_monitord.c +++ b/src/lxc/lxc_monitord.c @@ -114,6 +114,7 @@ static int lxc_monitord_fifo_delete(struct lxc_monitor *mon) static void lxc_monitord_sockfd_remove(struct lxc_monitor *mon, int fd) { int i; + INFO("removing fd %d\n", fd); if (lxc_mainloop_del_handler(&mon->descr, fd)) CRIT("fd:%d not found in mainloop", fd); close(fd); @@ -343,7 +344,7 @@ int main(int argc, char *argv[]) if (ret < 0 || ret >= sizeof(logpath)) return EXIT_FAILURE; - ret = lxc_log_init(NULL, logpath, "NOTICE", "lxc-monitord", 0, lxcpath); + ret = lxc_log_init(NULL, logpath, "INFO", "lxc-monitord", 0, lxcpath); if (ret) return ret; > [caglar@qgq:~/Project/lxc/examples] ps aux | > grep /usr/bin/lxc-monitord caglar 1170 0.0 0.0 13580 940 > pts/3 S+ 23:57 0:00 grep --color=auto /usr/bin/lxc-monitord > root 29997 0.0 0.0 15000 744 ? Ss 23:47 0:00 > /usr/bin/lxc-monitord /var/lib/lxc 5 > [caglar@qgq:~/Project/lxc/examples] date > Fri May 3 23:57:52 EDT 2013 > > * And lastly here is what lxc-monitord.log shows > > [caglar@qgq:~/Project/lxc(clone)] tail > -f /var/lib/lxc/lxc-monitord.log lxc-monitord 1367639242.631 NOTICE > lxc_monitord - monitoring lxcpath /var/lib/lxc > lxc-monitord 1367639242.633 NOTICE lxc_monitord - create failed > lxc-monitord 1367639242.633 NOTICE lxc_monitord - create failed > lxc-monitord 1367639242.636 NOTICE lxc_monitord - create failed > lxc-monitord 1367639242.639 NOTICE lxc_monitord - create failed > lxc-monitord 1367639242.643 NOTICE lxc_monitord - create failed > lxc-monitord 1367639242.643 NOTICE lxc_monitord - create failed > lxc-monitord 1367639242.651 NOTICE lxc_monitord - create failed > lxc-monitord 1367639242.654 NOTICE lxc_monitord - create failed > lxc-monitord 1367639242.665 NOTICE lxc_monitord - create failed > lxc-monitord 1367639242.678 NOTICE lxc_monitord - create failed > lxc-monitord 1367639242.681 NOTICE lxc_monitord - create failed > lxc-monitord 1367639242.681 NOTICE lxc_monitord - create failed > lxc-monitord 1367639242.682 NOTICE lxc_monitord - create failed > lxc-monitord 1367639242.707 NOTICE lxc_monitord - create failed > lxc-monitord 1367639242.710 NOTICE lxc_monitord - create failed > lxc-monitord 1367639242.710 NOTICE lxc_monitord - create failed > lxc-monitord 1367639242.722 NOTICE lxc_monitord - create failed > lxc-monitord 1367639242.733 NOTICE lxc_monitord - create failed > lxc-monitord 1367639242.831 NOTICE lxc_monitord - create failed > lxc-monitord 1367639274.071 NOTICE lxc_monitord - clients 9 > lxc-monitord 1367639323.928 NOTICE lxc_monitord - clients 9 > lxc-monitord 1367639372.862 NOTICE lxc_monitord - clients 9 > lxc-monitord 1367639444.107 NOTICE lxc_monitord - clients 9 > lxc-monitord 1367639474.130 NOTICE lxc_monitord - clients 9 > lxc-monitord 1367639504.133 NOTICE lxc_monitord - clients 9 > lxc-monitord 1367639534.161 NOTICE lxc_monitord - clients 9 > lxc-monitord 1367639564.190 NOTICE lxc_monitord - clients 9 > lxc-monitord 1367639594.209 NOTICE lxc_monitord - clients 9 > lxc-monitord 1367639624.223 NOTICE lxc_monitord - clients 9 > lxc-monitord 1367639654.256 NOTICE lxc_monitord - clients 9 > lxc-monitord 1367639684.287 NOTICE lxc_monitord - clients 9 > lxc-monitord 1367639714.317 NOTICE lxc_monitord - clients 9 > lxc-monitord 1367639744.347 NOTICE lxc_monitord - clients 9 > lxc-monitord 1367639774.370 NOTICE lxc_monitord - clients 9 > lxc-monitord 1367639804.396 NOTICE lxc_monitord - clients 9 > lxc-monitord 1367639834.426 NOTICE lxc_monitord - clients 9 > lxc-monitord 1367639864.456 NOTICE lxc_monitord - clients 9 > lxc-monitord 1367639894.486 NOTICE lxc_monitord - clients 9 You might want to consider patching the log stuff to print out pids, I found that helpful while working on this: diff --git a/src/lxc/log.c b/src/lxc/log.c index d49a544..98581c1 100644 --- a/src/lxc/log.c +++ b/src/lxc/log.c @@ -58,7 +58,7 @@ static int log_append_stderr(const struct lxc_log_appender *appender, if (event->priority < LXC_LOG_PRIORITY_ERROR) return 0; - fprintf(stderr, "%s: ", log_prefix); + fprintf(stderr, "%-5d %s: ", getpid(), log_prefix); vfprintf(stderr, event->fmt, *event->vap); fprintf(stderr, "\n"); return 0; @@ -75,7 +75,8 @@ static int log_append_logfile(const struct lxc_log_appender *appender, return 0; n = snprintf(buffer, sizeof(buffer), - "%15s %10ld.%03ld %-8s %s - ", + "%-5d %15s %10ld.%03ld %-8s %s - ", + getpid(), log_prefix, event->timestamp.tv_sec, event->timestamp.tv_usec / 1000, > On Fri, Apr 26, 2013 at 4:52 PM, S.Çağlar Onur <cag...@10ur.org> > wrote: > > > Yeah, I think you all correct and I'm just confused - probably > > direct effect of lack of caffeine. And no, it's not complicating > > something for me, it's working great. I just want to make sure that > > I'm wrong :) > > > > > > On Fri, Apr 26, 2013 at 4:37 PM, Dwight Engen > > <dwight.en...@oracle.com>wrote: > > > >> On Fri, 26 Apr 2013 22:07:22 +0200 > >> Stéphane Graber <stgra...@ubuntu.com> wrote: > >> > >> > On 04/26/2013 09:42 PM, S.Çağlar Onur wrote: > >> > > Hey Dwight, > >> > > > >> > > I'm observing following behavior with staging tree and just > >> > > wanted to make sure that what I'm seeing is the expected; > >> > > > >> > > * Initially nothing runs > >> > > > >> > > [caglar@qgq:~/Projects/lxc/examples] sudo ./list > >> > > bankai (STOPPED) > >> > > bleach (STOPPED) > >> > > zangetsu (STOPPED) > >> > > > >> > > * I start one container using the API > >> > > > >> > > [caglar@qgq:~/Projects/lxc/examples] sudo ./start -name > >> > > zangetsu Starting the container... > >> > > > >> > > [caglar@qgq:~/Projects/lxc/examples] sudo ./list > >> > > bankai (STOPPED) > >> > > bleach (STOPPED) > >> > > zangetsu (RUNNING) > >> > > > >> > > * monitord starts as expected but exits after 30 seconds later > >> > > (although container is still running); > >> > > > >> > > [caglar@qgq:~/Projects/lxc-upstream(staging)] tail -f > >> > > /var/lib/lxc/lxc-monitord.log > >> > > lxc-monitord 1367004858.616 NOTICE lxc_monitord - > >> > > monitoring lxcpath /var/lib/lxc > >> > > lxc-monitord 1367004888.677 NOTICE lxc_monitord - no > >> > > clients for 30 seconds, exiting > >> > > lxc-monitord 1367004888.677 NOTICE lxc_monitord - monitor > >> > > exiting > >> > > > >> > > [caglar@qgq:~/Projects/lxc/examples] sudo ./list > >> > > bankai (STOPPED) > >> > > bleach (STOPPED) > >> > > zangetsu (RUNNING) > >> > > > >> > > [caglar@qgq:~/Projects/lxc/examples] ps aux | grep monitord > >> > > caglar 28404 0.0 0.0 7240 624 pts/54 S+ 15:34 > >> > > 0:00 tail -f /var/lib/lxc/lxc-monitord.log > >> > > caglar 29037 0.0 0.0 9436 948 pts/0 S+ 15:38 > >> > > 0:00 grep --color=auto monitord > >> > > [caglar@qgq:~/Projects/lxc/examples] > >> > > > >> > > I'm asking cause I was under the impression that lxc-monitord > >> > > will keep running as long as there is a container. Am I wrong? > >> > > >> > I believe the monitor will get spawned the first time something > >> > needs it (lxc-monitor/lxc-wait) and exit 30s after the last > >> > client disconnects. It'll then be respawned the next time > >> > lxc-monitor or lxc-wait is started again that container. > >> > >> Yep Stéphane, that is correct. Also note that the monitord is per > >> lxcpath, not per container. > >> > >> Çağlar, you may have been slightly confused because if you start a > >> container in daemon mode through the API, the API does an internal > >> lxc_wait() and thus a monitord will get spawned when you first > >> start a container, but will go away ~30 seconds afterwards. > >> > > > > > > > > -- > > S.Çağlar Onur <cag...@10ur.org> > > > > > ------------------------------------------------------------------------------ Get 100% visibility into Java/.NET code with AppDynamics Lite It's a free troubleshooting tool designed for production Get down to code-level detail for bottlenecks, with <2% overhead. Download for free and get started troubleshooting in minutes. http://p.sf.net/sfu/appdyn_d2d_ap2 _______________________________________________ Lxc-devel mailing list Lxc-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lxc-devel