Thanks a lot for answer. Yes, I’d like to checkpoint a container, migrate it to another node and restart it. Is that possible with CRIU? (the checkpoint and the restart I mean )
My second question: without CRIU, when I restart lxc-snapshot on another node, it restarts correctly? Thanks a lot. Bests. 2014-04-02 13:00 GMT+01:00 <[email protected]>: > Send lxc-users mailing list submissions to > [email protected] > > To subscribe or unsubscribe via the World Wide Web, visit > http://lists.linuxcontainers.org/listinfo/lxc-users > or, via email, send a message with subject or body 'help' to > [email protected] > > You can reach the person managing the list at > [email protected] > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of lxc-users digest..." > > Today's Topics: > > 1. Snapshot of a LXC container (Thouraya TH) > 2. Re: Snapshot of a LXC container (Rami Rosen) > 3. Re: lxc_monitor exiting, but not cleaning monitor-fifo? > (Serge Hallyn) > 4. Re: lxc_monitor exiting, but not cleaning monitor-fifo? > (Florian Klink) > > > ---------- Message transféré ---------- > From: Thouraya TH <[email protected]> > To: [email protected] > Cc: > Date: Tue, 1 Apr 2014 16:34:01 +0100 > Subject: [lxc-users] Snapshot of a LXC container > Hello, > > Please, i have another question about "Snapshot of a LXC container" > What's the difference between a and b: > a) stop container, migrate it to another machine and restart it > b) snapshot with a virtual machine on the container , migrate the > snapshot, restart > (solution system level) > > > > > Thank you so much. > Bests. > > > ---------- Message transféré ---------- > From: Rami Rosen <[email protected]> > To: LXC users mailing-list <[email protected]> > Cc: > Date: Tue, 1 Apr 2014 19:12:11 +0300 > Subject: Re: [lxc-users] Snapshot of a LXC container > > Hi, > First I assume that by stopping and starting the container you mean > checkpointing and restoring it, by CRIU. > > The difference is that by lxc-snapshot you save only the filesystem state > of a container, whereas with checkpoint you save the full state of the > container process and its children. > > Regards, > Rami Rosen > http://ramirose.wix.com/ramirosen > בתאריך 1 באפר 2014 18:34, "Thouraya TH" <[email protected]> כתב: > >> Hello, >> >> Please, i have another question about "Snapshot of a LXC container" >> What's the difference between a and b: >> a) stop container, migrate it to another machine and restart it >> b) snapshot with a virtual machine on the container , migrate the >> snapshot, restart >> (solution system level) >> >> >> >> >> Thank you so much. >> Bests. >> >> _______________________________________________ >> lxc-users mailing list >> [email protected] >> http://lists.linuxcontainers.org/listinfo/lxc-users >> > > > ---------- Message transféré ---------- > From: Serge Hallyn <[email protected]> > To: LXC users mailing-list <[email protected]> > Cc: [email protected] > Date: Tue, 1 Apr 2014 13:01:36 -0500 > Subject: Re: [lxc-users] lxc_monitor exiting, but not cleaning > monitor-fifo? > As an alternative to doing pidfiles, how about following the way > that lxcapi_create does it with fcntl(fd, F_SETLKW? (see > create_partial() and ongoing_create()? > > Then if the monitor exited without being able to clean up, we can > detect it and clean up. > > > > ---------- Message transféré ---------- > From: Florian Klink <[email protected]> > To: Dwight Engen <[email protected]>, LXC users mailing-list < > [email protected]> > Cc: > Date: Tue, 01 Apr 2014 22:15:25 +0200 > Subject: Re: [lxc-users] lxc_monitor exiting, but not cleaning > monitor-fifo? > Am 01.04.2014 01:49, schrieb Dwight Engen: > > On Mon, 31 Mar 2014 23:18:13 +0200 > > Florian Klink <[email protected]> wrote: > > > >> Am 31.03.2014 21:13, schrieb Dwight Engen: > >>> On Mon, 31 Mar 2014 20:34:15 +0200 > >>> Florian Klink <[email protected]> wrote: > >>> > >>>> Am 31.03.2014 20:10, schrieb Dwight Engen: > >>>>> On Sat, 29 Mar 2014 23:39:33 +0100 > >>>>> Florian Klink <[email protected]> wrote: > >>>>> > >>>>>> Hi, > >>>>>> > >>>>>> when running multiple lxc actions in row using the command line > >>>>>> tools, I sometimes observe the following state: > >>>>>> > >>>>>> > >>>>>> - lxc-monitord is not running anymore > >>>>>> - /run/lxc/var/lib/lxc/monitor-fifo still exists, but is > >>>>>> "refusing connection" > >>>>>> > >>>>>> In the logs, I then see the following: > >>>>>> > >>>>>> > >>>>>> lxc-start 1395671045.703 ERROR lxc_monitor - connect : backing > >>>>>> off 10 lxc-start 1395671045.713 ERROR lxc_monitor - connect : > >>>>>> backing off 50 lxc-start 1395671045.763 ERROR lxc_monitor - > >>>>>> connect : backing off 100 lxc-start 1395671045.864 ERROR > >>>>>> lxc_monitor - connect : Connection refused > >>>>>> > >>>>>> > >>>>>> ... and the command fails. > >>>>> > >>>>> The only time I've seen this happen is if lxc-monitord is hard > >>>>> killed so it doesn't have a chance to clean up and remove the > >>>>> socket. > >>>> > >>>> Here, it's happening quite frequently. However, the script never > >>>> kills lxc-monitord on its own, it just tries to detect and fix > >>>> this state by removing the socket file... > >>> > >>> Right, removing the socket file makes it so another lxc-monitord > >>> will start, but the question is why is the first one exiting without > >>> cleaning up? Can you reliably reproduce it at will? If so then maybe > >>> you could attach an strace to lxc-monitord and see why it is > >>> exiting. > >> > >> I was so far not successful in reproducing the bug while having an > >> strace running. :-( But I'll continue to try! > > Success :-) I managed to get an strace while trying to reproduce the > bug. I gzipped and attached it to this mail. > > Its the output of strace -f -s 200 /usr/lib/lxc/lxc-monitord > /var/lib/lxc /run/lxc/var/lib/lxc/monitor-fifo &> strace_output.txt > > I fired a bunch of lxc-starts and lxc-stops in row, then stopped my > script and waited for lxc-monitord (and strace too) to stop. > > Then I started my script again and had the "leftover monitor-fifo state". > > >>> > >>>>> > >>>>>> > >>>>>> A possible workaround would be checking for non-running > >>>>>> lxc-monitord process but existing monitor-fifo file then removing > >>>>>> the fifo if it exists before running the next lxc command, but > >>>>>> thats ugly ;-) > >>>>> > >>>>> Is there a good non-racy way to do this? I guess monitord could > >>>>> write its pid in $LXCPATH and we could kill(pid, 0) it. > >> > >> I also think that lxc should be able to recover from this problem > >> automatically. > > > > I agree, though I would like to understand the root cause. Can you try > > out the attached patch? I think it will cure your issues. > > > > Thanks for the patch! Just tell me if you need more information for the > strace above. If not, I'll happily apply the patch :-) > > >>>>> > >>>>>> Is this behaviour known? Is there some missing "cleanup code" in > >>>>>> lxc(_monitord) or why is it failing like this? > >>>>> > >>>>> Currently it catches SIGILL, SIGSEGV, SIGBUS, and SIGTERM and > >>>>> cleans up. Other than hard kill I'm not sure what else might > >>>>> cause it to exit without cleaning up. > >>>> > >>>> I shutdown containers with `lxc-stop -n container-name` > >>>> (lxc.stopsignal=30 (SIGPWR)), however this signal should never go > >>>> to lxc_monitord, right? > >>> > >>> Right, that goes to the init process of the container. > > > _______________________________________________ > lxc-users mailing list > [email protected] > http://lists.linuxcontainers.org/listinfo/lxc-users >
_______________________________________________ lxc-users mailing list [email protected] http://lists.linuxcontainers.org/listinfo/lxc-users
