[Devel] [PATCH 2/9] General infrastructure for checkpoint restart

2008-10-16 Thread Dave Hansen
off-by: Oren Laadan <[EMAIL PROTECTED]> Acked-by: Serge Hallyn <[EMAIL PROTECTED]> Signed-off-by: Dave Hansen <[EMAIL PROTECTED]> --- linux-2.6.git-dave/Makefile |2 linux-2.6.git-dave/checkpoint/Makefile|2 linux-2.6.git-dave

[Devel] [PATCH 3/9] x86 support for checkpoint/restart

2008-10-16 Thread Dave Hansen
trigger an explicit error. Signed-off-by: Oren Laadan <[EMAIL PROTECTED]> Acked-by: Serge Hallyn <[EMAIL PROTECTED]> Signed-off-by: Dave Hansen <[EMAIL PROTECTED]> --- linux-2.6.git-dave/arch/x86/mm/Makefile |2 linux-2.6.git-dave/arch/x86/m

[Devel] [PATCH 9/9] Restore open file descriprtors

2008-10-16 Thread Dave Hansen
the file pointer from the hash as an FD. This patch only handles basic FDs - regular files, directories and also symbolic links. Signed-off-by: Oren Laadan <[EMAIL PROTECTED]> Acked-by: Serge Hallyn <[EMAIL PROTECTED]> Signed-off-by: Dave Hansen <[EMAIL PROTECTED]> --- linux-2.

[Devel] [PATCH 1/9] Create syscalls: sys_checkpoint, sys_restart

2008-10-16 Thread Dave Hansen
s. For sys_checkpoint the first argument identifies the target container; for sys_restart it will identify the checkpoint image. Signed-off-by: Oren Laadan <[EMAIL PROTECTED]> Acked-by: Serge Hallyn <[EMAIL PROTECTED]> Signed-off-by: Dave Hansen <[EMAIL PROTECTED]> --- linux-2.6.

[Devel] [PATCH 8/9] Dump open file descriptors

2008-10-16 Thread Dave Hansen
c links. Signed-off-by: Oren Laadan <[EMAIL PROTECTED]> Acked-by: Serge Hallyn <[EMAIL PROTECTED]> Signed-off-by: Dave Hansen <[EMAIL PROTECTED]> --- linux-2.6.git-dave/checkpoint/Makefile|2 linux-2.6.git-dave/checkpoint/checkpoint.c|4 linux-2.6.g

[Devel] [PATCH 6/9] Checkpoint/restart: initial documentation

2008-10-16 Thread Dave Hansen
From: Oren Laadan <[EMAIL PROTECTED]> Covers application checkpoint/restart, overall design, interfaces and checkpoint image format. Signed-off-by: Oren Laadan <[EMAIL PROTECTED]> Acked-by: Serge Hallyn <[EMAIL PROTECTED]> Signed-off-by: Dave Hansen <[EMAIL PROTECTED]>

[Devel] [PATCH 4/9] Dump memory address space

2008-10-16 Thread Dave Hansen
PROTECTED]> Signed-off-by: Dave Hansen <[EMAIL PROTECTED]> --- linux-2.6.git-dave/arch/x86/mm/checkpoint.c | 31 + linux-2.6.git-dave/arch/x86/mm/restart.c|1 linux-2.6.git-dave/checkpoint/Makefile |3 linux-2.6.git-dave/checkpoint/checkpoint.c

[Devel] [PATCH 0/9] Kernel-based checkpoint/restart

2008-10-16 Thread Dave Hansen
I'd like to see these merged into -mm and on the way to mainline. The entire freakin' world is cc'd. So sue me. :) Why do we want it? It allows containers to be moved between physical machines' kernels in the same way that VMWare can move VMs between physical machines' hypervisors. There are c

[Devel] [PATCH 5/9] Restore memory address space

2008-10-16 Thread Dave Hansen
D]> Acked-by: Serge Hallyn <[EMAIL PROTECTED]> Signed-off-by: Dave Hansen <[EMAIL PROTECTED]> --- linux-2.6.git-dave/arch/x86/mm/restart.c| 64 +++ linux-2.6.git-dave/checkpoint/Makefile |2 linux-2.6.git-dave/checkpoint/checkpoint_arch.h |

[Devel] [PATCH 7/9] Infrastructure for shared objects

2008-10-16 Thread Dave Hansen
object is created, and added to the hash table (this time indexed by its identifier). Otherwise, the object in the hash table is used. Signed-off-by: Oren Laadan <[EMAIL PROTECTED]> Acked-by: Serge Hallyn <[EMAIL PROTECTED]> Signed-off-by: Dave Hansen <[EMAIL PROTECTED]&g

[Devel] [PATCH] fix oops in checkpoint/restart error path

2008-10-16 Thread Dave Hansen
The 'ctx' is kzmalloc()'d. So, all its contents are zeroed. It has a list_head, which is walked during cr_ctx_free(). list_for_each() on a non-initalized list_head is bad. Whoops. Signed-off-by: Dave Hansen <[EMAIL PROTECTED]> --- linux-2.6.git-dave/checkpoint/sys.c |

[Devel] Re: [PATCH 5/9] Restore memory address space

2008-10-17 Thread Dave Hansen
On Fri, 2008-10-17 at 10:44 +0200, Nadia Derbey wrote: > On Thu, 2008-10-16 at 11:14 -0700, Dave Hansen wrote: > > +static int cr_page_read(struct cr_ctx *ctx, struct page *page, char *buf) > > +{ > > + void *ptr; > > + int ret; > > + > > + ret = cr

[Devel] Re: [PATCH 0/9] OpenVZ kernel based checkpointing/restart

2008-10-17 Thread Dave Hansen
On Wed, 2008-09-03 at 14:57 +0400, Andrey Mirkin wrote: > This patchset introduces kernel based checkpointing/restart as it is > implemented in OpenVZ project. This patchset has limited functionality and > are able to checkpoint/restart only single process. Recently Oren Laaden > sent another kerne

Re: [Devel] Re: [PATCH 0/9] OpenVZ kernel based checkpointing/restart

2008-10-20 Thread Dave Hansen
On Mon, 2008-10-20 at 16:14 +0400, Andrey Mirkin wrote: > Right now my patchset (v2) provides an ability to checkpoint and restart a > group of processes. The process of checkpointing and restart can be initiated > from external process (not from the process which should be checkpointed). Absolu

[Devel] Re: [PATCH 0/9] OpenVZ kernel based checkpointing/restart

2008-10-20 Thread Dave Hansen
On Mon, 2008-10-20 at 13:10 +0200, Louis Rilling wrote: > To be fair, and since (IIRC) the initial intent was to start with OpenVZ's > approach, shouldn't Oren answer the same questions with respect to Andrey's > patchset? I'm only really "supporting" Oren's patch set because he got it out way bef

[Devel] Re: [PATCH 02/10] Make checkpoint/restart functionality modular

2008-10-20 Thread Dave Hansen
On Sat, 2008-10-18 at 03:11 +0400, Andrey Mirkin wrote: > +struct cpt_operations > +{ > + struct module * owner; > + int (*checkpoint)(pid_t pid, int fd, unsigned long flags); > + int (*restart)(int ctid, int fd, unsigned long flags); > +}; I think this is pretty useless obfuscat

[Devel] Re: [PATCH 03/10] Introduce context structure needed during checkpointing/restart

2008-10-20 Thread Dave Hansen
On Sat, 2008-10-18 at 03:11 +0400, Andrey Mirkin wrote: > +typedef struct cpt_context > +{ > + pid_t pid;/* should be changed to ctid later */ > + int ctx_id; /* context id */ > + struct list_head ctx_list; > + int refcount; > +

[Devel] Re: [RFC v6][PATCH 0/9] Kernel based checkpoint/restart

2008-10-20 Thread Dave Hansen
On Fri, 2008-10-17 at 17:00 +1030, David Newall wrote: > > The strace/gdb example is *really* hard; but for vfork, you just wait > > until it's over. The interval between vfork and exec/exit should be > > short enough not to affect the overall time for a checkpoint > > A malicious user could trivi

[Devel] Re: [PATCH 06/10] Introduce functions to dump mm

2008-10-20 Thread Dave Hansen
On Sat, 2008-10-18 at 03:11 +0400, Andrey Mirkin wrote: > +static void page_get_desc(struct vm_area_struct *vma, unsigned long addr, > + struct page_desc *pdesc, cpt_context_t * ctx) > +{ > + struct mm_struct *mm = vma->vm_mm; > + pgd_t *pgd; > + pud_t *pud

[Devel] External checkpoint patch

2008-10-21 Thread Dave Hansen
Oren, Could you take a look over Cedric's external checkpoint patch? http://git.kernel.org/?p=linux/kernel/git/daveh/linux-2.6-cr.git;a=commit;h=28ffabbc17d3641eee2a7eb66f714c266c400263 It seems pretty small to me. -- >From a2f88cbc023e2e9be5eb554fe64078a3d7d2ade6 Mon Sep 17 00:00:00 2001 From

[Devel] Re: [RFC v7][PATCH 0/9] Kernel based checkpoint/restart

2008-10-21 Thread Dave Hansen
On Tue, 2008-10-21 at 12:21 -0700, Andrew Morton wrote: > On Mon, 20 Oct 2008 01:40:28 -0400 > Oren Laadan <[EMAIL PROTECTED]> wrote: > > These patches implement basic checkpoint-restart [CR]. This version > > (v7) supports basic tasks with simple private memory, and open files > > (regular files a

[Devel] Re: [RFC v7][PATCH 2/9] General infrastructure for checkpoint restart

2008-10-21 Thread Dave Hansen
On Tue, 2008-10-21 at 22:55 -0400, Daniel Jacobowitz wrote: > I haven't been following - but why this whole container restriction? > Checkpoint/restart of individual processes is very useful too. > There are issues with e.g. IPC, but I'm not convinced they're > substantially different than the issu

Re: [Devel] Re: [PATCH 08/10] Introduce functions to restart a process

2008-10-23 Thread Dave Hansen
On Thu, 2008-10-23 at 13:54 +0400, Andrey Mirkin wrote: > We are putting special structure on stack, which is used at the very end of > the whole restart procedure to restore complex states (ptrace is one of such > cases). Right now I don't need to use this structure as we have a deal with > sim

Re: [Devel] Re: [PATCH 08/10] Introduce functions to restart a process

2008-10-23 Thread Dave Hansen
On Thu, 2008-10-23 at 13:00 +0400, Andrey Mirkin wrote: > > > >>> It is not related to the freezer code actually. > > >>> That is needed to restart syscalls. Right now I don't have a code in my > > >>> patchset which restarts a syscall, but later I plan to add it. > > >>> In OpenVZ checkpointing w

Re: [Devel] Re: [PATCH 06/10] Introduce functions to dump mm

2008-10-23 Thread Dave Hansen
On Thu, 2008-10-23 at 12:43 +0400, Andrey Mirkin wrote: > > > +#ifdef CONFIG_X86 > > > + if (pmd_huge(*pmd)) { > > > + eprintk("page_huge\n"); > > > + goto out_unsupported; > > > + } > > > +#endif > > > > I take it you know that this breaks with the 1GB (x86_

[Devel] Re: [RFC v7][PATCH 2/9] General infrastructure for checkpoint restart

2008-10-27 Thread Dave Hansen
On Mon, 2008-10-27 at 07:03 -0400, Oren Laadan wrote: > > In our implementation, we simply refused to checkpoint setid > programs. > > True. And this works very well for HPC applications. > > However, it doesn't work so well for server applications, for > instance. > > Also, you could use file s

[Devel] Re: [RFC v7][PATCH 2/9] General infrastructure for checkpoint restart

2008-10-27 Thread Dave Hansen
On Mon, 2008-10-27 at 17:51 -0400, Oren Laadan wrote: > > Instead, how about a flag to sys_checkpoint() -- DO_RISKY_CHECKPOINT > > -- > > which checkpoints despite !may_checkpoint? > > I also agree with Matt - so we have a quorum :) > > so just to clarify: sys_checkpoint() is to fail (with

[Devel] [BIG RFC] Filesystem-based checkpoint

2008-10-28 Thread Dave Hansen
I hate the syscall. It's a very un-Linux-y way of doing things. There, I said it. Here's an alternative. It still uses the syscall to initiate things, but it uses debugfs to transport the data instead. This is just a concept demonstration. It doesn't actually work, and I wouldn't be using debu

[Devel] Re: [BIG RFC] Filesystem-based checkpoint

2008-10-28 Thread Dave Hansen
On Tue, 2008-10-28 at 15:56 -0500, Serge E. Hallyn wrote: > Quoting Dave Hansen ([EMAIL PROTECTED]): > > I hate the syscall. It's a very un-Linux-y way of doing things. There, > > Not really the syscall, but the writing to the file from the kernel. > Any time I see

[Devel] Re: [BIG RFC] Filesystem-based checkpoint

2008-10-28 Thread Dave Hansen
On Tue, 2008-10-28 at 15:56 -0500, Serge E. Hallyn wrote: > If you like I can take a shot at whipping up the new mini-fs, though > I think you're having fun :) There are a couple of concepts that just get easier once you start thinking of this as an entire fs too. For instance, cr_ctx just become

[Devel] Re: [BIG RFC] Filesystem-based checkpoint

2008-10-30 Thread Dave Hansen
On Thu, 2008-10-30 at 12:25 -0400, Oren Laadan wrote: > Dave Hansen wrote: > > On Tue, 2008-10-28 at 15:56 -0500, Serge E. Hallyn wrote: > >> If you like I can take a shot at whipping up the new mini-fs, though > >> I think you're having fun :) > > > >

Re: [Devel] Re: [PATCH 0/9] OpenVZ kernel based checkpointing/restart

2008-10-30 Thread Dave Hansen
On Thu, 2008-10-30 at 10:02 +0400, Andrey Mirkin wrote: > Anyway we should ask everyone what they think about user- and kernel- based > process creation. > Dave, Serge, Cedric, Daniel, Louis what do you think about that? My worry is where a single sys_restart() plus in-kernel process creation tak

Re: [Devel] Re: [PATCH 0/9] OpenVZ kernel based checkpointing/restart

2008-10-30 Thread Dave Hansen
On Thu, 2008-10-30 at 12:47 +0100, Louis Rilling wrote: > 1) this prevents userspace from doing weird things, like changing the task > tree > and let the kernel detect it and deal with the mess this creates (think about > two threads being restarted in separate processes that do not even share the

[Devel] Re: [BIG RFC] Filesystem-based checkpoint

2008-10-30 Thread Dave Hansen
On Thu, 2008-10-30 at 14:19 -0400, Oren Laadan wrote: > I'm not sure why you say it's "un-linux-y" to begin with. But to the > point, here are my thought: > > 1. What you suggest is to expose the internal data to user space and > pull it. Isn't that what cryo tried to do ? No, cryo attempted to u

[Devel] Re: [BIG RFC] Filesystem-based checkpoint

2008-10-30 Thread Dave Hansen
On Thu, 2008-10-30 at 14:28 -0500, Serge E. Hallyn wrote: > Now maybe eventually he's going to propose something more esotaric where > doing the mount() actually starts the checkpoint (that's where I figured > he'd be heading), but I think it would still be one action on the part > of userspace tel

[Devel] Re: [BIG RFC] Filesystem-based checkpoint

2008-10-30 Thread Dave Hansen
On Thu, 2008-10-30 at 15:47 -0400, Oren Laadan wrote: > 3. Your approach doesn't play well with what I call "checkpoint that > involves self". This term refers to a process that checkpoints itself > (and only itself), or to a process that attempts to checkpoint its own > container. In both

[Devel] Re: [BIG RFC] Filesystem-based checkpoint

2008-10-30 Thread Dave Hansen
On Thu, 2008-10-30 at 16:15 -0400, Oren Laadan wrote: > Dave Hansen wrote: > > This is a blob. It's simply a blob exported in a filesystem. Note that > > it exports the same format as the 'big blob' with the same types. Stick > > a couple of cr_hdr* objects

[Devel] Re: [BIG RFC] Filesystem-based checkpoint

2008-10-30 Thread Dave Hansen
On Thu, 2008-10-30 at 16:33 -0700, Eric W. Biederman wrote: > Dave Hansen <[EMAIL PROTECTED]> writes: > > I hate the syscall. It's a very un-Linux-y way of doing things. There, > > I said it. Here's an alternative. It still uses the syscall to > > in

[Devel] Re: [BIG RFC] Filesystem-based checkpoint

2008-10-31 Thread Dave Hansen
On Thu, 2008-10-30 at 20:12 -0700, Eric W. Biederman wrote: > Dave Hansen <[EMAIL PROTECTED]> writes: > >> > System calls in Linux are fast. Doing lots of them is not a problem. > >> > If it becomes one, we can always export a condensed version of this >

[Devel] Re: [BIG RFC] Filesystem-based checkpoint

2008-11-03 Thread Dave Hansen
On Fri, 2008-10-31 at 13:51 -0700, Eric W. Biederman wrote: > Dave Hansen <[EMAIL PROTECTED]> writes: > > Eric, you were saying that my interface had way too many "dangerous > > syscalls". How does this relate to user namespaces and creating objects > > wit

[Devel] Re: [BIG RFC] Filesystem-based checkpoint

2008-11-03 Thread Dave Hansen
On Mon, 2008-11-03 at 09:23 -0800, Dave Hansen wrote: > The problem is that we can't possibly use refcounts (at least the ones > we have today) alone. For instance, with the pid namespace, we would > have 1 ref for the 'init' process doing the sys_restore() call and

[Devel] Re: [patch 1/1][RFC] do not sys_reboot when not in init_pid_ns

2008-11-03 Thread Dave Hansen
On Sun, 2008-11-02 at 01:00 +0100, Daniel Lezcano wrote: > +++ net-next-2.6/kernel/sys.c > @@ -355,6 +355,9 @@ asmlinkage long sys_reboot(int magic1, i > if (!capable(CAP_SYS_BOOT)) > return -EPERM; > > + if (current->nsproxy->pid_ns != &init_pid_ns) > +

[Devel] Re: [PATCH 1/3] ftrace: add function tracing to single thread

2008-11-25 Thread Dave Hansen
On Tue, 2008-11-25 at 19:11 -0500, Steven Rostedt wrote: > > > > Steven Rostedt <[EMAIL PROTECTED]> wrote: > > > > > > > > > This patch adds the ability to function trace a single thread. > > > > > The file: > > > > > > > > > > /debugfs/tracing/set_ftrace_pid > > > > > > > > > > contains the p

[Devel] Re: [RFC v10][PATCH 08/13] Dump open file descriptors

2008-12-01 Thread Dave Hansen
On Fri, 2008-11-28 at 10:19 +, Al Viro wrote: > On Wed, Nov 26, 2008 at 08:04:39PM -0500, Oren Laadan wrote: > > +int cr_scan_fds(struct files_struct *files, int **fdtable) > > +{ > > + struct fdtable *fdt; > > + int *fds; > > + int i, n = 0; > > + int tot = CR_DEFAULT_FDTABLE; > > + >

[Devel] Re: [RFC v10][PATCH 05/13] Dump memory address space

2008-12-01 Thread Dave Hansen
On Fri, 2008-11-28 at 10:53 +, Al Viro wrote: > > > +static int cr_ctx_checkpoint(struct cr_ctx *ctx, pid_t pid) > > +{ > > + ctx->root_pid = pid; > > + > > + /* > > + * assume checkpointer is in container's root vfs > > + * FIXME: this works for now, but will change with rea

[Devel] Re: [RFC v10][PATCH 02/13] Checkpoint/restart: initial documentation

2008-12-01 Thread Dave Hansen
On Fri, 2008-11-28 at 10:45 +, Al Viro wrote: > On Wed, Nov 26, 2008 at 08:04:33PM -0500, Oren Laadan wrote: > > +Currently, namespaces are not saved or restored. They will be treated > > +as a class of a shared object. In particular, it is assumed that the > > +task's file system namespace is

[Devel] Re: [RFC v10][PATCH 09/13] Restore open file descriprtors

2008-12-01 Thread Dave Hansen
On Fri, 2008-11-28 at 11:27 +, Al Viro wrote: > On Wed, Nov 26, 2008 at 08:04:40PM -0500, Oren Laadan wrote: > > +/** > > + * cr_attach_get_file - attach (and get) lonely file ptr to a file > > descriptor > > + * @file: lonely file pointer > > + */ > > +static int cr_attach_get_file(struct fil

[Devel] Re: [RFC v10][PATCH 08/13] Dump open file descriptors

2008-12-01 Thread Dave Hansen
On Mon, 2008-12-01 at 15:23 -0500, Oren Laadan wrote: > Verifying that the size doesn't change does not ensure that the table's > contents remained the same, so we can still end up with obsolete data. With the realloc() scheme, we have virtually no guarantees about how the fdtable that we read rel

[Devel] Re: [RFC v10][PATCH 09/13] Restore open file descriprtors

2008-12-01 Thread Dave Hansen
On Mon, 2008-12-01 at 15:41 -0500, Oren Laadan wrote: > >>> + fd = cr_attach_file(file); /* no need to cleanup 'file' below */ > >>> + if (fd < 0) { > >>> + filp_close(file, NULL); > >>> + ret = fd; > >>> + goto out; > >>> + } > >>> + > >>> + /* register n

[Devel] Re: [RFC v10][PATCH 09/13] Restore open file descriprtors

2008-12-01 Thread Dave Hansen
On Mon, 2008-12-01 at 16:00 -0500, Oren Laadan wrote: > > Is that sufficient? It seems like we're depending on the fd's reference > > to the 'struct file' to keep it valid in the hash. If something happens > > to the fd (like the other thread messing with it) the 'struct file' can > > still go aw

[Devel] Re: [RFC v10][PATCH 08/13] Dump open file descriptors

2008-12-01 Thread Dave Hansen
On Mon, 2008-12-01 at 13:02 -0800, Linus Torvalds wrote: > On Mon, 1 Dec 2008, Dave Hansen wrote: > > > > Why is this done in two steps? It first grabs a list of fd numbers > > which needs to be validated, then goes back and turns those into 'struct > > file&#

[Devel] Re: [RFC v10][PATCH 09/13] Restore open file descriprtors

2008-12-01 Thread Dave Hansen
On Mon, 2008-12-01 at 16:00 -0500, Oren Laadan wrote: > Dave Hansen wrote: > > On Mon, 2008-12-01 at 15:41 -0500, Oren Laadan wrote: > >>>>> + fd = cr_attach_file(file); /* no need to cleanup 'file' below > >>>>> */ > >>

[Devel] Re: [RFC v10][PATCH 09/13] Restore open file descriprtors

2008-12-01 Thread Dave Hansen
On Mon, 2008-12-01 at 13:07 -0800, Dave Hansen wrote: > > When a shared object is inserted to the hash we automatically take another > > reference to it (according to its type) for as long as it remains in the > > hash. See: 'cr_obj_ref_grab()' and 'cr_obj_ref_dro

[Devel] [RFC][PATCH 4/4] checkpoint/restart: simplify cr_scan_fds()

2008-12-02 Thread Dave Hansen
I think having all the allocations in one place, plus the reduction in the number of lines speaks for itself. In any case, this is last in the series and can be dropped if you don't like it. --- linux-2.6.git-dave/checkpoint/ckpt_file.c | 13 - 1 file changed, 4 insertions(+), 9

[Devel] [RFC][PATCH 1/4] checkpoint/restart: fix code to handle open symlinks

2008-12-02 Thread Dave Hansen
There's no such thing as an opened symlink. --- linux-2.6.git-dave/checkpoint/ckpt_file.c |3 --- linux-2.6.git-dave/checkpoint/rstr_file.c |1 - linux-2.6.git-dave/include/linux/checkpoint_hdr.h |1 - 3 files changed, 5 deletions(-) diff -puN checkpoint/ckpt_file.

[Devel] [RFC][PATCH 2/4] checkpoint/restart: fix cr_ctx_checkpoint() locking

2008-12-02 Thread Dave Hansen
The existing ctx->vfsroot is dangerous. We take it from the root task via: ctx->root_task->fs->root and don't take a ref on the 'fs' in there. We also take a ref on the fs.root which is worthless. So, replace ctx->vfsroot with a reference to the 'fs' instead. This gives us easy access to fs.ro

[Devel] [RFC][PATCH 3/4] checkpoint/restart: fix 'struct file' references

2008-12-02 Thread Dave Hansen
We must hold a reference to 'file' when it get passed in to fd_install(). After fd_install() the entry in the fdtable takes on the reference. If we don't hold a reference to it, another thread can come along after fd_install() and fput() it before we've done the get_file(). In cr_read_fd_data()

[Devel] Re: [RFC][PATCH 2/4] checkpoint/restart: fix cr_ctx_checkpoint() locking

2008-12-02 Thread Dave Hansen
On Tue, 2008-12-02 at 17:22 -0500, Oren Laadan wrote: > any reason why you want to reference the entire ->fs instead of, as I > suggested, make a *copy* of ->fs->root and then reference its > contents ? This looked simpler to me. We're going to have to rip this sucker out anyway, so I went for wh

[Devel] Re: [RFC][PATCH 4/4] checkpoint/restart: simplify cr_scan_fds()

2008-12-03 Thread Dave Hansen
On Wed, 2008-12-03 at 04:48 -0500, Oren Laadan wrote: > (as discussed in the LKML thread) as far as I can see the existing code is > safe, and this code is not more correct (for restart) in terms of races with > changes to the file table ? Agreed. This just saves a few lines of code. -- Dave __

[Devel] Re: [PATCH 2/3] ftrace: use struct pid

2008-12-04 Thread Dave Hansen
On Thu, 2008-12-04 at 04:42 -0800, Eric W. Biederman wrote: > > > +static void clear_ftrace_pid_task(struct pid **pid) > > +{ > > + struct task_struct *p; > > + > rcu_read_lock(); > > > + do_each_pid_task(*pid, PIDTYPE_PID, p) { > > + clear_tsk_trace_trace(p); > > +

[Devel] Re: [PATCH 2/3] ftrace: use struct pid

2008-12-04 Thread Dave Hansen
On Thu, 2008-12-04 at 04:56 -0800, Dave Hansen wrote: > On Thu, 2008-12-04 at 04:42 -0800, Eric W. Biederman wrote: > > > > > +static void clear_ftrace_pid_task(struct pid **pid) > > > +{ > > > + struct task_struct *p; > > > + > >

[Devel] Re: [Testing CGROUP inside CONTAINER]: BUG#1

2008-12-04 Thread Dave Hansen
On Thu, 2008-12-04 at 18:55 +0530, gowrishankar wrote: > It is expected behaviour of container, as PIDs in other namespaces will > always be shown as 0. > So here, these 0s are from system ns (probably, as you had only system > ns and delta ns). I think it is pretty bogus to be showing the 0's.

[Devel] checkpointing VFS internals

2008-12-04 Thread Dave Hansen
Try the following: mkdir foo bar mount --bind foo bar (date; cat) > bar/file In another terminal: cat bar/file umount bar # should say busy umount -l bar cat bar/file # -EEXIST # ls -l /proc/`pidof cat`/fd/ total 0 lrwx-- 1 dave dave 6

[Devel] Re: [PATCH 2/3] ftrace: use struct pid

2008-12-04 Thread Dave Hansen
On Thu, 2008-12-04 at 05:40 -0800, Eric W. Biederman wrote: > Dave Hansen <[EMAIL PROTECTED]> writes: > > On Thu, 2008-12-04 at 04:56 -0800, Dave Hansen wrote: > >> On Thu, 2008-12-04 at 04:42 -0800, Eric W. Biederman wrote: > >> > > >> > >

[Devel] Re: [PATCH 2/3] ftrace: use struct pid

2008-12-04 Thread Dave Hansen
On Thu, 2008-12-04 at 05:40 -0800, Eric W. Biederman wrote: > Dave Hansen <[EMAIL PROTECTED]> writes: > > On Thu, 2008-12-04 at 04:56 -0800, Dave Hansen wrote: > >> On Thu, 2008-12-04 at 04:42 -0800, Eric W. Biederman wrote: > >> > > >> > >

[Devel] Re: [PATCH] checkpoint/restart: refuse checkpoint on detached file

2008-12-05 Thread Dave Hansen
On Thu, 2008-12-04 at 22:41 -0600, Serge E. Hallyn wrote: > > @@ -158,6 +173,12 @@ cr_write_fd_ent(struct cr_ctx *ctx, struct > files_struct *files, int fd) > goto out; > } > > + /* Make sure this isn't under some detached tree */ > + if (file_in_detached_tree(

[Devel] Re: [PATCH] checkpoint/restart: refuse checkpoint on detached file

2008-12-05 Thread Dave Hansen
On Fri, 2008-12-05 at 16:46 -0600, Serge E. Hallyn wrote: > Quoting Dave Hansen ([EMAIL PROTECTED]): > > On Thu, 2008-12-04 at 22:41 -0600, Serge E. Hallyn wrote: > > > > > > @@ -158,6 +173,12 @@ cr_write_fd_ent(struct cr_ctx *ctx, struct > &

[Devel] Re: [PATCH] cgroups: not to iterate other namespace process inside container

2008-12-08 Thread Dave Hansen
On Mon, 2008-12-08 at 12:22 +0530, gowrishankar wrote: > static int pid_array_load(pid_t *pidarray, int npids, struct cgroup *cgrp) > { > -int n = 0; > +int n = 0, ret; Please declare these separately unless there's a really good reason not to. > struct cgroup_iter it; > struc

[Devel] Re: [PATCH] cgroups: skip processes from other namespaces when listing a cgroup

2008-12-08 Thread Dave Hansen
On Mon, 2008-12-08 at 20:44 +0530, Balbir Singh wrote: > * gowrishankar <[EMAIL PROTECTED]> [2008-12-07 20:16:01]: > > struct cgroup_iter it; > > struct task_struct *tsk; > > cgroup_iter_start(cgrp, &it); > > while ((tsk = cgroup_iter_next(cgrp, &it))) { > > if (unlikely(n =

[Devel] Re: [PATCH] pid: improved namespaced iteration over processes list

2008-12-15 Thread Dave Hansen
On Mon, 2008-12-15 at 22:19 +0530, Gowrishankar M wrote: > Below patch addresses a common solution for any place where a process > should be checked if it is associated to caller namespace. At present, > we use 'task_pid_vnr(t) > 0' to further proceed with task 't' in current > namespace. > > To a

[Devel] Re: [RFC v11][PATCH 00/13] Kernel based checkpoint/restart

2008-12-16 Thread Dave Hansen
Andrew, I just realized that you weren't cc'd on these when they were posted. Can we give them a run in -mm? As far as I know, all review comments have been addressed and there's nothing outstanding. On Fri, 2008-12-05 at 12:31 -0500, Oren Laadan wrote: > Checkpoint-restart (c/r): fixed races i

[Devel] Re: [RFC v11][PATCH 03/13] General infrastructure for checkpoint restart

2008-12-16 Thread Dave Hansen
On Tue, 2008-12-16 at 13:54 -0800, Mike Waychison wrote: > Oren Laadan wrote: > > diff --git a/checkpoint/sys.c b/checkpoint/sys.c > > index 375129c..bd14ef9 100644 > > --- a/checkpoint/sys.c > > +++ b/checkpoint/sys.c > > > +/* > > + * During checkpoint and restart the code writes outs/reads in d

[Devel] Re: [RFC v11][PATCH 03/13] General infrastructure for checkpoint restart

2008-12-16 Thread Dave Hansen
On Tue, 2008-12-16 at 14:43 -0800, Mike Waychison wrote: > Hmm, if I'm understanding you correctly, adding ref counts explicitly > (like you suggest below) would be used to let a lower layer defer > writes. Seems like this could be just as easily done with explicits > kmallocs and transferring

[Devel] Re: [PATCH 1/2] mqueue ns: move mqueue_mnt into struct ipc_namespace

2008-12-17 Thread Dave Hansen
On Wed, 2008-12-17 at 11:55 -0600, Serge E. Hallyn wrote: > Move mqueue vfsmount plus a few tunables into the > ipc_namespace struct. The CONFIG_IPC_NS boolean > and the ipc_namespace struct will serve both the > posix message queue namespaces and the SYSV ipc > namespaces. > > Signed-off-by: Ced

[Devel] Re: [PATCH 2/2] ipc namespaces: implement support for posix msqueues

2008-12-17 Thread Dave Hansen
On Wed, 2008-12-17 at 11:55 -0600, Serge E. Hallyn wrote: > -void free_ipc_ns(struct kref *kref) > +void put_ipc_ns(struct ipc_namespace *ns) > { > - struct ipc_namespace *ns; > + if (ns && atomic_dec_and_lock(&ns->count, &mq_lock)) { > + mq_clear_sbinfo(ns); > +

[Devel] Re: [PATCH 1/2] mqueue ns: move mqueue_mnt into struct ipc_namespace

2008-12-17 Thread Dave Hansen
On Wed, 2008-12-17 at 12:52 -0600, Serge E. Hallyn wrote: > > > +void mq_init_ns(struct ipc_namespace *ns) { > > > + ns->mq_queues_count = 0; > > > + ns->mq_queues_max= DFLT_QUEUESMAX; > > > + ns->mq_msg_max = DFLT_MSGMAX; > > > + ns->mq_msgsize_max = DFLT_MSGSIZEMAX; > > > +

[Devel] Re: [RFC v11][PATCH 05/13] Dump memory address space

2008-12-18 Thread Dave Hansen
On Thu, 2008-12-18 at 06:10 -0500, Oren Laadan wrote: > >> +mutex_lock(&mm->context.lock); > >> + > >> +hh->ldt_entry_size = LDT_ENTRY_SIZE; > >> +hh->nldt = mm->context.size; > >> + > >> +cr_debug("nldt %d\n", hh->nldt); > >> + > >> +ret = cr_write_obj(ctx, &h, hh); > >> +c

[Devel] Re: [RFC v11][PATCH 05/13] Dump memory address space

2008-12-18 Thread Dave Hansen
On Thu, 2008-12-18 at 06:10 -0500, Oren Laadan wrote: > >> +for (i = pgarr->nr_used; i--; /**/) > >> +page_cache_release(pgarr->pages[i]); > > > > This is sorta hard to read (and non-intuitive). Is it easier to do: > > > > for (i = 0; i < pgarr->nr_used; i++) > > page_cache_relea

[Devel] Re: [PATCH 2/5] pid: use namespaced iteration on processes while using sysrq

2008-12-18 Thread Dave Hansen
On Thu, 2008-12-18 at 22:12 +0530, Gowrishankar M wrote: > From: Gowrishankar M > > At present, while signalling processes using sysrq, process iteration > goes beyond current PID namespace. > > Below patch uses one of the proposed namespace iteration macros to fix > the boundary. I think this

[Devel] Re: [PATCH 4/5] pid: use namespaced iteration on processes while sending signal to all

2008-12-18 Thread Dave Hansen
On Thu, 2008-12-18 at 22:12 +0530, Gowrishankar M wrote: > At present we scan all processes in init namespace, whether in new namespace > or not, to send signal to all processes for container. Also we filter out > processes belonging to same namespace using task_pid_vnr(). > > Below patch proposes

[Devel] Re: [RFC v11][PATCH 05/13] Dump memory address space

2008-12-18 Thread Dave Hansen
f(unsigned > long), > >> You used PAGE_SIZE / sizeof(void *) above. Why not > __get_free_page()? > > > > Hahaha .. well, it's a guaranteed method to keep Dave Hansen from > > barking about not using kmalloc ... > > > > Personally I p

[Devel] Re: [RFC v12][PATCH 00/14] Kernel based checkpoint/restart

2009-01-06 Thread Dave Hansen
On Mon, 2008-12-29 at 04:16 -0500, Oren Laadan wrote: > Checkpoint-restart (c/r): fixed issues in error path handling (comments > from Mike Waychison) and . Updated and tested against v2.6.28 > > We'd like to push these into -mm. Hey Andrew, I think we've exhausted all the reviewers on this one,

[Devel] Re: [RFC][PATCH] x86_86 support of checkpoint/restart (Re: Checkpoint / Restart)

2009-02-04 Thread Dave Hansen
On Wed, 2009-01-28 at 11:10 +0900, Masahiko Takahashi wrote: > I'm now working on porting to x86_64 with help from Nauman Rafique. > Here is the preliminary patch. If there is someone who is interested > in x86_64 support, please join. Do you have anything more recent than this? I think there are

[Devel] Re: [RFC cr-pipe-v13][PATCH 2/3] Checkpoint open pipes

2009-02-06 Thread Dave Hansen
On Thu, 2009-02-05 at 20:26 -0600, Nathan Lynch wrote: > On Thu, 05 Feb 2009 10:45:55 +0100 > Cedric Le Goater wrote: > > > +/* cr_write_pipebuf - dump contents of a pipe/fifo (assume i_mutex > > > taken) */ > > > +static int cr_write_pipebuf(struct cr_ctx *ctx, struct pipe_inode_info > > > *pip

[Devel] Re: [RFC cr-pipe-v13][PATCH 2/3] Checkpoint open pipes

2009-02-06 Thread Dave Hansen
On Fri, 2009-02-06 at 18:20 +0100, Cedric Le Goater wrote: > > Sleeping inside mutexes is OK. In general, they're drop-in compatible > > with semaphore behavior. > > what about the vfs_write() ? Unless vfs_write() can come back and take the same mutex, I still think you're OK. -- Dave _

[Devel] Re: [cgroup or VFS ?] WARNING: at fs/namespace.c:636 mntput_no_expire+0xac/0xf2()

2009-02-09 Thread Dave Hansen
On Mon, 2009-02-09 at 09:34 +, Al Viro wrote: > On Mon, Feb 09, 2009 at 12:40:46AM -0800, Andrew Morton wrote: > > > Thread 1: > > > for ((; ;)) > > > { > > > mount -t cgroup -o cpuset xxx /mnt > /dev/null 2>&1 > > > mkdir /mnt/0 > /dev/null 2>&1 > > > rmdir /mnt/0 > /dev/

[Devel] Re: [RFC][PATCH] x86_86 support of checkpoint/restart (Re: Checkpoint / Restart)

2009-02-09 Thread Dave Hansen
On Fri, 2009-02-06 at 16:17 -0800, Nauman Rafique wrote: > The patch sent by Masahiko assumes that all the user-space registers > are saved on > the kernel stack on a system call. This is not true for the majority > of the system calls. The callee saved registers (as defined by x86_64 > ABI) - rbx,

[Devel] Re: [RFC][PATCH] x86_86 support of checkpoint/restart (Re: Checkpoint / Restart)

2009-02-09 Thread Dave Hansen
On Mon, 2009-02-09 at 10:02 -0800, Dave Hansen wrote: > Signal handling and ptrace single stepping are two places I would > imagine we have to enter the kernel and preserve those registers. Is > that why you were suggesting overloading signal delivery? There is also, of course,

[Devel] Re: [RFC v13][PATCH 00/14] Kernel based checkpoint/restart

2009-02-10 Thread Dave Hansen
On Tue, 2009-01-27 at 12:07 -0500, Oren Laadan wrote: > Checkpoint-restart (c/r): a couple of fixes in preparation for 64bit > architectures, and a couple of fixes for bugss (comments from Serge > Hallyn, Sudakvev Bhattiprolu and Nathan Lynch). Updated and tested > against v2.6.28. > > Aiming for

[Devel] Re: [RFC v13][PATCH 00/14] Kernel based checkpoint/restart

2009-02-12 Thread Dave Hansen
On Thu, 2009-02-12 at 10:17 +0100, Ingo Molnar wrote: > * Andrew Morton wrote: > > > On Tue, 10 Feb 2009 09:05:47 -0800 > > Dave Hansen wrote: > > > > > On Tue, 2009-01-27 at 12:07 -0500, Oren Laadan wrote: > > > > Checkpoint-restart (c/r

[Devel] Re: [RFC v13][PATCH 00/14] Kernel based checkpoint/restart

2009-02-12 Thread Dave Hansen
On Wed, 2009-02-11 at 14:14 -0800, Andrew Morton wrote: > On Tue, 10 Feb 2009 09:05:47 -0800 > Dave Hansen wrote: > > > On Tue, 2009-01-27 at 12:07 -0500, Oren Laadan wrote: > > > Checkpoint-restart (c/r): a couple of fixes in preparation for 64bit > > > archite

[Devel] What can OpenVZ do?

2009-02-12 Thread Dave Hansen
On Thu, 2009-02-12 at 11:42 -0800, Andrew Morton wrote: > On Thu, 12 Feb 2009 13:30:35 -0600 > Matt Mackall wrote: > > > On Thu, 2009-02-12 at 10:11 -0800, Dave Hansen wrote: > > > > > > - In bullet-point form, what features are missing, and should be added?

[Devel] Re: [RFC v13][PATCH 00/14] Kernel based checkpoint/restart

2009-02-12 Thread Dave Hansen
On Thu, 2009-02-12 at 13:30 -0600, Matt Mackall wrote: > On Thu, 2009-02-12 at 10:11 -0800, Dave Hansen wrote: ... > > * Filesystem state > > * contents of files > > * mount tree for individual processes > > * flock > > * threads and sess

[Devel] How much of a mess does OpenVZ make? ; ) Was: What can OpenVZ do?

2009-02-12 Thread Dave Hansen
On Thu, 2009-02-12 at 14:10 -0800, Andrew Morton wrote: > On Thu, 12 Feb 2009 13:51:23 -0800 > Dave Hansen wrote: > > > On Thu, 2009-02-12 at 11:42 -0800, Andrew Morton wrote: > > > On Thu, 12 Feb 2009 13:30:35 -0600 > > > Matt Mackall wrote: > > >

[Devel] Re: [RFC v13][PATCH 00/14] Kernel based checkpoint/restart

2009-02-12 Thread Dave Hansen
On Thu, 2009-02-12 at 17:05 -0600, Matt Mackall wrote: > On Thu, 2009-02-12 at 14:57 -0800, Dave Hansen wrote: > > > Also, what happens if I checkpoint a process in 2.6.30 and restore it in > > > 2.6.31 which has an expanded idea of what should be restored? Do your > >

[Devel] Re: [RFC v13][PATCH 00/14] Kernel based checkpoint/restart

2009-02-16 Thread Dave Hansen
On Fri, 2009-02-13 at 15:28 -0800, Andrew Morton wrote: > > > For extra marks: > > > > > > - Will any of this involve non-trivial serialisation of kernel > > > objects? If so, that's getting into the > > > unacceptably-expensive-to-maintain space, I suspect. > > > > We have some structures t

[Devel] Re: What can OpenVZ do?

2009-02-16 Thread Dave Hansen
On Fri, 2009-02-13 at 11:53 +0100, Ingo Molnar wrote: > In any case, by designing checkpointing to reuse the existing LSM > callbacks, we'd hit multiple birds with the same stone. (One of > which is the constant complaints about the runtime costs of the LSM > callbacks - with checkpointing we get a

[Devel] Re: What can OpenVZ do?

2009-02-17 Thread Dave Hansen
On Wed, 2009-02-18 at 01:32 +0100, Ingo Molnar wrote: > > > Uncheckpointable should be a one-way flag anyway. We want this > > > to become usable, so uncheckpointable functionality should be as > > > painful as possible, to make sure it's getting fixed ... > > > > Again, as these patches stand,

[Devel] Re: What can OpenVZ do?

2009-02-17 Thread Dave Hansen
On Tue, 2009-02-17 at 23:23 +0100, Ingo Molnar wrote: > * Dave Hansen wrote: > > On Fri, 2009-02-13 at 11:53 +0100, Ingo Molnar wrote: > > > In any case, by designing checkpointing to reuse the existing LSM > > > callbacks, we'd hit multiple birds with the same s

[Devel] Re: What can OpenVZ do?

2009-02-18 Thread Dave Hansen
On Wed, 2009-02-18 at 19:16 +0100, Ingo Molnar wrote: > Nothing motivates more than app designers complaining about the > one-way flag. > > Furthermore, it's _far_ easier to make a one-way flag SMP-safe. > We just set it and that's it. When we unset it, what do we about > SMP races with other t

  1   2   3   4   5   6   >