Re: [RFC v7][PATCH 2/9] General infrastructure for checkpoint restart

2008-10-22 Thread Oren Laadan
Serge E. Hallyn wrote: Quoting Oren Laadan ([EMAIL PROTECTED]): Serge E. Hallyn wrote: Quoting Andrew Morton ([EMAIL PROTECTED]): On Mon, 20 Oct 2008 01:40:30 -0400 Oren Laadan [EMAIL PROTECTED] wrote: asmlinkage long sys_checkpoint(pid_t pid, int fd, unsigned long flags) { - pr_debug

Re: [RFC v7][PATCH 2/9] General infrastructure for checkpoint restart

2008-10-22 Thread Oren Laadan
Serge E. Hallyn wrote: Quoting Oren Laadan ([EMAIL PROTECTED]): Serge E. Hallyn wrote: Quoting Oren Laadan ([EMAIL PROTECTED]): Just thinking aloud... Is read mode appropriate? The user can edit the statefile and restart it. Admittedly the restart code should then do all

Re: [RFC v7][PATCH 2/9] General infrastructure for checkpoint restart

2008-10-27 Thread Oren Laadan
Peter Chubb wrote: Oren == Oren Laadan [EMAIL PROTECTED] writes: Oren Nope, since we will fail to restart in many cases. We will need Oren a way to move from caller's credentials to saved credentials, Oren and even from caller's credentials to privileged credentials Oren (e.g. to reopen

[RFC v8][PATCH 0/12] Kernel based checkpoint/restart

2008-10-30 Thread Oren Laadan
Basic checkpoint-restart [C/R]: v8 adds support for external checkpoint and improves documentation. Older announcements below. The git tree tracking v8 (branch 'ckpt-v8'), and older versions, is at: git://gorgona.ncl.cs.columbia.edu/pub/git/linux-cr-dev.git (or for the latest version -

[RFC v8][PATCH 06/12] Dump memory address space

2008-10-30 Thread Oren Laadan
-present pages Changelog[v4]: - Use standard list_... for cr_pgarr Signed-off-by: Oren Laadan [EMAIL PROTECTED] Acked-by: Serge Hallyn [EMAIL PROTECTED] Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- arch/x86/mm/checkpoint.c | 31 +++ arch/x86/mm/restart.c|1

[RFC v8][PATCH 05/12] x86 support for checkpoint/restart

2008-10-30 Thread Oren Laadan
/restore state of FPU Changelog[v5]: - Remove preempt_disable() when restoring debug registers Changelog[v4]: - Fix header structure alignment Changelog[v2]: - Pad header structures to 64 bits to ensure compatibility Signed-off-by: Oren Laadan [EMAIL PROTECTED] Acked-by: Serge Hallyn

[RFC v9][PATCH 07/13] Infrastructure for shared objects

2008-11-10 Thread Oren Laadan
(this time indexed by its identifier). Otherwise, the object in the hash table is used. Changelog[v4]: - Fix calculation of hash table size Changelog[v3]: - Use standard hlist_... for hash table Signed-off-by: Oren Laadan [EMAIL PROTECTED] Acked-by: Serge Hallyn [EMAIL PROTECTED] Signed-off

[RFC v10][PATCH 00/13] Kernel based checkpoint/restart

2008-11-26 Thread Oren Laadan
Checkpoint-restart (c/r): fixes a couple of bugs and a DoS issue (tested against v2.6.28-rc3). We'd like these to make it into -mm. This version addresses the last of the known bugs. Please pull at least the first 11 patches, as they are similar to before. Patches 1-11 are stable, providing

[RFC v10][PATCH 10/13] External checkpoint of a task other than ourself

2008-11-26 Thread Oren Laadan
of the restarting task. Changelog[v10]: - Grab vfs root of container init, rather than current process Signed-off-by: Oren Laadan [EMAIL PROTECTED] Acked-by: Serge Hallyn [EMAIL PROTECTED] --- checkpoint/checkpoint.c| 75 ++-- checkpoint/restart.c

[RFC v10][PATCH 02/13] Checkpoint/restart: initial documentation

2008-11-26 Thread Oren Laadan
Covers application checkpoint/restart, overall design, interfaces, usage, shared objects, and and checkpoint image format. Changelog[v8]: - Split into multiple files in Documentation/checkpoint/... - Extend documentation, fix typos and comments from feedback Signed-off-by: Oren Laadan [EMAIL

[RFC v10][PATCH 03/13] General infrastructure for checkpoint restart

2008-11-26 Thread Oren Laadan
- Pad header structures to 64 bits to ensure compatibility Signed-off-by: Oren Laadan [EMAIL PROTECTED] Acked-by: Serge Hallyn [EMAIL PROTECTED] Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- Makefile |2 +- checkpoint/Makefile|2 +- checkpoint

Re: [RFC v10][PATCH 09/13] Restore open file descriprtors

2008-12-01 Thread Oren Laadan
Dave Hansen wrote: On Fri, 2008-11-28 at 11:27 +, Al Viro wrote: On Wed, Nov 26, 2008 at 08:04:40PM -0500, Oren Laadan wrote: +/** + * cr_attach_get_file - attach (and get) lonely file ptr to a file descriptor + * @file: lonely file pointer + */ +static int cr_attach_get_file(struct

Re: [RFC v10][PATCH 05/13] Dump memory address space

2008-12-01 Thread Oren Laadan
Dave Hansen wrote: On Fri, 2008-11-28 at 10:53 +, Al Viro wrote: +static int cr_ctx_checkpoint(struct cr_ctx *ctx, pid_t pid) +{ + ctx-root_pid = pid; + + /* + * assume checkpointer is in container's root vfs + * FIXME: this works for now, but will change with real

Re: [RFC v10][PATCH 08/13] Dump open file descriptors

2008-12-01 Thread Oren Laadan
Dave Hansen wrote: On Mon, 2008-12-01 at 15:23 -0500, Oren Laadan wrote: Verifying that the size doesn't change does not ensure that the table's contents remained the same, so we can still end up with obsolete data. With the realloc() scheme, we have virtually no guarantees about how

[RFC v11][PATCH 02/13] Checkpoint/restart: initial documentation

2008-12-05 Thread Oren Laadan
Covers application checkpoint/restart, overall design, interfaces, usage, shared objects, and and checkpoint image format. Changelog[v8]: - Split into multiple files in Documentation/checkpoint/... - Extend documentation, fix typos and comments from feedback Signed-off-by: Oren Laadan [EMAIL

[RFC v11][PATCH 03/13] General infrastructure for checkpoint restart

2008-12-05 Thread Oren Laadan
- Pad header structures to 64 bits to ensure compatibility Signed-off-by: Oren Laadan [EMAIL PROTECTED] Acked-by: Serge Hallyn [EMAIL PROTECTED] Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- Makefile |2 +- checkpoint/Makefile|2 +- checkpoint

[RFC v11][PATCH 07/13] Infrastructure for shared objects

2008-12-05 Thread Oren Laadan
of hash table size Changelog[v3]: - Use standard hlist_... for hash table Signed-off-by: Oren Laadan [EMAIL PROTECTED] Acked-by: Serge Hallyn [EMAIL PROTECTED] Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- checkpoint/Makefile|2 +- checkpoint/objhash.c | 278

[RFC v11][PATCH 09/13] Restore open file descriprtors

2008-12-05 Thread Oren Laadan
basic FDs - regular files, directories and also symbolic links. Changelog[v6]: - Balance all calls to cr_hbuf_get() with matching cr_hbuf_put() (even though it's not really needed) Signed-off-by: Oren Laadan [EMAIL PROTECTED] Acked-by: Serge Hallyn [EMAIL PROTECTED] Signed-off-by: Dave

[RFC v11][PATCH 01/13] Create syscalls: sys_checkpoint, sys_restart

2008-12-05 Thread Oren Laadan
[v5]: - Config is 'def_bool n' by default Signed-off-by: Oren Laadan [EMAIL PROTECTED] Acked-by: Serge Hallyn [EMAIL PROTECTED] Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- arch/x86/include/asm/unistd_32.h |2 + arch/x86/kernel/syscall_table_32.S |2 + checkpoint/Kconfig

[RFC v11][PATCH 04/13] x86 support for checkpoint/restart

2008-12-05 Thread Oren Laadan
- Follow Dave Hansen's refactoring of the original post Signed-off-by: Oren Laadan [EMAIL PROTECTED] Acked-by: Serge Hallyn [EMAIL PROTECTED] Signed-off-by: Dave Hansen [EMAIL PROTECTED] --- arch/x86/include/asm/checkpoint_hdr.h | 85 arch/x86/mm/Makefile |2 + arch

[RFC v11][PATCH 08/13] Dump open file descriptors

2008-12-05 Thread Oren Laadan
]: - initialize 'coe' to workaround gcc false warning Changelog[v6]: - Balance all calls to cr_hbuf_get() with matching cr_hbuf_put() (even though it's not really needed) Signed-off-by: Oren Laadan [EMAIL PROTECTED] Acked-by: Serge Hallyn [EMAIL PROTECTED] Signed-off-by: Dave Hansen [EMAIL

[RFC v11][PATCH 12/13] Checkpoint multiple processes

2008-12-05 Thread Oren Laadan
for creation of processes during restart either in userspace or by the kernel. Currently we ignore threads and zombies, as well as session ids. Signed-off-by: Oren Laadan [EMAIL PROTECTED] Acked-by: Serge Hallyn [EMAIL PROTECTED] --- checkpoint/checkpoint.c| 228

[RFC v11][PATCH 00/13] Kernel based checkpoint/restart

2008-12-05 Thread Oren Laadan
Checkpoint-restart (c/r): fixed races in file handling (comments from from Al Viro). Updated and tested against v2.6.28-rc7 (feaf384...) We'd like these to make it into -mm. This version addresses the last of the known bugs. Please pull at least the first 11 patches, as they are similar to

[RFC v11][PATCH 13/13] Restart multiple processes

2008-12-05 Thread Oren Laadan
inside a container, and without restoring the original pids of the processes (that is, provided that the application can tolerate such behavior). This is useful to allow multi-process restart of tasks not isolated inside a container, and also for debugging. Signed-off-by: Oren Laadan [EMAIL

[RFC v11][PATCH 10/13] External checkpoint of a task other than ourself

2008-12-05 Thread Oren Laadan
of the restarting task. Changelog[v11]: - Copy contents of 'init-fs-root' instead of pointing to them Changelog[v10]: - Grab vfs root of container init, rather than current process Signed-off-by: Oren Laadan [EMAIL PROTECTED] Acked-by: Serge Hallyn [EMAIL PROTECTED] --- checkpoint/checkpoint.c

Re: [RFC v11][PATCH 03/13] General infrastructure for checkpoint restart

2008-12-16 Thread Oren Laadan
Dave Hansen wrote: On Tue, 2008-12-16 at 13:54 -0800, Mike Waychison wrote: Oren Laadan wrote: diff --git a/checkpoint/sys.c b/checkpoint/sys.c index 375129c..bd14ef9 100644 --- a/checkpoint/sys.c +++ b/checkpoint/sys.c +/* + * During checkpoint and restart the code writes outs/reads

Re: [RFC v11][PATCH 03/13] General infrastructure for checkpoint restart

2008-12-16 Thread Oren Laadan
Mike Waychison wrote: Oren Laadan wrote: Dave Hansen wrote: On Tue, 2008-12-16 at 13:54 -0800, Mike Waychison wrote: Oren Laadan wrote: diff --git a/checkpoint/sys.c b/checkpoint/sys.c index 375129c..bd14ef9 100644 --- a/checkpoint/sys.c +++ b/checkpoint/sys.c +/* + * During

Re: [RFC v11][PATCH 05/13] Dump memory address space

2008-12-18 Thread Oren Laadan
Mike Waychison wrote: Comments below. Thanks for the detailed review. Oren Laadan wrote: For each VMA, there is a 'struct cr_vma'; if the VMA is file-mapped, it will be followed by the file name. Then comes the actual contents, in one or more chunk: each chunk begins with a header

Re: [RFC v11][PATCH 05/13] Dump memory address space

2008-12-18 Thread Oren Laadan
Dave Hansen wrote: On Thu, 2008-12-18 at 06:10 -0500, Oren Laadan wrote: +for (i = pgarr-nr_used; i--; /**/) +page_cache_release(pgarr-pages[i]); This is sorta hard to read (and non-intuitive). Is it easier to do: for (i = 0; i pgarr-nr_used; i++) page_cache_release

Re: [RFC v11][PATCH 05/13] Dump memory address space

2008-12-18 Thread Oren Laadan
Mike Waychison wrote: Oren Laadan wrote: Mike Waychison wrote: Comments below. Thanks for the detailed review. Oren Laadan wrote: For each VMA, there is a 'struct cr_vma'; if the VMA is file-mapped, it will be followed by the file name. Then comes the actual contents, in one or more

[RFC v12][PATCH 01/14] Create syscalls: sys_checkpoint, sys_restart

2008-12-29 Thread Oren Laadan
[v5]: - Config is 'def_bool n' by default Signed-off-by: Oren Laadan or...@cs.columbia.edu Acked-by: Serge Hallyn se...@us.ibm.com Signed-off-by: Dave Hansen d...@linux.vnet.ibm.com --- arch/x86/include/asm/unistd_32.h |2 + arch/x86/kernel/syscall_table_32.S |2 + checkpoint/Kconfig

[RFC v12][PATCH 10/14] Restore open file descriprtors

2008-12-29 Thread Oren Laadan
basic FDs - regular files, directories and also symbolic links. Changelog[v12]: - Replace obsolete cr_debug() with pr_debug() Changelog[v6]: - Balance all calls to cr_hbuf_get() with matching cr_hbuf_put() (even though it's not really needed) Signed-off-by: Oren Laadan

[RFC v12][PATCH 02/14] Checkpoint/restart: initial documentation

2008-12-29 Thread Oren Laadan
Covers application checkpoint/restart, overall design, interfaces, usage, shared objects, and and checkpoint image format. Changelog[v8]: - Split into multiple files in Documentation/checkpoint/... - Extend documentation, fix typos and comments from feedback Signed-off-by: Oren Laadan

[RFC v12][PATCH 11/14] External checkpoint of a task other than ourself

2008-12-29 Thread Oren Laadan
of the restarting task. Changelog[v12]: - Replace obsolete cr_debug() with pr_debug() Changelog[v11]: - Copy contents of 'init-fs-root' instead of pointing to them Changelog[v10]: - Grab vfs root of container init, rather than current process Signed-off-by: Oren Laadan or...@cs.columbia.edu Acked

[RFC v12][PATCH 12/14] Track in-kernel when we expect checkpoint/restart to work

2008-12-29 Thread Oren Laadan
will be. This can, of course, be fixed up in the future. We might want to reset the flag when a new pid namespace is created, for instance. Signed-off-by: Dave Hansen d...@linux.vnet.ibm.com Signed-off-by: Oren Laadan or...@cs.columbia.edu --- checkpoint/checkpoint.c|6 ++ include

[RFC v12][PATCH 14/14] Restart multiple processes

2008-12-29 Thread Oren Laadan
cr_debug() with pr_debug() Signed-off-by: Oren Laadan or...@cs.columbia.edu --- checkpoint/restart.c | 214 +++- checkpoint/sys.c | 34 ++-- include/linux/checkpoint.h | 23 - include/linux/sched.h |1 + 4 files changed, 258

[RFC v12][PATCH 05/14] x86 support for checkpoint/restart

2008-12-29 Thread Oren Laadan
header structure alignment Changelog[v2]: - Pad header structures to 64 bits to ensure compatibility - Follow Dave Hansen's refactoring of the original post Signed-off-by: Oren Laadan or...@cs.columbia.edu Acked-by: Serge Hallyn se...@us.ibm.com Signed-off-by: Dave Hansen d...@linux.vnet.ibm.com

[RFC v12][PATCH 03/14] Make file_pos_read/write() public

2008-12-29 Thread Oren Laadan
These two are used in the next patch when calling vfs_read/write() --- fs/read_write.c| 10 -- include/linux/fs.h | 10 ++ 2 files changed, 10 insertions(+), 10 deletions(-) diff --git a/fs/read_write.c b/fs/read_write.c index 969a6d9..dda4eab 100644 --- a/fs/read_write.c

[RFC v12][PATCH 07/14] Restore memory address space

2008-12-29 Thread Oren Laadan
() Changelog[v4]: - Use standard list_... for cr_pgarr Signed-off-by: Oren Laadan or...@cs.columbia.edu Acked-by: Serge Hallyn se...@us.ibm.com Signed-off-by: Dave Hansen d...@linux.vnet.ibm.com --- arch/x86/include/asm/checkpoint_hdr.h |5 + arch/x86/mm/restart.c | 58

[RFC v12][PATCH 04/14] General infrastructure for checkpoint restart

2008-12-29 Thread Oren Laadan
to 64 bits to ensure compatibility Signed-off-by: Oren Laadan or...@cs.columbia.edu Acked-by: Serge Hallyn se...@us.ibm.com Signed-off-by: Dave Hansen d...@linux.vnet.ibm.com --- Makefile |2 +- checkpoint/Makefile|2 +- checkpoint/checkpoint.c| 188

[RFC v12][PATCH 08/14] Infrastructure for shared objects

2008-12-29 Thread Oren Laadan
of hash table size Changelog[v3]: - Use standard hlist_... for hash table Signed-off-by: Oren Laadan or...@cs.columbia.edu Acked-by: Serge Hallyn se...@us.ibm.com Signed-off-by: Dave Hansen d...@linux.vnet.ibm.com --- checkpoint/Makefile|2 +- checkpoint/objhash.c | 278

[RFC v12][PATCH 09/14] Dump open file descriptors

2008-12-29 Thread Oren Laadan
() - Drop useless kfree from cr_scan_fds() Changelog[v8]: - initialize 'coe' to workaround gcc false warning Changelog[v6]: - Balance all calls to cr_hbuf_get() with matching cr_hbuf_put() (even though it's not really needed) Signed-off-by: Oren Laadan or...@cs.columbia.edu Acked

[RFC v12][PATCH 13/14] Checkpoint multiple processes

2008-12-29 Thread Oren Laadan
for creation of processes during restart either in userspace or by the kernel. Currently we ignore threads and zombies, as well as session ids. Changelog[v12]: - Replace obsolete cr_debug() with pr_debug() Signed-off-by: Oren Laadan or...@cs.columbia.edu Acked-by: Serge Hallyn se...@us.ibm.com

Re: [RFC v12][PATCH 13/14] Checkpoint multiple processes

2009-01-14 Thread Oren Laadan
Nathan Lynch wrote: +/* count number of tasks in tree (and optionally fill pid's in array) */ +static int cr_tree_count_tasks(struct cr_ctx *ctx) +{ +struct task_struct *root = ctx-root_task; +struct task_struct *task = root; +struct task_struct *parent = NULL; +struct

[RFC v13][PATCH 12/14] Track in-kernel when we expect checkpoint/restart to work

2009-01-27 Thread Oren Laadan
will be. This can, of course, be fixed up in the future. We might want to reset the flag when a new pid namespace is created, for instance. Signed-off-by: Dave Hansen d...@linux.vnet.ibm.com Signed-off-by: Oren Laadan or...@cs.columbia.edu --- checkpoint/checkpoint.c|6 ++ include

[RFC v13][PATCH 14/14] Restart multiple processes

2009-01-27 Thread Oren Laadan
-checkpoint_ctx regardless of error condition - Remove unused argument 'ctx' from do_restart_task() prototype - Remove unused member 'pids_err' from 'struct cr_ctx' Changelog[v12]: - Replace obsolete cr_debug() with pr_debug() Signed-off-by: Oren Laadan or...@cs.columbia.edu Acked-by: Serge Hallyn

[RFC v13][PATCH 09/14] Dump open file descriptors

2009-01-27 Thread Oren Laadan
() - Drop useless kfree from cr_scan_fds() Changelog[v8]: - initialize 'coe' to workaround gcc false warning Changelog[v6]: - Balance all calls to cr_hbuf_get() with matching cr_hbuf_put() (even though it's not really needed) Signed-off-by: Oren Laadan or...@cs.columbia.edu Acked

[RFC v13][PATCH 07/14] Restore memory address space

2009-01-27 Thread Oren Laadan
chunks of vaddrs, pages instead of one long list of each - Memory restore now maps user pages explicitly to copy data into them, instead of reading directly to user space; got rid of mprotect_fixup() Changelog[v4]: - Use standard list_... for cr_pgarr Signed-off-by: Oren Laadan

Re: [RFC v13][PATCH 00/14] Kernel based checkpoint/restart

2009-03-12 Thread Oren Laadan
Oren Laadan wrote: Hi, Just got back from 3 weeks with practically no internet, and I see that I missed a big party ! Trying to catch up with what's been said so far -- [...] - Will any of this involve non-trivial serialisation of kernel objects? If so, that's getting

Re: How much of a mess does OpenVZ make? ;) Was: What can OpenVZ do?

2009-03-13 Thread Oren Laadan
Dave Hansen wrote: On Fri, 2009-03-13 at 14:01 -0700, Linus Torvalds wrote: On Fri, 13 Mar 2009, Alexey Dobriyan wrote: Let's face it, we're not going to _ever_ checkpoint any kind of general case process. Just TCP makes that fundamentally impossible in the general case, and there are

Re: How much of a mess does OpenVZ make? ;) Was: What can OpenVZ do?

2009-03-13 Thread Oren Laadan
Mike Waychison wrote: Linus Torvalds wrote: On Thu, 12 Mar 2009, Sukadev Bhattiprolu wrote: Ying Han [ying...@google.com] wrote: | Hi Serge: | I made a patch based on Oren's tree recently which implement a new | syscall clone_with_pid. I tested with checkpoint/restart process tree | and

Re: [RFC v13][PATCH 05/14] x86 support for checkpoint/restart

2009-03-18 Thread Oren Laadan
Nathan Lynch wrote: Hi, this is an old thread I guess, but I just noticed some issues while looking at this code. On Tue, 27 Jan 2009 12:08:03 -0500 Oren Laadan or...@cs.columbia.edu wrote: +static int cr_read_cpu_fpu(struct cr_ctx *ctx, struct task_struct *t) +{ +void *xstate_buf

[RFC v16][PATCH 13/43] c/r: introduce method '-checkpoint()' in struct vm_operations_struct

2009-05-27 Thread Oren Laadan
Signed-off-by: Oren Laadan or...@cs.columbia.edu --- include/linux/mm.h |5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index bff1f0d..05f0ed9 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -14,6 +14,8 @@ #include

[RFC v16][PATCH 34/43] c/r: save and restore ipc namespace basics

2009-05-27 Thread Oren Laadan
Save and restores the common state (parameters) of ipc namespace. Also add logic to iterate through the objects of sysvipc shared memory, message queues and semaphores. The logic to save and restore the state of these objects will be added in the next few patches. Signed-off-by: Oren Laadan

[RFC v16][PATCH 16/43] c/r: export shmem_getpage() to support shared memory

2009-05-27 Thread Oren Laadan
Export functionality to retrieve specific pages from shared memory given an inode in shmem-fs; this will be used in the next two patches to provide support for c/r of shared memory. mm/shmem.c: - shmem_getpage() and 'enum sgp_type' moved to linux/mm.h Signed-off-by: Oren Laadan

[RFC v16][PATCH 43/43] c/r: define s390-specific checkpoint-restart code

2009-05-27 Thread Oren Laadan
From: Dan Smith da...@us.ibm.com Implement the s390 arch-specific checkpoint/restart helpers. This is on top of Oren Laadan's c/r code. With these, I am able to checkpoint and restart simple programs as per Oren's patch intro. While on x86 I never had to freeze a single task to checkpoint it,

[RFC v16][PATCH 14/43] c/r: dump memory address space (private memory)

2009-05-27 Thread Oren Laadan
) to allow chunks of vaddrs, pages instead of one long list of each - Fix use of follow_page() to avoid faulting in non-present pages Changelog[v4]: - Use standard list_... for ckpt_pgarr Signed-off-by: Oren Laadan or...@cs.columbia.edu --- arch/x86/include/asm/checkpoint_hdr.h |8 + arch/x86

[RFC v16][PATCH 31/43] deferqueue: generic queue to defer work

2009-05-27 Thread Oren Laadan
operation as a whole, and later a task in particular, to defer some action until later (but not arbitrarily later) _in the restore_ operation. Signed-off-by: Oren Laadan or...@cs.columbia.edu --- checkpoint/Kconfig |5 ++ include/linux/deferqueue.h | 58 +++ kernel

[RFC v16][PATCH 40/43] c/r: support semaphore sysv-ipc

2009-05-27 Thread Oren Laadan
into the same format on 32- and 64-bit architectures, the checkpoint format is simply the dump of this array as is. TODO: this patch does not handle semaphore-undo -- this data should be saved per-task while iterating through the tasks. Signed-off-by: Oren Laadan or...@cs.columbia.edu --- include

[RFC v16][PATCH 36/43] c/r: support share-memory sysv-ipc

2009-05-27 Thread Oren Laadan
-off-by: Oren Laadan or...@cs.columbia.edu --- checkpoint/memory.c | 28 - checkpoint/sys.c | 10 ++ include/linux/checkpoint.h |3 + include/linux/checkpoint_hdr.h | 19 +++- include/linux/checkpoint_types.h |1 + include/linux/shm.h

[RFC v16][PATCH 28/43] c/r: make ckpt_may_checkpoint_task() check each namespace individually

2009-05-27 Thread Oren Laadan
From: Dan Smith da...@us.ibm.com Signed-off-by: Dan Smith da...@us.ibm.com Signed-off-by: Oren Laadan or...@cs.columbia.edu --- checkpoint/checkpoint.c| 20 ++-- checkpoint/objhash.c | 28 +++ checkpoint/process.c | 101

[RFC v16][PATCH 26/43] splice: added support for pipe-to-pipe splice()

2009-05-27 Thread Oren Laadan
call to copy buffers from one pipe to another. This obvious and trivial use case for splice() was not supported until now. It reuses the functions link_ipipe_prep() and link_opipe_prep() from the tee() system call implementation. -- Signed-off-by: Oren Laadan or...@cs.columbia.edu --- fs

[RFC v16][PATCH 29/43] c/r: support for UTS namespace

2009-05-27 Thread Oren Laadan
information record - Update Documentation to reflect new location of namespace info - Support checkpoint and restart of nested UTS namespaces Signed-off-by: Dan Smith da...@us.ibm.com Signed-off-by: Oren Laadan or...@cs.columbia.edu --- checkpoint/checkpoint.c|2 - checkpoint

[RFC v16][PATCH 21/43] c/r: restart-blocks

2009-05-27 Thread Oren Laadan
for the task to execute the signal handler (by faking a signal). The handler, in turn, already has the code to handle these restart request gracefully. Signed-off-by: Oren Laadan or...@cs.columbia.edu --- arch/x86/include/asm/checkpoint_hdr.h |1 - arch/x86/mm/checkpoint.c | 10

[RFC v16][PATCH 24/43] c/r: detect resource leaks for whole-container checkpoint

2009-05-27 Thread Oren Laadan
. Signed-off-by: Oren Laadan or...@cs.columbia.edu --- checkpoint/checkpoint.c|8 checkpoint/objhash.c | 82 ++-- include/linux/checkpoint.h |1 + 3 files changed, 88 insertions(+), 3 deletions(-) diff --git a/checkpoint/checkpoint.c b

[RFC v16][PATCH 33/43] c/r (ipc): helpers to save and restore kern_ipc_perm structures

2009-05-27 Thread Oren Laadan
of ipc objects, but does not restore them during restart. Signed-off-by: Oren Laadan or...@cs.columbia.edu --- include/linux/checkpoint.h |7 +++- include/linux/checkpoint_hdr.h | 29 ++ ipc/Makefile |1 + ipc/checkpoint.c | 81

[RFC v16][PATCH 32/43] c/r (ipc): allow allocation of a desired ipc identifier

2009-05-27 Thread Oren Laadan
During restart, we need to allocate ipc objects that with the same identifiers as recorded during checkpoint. Modify the allocation code allow an in-kernel caller to request a specific ipc identifier. The system call interface remains unchanged. Signed-off-by: Oren Laadan or...@cs.columbia.edu

[RFC v16][PATCH 22/43] c/r: checkpoint multiple processes

2009-05-27 Thread Oren Laadan
]: - Release tasklist_lock in error path in ckpt_tree_count_tasks() - Use separate index for 'tasks_arr' and 'hh' in ckpt_write_pids() Changelog[v12]: - Replace obsolete ckpt_debug() with pr_debug() Signed-off-by: Oren Laadan or...@cs.columbia.edu --- checkpoint/checkpoint.c | 237

[RFC v16][PATCH 23/43] c/r: restart multiple processes

2009-05-27 Thread Oren Laadan
'struct ckpt_ctx' Changelog[v12]: - Replace obsolete ckpt_debug() with pr_debug() Signed-off-by: Oren Laadan or...@cs.columbia.edu --- checkpoint/restart.c | 242 -- checkpoint/sys.c | 27 - include/linux/checkpoint.h |3

[RFC v16][PATCH 02/43] c/r: make file_pos_read/write() public

2009-05-27 Thread Oren Laadan
These two are used in the next patch when calling vfs_read/write() Signed-off-by: Oren Laadan or...@cs.columbia.edu --- fs/read_write.c| 10 -- include/linux/fs.h | 10 ++ 2 files changed, 10 insertions(+), 10 deletions(-) diff --git a/fs/read_write.c b/fs/read_write.c

[RFC v16][PATCH 06/43] c/r: x86_32 support for checkpoint/restart

2009-05-27 Thread Oren Laadan
Signed-off-by: Oren Laadan or...@cs.columbia.edu --- arch/x86/include/asm/checkpoint_hdr.h | 110 + arch/x86/mm/Makefile |2 + arch/x86/mm/checkpoint.c | 431 + checkpoint/checkpoint.c |7 +- checkpoint

[RFC v16][PATCH 07/43] c/r: infrastructure for shared objects

2009-05-27 Thread Oren Laadan
grabbing a reference and object lifetime Changelog[v4]: - Fix calculation of hash table size Changelog[v3]: - Use standard hlist_... for hash table Signed-off-by: Oren Laadan or...@cs.columbia.edu --- checkpoint/Makefile |1 + checkpoint/objhash.c | 397

[RFC v16][PATCH 04/43] c/r: documentation

2009-05-27 Thread Oren Laadan
before they are referenced unless they are compound) Changelog[v8]: - Split into multiple files in Documentation/checkpoint/... - Extend documentation, fix typos and comments from feedback Signed-off-by: Oren Laadan or...@cs.columbia.edu Acked-by: Serge Hallyn se...@us.ibm.com Signed-off

[RFC v16][PATCH 05/43] c/r: basic infrastructure for checkpoint/restart

2009-05-27 Thread Oren Laadan
} to checkpoint header - Pad header structures to 64 bits to ensure compatibility Signed-off-by: Oren Laadan or...@cs.columbia.edu --- Makefile |2 +- checkpoint/Makefile |6 +- checkpoint/checkpoint.c | 261 +++ checkpoint

Re: [RFC v16][PATCH 19/43] c/r: external checkpoint of a task other than ourself

2009-05-27 Thread Oren Laadan
On Thu, 28 May 2009, Alexey Dobriyan wrote: On Wed, May 27, 2009 at 01:32:45PM -0400, Oren Laadan wrote: Now we can do external checkpoint, i.e. act on another task. +static int may_checkpoint_task(struct ckpt_ctx *ctx, struct task_struct *t) +{ + if (t-state == TASK_DEAD

[RFC v17][PATCH 09/60] Namespaces submenu

2009-07-22 Thread Oren Laadan
From: Dave Hansen d...@linux.vnet.ibm.com Let's not steal too much space in the 'General Setup' menu. Take a cue from the cgroups code and create a submenu. This can go upstream now. Signed-off-by: Dave Hansen d...@linux.vnet.ibm.com Acked-by: Oren Laadan or...@cs.columbia.edu --- init/Kconfig

[RFC v17][PATCH 01/60] c/r: extend arch_setup_additional_pages()

2009-07-22 Thread Oren Laadan
-by: Oren Laadan or...@cs.columbia.edu --- arch/powerpc/include/asm/elf.h |1 + arch/powerpc/kernel/vdso.c | 13 - arch/s390/include/asm/elf.h|2 +- arch/s390/kernel/vdso.c| 13 - arch/sh/include/asm/elf.h |1 + arch/sh/kernel

[RFC v17][PATCH 28/60] c/r: support for zombie processes

2009-07-22 Thread Oren Laadan
tasks are already zombified (as opposed to perhap only becoming a zombie). Changelog[v17]: - Validate t-exit_signal for both threads and leader - Skip zombies in most of may_checkpoint_task() - Save/restore t-pdeath_signal - Validate -exit_signal and -pdeath_signal Signed-off-by: Oren Laadan

[RFC v17][PATCH 15/60] pids 5/7: Add target_pids parameter to copy_process()

2009-07-22 Thread Oren Laadan
From: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com The new parameter will be used in a follow-on patch when clone_with_pids() is implemented. Signed-off-by: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com Acked-by: Serge Hallyn se...@us.ibm.com Reviewed-by: Oren Laadan or...@cs.columbia.edu

[RFC v17][PATCH 54/60] c/r: add CKPT_COPY() macro

2009-07-22 Thread Oren Laadan
() macro to help copying register arrays, etc . Move the macro definitions inside the CR #ifdef Feb 25: . Changed WARN_ON() to BUILD_BUG_ON() Signed-off-by: Dan Smith da...@us.ibm.com Signed-off-by: Oren Laadan or...@cs.columbia.edu 1: https://lists.linux

[RFC v17][PATCH 12/60] pids 2/7: Have alloc_pidmap() return actual error code

2009-07-22 Thread Oren Laadan
than have callers assume -ENOMEM, have alloc_pidmap() return the actual error. Signed-off-by: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com Acked-by: Serge Hallyn se...@us.ibm.com Reviewed-by: Oren Laadan or...@cs.columbia.edu --- kernel/fork.c |5 +++-- kernel/pid.c |9 ++--- 2

[RFC v17][PATCH 35/60] c/r: add generic '-checkpoint' f_op to ext fses

2009-07-22 Thread Oren Laadan
From: Dave Hansen d...@linux.vnet.ibm.com This marks ext[234] as being checkpointable. There will be many more to do this to, but this is a start. Signed-off-by: Dave Hansen d...@linux.vnet.ibm.com --- fs/ext2/dir.c |1 + fs/ext2/file.c |2 ++ fs/ext3/dir.c |1 + fs/ext3/file.c |

[RFC v17][PATCH 38/60] c/r: dump memory address space (private memory)

2009-07-22 Thread Oren Laadan
-by: Oren Laadan or...@cs.columbia.edu --- arch/x86/include/asm/checkpoint_hdr.h |8 + arch/x86/mm/checkpoint.c | 31 ++ checkpoint/Makefile |3 +- checkpoint/checkpoint.c |3 + checkpoint/memory.c | 688

[RFC v17][PATCH 46/60] c/r: support for UTS namespace

2009-07-22 Thread Oren Laadan
of namespace info - Support checkpoint and restart of nested UTS namespaces Signed-off-by: Dan Smith da...@us.ibm.com Signed-off-by: Oren Laadan or...@cs.columbia.edu --- checkpoint/Makefile |1 + checkpoint/checkpoint.c |5 +- checkpoint/namespace.c | 100

[RFC v17][PATCH 26/60] c/r: restart multiple processes

2009-07-22 Thread Oren Laadan
'struct ckpt_ctx' Changelog[v12]: - Replace obsolete ckpt_debug() with pr_debug() Signed-off-by: Oren Laadan or...@cs.columbia.edu --- checkpoint/restart.c | 461 -- checkpoint/sys.c | 33 ++- include/linux/checkpoint.h | 39

[RFC v17][PATCH 30/60] c/r: infrastructure for shared objects

2009-07-22 Thread Oren Laadan
]: - Use standard hlist_... for hash table Signed-off-by: Oren Laadan or...@cs.columbia.edu --- checkpoint/Makefile |1 + checkpoint/objhash.c | 419 ++ checkpoint/restart.c | 50 +- checkpoint/sys.c

[RFC v17][PATCH 50/60] c/r: support share-memory sysv-ipc

2009-07-22 Thread Oren Laadan
. Changelog[v17]: - Restore objects in the right namespace - Properly initialize ctx-deferqueue - Fix compilation with CONFIG_CHECKPOINT=n Signed-off-by: Oren Laadan or...@cs.columbia.edu --- checkpoint/memory.c | 28 - checkpoint/sys.c | 13 ++ include/linux

[RFC v17][PATCH 56/60] c/r: clone_with_pids: define the s390 syscall

2009-07-22 Thread Oren Laadan
From: Serge E. Hallyn se...@us.ibm.com Hook up the clone_with_pids system call for s390x. clone_with_pids() takes an additional argument over clone(), which we pass in through register 7. Stub code for using the syscall looks like: struct target_pid_set { int num_pids; pid_t

[RFC v17][PATCH 58/60] c/r: checkpoint and restore task credentials

2009-07-22 Thread Oren Laadan
, resulting in a program owned by hallyn. Changelog: Jun 15: Fix user_ns handling when !CONFIG_USER_N Set creator_ref=0 for root_ns (discard @flags) Don't overwrite global user-ns if CONFIG_USER_NS Jun 10: Merge with ckpt-v16-dev (Oren Laadan

[RFC v17][PATCH 06/60] cgroup freezer: Update stale locking comments

2009-07-22 Thread Oren Laadan
From: Matt Helsley matth...@us.ibm.com Update stale comments regarding locking order and add a little more detail so it's easier to follow the locking between the cgroup freezer and the power management freezer code. Signed-off-by: Matt Helsley matth...@us.ibm.com Cc: Oren Laadan

[RFC v17][PATCH 23/60] c/r: export functionality used in next patch for restart-blocks

2009-07-22 Thread Oren Laadan
. Signed-off-by: Oren Laadan or...@cs.columbia.edu Acked-by: Serge Hallyn se...@us.ibm.com --- fs/select.c |2 +- include/linux/futex.h| 11 +++ include/linux/poll.h |3 +++ include/linux/posix-timers.h |6 ++ kernel/compat.c

[RFC v17][PATCH 16/60] pids 6/7: Define do_fork_with_pids()

2009-07-22 Thread Oren Laadan
/fork.c) Signed-off-by: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com Acked-by: Serge Hallyn se...@us.ibm.com Reviewed-by: Oren Laadan or...@cs.columbia.edu --- include/linux/sched.h |3 +++ include/linux/types.h |5 + kernel/fork.c | 16 ++-- 3 files changed, 22

[RFC v17][PATCH 55/60] c/r: define s390-specific checkpoint-restart code

2009-07-22 Thread Oren Laadan
From: Dan Smith da...@us.ibm.com Implement the s390 arch-specific checkpoint/restart helpers. This is on top of Oren Laadan's c/r code. With these, I am able to checkpoint and restart simple programs as per Oren's patch intro. While on x86 I never had to freeze a single task to checkpoint it,

[RFC v17][PATCH 48/60] c/r (ipc): allow allocation of a desired ipc identifier

2009-07-22 Thread Oren Laadan
During restart, we need to allocate ipc objects that with the same identifiers as recorded during checkpoint. Modify the allocation code allow an in-kernel caller to request a specific ipc identifier. The system call interface remains unchanged. Signed-off-by: Oren Laadan or...@cs.columbia.edu

[RFC v17][PATCH 43/60] splice: export pipe/file-to-pipe/file functionality

2009-07-22 Thread Oren Laadan
splice_pipe_to_pipe() directly. Signed-off-by: Oren Laadan or...@cs.columbia.edu --- fs/splice.c| 61 --- include/linux/splice.h |9 +++ 2 files changed, 50 insertions(+), 20 deletions(-) diff --git a/fs/splice.c b/fs/splice.c index 73766d2

[RFC v17][PATCH 31/60] c/r: detect resource leaks for whole-container checkpoint

2009-07-22 Thread Oren Laadan
, while the leak-detection pre-step took place. Changelog[v17]: - Leak detection is performed in two-steps - Detect reverse-leaks (objects disappearing unexpectedly) - Skip reverse-leak detection if ops-ref_users isn't defined Signed-off-by: Oren Laadan or...@cs.columbia.edu --- checkpoint

[RFC v17][PATCH 21/60] c/r: x86_32 support for checkpoint/restart

2009-07-22 Thread Oren Laadan
structure alignment Changelog[v2]: - Pad header structures to 64 bits to ensure compatibility - Follow Dave Hansen's refactoring of the original post Signed-off-by: Oren Laadan or...@cs.columbia.edu --- arch/x86/include/asm/Kbuild |1 + arch/x86/include/asm/checkpoint_hdr.h | 122

[RFC v17][PATCH 00/60] Kernel based checkpoint/restart

2009-07-22 Thread Oren Laadan
Application checkpoint/restart (c/r) is the ability to save the state of a running application so that it can later resume its execution from the time at which it was checkpointed, on the same or a different machine. This version introduces 'clone_with_pids()' syscall to preset pid(s) for a child

[RFC v17][PATCH 20/60] c/r: basic infrastructure for checkpoint/restart

2009-07-22 Thread Oren Laadan
it's not really needed) Changelog[v5]: - Rename headers files s/ckpt/checkpoint/ Changelog[v2]: - Added utsname-{release,version,machine} to checkpoint header - Pad header structures to 64 bits to ensure compatibility Signed-off-by: Oren Laadan or...@cs.columbia.edu --- Makefile

Re: [RFC v17][PATCH 52/60] c/r: support semaphore sysv-ipc

2009-07-22 Thread Oren Laadan
Cyrill Gorcunov wrote: [Oren Laadan - Wed, Jul 22, 2009 at 06:00:14AM -0400] ... | +static struct sem *restore_sem_array(struct ckpt_ctx *ctx, int nsems) | +{ | + struct sem *sma; | + int i, ret; | + | + sma = kmalloc(nsems * sizeof(*sma), GFP_KERNEL); Forgot

  1   2   3   >