Serge Hallyn reports:
  "another question: if i run 'restart < out' and sys_restart returns
  due to a -EPERM on some object, then restart.c returns 1.  but if i
  'restart --pids', then it reports the error and returns 0.  unless i
  add --copy-status to the flags.  that seems inconsistent?"

It was with a subtree checkpoint in a child pidns, root-task is not
pid 1, So, the restarts calls ckpt_coordinator_pidns() execution.

In commit 2000bbb4b9... "restart: fix race in ckpt_coordinator_pidns
and --no-wait" adds a pipe for a coordinator in a new pids to report
success/failure of the restart operation back to the parent when the
parent does not wish to wait.

IOW, the coordinator's exit value is overloaded - used once to report
success/failure and once (optionally) to report root-tasks exit status.

This patch fixes this by extending the previous commit to make the
coordinator-pidns always report the restart status via the pipe, and
only use the exit status for --wait --copy-status case.

Signed-off-by: Oren Laadan <[email protected]>
---
 restart.c |   25 ++++++++++++-------------
 1 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/restart.c b/restart.c
index 35c54ea..5871bbf 100644
--- a/restart.c
+++ b/restart.c
@@ -942,10 +942,12 @@ static int ckpt_coordinator_pidns(struct ckpt_ctx *ctx)
        ckpt_dbg("forking coordinator in new pidns\n");
 
        /*
-        * We won't wait for (collect) the coordinator, so we use a
-        * pipe instead for the coordinator to report success/failure.
+        * The coordinator report restart susccess/failure via pipe.
+        * (It cannot use return value, because the in the default
+        * --wait --copy-status case it is already used to report the
+        * root-task's return value).
         */
-       if (!ctx->args->wait && pipe(ctx->pipe_coord)) {
+       if (pipe(ctx->pipe_coord) < 0) {
                perror("pipe");
                return -1;
        }
@@ -981,10 +983,7 @@ static int ckpt_coordinator_pidns(struct ckpt_ctx *ctx)
                return -1;
 
        ctx->args->copy_status = copy;
-       if (ctx->args->wait)
-               return ckpt_collect_child(ctx);
-       else
-               return ckpt_coordinator_status(ctx);
+       return ckpt_coordinator_status(ctx);
 }
 #else
 static int ckpt_coordinator_pidns(struct ckpt_ctx *ctx)
@@ -1040,13 +1039,13 @@ static int ckpt_coordinator(struct ckpt_ctx *ctx)
                 * around and be reaper until all tasks are gone.
                 * Otherwise, container will die as soon as we exit.
                 */
-               if (!ctx->args->wait) {
-                       /* report status because parent won't wait for us */
-                       if (write(ctx->pipe_coord[1], &ret, sizeof(ret)) < 0) {
-                               perror("failed to report status");
-                               exit(1);
-                       }
+
+               /* Report success/failure to the parent */
+               if (write(ctx->pipe_coord[1], &ret, sizeof(ret)) < 0) {
+                       perror("failed to report status");
+                       exit(1);
                }
+
                ret = ckpt_pretend_reaper(ctx);
        } else if (ctx->args->wait) {
                ret = ckpt_collect_child(ctx);
-- 
1.6.0.4

_______________________________________________
Containers mailing list
[email protected]
https://lists.linux-foundation.org/mailman/listinfo/containers

_______________________________________________
Devel mailing list
[email protected]
https://openvz.org/mailman/listinfo/devel

Reply via email to