Re: [PATCH v3 1/1] kthread: allocate kthread structure using kmalloc

2016-10-27 Thread Oleg Nesterov
On 10/26, Josh Poimboeuf wrote:
>
> On Wed, Oct 26, 2016 at 04:14:00PM +0200, Oleg Nesterov wrote:
> > +/*
> > + * TODO: kill it and use to_kthread(). But we still need the users
> > + * like kthread_stop() which has to sync with the exiting kthread.
> > + */
> >  static struct kthread *to_live_kthread(struct task_struct *k)
>
> Now that the kthread struct is no longer on the stack, are the
> try_get_task_stack() and its corresponding put_task_stack()'s still
> needed?

It seems you missed this part

Of course, with this patch we are ready to remove
put_task_stack() from kthread.c right now. The next change should kill
to_live_kthread() altogether.

in the same email ;)

Oleg.



Re: [PATCH v3 1/1] kthread: allocate kthread structure using kmalloc

2016-10-27 Thread Oleg Nesterov
On 10/26, Josh Poimboeuf wrote:
>
> On Wed, Oct 26, 2016 at 04:14:00PM +0200, Oleg Nesterov wrote:
> > +/*
> > + * TODO: kill it and use to_kthread(). But we still need the users
> > + * like kthread_stop() which has to sync with the exiting kthread.
> > + */
> >  static struct kthread *to_live_kthread(struct task_struct *k)
>
> Now that the kthread struct is no longer on the stack, are the
> try_get_task_stack() and its corresponding put_task_stack()'s still
> needed?

It seems you missed this part

Of course, with this patch we are ready to remove
put_task_stack() from kthread.c right now. The next change should kill
to_live_kthread() altogether.

in the same email ;)

Oleg.



Re: [PATCH v3 1/1] kthread: allocate kthread structure using kmalloc

2016-10-26 Thread Josh Poimboeuf
On Wed, Oct 26, 2016 at 04:14:00PM +0200, Oleg Nesterov wrote:
> +/*
> + * TODO: kill it and use to_kthread(). But we still need the users
> + * like kthread_stop() which has to sync with the exiting kthread.
> + */
>  static struct kthread *to_live_kthread(struct task_struct *k)

Now that the kthread struct is no longer on the stack, are the
try_get_task_stack() and its corresponding put_task_stack()'s still
needed?

-- 
Josh


Re: [PATCH v3 1/1] kthread: allocate kthread structure using kmalloc

2016-10-26 Thread Josh Poimboeuf
On Wed, Oct 26, 2016 at 04:14:00PM +0200, Oleg Nesterov wrote:
> +/*
> + * TODO: kill it and use to_kthread(). But we still need the users
> + * like kthread_stop() which has to sync with the exiting kthread.
> + */
>  static struct kthread *to_live_kthread(struct task_struct *k)

Now that the kthread struct is no longer on the stack, are the
try_get_task_stack() and its corresponding put_task_stack()'s still
needed?

-- 
Josh


Re: [PATCH v3 1/1] kthread: allocate kthread structure using kmalloc

2016-10-26 Thread Thomas Gleixner
On Wed, 26 Oct 2016, Oleg Nesterov wrote:
> On 10/26, Thomas Gleixner wrote:
> >
> > On Wed, 26 Oct 2016, Oleg Nesterov wrote:
> > > +static inline void set_kthread_struct(void *kthread)
> > > +{
> > > + /*
> > > +  * We abuse ->set_child_tid to avoid the new member and because it
> > > +  * can't be wrongly copied by copy_process(). We also rely on fact
> > > +  * that the caller can't exec, so PF_KTHREAD can't be cleared.
> > > +  */
> > > + current->set_child_tid = (__force void __user *)kthread;
> >
> > Can we pretty please avoid this type casting? We only have 5 places using
> > set_child_tid. So we can really make it a proper union
> 
> Yes, I thought about anonymous union too, the only problem is that
> it will need more comments ;)

Be careful with anonymous unions. There are a few pitfalls with older
compilers. That's why I said make it a proper union and fixup the 5 usage
sites. It might work in this case, but you've been warned :)

Thanks,

tglx

 


Re: [PATCH v3 1/1] kthread: allocate kthread structure using kmalloc

2016-10-26 Thread Thomas Gleixner
On Wed, 26 Oct 2016, Oleg Nesterov wrote:
> On 10/26, Thomas Gleixner wrote:
> >
> > On Wed, 26 Oct 2016, Oleg Nesterov wrote:
> > > +static inline void set_kthread_struct(void *kthread)
> > > +{
> > > + /*
> > > +  * We abuse ->set_child_tid to avoid the new member and because it
> > > +  * can't be wrongly copied by copy_process(). We also rely on fact
> > > +  * that the caller can't exec, so PF_KTHREAD can't be cleared.
> > > +  */
> > > + current->set_child_tid = (__force void __user *)kthread;
> >
> > Can we pretty please avoid this type casting? We only have 5 places using
> > set_child_tid. So we can really make it a proper union
> 
> Yes, I thought about anonymous union too, the only problem is that
> it will need more comments ;)

Be careful with anonymous unions. There are a few pitfalls with older
compilers. That's why I said make it a proper union and fixup the 5 usage
sites. It might work in this case, but you've been warned :)

Thanks,

tglx

 


Re: [PATCH v3 1/1] kthread: allocate kthread structure using kmalloc

2016-10-26 Thread Oleg Nesterov
Damn. sorry for noise, this doesn't really matter, but ...

On 10/26, Oleg Nesterov wrote:
>
> Yes, it needs cleanups and
> only because of kthread on stack.

What I actually tried to say is "NOT only because of kthread on stack".

Oleg.



Re: [PATCH v3 1/1] kthread: allocate kthread structure using kmalloc

2016-10-26 Thread Oleg Nesterov
Damn. sorry for noise, this doesn't really matter, but ...

On 10/26, Oleg Nesterov wrote:
>
> Yes, it needs cleanups and
> only because of kthread on stack.

What I actually tried to say is "NOT only because of kthread on stack".

Oleg.



Re: [PATCH v3 1/1] kthread: allocate kthread structure using kmalloc

2016-10-26 Thread Oleg Nesterov
On 10/26, Thomas Gleixner wrote:
>
> On Wed, 26 Oct 2016, Oleg Nesterov wrote:
> > +static inline void set_kthread_struct(void *kthread)
> > +{
> > +   /*
> > +* We abuse ->set_child_tid to avoid the new member and because it
> > +* can't be wrongly copied by copy_process(). We also rely on fact
> > +* that the caller can't exec, so PF_KTHREAD can't be cleared.
> > +*/
> > +   current->set_child_tid = (__force void __user *)kthread;
>
> Can we pretty please avoid this type casting? We only have 5 places using
> set_child_tid. So we can really make it a proper union

Yes, I thought about anonymous union too, the only problem is that
it will need more comments ;)

And I agree with other nits, will redo/resend tomorrow.

Thanks!

Oleg.



Re: [PATCH v3 1/1] kthread: allocate kthread structure using kmalloc

2016-10-26 Thread Oleg Nesterov
On 10/26, Thomas Gleixner wrote:
>
> On Wed, 26 Oct 2016, Oleg Nesterov wrote:
> > +static inline void set_kthread_struct(void *kthread)
> > +{
> > +   /*
> > +* We abuse ->set_child_tid to avoid the new member and because it
> > +* can't be wrongly copied by copy_process(). We also rely on fact
> > +* that the caller can't exec, so PF_KTHREAD can't be cleared.
> > +*/
> > +   current->set_child_tid = (__force void __user *)kthread;
>
> Can we pretty please avoid this type casting? We only have 5 places using
> set_child_tid. So we can really make it a proper union

Yes, I thought about anonymous union too, the only problem is that
it will need more comments ;)

And I agree with other nits, will redo/resend tomorrow.

Thanks!

Oleg.



Re: [PATCH v3 1/1] kthread: allocate kthread structure using kmalloc

2016-10-26 Thread Thomas Gleixner
On Wed, 26 Oct 2016, Oleg Nesterov wrote:
> Some notes right now. Of course, with this patch we are ready to remove
> put_task_stack() from kthread.c right now. The next change should kill
> to_live_kthread() altogether. And stop using ->vfork_done.
> 
> And. With this patch we do not need another "workqueue: ignore dead tasks
> in a workqueue sleep hook" fix from Roman.

Nice !
 
> diff --git a/kernel/kthread.c b/kernel/kthread.c
> index be2cc1f..c6adbde 100644
> --- a/kernel/kthread.c
> +++ b/kernel/kthread.c
> @@ -53,14 +53,34 @@ enum KTHREAD_BITS {
>   KTHREAD_IS_PARKED,
>  };
>  
> -#define __to_kthread(vfork)  \
> - container_of(vfork, struct kthread, exited)
> +static inline void set_kthread_struct(void *kthread)
> +{
> + /*
> +  * We abuse ->set_child_tid to avoid the new member and because it
> +  * can't be wrongly copied by copy_process(). We also rely on fact
> +  * that the caller can't exec, so PF_KTHREAD can't be cleared.
> +  */
> + current->set_child_tid = (__force void __user *)kthread;

Can we pretty please avoid this type casting? We only have 5 places using
set_child_tid. So we can really make it a proper union and fix up the 5
usage sites as a preparatory patch for this.

> +}
>  
>  static inline struct kthread *to_kthread(struct task_struct *k)
>  {
> - return __to_kthread(k->vfork_done);
> + WARN_ON(!(k->flags & PF_KTHREAD));
> + return (__force void *)k->set_child_tid;
>  }
>  
> +void free_kthread_struct(struct task_struct *k)
> +{
> + kfree(to_kthread(k)); /* can be NULL if kmalloc() failed */

Can you please not use tail comments? They really stop the reading flow.

> +#define __to_kthread(vfork)  \
> + container_of(vfork, struct kthread, exited)
> +
> +/*
> + * TODO: kill it and use to_kthread(). But we still need the users
> + * like kthread_stop() which has to sync with the exiting kthread.
> + */
>  static struct kthread *to_live_kthread(struct task_struct *k)
>  {
>   struct completion *vfork = ACCESS_ONCE(k->vfork_done);
> @@ -181,14 +201,11 @@ static int kthread(void *_create)
>   int (*threadfn)(void *data) = create->threadfn;
>   void *data = create->data;
>   struct completion *done;
> - struct kthread self;
> + struct kthread *self;
>   int ret;
>  
> - self.flags = 0;
> - self.data = data;
> - init_completion();
> - init_completion();
> - current->vfork_done = 
> + self = kmalloc(sizeof(*self), GFP_KERNEL);
> + set_kthread_struct(self);
>  
>   /* If user was SIGKILLed, I release the structure. */
>   done = xchg(>done, NULL);
> @@ -196,6 +213,19 @@ static int kthread(void *_create)
>   kfree(create);
>   do_exit(-EINTR);
>   }
> +
> + if (!self) {
> + create->result = ERR_PTR(-ENOMEM);
> + complete(done);
> + do_exit(-ENOMEM);
> + }
> +
> + self->flags = 0;
> + self->data = data;
> + init_completion(>exited);
> + init_completion(>parked);
> + current->vfork_done = >exited;
> +
>   /* OK, tell user we're spawned, wait for stop or wakeup */
>   __set_current_state(TASK_UNINTERRUPTIBLE);
>   create->result = current;
> @@ -203,12 +233,10 @@ static int kthread(void *_create)
>   schedule();
>  
>   ret = -EINTR;
> -
> - if (!test_bit(KTHREAD_SHOULD_STOP, )) {
> - __kthread_parkme();
> + if (!test_bit(KTHREAD_SHOULD_STOP, >flags)) {
> + __kthread_parkme(self);
>   ret = threadfn(data);
>   }
> - /* we can't just return, we must preserve "self" on stack */
>   do_exit(ret);
>  }

Other than the above nits, this is the right direction to go.

Thanks,

tglx


Re: [PATCH v3 1/1] kthread: allocate kthread structure using kmalloc

2016-10-26 Thread Thomas Gleixner
On Wed, 26 Oct 2016, Oleg Nesterov wrote:
> Some notes right now. Of course, with this patch we are ready to remove
> put_task_stack() from kthread.c right now. The next change should kill
> to_live_kthread() altogether. And stop using ->vfork_done.
> 
> And. With this patch we do not need another "workqueue: ignore dead tasks
> in a workqueue sleep hook" fix from Roman.

Nice !
 
> diff --git a/kernel/kthread.c b/kernel/kthread.c
> index be2cc1f..c6adbde 100644
> --- a/kernel/kthread.c
> +++ b/kernel/kthread.c
> @@ -53,14 +53,34 @@ enum KTHREAD_BITS {
>   KTHREAD_IS_PARKED,
>  };
>  
> -#define __to_kthread(vfork)  \
> - container_of(vfork, struct kthread, exited)
> +static inline void set_kthread_struct(void *kthread)
> +{
> + /*
> +  * We abuse ->set_child_tid to avoid the new member and because it
> +  * can't be wrongly copied by copy_process(). We also rely on fact
> +  * that the caller can't exec, so PF_KTHREAD can't be cleared.
> +  */
> + current->set_child_tid = (__force void __user *)kthread;

Can we pretty please avoid this type casting? We only have 5 places using
set_child_tid. So we can really make it a proper union and fix up the 5
usage sites as a preparatory patch for this.

> +}
>  
>  static inline struct kthread *to_kthread(struct task_struct *k)
>  {
> - return __to_kthread(k->vfork_done);
> + WARN_ON(!(k->flags & PF_KTHREAD));
> + return (__force void *)k->set_child_tid;
>  }
>  
> +void free_kthread_struct(struct task_struct *k)
> +{
> + kfree(to_kthread(k)); /* can be NULL if kmalloc() failed */

Can you please not use tail comments? They really stop the reading flow.

> +#define __to_kthread(vfork)  \
> + container_of(vfork, struct kthread, exited)
> +
> +/*
> + * TODO: kill it and use to_kthread(). But we still need the users
> + * like kthread_stop() which has to sync with the exiting kthread.
> + */
>  static struct kthread *to_live_kthread(struct task_struct *k)
>  {
>   struct completion *vfork = ACCESS_ONCE(k->vfork_done);
> @@ -181,14 +201,11 @@ static int kthread(void *_create)
>   int (*threadfn)(void *data) = create->threadfn;
>   void *data = create->data;
>   struct completion *done;
> - struct kthread self;
> + struct kthread *self;
>   int ret;
>  
> - self.flags = 0;
> - self.data = data;
> - init_completion();
> - init_completion();
> - current->vfork_done = 
> + self = kmalloc(sizeof(*self), GFP_KERNEL);
> + set_kthread_struct(self);
>  
>   /* If user was SIGKILLed, I release the structure. */
>   done = xchg(>done, NULL);
> @@ -196,6 +213,19 @@ static int kthread(void *_create)
>   kfree(create);
>   do_exit(-EINTR);
>   }
> +
> + if (!self) {
> + create->result = ERR_PTR(-ENOMEM);
> + complete(done);
> + do_exit(-ENOMEM);
> + }
> +
> + self->flags = 0;
> + self->data = data;
> + init_completion(>exited);
> + init_completion(>parked);
> + current->vfork_done = >exited;
> +
>   /* OK, tell user we're spawned, wait for stop or wakeup */
>   __set_current_state(TASK_UNINTERRUPTIBLE);
>   create->result = current;
> @@ -203,12 +233,10 @@ static int kthread(void *_create)
>   schedule();
>  
>   ret = -EINTR;
> -
> - if (!test_bit(KTHREAD_SHOULD_STOP, )) {
> - __kthread_parkme();
> + if (!test_bit(KTHREAD_SHOULD_STOP, >flags)) {
> + __kthread_parkme(self);
>   ret = threadfn(data);
>   }
> - /* we can't just return, we must preserve "self" on stack */
>   do_exit(ret);
>  }

Other than the above nits, this is the right direction to go.

Thanks,

tglx


Re: [PATCH v3 1/1] kthread: allocate kthread structure using kmalloc

2016-10-26 Thread Oleg Nesterov
On 10/25, Andy Lutomirski wrote:
>
> Would it perhaps make sense to do something like Roman's patch for 4.9
> and then consider further changes down the road?

OK, lets make it kmalloc'ed. Please see my old patch below, slightly changed.

I'll send it after I do some testing and write the changelog. No separate
refcounting, no complications.

Some notes right now. Of course, with this patch we are ready to remove
put_task_stack() from kthread.c right now. The next change should kill
to_live_kthread() altogether. And stop using ->vfork_done.

And. With this patch we do not need another "workqueue: ignore dead tasks
in a workqueue sleep hook" fix from Roman.

> Roman's patch
> appears to fix a real bug,

Well, it fixes the additional problems if we already have a bug, but I
agree this is a problem anyway.

> and I think that, while not really ideal,
> the code is an incredible mess right now and Roman's patch (assuming
> it's correct) makes it considerably nicer.

This is where I disagree with you and Roman. Yes, it needs cleanups and
only because of kthread on stack. But _IMO_ Roman's patch makes it much,
much worse and adds a lot of unnecessary complications.

Could you please look at the patch below?

Oleg.
---

diff --git a/include/linux/kthread.h b/include/linux/kthread.h
index a6e82a6..c1c3e63 100644
--- a/include/linux/kthread.h
+++ b/include/linux/kthread.h
@@ -48,6 +48,7 @@ struct task_struct *kthread_create_on_cpu(int 
(*threadfn)(void *data),
__k;   \
 })
 
+void free_kthread_struct(struct task_struct *k);
 void kthread_bind(struct task_struct *k, unsigned int cpu);
 void kthread_bind_mask(struct task_struct *k, const struct cpumask *mask);
 int kthread_stop(struct task_struct *k);
diff --git a/kernel/fork.c b/kernel/fork.c
index 623259f..663c6a7 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -351,6 +351,8 @@ void free_task(struct task_struct *tsk)
ftrace_graph_exit_task(tsk);
put_seccomp_filter(tsk);
arch_release_task_struct(tsk);
+   if (tsk->flags & PF_KTHREAD)
+   free_kthread_struct(tsk);
free_task_struct(tsk);
 }
 EXPORT_SYMBOL(free_task);
diff --git a/kernel/kthread.c b/kernel/kthread.c
index be2cc1f..c6adbde 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -53,14 +53,34 @@ enum KTHREAD_BITS {
KTHREAD_IS_PARKED,
 };
 
-#define __to_kthread(vfork)\
-   container_of(vfork, struct kthread, exited)
+static inline void set_kthread_struct(void *kthread)
+{
+   /*
+* We abuse ->set_child_tid to avoid the new member and because it
+* can't be wrongly copied by copy_process(). We also rely on fact
+* that the caller can't exec, so PF_KTHREAD can't be cleared.
+*/
+   current->set_child_tid = (__force void __user *)kthread;
+}
 
 static inline struct kthread *to_kthread(struct task_struct *k)
 {
-   return __to_kthread(k->vfork_done);
+   WARN_ON(!(k->flags & PF_KTHREAD));
+   return (__force void *)k->set_child_tid;
 }
 
+void free_kthread_struct(struct task_struct *k)
+{
+   kfree(to_kthread(k)); /* can be NULL if kmalloc() failed */
+}
+
+#define __to_kthread(vfork)\
+   container_of(vfork, struct kthread, exited)
+
+/*
+ * TODO: kill it and use to_kthread(). But we still need the users
+ * like kthread_stop() which has to sync with the exiting kthread.
+ */
 static struct kthread *to_live_kthread(struct task_struct *k)
 {
struct completion *vfork = ACCESS_ONCE(k->vfork_done);
@@ -181,14 +201,11 @@ static int kthread(void *_create)
int (*threadfn)(void *data) = create->threadfn;
void *data = create->data;
struct completion *done;
-   struct kthread self;
+   struct kthread *self;
int ret;
 
-   self.flags = 0;
-   self.data = data;
-   init_completion();
-   init_completion();
-   current->vfork_done = 
+   self = kmalloc(sizeof(*self), GFP_KERNEL);
+   set_kthread_struct(self);
 
/* If user was SIGKILLed, I release the structure. */
done = xchg(>done, NULL);
@@ -196,6 +213,19 @@ static int kthread(void *_create)
kfree(create);
do_exit(-EINTR);
}
+
+   if (!self) {
+   create->result = ERR_PTR(-ENOMEM);
+   complete(done);
+   do_exit(-ENOMEM);
+   }
+
+   self->flags = 0;
+   self->data = data;
+   init_completion(>exited);
+   init_completion(>parked);
+   current->vfork_done = >exited;
+
/* OK, tell user we're spawned, wait for stop or wakeup */
__set_current_state(TASK_UNINTERRUPTIBLE);
create->result = current;
@@ -203,12 +233,10 @@ static int kthread(void *_create)
schedule();
 
ret = -EINTR;
-
-   if (!test_bit(KTHREAD_SHOULD_STOP, )) {
-   __kthread_parkme();
+   if (!test_bit(KTHREAD_SHOULD_STOP, >flags)) {
+   

Re: [PATCH v3 1/1] kthread: allocate kthread structure using kmalloc

2016-10-26 Thread Oleg Nesterov
On 10/25, Andy Lutomirski wrote:
>
> Would it perhaps make sense to do something like Roman's patch for 4.9
> and then consider further changes down the road?

OK, lets make it kmalloc'ed. Please see my old patch below, slightly changed.

I'll send it after I do some testing and write the changelog. No separate
refcounting, no complications.

Some notes right now. Of course, with this patch we are ready to remove
put_task_stack() from kthread.c right now. The next change should kill
to_live_kthread() altogether. And stop using ->vfork_done.

And. With this patch we do not need another "workqueue: ignore dead tasks
in a workqueue sleep hook" fix from Roman.

> Roman's patch
> appears to fix a real bug,

Well, it fixes the additional problems if we already have a bug, but I
agree this is a problem anyway.

> and I think that, while not really ideal,
> the code is an incredible mess right now and Roman's patch (assuming
> it's correct) makes it considerably nicer.

This is where I disagree with you and Roman. Yes, it needs cleanups and
only because of kthread on stack. But _IMO_ Roman's patch makes it much,
much worse and adds a lot of unnecessary complications.

Could you please look at the patch below?

Oleg.
---

diff --git a/include/linux/kthread.h b/include/linux/kthread.h
index a6e82a6..c1c3e63 100644
--- a/include/linux/kthread.h
+++ b/include/linux/kthread.h
@@ -48,6 +48,7 @@ struct task_struct *kthread_create_on_cpu(int 
(*threadfn)(void *data),
__k;   \
 })
 
+void free_kthread_struct(struct task_struct *k);
 void kthread_bind(struct task_struct *k, unsigned int cpu);
 void kthread_bind_mask(struct task_struct *k, const struct cpumask *mask);
 int kthread_stop(struct task_struct *k);
diff --git a/kernel/fork.c b/kernel/fork.c
index 623259f..663c6a7 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -351,6 +351,8 @@ void free_task(struct task_struct *tsk)
ftrace_graph_exit_task(tsk);
put_seccomp_filter(tsk);
arch_release_task_struct(tsk);
+   if (tsk->flags & PF_KTHREAD)
+   free_kthread_struct(tsk);
free_task_struct(tsk);
 }
 EXPORT_SYMBOL(free_task);
diff --git a/kernel/kthread.c b/kernel/kthread.c
index be2cc1f..c6adbde 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -53,14 +53,34 @@ enum KTHREAD_BITS {
KTHREAD_IS_PARKED,
 };
 
-#define __to_kthread(vfork)\
-   container_of(vfork, struct kthread, exited)
+static inline void set_kthread_struct(void *kthread)
+{
+   /*
+* We abuse ->set_child_tid to avoid the new member and because it
+* can't be wrongly copied by copy_process(). We also rely on fact
+* that the caller can't exec, so PF_KTHREAD can't be cleared.
+*/
+   current->set_child_tid = (__force void __user *)kthread;
+}
 
 static inline struct kthread *to_kthread(struct task_struct *k)
 {
-   return __to_kthread(k->vfork_done);
+   WARN_ON(!(k->flags & PF_KTHREAD));
+   return (__force void *)k->set_child_tid;
 }
 
+void free_kthread_struct(struct task_struct *k)
+{
+   kfree(to_kthread(k)); /* can be NULL if kmalloc() failed */
+}
+
+#define __to_kthread(vfork)\
+   container_of(vfork, struct kthread, exited)
+
+/*
+ * TODO: kill it and use to_kthread(). But we still need the users
+ * like kthread_stop() which has to sync with the exiting kthread.
+ */
 static struct kthread *to_live_kthread(struct task_struct *k)
 {
struct completion *vfork = ACCESS_ONCE(k->vfork_done);
@@ -181,14 +201,11 @@ static int kthread(void *_create)
int (*threadfn)(void *data) = create->threadfn;
void *data = create->data;
struct completion *done;
-   struct kthread self;
+   struct kthread *self;
int ret;
 
-   self.flags = 0;
-   self.data = data;
-   init_completion();
-   init_completion();
-   current->vfork_done = 
+   self = kmalloc(sizeof(*self), GFP_KERNEL);
+   set_kthread_struct(self);
 
/* If user was SIGKILLed, I release the structure. */
done = xchg(>done, NULL);
@@ -196,6 +213,19 @@ static int kthread(void *_create)
kfree(create);
do_exit(-EINTR);
}
+
+   if (!self) {
+   create->result = ERR_PTR(-ENOMEM);
+   complete(done);
+   do_exit(-ENOMEM);
+   }
+
+   self->flags = 0;
+   self->data = data;
+   init_completion(>exited);
+   init_completion(>parked);
+   current->vfork_done = >exited;
+
/* OK, tell user we're spawned, wait for stop or wakeup */
__set_current_state(TASK_UNINTERRUPTIBLE);
create->result = current;
@@ -203,12 +233,10 @@ static int kthread(void *_create)
schedule();
 
ret = -EINTR;
-
-   if (!test_bit(KTHREAD_SHOULD_STOP, )) {
-   __kthread_parkme();
+   if (!test_bit(KTHREAD_SHOULD_STOP, >flags)) {
+   

Re: [PATCH v3 1/1] kthread: allocate kthread structure using kmalloc

2016-10-25 Thread Andy Lutomirski
On Tue, Oct 25, 2016 at 8:43 AM, Oleg Nesterov  wrote:
> On 10/25, Oleg Nesterov wrote:
>>
>> On 10/25, Roman Pen wrote:
>> >
>> > This patch avoids allocation of kthread structure on a stack, and simply
>> > uses kmalloc.
>>
>> Oh. I didn't even read this patch, but I have to admit I personally do not
>> like it. I can be wrong, but imo this is the step to the wrong direction.
>
> And after I tried to actually read it I dislike it even more, sorry Roman.
> Starting from the fact it moves kthread_create_info into struct kthread.

Would it perhaps make sense to do something like Roman's patch for 4.9
and then consider further changes down the road?  Roman's patch
appears to fix a real bug, and I think that, while not really ideal,
the code is an incredible mess right now and Roman's patch (assuming
it's correct) makes it considerably nicer.


Re: [PATCH v3 1/1] kthread: allocate kthread structure using kmalloc

2016-10-25 Thread Andy Lutomirski
On Tue, Oct 25, 2016 at 8:43 AM, Oleg Nesterov  wrote:
> On 10/25, Oleg Nesterov wrote:
>>
>> On 10/25, Roman Pen wrote:
>> >
>> > This patch avoids allocation of kthread structure on a stack, and simply
>> > uses kmalloc.
>>
>> Oh. I didn't even read this patch, but I have to admit I personally do not
>> like it. I can be wrong, but imo this is the step to the wrong direction.
>
> And after I tried to actually read it I dislike it even more, sorry Roman.
> Starting from the fact it moves kthread_create_info into struct kthread.

Would it perhaps make sense to do something like Roman's patch for 4.9
and then consider further changes down the road?  Roman's patch
appears to fix a real bug, and I think that, while not really ideal,
the code is an incredible mess right now and Roman's patch (assuming
it's correct) makes it considerably nicer.


Re: [PATCH v3 1/1] kthread: allocate kthread structure using kmalloc

2016-10-25 Thread Oleg Nesterov
Roman, I need to run away, just one note.

On 10/25, Roman Penyaev wrote:
>
> On Tue, Oct 25, 2016 at 5:43 PM, Oleg Nesterov  wrote:
>
> So in particular I do not understand this comment from the patch above
> where you abuse 'current->set_child_tid':
>
>  * This is the ugly but simple hack we will hopefully remove soon.
>
> how you are going to avoid this abuse of set_child_tid?

please ignore this comment. It actually reflects my desire to kill
struct kthread. If we won't do this, we can (ab)use this or another member
in task_struct.

> or vfork_done?
> because vfork_done is not only for waking up (yes, I totally agree, we
> can reuse task_work), it is also for getting a private data (like
> workqueue uses it):  task_struct->vfork_done->kthread->data.

kthreads simply should not use ->vfork_done at all. to_kthread() can
use the same pointer.

Oleg.



Re: [PATCH v3 1/1] kthread: allocate kthread structure using kmalloc

2016-10-25 Thread Oleg Nesterov
Roman, I need to run away, just one note.

On 10/25, Roman Penyaev wrote:
>
> On Tue, Oct 25, 2016 at 5:43 PM, Oleg Nesterov  wrote:
>
> So in particular I do not understand this comment from the patch above
> where you abuse 'current->set_child_tid':
>
>  * This is the ugly but simple hack we will hopefully remove soon.
>
> how you are going to avoid this abuse of set_child_tid?

please ignore this comment. It actually reflects my desire to kill
struct kthread. If we won't do this, we can (ab)use this or another member
in task_struct.

> or vfork_done?
> because vfork_done is not only for waking up (yes, I totally agree, we
> can reuse task_work), it is also for getting a private data (like
> workqueue uses it):  task_struct->vfork_done->kthread->data.

kthreads simply should not use ->vfork_done at all. to_kthread() can
use the same pointer.

Oleg.



Re: [PATCH v3 1/1] kthread: allocate kthread structure using kmalloc

2016-10-25 Thread Roman Penyaev
On Tue, Oct 25, 2016 at 5:43 PM, Oleg Nesterov  wrote:
> On 10/25, Oleg Nesterov wrote:
>>
>> On 10/25, Roman Pen wrote:
>> >
>> > This patch avoids allocation of kthread structure on a stack, and simply
>> > uses kmalloc.
>>
>> Oh. I didn't even read this patch, but I have to admit I personally do not
>> like it. I can be wrong, but imo this is the step to the wrong direction.
>
> And after I tried to actually read it I dislike it even more, sorry Roman.
> Starting from the fact it moves kthread_create_info into struct kthread.

that can be changed, of course, as I told, I wanted to keep allocations/
deallocations simpler.

>> struct kthread is already bloated, we should not bloat it more. Instead
>> we should kill it. And to_kthread() too, at least in its current form.
>
> Yes, but even if we can't or do not want to do this, even if we want to
> kmalloc struct kthread, I really think it should not be refcounted
> separately from task_struct.

it is already like that, we have to get/put references on a task stack.

>
> something like the patch in http://marc.info/?l=linux-kernel=146715459127804

the key function in that patch is:

free_kthread_struct(tsk);

so if we teach the generic free_task() to deal with kthreads, that of course
solves these kind of problems.  I did not consider that variant.

>
> Either way to_live_kthread() must go away. Currently we can't avoid it
> because we abuse vfork_done, but as I already said we no longer need this.

There is something which I do not understand.  You still need to have a
connection (a pointer) between task_struct and private data (kthread AND
private data, whatever), which is passed by the user of kthread API.
You still need to find a victim in a task_struct and abuse it :)

So in particular I do not understand this comment from the patch above
where you abuse 'current->set_child_tid':

 * This is the ugly but simple hack we will hopefully remove soon.

how you are going to avoid this abuse of set_child_tid? or vfork_done?
because vfork_done is not only for waking up (yes, I totally agree, we
can reuse task_work), it is also for getting a private data (like
workqueue uses it):  task_struct->vfork_done->kthread->data.

--
Roman


Re: [PATCH v3 1/1] kthread: allocate kthread structure using kmalloc

2016-10-25 Thread Roman Penyaev
On Tue, Oct 25, 2016 at 5:43 PM, Oleg Nesterov  wrote:
> On 10/25, Oleg Nesterov wrote:
>>
>> On 10/25, Roman Pen wrote:
>> >
>> > This patch avoids allocation of kthread structure on a stack, and simply
>> > uses kmalloc.
>>
>> Oh. I didn't even read this patch, but I have to admit I personally do not
>> like it. I can be wrong, but imo this is the step to the wrong direction.
>
> And after I tried to actually read it I dislike it even more, sorry Roman.
> Starting from the fact it moves kthread_create_info into struct kthread.

that can be changed, of course, as I told, I wanted to keep allocations/
deallocations simpler.

>> struct kthread is already bloated, we should not bloat it more. Instead
>> we should kill it. And to_kthread() too, at least in its current form.
>
> Yes, but even if we can't or do not want to do this, even if we want to
> kmalloc struct kthread, I really think it should not be refcounted
> separately from task_struct.

it is already like that, we have to get/put references on a task stack.

>
> something like the patch in http://marc.info/?l=linux-kernel=146715459127804

the key function in that patch is:

free_kthread_struct(tsk);

so if we teach the generic free_task() to deal with kthreads, that of course
solves these kind of problems.  I did not consider that variant.

>
> Either way to_live_kthread() must go away. Currently we can't avoid it
> because we abuse vfork_done, but as I already said we no longer need this.

There is something which I do not understand.  You still need to have a
connection (a pointer) between task_struct and private data (kthread AND
private data, whatever), which is passed by the user of kthread API.
You still need to find a victim in a task_struct and abuse it :)

So in particular I do not understand this comment from the patch above
where you abuse 'current->set_child_tid':

 * This is the ugly but simple hack we will hopefully remove soon.

how you are going to avoid this abuse of set_child_tid? or vfork_done?
because vfork_done is not only for waking up (yes, I totally agree, we
can reuse task_work), it is also for getting a private data (like
workqueue uses it):  task_struct->vfork_done->kthread->data.

--
Roman


Re: [PATCH v3 1/1] kthread: allocate kthread structure using kmalloc

2016-10-25 Thread Roman Penyaev
On Tue, Oct 25, 2016 at 4:03 PM, Oleg Nesterov  wrote:
> On 10/25, Roman Pen wrote:
>>
>> This patch avoids allocation of kthread structure on a stack, and simply
>> uses kmalloc.
>
> Oh. I didn't even read this patch, but I have to admit I personally do not
> like it. I can be wrong, but imo this is the step to the wrong direction.

it targets only one thing - we can't use stack anymore for keeping kthread
structure on it.  so basically it is not even a step, just an attempt to
fix memory corruption after oops.

> struct kthread is already bloated, we should not bloat it more.

that is clear, but I wanted to keep code simpler with one allocation and
one structure for all the needs.

> Instead we should kill it. And to_kthread() too, at least in its current
> form.

that is nice and this is a step forward, but not with this patch.

>
> For example. parked/exited/cpu should go into smp_hotplug_thread. Yes,
> this needs cleanups.
>
> All we need is kthread_data() which returns the pointer to the private
> data used by kthread.

that means that we still need to store the private data inside task_struct
and probably again abuse some member.

>
> As for kthread_stop(), we no longer need to abuse ->vfork_done, we can
> use task_works:

yes, that can be done easily. in the current patch I already use the
task_work to free the kthread structure. but still for me is not clear
where to keep the private data if you have only task_struct in your
hands.

--
Roman


Re: [PATCH v3 1/1] kthread: allocate kthread structure using kmalloc

2016-10-25 Thread Roman Penyaev
On Tue, Oct 25, 2016 at 4:03 PM, Oleg Nesterov  wrote:
> On 10/25, Roman Pen wrote:
>>
>> This patch avoids allocation of kthread structure on a stack, and simply
>> uses kmalloc.
>
> Oh. I didn't even read this patch, but I have to admit I personally do not
> like it. I can be wrong, but imo this is the step to the wrong direction.

it targets only one thing - we can't use stack anymore for keeping kthread
structure on it.  so basically it is not even a step, just an attempt to
fix memory corruption after oops.

> struct kthread is already bloated, we should not bloat it more.

that is clear, but I wanted to keep code simpler with one allocation and
one structure for all the needs.

> Instead we should kill it. And to_kthread() too, at least in its current
> form.

that is nice and this is a step forward, but not with this patch.

>
> For example. parked/exited/cpu should go into smp_hotplug_thread. Yes,
> this needs cleanups.
>
> All we need is kthread_data() which returns the pointer to the private
> data used by kthread.

that means that we still need to store the private data inside task_struct
and probably again abuse some member.

>
> As for kthread_stop(), we no longer need to abuse ->vfork_done, we can
> use task_works:

yes, that can be done easily. in the current patch I already use the
task_work to free the kthread structure. but still for me is not clear
where to keep the private data if you have only task_struct in your
hands.

--
Roman


Re: [PATCH v3 1/1] kthread: allocate kthread structure using kmalloc

2016-10-25 Thread Oleg Nesterov
On 10/25, Oleg Nesterov wrote:
>
> On 10/25, Roman Pen wrote:
> >
> > This patch avoids allocation of kthread structure on a stack, and simply
> > uses kmalloc.
>
> Oh. I didn't even read this patch, but I have to admit I personally do not
> like it. I can be wrong, but imo this is the step to the wrong direction.

And after I tried to actually read it I dislike it even more, sorry Roman.
Starting from the fact it moves kthread_create_info into struct kthread.

> struct kthread is already bloated, we should not bloat it more. Instead
> we should kill it. And to_kthread() too, at least in its current form.

Yes, but even if we can't or do not want to do this, even if we want to
kmalloc struct kthread, I really think it should not be refcounted
separately from task_struct.

something like the patch in http://marc.info/?l=linux-kernel=146715459127804

Either way to_live_kthread() must go away. Currently we can't avoid it
because we abuse vfork_done, but as I already said we no longer need this.

Oleg.



Re: [PATCH v3 1/1] kthread: allocate kthread structure using kmalloc

2016-10-25 Thread Oleg Nesterov
On 10/25, Oleg Nesterov wrote:
>
> On 10/25, Roman Pen wrote:
> >
> > This patch avoids allocation of kthread structure on a stack, and simply
> > uses kmalloc.
>
> Oh. I didn't even read this patch, but I have to admit I personally do not
> like it. I can be wrong, but imo this is the step to the wrong direction.

And after I tried to actually read it I dislike it even more, sorry Roman.
Starting from the fact it moves kthread_create_info into struct kthread.

> struct kthread is already bloated, we should not bloat it more. Instead
> we should kill it. And to_kthread() too, at least in its current form.

Yes, but even if we can't or do not want to do this, even if we want to
kmalloc struct kthread, I really think it should not be refcounted
separately from task_struct.

something like the patch in http://marc.info/?l=linux-kernel=146715459127804

Either way to_live_kthread() must go away. Currently we can't avoid it
because we abuse vfork_done, but as I already said we no longer need this.

Oleg.



Re: [PATCH v3 1/1] kthread: allocate kthread structure using kmalloc

2016-10-25 Thread Oleg Nesterov
On 10/25, Roman Pen wrote:
>
> This patch avoids allocation of kthread structure on a stack, and simply
> uses kmalloc.

Oh. I didn't even read this patch, but I have to admit I personally do not
like it. I can be wrong, but imo this is the step to the wrong direction.

struct kthread is already bloated, we should not bloat it more. Instead
we should kill it. And to_kthread() too, at least in its current form.

For example. parked/exited/cpu should go into smp_hotplug_thread. Yes,
this needs cleanups.

All we need is kthread_data() which returns the pointer to the private
data used by kthread.

As for kthread_stop(), we no longer need to abuse ->vfork_done, we can
use task_works:

struct xxx {
struct struct callback_head work;
struct completion done;
};

static void kthread_signal_exit(struct callback_head *arg)
{
struct xxx *xxx = container_of(arg);
complete(xxx->done);
}

int kthread_stop(struct task_struct *k)
{
struct xxx xxx;
int ret;

init_task_work(, kthread_signal_exit);
init_completion();

trace_sched_kthread_stop(k);

get_task_struct(k);
if (!task_work_add(k, , false)) {
set(KTHREAD_SHOULD_STOP);
wake_up_process(k);
wait_for_completion();
}
ret = k->exit_code;
put_task_struct(k);

trace_sched_kthread_stop_ret(ret);
return ret;
}

Oleg.



Re: [PATCH v3 1/1] kthread: allocate kthread structure using kmalloc

2016-10-25 Thread Oleg Nesterov
On 10/25, Roman Pen wrote:
>
> This patch avoids allocation of kthread structure on a stack, and simply
> uses kmalloc.

Oh. I didn't even read this patch, but I have to admit I personally do not
like it. I can be wrong, but imo this is the step to the wrong direction.

struct kthread is already bloated, we should not bloat it more. Instead
we should kill it. And to_kthread() too, at least in its current form.

For example. parked/exited/cpu should go into smp_hotplug_thread. Yes,
this needs cleanups.

All we need is kthread_data() which returns the pointer to the private
data used by kthread.

As for kthread_stop(), we no longer need to abuse ->vfork_done, we can
use task_works:

struct xxx {
struct struct callback_head work;
struct completion done;
};

static void kthread_signal_exit(struct callback_head *arg)
{
struct xxx *xxx = container_of(arg);
complete(xxx->done);
}

int kthread_stop(struct task_struct *k)
{
struct xxx xxx;
int ret;

init_task_work(, kthread_signal_exit);
init_completion();

trace_sched_kthread_stop(k);

get_task_struct(k);
if (!task_work_add(k, , false)) {
set(KTHREAD_SHOULD_STOP);
wake_up_process(k);
wait_for_completion();
}
ret = k->exit_code;
put_task_struct(k);

trace_sched_kthread_stop_ret(ret);
return ret;
}

Oleg.



[PATCH v3 1/1] kthread: allocate kthread structure using kmalloc

2016-10-25 Thread Roman Pen
This patch avoids allocation of kthread structure on a stack, and simply
uses kmalloc.  Allocation on a stack became a huge problem (with memory
corruption and all other not nice consequences) after the following commit
2deb4be28077 ("x86/dumpstack: When OOPSing, rewind the stack before do_exit()")
by Andy Lutomirski, which rewinds the stack on oops, thus ooopsed kthread
steps on a garbage memory while completion of task->vfork_done structure
on the following path:

   oops_end()
   rewind_stack_do_exit()
   exit_mm()
   mm_release()
   complete_vfork_done()

Also in this patch two structures 'struct kthread_create_info' and
'struct kthread' are merged into one 'struct kthread' and its freeing
is controlled by a reference counter.

The last reference on kthread is put from a task work, the callback,
which is invoked from do_exit().  The major thing is that the last
put is happens *after* completion_vfork_done() is invoked.

Signed-off-by: Roman Pen 
Cc: Andy Lutomirski 
Cc: Oleg Nesterov 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Tejun Heo 
Cc: linux-kernel@vger.kernel.org
---
v3:
  o handle to_live_kthread() calls, which should increase a kthread
ref or return NULL.  Function was renamed to to_live_kthread_and_get().
  o minor comments tweaks.

v2:
  o let x86/kernel/dumpstack.c rewind a stack, but do not use a stack
for a structure allocation.

 kernel/kthread.c | 198 ---
 1 file changed, 117 insertions(+), 81 deletions(-)

diff --git a/kernel/kthread.c b/kernel/kthread.c
index 4ab4c3766a80..e8adc10556e0 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -18,14 +18,19 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 static DEFINE_SPINLOCK(kthread_create_lock);
 static LIST_HEAD(kthread_create_list);
 struct task_struct *kthreadd_task;
 
-struct kthread_create_info
-{
+struct kthread {
+   struct list_head list;
+   unsigned long flags;
+   unsigned int cpu;
+   atomic_t refs;
+
/* Information passed to kthread() from kthreadd. */
int (*threadfn)(void *data);
void *data;
@@ -33,15 +38,9 @@ struct kthread_create_info
 
/* Result passed back to kthread_create() from kthreadd. */
struct task_struct *result;
-   struct completion *done;
 
-   struct list_head list;
-};
-
-struct kthread {
-   unsigned long flags;
-   unsigned int cpu;
-   void *data;
+   struct callback_head put_work;
+   struct completion *started;
struct completion parked;
struct completion exited;
 };
@@ -56,17 +55,49 @@ enum KTHREAD_BITS {
 #define __to_kthread(vfork)\
container_of(vfork, struct kthread, exited)
 
+static inline void get_kthread(struct kthread *kthread)
+{
+   BUG_ON(atomic_read(>refs) <= 0);
+   atomic_inc(>refs);
+}
+
+static inline void put_kthread(struct kthread *kthread)
+{
+   BUG_ON(atomic_read(>refs) <= 0);
+   if (atomic_dec_and_test(>refs))
+   kfree(kthread);
+}
+
+/**
+ * put_kthread_cb - is called from do_exit() and does likely
+ *  the final put.
+ */
+static void put_kthread_cb(struct callback_head *work)
+{
+   struct kthread *kthread;
+
+   kthread = container_of(work, struct kthread, put_work);
+   put_kthread(kthread);
+}
+
 static inline struct kthread *to_kthread(struct task_struct *k)
 {
return __to_kthread(k->vfork_done);
 }
 
-static struct kthread *to_live_kthread(struct task_struct *k)
+static struct kthread *to_live_kthread_and_get(struct task_struct *k)
 {
-   struct completion *vfork = ACCESS_ONCE(k->vfork_done);
-   if (likely(vfork) && try_get_task_stack(k))
-   return __to_kthread(vfork);
-   return NULL;
+   struct kthread *kthread = NULL;
+
+   BUG_ON(!(k->flags & PF_KTHREAD));
+   task_lock(k);
+   if (likely(k->vfork_done)) {
+   kthread = __to_kthread(k->vfork_done);
+   get_kthread(kthread);
+   }
+   task_unlock(k);
+
+   return kthread;
 }
 
 /**
@@ -174,41 +205,37 @@ void kthread_parkme(void)
 }
 EXPORT_SYMBOL_GPL(kthread_parkme);
 
-static int kthread(void *_create)
+static int kthreadfn(void *_self)
 {
-   /* Copy data: it's on kthread's stack */
-   struct kthread_create_info *create = _create;
-   int (*threadfn)(void *data) = create->threadfn;
-   void *data = create->data;
-   struct completion *done;
-   struct kthread self;
-   int ret;
-
-   self.flags = 0;
-   self.data = data;
-   init_completion();
-   init_completion();
-   current->vfork_done = 
-
-   /* If user was SIGKILLed, I release the structure. */
-   done = xchg(>done, NULL);
-   if (!done) {
-   kfree(create);
-   

[PATCH v3 1/1] kthread: allocate kthread structure using kmalloc

2016-10-25 Thread Roman Pen
This patch avoids allocation of kthread structure on a stack, and simply
uses kmalloc.  Allocation on a stack became a huge problem (with memory
corruption and all other not nice consequences) after the following commit
2deb4be28077 ("x86/dumpstack: When OOPSing, rewind the stack before do_exit()")
by Andy Lutomirski, which rewinds the stack on oops, thus ooopsed kthread
steps on a garbage memory while completion of task->vfork_done structure
on the following path:

   oops_end()
   rewind_stack_do_exit()
   exit_mm()
   mm_release()
   complete_vfork_done()

Also in this patch two structures 'struct kthread_create_info' and
'struct kthread' are merged into one 'struct kthread' and its freeing
is controlled by a reference counter.

The last reference on kthread is put from a task work, the callback,
which is invoked from do_exit().  The major thing is that the last
put is happens *after* completion_vfork_done() is invoked.

Signed-off-by: Roman Pen 
Cc: Andy Lutomirski 
Cc: Oleg Nesterov 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Tejun Heo 
Cc: linux-kernel@vger.kernel.org
---
v3:
  o handle to_live_kthread() calls, which should increase a kthread
ref or return NULL.  Function was renamed to to_live_kthread_and_get().
  o minor comments tweaks.

v2:
  o let x86/kernel/dumpstack.c rewind a stack, but do not use a stack
for a structure allocation.

 kernel/kthread.c | 198 ---
 1 file changed, 117 insertions(+), 81 deletions(-)

diff --git a/kernel/kthread.c b/kernel/kthread.c
index 4ab4c3766a80..e8adc10556e0 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -18,14 +18,19 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 static DEFINE_SPINLOCK(kthread_create_lock);
 static LIST_HEAD(kthread_create_list);
 struct task_struct *kthreadd_task;
 
-struct kthread_create_info
-{
+struct kthread {
+   struct list_head list;
+   unsigned long flags;
+   unsigned int cpu;
+   atomic_t refs;
+
/* Information passed to kthread() from kthreadd. */
int (*threadfn)(void *data);
void *data;
@@ -33,15 +38,9 @@ struct kthread_create_info
 
/* Result passed back to kthread_create() from kthreadd. */
struct task_struct *result;
-   struct completion *done;
 
-   struct list_head list;
-};
-
-struct kthread {
-   unsigned long flags;
-   unsigned int cpu;
-   void *data;
+   struct callback_head put_work;
+   struct completion *started;
struct completion parked;
struct completion exited;
 };
@@ -56,17 +55,49 @@ enum KTHREAD_BITS {
 #define __to_kthread(vfork)\
container_of(vfork, struct kthread, exited)
 
+static inline void get_kthread(struct kthread *kthread)
+{
+   BUG_ON(atomic_read(>refs) <= 0);
+   atomic_inc(>refs);
+}
+
+static inline void put_kthread(struct kthread *kthread)
+{
+   BUG_ON(atomic_read(>refs) <= 0);
+   if (atomic_dec_and_test(>refs))
+   kfree(kthread);
+}
+
+/**
+ * put_kthread_cb - is called from do_exit() and does likely
+ *  the final put.
+ */
+static void put_kthread_cb(struct callback_head *work)
+{
+   struct kthread *kthread;
+
+   kthread = container_of(work, struct kthread, put_work);
+   put_kthread(kthread);
+}
+
 static inline struct kthread *to_kthread(struct task_struct *k)
 {
return __to_kthread(k->vfork_done);
 }
 
-static struct kthread *to_live_kthread(struct task_struct *k)
+static struct kthread *to_live_kthread_and_get(struct task_struct *k)
 {
-   struct completion *vfork = ACCESS_ONCE(k->vfork_done);
-   if (likely(vfork) && try_get_task_stack(k))
-   return __to_kthread(vfork);
-   return NULL;
+   struct kthread *kthread = NULL;
+
+   BUG_ON(!(k->flags & PF_KTHREAD));
+   task_lock(k);
+   if (likely(k->vfork_done)) {
+   kthread = __to_kthread(k->vfork_done);
+   get_kthread(kthread);
+   }
+   task_unlock(k);
+
+   return kthread;
 }
 
 /**
@@ -174,41 +205,37 @@ void kthread_parkme(void)
 }
 EXPORT_SYMBOL_GPL(kthread_parkme);
 
-static int kthread(void *_create)
+static int kthreadfn(void *_self)
 {
-   /* Copy data: it's on kthread's stack */
-   struct kthread_create_info *create = _create;
-   int (*threadfn)(void *data) = create->threadfn;
-   void *data = create->data;
-   struct completion *done;
-   struct kthread self;
-   int ret;
-
-   self.flags = 0;
-   self.data = data;
-   init_completion();
-   init_completion();
-   current->vfork_done = 
-
-   /* If user was SIGKILLed, I release the structure. */
-   done = xchg(>done, NULL);
-   if (!done) {
-   kfree(create);
-   do_exit(-EINTR);
+   struct completion *started;
+   struct kthread *self = _self;
+   int ret = -EINTR;
+
+   /* If user was SIGKILLed,