Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-05-17 Thread Dean Nelson
On Wed, May 02, 2007 at 09:44:11AM -0600, Eric W. Biederman wrote:
> Dean Nelson <[EMAIL PROTECTED]> writes:
> > On Thu, Apr 26, 2007 at 01:11:15PM -0600, Eric W. Biederman wrote:
> >>
> >> Ok.Because of the module unloading issue, and because we don't have
> >> a lot of these threads running around, the current plan is to fix
> >> thread_create and kthread_stop so that they must always be paired,
> >> and so that kthread_stop will work correctly if the task has already
> >> exited.
> >>
> >> Basically that just involves calling get_task_struct in kthread_create
> >> and put_task_struct in kthread_stop.
> >
> > Okay, so I need to expand upon Christoph Hellwig's patch so that all
> > the kthread_create()'d threads are kthread_stop()'d.
> >
> > This is easy to do for the XPC thread that exists for the lifetime of XPC,
> > as well as for the threads created to manage the SGI system partitions.
> >
> > XPC has the one discovery thread that is created when XPC is first started
> > and exits as soon as it has finished discovering all existing SGI system
> > partitions. With your forthcoming change to kthread_stop() that will allow
> > it to be called after the thread has exited, doing this one is also easy.
> > Note that the kthread_stop() for this discovery thread won't occur until
> > XPC is rmmod'd. This means that its task_struct will not be freed for
> > possibly a very long time (i.e., weeks). Is that a problem?
> 
> As long as there is only one, not really.  It would be good if we could
> get rid of it though.
> 
> The practical problem is the race with rmmod, in particular if someone
> calls rmmod while this thread is still running.

I guess I'm not seeing the race with rmmod that you're talking
about? In XPC's case, rmmod calls xpc_exit() which currently does a
wait_for_completion() on the discovery thread and on the other thread
mentioned above. These will be changed to kthread_stop() calls.  And if
the discovery thread has already exited the kthread_stop() will return
immediately and if not it will wait until the discovery thread has
exited. rmmod won't return from xpc_exit() until both threads have exited.

Any thought as to when the changes to kthread_stop() that allow it to be
called for a kthread that has already exited will get into the -mm tree?

> > Is there any way to have a version of kthread_create() that doesn't
> > require a matching kthread_stop()? Or add a kthread_not_stopping()
> > that does the put_task_struct() call, so as to eliminate the need for
> > calling kthread_stop()?
> 
> Yes.  I was thinking calling it kthread_orphan or something like that.
> We can't make anything like that the default, because of the modular
> remove problem, but it's not to hard.

Again, when xpc_exit() is called by rmmod it waits for XPC's pool of
threads to exit before it returns, so not a problem.

Any thought as to when kthread_orphan() will get into the -mm tree? Once
kthread_stop() is changed and kthread_orphan() added I can proceed with
a patch to change XPC to use the kthread API.

> > Or should we reconsider the kthread pool approach
> > (and get XPC out of the thread management business altogether)? Robin
> > Holt is putting together a proposal for how one could do a kthread pool,
> > it should provide a bit more justification for going down that road.

Robin has changed his mind about tieing in the management of a pool
of threads with the kthread API, so there won't be the fore mentioned
proposal.

Thanks,
Dean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-05-17 Thread Dean Nelson
On Wed, May 02, 2007 at 09:44:11AM -0600, Eric W. Biederman wrote:
 Dean Nelson [EMAIL PROTECTED] writes:
  On Thu, Apr 26, 2007 at 01:11:15PM -0600, Eric W. Biederman wrote:
 
  Ok.Because of the module unloading issue, and because we don't have
  a lot of these threads running around, the current plan is to fix
  thread_create and kthread_stop so that they must always be paired,
  and so that kthread_stop will work correctly if the task has already
  exited.
 
  Basically that just involves calling get_task_struct in kthread_create
  and put_task_struct in kthread_stop.
 
  Okay, so I need to expand upon Christoph Hellwig's patch so that all
  the kthread_create()'d threads are kthread_stop()'d.
 
  This is easy to do for the XPC thread that exists for the lifetime of XPC,
  as well as for the threads created to manage the SGI system partitions.
 
  XPC has the one discovery thread that is created when XPC is first started
  and exits as soon as it has finished discovering all existing SGI system
  partitions. With your forthcoming change to kthread_stop() that will allow
  it to be called after the thread has exited, doing this one is also easy.
  Note that the kthread_stop() for this discovery thread won't occur until
  XPC is rmmod'd. This means that its task_struct will not be freed for
  possibly a very long time (i.e., weeks). Is that a problem?
 
 As long as there is only one, not really.  It would be good if we could
 get rid of it though.
 
 The practical problem is the race with rmmod, in particular if someone
 calls rmmod while this thread is still running.

I guess I'm not seeing the race with rmmod that you're talking
about? In XPC's case, rmmod calls xpc_exit() which currently does a
wait_for_completion() on the discovery thread and on the other thread
mentioned above. These will be changed to kthread_stop() calls.  And if
the discovery thread has already exited the kthread_stop() will return
immediately and if not it will wait until the discovery thread has
exited. rmmod won't return from xpc_exit() until both threads have exited.

Any thought as to when the changes to kthread_stop() that allow it to be
called for a kthread that has already exited will get into the -mm tree?

  Is there any way to have a version of kthread_create() that doesn't
  require a matching kthread_stop()? Or add a kthread_not_stopping()
  that does the put_task_struct() call, so as to eliminate the need for
  calling kthread_stop()?
 
 Yes.  I was thinking calling it kthread_orphan or something like that.
 We can't make anything like that the default, because of the modular
 remove problem, but it's not to hard.

Again, when xpc_exit() is called by rmmod it waits for XPC's pool of
threads to exit before it returns, so not a problem.

Any thought as to when kthread_orphan() will get into the -mm tree? Once
kthread_stop() is changed and kthread_orphan() added I can proceed with
a patch to change XPC to use the kthread API.

  Or should we reconsider the kthread pool approach
  (and get XPC out of the thread management business altogether)? Robin
  Holt is putting together a proposal for how one could do a kthread pool,
  it should provide a bit more justification for going down that road.

Robin has changed his mind about tieing in the management of a pool
of threads with the kthread API, so there won't be the fore mentioned
proposal.

Thanks,
Dean
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-05-02 Thread Eric W. Biederman
Dean Nelson <[EMAIL PROTECTED]> writes:

> On Mon, Apr 30, 2007 at 10:22:30AM -0500, Dean Nelson wrote:
>> On Fri, Apr 27, 2007 at 02:33:32PM -0600, Eric W. Biederman wrote:
>> > Dean Nelson <[EMAIL PROTECTED]> writes:
>> > 
>> > > Taking it one step further, if you added the notion of a thread pool,
>> > > where upon exit, a thread isn't destroyed but rather is queued ready to
>> > > handle the next kthread_create_quick() request.
>> > 
>> > That might happen.  So far I am avoiding the notion of a thread pool for
>> > as long as I can.  There is some sense in it, especially in generalizing
>> > the svc thread pool code from nfs.  But if I don't have to go there I would
>> > prefer it.
>> 
>> This means that XPC will have to retain its thread pool, but I can
>> understand you not wanting to go there.
>
> On Thu, Apr 26, 2007 at 01:11:15PM -0600, Eric W. Biederman wrote:
>>
>> Ok.Because of the module unloading issue, and because we don't have
>> a lot of these threads running around, the current plan is to fix
>> thread_create and kthread_stop so that they must always be paired,
>> and so that kthread_stop will work correctly if the task has already
>> exited.
>>
>> Basically that just involves calling get_task_struct in kthread_create
>> and put_task_struct in kthread_stop.
>
> Okay, so I need to expand upon Christoph Hellwig's patch so that all
> the kthread_create()'d threads are kthread_stop()'d.
>
> This is easy to do for the XPC thread that exists for the lifetime of XPC,
> as well as for the threads created to manage the SGI system partitions.
>
> XPC has the one discovery thread that is created when XPC is first started
> and exits as soon as it has finished discovering all existing SGI system
> partitions. With your forthcoming change to kthread_stop() that will allow
> it to be called after the thread has exited, doing this one is also easy.
> Note that the kthread_stop() for this discovery thread won't occur until
> XPC is rmmod'd. This means that its task_struct will not be freed for
> possibly a very long time (i.e., weeks). Is that a problem?

As long as there is only one, not really.  It would be good if we could
get rid of it though.

The practical problem is the race with rmmod, in particular if someone
calls rmmod while this thread is still running.

If I get clever I think this is likely solvable with something like.

kthread_maybe_stop(struct task_struct **loc)
{
struct task_struct *tsk;
tsk = xchg(loc, NULL);
if (tsk)
kthread_stop(tsk);
}

kthread_stop_self(struct task_struct **loc, int exit_code)
{
struct task_struct *tsk;

tsk = xchg(loc, NULL);
if (tsk)
put_task_struct(tsk);
do_exit(tsk);
}

I'm not quite convinced that is a common enough paradigm to implement
that.

> But then we come to XPC's pool of threads that deliver channel messages
> to the appropriate consumer (like XPNET) and can block indefinitely. As
> mentioned earlier there could be hundreds if not thousands of these
> (our systems keep getting bigger). So now requiring a kthread_stop()
> for each one of these becomes more of a problem, as it is a lot of
> task_struct pointers to maintain.
>
> Currently, XPC maintains these threads via a
> wait_event_interruptible_exclusive() queue so that it can wakeup as many
> or as few as needed at any given moment by calling wake_up_nr(). When XPC
> is rmmod'd, a flag is set which causes them to exit and wake_up_all()
> is called.  Therefore XPC dosen't need to remember their pids or
> task_struct pointers.
>
> So what would you suggest we do for this pool of threads?

Good question.

The whole concept of something that feels like a core part of the
platform code being modular I'm still looking at strange.

> Is there any way to have a version of kthread_create() that doesn't
> require a matching kthread_stop()? Or add a kthread_not_stopping()
> that does the put_task_struct() call, so as to eliminate the need for
> calling kthread_stop()?

Yes.  I was thinking calling it kthread_orphan or something like that.
We can't make anything like that the default, because of the modular
remove problem, but it's not to hard.

> Or should we reconsider the kthread pool approach
> (and get XPC out of the thread management business altogether)? Robin
> Holt is putting together a proposal for how one could do a kthread pool,
> it should provide a bit more justification for going down that road.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-05-02 Thread Dean Nelson
On Mon, Apr 30, 2007 at 10:22:30AM -0500, Dean Nelson wrote:
> On Fri, Apr 27, 2007 at 02:33:32PM -0600, Eric W. Biederman wrote:
> > Dean Nelson <[EMAIL PROTECTED]> writes:
> > 
> > > Taking it one step further, if you added the notion of a thread pool,
> > > where upon exit, a thread isn't destroyed but rather is queued ready to
> > > handle the next kthread_create_quick() request.
> > 
> > That might happen.  So far I am avoiding the notion of a thread pool for
> > as long as I can.  There is some sense in it, especially in generalizing
> > the svc thread pool code from nfs.  But if I don't have to go there I would
> > prefer it.
> 
> This means that XPC will have to retain its thread pool, but I can
> understand you not wanting to go there.

On Thu, Apr 26, 2007 at 01:11:15PM -0600, Eric W. Biederman wrote:
>
> Ok.Because of the module unloading issue, and because we don't have
> a lot of these threads running around, the current plan is to fix
> thread_create and kthread_stop so that they must always be paired,
> and so that kthread_stop will work correctly if the task has already
> exited.
>
> Basically that just involves calling get_task_struct in kthread_create
> and put_task_struct in kthread_stop.

Okay, so I need to expand upon Christoph Hellwig's patch so that all
the kthread_create()'d threads are kthread_stop()'d.

This is easy to do for the XPC thread that exists for the lifetime of XPC,
as well as for the threads created to manage the SGI system partitions.

XPC has the one discovery thread that is created when XPC is first started
and exits as soon as it has finished discovering all existing SGI system
partitions. With your forthcoming change to kthread_stop() that will allow
it to be called after the thread has exited, doing this one is also easy.
Note that the kthread_stop() for this discovery thread won't occur until
XPC is rmmod'd. This means that its task_struct will not be freed for
possibly a very long time (i.e., weeks). Is that a problem?

But then we come to XPC's pool of threads that deliver channel messages
to the appropriate consumer (like XPNET) and can block indefinitely. As
mentioned earlier there could be hundreds if not thousands of these
(our systems keep getting bigger). So now requiring a kthread_stop()
for each one of these becomes more of a problem, as it is a lot of
task_struct pointers to maintain.

Currently, XPC maintains these threads via a
wait_event_interruptible_exclusive() queue so that it can wakeup as many
or as few as needed at any given moment by calling wake_up_nr(). When XPC
is rmmod'd, a flag is set which causes them to exit and wake_up_all()
is called.  Therefore XPC dosen't need to remember their pids or
task_struct pointers.

So what would you suggest we do for this pool of threads?

Is there any way to have a version of kthread_create() that doesn't
require a matching kthread_stop()? Or add a kthread_not_stopping()
that does the put_task_struct() call, so as to eliminate the need for
calling kthread_stop()?  Or should we reconsider the kthread pool approach
(and get XPC out of the thread management business altogether)? Robin
Holt is putting together a proposal for how one could do a kthread pool,
it should provide a bit more justification for going down that road.

Thanks,
Dean

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-05-02 Thread Dean Nelson
On Mon, Apr 30, 2007 at 10:22:30AM -0500, Dean Nelson wrote:
 On Fri, Apr 27, 2007 at 02:33:32PM -0600, Eric W. Biederman wrote:
  Dean Nelson [EMAIL PROTECTED] writes:
  
   Taking it one step further, if you added the notion of a thread pool,
   where upon exit, a thread isn't destroyed but rather is queued ready to
   handle the next kthread_create_quick() request.
  
  That might happen.  So far I am avoiding the notion of a thread pool for
  as long as I can.  There is some sense in it, especially in generalizing
  the svc thread pool code from nfs.  But if I don't have to go there I would
  prefer it.
 
 This means that XPC will have to retain its thread pool, but I can
 understand you not wanting to go there.

On Thu, Apr 26, 2007 at 01:11:15PM -0600, Eric W. Biederman wrote:

 Ok.Because of the module unloading issue, and because we don't have
 a lot of these threads running around, the current plan is to fix
 thread_create and kthread_stop so that they must always be paired,
 and so that kthread_stop will work correctly if the task has already
 exited.

 Basically that just involves calling get_task_struct in kthread_create
 and put_task_struct in kthread_stop.

Okay, so I need to expand upon Christoph Hellwig's patch so that all
the kthread_create()'d threads are kthread_stop()'d.

This is easy to do for the XPC thread that exists for the lifetime of XPC,
as well as for the threads created to manage the SGI system partitions.

XPC has the one discovery thread that is created when XPC is first started
and exits as soon as it has finished discovering all existing SGI system
partitions. With your forthcoming change to kthread_stop() that will allow
it to be called after the thread has exited, doing this one is also easy.
Note that the kthread_stop() for this discovery thread won't occur until
XPC is rmmod'd. This means that its task_struct will not be freed for
possibly a very long time (i.e., weeks). Is that a problem?

But then we come to XPC's pool of threads that deliver channel messages
to the appropriate consumer (like XPNET) and can block indefinitely. As
mentioned earlier there could be hundreds if not thousands of these
(our systems keep getting bigger). So now requiring a kthread_stop()
for each one of these becomes more of a problem, as it is a lot of
task_struct pointers to maintain.

Currently, XPC maintains these threads via a
wait_event_interruptible_exclusive() queue so that it can wakeup as many
or as few as needed at any given moment by calling wake_up_nr(). When XPC
is rmmod'd, a flag is set which causes them to exit and wake_up_all()
is called.  Therefore XPC dosen't need to remember their pids or
task_struct pointers.

So what would you suggest we do for this pool of threads?

Is there any way to have a version of kthread_create() that doesn't
require a matching kthread_stop()? Or add a kthread_not_stopping()
that does the put_task_struct() call, so as to eliminate the need for
calling kthread_stop()?  Or should we reconsider the kthread pool approach
(and get XPC out of the thread management business altogether)? Robin
Holt is putting together a proposal for how one could do a kthread pool,
it should provide a bit more justification for going down that road.

Thanks,
Dean

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-05-02 Thread Eric W. Biederman
Dean Nelson [EMAIL PROTECTED] writes:

 On Mon, Apr 30, 2007 at 10:22:30AM -0500, Dean Nelson wrote:
 On Fri, Apr 27, 2007 at 02:33:32PM -0600, Eric W. Biederman wrote:
  Dean Nelson [EMAIL PROTECTED] writes:
  
   Taking it one step further, if you added the notion of a thread pool,
   where upon exit, a thread isn't destroyed but rather is queued ready to
   handle the next kthread_create_quick() request.
  
  That might happen.  So far I am avoiding the notion of a thread pool for
  as long as I can.  There is some sense in it, especially in generalizing
  the svc thread pool code from nfs.  But if I don't have to go there I would
  prefer it.
 
 This means that XPC will have to retain its thread pool, but I can
 understand you not wanting to go there.

 On Thu, Apr 26, 2007 at 01:11:15PM -0600, Eric W. Biederman wrote:

 Ok.Because of the module unloading issue, and because we don't have
 a lot of these threads running around, the current plan is to fix
 thread_create and kthread_stop so that they must always be paired,
 and so that kthread_stop will work correctly if the task has already
 exited.

 Basically that just involves calling get_task_struct in kthread_create
 and put_task_struct in kthread_stop.

 Okay, so I need to expand upon Christoph Hellwig's patch so that all
 the kthread_create()'d threads are kthread_stop()'d.

 This is easy to do for the XPC thread that exists for the lifetime of XPC,
 as well as for the threads created to manage the SGI system partitions.

 XPC has the one discovery thread that is created when XPC is first started
 and exits as soon as it has finished discovering all existing SGI system
 partitions. With your forthcoming change to kthread_stop() that will allow
 it to be called after the thread has exited, doing this one is also easy.
 Note that the kthread_stop() for this discovery thread won't occur until
 XPC is rmmod'd. This means that its task_struct will not be freed for
 possibly a very long time (i.e., weeks). Is that a problem?

As long as there is only one, not really.  It would be good if we could
get rid of it though.

The practical problem is the race with rmmod, in particular if someone
calls rmmod while this thread is still running.

If I get clever I think this is likely solvable with something like.

kthread_maybe_stop(struct task_struct **loc)
{
struct task_struct *tsk;
tsk = xchg(loc, NULL);
if (tsk)
kthread_stop(tsk);
}

kthread_stop_self(struct task_struct **loc, int exit_code)
{
struct task_struct *tsk;

tsk = xchg(loc, NULL);
if (tsk)
put_task_struct(tsk);
do_exit(tsk);
}

I'm not quite convinced that is a common enough paradigm to implement
that.

 But then we come to XPC's pool of threads that deliver channel messages
 to the appropriate consumer (like XPNET) and can block indefinitely. As
 mentioned earlier there could be hundreds if not thousands of these
 (our systems keep getting bigger). So now requiring a kthread_stop()
 for each one of these becomes more of a problem, as it is a lot of
 task_struct pointers to maintain.

 Currently, XPC maintains these threads via a
 wait_event_interruptible_exclusive() queue so that it can wakeup as many
 or as few as needed at any given moment by calling wake_up_nr(). When XPC
 is rmmod'd, a flag is set which causes them to exit and wake_up_all()
 is called.  Therefore XPC dosen't need to remember their pids or
 task_struct pointers.

 So what would you suggest we do for this pool of threads?

Good question.

The whole concept of something that feels like a core part of the
platform code being modular I'm still looking at strange.

 Is there any way to have a version of kthread_create() that doesn't
 require a matching kthread_stop()? Or add a kthread_not_stopping()
 that does the put_task_struct() call, so as to eliminate the need for
 calling kthread_stop()?

Yes.  I was thinking calling it kthread_orphan or something like that.
We can't make anything like that the default, because of the modular
remove problem, but it's not to hard.

 Or should we reconsider the kthread pool approach
 (and get XPC out of the thread management business altogether)? Robin
 Holt is putting together a proposal for how one could do a kthread pool,
 it should provide a bit more justification for going down that road.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-04-30 Thread Dean Nelson
On Fri, Apr 27, 2007 at 02:33:32PM -0600, Eric W. Biederman wrote:
> Dean Nelson <[EMAIL PROTECTED]> writes:
> 
> > On Fri, Apr 27, 2007 at 12:34:02PM -0600, Eric W. Biederman wrote:
> >> Dean Nelson <[EMAIL PROTECTED]> writes:
> >> >
> >> > XPC is in need of threads that can block indefinitely, which is why XPC
> >> > is in the business of maintaining a pool of threads. Currently there is
> >> > no such capability (that I know of) that is provided by linux. Workqueues
> >> > can't block indefinitely.
> >> 
> >> I'm not certain I understand this requirement.  Do you mean block 
> >> indefinitely
> >> unless requested to stop?
> >
> > These threads can block waiting for a hardware DMA engine, which has a 28
> > second timeout setpoint.
> 
> Ok.  So this is an interruptible sleep?

No, the hardware DMA engine's software interface, doesn't sleep
nor relinquish the CPU. But there are other spots where we do sleep
interruptibly.

> Do you have any problems being woken up out of that interruptible sleep
> by kthread_stop?
> 
> I am in the process of modifying kthread_stop to wake up thread in an
> interruptible sleep and set signal_pending, so they will break out.

No, this is fine, just avoid designing the kthread stop mechanism
to require a thread being requested to stop to actually stop in some
finite amount of time, and that by it not stopping other kthread stop
requests are held off. Allow the thread to take as much time as it needs
to respond to the kthread stop request.

> >> > And for performance reasons these threads need to be able to be created
> >> > quickly. These threads are involved in delivering messages to XPC's users
> >> > (like XPNET) and we had latency issues that led us to use kernel_thread()
> >> > directly instead of the kthread API. Additionally, XPC may need to have
> >> > hundreds of these threads active at any given time.
> >> 
> >> Ugh.  Can you tell me a little more about the latency issues?
> >
> > After placing a message in a local message queue, one SGI system partition
> > will interrupt another to retrieve the message. We need to minimize the
> > time from entering XPC's interrupt handler to the time that the message
> > can be DMA transferred and delivered to the consumer (like XPNET) to
> > whom it was sent.
> >
> >> Is having a non-halting kthread_create enough to fix this?
> >> So you don't have to context switch several times to get the
> >> thread running?
> >> 
> >> Or do you need more severe latency reductions?
> >> 
> >> The more severe fix would require some significant changes to copy_process
> >> and every architecture would need to be touched to fix up copy_thread.
> >> It is possible, it is a lot of work, and the reward is far from obvious.
> >
> > I think a non-halting kthread_create() should be sufficient. It is in
> > effect what XPC has now in calling kernel_thread() directly.
> 
> A little different but pretty close.
> 
> We call kthread_create() it prepares everything and places it on
> a queue and wakes up kthreadd.
> 
> kthreadd then wakes up and forks the thread.
> 
> After the thread has finishing setting up it will call complete on
> a completion so kthread_create can continue on it's merry way
> but it should not need to go to sleep waiting for someone to
> call kthread_bind.
> 
> But if you can live with what I have just described that will
> be easy to code up.
> 
> It is a little slower then kernel_thread but hopefully not much.

I was aware of this behavior of kthread_create(), which I consider
'halting' in that the thread doing the kthread_create() blocks waiting
for kthreadd to get scheduled, call kernel_thread(), and then call
complete(). By your mentioning a 'non-halting' kthread_create() I
thought you were planning to create a new flavor of kthread_create()
that called kernel_thread() directly and reparented the child thread to
kthreadd. My mistake.

So there will be more overhead (time-wise) for XPC in calling
kthread_run() as opposed to it formerly calling kernel_thread() directly.
Thus requiring XPC to utilize a pool of kthread_create()'d threads.

> > Taking it one step further, if you added the notion of a thread pool,
> > where upon exit, a thread isn't destroyed but rather is queued ready to
> > handle the next kthread_create_quick() request.
> 
> That might happen.  So far I am avoiding the notion of a thread pool for
> as long as I can.  There is some sense in it, especially in generalizing
> the svc thread pool code from nfs.  But if I don't have to go there I would
> prefer it.

This means that XPC will have to retain its thread pool, but I can
understand you not wanting to go there.

Thanks,
Dean

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-04-30 Thread Dean Nelson
On Fri, Apr 27, 2007 at 02:33:32PM -0600, Eric W. Biederman wrote:
 Dean Nelson [EMAIL PROTECTED] writes:
 
  On Fri, Apr 27, 2007 at 12:34:02PM -0600, Eric W. Biederman wrote:
  Dean Nelson [EMAIL PROTECTED] writes:
  
   XPC is in need of threads that can block indefinitely, which is why XPC
   is in the business of maintaining a pool of threads. Currently there is
   no such capability (that I know of) that is provided by linux. Workqueues
   can't block indefinitely.
  
  I'm not certain I understand this requirement.  Do you mean block 
  indefinitely
  unless requested to stop?
 
  These threads can block waiting for a hardware DMA engine, which has a 28
  second timeout setpoint.
 
 Ok.  So this is an interruptible sleep?

No, the hardware DMA engine's software interface, doesn't sleep
nor relinquish the CPU. But there are other spots where we do sleep
interruptibly.

 Do you have any problems being woken up out of that interruptible sleep
 by kthread_stop?
 
 I am in the process of modifying kthread_stop to wake up thread in an
 interruptible sleep and set signal_pending, so they will break out.

No, this is fine, just avoid designing the kthread stop mechanism
to require a thread being requested to stop to actually stop in some
finite amount of time, and that by it not stopping other kthread stop
requests are held off. Allow the thread to take as much time as it needs
to respond to the kthread stop request.

   And for performance reasons these threads need to be able to be created
   quickly. These threads are involved in delivering messages to XPC's users
   (like XPNET) and we had latency issues that led us to use kernel_thread()
   directly instead of the kthread API. Additionally, XPC may need to have
   hundreds of these threads active at any given time.
  
  Ugh.  Can you tell me a little more about the latency issues?
 
  After placing a message in a local message queue, one SGI system partition
  will interrupt another to retrieve the message. We need to minimize the
  time from entering XPC's interrupt handler to the time that the message
  can be DMA transferred and delivered to the consumer (like XPNET) to
  whom it was sent.
 
  Is having a non-halting kthread_create enough to fix this?
  So you don't have to context switch several times to get the
  thread running?
  
  Or do you need more severe latency reductions?
  
  The more severe fix would require some significant changes to copy_process
  and every architecture would need to be touched to fix up copy_thread.
  It is possible, it is a lot of work, and the reward is far from obvious.
 
  I think a non-halting kthread_create() should be sufficient. It is in
  effect what XPC has now in calling kernel_thread() directly.
 
 A little different but pretty close.
 
 We call kthread_create() it prepares everything and places it on
 a queue and wakes up kthreadd.
 
 kthreadd then wakes up and forks the thread.
 
 After the thread has finishing setting up it will call complete on
 a completion so kthread_create can continue on it's merry way
 but it should not need to go to sleep waiting for someone to
 call kthread_bind.
 
 But if you can live with what I have just described that will
 be easy to code up.
 
 It is a little slower then kernel_thread but hopefully not much.

I was aware of this behavior of kthread_create(), which I consider
'halting' in that the thread doing the kthread_create() blocks waiting
for kthreadd to get scheduled, call kernel_thread(), and then call
complete(). By your mentioning a 'non-halting' kthread_create() I
thought you were planning to create a new flavor of kthread_create()
that called kernel_thread() directly and reparented the child thread to
kthreadd. My mistake.

So there will be more overhead (time-wise) for XPC in calling
kthread_run() as opposed to it formerly calling kernel_thread() directly.
Thus requiring XPC to utilize a pool of kthread_create()'d threads.

  Taking it one step further, if you added the notion of a thread pool,
  where upon exit, a thread isn't destroyed but rather is queued ready to
  handle the next kthread_create_quick() request.
 
 That might happen.  So far I am avoiding the notion of a thread pool for
 as long as I can.  There is some sense in it, especially in generalizing
 the svc thread pool code from nfs.  But if I don't have to go there I would
 prefer it.

This means that XPC will have to retain its thread pool, but I can
understand you not wanting to go there.

Thanks,
Dean

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-04-27 Thread Eric W. Biederman
Dean Nelson <[EMAIL PROTECTED]> writes:

> On Fri, Apr 27, 2007 at 12:34:02PM -0600, Eric W. Biederman wrote:
>> Dean Nelson <[EMAIL PROTECTED]> writes:
>> >
>> > Christoph is correct in that XPC has a single thread that exists throughout
>> > its lifetime, another set of threads that exist for the time that active
>> > contact with other XPCs running on other SGI system partitions exists, and
>> > finally there is a pool of threads that exist on an as needed basis once
>> > a channel connection has been established between two partitions.
>> >
>> > In principle I approve of the kthread API and its use as opposed to what
>> > XPC currently does (calls kernel_thread(), daemonize(),
> wait_for_completion(),
>> > and complete()). So Christoph's patch that changes the single long-lived
>> > thread to use kthread_stop() and kthread_should_stop() is appreciated.
>> >
>> > But the fact that another thread, started at the xpc_init() time, that does
>> > discovery of other SGI system partitions wasn't converted points out a
>> > weekness in either my thinking or the kthread API. This discovery thread
>> > does its job and then exits. Should XPC be rmmod'd while the discovery
>> > thread is still running we would need to do a kthread_stop() against it. 
>> > But kthread_stop() isn't set up to deal with a task that has already 
>> > exited.
>> > And if what once was the task structure of this exited task has been
>> > reallocated to another new task, we'd end up stopping it should it be
>> > operating under the kthread API, or possibly waiting a very long time
>> > for it to exit if it is not.
>> 
>> Patches are currently under development to allow kthreads to exit
>> before kthread_stop is called.  The big thing is that once we allow
>> kernel threads that exited by themselves to be reaped by kthread_stop
>> we have some significant work to do.
>> 
>> > I'm also a little uneasy that kthread_stop() has an "only one thread can
>> > stop another thread at a time" design. It's a potential bottleneck on
>> > very large systems where threads are blocked and unable to respond to a
>> > kthread_should_stop() for some period of time.
>> 
>> There are already patches out there to fix this issue.
>> 
>> > XPC is in need of threads that can block indefinitely, which is why XPC
>> > is in the business of maintaining a pool of threads. Currently there is
>> > no such capability (that I know of) that is provided by linux. Workqueues
>> > can't block indefinitely.
>> 
>> I'm not certain I understand this requirement.  Do you mean block 
>> indefinitely
>> unless requested to stop?
>
> These threads can block waiting for a hardware DMA engine, which has a 28
> second timeout setpoint.

Ok.  So this is an interruptible sleep?
Do you have any problems being woken up out of that interruptible sleep
by kthread_stop?

I am in the process of modifying kthread_stop to wake up thread in an
interruptible sleep and set signal_pending, so they will break out.

>> > And for performance reasons these threads need to be able to be created
>> > quickly. These threads are involved in delivering messages to XPC's users
>> > (like XPNET) and we had latency issues that led us to use kernel_thread()
>> > directly instead of the kthread API. Additionally, XPC may need to have
>> > hundreds of these threads active at any given time.
>> 
>> Ugh.  Can you tell me a little more about the latency issues?
>
> After placing a message in a local message queue, one SGI system partition
> will interrupt another to retrieve the message. We need to minimize the
> time from entering XPC's interrupt handler to the time that the message
> can be DMA transferred and delivered to the consumer (like XPNET) to
> whom it was sent.
>
>> Is having a non-halting kthread_create enough to fix this?
>> So you don't have to context switch several times to get the
>> thread running?
>> 
>> Or do you need more severe latency reductions?
>> 
>> The more severe fix would require some significant changes to copy_process
>> and every architecture would need to be touched to fix up copy_thread.
>> It is possible, it is a lot of work, and the reward is far from obvious.
>
> I think a non-halting kthread_create() should be sufficient. It is in
> effect what XPC has now in calling kernel_thread() directly.

A little different but pretty close.

We call kthread_create() it prepares everything and places it on
a queue and wakes up kthreadd.

kthreadd then wakes up and forks the thread.

After the thread has finishing setting up it will call complete on
a completion so kthread_create can continue on it's merry way
but it should not need to go to sleep waiting for someone to
call kthread_bind.

But if you can live with what I have just described that will
be easy to code up.

It is a little slower then kernel_thread but hopefully not much.

> Taking it one step further, if you added the notion of a thread pool,
> where upon exit, a thread isn't destroyed but rather is queued ready to
> handle 

Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-04-27 Thread Dean Nelson
On Fri, Apr 27, 2007 at 12:34:02PM -0600, Eric W. Biederman wrote:
> Dean Nelson <[EMAIL PROTECTED]> writes:
> >
> > Christoph is correct in that XPC has a single thread that exists throughout
> > its lifetime, another set of threads that exist for the time that active
> > contact with other XPCs running on other SGI system partitions exists, and
> > finally there is a pool of threads that exist on an as needed basis once
> > a channel connection has been established between two partitions.
> >
> > In principle I approve of the kthread API and its use as opposed to what
> > XPC currently does (calls kernel_thread(), daemonize(), 
> > wait_for_completion(),
> > and complete()). So Christoph's patch that changes the single long-lived
> > thread to use kthread_stop() and kthread_should_stop() is appreciated.
> >
> > But the fact that another thread, started at the xpc_init() time, that does
> > discovery of other SGI system partitions wasn't converted points out a
> > weekness in either my thinking or the kthread API. This discovery thread
> > does its job and then exits. Should XPC be rmmod'd while the discovery
> > thread is still running we would need to do a kthread_stop() against it. 
> > But kthread_stop() isn't set up to deal with a task that has already exited.
> > And if what once was the task structure of this exited task has been
> > reallocated to another new task, we'd end up stopping it should it be
> > operating under the kthread API, or possibly waiting a very long time
> > for it to exit if it is not.
> 
> Patches are currently under development to allow kthreads to exit
> before kthread_stop is called.  The big thing is that once we allow
> kernel threads that exited by themselves to be reaped by kthread_stop
> we have some significant work to do.
> 
> > I'm also a little uneasy that kthread_stop() has an "only one thread can
> > stop another thread at a time" design. It's a potential bottleneck on
> > very large systems where threads are blocked and unable to respond to a
> > kthread_should_stop() for some period of time.
> 
> There are already patches out there to fix this issue.
> 
> > XPC is in need of threads that can block indefinitely, which is why XPC
> > is in the business of maintaining a pool of threads. Currently there is
> > no such capability (that I know of) that is provided by linux. Workqueues
> > can't block indefinitely.
> 
> I'm not certain I understand this requirement.  Do you mean block indefinitely
> unless requested to stop?

These threads can block waiting for a hardware DMA engine, which has a 28
second timeout setpoint.

> > And for performance reasons these threads need to be able to be created
> > quickly. These threads are involved in delivering messages to XPC's users
> > (like XPNET) and we had latency issues that led us to use kernel_thread()
> > directly instead of the kthread API. Additionally, XPC may need to have
> > hundreds of these threads active at any given time.
> 
> Ugh.  Can you tell me a little more about the latency issues?

After placing a message in a local message queue, one SGI system partition
will interrupt another to retrieve the message. We need to minimize the
time from entering XPC's interrupt handler to the time that the message
can be DMA transferred and delivered to the consumer (like XPNET) to
whom it was sent.

> Is having a non-halting kthread_create enough to fix this?
> So you don't have to context switch several times to get the
> thread running?
> 
> Or do you need more severe latency reductions?
> 
> The more severe fix would require some significant changes to copy_process
> and every architecture would need to be touched to fix up copy_thread.
> It is possible, it is a lot of work, and the reward is far from obvious.

I think a non-halting kthread_create() should be sufficient. It is in
effect what XPC has now in calling kernel_thread() directly.

Taking it one step further, if you added the notion of a thread pool,
where upon exit, a thread isn't destroyed but rather is queued ready to
handle the next kthread_create_quick() request.

> > I think it would be great if the kthread API (or underlying implementation)
> > could be changed to handle these issues. I'd love for XPC to not have to
> > maintain this sort of thing itself.
> 
> Currently daemonize is a serious maintenance problem.
> 
> Using daemonize and kernel_thread to create kernel threads is a blocker
> in implementing the pid namespace because of their use of pid_t.
> 
> So I am motivated to get this fixed.

This would also address the problems we see with huge pid spaces for
kernel threads on our largest machines.  In the example from last week,
we had 10 threads each on 4096 cpus.  If we reworked work_queues to use
the kthread_create_nonblocking() thread pool, we could probably collapse
the need for having all of those per-task, per-cpu work queues.

Dean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to 

Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-04-27 Thread Eric W. Biederman
Dean Nelson <[EMAIL PROTECTED]> writes:
>
> Christoph is correct in that XPC has a single thread that exists throughout
> its lifetime, another set of threads that exist for the time that active
> contact with other XPCs running on other SGI system partitions exists, and
> finally there is a pool of threads that exist on an as needed basis once
> a channel connection has been established between two partitions.
>
> In principle I approve of the kthread API and its use as opposed to what
> XPC currently does (calls kernel_thread(), daemonize(), wait_for_completion(),
> and complete()). So Christoph's patch that changes the single long-lived
> thread to use kthread_stop() and kthread_should_stop() is appreciated.
>
> But the fact that another thread, started at the xpc_init() time, that does
> discovery of other SGI system partitions wasn't converted points out a
> weekness in either my thinking or the kthread API. This discovery thread
> does its job and then exits. Should XPC be rmmod'd while the discovery
> thread is still running we would need to do a kthread_stop() against it. 
> But kthread_stop() isn't set up to deal with a task that has already exited.
> And if what once was the task structure of this exited task has been
> reallocated to another new task, we'd end up stopping it should it be
> operating under the kthread API, or possibly waiting a very long time
> for it to exit if it is not.

Patches are currently under development to allow kthreads to exit
before kthread_stop is called.  The big thing is that once we allow
kernel threads that exited by themselves to be reaped by kthread_stop
we have some significant work to do.

> I'm also a little uneasy that kthread_stop() has an "only one thread can
> stop another thread at a time" design. It's a potential bottleneck on
> very large systems where threads are blocked and unable to respond to a
> kthread_should_stop() for some period of time.

There are already patches out there to fix this issue.

> XPC is in need of threads that can block indefinitely, which is why XPC
> is in the business of maintaining a pool of threads. Currently there is
> no such capability (that I know of) that is provided by linux. Workqueues
> can't block indefinitely.

I'm not certain I understand this requirement.  Do you mean block indefinitely
unless requested to stop?

> And for performance reasons these threads need to be able to be created
> quickly. These threads are involved in delivering messages to XPC's users
> (like XPNET) and we had latency issues that led us to use kernel_thread()
> directly instead of the kthread API. Additionally, XPC may need to have
> hundreds of these threads active at any given time.

Ugh.  Can you tell me a little more about the latency issues?

Is having a non-halting kthread_create enough to fix this?
So you don't have to context switch several times to get the
thread running?

Or do you need more severe latency reductions?

The more severe fix would require some significant changes to copy_process
and every architecture would need to be touched to fix up copy_thread.
It is possible, it is a lot of work, and the reward is far from obvious.

> I think it would be great if the kthread API (or underlying implementation)
> could be changed to handle these issues. I'd love for XPC to not have to
> maintain this sort of thing itself.

Currently daemonize is a serious maintenance problem.

Using daemonize and kernel_thread to create kernel threads is a blocker
in implementing the pid namespace because of their use of pid_t.

So I am motivated to get this fixed.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-04-27 Thread Dean Nelson

On Thu, Apr 19, 2007 at 04:51:03PM -0700, Andrew Morton wrote:
> Another driver which should be fully converted to the kthread API:
> kthread_stop() and kthread_should_stop().
> 
> And according to my logs, this driver was added to the tree more than
> a year _after_ the kthread interface was made available.
> 
> This isn't good.

On Sun, Apr 22, 2007 at 09:36:47PM +0100, Christoph Hellwig wrote:
> On Thu, Apr 19, 2007 at 01:58:44AM -0600, Eric W. Biederman wrote:
> > From: Eric W. Biederman <[EMAIL PROTECTED]>
> >
> > This patch starts the xpc kernel threads using kthread_run
> > not a combination of kernel_thread and daemonize.  Resuling
> > in slightly simpler and more maintainable code.
>
> This driver is a really twisted maze.  It has a lot of threads,
> some of them running through the whole lifetime of the driver,
> some short-lived and some in a sort of a pool.
>
> The patch below fixes up the long-lived thread as well as fixing
> gazillions of leaks in the init routine by switching to proper
> goto-based unwinding.

I see that the setting of 'xpc_rsvd_page->vars_pa = 0;' in xpc_init() is
considered a leak by Christoph (hch), but it really is not. If you look at
xpc_rsvd_page_init() where it is set up, you might see that the reserved page
is something XPC gets from SAL who created it at system boot time. If XPC is
rmmod'd and insmod'd again, it will be handed the same page of memory by SAL.
So there is no memory leak here.

As for the other suspected leaks mentioned I'm not sure what they could be.
It may be the fact that XPC continues coming up when presented with error
returns from register_reboot_notifier() and register_die_notifier(). There
is no leak in this, just simply a loss of an early notification to other
SGI system partitions that this partition is going down. A fact they will
sooner or later discover on their own. A notice of this degraded
functionality is written to the console. And the likelyhood that these
functions should ever return an error is very, very small (currently,
neither of them has an error return).

>From my experience the goto-based unwinding of error returns is not necessarily
a superior approach. I spent a month tracking down a difficult to reproduce
problem that ended up being an error return jumping to the wrong label in a
goto-based unwind. Problems can arise with either approach. I've also seen
the compiler generate less code for the non-goto approach. I'm not a compiler
person so I can't explain this, nor can I say that it's always the case, but
at one time when I did a bake off between the two approaches the non-goto
approach generated less code.

Having said this I have no problem with switching to a goto-based unwinding
of errors if that is what the community prefers. I personally find it more
readable than the non-goto approach.

> Note that thread pools are something we have in a few places,
> and might be worth handling in the core kthread infrastructure,
> as tearing down pools will get a bit complicated using the
> kthread APIs.

Christoph is correct in that XPC has a single thread that exists throughout
its lifetime, another set of threads that exist for the time that active
contact with other XPCs running on other SGI system partitions exists, and
finally there is a pool of threads that exist on an as needed basis once
a channel connection has been established between two partitions.

In principle I approve of the kthread API and its use as opposed to what
XPC currently does (calls kernel_thread(), daemonize(), wait_for_completion(),
and complete()). So Christoph's patch that changes the single long-lived
thread to use kthread_stop() and kthread_should_stop() is appreciated.

But the fact that another thread, started at the xpc_init() time, that does
discovery of other SGI system partitions wasn't converted points out a
weekness in either my thinking or the kthread API. This discovery thread
does its job and then exits. Should XPC be rmmod'd while the discovery
thread is still running we would need to do a kthread_stop() against it. 
But kthread_stop() isn't set up to deal with a task that has already exited.
And if what once was the task structure of this exited task has been
reallocated to another new task, we'd end up stopping it should it be
operating under the kthread API, or possibly waiting a very long time
for it to exit if it is not.

I'm also a little uneasy that kthread_stop() has an "only one thread can
stop another thread at a time" design. It's a potential bottleneck on
very large systems where threads are blocked and unable to respond to a
kthread_should_stop() for some period of time.

XPC is in need of threads that can block indefinitely, which is why XPC
is in the business of maintaining a pool of threads. Currently there is
no such capability (that I know of) that is provided by linux. Workqueues
can't block indefinitely.

And for performance reasons these threads need to be able to be created
quickly. These threads are involved 

Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-04-27 Thread Dean Nelson

On Thu, Apr 19, 2007 at 04:51:03PM -0700, Andrew Morton wrote:
 Another driver which should be fully converted to the kthread API:
 kthread_stop() and kthread_should_stop().
 
 And according to my logs, this driver was added to the tree more than
 a year _after_ the kthread interface was made available.
 
 This isn't good.

On Sun, Apr 22, 2007 at 09:36:47PM +0100, Christoph Hellwig wrote:
 On Thu, Apr 19, 2007 at 01:58:44AM -0600, Eric W. Biederman wrote:
  From: Eric W. Biederman [EMAIL PROTECTED]
 
  This patch starts the xpc kernel threads using kthread_run
  not a combination of kernel_thread and daemonize.  Resuling
  in slightly simpler and more maintainable code.

 This driver is a really twisted maze.  It has a lot of threads,
 some of them running through the whole lifetime of the driver,
 some short-lived and some in a sort of a pool.

 The patch below fixes up the long-lived thread as well as fixing
 gazillions of leaks in the init routine by switching to proper
 goto-based unwinding.

I see that the setting of 'xpc_rsvd_page-vars_pa = 0;' in xpc_init() is
considered a leak by Christoph (hch), but it really is not. If you look at
xpc_rsvd_page_init() where it is set up, you might see that the reserved page
is something XPC gets from SAL who created it at system boot time. If XPC is
rmmod'd and insmod'd again, it will be handed the same page of memory by SAL.
So there is no memory leak here.

As for the other suspected leaks mentioned I'm not sure what they could be.
It may be the fact that XPC continues coming up when presented with error
returns from register_reboot_notifier() and register_die_notifier(). There
is no leak in this, just simply a loss of an early notification to other
SGI system partitions that this partition is going down. A fact they will
sooner or later discover on their own. A notice of this degraded
functionality is written to the console. And the likelyhood that these
functions should ever return an error is very, very small (currently,
neither of them has an error return).

From my experience the goto-based unwinding of error returns is not necessarily
a superior approach. I spent a month tracking down a difficult to reproduce
problem that ended up being an error return jumping to the wrong label in a
goto-based unwind. Problems can arise with either approach. I've also seen
the compiler generate less code for the non-goto approach. I'm not a compiler
person so I can't explain this, nor can I say that it's always the case, but
at one time when I did a bake off between the two approaches the non-goto
approach generated less code.

Having said this I have no problem with switching to a goto-based unwinding
of errors if that is what the community prefers. I personally find it more
readable than the non-goto approach.

 Note that thread pools are something we have in a few places,
 and might be worth handling in the core kthread infrastructure,
 as tearing down pools will get a bit complicated using the
 kthread APIs.

Christoph is correct in that XPC has a single thread that exists throughout
its lifetime, another set of threads that exist for the time that active
contact with other XPCs running on other SGI system partitions exists, and
finally there is a pool of threads that exist on an as needed basis once
a channel connection has been established between two partitions.

In principle I approve of the kthread API and its use as opposed to what
XPC currently does (calls kernel_thread(), daemonize(), wait_for_completion(),
and complete()). So Christoph's patch that changes the single long-lived
thread to use kthread_stop() and kthread_should_stop() is appreciated.

But the fact that another thread, started at the xpc_init() time, that does
discovery of other SGI system partitions wasn't converted points out a
weekness in either my thinking or the kthread API. This discovery thread
does its job and then exits. Should XPC be rmmod'd while the discovery
thread is still running we would need to do a kthread_stop() against it. 
But kthread_stop() isn't set up to deal with a task that has already exited.
And if what once was the task structure of this exited task has been
reallocated to another new task, we'd end up stopping it should it be
operating under the kthread API, or possibly waiting a very long time
for it to exit if it is not.

I'm also a little uneasy that kthread_stop() has an only one thread can
stop another thread at a time design. It's a potential bottleneck on
very large systems where threads are blocked and unable to respond to a
kthread_should_stop() for some period of time.

XPC is in need of threads that can block indefinitely, which is why XPC
is in the business of maintaining a pool of threads. Currently there is
no such capability (that I know of) that is provided by linux. Workqueues
can't block indefinitely.

And for performance reasons these threads need to be able to be created
quickly. These threads are involved in delivering messages to XPC's users

Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-04-27 Thread Eric W. Biederman
Dean Nelson [EMAIL PROTECTED] writes:

 Christoph is correct in that XPC has a single thread that exists throughout
 its lifetime, another set of threads that exist for the time that active
 contact with other XPCs running on other SGI system partitions exists, and
 finally there is a pool of threads that exist on an as needed basis once
 a channel connection has been established between two partitions.

 In principle I approve of the kthread API and its use as opposed to what
 XPC currently does (calls kernel_thread(), daemonize(), wait_for_completion(),
 and complete()). So Christoph's patch that changes the single long-lived
 thread to use kthread_stop() and kthread_should_stop() is appreciated.

 But the fact that another thread, started at the xpc_init() time, that does
 discovery of other SGI system partitions wasn't converted points out a
 weekness in either my thinking or the kthread API. This discovery thread
 does its job and then exits. Should XPC be rmmod'd while the discovery
 thread is still running we would need to do a kthread_stop() against it. 
 But kthread_stop() isn't set up to deal with a task that has already exited.
 And if what once was the task structure of this exited task has been
 reallocated to another new task, we'd end up stopping it should it be
 operating under the kthread API, or possibly waiting a very long time
 for it to exit if it is not.

Patches are currently under development to allow kthreads to exit
before kthread_stop is called.  The big thing is that once we allow
kernel threads that exited by themselves to be reaped by kthread_stop
we have some significant work to do.

 I'm also a little uneasy that kthread_stop() has an only one thread can
 stop another thread at a time design. It's a potential bottleneck on
 very large systems where threads are blocked and unable to respond to a
 kthread_should_stop() for some period of time.

There are already patches out there to fix this issue.

 XPC is in need of threads that can block indefinitely, which is why XPC
 is in the business of maintaining a pool of threads. Currently there is
 no such capability (that I know of) that is provided by linux. Workqueues
 can't block indefinitely.

I'm not certain I understand this requirement.  Do you mean block indefinitely
unless requested to stop?

 And for performance reasons these threads need to be able to be created
 quickly. These threads are involved in delivering messages to XPC's users
 (like XPNET) and we had latency issues that led us to use kernel_thread()
 directly instead of the kthread API. Additionally, XPC may need to have
 hundreds of these threads active at any given time.

Ugh.  Can you tell me a little more about the latency issues?

Is having a non-halting kthread_create enough to fix this?
So you don't have to context switch several times to get the
thread running?

Or do you need more severe latency reductions?

The more severe fix would require some significant changes to copy_process
and every architecture would need to be touched to fix up copy_thread.
It is possible, it is a lot of work, and the reward is far from obvious.

 I think it would be great if the kthread API (or underlying implementation)
 could be changed to handle these issues. I'd love for XPC to not have to
 maintain this sort of thing itself.

Currently daemonize is a serious maintenance problem.

Using daemonize and kernel_thread to create kernel threads is a blocker
in implementing the pid namespace because of their use of pid_t.

So I am motivated to get this fixed.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-04-27 Thread Dean Nelson
On Fri, Apr 27, 2007 at 12:34:02PM -0600, Eric W. Biederman wrote:
 Dean Nelson [EMAIL PROTECTED] writes:
 
  Christoph is correct in that XPC has a single thread that exists throughout
  its lifetime, another set of threads that exist for the time that active
  contact with other XPCs running on other SGI system partitions exists, and
  finally there is a pool of threads that exist on an as needed basis once
  a channel connection has been established between two partitions.
 
  In principle I approve of the kthread API and its use as opposed to what
  XPC currently does (calls kernel_thread(), daemonize(), 
  wait_for_completion(),
  and complete()). So Christoph's patch that changes the single long-lived
  thread to use kthread_stop() and kthread_should_stop() is appreciated.
 
  But the fact that another thread, started at the xpc_init() time, that does
  discovery of other SGI system partitions wasn't converted points out a
  weekness in either my thinking or the kthread API. This discovery thread
  does its job and then exits. Should XPC be rmmod'd while the discovery
  thread is still running we would need to do a kthread_stop() against it. 
  But kthread_stop() isn't set up to deal with a task that has already exited.
  And if what once was the task structure of this exited task has been
  reallocated to another new task, we'd end up stopping it should it be
  operating under the kthread API, or possibly waiting a very long time
  for it to exit if it is not.
 
 Patches are currently under development to allow kthreads to exit
 before kthread_stop is called.  The big thing is that once we allow
 kernel threads that exited by themselves to be reaped by kthread_stop
 we have some significant work to do.
 
  I'm also a little uneasy that kthread_stop() has an only one thread can
  stop another thread at a time design. It's a potential bottleneck on
  very large systems where threads are blocked and unable to respond to a
  kthread_should_stop() for some period of time.
 
 There are already patches out there to fix this issue.
 
  XPC is in need of threads that can block indefinitely, which is why XPC
  is in the business of maintaining a pool of threads. Currently there is
  no such capability (that I know of) that is provided by linux. Workqueues
  can't block indefinitely.
 
 I'm not certain I understand this requirement.  Do you mean block indefinitely
 unless requested to stop?

These threads can block waiting for a hardware DMA engine, which has a 28
second timeout setpoint.

  And for performance reasons these threads need to be able to be created
  quickly. These threads are involved in delivering messages to XPC's users
  (like XPNET) and we had latency issues that led us to use kernel_thread()
  directly instead of the kthread API. Additionally, XPC may need to have
  hundreds of these threads active at any given time.
 
 Ugh.  Can you tell me a little more about the latency issues?

After placing a message in a local message queue, one SGI system partition
will interrupt another to retrieve the message. We need to minimize the
time from entering XPC's interrupt handler to the time that the message
can be DMA transferred and delivered to the consumer (like XPNET) to
whom it was sent.

 Is having a non-halting kthread_create enough to fix this?
 So you don't have to context switch several times to get the
 thread running?
 
 Or do you need more severe latency reductions?
 
 The more severe fix would require some significant changes to copy_process
 and every architecture would need to be touched to fix up copy_thread.
 It is possible, it is a lot of work, and the reward is far from obvious.

I think a non-halting kthread_create() should be sufficient. It is in
effect what XPC has now in calling kernel_thread() directly.

Taking it one step further, if you added the notion of a thread pool,
where upon exit, a thread isn't destroyed but rather is queued ready to
handle the next kthread_create_quick() request.

  I think it would be great if the kthread API (or underlying implementation)
  could be changed to handle these issues. I'd love for XPC to not have to
  maintain this sort of thing itself.
 
 Currently daemonize is a serious maintenance problem.
 
 Using daemonize and kernel_thread to create kernel threads is a blocker
 in implementing the pid namespace because of their use of pid_t.
 
 So I am motivated to get this fixed.

This would also address the problems we see with huge pid spaces for
kernel threads on our largest machines.  In the example from last week,
we had 10 threads each on 4096 cpus.  If we reworked work_queues to use
the kthread_create_nonblocking() thread pool, we could probably collapse
the need for having all of those per-task, per-cpu work queues.

Dean
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  

Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-04-27 Thread Eric W. Biederman
Dean Nelson [EMAIL PROTECTED] writes:

 On Fri, Apr 27, 2007 at 12:34:02PM -0600, Eric W. Biederman wrote:
 Dean Nelson [EMAIL PROTECTED] writes:
 
  Christoph is correct in that XPC has a single thread that exists throughout
  its lifetime, another set of threads that exist for the time that active
  contact with other XPCs running on other SGI system partitions exists, and
  finally there is a pool of threads that exist on an as needed basis once
  a channel connection has been established between two partitions.
 
  In principle I approve of the kthread API and its use as opposed to what
  XPC currently does (calls kernel_thread(), daemonize(),
 wait_for_completion(),
  and complete()). So Christoph's patch that changes the single long-lived
  thread to use kthread_stop() and kthread_should_stop() is appreciated.
 
  But the fact that another thread, started at the xpc_init() time, that does
  discovery of other SGI system partitions wasn't converted points out a
  weekness in either my thinking or the kthread API. This discovery thread
  does its job and then exits. Should XPC be rmmod'd while the discovery
  thread is still running we would need to do a kthread_stop() against it. 
  But kthread_stop() isn't set up to deal with a task that has already 
  exited.
  And if what once was the task structure of this exited task has been
  reallocated to another new task, we'd end up stopping it should it be
  operating under the kthread API, or possibly waiting a very long time
  for it to exit if it is not.
 
 Patches are currently under development to allow kthreads to exit
 before kthread_stop is called.  The big thing is that once we allow
 kernel threads that exited by themselves to be reaped by kthread_stop
 we have some significant work to do.
 
  I'm also a little uneasy that kthread_stop() has an only one thread can
  stop another thread at a time design. It's a potential bottleneck on
  very large systems where threads are blocked and unable to respond to a
  kthread_should_stop() for some period of time.
 
 There are already patches out there to fix this issue.
 
  XPC is in need of threads that can block indefinitely, which is why XPC
  is in the business of maintaining a pool of threads. Currently there is
  no such capability (that I know of) that is provided by linux. Workqueues
  can't block indefinitely.
 
 I'm not certain I understand this requirement.  Do you mean block 
 indefinitely
 unless requested to stop?

 These threads can block waiting for a hardware DMA engine, which has a 28
 second timeout setpoint.

Ok.  So this is an interruptible sleep?
Do you have any problems being woken up out of that interruptible sleep
by kthread_stop?

I am in the process of modifying kthread_stop to wake up thread in an
interruptible sleep and set signal_pending, so they will break out.

  And for performance reasons these threads need to be able to be created
  quickly. These threads are involved in delivering messages to XPC's users
  (like XPNET) and we had latency issues that led us to use kernel_thread()
  directly instead of the kthread API. Additionally, XPC may need to have
  hundreds of these threads active at any given time.
 
 Ugh.  Can you tell me a little more about the latency issues?

 After placing a message in a local message queue, one SGI system partition
 will interrupt another to retrieve the message. We need to minimize the
 time from entering XPC's interrupt handler to the time that the message
 can be DMA transferred and delivered to the consumer (like XPNET) to
 whom it was sent.

 Is having a non-halting kthread_create enough to fix this?
 So you don't have to context switch several times to get the
 thread running?
 
 Or do you need more severe latency reductions?
 
 The more severe fix would require some significant changes to copy_process
 and every architecture would need to be touched to fix up copy_thread.
 It is possible, it is a lot of work, and the reward is far from obvious.

 I think a non-halting kthread_create() should be sufficient. It is in
 effect what XPC has now in calling kernel_thread() directly.

A little different but pretty close.

We call kthread_create() it prepares everything and places it on
a queue and wakes up kthreadd.

kthreadd then wakes up and forks the thread.

After the thread has finishing setting up it will call complete on
a completion so kthread_create can continue on it's merry way
but it should not need to go to sleep waiting for someone to
call kthread_bind.

But if you can live with what I have just described that will
be easy to code up.

It is a little slower then kernel_thread but hopefully not much.

 Taking it one step further, if you added the notion of a thread pool,
 where upon exit, a thread isn't destroyed but rather is queued ready to
 handle the next kthread_create_quick() request.

That might happen.  So far I am avoiding the notion of a thread pool for
as long as I can.  There is some sense in it, especially in 

Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-04-26 Thread Dean Nelson
On Thu, Apr 19, 2007 at 01:58:44AM -0600, Eric W. Biederman wrote:
> From: Eric W. Biederman <[EMAIL PROTECTED]>
> 
> This patch starts the xpc kernel threads using kthread_run
> not a combination of kernel_thread and daemonize.  Resuling
> in slightly simpler and more maintainable code.
> 
> Cc: Jes Sorensen <[EMAIL PROTECTED]>
> Cc: Tony Luck <[EMAIL PROTECTED]>
> Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
> ---
>  arch/ia64/sn/kernel/xpc_main.c |   31 +--
>  1 files changed, 13 insertions(+), 18 deletions(-)

Acked-by: Dean Nelson <[EMAIL PROTECTED]>

Andrew, I've tested Eric's patch in conjunction with a fix from Christoph
Lameter, which you've already got (it cleaned up a couple of compiler errors),
and the following patch that I'd like added (it cleans up a couple of
compiler warning errors and makes a few cosmetic changes).

Thanks,
Dean


Signed-off-by: Dean Nelson <[EMAIL PROTECTED]>

Index: mm-tree/arch/ia64/sn/kernel/xpc_main.c
===
--- mm-tree.orig/arch/ia64/sn/kernel/xpc_main.c 2007-04-25 14:04:51.701213426 
-0500
+++ mm-tree/arch/ia64/sn/kernel/xpc_main.c  2007-04-26 06:29:02.447330438 
-0500
@@ -568,7 +568,6 @@
 
task = kthread_run(xpc_activating, (void *) ((u64) partid),
  "xpc%02d", partid);
-
if (unlikely(IS_ERR(task))) {
spin_lock_irqsave(>act_lock, irq_flags);
part->act_state = XPC_P_INACTIVE;
@@ -808,7 +807,6 @@
int ignore_disconnecting)
 {
unsigned long irq_flags;
-   pid_t pid;
u64 args = XPC_PACK_ARGS(ch->partid, ch->number);
struct xpc_partition *part = _partitions[ch->partid];
struct task_struct *task;
@@ -840,7 +838,7 @@
(void) xpc_part_ref(part);
xpc_msgqueue_ref(ch);
 
-   task = kthread_run(xpc_daemonize_kthread, args,
+   task = kthread_run(xpc_daemonize_kthread, (void *) args,
   "xpc%02dc%d", ch->partid, ch->number);
if (IS_ERR(task)) {
/* the fork failed */
@@ -1381,7 +1379,8 @@
 * activate based on info provided by SAL. This new thread is short
 * lived and will exit once discovery is complete.
 */
-   task = kthread_run(xpc_initiate_discovery, NULL, 
XPC_DISCOVERY_THREAD_NAME);
+   task = kthread_run(xpc_initiate_discovery, NULL,
+  XPC_DISCOVERY_THREAD_NAME);
if (IS_ERR(task)) {
dev_err(xpc_part, "failed while forking discovery thread\n");
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-04-26 Thread Dean Nelson
On Thu, Apr 19, 2007 at 01:58:44AM -0600, Eric W. Biederman wrote:
 From: Eric W. Biederman [EMAIL PROTECTED]
 
 This patch starts the xpc kernel threads using kthread_run
 not a combination of kernel_thread and daemonize.  Resuling
 in slightly simpler and more maintainable code.
 
 Cc: Jes Sorensen [EMAIL PROTECTED]
 Cc: Tony Luck [EMAIL PROTECTED]
 Signed-off-by: Eric W. Biederman [EMAIL PROTECTED]
 ---
  arch/ia64/sn/kernel/xpc_main.c |   31 +--
  1 files changed, 13 insertions(+), 18 deletions(-)

Acked-by: Dean Nelson [EMAIL PROTECTED]

Andrew, I've tested Eric's patch in conjunction with a fix from Christoph
Lameter, which you've already got (it cleaned up a couple of compiler errors),
and the following patch that I'd like added (it cleans up a couple of
compiler warning errors and makes a few cosmetic changes).

Thanks,
Dean


Signed-off-by: Dean Nelson [EMAIL PROTECTED]

Index: mm-tree/arch/ia64/sn/kernel/xpc_main.c
===
--- mm-tree.orig/arch/ia64/sn/kernel/xpc_main.c 2007-04-25 14:04:51.701213426 
-0500
+++ mm-tree/arch/ia64/sn/kernel/xpc_main.c  2007-04-26 06:29:02.447330438 
-0500
@@ -568,7 +568,6 @@
 
task = kthread_run(xpc_activating, (void *) ((u64) partid),
  xpc%02d, partid);
-
if (unlikely(IS_ERR(task))) {
spin_lock_irqsave(part-act_lock, irq_flags);
part-act_state = XPC_P_INACTIVE;
@@ -808,7 +807,6 @@
int ignore_disconnecting)
 {
unsigned long irq_flags;
-   pid_t pid;
u64 args = XPC_PACK_ARGS(ch-partid, ch-number);
struct xpc_partition *part = xpc_partitions[ch-partid];
struct task_struct *task;
@@ -840,7 +838,7 @@
(void) xpc_part_ref(part);
xpc_msgqueue_ref(ch);
 
-   task = kthread_run(xpc_daemonize_kthread, args,
+   task = kthread_run(xpc_daemonize_kthread, (void *) args,
   xpc%02dc%d, ch-partid, ch-number);
if (IS_ERR(task)) {
/* the fork failed */
@@ -1381,7 +1379,8 @@
 * activate based on info provided by SAL. This new thread is short
 * lived and will exit once discovery is complete.
 */
-   task = kthread_run(xpc_initiate_discovery, NULL, 
XPC_DISCOVERY_THREAD_NAME);
+   task = kthread_run(xpc_initiate_discovery, NULL,
+  XPC_DISCOVERY_THREAD_NAME);
if (IS_ERR(task)) {
dev_err(xpc_part, failed while forking discovery thread\n);
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-04-23 Thread Russ Anderson
Jes Sorensen wrote:
> 
> Russ/Dean/Robin - could one of you provide some feedback to this one
> please.

Dean's on vacation for a couple days and will test it when he gets back.

-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc  [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-04-23 Thread Eric W. Biederman
Jes Sorensen <[EMAIL PROTECTED]> writes:

>
> Like with the previous patch from Eric, I'm CC'ing the correct people
> for this patch (forwarded it in a seperate email). CC'ing irrelevant
> lists such as containers@ and not linux-ia64@ makes it somewhat
> difficult to get proper reviews of these things.

containers is actually relevant because everything not being converted
to a kthread API is actually show stopper issue for the development of
the pid namespace because of the usage of pids by kernel_thread, and
the apparent impossibility to fix daemonize to sort this out.

Now I do agree linux-ia64 is also relevant.

> Russ/Dean/Robin - could one of you provide some feedback to this one
> please.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-04-23 Thread Jes Sorensen

Christoph Hellwig wrote:

On Thu, Apr 19, 2007 at 01:58:44AM -0600, Eric W. Biederman wrote:

From: Eric W. Biederman <[EMAIL PROTECTED]>

This patch starts the xpc kernel threads using kthread_run
not a combination of kernel_thread and daemonize.  Resuling
in slightly simpler and more maintainable code.


This driver is a really twisted maze.  It has a lot of threads,
some of them running through the whole lifetime of the driver,
some short-lived and some in a sort of a pool.

The patch below fixes up the long-lived thread as well as fixing
gazillions of leaks in the init routine by switching to proper
goto-based unwinding.

Note that thread pools are something we have in a few places,
and might be worth handling in the core kthread infrastructure,
as tearing down pools will get a bit complicated using the
kthread APIs.


Like with the previous patch from Eric, I'm CC'ing the correct people
for this patch (forwarded it in a seperate email). CC'ing irrelevant
lists such as containers@ and not linux-ia64@ makes it somewhat
difficult to get proper reviews of these things.

Russ/Dean/Robin - could one of you provide some feedback to this one
please.

Thanks,
Jes
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-04-23 Thread Jes Sorensen

Christoph Hellwig wrote:

On Thu, Apr 19, 2007 at 01:58:44AM -0600, Eric W. Biederman wrote:

From: Eric W. Biederman [EMAIL PROTECTED]

This patch starts the xpc kernel threads using kthread_run
not a combination of kernel_thread and daemonize.  Resuling
in slightly simpler and more maintainable code.


This driver is a really twisted maze.  It has a lot of threads,
some of them running through the whole lifetime of the driver,
some short-lived and some in a sort of a pool.

The patch below fixes up the long-lived thread as well as fixing
gazillions of leaks in the init routine by switching to proper
goto-based unwinding.

Note that thread pools are something we have in a few places,
and might be worth handling in the core kthread infrastructure,
as tearing down pools will get a bit complicated using the
kthread APIs.


Like with the previous patch from Eric, I'm CC'ing the correct people
for this patch (forwarded it in a seperate email). CC'ing irrelevant
lists such as containers@ and not linux-ia64@ makes it somewhat
difficult to get proper reviews of these things.

Russ/Dean/Robin - could one of you provide some feedback to this one
please.

Thanks,
Jes
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-04-23 Thread Eric W. Biederman
Jes Sorensen [EMAIL PROTECTED] writes:


 Like with the previous patch from Eric, I'm CC'ing the correct people
 for this patch (forwarded it in a seperate email). CC'ing irrelevant
 lists such as containers@ and not linux-ia64@ makes it somewhat
 difficult to get proper reviews of these things.

containers is actually relevant because everything not being converted
to a kthread API is actually show stopper issue for the development of
the pid namespace because of the usage of pids by kernel_thread, and
the apparent impossibility to fix daemonize to sort this out.

Now I do agree linux-ia64 is also relevant.

 Russ/Dean/Robin - could one of you provide some feedback to this one
 please.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-04-23 Thread Russ Anderson
Jes Sorensen wrote:
 
 Russ/Dean/Robin - could one of you provide some feedback to this one
 please.

Dean's on vacation for a couple days and will test it when he gets back.

-- 
Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc  [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-04-22 Thread Christoph Hellwig
On Thu, Apr 19, 2007 at 01:58:44AM -0600, Eric W. Biederman wrote:
> From: Eric W. Biederman <[EMAIL PROTECTED]>
> 
> This patch starts the xpc kernel threads using kthread_run
> not a combination of kernel_thread and daemonize.  Resuling
> in slightly simpler and more maintainable code.

This driver is a really twisted maze.  It has a lot of threads,
some of them running through the whole lifetime of the driver,
some short-lived and some in a sort of a pool.

The patch below fixes up the long-lived thread as well as fixing
gazillions of leaks in the init routine by switching to proper
goto-based unwinding.

Note that thread pools are something we have in a few places,
and might be worth handling in the core kthread infrastructure,
as tearing down pools will get a bit complicated using the
kthread APIs.


Signed-off-by: Christoph Hellwig <[EMAIL PROTECTED]>

Index: linux-2.6/arch/ia64/sn/kernel/xpc_main.c
===
--- linux-2.6.orig/arch/ia64/sn/kernel/xpc_main.c   2007-04-22 
21:19:22.0 +0200
+++ linux-2.6/arch/ia64/sn/kernel/xpc_main.c2007-04-22 21:33:54.0 
+0200
@@ -55,6 +55,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -159,16 +160,14 @@ static struct ctl_table_header *xpc_sysc
 int xpc_disengage_request_timedout;
 
 /* #of IRQs received */
-static atomic_t xpc_act_IRQ_rcvd;
+static atomic_t xpc_act_IRQ_rcvd = ATOMIC_INIT(0);
 
 /* IRQ handler notifies this wait queue on receipt of an IRQ */
 static DECLARE_WAIT_QUEUE_HEAD(xpc_act_IRQ_wq);
 
+static struct task_struct *xpc_hb_checker_thread;
 static unsigned long xpc_hb_check_timeout;
 
-/* notification that the xpc_hb_checker thread has exited */
-static DECLARE_COMPLETION(xpc_hb_checker_exited);
-
 /* notification that the xpc_discovery thread has exited */
 static DECLARE_COMPLETION(xpc_discovery_exited);
 
@@ -250,17 +249,10 @@ xpc_hb_checker(void *ignore)
int new_IRQ_count;
int force_IRQ=0;
 
-
/* this thread was marked active by xpc_hb_init() */
-
-   daemonize(XPC_HB_CHECK_THREAD_NAME);
-
-   set_cpus_allowed(current, cpumask_of_cpu(XPC_HB_CHECK_CPU));
-
xpc_hb_check_timeout = jiffies + (xpc_hb_check_interval * HZ);
 
-   while (!(volatile int) xpc_exiting) {
-
+   while (!kthread_should_stop()) {
dev_dbg(xpc_part, "woke up with %d ticks rem; %d IRQs have "
"been received\n",
(int) (xpc_hb_check_timeout - jiffies),
@@ -304,14 +296,10 @@ xpc_hb_checker(void *ignore)
(void) wait_event_interruptible(xpc_act_IRQ_wq,
(last_IRQ_count < atomic_read(_act_IRQ_rcvd) ||
jiffies >= xpc_hb_check_timeout ||
-   (volatile int) xpc_exiting));
+   kthread_should_stop()));
}
 
dev_dbg(xpc_part, "heartbeat checker is exiting\n");
-
-
-   /* mark this thread as having exited */
-   complete(_hb_checker_exited);
return 0;
 }
 
@@ -966,9 +954,7 @@ xpc_do_exit(enum xpc_retval reason)
/* wait for the discovery thread to exit */
wait_for_completion(_discovery_exited);
 
-   /* wait for the heartbeat checker thread to exit */
-   wait_for_completion(_hb_checker_exited);
-
+   kthread_stop(xpc_hb_checker_thread);
 
/* sleep for a 1/3 of a second or so */
(void) msleep_interruptible(300);
@@ -1219,29 +1205,29 @@ xpc_system_die(struct notifier_block *nb
 int __init
 xpc_init(void)
 {
-   int ret;
+   int ret = -ENODEV;
partid_t partid;
struct xpc_partition *part;
pid_t pid;
size_t buf_size;
 
+   if (!ia64_platform_is("sn2"))
+   goto out;
 
-   if (!ia64_platform_is("sn2")) {
-   return -ENODEV;
-   }
-
-
+   ret = -ENOMEM;
buf_size = max(XPC_RP_VARS_SIZE,
XPC_RP_HEADER_SIZE + XP_NASID_MASK_BYTES);
xpc_remote_copy_buffer = xpc_kmalloc_cacheline_aligned(buf_size,
 GFP_KERNEL, _remote_copy_buffer_base);
-   if (xpc_remote_copy_buffer == NULL)
-   return -ENOMEM;
+   if (!xpc_remote_copy_buffer)
+   goto out;
 
snprintf(xpc_part->bus_id, BUS_ID_SIZE, "part");
snprintf(xpc_chan->bus_id, BUS_ID_SIZE, "chan");
 
xpc_sysctl = register_sysctl_table(xpc_sys_dir);
+   if (!xpc_sysctl)
+   goto out_free_remote_buffer;
 
/*
 * The first few fields of each entry of xpc_partitions[] need to
@@ -1278,12 +1264,6 @@ xpc_init(void)
xpc_allow_IPI_ops();
 
/*
-* Interrupts being processed will increment this atomic variable and
-* awaken the heartbeat thread which will process the interrupts.
-*/
-   atomic_set(_act_IRQ_rcvd, 0);
-
-

Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-04-22 Thread Christoph Hellwig
On Thu, Apr 19, 2007 at 01:58:44AM -0600, Eric W. Biederman wrote:
 From: Eric W. Biederman [EMAIL PROTECTED]
 
 This patch starts the xpc kernel threads using kthread_run
 not a combination of kernel_thread and daemonize.  Resuling
 in slightly simpler and more maintainable code.

This driver is a really twisted maze.  It has a lot of threads,
some of them running through the whole lifetime of the driver,
some short-lived and some in a sort of a pool.

The patch below fixes up the long-lived thread as well as fixing
gazillions of leaks in the init routine by switching to proper
goto-based unwinding.

Note that thread pools are something we have in a few places,
and might be worth handling in the core kthread infrastructure,
as tearing down pools will get a bit complicated using the
kthread APIs.


Signed-off-by: Christoph Hellwig [EMAIL PROTECTED]

Index: linux-2.6/arch/ia64/sn/kernel/xpc_main.c
===
--- linux-2.6.orig/arch/ia64/sn/kernel/xpc_main.c   2007-04-22 
21:19:22.0 +0200
+++ linux-2.6/arch/ia64/sn/kernel/xpc_main.c2007-04-22 21:33:54.0 
+0200
@@ -55,6 +55,7 @@
 #include linux/delay.h
 #include linux/reboot.h
 #include linux/completion.h
+#include linux/kthread.h
 #include asm/sn/intr.h
 #include asm/sn/sn_sal.h
 #include asm/kdebug.h
@@ -159,16 +160,14 @@ static struct ctl_table_header *xpc_sysc
 int xpc_disengage_request_timedout;
 
 /* #of IRQs received */
-static atomic_t xpc_act_IRQ_rcvd;
+static atomic_t xpc_act_IRQ_rcvd = ATOMIC_INIT(0);
 
 /* IRQ handler notifies this wait queue on receipt of an IRQ */
 static DECLARE_WAIT_QUEUE_HEAD(xpc_act_IRQ_wq);
 
+static struct task_struct *xpc_hb_checker_thread;
 static unsigned long xpc_hb_check_timeout;
 
-/* notification that the xpc_hb_checker thread has exited */
-static DECLARE_COMPLETION(xpc_hb_checker_exited);
-
 /* notification that the xpc_discovery thread has exited */
 static DECLARE_COMPLETION(xpc_discovery_exited);
 
@@ -250,17 +249,10 @@ xpc_hb_checker(void *ignore)
int new_IRQ_count;
int force_IRQ=0;
 
-
/* this thread was marked active by xpc_hb_init() */
-
-   daemonize(XPC_HB_CHECK_THREAD_NAME);
-
-   set_cpus_allowed(current, cpumask_of_cpu(XPC_HB_CHECK_CPU));
-
xpc_hb_check_timeout = jiffies + (xpc_hb_check_interval * HZ);
 
-   while (!(volatile int) xpc_exiting) {
-
+   while (!kthread_should_stop()) {
dev_dbg(xpc_part, woke up with %d ticks rem; %d IRQs have 
been received\n,
(int) (xpc_hb_check_timeout - jiffies),
@@ -304,14 +296,10 @@ xpc_hb_checker(void *ignore)
(void) wait_event_interruptible(xpc_act_IRQ_wq,
(last_IRQ_count  atomic_read(xpc_act_IRQ_rcvd) ||
jiffies = xpc_hb_check_timeout ||
-   (volatile int) xpc_exiting));
+   kthread_should_stop()));
}
 
dev_dbg(xpc_part, heartbeat checker is exiting\n);
-
-
-   /* mark this thread as having exited */
-   complete(xpc_hb_checker_exited);
return 0;
 }
 
@@ -966,9 +954,7 @@ xpc_do_exit(enum xpc_retval reason)
/* wait for the discovery thread to exit */
wait_for_completion(xpc_discovery_exited);
 
-   /* wait for the heartbeat checker thread to exit */
-   wait_for_completion(xpc_hb_checker_exited);
-
+   kthread_stop(xpc_hb_checker_thread);
 
/* sleep for a 1/3 of a second or so */
(void) msleep_interruptible(300);
@@ -1219,29 +1205,29 @@ xpc_system_die(struct notifier_block *nb
 int __init
 xpc_init(void)
 {
-   int ret;
+   int ret = -ENODEV;
partid_t partid;
struct xpc_partition *part;
pid_t pid;
size_t buf_size;
 
+   if (!ia64_platform_is(sn2))
+   goto out;
 
-   if (!ia64_platform_is(sn2)) {
-   return -ENODEV;
-   }
-
-
+   ret = -ENOMEM;
buf_size = max(XPC_RP_VARS_SIZE,
XPC_RP_HEADER_SIZE + XP_NASID_MASK_BYTES);
xpc_remote_copy_buffer = xpc_kmalloc_cacheline_aligned(buf_size,
 GFP_KERNEL, xpc_remote_copy_buffer_base);
-   if (xpc_remote_copy_buffer == NULL)
-   return -ENOMEM;
+   if (!xpc_remote_copy_buffer)
+   goto out;
 
snprintf(xpc_part-bus_id, BUS_ID_SIZE, part);
snprintf(xpc_chan-bus_id, BUS_ID_SIZE, chan);
 
xpc_sysctl = register_sysctl_table(xpc_sys_dir);
+   if (!xpc_sysctl)
+   goto out_free_remote_buffer;
 
/*
 * The first few fields of each entry of xpc_partitions[] need to
@@ -1278,12 +1264,6 @@ xpc_init(void)
xpc_allow_IPI_ops();
 
/*
-* Interrupts being processed will increment this atomic variable and
-* awaken the heartbeat thread 

Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-04-21 Thread Eric W. Biederman
Robin Holt <[EMAIL PROTECTED]> writes:

> I think this was originally coded with daemonize to avoid issues with
> reaping children.  Dean Nelson can correct me if I am wrong.  I assume
> this patch is going in as part of the set which will make these threads
> clear themselves from the children list and if that is the case, I can
> see no issues.

One of my earlier patches guarantees that kthreadd will have pid == 2.

daemonize actually explicitly reparents to init so using daemonize and
kernel_thread provides no help at all with respect to scaling.  It in
fact guarantees you will be on init's list of child processes.

The work to enhance wait is a little tricky and it conflicts with the
utrace patches, which makes it hard to pursue at the moment.

I'm actually sorting out kthread stop so I can complete the pid namespace.
But since all kthreads are children of kthreadd this helps in a small
way with the scaling issue.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-04-21 Thread Eric W. Biederman
Robin Holt [EMAIL PROTECTED] writes:

 I think this was originally coded with daemonize to avoid issues with
 reaping children.  Dean Nelson can correct me if I am wrong.  I assume
 this patch is going in as part of the set which will make these threads
 clear themselves from the children list and if that is the case, I can
 see no issues.

One of my earlier patches guarantees that kthreadd will have pid == 2.

daemonize actually explicitly reparents to init so using daemonize and
kernel_thread provides no help at all with respect to scaling.  It in
fact guarantees you will be on init's list of child processes.

The work to enhance wait is a little tricky and it conflicts with the
utrace patches, which makes it hard to pursue at the moment.

I'm actually sorting out kthread stop so I can complete the pid namespace.
But since all kthreads are children of kthreadd this helps in a small
way with the scaling issue.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-04-20 Thread Robin Holt
On Fri, Apr 20, 2007 at 08:23:39AM +0200, Jes Sorensen wrote:
> Andrew Morton wrote:
> >Another driver which should be fully converted to the kthread API:
> >kthread_stop() and kthread_should_stop().
> >
> >And according to my logs, this driver was added to the tree more than
> >a year _after_ the kthread interface was made available.
> >
> >This isn't good.
> 
> Andrew,
> 
> Per my previous response, I'd prefer to have either Russ or Robin ack
> the patch doesn't break before it's pushed to Linus.
> 
> I don't know much about the xpmem and I am not comfortable testing it.

I think this was originally coded with daemonize to avoid issues with
reaping children.  Dean Nelson can correct me if I am wrong.  I assume
this patch is going in as part of the set which will make these threads
clear themselves from the children list and if that is the case, I can
see no issues.

Thanks,
Robin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-04-20 Thread Jes Sorensen

Andrew Morton wrote:

Another driver which should be fully converted to the kthread API:
kthread_stop() and kthread_should_stop().

And according to my logs, this driver was added to the tree more than
a year _after_ the kthread interface was made available.

This isn't good.


Andrew,

Per my previous response, I'd prefer to have either Russ or Robin ack
the patch doesn't break before it's pushed to Linus.

I don't know much about the xpmem and I am not comfortable testing it.

Cheers,
Jes
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-04-20 Thread Jes Sorensen

Andrew Morton wrote:

Another driver which should be fully converted to the kthread API:
kthread_stop() and kthread_should_stop().

And according to my logs, this driver was added to the tree more than
a year _after_ the kthread interface was made available.

This isn't good.


Andrew,

Per my previous response, I'd prefer to have either Russ or Robin ack
the patch doesn't break before it's pushed to Linus.

I don't know much about the xpmem and I am not comfortable testing it.

Cheers,
Jes
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-04-20 Thread Robin Holt
On Fri, Apr 20, 2007 at 08:23:39AM +0200, Jes Sorensen wrote:
 Andrew Morton wrote:
 Another driver which should be fully converted to the kthread API:
 kthread_stop() and kthread_should_stop().
 
 And according to my logs, this driver was added to the tree more than
 a year _after_ the kthread interface was made available.
 
 This isn't good.
 
 Andrew,
 
 Per my previous response, I'd prefer to have either Russ or Robin ack
 the patch doesn't break before it's pushed to Linus.
 
 I don't know much about the xpmem and I am not comfortable testing it.

I think this was originally coded with daemonize to avoid issues with
reaping children.  Dean Nelson can correct me if I am wrong.  I assume
this patch is going in as part of the set which will make these threads
clear themselves from the children list and if that is the case, I can
see no issues.

Thanks,
Robin
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-04-19 Thread Andrew Morton
On Thu, 19 Apr 2007 01:58:44 -0600
"Eric W. Biederman" <[EMAIL PROTECTED]> wrote:

> 
> This patch starts the xpc kernel threads using kthread_run
> not a combination of kernel_thread and daemonize.  Resuling
> in slightly simpler and more maintainable code.
> 
> Cc: Jes Sorensen <[EMAIL PROTECTED]>
> Cc: Tony Luck <[EMAIL PROTECTED]>
> Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
> ---
>  arch/ia64/sn/kernel/xpc_main.c |   31 +--

Another driver which should be fully converted to the kthread API:
kthread_stop() and kthread_should_stop().

And according to my logs, this driver was added to the tree more than
a year _after_ the kthread interface was made available.

This isn't good.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ia64 sn xpc: Convert to use kthread API.

2007-04-19 Thread Andrew Morton
On Thu, 19 Apr 2007 01:58:44 -0600
Eric W. Biederman [EMAIL PROTECTED] wrote:

 
 This patch starts the xpc kernel threads using kthread_run
 not a combination of kernel_thread and daemonize.  Resuling
 in slightly simpler and more maintainable code.
 
 Cc: Jes Sorensen [EMAIL PROTECTED]
 Cc: Tony Luck [EMAIL PROTECTED]
 Signed-off-by: Eric W. Biederman [EMAIL PROTECTED]
 ---
  arch/ia64/sn/kernel/xpc_main.c |   31 +--

Another driver which should be fully converted to the kthread API:
kthread_stop() and kthread_should_stop().

And according to my logs, this driver was added to the tree more than
a year _after_ the kthread interface was made available.

This isn't good.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/