RE: Kernel 2.2: tq_scheduler functions scheduling and waiting
I have tested with a kernel thread running the tq_scheduler and it is much more stable. The kernel still ran into a problem in n_tty.c in which the compiler optimized-out the check "if (!tty)" in n_tty_set_termios(); I am still investigating the right solution to this. As a long term fix, I will review the 2.4 and latest 2.2 sources. > Yes. The situation where one task is on two waitqueues > is rare, but does happen. And yes, there is code out there > which does a bare schedule() and *assumes* that once the > schedule has returned, the thing it was waiting for has > indeed occurred. > > Generally this is poor practice - it's safer to loop > over the schedule() call until the condition you're > sleeping on has been tested. I see your point. It would prevent this type of problem if all code waiting for conditions made certain those conditions were met. However, given the way the kernel works, it is not necessary to check unless the task specifically expects more than one condition to awaken it - at least it wasn't until tq_scheduler was introduced. Actually, that is not fair either - only when functions in tq_scheduler starting "blocking" did this become a problem. It would help me tremendously if these types of limitations and requirements for working in the kernel were well documented. It takes significant effort to determine the requirements, and to verify that my understanding is correct. > > You really shouldn't be sleeping in this way on tq_scheduler > if there's any way in which the sleep can take an extended > period of time. You may end up putting important kernel > tasks to sleep. I agree. In addition, even if the tq_scheduler function did check for its own condition, a problem still exists when the task returns to the code using the first wait queue before its condition is met; since the code using the second wait queue would set the task state to running and would not set it back (which it couldn't without knowing the conditions to check). > > Best to use schedule_task(), or an independent kernel thread. > > - - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel 2.2: tq_scheduler functions scheduling and waiting
Arthur Naseef wrote: > > Andrew: > > Excellent. I will look at the 2.4 sources. > > In addition to the TASK_ZOMBIE issue you mention, I believe there > is an issue of false termination of wait queues. Consider this: > > - Task places itself on a wait queue > - Calls schedule() > - tq_scheduler function does the same > > Now, there are two events which could place the task in TASK_RUNNING > and no clear way to differentiate. And, since most of the kernel > code does not check that the wait condition was actually met, this > could lead to all types of problems, right? > Yes. The situation where one task is on two waitqueues is rare, but does happen. And yes, there is code out there which does a bare schedule() and *assumes* that once the schedule has returned, the thing it was waiting for has indeed occurred. Generally this is poor practice - it's safer to loop over the schedule() call until the condition you're sleeping on has been tested. You really shouldn't be sleeping in this way on tq_scheduler if there's any way in which the sleep can take an extended period of time. You may end up putting important kernel tasks to sleep. Best to use schedule_task(), or an independent kernel thread. - - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: Kernel 2.2: tq_scheduler functions scheduling and waiting
Andrew: Excellent. I will look at the 2.4 sources. In addition to the TASK_ZOMBIE issue you mention, I believe there is an issue of false termination of wait queues. Consider this: - Task places itself on a wait queue - Calls schedule() - tq_scheduler function does the same Now, there are two events which could place the task in TASK_RUNNING and no clear way to differentiate. And, since most of the kernel code does not check that the wait condition was actually met, this could lead to all types of problems, right? -art -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of Andrew Morton Sent: Monday, May 28, 2001 10:28 PM To: Arthur Naseef Cc: [EMAIL PROTECTED] Subject: Re: Kernel 2.2: tq_scheduler functions scheduling and waiting Arthur Naseef wrote: > > All: > > I have been diagnosing kernel panics for over a week and I have > concerns with the use of tq_scheduler for which I was hoping I > could get some assistance. > > Is it considered acceptable for functions in the tq_scheduler > task list to call schedule? Is it acceptable for such functions > to wait on wait queues? What limitations exist? When a task wants to exit, it cleans up all its stuff, sets its state to TASK_ZOMBIE and then calls schedule(). The scheduler takes it off the runqueue and the task is never again executed. It's just a couple of stack pages which are waiting for someone in wait4() to release. But imagine what happens if the TASK_ZOMBIE task hits schedule() and finds a tq_scheduler task to run. And that task calls schedule(). In state TASK_ZOMBIE. Messy. At the very least, the schedule() call will never return. If the tq_scheduler task sets current->state to TASK_[UN]INTERRUPTIBLE (as it should) before calling schedule() then it has overwritten TASK_ZOMBIE and the task which is trying to exit has become magically resurrected. As far as I can tell, the "dead" task will run again, do the `fake_volatile' thing in do_exit() and try to go zombie again. It would be very interesting to change the test in schedule(): sti(); - if (tq_scheduler) + if (tq_scheduler && current->state != TASK_ZOMBIE) goto handle_tq_scheduler; It's all rather unpleasant, and tq_scheduler was killed in 2.4. I suggest you take a look at all the serial drivers in 2.4, see how I converted them to use schedule_task(). Someone kindly ported schedule_task() to 2.2.recent, so you should be able to use that in the same way. - - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel 2.2: tq_scheduler functions scheduling and waiting
Arthur Naseef wrote: > > All: > > I have been diagnosing kernel panics for over a week and I have > concerns with the use of tq_scheduler for which I was hoping I > could get some assistance. > > Is it considered acceptable for functions in the tq_scheduler > task list to call schedule? Is it acceptable for such functions > to wait on wait queues? What limitations exist? When a task wants to exit, it cleans up all its stuff, sets its state to TASK_ZOMBIE and then calls schedule(). The scheduler takes it off the runqueue and the task is never again executed. It's just a couple of stack pages which are waiting for someone in wait4() to release. But imagine what happens if the TASK_ZOMBIE task hits schedule() and finds a tq_scheduler task to run. And that task calls schedule(). In state TASK_ZOMBIE. Messy. At the very least, the schedule() call will never return. If the tq_scheduler task sets current->state to TASK_[UN]INTERRUPTIBLE (as it should) before calling schedule() then it has overwritten TASK_ZOMBIE and the task which is trying to exit has become magically resurrected. As far as I can tell, the "dead" task will run again, do the `fake_volatile' thing in do_exit() and try to go zombie again. It would be very interesting to change the test in schedule(): sti(); - if (tq_scheduler) + if (tq_scheduler && current->state != TASK_ZOMBIE) goto handle_tq_scheduler; It's all rather unpleasant, and tq_scheduler was killed in 2.4. I suggest you take a look at all the serial drivers in 2.4, see how I converted them to use schedule_task(). Someone kindly ported schedule_task() to 2.2.recent, so you should be able to use that in the same way. - - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/