ACPI: created a dedicated workqueue for notify() execution

2007-05-10 Thread Linux Kernel Mailing List
Gitweb: 
http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=88db5e1489f23876a226f5393fd978ddc09dc5f9
Commit: 88db5e1489f23876a226f5393fd978ddc09dc5f9
Parent: 262a7a28de060f3a63cae20035876d6f22fd7670
Author: Alexey Starikovskiy <[EMAIL PROTECTED]>
AuthorDate: Wed May 9 23:31:03 2007 -0400
Committer:  Len Brown <[EMAIL PROTECTED]>
CommitDate: Wed May 9 23:31:03 2007 -0400

ACPI: created a dedicated workqueue for notify() execution

HP nx6125/nx6325/... machines have a _GPE handler with an infinite
loop sending Notify() events to different ACPI subsystems.

Notify handler in ACPI driver is a C-routine, which may call ACPI
interpreter again to get access to some ACPI variables
(acpi_evaluate_xxx).
On these HP machines such an evaluation changes state of some variable
and lets the loop above break.

In the current ACPI implementation Notify requests are being deferred
to the same kacpid workqueue on which the above GPE handler with
infinite loop is executing. Thus we have a deadlock -- loop will
continue to spin, sending notify events, and at the same time
preventing these notify events from being run on a workqueue. All
notify events are deferred, thus we see increase in memory consumption
noticed by author of the thread. Also as GPE handling is bloked,
machines overheat. Eventually by external poll of the same
acpi_evaluate, kacpid is released and all the queued notify events are
free to run, thus 100% cpu utilization by kacpid for several seconds
or more.

To prevent all these horrors it's needed to not put notify events to
kacpid workqueue by either executing them immediately or putting them
on some other thread. It's dangerous to execute notify events in
place, as it will put several ACPI interpreter stacks on top of each
other (at least 4 in case of nx6125), thus causing kernel  stack
overflow.

First attempt to create a new thread was done by Peter Wainwright
He created a bunch of threads, which were stealing work from a kacpid
workqueue.
This patch appeared in 2.6.15 kernel shipped with Ubuntu 6.06 LTS.

Second attempt was done by me, I created a new thread for each Notify
event. This worked OK on HP nx machines, but broke Linus' Compaq
n620c, by producing threads with a speed what they stopped the machine
completely. Thus this patch was reverted from 18-rc2 as I remember.
I re-made the patch to create second workqueue just for notify events,
thus hopping it will not break Linus' machine. Patch was tested on the
same HP nx machines in #5534 and #7122, but I did not received reply
from Linus on a test patch sent to him.
Patch went to 19-rc and was rejected with much fanfare again.
There was 4th patch, which inserted schedule_timeout(1) into deferred
execution of kacpid, if we had any notify requests pending, but Linus
decided that it was too complex (involved either changes to workqueue
to see if it's empty or atomic inc/dec).
Now you see last variant which adds yield() to every GPE execution.

http://bugzilla.kernel.org/show_bug.cgi?id=5534
http://bugzilla.kernel.org/show_bug.cgi?id=8385

Signed-off-by: Alexey Starikovskiy <[EMAIL PROTECTED]>
Signed-off-by: Len Brown <[EMAIL PROTECTED]>
---
 drivers/acpi/osl.c |   45 +++--
 1 files changed, 35 insertions(+), 10 deletions(-)

diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index c2bed56..b998340 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -71,6 +71,7 @@ static unsigned int acpi_irq_irq;
 static acpi_osd_handler acpi_irq_handler;
 static void *acpi_irq_context;
 static struct workqueue_struct *kacpid_wq;
+static struct workqueue_struct *kacpi_notify_wq;
 
 static void __init acpi_request_region (struct acpi_generic_address *addr,
unsigned int length, char *desc)
@@ -137,8 +138,9 @@ acpi_status acpi_os_initialize1(void)
return AE_NULL_ENTRY;
}
kacpid_wq = create_singlethread_workqueue("kacpid");
+   kacpi_notify_wq = create_singlethread_workqueue("kacpi_notify");
BUG_ON(!kacpid_wq);
-
+   BUG_ON(!kacpi_notify_wq);
return AE_OK;
 }
 
@@ -150,6 +152,7 @@ acpi_status acpi_os_terminate(void)
}
 
destroy_workqueue(kacpid_wq);
+   destroy_workqueue(kacpi_notify_wq);
 
return AE_OK;
 }
@@ -603,6 +606,23 @@ void acpi_os_derive_pci_id(acpi_handle rhandle,/* 
upper bound  */
 static void acpi_os_execute_deferred(struct work_struct *work)
 {
struct acpi_os_dpc *dpc = container_of(work, struct acpi_os_dpc, work);
+   if (!dpc) {
+   printk(KERN_ERR PREFIX "Invalid (NULL) context\n");
+   return;
+   }
+
+   dpc->f

Revert "ACPI: created a dedicated workqueue for notify() execution"

2006-11-17 Thread Linux Kernel Mailing List
commit b976fe19acc565e5137e6f12af7b6633a23e6b7c
tree f5bd7eecbee3c165ff97ab8c642cae4f421a3cec
parent 808dbbb6bb61173bf52946a28f99089d2efa4c55
author Linus Torvalds <[EMAIL PROTECTED]> 1163820669 -0800
committer Linus Torvalds <[EMAIL PROTECTED]> 1163820669 -0800

Revert "ACPI: created a dedicated workqueue for notify() execution"

This reverts commit 37605a6900f6b4d886d995751fcfeef88c4e462c.

Again.

This same bug has now been introduced twice: it was done earlier by
commit b8d35192c55fb055792ff0641408eaaec7c88988, only to be reverted
last time in commit 72945b2b90a5554975b8f72673ab7139d232a121.

We must NOT try to queue up notify handlers to another thread than the
normal ACPI execution thread, because the notifications on some systems
seem to just keep on accumulating until we run out of memory and/or
threads.

Keeping events within the one deferred execution thread automatically
throttles the events properly.

At least the Compaq N620c will lock up completely on the first thermal
event without this patch reverted.

Cc: David Brownell <[EMAIL PROTECTED]>
Cc: Len Brown <[EMAIL PROTECTED]>
Cc: Alexey Starikovskiy <[EMAIL PROTECTED]>
Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>

 drivers/acpi/osl.c |   34 +-
 1 file changed, 21 insertions(+), 13 deletions(-)

diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index c84286c..068fe4f 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -73,7 +73,6 @@ static unsigned int acpi_irq_irq;
 static acpi_osd_handler acpi_irq_handler;
 static void *acpi_irq_context;
 static struct workqueue_struct *kacpid_wq;
-static struct workqueue_struct *kacpi_notify_wq;
 
 acpi_status acpi_os_initialize(void)
 {
@@ -92,9 +91,8 @@ acpi_status acpi_os_initialize1(void)
return AE_NULL_ENTRY;
}
kacpid_wq = create_singlethread_workqueue("kacpid");
-   kacpi_notify_wq = create_singlethread_workqueue("kacpi_notify");
BUG_ON(!kacpid_wq);
-   BUG_ON(!kacpi_notify_wq);
+
return AE_OK;
 }
 
@@ -106,7 +104,6 @@ acpi_status acpi_os_terminate(void)
}
 
destroy_workqueue(kacpid_wq);
-   destroy_workqueue(kacpi_notify_wq);
 
return AE_OK;
 }
@@ -569,7 +566,10 @@ void acpi_os_derive_pci_id(acpi_handle r
 
 static void acpi_os_execute_deferred(void *context)
 {
-   struct acpi_os_dpc *dpc = (struct acpi_os_dpc *)context;
+   struct acpi_os_dpc *dpc = NULL;
+
+
+   dpc = (struct acpi_os_dpc *)context;
if (!dpc) {
printk(KERN_ERR PREFIX "Invalid (NULL) context\n");
return;
@@ -604,12 +604,14 @@ acpi_status acpi_os_execute(acpi_execute
struct acpi_os_dpc *dpc;
struct work_struct *task;
 
+   ACPI_FUNCTION_TRACE("os_queue_for_execution");
+
ACPI_DEBUG_PRINT((ACPI_DB_EXEC,
  "Scheduling function [%p(%p)] for deferred 
execution.\n",
  function, context));
 
if (!function)
-   return AE_BAD_PARAMETER;
+   return_ACPI_STATUS(AE_BAD_PARAMETER);
 
/*
 * Allocate/initialize DPC structure.  Note that this memory will be
@@ -622,20 +624,26 @@ acpi_status acpi_os_execute(acpi_execute
 * from the same memory.
 */
 
-   dpc = kmalloc(sizeof(struct acpi_os_dpc) +
-   sizeof(struct work_struct), GFP_ATOMIC);
+   dpc =
+   kmalloc(sizeof(struct acpi_os_dpc) + sizeof(struct work_struct),
+   GFP_ATOMIC);
if (!dpc)
-   return AE_NO_MEMORY;
+   return_ACPI_STATUS(AE_NO_MEMORY);
+
dpc->function = function;
dpc->context = context;
+
task = (void *)(dpc + 1);
INIT_WORK(task, acpi_os_execute_deferred, (void *)dpc);
-   if (!queue_work((type == OSL_NOTIFY_HANDLER)?
-   kacpi_notify_wq : kacpid_wq, task)) {
-   status = AE_ERROR;
+
+   if (!queue_work(kacpid_wq, task)) {
+   ACPI_DEBUG_PRINT((ACPI_DB_ERROR,
+ "Call to queue_work() failed.\n"));
kfree(dpc);
+   status = AE_ERROR;
}
-   return status;
+
+   return_ACPI_STATUS(status);
 }
 
 EXPORT_SYMBOL(acpi_os_execute);
-
To unsubscribe from this list: send the line "unsubscribe git-commits-head" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html